Vai al contenuto principale
Oggetto:

Next generation sequencing data analysis using linux power tools

Oggetto:

Next generation sequencing data analysis using linux power tools

Oggetto:

Academic year 2023/2024

Course ID
NEU0293A
Teacher
Ivan Molineris (Lecturer)
Year
1st year
Teaching period
First semester
Type
Related or integrative
Credits/Recognition
2
Course disciplinary sector (SSD)
BIO/11 - molecular biology
Delivery
Formal authority
Language
English
Attendance
Optional
Type of examination
Practice test
Type of learning unit
modulo
Modular course
Applied bioinformatics (NEU0293)
Prerequisites
Theoretical knowledge of molecular biology concepts and high-throughput analyses such as DNA and RNA sequencing. Basic knowledge of programming notions, such as: file system, commands, variables, control flow (if/else/loops), lists, functions.
It is necessary to master the concepts seen in the modules of Programming and Bioinformatic of the Data Science teacing.
Oggetto:

Sommario del corso

Oggetto:

Course objectives

The aim of the course is to provide the students with the knowledge and competences necessary to autonomously run computational analyses in a UNIX environment, with a specific focus on methods for the analysis of Next-Generation Sequencing big data.

 

Oggetto:

Results of learning outcomes

Understanding of the computing processes and basic idea of computer architecture.

Familiarity with the UNIX environment.

Be able to proficeintly run bioinformaitcs tool from the command line.

Understanding of parallel computing principles and basic application.

Oggetto:

Program

  1. Computer science concepts (reviewed from the Programming for Data Science module):
    1. Computer architecture
    2. Process
    3. The file system
    4. Interface and API concept
    5. Structure of a linux/unix system
    6. Exchange of data and services, servers
    7. Encoding: everything in bioinformatics is text
  2. The shell and commands
    1. Navigate the filesystem
    2. Filesystem permission system
  3. Unix power tools and basic programming principles
    1. awk
    2. Principles and application of parallel computing
  4. Next generation sequencing data analysis
    1. The fasta and fastq files
    2. Fastqc
    3. Analysis of overrepresented sequences
    4. Annotation of genomes and GTF
    5. Mapping with STAR
    6. The bam format and its display
    7. Expression quantification
    8. Introduction to pseudo-alignments
  5. Error controls and quality assessment
Oggetto:

Course delivery

All the lesson will be done in the informatic Lab, and are a blend of frontal teacinh and practical activities.

Oggetto:

Learning assessment methods

The learning assesment is ingertated with the other module of the curse. See the relavite page.

The students should produce one integrated report concering both modules and there will be one intergated oral examination.

Suggested readings and bibliography



Oggetto:
Last update: 31/08/2023 10:28
Non cliccare qui!