Table Of Contents

Previous topic

Educational Activities of the Bioinformatics Consulting Center

Next topic

Getting Started

This Page

BMMB - 597D: Practical Data Analysis (Fall 2011)

The purpose of this course is to introduce students to the various applications of high-throughput sequencing including: chip-Seq, RNA-Seq, SNP calling, metagenomics, de-novo assembly and others. The course material will concentrate on presenting complete data analysis scenarios for each of these domains of applications and will introduce students to a wide variety of existing tools and techniques. We expect that by the end of the course work students will:

  • understand common bioinformatics data formats and standards

  • become familiar with the practice of analyzing short-read sequencing data from various instruments:
    • Illumina HiSeq sequencer
    • ABI SOLID sequencer
    • Roche 454 platforms
  • develop a computationally oriented thinking that is necessary to take on large-scale data analysis projects

  • understand data analysis principles of methodologies such as:
    • short read and long read alignments
    • Chip-Seq analysis and peak calling
    • interval query and manipulation
    • SNP calling and genomic variation detection
    • genome assembly with open source tools
    • metagenomics analysis
  • filter, extract and combine data with scripting languages

  • automate tasks with shell scripts to create reusable data pipelines

  • plot and visualize results with R and other packages

A laptop that has sufficient amount of battery power for 25 minute work may be required to perform data analysis tasks in class. We will be able to provide support for Mac OSX (Tiger/Leopard), Windows (XP/Vista) and Linux operating systems.

Practical data analysis for life scientists
BMMB 597D - Bio Data Analysis (2 cr.)
Schedule #398704
Tuesday/Thursday 2:30-3:20 in 120 Thomas Building
Limit of 25 students.

Office hours: MW 2-3pm 502B Wartik

Lecture Notes

Important

Read the Getting Started page before the first lecture.

Lectures will appear below as they are presented. Each week we will cover certain topic over two lectures. Homeworks are included in the handouts.

Grading and Homework

The final grade will be an average of the grades obtained on homework and two projects. Please refer to the information in Lecture 1 for more details on the projects.

Homework will be handed out on most lectures in the form of exercises that will need to be turned in at the beginning of each week. Note that many of these may be solved in class during the exercise session.

We want to emphasize that the primary goal of this course work is to improve students ability to handle and interpret data sets. Therefore the evaluation process is relative to the initial aptitudes. We aim to focus on developing permanent skills and talents that are not just immediately useful but also provide the foundation for further more in depth understanding of informatics in general.