Bioinformatics BootCamp for Postdocs

We are offering the Bioinformatics BootCamp for Postdocs in partnership with Countway Library of Medicine and the HMS Clinical and Translational Science Center. Complete syllabus and registration links are below.

Bioinformatics is a rapidly growing field involving all areas of biological research. High-throughput technologies have established themselves as indispensable tools for the study of biological systems from gene expression level changes and protein concentrations to their modifications and interactions in complex diseases and systems. With the advancement of bioinformatics, the need for researchers to interpret information embedded in complex biological systems continues to grow.  

The BootCamp runs June 8 – August 30. You may register for as many of the classes as you wish to attend. A Bootcamp Certificate of Completion is awarded for those who attend at least 70 % of the workshops.

Spaces are limited for these classes. Registration is required and opens on Monday, May 8.

- All classes held at Coutway Library, Room L2-025, 10 Shattuck Street - 

Introduction to Bioinformatics
Date: Thur June 08 | 10:00am, Duration: 1.5 hour(s)
This workshop will give overview of basic concepts and fundamentals underlying modern bioinformatics. Topics presented include sequence databases, sequence, comparisons, database searches, phylogenetic analysis, protein structure, proteomics, RNA structure prediction, gene prediction and identification, genetic analysis of disease, and microarray‐based studies of gene expression.

Introduction to Amazon Cloud and UNIX basics
Date: Tue June 13 | 10:00am, Duration: 2 hours
This workshop is an introduction to Amazon cloud Computing and UNIX basics. The workshop will help you to get setup and build windows or unix computer on Amazon cloud computing. Attendees will be exposed to hands‐on experience and by the end of the course one should be able to confidently use the command line interface on Unix system. They should be able also be to navigate around the Unix file system from the command line and use a number of basic Unix commands.

Analyzing NGS Data: Standard data processing and workflow analysis on high powered computing environment
Date: Tue June 20 | 10:00am, Duration: 3 hours
The NGS technologies have the potential to dramatically accelerate biomedical research by enabling comprehensive analysis of genomes and transcriptomes to become inexpensive, routine, and widespread tools. This workshop will focus on methods for base‐calling and variant‐calling, for aligning reads to reference sequences (e.g. genomes), and for de novo assembly of short reads into longer sequences. The following tools will be covered on Orchestra, a shared research cluster; quality reports of FASTQ files, trimming and filtering of reads, alignment and coverage objects such as SAM/BAM files using bowtie/bwa, calling SNPs with Samtools, and De novo Assembly using Velvet.

mRNA‐seq analysis using JMP Genomics Software
Date: Thur July 06 | 10:00am, Duration: 3 hours
Next‐Generation sequencing is quickly becoming the platform of choice for genomic analysis. JMP Genomics has incorporated a number of functions for working with mRNA‐seq data. Many of the functions are similar to those used in traditional expression analysis. Learn how to Map and generate counts, import mRNA‐seq data (counts) and carry out mRNA‐seq analysis including data filtering, normalization, Anova and differential expression.

Affymetrix / Illumina Microarray data analysis using R/Bioconductor
Date: Tue July 11 | 10:00am, Duration: 3 hours
The course is a general introduction to Microarrays and the use of R/Bioconductor to carry out microarray data analysis. Following introduction the workshop starts with hands‐on exercise on how to install R and Bioconductor GUI packages. The course is mainly based on the use of Bioconductor open source packages for analyzing single channel and two channel data sets. Only basic R coding will be introduced since all the analysis are performed using OneChannelGUI, a graphical interface to Bioconductor tools, designed for life scientists who are not familiar with R language. Students will learn how to carry out the following; Quality control, Normalization, Filtering, Statistical analysis, and Differential expression.

Introduction to ChIP‐Seq and data analysis using Galaxy
Date: Wed July 12  | 10:00am, Duration: 3 hours
This three hour workshop is focused on analysis of ChIP‐seq data using Galaxy. It is aimed at researchers who are using, or planning to use ChIP‐Seq methods as part of their research. The course will focus on hands‐on training in using standard data analysis methods including loading data into galaxy, Quality control and manipulation, Bowtie mapping, Peak Calling using MACS, Annotation and functional enrichment of peaks.

Geneious Pro Genomic Analysis Software Platform
Date: Tue Jul 18 | 10:00am, Duration: 2 hours
Attendees will have an opportunity to learn the following. Full genome sequence assembly Sequence, literature & BLAST searching, Phylogenetics, Primer design/Primer management, in silico cloning and Gateway cloning, Variant (SNP) calling, RNA‐Seq Mapping and Expression Analysis 

Analysis of MicroRNA Expression and Function by a variety of Techniquesa
Date: Tue July 25 | 10:00am, Duration: 2 hours
MicroRNAs (miRNAs) are post‐transcriptional regulators that silence gene expression by binding mainly to untranslated regions in the 3' end of the target messenger RNA transcripts. This session will be useful for any researcher who wants to investigate miRNA's in detail. Topics covered include basic techniques for miRNA isolation, expression profiling and validation, as well as their functional analysis in mammalian cells.

HTqPCR - high throughput qPCR analysis using R/Bioconductor
Date: Tue Aug 01 | 10:00am, Duration: 2 hours
Quantitative real-time polymerase chain reaction (qPCR) is routinely used for RNA expression profiling, validation of microarray hybridization data and clinical diagnostic assays. HTqPCR, a package for the R statistical computing environment aids to enable the processing and analysis of qPCR data in high throughput across multiple conditions or replicates, and in spatially-defined formats such ABI TaqMan Low Density Arrays and conventional 96- or 384-well plates. The workshop is designed to help researchers learn how to load data into HTqPCR and carryout quality assessment, normalization, visualization and parametric or non-parametric testing for statistical significance in Ct values between features (e.g. genes, microRNAs).

Making the Most of the UCSC Genome Browser
Date: Tue Aug 08 | 10:00am, Duration: 2 hours
The UCSC Genome Browser provides rapid, straight forward access to a vast store of Genome‐ oriented material. Learn how to quickly locate gene information, gene features, how to download sequence and track information, and how to make use of the Table Browser to retrieve data in bulk. We'll also examine other UCSC tools such as the Gene Sorter and VisiGene.

Ensembl Genome Browser Workshop
Date: Thur Aug 10 | 10:00am, Duration: 2 hours
Ensembl provides unified access to genomic information and annotation for more than 50 eukaryotic species. Learn how to find what you need, from splice sites to regulatory regions to SNPs. We'll also explore the BioMart tool to select and export Ensembl data. With hands‐on exercises.

SHRINE workshop.  How to use the SHRINE web-based tool to generate aggregate number of patients across Harvard Hospitals
Date: Thur Aug 10 | 1:00pm, Duration: 1 hour
SHRINE (the Shared Health Research Information Network) is a web-based query tool built on top of i2b2 (Informatics for Integrating Biology and the Bedside) a widely used and robust platform for clinical research. SHRINE allows researchers to query across participating hospital electronic medical record systems in order to determine the total counts of patients who meet a given set of inclusion and exclusion criteria (currently demographics, diagnoses, medications, and selected laboratory values). These data will be most useful for investigators interested in: Identify or characterize potential clinical trial cohorts for recruitments, Generate new research hypotheses, Plan or conduct research requiring large sample sizes and Prepare grant applications. SHRINE is a service available to Harvard Medical School faculty (Instructor or above) and Fellows employed by one of the participating hospitals. Fellows must be sponsored by an approved faculty member. For more information, please visit:

Navigating and Using NCBI BLAST & Gene Expression Omnibus
Date: Tue Aug 15 | 10:00am, Duration: 2 hours
Learn how to use BLAST as an experimental tool. We will cover the use of filters as BLAST tools and contrast them with the use of PHI‐BLAST. We will learn about substitution matrices and how the PSSM relates to PSI‐BLAST. The Gene Expression Omnibus (GEO) is a public repository that archives and freely distributes microarray, next‐generation sequencing, and other forms of high‐throughput functional genomic data. Learn how to navigate the GEO interface to retrieve data to inform your experiments. This workshop will acquaint attendees with NCBI blast tool for comparing nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. Attendees will also learn how to query and download experiments that are MIAME‐compliant.

Pathway Analysis and Networks of Genomic data using Metacore
Date: Tue Aug  22 | 10:00am, Duration: 2 hours
How to choose the right network building algorithm to test and expand your hypothesis. One aspect of systems biology is to integrate complex interactions of biological systems. GeneGo provides a highly annotated and dense interaction database with over ten different network building algorithms. Here we demonstrate the strength of these tools in the ability to visualize signaling interaction networks and expand on your hypotheses outside of the realm of your core research areas. This tutorial describes each network building algorithm and modeling workflows including building with canonical pathway interactions, with examples of when to use each. In this session we also highlight how to optimize the visualization of your interactions of interest on a network we will build. We will show tools such as how to add/ hide/show objects and how to manipulate visualizations of pathways using post‐filters such as disease, tissue, orthologs or gene ontology processes. 

Graphics and statistical tests using R and JMP software
Date: Wed Aug 30 | 10:00am, Duration: 2 hours
In this class attendees will have opportunity to learn basic concepts of R and some useful applications of R in research: the ggplot2 graphics and some common statistics tests. This class also exposes attendees with JMP Statistical Discovery Software from SAS.  This software provides the complete spectrum of statistics and graphics a student or researcher may encounter. JMP is visual, interactive and dynamic, with a friendly point and click, drag and drop interface.  JMP combines powerful statistics with dynamic graphics, in memory and on the desktop. Its interactive and visual paradigm enables JMP to reveal insights that are impossible to gain from raw tables of numbers or static graphs.