You've got data. We turn it into information
All data which comes into the facility, whether we create it or you give it to use, is subject to quality control testing. For sequencing data we primarily use fastQC. We use other tools and tests where approproiate.
RNA-Seq requires many of the same steps as Microarray analysis, but is confounded by not having well-defined, well designed oligos with good annotation, but rather short reads of uncertain origin. A major step in this analysis is aligning the reads to a reference genome or transcriptome, or creating a de novo transcriptome. Once these steps are complete, analysis follows a path very similar to microarray analysis. We use STAR for the mapping, and DESeq2 to calculate fold change and significant differential expression. Partek can also be used for the analysis after mapping is done.
Single Nucleotide Polymorphisms and structural polymorphisms (Copy Number Variants, Presence/Absence variants) can be discovered and characterized by resequencing individuals from species for which a reference genome exists. The can also be characterized for new genomes by assembling those genomes from scratch (de novo assembly).
Increasingly, entire communities of organisms are being sequenced in bulk from samples collected clinically or in the environment. The microbial compositon of these samples can be determined through bulk sequencing, thereby providiing a relatively unbiased view of the diversity of organisms and genes present in the sample, and how these change with environmental or clinical conditions.
We will determine the presence, and even relative abundance of, known microRNAs. Finding novel microRNAs, or circular RNA, is something we can do, but it would fall under the category of ‘uncommon tasks’, and would be billed accordingly.
ChIP-Seq and HITS-CLIP
Chromatin immunoprecipitation followed by sequencing determines the genomic binding sites of any protein you have an antibody to. Histone methylation is a popular application of this technique. HITS-CLIP is similar, but immunoprecipitates RNA.
Microarray analysis generally consists of three parts: normalizing data, calculating fold change and significance, and determining biological significance and pathway analysis. For the first two parts, we primarily use R and Bioconductor. There are several standard procedures we follow to produce publishable results. We also use the Partek Genomics Suite, a commercial product with a graphical interface and automated reference downloading. Partek will do some pathway analysis, though we have not validated its usefulness in that regard. For ontology and pathway analysis we use either DAVID or Ingenuity Pathway Analysis.
For questions, help, or to offer a beer, get in touch with the bioinformatician, Niel Infante