West Virginia University Genomics Core Facility

Bioinformatics
You've got data. We turn it into information


This is how I do some of the standard analyses. I am constantly evaluating and changing my procedures, so this may be out of date. You will want to change file names, directory structure, and such. If running on the HPC, you'll need to wrap the commands in a script to submit. You may (will) have to load modules, and some of the commands may be (are) slightly different on different machines.

Get fastq files from Basespace in R

You will need to get an access code from Illumina first. Information on that is here.
library(BaseSpaceR)
ACCESS_TOKEN<- 'dd9...mytoken...43'
PROJECT_ID<- '123456'  ## Get proj ID from url of the project

aAuth<- AppAuth(access_token = ACCESS_TOKEN)
selProj <- Projects(aAuth, id = PROJECT_ID, simplify = TRUE) 
sampl <- listSamples(selProj, limit= 1000)
inSample <- Samples(aAuth, id = Id(sampl), simplify = TRUE)
for(s in inSample){ 
    f <- listFiles(s, Extensions = ".gz")
    print(Name(f))
    getFiles(aAuth, id= Id(f), destDir = 'outdir/', verbose = TRUE)
}



For questions, help, or to offer a beer, get in touch with the bioinformatician, Niel Infante