I want to reproduce the results that people achieved in the following Nature paper:
Transcriptome genetics using second generation sequencing in a Caucasian populationhttp://www.nature.com/nature/journal/vaop/ncurrent/full/nature08903.html
I downloaded their SAM files from the groups website:http://funpopgen.unige.ch/data/ceu60
I downloaded a reference fasta and fai file from:
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/pilot_data/technical/reference/
The main problems seem to exist that I'm not able to convert these SAM files into proper "working" BAM files so that I can get BED files that is the input format for FluxCapacitor (http://flux.sammeth.net/). I tried using the following steps (as there is no "proper" header in the SAM files I've to do some additional steps):
- samtools view -bt human_b36_male.fa.gz.fai first.sam> first.bam
- samtools sort first.bam first.bam.sorted
- samtools index first.bam.sorted
- samtools index aln-sorted.bam