I am trying to plot a heatmap of read density around a feature of interest (TSSs) very common in genomics papers. something like this (B):
However, I am struggling a bit in getting to look "right". A bit of background:
I have mapped ChIP-seq reads for pol2 and calculate the coverage, per nucleotide, using bedtools.
coverageBed -d -abam $bamFile -b $TSSs > $coverage.bed
# output:
chr1 67108226 67110226 uc001dct.3 16 + 1 10
chr1 67108226 67110226 uc001dct.3 16 + 2 10
chr1 67108226 67110226 uc001dct.3 16 + 3 10
chr1 67108226 67110226 uc001dct.3 16 + 4 10
chr1 67108226 67110226 uc001dct.3 16 + 5 8
chr1 67108226 67110226 uc001dct.3 16 + 6 8
chr1 67108226 67110226 uc001dct.3 16 + 7 8
chr1 67108226 67110226 uc001dct.3 16 + 8 8
chr1 67108226 67110226 uc001dct.3 16 + 9 8
chr1 67108226 67110226 uc001dct.3 16 + 10 8
Then in R, the genomic position, in column 7, is converted to relative position to the TSS and read counts normalized to the library size. This is converted to a numeric matrix with each row being a TSS and each column the relative nucleotide position. For the plotting the matrix is ordered number of reads per TSS, and the values logged. This is the outcome:
heatmap(cov.mlog, Rowv=NA, Colv= ...