I'm trying to find all the reads (by name) from a BAM file that align to various regions in a bed file. Right now I can do this with bedtools
using intersectBed
:
intersectBed -abam reads.bam -wo -f 1 -b regions.bed -bed
From this one can parse all the read ids that land in every interval in regions.bed
, but it's not very compact. Is there a way to get bedtools
to natively transform this into a more compact format, e.g.
chr1 x y .... read_id1,read_id2,read_id3
where chr1 x y
is a given interval in regions.bed
and the comma separated read_id1,...
is the list of read ids from reads.bam
that fall in that interval. In this compact format, the output BED file would have at most as many entries as there are regions in regions.bed
, whereas with the -wo
option it can be even larger than the number of reads in reads.bam
. Thanks.