Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 3764

Memory Efficient Bedtools Sort And Merge With Millions Of Entries?

$
0
0

I would like to know if there is a memory-efficent way of sorting and merging a large amount of bed files, each of them containing millions of entries, into a single bed file that merges the entries, either duplicated or partially overlapping, so that they are unique in the file.

I have tried the following but it blows up in memory beyond the 32G I have available here:

find /my/path -name '*.bed.gz' | xargs gunzip -c | ~/src/bedtools-2.17.0/bin/bedtools sort | ~/src/bedtools-2.17.0/bin/bedtools merge | gzip -c > bed.all.gz

Any suggestions?


Viewing all articles
Browse latest Browse all 3764

Trending Articles