Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 3764

Genomic Regions To Exclude Before Shuffling Intervals

$
0
0

I want to do permutation test: randomly reposit (shuffle) given genomic intervals and measure intersection between new coordinates and specific genomic element.

Example:

  • Different sets of genes: protein coding, pseudogenes, ncRNA - intervals that I want to shuffle;
    Genomic repeat L1 - coordinates are stable.
  • For every gene set shuffle intervals, intersect and measure the overlap with L1 (I am using bedtools shuffle - "reposition each feature in the input BED file on a random chromosome at a random position").

Question - Which genomic regions to exclude from the "genome" (bedtools shuffle -g option) before shuffling gene intervals?
I was going to exclude gaps in the assembly.
But what about:

  • All gene regions.
    If I am shuffling pseudogene intervals should I exclude protein coding and ncRNA coordinates?
  • All non L1 Repeat masker coordinates.
    As alu, LTR and DNA transposons aren't L1 so their won't be any intersection with them?

Viewing all articles
Browse latest Browse all 3764