Issue with Alignment Results Using Soft-Masked Genome in Bismark #705

Chanspace · 2024-10-14T02:13:52Z

I am currently conducting Whole Genome Bisulfite Sequencing (WGBS) data analysis using Bismark and plan to utilize a soft-masked genome, where all repetitive and low-complexity regions are marked with lowercase letters.

During the index generation step, I observed that the index created is consistent with the unmasked genome. However, I noticed a significant difference in the results during the alignment step, specifically in the number of uniquely aligned reads. It appears that tools like Bowtie2 ignore the soft-masking, treating the lowercase letters as uppercase during alignment.

Is there a specific parameter or approach in Bismark that would allow me to achieve alignment results with the soft-masked genome that are comparable to those obtained with the unmasked genome? Any guidance or advice would be greatly appreciated!

Thank you!

FelixKrueger · 2024-10-15T07:45:58Z

To be perfectly honest, I don't exactly know whether or not Bowtie2 treats soft-masked genomes differently to unmasked genomes but I don't think it does (Google also doesn't seem to know, "how does Bowtie2 treat soft-masked index" didn't yield any great insights either).

What would you like to achieve by soft-masking repeats?

Chanspace · 2024-10-16T11:55:53Z

I'm sorry, I may not have expressed myself clearly. What I actually want to know is how to ensure consistent detection rates when using unmasked and soft-masked genomes in Bismark. The reason is that we have utilized soft-masked genomes in other omics analyses, so we hope to maintain consistency. However, we compared unmasked and soft-masked genomes in WGBS data analysis with bismark, and even though the generated indexes are the same, there are still differences in the subsequent methylation detection rates.

FelixKrueger · 2024-10-20T21:24:58Z

I am afraid I don't really have an answer you, I would have to do some tests with reproducible examples. I think your best bet would be to cross-post this questions over at the Bowtie2 repo - as the effects of this behaviour will likely be part of the Bowtie2 strategy with soft-masked indexes. I you get an answer, I'd be curious to learn more details. Sorry if this is not immediately useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Alignment Results Using Soft-Masked Genome in Bismark #705

Issue with Alignment Results Using Soft-Masked Genome in Bismark #705

Chanspace commented Oct 14, 2024

FelixKrueger commented Oct 15, 2024 •

edited

Loading

Chanspace commented Oct 16, 2024

FelixKrueger commented Oct 20, 2024

Issue with Alignment Results Using Soft-Masked Genome in Bismark #705

Issue with Alignment Results Using Soft-Masked Genome in Bismark #705

Comments

Chanspace commented Oct 14, 2024

FelixKrueger commented Oct 15, 2024 • edited Loading

Chanspace commented Oct 16, 2024

FelixKrueger commented Oct 20, 2024

FelixKrueger commented Oct 15, 2024 •

edited

Loading