Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bismark_methylation_extractor samtools view: writing to standard output failed: Broken pipe #721

Open
caaswangzhen opened this issue Dec 16, 2024 · 4 comments

Comments

@caaswangzhen
Copy link

bismark_methylation_extractor -p --comprehensive --bedGraph
--counts --cytosine_report --buffer_size 30G --report --parallel 8
--genome_folder ../reference
1_bismark_bt2_pe.deduplicated.bam -o ./

The above is my running code, and the log file is shown below


Stored sequence information of 12 chromosomes/scaffolds in total

==============================================================================
Methylation information will now be written into a genome-wide cytosine report

Adding context-specific methylation summaries

Writing genome-wide cytosine report to: 1_bismark_bt2_pe.deduplicated.CpG_report.txt <<<

Writing all cytosine context summary file to: 1_bismark_bt2_pe.deduplicated.cytosine_context_summary.txt <<<

Storing all covered cytosine positions for chromosome:chr01
Writing cytosine report for chromosome chr01 (stored 3575676 different covered positions)
Writing cytosine report for chromosome chr02 (stored 3012099 different covered positions)
Writing cytosine report for chromosome chr03 (stored 3159034 different covered positions)
Writing cytosine report for chromosome chr04 (stored 2905351 different covered positions)
Writing cytosine report for chromosome chr05 (stored 2444130 different covered positions)
Writing cytosine report for chromosome chr06 (stored 2442238 different covered positions)
Writing cytosine report for chromosome chr07 (stored 2317488 different covered positions)
Writing cytosine report for chromosome chr08 (stored 2340837 different covered positions)
Writing cytosine report for chromosome chr09 (stored 1903696 different covered positions)
Writing cytosine report for chromosome chr10 (stored 1953891 different covered positions)
Writing cytosine report for chromosome chr11 (stored 2331355 different covered positions)
Writing cytosine report for last chromosome chr12 (stored 1998764 different covered positions)
Finished writing out cytosine report for covered chromosomes (processed 12 chromosomes/scaffolds in total)

Now processing chromosomes that were not covered by any methylation calls in the coverage file...
All chromosomes in the genome were covered by at least some reads. coverage2cytosine processing complete.

Finished generating genome-wide cytosine report

samtools view: writing to standard output failed: Broken pipe
samtools view: error reading file "1_bismark_bt2_pe.deduplicated.bam": Broken pipe
samtools view: error closing standard output: -1

This is the last few lines of the log file, and it looks like most of the process is finished, but samtools has an error.
I've generated these files.
1_bismark_bt2_pe.deduplicated.bedGraph.gz 1_bismark_bt2_pe.deduplicated.cytosine_context_summary.txt
1_bismark_bt2_pe.deduplicated.bismark.cov.gz 1_bismark_bt2_pe.deduplicated.M-bias.txt
1_bismark_bt2_pe.deduplicated.CpG_report.txt 1_bismark_bt2_pe.deduplicated_splitting_report.txt

What is the cause of samtools error, is it because "--buffer_size 30G" memory is not enough? Is the file that has been generated complete and will it affect the following analysis?

I look forward to your reply and thank you very much.

@FelixKrueger
Copy link
Owner

To be honest that all looks like it succeeded just fine, and 'just' died while attempting to close the file handle. may I aski which version you tried this with?

@caaswangzhen
Copy link
Author

Thank you for your reply. My bismark version is v0.24.2.
$ bismark --version
Bismark - Bisulfite Mapper and Methylation Caller.
Bismark Version: v0.24.2
Copyright 2010-23 Felix Krueger, Altos Bioinformatics
https://github.com/FelixKrueger/Bismark

I also wonder if the above error is memory related. My genome size is 400Mb, is "--buffer_size 30G --parallel 8" appropriate?

@FelixKrueger
Copy link
Owner

The 30G of RAM is only required for the bedGraph/coverage file conversion, and that must have completed before the cytosine report step. This latter step also finished fine:

Finished writing out cytosine report for covered chromosomes (processed 12 chromosomes/scaffolds in total)

Now processing chromosomes that were not covered by any methylation calls in the coverage file...
All chromosomes in the genome were covered by at least some reads. coverage2cytosine processing complete.

Finished generating genome-wide cytosine report

The error appears to come from reading the deduplicated BAM file, and at the end failing to close streamed file-handles (which can be difficult to debug). If you wanted to be sure all is fine you could try and rerun the command (in a different folder) without --parallel 8 and see if you get the same number of lines in the coverage file. But I would assume that it is all fine.

@FelixKrueger
Copy link
Owner

maybe you could try updating samtools?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants