-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can CollectWgsMetrics fall back to slow algorithm gracefully if the fast one fails with the default read length? #1970
Comments
@gokalpcelik What are you envisioning exactly? If it encounters an error it will restart at the beginning of the bam with the slow algorithm? Or it will process a specific read with the other algorithm and then combines them? |
This usually happens at the very beginning so restarting the traversal should not be a huge loss. |
That seems reasonable then. |
I wonder if it's possible to make the Fast collector work on these files? If it's just retrying with the other collector that seems pretty simple, but kind of gross. Do you want to implement that? I might change the arguments so instead of USE_FAST_ALGORITHM we have a set of inputs SLOW, FAST, TRY_FAST_AND_FALL_BACK or something like that so you can control the behavior. |
Try fast and fall back is a nice option
…On Tue, Jul 9, 2024, 20:36 Louis Bergelson ***@***.***> wrote:
I wonder if it's possible to make the Fast collector work on these files?
If it's just retrying with the other collector that seems pretty simple,
but kind of gross. Do you want to implement that? I might change the
arguments so instead of USE_FAST_ALGORITHM we have a set of inputs SLOW,
FAST, TRY_FAST_AND_FALL_BACK or something like that so you can control the
behavior.
—
Reply to this email directly, view it on GitHub
<#1970 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI6VAC6UCE53HGDXWCMV6ELZLQNRTAVCNFSM6AAAAABKQURVQSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJYGI4TKNJYHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
We will have to identify what the actual issue that's causing this is, because we want to only fall back in the cases where it would help to do so, and not just any random |
@lbergelson It is the issue. The cc: @gokalpcelik |
If this is true, then perhaps the argument is mislabelled? Currently it is described as the average read length. |
I hit this too and debugging was tricky. Glad this issue was opened! I have 100 cycle read data and am relying on the default
I'm going to try something naive like: picard \
-Xmx${memory}M \
CollectWgsMetrics \
--VALIDATION_STRINGENCY SILENT \
--INPUT ${bam} \
--OUTPUT ${sample}.${library}.wgs_metrics.txt \
--REFERENCE_SEQUENCE ${fasta} \
--USE_FAST_ALGORITHM true \
${intervals} \
|| picard \
-Xmx${memory}M \
CollectWgsMetrics \
--VALIDATION_STRINGENCY SILENT \
--INPUT ${bam} \
--OUTPUT ${sample}.${library}.wgs_metrics.txt \
--REFERENCE_SEQUENCE ${fasta} \
--USE_FAST_ALGORITHM false \
${intervals} Alternatively, I'll see if I can't get what I need out of |
This issue seems to be present in the latest version as well. I have a collection of WGS bam files (#>100) and fast algorithm works gracefully for most of them whereas only a few of them fail with the same error due to read length is not being optimal
I can understand that this parameter is important for estimates however instead of failing can the tool fall back to the slow algorithm gracefully?
The text was updated successfully, but these errors were encountered: