Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deepvariant 1.8.0 has issues with using --proposed_variants and --variant_caller #922

Open
chaoxinzhang opened this issue Jan 3, 2025 · 3 comments

Comments

@chaoxinzhang
Copy link

chaoxinzhang commented Jan 3, 2025

Hi!

Deepvariant 1.8.0 has issues with using --proposed_variants and --variant_caller

here is the command for deepvariant 1.8.0:

time sudo docker run  -m 480G \
   -v "/mnt/disk1/ref/bed":"/input" \
   -v "/mnt/disk2/SRA3":"/output" \
   -v "/mnt/disk1/ref":"/reference" \
   google/deepvariant:"1.8.0" \
   /opt/deepvariant/bin/run_deepvariant \
  --model_type WGS \
  --ref /reference/Homo_sapiens_assembly38.fasta \
  --reads /output/ERR10502884.markup.bam \
  --output_vcf /output/ERR10502884.output180.vcf.gz \
  --num_shards 120 \
  --intermediate_results_dir /output/tmp3 \
  --make_examples_extra_args variant_caller="vcf_candidate_importer",proposed_variants="/output/output1.vcf.gz"

The output180.vcf.gz file has over 5 million variations.

zcat /mnt/disk2/SRA3/ERR10502884.output180.vcf.gz  |grep PASS|wc -l

The ERR10502884. output 180. vcf. gz file has less than 100,000 lines with a PASS flag, and the rest have a refcall flag

But here is the command for deepvariant 1.6.1:

time sudo docker run  -m 480G \
  -v "/mnt/disk1/ref/bed":"/input"
  -v "/mnt/disk2/SRA3":"/output"
  -v "/mnt/disk1/ref":"/reference"
  google/deepvariant:"1.6.1" \
  /opt/deepvariant/bin/run_deepvariant \
  --model_type WGS \
  --ref /reference/Homo_sapiens_assembly38.fasta \
  --reads /output/ERR10502884.markup.bam \
  --output_vcf /output/ERR10502884.output161.vcf.gz \
  --num_shards 120 \
  --intermediate_results_dir /output/tmp3 \
  --make_examples_extra_args variant_caller="vcf_candidate_importer",proposed_variants="/output/output1.vcf.gz"
zcat /mnt/disk2/SRA3/ERR10502884.output161.vcf.gz  |grep PASS|wc -l

The ERR10502884. output 161. vcf. gz file has more than 3,000,000 lines with a PASS flag

@kishwarshafin
Copy link
Collaborator

@mulderdt can you please try your 1.8.0 command with --disable_small_model=true please?

@chaoxinzhang
Copy link
Author

@mulderdt can you please try your 1.8.0 command with --disable_small_model=true please?
I'm glad to see your reply.

Yes, I did.

Try 1.8.0 command with --disable_small_model=true, The output file has more than 3,000,000 lines with a PASS flag.

What else do I need to try?

@danielecook
Copy link
Collaborator

@chaoxinzhang can you remove the --intermediate_results_dir /output/tmp3 \ flag when running? This directory should not be reused between commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants