You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2025-01-08 12:50:15.900768: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-01-08 12:50:16.282094: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-01-08 12:50:18.366498: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/.singularity.d/libs
2025-01-08 12:50:18.369116: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/.singularity.d/libs
2025-01-08 12:50:18.369250: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
I0108 12:50:21.480433 139804324811904 run_deepvariant.py:649] Re-using the directory for intermediate results in /data1/shahs3/users/chois7/tickets/gpu-deepvariant/quickstart-output/intermediate_results_dir
I0108 12:50:21.481722 139804324811904 run_deepvariant.py:847] env = {'SHELL': '/bin/bash', 'NV_LIBCUBLAS_VERSION': '11.11.3.6-1', 'NVIDIA_VISIBLE_DEVICES': 'all', 'NV_NVML_DEV_VERSION': '11.8.86-1', 'NV_CUDNN_PACKAGE_NAME': 'libcudnn8', 'SLURM_JOB_USER': 'chois7', 'SLURM_TASKS_PER_NODE': '1', 'SLURM_JOB_UID': '164064212', 'HISTCONTROL': 'ignoredups', 'NV_LIBNCCL_DEV_PACKAGE': 'libnccl-dev=2.15.5-1+cuda11.8', 'SLURM_TASK_PID': '3398772', 'CONDA_EXE': '/home/chois7/miniforge3/bin/conda', '_CE_M': '', 'NV_LIBNCCL_DEV_PACKAGE_VERSION': '2.15.5-1', 'PKG_CONFIG_PATH': '/home/chois7/packages/lib/pkgconfig:/home/chois7/packages/lib/pkgconfig:/home/chois7/packages/lib/pkgconfig:/home/chois7/packages/lib/pkgconfig:', 'SLURM_JOB_GPUS': '3', 'SLURM_LOCALID': '0', 'SLURM_SUBMIT_DIR': '/data1/shahs3/users/chois7/tickets/gpu-deepvariant', 'ONCOKB_API_KEY': 'c8b739f9-30bb-47d7-8dba-58eda2fcd3a2', 'HISTSIZE': '1000', 'HOSTNAME': 'iscf031', 'PYTHON_VERSION': '3.10', 'LANGUAGE': 'en', 'SINGULARITY_NAME': 'deepvariant-1.8.0-gpu.sif', 'SLURMD_NODENAME': 'iscf031', 'SLURM_JOB_START_TIME': '1736358615', 'TERMCAP': 'SC|xterm-256color|VT 100/ANSI X3.64 virtual terminal:\\\n\t:DO=\\E[%dB:LE=\\E[%dD:RI=\\E[%dC:UP=\\E[%dA:bs:bt=\\E[Z:\\\n\t:cd=\\E[J:ce=\\E[K:cl=\\E[H\\E[J:cm=\\E[%i%d;%dH:ct=\\E[3g:\\\n\t:do=^J:nd=\\E[C:pt:rc=\\E8:rs=\\Ec:sc=\\E7:st=\\EH:up=\\EM:\\\n\t:le=^H:bl=^G:cr=^M:it#8:ho=\\E[H:nw=\\EE:ta=^I:is=\\E)0:\\\n\t:li#51:co#209:am:xn:xv:LP:sr=\\EM:al=\\E[L:AL=\\E[%dL:\\\n\t:cs=\\E[%i%d;%dr:dl=\\E[M:DL=\\E[%dM:dc=\\E[P:DC=\\E[%dP:\\\n\t:im=\\E[4h:ei=\\E[4l:mi:IC=\\E[%d@:ks=\\E[?1h\\E=:\\\n\t:ke=\\E[?1l\\E>:vi=\\E[?25l:ve=\\E[34h\\E[?25h:vs=\\E[34l:\\\n\t:ti=\\E[?1049h:te=\\E[?1049l:us=\\E[4m:ue=\\E[24m:so=\\E[3m:\\\n\t:se=\\E[23m:mb=\\E[5m:md=\\E[1m:mh=\\E[2m:mr=\\E[7m:\\\n\t:me=\\E[m:ms:\\\n\t:Co#8:pa#64:AF=\\E[3%dm:AB=\\E[4%dm:op=\\E[39;49m:AX:\\\n\t:vb=\\Eg:G0:as=\\E(0:ae=\\E(B:\\\n\t:ac=\\140\\140aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~..--++,,hhII00:\\\n\t:po=\\E[5i:pf=\\E[4i:Km=\\E[M:k0=\\E[10~:k1=\\EOP:k2=\\EOQ:\\\n\t:k3=\\EOR:k4=\\EOS:k5=\\E[15~:k6=\\E[17~:k7=\\E[18~:\\\n\t:k8=\\E[19~:k9=\\E[20~:k;=\\E[21~:F1=\\E[23~:F2=\\E[24~:\\\n\t:F3=\\E[1;2P:F4=\\E[1;2Q:F5=\\E[1;2R:F6=\\E[1;2S:\\\n\t:F7=\\E[15;2~:F8=\\E[17;2~:F9=\\E[18;2~:FA=\\E[19;2~:\\\n\t:FB=\\E[20;2~:FC=\\E[21;2~:FD=\\E[23;2~:FE=\\E[24;2~:kb=\x7f:\\\n\t:K2=\\EOE:kB=\\E[Z:kF=\\E[1;2B:kR=\\E[1;2A:*4=\\E[3;2~:\\\n\t:*7=\\E[1;2F:#2=\\E[1;2H:#3=\\E[2;2~:#4=\\E[1;2D:%c=\\E[6;2~:\\\n\t:%e=\\E[5;2~:%i=\\E[1;2C:kh=\\E[1~:@1=\\E[1~:kH=\\E[4~:\\\n\t:@7=\\E[4~:kN=\\E[6~:kP=\\E[5~:kI=\\E[2~:kD=\\E[3~:ku=\\EOA:\\\n\t:kd=\\EOB:kr=\\EOC:kl=\\EOD:km:', 'HYDRA_LAUNCHER_EXTRA_ARGS': '--external-launcher', 'JAVA_HOME': '/home/chois7/miniforge3/envs/py11/lib/jvm', 'NVIDIA_REQUIRE_CUDA': 'cuda>=11.8 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471', 'NV_LIBCUBLAS_DEV_PACKAGE': 'libcublas-dev-11-8=11.11.3.6-1', 'NV_NVTX_VERSION': '11.8.86-1', 'USER_PRINCIPAL_NAME': '[email protected]', 'JAVA_LD_LIBRARY_PATH': '/home/chois7/miniforge3/envs/py11/lib/jvm/lib/server', 'WINDOW': '10', 'NV_CUDA_CUDART_DEV_VERSION': '11.8.89-1', 'NV_LIBCUSPARSE_VERSION': '11.7.5.86-1', 'SLURM_CLUSTER_NAME': 'iris', 'SLURM_JOB_END_TIME': '1736365815', 'NV_LIBNPP_VERSION': '11.8.0.86-1', 'SLURM_CPUS_ON_NODE': '1', 'DV_GPU_BUILD': '1', 'SINGULARITY_ENVIRONMENT': '/.singularity.d/env/91-environment.sh', 'NCCL_VERSION': '2.15.5-1', 'SLURM_JOB_CPUS_PER_NODE': '1', 'XML_CATALOG_FILES': 'file:///home/chois7/miniforge3/envs/py11/etc/xml/catalog file:///etc/xml/catalog', 'LMOD_DIR': '/usr/share/lmod/lmod/libexec', 'EDITOR': 'vim', 'TF_FORCE_GPU_ALLOW_GROWTH': 'true', 'SLURM_GPUS_ON_NODE': '1', 'KRB5CCNAME': 'FILE:/tmp/krb5cc_164064212', 'PRTE_MCA_plm_slurm_args': '--external-launcher', 'PWD': '/data1/shahs3/users/chois7/tickets/gpu-deepvariant', 'SLURM_GTIDS': '0', 'ISABL_CLIENT_ID': '3', 'GSETTINGS_SCHEMA_DIR': '/home/chois7/miniforge3/envs/py11/share/glib-2.0/schemas', 'DA_SESSION_ID_AUTH': '42ca2176-265a-204c-b0bf-68914c952524', 'LOGNAME': 'chois7', 'CONDA_PREFIX': '/home/chois7/miniforge3/envs/py11', 'NV_CUDNN_PACKAGE': 'libcudnn8=8.9.6.50-1+cuda11.8', 'SLURM_JOB_PARTITION': 'componc_gpu', 'MODULESHOME': '/usr/share/lmod/lmod', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'ISABL_API_URL': 'https://isabl.shahlab.mskcc.org/api/v1/', 'MANPATH': '/usr/share/lmod/lmod/share/man:', 'NXF_SINGULARITY_CACHEDIR': '/data1/shahs3/users/chois7/tmp/.cache', 'NV_NVPROF_DEV_PACKAGE': 'cuda-nvprof-11-8=11.8.87-1', 'NV_LIBNPP_PACKAGE': 'libnpp-11-8=11.8.0.86-1', 'TF_ENABLE_ONEDNN_OPTS': '1', 'GSETTINGS_SCHEMA_DIR_CONDA_BACKUP': '', 'NV_LIBNCCL_DEV_PACKAGE_NAME': 'libnccl-dev', 'SLURM_JOB_NUM_NODES': '1', 'SCREENDIR': '/home/chois7/.screen', 'SLURM_JOBID': '10976583', 'NV_LIBCUBLAS_DEV_VERSION': '11.11.3.6-1', 'NVIDIA_PRODUCT_NAME': 'CUDA', 'I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS': '--external-launcher', 'SLURM_JOB_QOS': 'normal', 'USER_PATH': '/home/chois7/packages/bin:/data1/shahs3/users/chois7/packages/annovar:/data1/shahs3/users/chois7/packages/node-v20.11.1-linux-x64/bin:/home/chois7/miniforge3/envs/py11/bin:/home/chois7/miniforge3/bin:/home/chois7/.cargo/bin:/home/chois7/packages/bin:/data1/shahs3/users/chois7/packages/annovar:/data1/shahs3/users/chois7/packages/node-v20.11.1-linux-x64/bin:/home/chois7/miniforge3/bin:/home/chois7/packages/bin:/data1/shahs3/users/chois7/packages/annovar:/data1/shahs3/users/chois7/packages/node-v20.11.1-linux-x64/bin:/home/chois7/data1/envs/apps/bin:/home/chois7/miniforge3/bin:/home/chois7/miniforge3/condabin:/data1/shahs3/users/chois7/packages/google-cloud-sdk/bin:/home/chois7/packages/bin:/data1/shahs3/users/chois7/packages/annovar:/data1/shahs3/users/chois7/packages/node-v20.11.1-linux-x64/bin:/home/chois7/miniforge3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/chois7/bin:/home/chois7/bin:/home/chois7/bin:/home/chois7/.local/bin:/home/chois7/bin:/home/chois7/.local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin', 'NV_LIBCUBLAS_DEV_PACKAGE_NAME': 'libcublas-dev-11-8', 'CAPSULE_LOG': 'none', 'NV_CUDA_CUDART_VERSION': '11.8.89-1', 'HOME': '/home/chois7', 'LANG': 'C', 'GITHUB_TOKEN': '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', 'LS_COLORS': 'di=00;94:ow=1;34:tw=1;33:fi=0:ln=32:pi=5:so=5:bd=5:cd=5:or=31:mi=0:ex=35:*.rpm=90', 'SLURM_PROCID': '0', 'CUDA_VERSION': '11.8.0', 'SINGULARITY_CONTAINER': '/home/chois7/data1/singularity/sifs/deepvariant-1.8.0-gpu.sif', 'NV_LIBCUBLAS_PACKAGE': 'libcublas-11-8=11.11.3.6-1', 'LMOD_SETTARG_FULL_SUPPORT': 'no', 'NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE': 'cuda-nsight-compute-11-8=11.8.0-1', 'CONDA_PROMPT_MODIFIER': '(py11) ', 'TMPDIR': '/data1/shahs3/users/chois7/tmp/.cache', 'PROMPT_COMMAND': 'PS1="Singularity> "; unset PROMPT_COMMAND', 'SLURM_TOPOLOGY_ADDR': 'iscf031', 'LMOD_VERSION': '8.7.32', 'SSH_CONNECTION': '10.18.25.140 53770 10.247.112.114 22', 'NV_LIBNPP_DEV_PACKAGE': 'libnpp-dev-11-8=11.8.0.86-1', 'PIP_CACHE_DIR': '/data1/shahs3/users/chois7/tmp/.cache', 'NV_LIBCUBLAS_PACKAGE_NAME': 'libcublas-11-8', 'XDG_CACHE_HOME': '/data1/shahs3/users/chois7/tmp/.cache', 'HYDRA_BOOTSTRAP': 'slurm', 'NV_LIBNPP_DEV_VERSION': '11.8.0.86-1', 'MODULEPATH_ROOT': '/usr/share/modulefiles', 'CUDA_VISIBLE_DEVICES': '0', 'JAVA_LD_LIBRARY_PATH_BACKUP': '/home/chois7/miniforge3/envs/py11/lib/jvm/lib/server', 'SLURM_TOPOLOGY_ADDR_PATTERN': 'node', 'LMOD_PKG': '/usr/share/lmod/lmod', 'SLURM_MEM_PER_NODE': '24576', 'TERM': 'xterm-256color', 'NV_LIBCUSPARSE_DEV_VERSION': '11.7.5.86-1', '_CE_CONDA': '', 'LESSOPEN': '||/usr/bin/lesspipe.sh %s', 'USER': 'chois7', 'CDC_PREW2KHOST': 'islogin01', 'CDC_JOINED_SITE': 'SDC', 'LIBRARY_PATH': '/usr/local/cuda/lib64/stubs', 'NV_CUDNN_VERSION': '8.9.6.50', 'SLURM_NODELIST': 'iscf031', 'CDC_JOINED_ZONE': 'CN=IRIS,CN=SDC_Zone,CN=MSK_Digits_HPC_Zone,CN=Zones,OU=Centrify,OU=HPC,OU=Resources,DC=MSKCC,DC=ROOT,DC=MSKCC,DC=ORG', 'ENVIRONMENT': 'BATCH', 'CONDA_SHLVL': '6', 'SLURM_JOB_ACCOUNT': 'shahs3', 'SLURM_PRIO_PROCESS': '0', 'LMOD_ROOT': '/usr/share/lmod', 'SHLVL': '4', 'SLURM_NNODES': '1', 'BASH_ENV': '/usr/share/lmod/lmod/init/bash', 'CDC_JOINED_DOMAIN': 'mskcc.root.mskcc.org', 'NV_CUDA_LIB_VERSION': '11.8.0-1', 'NVARCH': 'x86_64', 'LMOD_sys': 'Linux', 'LC_MESSAGES': 'C', 'SINGULARITY_BIND': '/usr/lib/locale/:/usr/lib/locale/,/data1,/home', 'NV_CUDNN_PACKAGE_DEV': 'libcudnn8-dev=8.9.6.50-1+cuda11.8', 'SLURM_SUBMIT_HOST': 'islogin01.mskcc.org', 'NV_CUDA_COMPAT_PACKAGE': 'cuda-compat-11-8', 'CONDA_PYTHON_EXE': '/home/chois7/miniforge3/bin/python', 'NV_LIBNCCL_PACKAGE': 'libnccl2=2.15.5-1+cuda11.8', 'LD_LIBRARY_PATH': '/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/.singularity.d/libs', 'LC_CTYPE': 'en_US.utf8', 'SLURM_JOB_ID': '10976583', 'SLURM_NODEID': '0', 'SSH_CLIENT': '10.18.25.140 53770 22', 'CDC_JOINED_DC': 'vsstgpmaddns1.mskcc.root.mskcc.org', 'CONDA_DEFAULT_ENV': 'py11', 'NV_CUDA_NSIGHT_COMPUTE_VERSION': '11.8.0-1', 'JAVA_HOME_CONDA_BACKUP': '/home/chois7/miniforge3/envs/py11/lib/jvm', 'which_declare': 'declare -f', 'NV_NVPROF_VERSION': '11.8.87-1', 'SLURM_CONF': '/etc/slurm/slurm.conf', 'PATH': '/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/conda/bin:/opt/conda/envs/bio/bin:/opt/deepvariant/bin', 'STY': '847481.iris', 'SLURM_JOB_NAME': 'test', 'MODULEPATH': '/etc/modulefiles:/usr/share/modulefiles:/admin/software/lmod/modulefiles', 'NV_LIBNCCL_PACKAGE_NAME': 'libnccl2', 'VERSION': '1.8.0', 'NV_LIBNCCL_PACKAGE_VERSION': '2.15.5-1', 'LMOD_CMD': '/usr/share/lmod/lmod/libexec/lmod', 'MAIL': '/var/spool/mail/chois7', 'SSH_TTY': '/dev/pts/51', 'CDC_LOCALHOST': 'islogin01.mskcc.org', 'CONDA_PREFIX_1': '/home/chois7/miniforge3', 'CONDA_PREFIX_2': '/home/chois7/miniforge3/envs/py11', 'CONDA_PREFIX_3': '/home/chois7/miniforge3', 'CONDA_PREFIX_4': '/home/chois7/miniforge3/envs/py11', 'OMPI_MCA_plm_slurm_args': '--external-launcher', 'SINGULARITY_COMMAND': 'run', 'CONDA_PREFIX_5': '/home/chois7/miniforge3', 'SLURM_JOB_GID': '164064212', 'OLDPWD': '/home/chois7/data1/projects/signatures-ont/deepsomatic', 'SLURM_JOB_NODELIST': 'iscf031', 'I_MPI_HYDRA_BOOTSTRAP': 'slurm', 'BASH_FUNC_ml%%': '() { eval "$($LMOD_DIR/ml_cmd "$@")"\n}', 'BASH_FUNC_which%%': '() { ( alias;\n eval ${which_declare} ) | /usr/bin/which --tty-only --read-alias --read-functions --show-tilde --show-dot $@\n}', 'BASH_FUNC_module%%': '() { if [ -z "${LMOD_SH_DBG_ON+x}" ]; then\n case "$-" in \n *v*x*)\n __lmod_sh_dbg=\'vx\'\n ;;\n *v*)\n __lmod_sh_dbg=\'v\'\n ;;\n *x*)\n __lmod_sh_dbg=\'x\'\n ;;\n esac;\n fi;\n if [ -n "${__lmod_sh_dbg:-}" ]; then\n set +$__lmod_sh_dbg;\n echo "Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for Lmod\'s output" 1>&2;\n fi;\n eval "$($LMOD_CMD shell "$@")" && eval "$(${LMOD_SETTARG_CMD:-:} -s sh)";\n __lmod_my_status=$?;\n if [ -n "${__lmod_sh_dbg:-}" ]; then\n echo "Shell debugging restarted" 1>&2;\n set -$__lmod_sh_dbg;\n fi;\n unset __lmod_sh_dbg;\n return $__lmod_my_status\n}', '_': '/usr/bin/python3', 'TPU_ML_PLATFORM': 'Tensorflow', 'TF2_BEHAVIOR': '1', 'TF_CPP_MIN_LOG_LEVEL': '1'}
Within the stdout, these warning messages appears:
2025-01-08 13:06:30.723603: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:267] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2025-01-08 13:06:35.873709: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:267] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2025-01-08 13:06:47.534398: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:1278] could not retrieve CUDA device count: CUDA_ERROR_NOT_INITIALIZED: initialization error
Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.8/docs/FAQ.md:
Describe the issue:
Submitted job does not use GPU but only utilizes CPUs
Setup
Steps to reproduce:
Within the stdout, these warning messages appears:
Does the quick start test work on your system?
Please test with https://github.com/google/deepvariant/blob/r0.10/docs/deepvariant-quick-start.md.
Is there any way to reproduce the issue by using the quick start?
nvidia-smi
.Any additional context:
The text was updated successfully, but these errors were encountered: