所以我在服务器上工作,我没有 sudo 权限,但我可以使用 conda 安装软件包。 我需要安装tensorflow,但我需要tensorflow>=2.6,当前的conda版本是2.4.1,如here所列。
所以我使用
pip install tensorflow
安装tensorflow。我现在的问题是我没有安装 cuda,因此在运行代码时遇到很多问题(完整的错误消息在最后)。
因此我尝试安装
conda install -c anaconda cudatoolkit
和 conda install -c anaconda cudnn
但仍然不起作用。
conda list
给我:
# Name Version Build Channel
cudatoolkit 10.2.89 hfd86e86_1 anaconda
cudnn 7.6.5 cuda10.2_0 anaconda
libgcc-ng 9.1.0 hdf63c60_0 anaconda
libstdcxx-ng 9.1.0 hdf63c60_0 anaconda
再次,我的问题是我可以安装的内容非常有限,公司 IT 部门将花费大约一个月的时间来为我安装所有这些,所以我想看看我是否可以自己完成。
是否可以在 conda 中解决这个问题?
2021-12-03 13:58:02.834987: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:02.835238: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-12-03 13:58:22.648587: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:22.649475: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:22.650392: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:22.651072: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:22.651783: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:22.652500: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:22.653205: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:22.653823: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:22.653891: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1835] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-12-03 13:58:22.654403: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-03 13:58:31.483952: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
2021-12-03 13:58:31.484056: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
2021-12-03 13:58:31.484111: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
2021-12-03 13:58:31.485153: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcupti.so.11.2'; dlerror: libcupti.so.11.2: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:31.485943: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcupti.so'; dlerror: libcupti.so: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/tools/intel/studio/2017/itac/2017.3.030/mic/slib:/opt/tools/intel/studio/2017/itac/2017.3.030/intel64/slib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/opt/tools/intel/studio/2017/debugger_2017/iga/lib:/opt/tools/intel/studio/2017/debugger_2017/libipt/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/opt/tools/intel/studio/2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib
2021-12-03 13:58:31.485995: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1666] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI could not be loaded or symbol could not be found.
2021-12-03 13:58:31.486035: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
2021-12-03 13:58:31.486073: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1757] function cupti_interface_->Finalize()failed with error CUPTI could not be loaded or symbol could not be found.
2021-12-03 13:58:41.242190: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2021-12-03 13:58:53.816935: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
2021-12-03 13:58:53.817059: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
2021-12-03 13:58:53.817265: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1666] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI could not be loaded or symbol could not be found.
slurmstepd: error: Step 55223.0 exceeded memory limit (5555060 > 5263360), being killed
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: Exceeded job memory limit
slurmstepd: error: *** STEP 55223.0 ON g201 CANCELLED AT 2021-12-03T13:58:59 ***
[mpiexec@spiro-g201-clu] control_cb (../../pm/pmiserv/pmiserv_cb.c:798): connection to proxy 0 at host g201 failed
[mpiexec@spiro-g201-clu] HYDT_dmxu_poll_wait_for_event (../../tools/demux/demux_poll.c:76): callback returned error status
[mpiexec@spiro-g201-clu] HYD_pmci_wait_for_completion (../../pm/pmiserv/pmiserv_pmci.c:501): error waiting for event
[mpiexec@spiro-g201-clu] main (../../ui/mpich/mpiexec.c:1147): process manager error waiting for completion
conda create -n tensorflow2_env python=3.10 numpy scipy pandas matplotlib seaborn scikit-learn ipykernel statsmodels xgboost fastapi
conda activate tensorflow2_env
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
python -m pip install "tensorflow<2.11"
conda env config vars set PYDEVD_DISABLE_FILE_VALIDATION=1
python -m ipykernel install --user --name tensorflow2_env --display-name "Python (TensorFlow 2)"
这里是tensorflow 1的命令:
conda create -n tensorflow1_env python=3.6 numpy scipy pandas matplotlib seaborn scikit-learn ipykernel statsmodels xgboost fastapi
conda activate tensorflow1_env
conda install -c conda-forge cudatoolkit=10.0 cudnn=7.3.1
python -m pip install "tensorflow<2.0"
conda env config vars set PYDEVD_DISABLE_FILE_VALIDATION=1
python -m ipykernel install --user --name tensorflow1_env --display-name "Python (TensorFlow 1)"
对于 pytorch
conda create -n pytorch_env python=3.12 numpy scipy pandas matplotlib seaborn scikit-learn ipykernel statsmodels plotly xgboost fastapi
conda activate pytorch_env
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
conda env config vars set PYDEVD_DISABLE_FILE_VALIDATION=1
python -m ipykernel install --user --name pytorch_env --display-name "Python (PyTorch)"
conda deactivate