-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependency Issues #4
Comments
I figured it out. I had to init the submodule https://github.com/synxlin/mini-torchpack. $ git submodule init
$ git submodule update Also installing the additional requirement $ pip install torchvision
$ pip install six However, it still is not running. There is some issue with Also got some pytorch version mix up. I think the submodule requires a cuda version from what I can tell. I'm on a machine without GPUs here though, so this might be a problem later. I'll keep at it though and post my updates here. python train.py --devices cpu
Extension horovod.torch has not been built: /home/pepper-jk/.conda/envs/deep_comp/lib/python3.7/site-packages/horovod/torch/mpi_lib/_mpi_lib.cpython-37m-x86_64-linux-gnu.so not found
If this is not expected, reinstall Horovod with HOROVOD_WITH_PYTORCH=1 to debug the build error.
Warning! MPI libs are missing, but python applications are still avaiable.
Traceback (most recent call last):
File "/home/pepper-jk/.conda/envs/deep_comp/lib/python3.7/site-packages/horovod/torch/mpi_ops.py", line 33, in <module>
from horovod.torch import mpi_lib_v2 as mpi_lib
ImportError: cannot import name 'mpi_lib_v2' from 'horovod.torch' (/home/pepper-jk/.conda/envs/deep_comp/lib/python3.7/site-packages/horovod/torch/__init__.py)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "train.py", line 17, in <module>
from dgc.horovod.optimizer import DistributedOptimizer
File "/home/pepper-jk/code/deep-gradient-compression/dgc/horovod/__init__.py", line 2, in <module>
from dgc.horovod.optimizer import DistributedOptimizer
File "/home/pepper-jk/code/deep-gradient-compression/dgc/horovod/optimizer.py", line 24, in <module>
from horovod.torch.mpi_ops import allreduce_async_
File "/home/pepper-jk/.conda/envs/deep_comp/lib/python3.7/site-packages/horovod/torch/mpi_ops.py", line 35, in <module>
check_installed_version('pytorch', torch.__version__, e)
File "/home/pepper-jk/.conda/envs/deep_comp/lib/python3.7/site-packages/horovod/common/util.py", line 260, in check_installed_version
raise HorovodVersionMismatchError(name, version, installed_version) from exception
horovod.common.exceptions.HorovodVersionMismatchError: Framework pytorch installed with version None but found version 1.10.0+cu102.
This can result in unexpected behavior including runtime errors.
Reinstall Horovod using `pip install --no-cache-dir` to build with the new version. |
Hello,
I wanted to try out your code and came across an issue regarding pytorch dependencies.
I installed all the requirements in a fresh conda environment with
python 3.7.11
via yourrequirements.txt
.I made sure the versions are at least the ones listed in the readme.
I installed openmpi via:
conda install openmpi
However, it appears the module
torchpack.mtpack
.I also tried to go back from torch==1.9.1 to torch==1.5, but no change.
Hope you can help me.
Thanks in advance.
p.s. I will try this again tomorrow and update this issue if I find a solution.
The text was updated successfully, but these errors were encountered: