cd
intopopcll_torch
- Run
python setup.py install
to install. - Run
python setup.py bdist_wheel
to build wheel (for sharing).
cd
intopopcll_torch/dist
- Run
pip install popcll_torch-1.0-cp39-cp39-linux_x86_64.whl
import torch as ch
from popcll_torch import popcll
z = ch.tensor([0,1,2,3,4,5,6,7,8], dtype=ch.long).cuda()
counts = popcll(z)
Currently only works with int/long 1-D tensors on CUDA.