Installation¶
Requirements¶
- Python >= 3.10
- PyTorch >= 2.4 with CUDA support
- NVIDIA GPU (Ampere or newer recommended)
- CUDA toolkit (nvcc) matching your PyTorch CUDA version
ninja(for C++/CUDA JIT compilation)
Install from source¶
# 1. Install PyTorch for your CUDA version (example: CUDA 12.8)
export CUDA=cu128
pip install torch torchvision --index-url https://download.pytorch.org/whl/${CUDA}
# 2. Install build dependencies
pip install build ninja
# 3. Install WarpConvNet and its dependencies
pip install cupy-cuda12x
pip install git+https://github.com/rusty1s/pytorch_scatter.git
pip install flash-attn --no-build-isolation
pip install .
CUDA version string
The cu128 version string follows PyTorch's convention: cu + major + minor
digits without a dot. For example, CUDA 12.1 → cu121, CUDA 12.8 → cu128.
Check pytorch.org for available versions.
Verify the installation¶
python -c "import warpconvnet; print('WarpConvNet installed successfully')"
Optional: model extras¶
To run the ScanNet example and other model-training scripts, install with the
models extra:
pip install ".[models]"
Compilation from source¶
For detailed compilation instructions — including GPU architecture targeting,
the SETUPTOOLS_SCM_PRETEND_VERSION workaround for detached HEAD / shallow
clones, and build_ext for development iteration — see the
Compilation Guide.
Troubleshooting¶
cuBLAS fp16 correctness issue¶
Some versions of nvidia-cublas-cu12 shipped with PyTorch produce incorrect
results for fp16/bf16 matrix multiplications. WarpConvNet prints a warning at
import time if this affects your environment. See the
Troubleshooting guide for
details and fixes.