Miscellaneous

Environment variables

WarpConvNet reads the following environment variables at import time. All are optional — defaults are chosen for typical use.

Defined in warpconvnet/constants.py.

Algorithm selection

Variable Default Description
WARPCONVNET_FWD_ALGO_MODE auto Forward convolution algorithm. auto benchmarks a reduced candidate set. all benchmarks every algorithm. Can also be a single name (e.g., implicit_gemm) or a list ([implicit_gemm,cutlass_implicit_gemm]).
WARPCONVNET_BWD_ALGO_MODE auto Backward convolution algorithm. Same format as forward.
WARPCONVNET_DEPTHWISE_CONV_FWD_ALGO_MODE auto Depthwise forward algorithm (explicit_gemm, implicit_gemm, or auto).
WARPCONVNET_DEPTHWISE_CONV_BWD_ALGO_MODE auto Depthwise backward algorithm.
WARPCONVNET_USE_FP16_ACCUM false Global default for the fp16 accumulator flag. When true, the production F16Acc tiles (40/42) enter the autotune pool and CUTLASS entries are rewritten to accumulator_type=torch.float16. Per-module use_fp16_accum= overrides this. See Accumulator Precision.

Valid algorithm names: explicit_gemm, implicit_gemm, cutlass_implicit_gemm, cute_implicit_gemm, explicit_gemm_grouped, implicit_gemm_grouped, cutlass_grouped_hybrid, cute_grouped, production, auto, all, trimmed. Unknown names raise ValueError when passed via fwd_algo/dgrad_algo/wgrad_algo.

Benchmark cache

Variable Default Description
WARPCONVNET_BENCHMARK_CACHE_DIR ~/.cache/warpconvnet Directory for the auto-tuning benchmark cache
WARPCONVNET_BENCHMARK_CACHE_DIR_OVERRIDE (unset) If set, overrides the default cache directory
WARPCONVNET_AUTOTUNE_LOG true Set to false or 0 to suppress auto-tuning log messages

Other

Variable Default Description
WARPCONVNET_SKIP_SYMMETRIC_KERNEL_MAP false Skip symmetric kernel map optimization
WARPCONVNET_SKIP_EXTENSION 0 Set to 1 to skip loading the C++ extension (for docs builds, etc.)