Beyond Native PyTorch: The Power of C++/CUDA Integration An efficient CUDA-based implementation of trilinear interpolation