[SOLVED] building Python bindings for PyTorch?

Geremia · 05-21-2024, 02:26 PM

I have packaged a library-only (non-Python) build of PyTorch (`-DBUILD_PYTHON=OFF` passed to `cmake`) for Slackware (using this SlackBuild), but now I'd like it to build the Python bindings, too.

It seems `-DBUILD_PYTHON=ON` would have created all the necessary files, but when I run `python3 setup.py install --root=$PKG`, it complains that I'm not in a `git` repo; I downloaded `pytorch-v2.3.0.tar.gz` to make the library-only package.

Geremia · 05-22-2024, 06:18 PM

See below.
🖗

franzen · 05-23-2024, 05:53 AM

Quote:

Originally Posted by Geremia

I have packaged a library-only (non-Python) build of PyTorch (`-DBUILD_PYTHON=OFF` passed to `cmake`) for Slackware (using this SlackBuild), but now I'd like it to build the Python bindings, too.

Please try the following patch:

Code:

--- pytorch.SlackBuild	2024-05-04 16:52:13.000000000 +0200
+++ pytorch.big/pytorch.SlackBuild	2024-05-23 15:40:49.957312533 +0200
@@ -68,7 +68,7 @@
 rm -rf $PKG
 mkdir -p $TMP $PKG $OUTPUT
 cd $TMP
-rm -rf v$PRGNAM-$VERSION
+rm -rf $PRGNAM-v$VERSION
 tar xvf $CWD/$PRGNAM-v$VERSION.tar.gz
 cd $PRGNAM-v$VERSION
 chown -R root:root .
@@ -78,50 +78,52 @@
 sed -i c10/CMakeLists.txt \
   -e "s;DESTINATION lib;DESTINATION lib$LIBDIRSUFFIX;g"
 
-mkdir -p build
+# Make Vulkan Warnings not to make the build fail
+sed -i "/Werror=return-type/d" CMakeLists.txt
+
+# When compiled with icecream, building dnnl-mkl outputs errors like "unspellable token PRAGMA_EOL".
+# This seems harmless as these get recompiled again locally, but it slows down the building process.
+# See https://github.com/icecc/icecream/issues/336
+
+export USE_NNPACK=0
+unshare -n python3 setup.py build --cmake-only
+
 cd build
   unshare -n cmake \
+    -G Ninja \
     -DCMAKE_C_FLAGS:STRING="$SLKCFLAGS" \
     -DCMAKE_CXX_FLAGS:STRING="$SLKCFLAGS" \
     -DCMAKE_INSTALL_PREFIX="/usr" \
+    -DCMAKE_CXX_STANDARD=17 \
     -DLIBSHM_INSTALL_LIB_SUBDIR="lib$LIBDIRSUFFIX" \
     -DTORCH_INSTALL_LIB_DIR="lib$LIBDIRSUFFIX" \
     -DPYTHON_EXECUTABLE=$(which python3) \
-    -DBUILD_TEST=OFF \
-    -DBUILD_CAFFE2=OFF \
-    -DBUILD_CAFFE2_OPS=OFF \
-    -DBUILD_PYTHON=OFF \
     -DBUILD_CUSTOM_PROTOBUF=OFF \
-    -DUSE_CUDA=OFF \
-    -DUSE_CUDNN=OFF \
-    -DUSE_FBGEMM=OFF \
     -DUSE_FFMPEG=ON \
+    -DUSE_GOLD_LINKER=ON \
     -DUSE_KINETO=OFF \
-    -DUSE_MKLDNN=OFF \
-    -DUSE_MPI=OFF \
-    -DUSE_NCCL=OFF \
     -DUSE_NNPACK=OFF \
-    -DUSE_OPENMP=OFF \
     -DUSE_OPENCL=ON \
     -DUSE_OPENCV=ON \
-    -DUSE_PTHREADPOOL=OFF \
-    -DUSE_PYTORCH_QNNPACK=OFF \
-    -DUSE_QNNPACK=OFF \
-    -DUSE_SYSTEM_TBB=ON \
-    -DUSE_XNNPACK=OFF \
-    -Wno-dev \
-    -DUSE_DISTRIBUTED=OFF \
+    -DUSE_VULKAN=ON \
     -DCMAKE_BUILD_TYPE=Release ..
-  make
-  make install/strip DESTDIR=tmpxxx
-
+  ninja 
+  DESTDIR=tmpxxx ninja install/strip
+  
   mkdir -p $PKG/usr
   mv tmpxxx/usr/include $PKG/usr
+  mv tmpxxx/usr/bin $PKG/usr
   mkdir -p $PKG/usr/share
   mv tmpxxx/usr/share/cmake $PKG/usr/share
   mkdir -p $PKG/usr/lib$LIBDIRSUFFIX
   mv tmpxxx/usr/lib$LIBDIRSUFFIX/*.so $PKG/usr/lib$LIBDIRSUFFIX
+  mv tmpxxx/usr/lib$LIBDIRSUFFIX/python* $PKG/usr/lib$LIBDIRSUFFIX
+  find $PKG/usr -empty -type d -delete
 cd ..
+python3 setup.py install --root=$PKG
+
+find $PKG -print0 | xargs -0 file | grep -e "executable" -e "shared object" | grep ELF \
+  | cut -f 1 -d : | xargs strip --strip-unneeded 2> /dev/null || true
 
 mkdir -p $PKG/usr/doc/$PRGNAM-$VERSION
 cp -a LICENSE NOTICE README.md RELEASE.md $PKG/usr/doc/$PRGNAM-$VERSION

Geremia · 05-24-2024, 11:34 PM

So basically I need to use Ninja, run "setup.py build" with the "--cmake-only" flag, and manually move the Python site-package stuff in tmpxxx.

In my case, I need

Code:

    -DUSE_CUDA=ON \
    -DUSE_CUDNN=ON \

and many other environment variables to specify my somewhat non-standard install locations for CUDA and CUDNN and that I need to use GCC 13.2 (∵ 14.1 isn't compatible with my CUDA version), but I can't find any binaries for 13.2 anymore (Slackware 15.0 uses GCC 11, and its GLIBC is too old for my CUDA version…)… Maybe I should just try CPU for now…

I get this, however:

Code:

-- Build files have been written to: /tmp/SBo/pytorch-v2.3.0/build
Building wheel torch-2.3.0a0+gitUnknown
-- Building version 2.3.0a0+gitUnknown
cmake -GNinja -DBUILD_PYTHON=True -DBUILD_TEST=True -DCMAKE_BUILD_TYPE=Release -DCMAKE_CUDA_FLAGS='--compiler-bindir=/home/geremia/gcc-11/usr/bin/ -I/home/geremia/gcc-11/usr/include/c++/11.2.0/ -L/home/geremia/gcc-11/usr/lib64/' -DCMAKE_CUDA_HOST_COMPILER=/home/geremia/gcc-11/usr/bin/g++ -DCMAKE_INSTALL_PREFIX=/tmp/SBo/pytorch-v2.3.0/torch -DCMAKE_PREFIX_PATH=/usr/lib/python3.11/site-packages;/usr/lib64 -DCUDA_HOST_COMPILER=/home/geremia/gcc-11/usr/bin/g++ -DCUDNN_INCLUDE_DIR=/usr/include -DCUDNN_LIBRARY=/usr/share/cuda/lib64/libcudnn.so -DJAVA_HOME=/usr/lib64/zulu-openjdk21 -DNUMPY_INCLUDE_DIR=/usr/lib64/python3.11/site-packages/numpy/core/include -DPYTHON_EXECUTABLE=/usr/bin/python3 -DPYTHON_INCLUDE_DIR=/usr/include/python3.11 -DPYTHON_LIBRARY=/usr/lib64/libpython3.11.so.1.0 -DTORCH_BUILD_VERSION=2.3.0a0+gitUnknown -DUSE_NNPACK=0 -DUSE_NUMPY=True /tmp/SBo/pytorch-v2.3.0
Finished running cmake. Run "ccmake build" or "cmake-gui build" to adjust build options and "python setup.py install" to build.
CMake Warning:
  No source or binary directory provided.  Both will be assumed to be the
  same as the current working directory, but note that this warning will
  become a fatal error in future CMake releases.


CMake Error: The source directory "/tmp/SBo/pytorch-v2.3.0/build" does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.

franzen · 05-25-2024, 01:27 AM

Quote:

Originally Posted by Geremia

So basically I need to use Ninja, run "setup.py build" with the "--cmake-only" flag, and manually move the Python site-package stuff in tmpxxx.

Almost. I updated the build on SBo similar to the patch above, https://slackbuilds.org/slackbuilds/...rch.SlackBuild
I suggest to build it as it comes from SBo with your custom setup, and see if it builds at all:
bash pytorch.SlackBuild |& tee build.log

Then make your CUDA adjustments to pytorch.SlackBuild and build again
bash pytorch.SlackBuild |& tee build.cuda.log

Then you can do a diff of build.log and build.cuda.log to identify problems.

willysr · 05-25-2024, 09:22 AM

@franze: the changes to pytorch seems to make other script (audio/openvino-plugins-ai-audacity) failed (https://github.com/SlackBuildsOrg/sl...ent-2131286184)

franzen · 05-25-2024, 09:27 AM

Quote:

Originally Posted by willysr

@franze: the changes to pytorch seems to make other script (audio/openvino-plugins-ai-audacity) failed (https://github.com/SlackBuildsOrg/sl...ent-2131286184)

Yes, it was a fail that i removed the cmake-files. I'll submit an update this week to fix it(and probably add some other changes).

Geremia · 05-25-2024, 01:32 PM

Quote:

Originally Posted by franzen

I suggest to build it as it comes from SBo with your custom setup, and see if it builds at all:
bash pytorch.SlackBuild |& tee build.log

Then make your CUDA adjustments to pytorch.SlackBuild and build again
bash pytorch.SlackBuild |& tee build.cuda.log

Then you can do a diff of build.log and build.cuda.log to identify problems.

I had to prepend USE_CUDA=0 to the first setup.py command ("python3 setup.py build --cmake-only"), but the build still failed:

Code:

FAILED: third_party/fbgemm/CMakeFiles/fbgemm_avx512.dir/src/UtilsAvx512.cc.o 
/usr/bin/ccache /usr/bin/c++ -DFBGEMM_STATIC -I/tmp/SBo/pytorch-v2.3.0/third_party/cpuinfo/include -I/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/third_party/asmjit/src -I/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/include -I/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm -isystem /tmp/SBo/pytorch-v2.3.0/third_party/gemmlowp -isystem /tmp/SBo/pytorch-v2.3.0/third_party/neon2sse -isystem /tmp/SBo/pytorch-v2.3.0/third_party/XNNPACK/include -O2 -fPIC -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -Wall -Wextra -Werror -Wno-deprecated-declarations -Wimplicit-fallthrough -O3 -DNDEBUG -std=c++17 -fPIC -fvisibility=hidden -m64 -mavx2 -mfma -mavx512f -mavx512bw -mavx512dq -mavx512vl -MD -MT third_party/fbgemm/CMakeFiles/fbgemm_avx512.dir/src/UtilsAvx512.cc.o -MF third_party/fbgemm/CMakeFiles/fbgemm_avx512.dir/src/UtilsAvx512.cc.o.d -o third_party/fbgemm/CMakeFiles/fbgemm_avx512.dir/src/UtilsAvx512.cc.o -c /tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc
In function ‘void fbgemm::internal::transpose_contiguous_16x2_block(const float*, float*, int64_t, int)’,
    inlined from ‘void fbgemm::internal::transpose_avx512_contiguous_thin(int64_t, int64_t, const T*, int64_t, T*, int64_t) [with T = float]’ at /tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc:1827:38:
/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc:970:35: error: ‘r’ may be used uninitialized [-Werror=maybe-uninitialized]
  970 |   d[0] = _mm512_permutex2var_epi32(r[0], index1, r[1]);
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc: In function ‘void fbgemm::internal::transpose_avx512_contiguous_thin(int64_t, int64_t, const T*, int64_t, T*, int64_t) [with T = float]’:
/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc:922:11: note: ‘r’ declared here
  922 |   __m512i r[2], d[2];
      |           ^
cc1plus: all warnings being treated as errors

I'm using GCC 14.1.0 on Slackware64-current.

franzen · 05-25-2024, 02:44 PM

Quote:

Originally Posted by Geremia

I had to prepend USE_CUDA=0 to the first setup.py command ("python3 setup.py build --cmake-only"), but the build still failed:

Code:

FAILED: third_party/fbgemm/CMakeFiles/fbgemm_avx512.dir/src/UtilsAvx512.cc.o 
/usr/bin/ccache /usr/bin/c++ -DFBGEMM_STATIC -I/tmp/SBo/pytorch-v2.3.0/third_party/cpuinfo/include -I/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/third_party/asmjit/src -I/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/include -I/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm -isystem /tmp/SBo/pytorch-v2.3.0/third_party/gemmlowp -isystem /tmp/SBo/pytorch-v2.3.0/third_party/neon2sse -isystem /tmp/SBo/pytorch-v2.3.0/third_party/XNNPACK/include -O2 -fPIC -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -Wall -Wextra -Werror -Wno-deprecated-declarations -Wimplicit-fallthrough -O3 -DNDEBUG -std=c++17 -fPIC -fvisibility=hidden -m64 -mavx2 -mfma -mavx512f -mavx512bw -mavx512dq -mavx512vl -MD -MT third_party/fbgemm/CMakeFiles/fbgemm_avx512.dir/src/UtilsAvx512.cc.o -MF third_party/fbgemm/CMakeFiles/fbgemm_avx512.dir/src/UtilsAvx512.cc.o.d -o third_party/fbgemm/CMakeFiles/fbgemm_avx512.dir/src/UtilsAvx512.cc.o -c /tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc
In function ‘void fbgemm::internal::transpose_contiguous_16x2_block(const float*, float*, int64_t, int)’,
    inlined from ‘void fbgemm::internal::transpose_avx512_contiguous_thin(int64_t, int64_t, const T*, int64_t, T*, int64_t) [with T = float]’ at /tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc:1827:38:
/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc:970:35: error: ‘r’ may be used uninitialized [-Werror=maybe-uninitialized]
  970 |   d[0] = _mm512_permutex2var_epi32(r[0], index1, r[1]);
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc: In function ‘void fbgemm::internal::transpose_avx512_contiguous_thin(int64_t, int64_t, const T*, int64_t, T*, int64_t) [with T = float]’:
/tmp/SBo/pytorch-v2.3.0/third_party/fbgemm/src/UtilsAvx512.cc:922:11: note: ‘r’ declared here
  922 |   __m512i r[2], d[2];
      |           ^
cc1plus: all warnings being treated as errors

I'm using GCC 14.1.0 on Slackware64-current.

You could try to correct the code using https://gcc.gnu.org/gcc-14/porting_to.html , or set -Wno-error in th CXXFLAGS, or disable fbgemm via -DUSE_FBGEMM=OFF.

Geremia · 05-25-2024, 03:47 PM

Quote:

Originally Posted by franzen

You could try to correct the code using https://gcc.gnu.org/gcc-14/porting_to.html , or set -Wno-error in th CXXFLAGS, or disable fbgemm via -DUSE_FBGEMM=OFF.

It was a compiler regression in GCC 12 (and still in GCC 14). Workaround:

Code:

SLKCFLAGS+=" -Wno-error=maybe-uninitialized -Wno-error=uninitialized -Wno-error=restrict"

Now the CPU-only build succeeded.

Update: GPU build (with -DUSE_CUDA=ON, -DUSE_CUDNN=ON, etc.) worked, too.

I'm not sure how many of the CMake flags were really necessary, but my CUDA install is in somewhat non-standard directories. I definitely needed to set CUDAHOSTCXX / CMAKE_CUDA_HOST_COMPILER to a compatible GCC version (13.2 or earlier), though.

Code:

--- pytorch.SlackBuild	2024-05-24 21:53:28.000000000 -0700
+++ pytorch.SlackBuild.sbopkg	2024-05-25 19:57:42.231797290 -0700
@@ -63,6 +63,8 @@
   SLKCFLAGS="-O2"
 fi
 
+SLKCFLAGS+=" -Wno-error=maybe-uninitialized -Wno-error=uninitialized -Wno-error=restrict"
+
 set -e
 
 rm -rf $PKG
@@ -85,12 +87,22 @@
 # This seems harmless as these get recompiled again locally, but it slows down the building process.
 # See https://github.com/icecc/icecream/issues/336
 
-export USE_NNPACK=0
+export USE_CUDA=1
+export CUDAHOSTCXX=/usr/bin/gcc-11
 python3 setup.py build --cmake-only
 
 cd build
   unshare -n cmake \
     -G Ninja \
+      -DCUDAToolkit_INCLUDE_DIR=/usr/include \
+      -DCUDA_CUDART_LIBRARY=/usr/lib64/libcudart.so \
+      -DCUDA_HOST_COMPILER=/usr/bin/gcc-11 \
+      -DCUDA_TOOLKIT_ROOT_DIR=/usr/share/cuda \
+      -DCUDNN_INCLUDE_DIR=/usr/include \
+      -DCUDNN_LIBRARY=/usr/share/cuda/lib64/libcudnn.so \
+      -DUSE_CUDA=ON \
+      -DUSE_CUDNN=ON \
+    -DUSE_NCCL=OFF \
     -DCMAKE_C_FLAGS:STRING="$SLKCFLAGS" \
     -DCMAKE_CXX_FLAGS:STRING="$SLKCFLAGS" \
     -DCMAKE_CXX_STANDARD=17 \

{NCCL didn't build ∵ it seemed to ignore $CUDAHOSTCXX, so I disabled it. It's only required for inter-GPU communication (multiple GPUs).}

Geremia · 05-25-2024, 10:42 PM

I had to create 2 symlinks:

Code:

ln -s /usr/lib64 /usr/lib64/python3.11/site-packages/torch/lib
ln -s /usr/bin /usr/lib64/python3.11/site-packages/torch/

otherwise, "import torch" throws:

Code:

OSError: /usr/lib64/python3.11/site-packages/torch/lib/libtorch_global_deps.so: cannot open shared object file: No such file or directory

Code:

RuntimeError: Unable to find torch_shm_manager at /usr/lib64/python3.11/site-packages/torch/bin/torch_shm_manager

Geremia · Today, 06:32 PM

PyTorch 2.3.1 builds successfully with GCC 13.2 for nvcc, GCC 14.1.0 for everything else, and CUDA 12.4.0, after applying this patch to fix /aten/src/ATen/core/boxing/impl/boxing.h.

There was also an issue with libc10.so not loading. It wasn't packaged properly because this was the directory name during the build process:

Code:

./build/tmpxxx/usr/lib6464/libc10.so

This is fixed by removing the sed command in the SlackBuild:

Code:

# Fix cmake libdir location
sed -i c10/CMakeLists.txt \
  -e "s;DESTINATION lib;DESTINATION lib$LIBDIRSUFFIX;g"

Lines 151 & 159 of c10/CMakeLists.txt should not have "lib6464" but only "lib64".