xgboost 多gpu支持 编译 Ubuntu 18.04.2 Linux 4.15.0-46-generic gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0 cuda 10.0 https://docs.nvidia.com/cuda/archive/10.0/cuda-installation-guide-linux/index.html#verify-you-have-supported-version-of-linux 安装略 nccl2 git clone https://github.com/NVIDIA/nccl.git cd nccl make -j src.build xgboost (建议选择稳定版源码编译 如 0.82) mkdir xgboost-src git clone --recursive https://github.com/dmlc/xgboost.git 或 git clone https://github.com/dmlc/xgboost.git git submodule init git submodule update 设置版本0.82(!然而最后安装后的版本是0.81) git checkout 3f83dcd mkdir build cd build cmake .. -DUSE_CUDA=ON -DUSE_NCCL=ON -DNCCL_ROOT=/xxx/install/nccl-src/nccl/build make -j4 直至出现类似结果 ... Scanning dependencies of target gpuxgboost [ 95%] Linking CXX static library libgpuxgboost.a [ 95%] Built target gpuxgboost Scanning dependencies of target runxgboost [ 97%] Building CXX object CMakeFiles/runxgboost.dir/src/cli_main.cc.o [ 98%] Linking CXX executable ../xgboost [ 98%] Built target runxgboost Scanning dependencies of target xgboost [100%] Linking CXX shared library ../lib/libxgboost.so [100%] Built target xgboost cd ../python-package python setup.py install 备注: 如果切换 使用 update-alternatives gcc/g++ 版本时,可能会出现各种引用异常,此时建议切换到gcc/g++某个已安装版本(如7.3), 重启机器 ------------------------------------------------------ tensorflow (对应cuda 10.0) tensorflow-gpu 1.13.1 pip install tensorflow-gpu ------------------------------------------------------ torch (对应cuda 10.0) pip install https://download.pytorch.org/whl/cu100/torch-1.0.1.post2-cp36-cp36m-linux_x86_64.whl pip install torchvision