ace-gpu-1 installation log
Log of platform setup and configuration
Base
NVIDIA Driver
This is probably not required as a driver is included with CUDA.
chmod 770 NVIDIA-Linux-x86_64-375.26.run /etc/init.d/lightdm stop ./NVIDIA-Linux-x86_64-375.26.run reboot
CUDA
Installation of CUDA from debian package
Confirm GPU
lspci | grep -i nvidia
Output example
01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1) 01:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1) 02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1) 02:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1) 03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1) 03:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
gcc version
gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ensure for headers
sudo apt-get install linux-headers-$(uname -r)
Download CUDA Toolkit
wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
Confirm checksum
wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/docs/sidebar/md5sum-txt
md5sum cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb cat md5sum-txt | grep cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64
Install
mv cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb sudo apt-get update sudo apt-get install cuda
Reboot system.
Environment Setup
Add CUDA bin path
</code> export PATH=/usr/local/cuda-8.0/bin:${PATH} echo $PATH
=== Ensure for LD_LIBRARY_PATH === Ensure LD_LIBRARY_PATH includes `/usr/local/cuda-8.0/lib64` <code> echo $LD_LIBRARY_PATH
If not set set using:
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64
otherwise something like this:
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
/etc/skel
Configure `/etc/skel` so that new users have the proper environment configuration
nano /etc/skel/.profile
Content example
# set user PATH to include /usr/local/cuda-8.0/bin if [ -d "/usr/local/cuda-8.0/bin" ]; then PATH="/usr/local/cuda-8.0/bin:$PATH" fi # set user LD_LIBRARY_PATH to include /usr/local/cuda-8.0/lib64 if [ -d "/usr/local/cuda-8.0/lib64" ]; then LD_LIBRARY_PATH="/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH"
GPU Accounting Setup and Configuration
The CUDA nvidia-persistenced needs to be configured for the target OS's startup system. In the case of Ubuntu 16.04 this would be systemd.
Confirm Driver Version
nvidia-smi
Output example
Wed Apr 26 14:20:40 2017 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.39 Driver Version: 375.39 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 TITAN X (Pascal) Off | 0000:01:00.0 On | N/A | | 41% 68C P2 95W / 250W | 1902MiB / 12186MiB | 96% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1977 G /usr/lib/xorg/Xorg 60MiB | | 0 10406 C /data1/data/kwagstyl/anaconda2/bin/python 189MiB | | 0 10660 C /data1/data/kwagstyl/anaconda2/bin/python 1495MiB | | 0 16118 C ...freesurfer_LBL/bin/mris_fix_topology_cuda 153MiB | +-----------------------------------------------------------------------------+
Download and uncompress
Download nvidia-persistenced version that matches your driver version (see output example above)
mkdir -p ~/src/ubuntu/16.04/nvidia cd ~/src/ubuntu/16.04/nvidia wget ftp://download.nvidia.com/XFree86/nvidia-persistenced/nvidia-persistenced-375.39.tar.bz2 tar xvjf nvidia-persistenced-375.39.tar.bz2
Edit the nvidia-persistenced.conf.template
= Confirm creation of the nvidia-persistenced user =
sudo cat /etc/passwd | grep nvidia
Output example:
nvidia-persistenced:x:126:132:NVIDIA Persistence Daemon,,,:/:/sbin/nologin
Edit the systemd template
cd nvidia-persistenced-375.39/init/systemd nano nvidia-persistenced.service.template
Replace USER with the nvidia-persistence users name `nvidia-persistenced`
Run the installer
cd ~/sys/sw/ubuntu/16.04/nvidia/nvidia-persistenced-375.39/init sudo ./install.sh
= Output example =
Checking for common requirements... sed found in PATH? Yes useradd found in PATH? Yes userdel found in PATH? Yes id found in PATH? Yes Common installation/uninstallation supported Creating sample System V script... done. Creating sample systemd service file... done. Creating sample Upstart service file... done. Checking for systemd requirements... /usr/lib/systemd/system directory exists? No /etc/systemd/system directory exists? Yes systemctl found in PATH? Yes systemd installation/uninstallation supported Installation parameters: User : nvidia-persistenced Group : nvidia-persistenced systemd service installation path : /etc/systemd/system User 'nvidia-persistenced' already exists, skipping useradd... User 'nvidia-persistenced' is in primary group 'nvidia-persistenced'. Stopping nvidia-persistenced.service... done. Installing sample systemd service nvidia-persistenced.service... done. Enabling nvidia-persistenced.service... done. Starting nvidia-persistenced.service... done.
Check
sudo service nvidia-persistenced status
Troubleshooting
add “–persistence-mode –verbose” to the line where the service is started.