ace-gpu-1_installation_log

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
ace-gpu-1_installation_log [2017/02/15 17:41] – created csteelace-gpu-1_installation_log [2024/03/26 13:52] (current) – external edit 127.0.0.1
Line 2: Line 2:
  
 Log of platform setup and configuration Log of platform setup and configuration
 +
 +===== Base =====
 +
 +
 +===== NVIDIA Driver =====
 +
 +This is probably not required as a driver is included with CUDA.
 +
 +<code>
 +chmod 770 NVIDIA-Linux-x86_64-375.26.run 
 +/etc/init.d/lightdm stop
 +./NVIDIA-Linux-x86_64-375.26.run 
 +reboot
 +</code>
 +
 +===== CUDA ====
 +
 +Installation of CUDA from debian package
 +
 +==== Confirm GPU ====
 +
 +<code>
 +lspci | grep -i nvidia
 +</code>
 +
 +Output example
 +
 +<code>
 +01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
 +01:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
 +02:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
 +03:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
 +
 +</code>
 +
 +==== gcc version ====
 +
 +<code>
 +gcc --version
 +</code>
 +
 +<code>
 +gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
 +Copyright (C) 2015 Free Software Foundation, Inc.
 +This is free software; see the source for copying conditions.  There is NO
 +warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 +</code>
 +
 +==== ensure for headers ====
 +
 +<code>
 +sudo apt-get install linux-headers-$(uname -r)
 +</code>
 +
 +==== Download CUDA Toolkit ====
 +
 +<code>
 +wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
 +</code>
 +
 +==== Confirm checksum ====
 +
 +<code>
 +wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/docs/sidebar/md5sum-txt
 +</code>
 +
 +<code>
 +md5sum cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
 +cat md5sum-txt | grep cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64
 +</code>
 +
 +==== Install ====
 +
 +<code>
 +mv cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
 +sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
 +sudo apt-get update
 +sudo apt-get install cuda
 +</code>
 +
 +Reboot system.
 +
 +==== Environment Setup ====
 +
 +=== Add CUDA bin path ===
 +
 +</code>
 +export PATH=/usr/local/cuda-8.0/bin:${PATH}
 +echo $PATH
 +<code>
 +
 +=== Ensure for LD_LIBRARY_PATH ===
 +
 +Ensure LD_LIBRARY_PATH includes `/usr/local/cuda-8.0/lib64`
 +
 +<code>
 +echo $LD_LIBRARY_PATH
 +</code>
 +
 +If not set set using:
 +
 +<code>
 +export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64
 +</code>
 +
 +otherwise something like this:
 +
 +<code>
 +export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
 +</code>
 +
 +==== /etc/skel ====
 +
 +Configure `/etc/skel` so that new users have the proper environment configuration
 +
 +<code>
 +nano /etc/skel/.profile
 +</code>
 +
 +Content example
 +
 +<code>
 +# set user PATH to include /usr/local/cuda-8.0/bin
 +if [ -d "/usr/local/cuda-8.0/bin" ]; then
 +    PATH="/usr/local/cuda-8.0/bin:$PATH"
 +fi
 +
 +# set user LD_LIBRARY_PATH to include /usr/local/cuda-8.0/lib64
 +if [ -d "/usr/local/cuda-8.0/lib64" ]; then
 +    LD_LIBRARY_PATH="/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH"
 +
 +
 +</code>
 +<code>
 +</code>
 +<code>
 +</code>
 +
 +=== GPU Accounting Setup and Configuration ===
 +
 +The CUDA nvidia-persistenced needs to be configured for the target OS's startup system. In the case of Ubuntu 16.04 this would be systemd. 
 +
 +== Confirm Driver Version ==
 +
 +<code>
 +nvidia-smi
 +</code>
 +
 +Output example
 +
 +<code>
 +Wed Apr 26 14:20:40 2017       
 ++-----------------------------------------------------------------------------+
 +| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
 +|-------------------------------+----------------------+----------------------+
 +| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 +| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 +|===============================+======================+======================|
 +|    TITAN X (Pascal)    Off  | 0000:01:00.0      On |                  N/A |
 +| 41%   68C    P2    95W / 250W |   1902MiB / 12186MiB |     96%      Default |
 ++-------------------------------+----------------------+----------------------+
 +                                                                               
 ++-----------------------------------------------------------------------------+
 +| Processes:                                                       GPU Memory |
 +|  GPU       PID  Type  Process name                               Usage      |
 +|=============================================================================|
 +|    0      1977    G   /usr/lib/xorg/Xorg                              60MiB |
 +|    0     10406    C   /data1/data/kwagstyl/anaconda2/bin/python      189MiB |
 +|    0     10660    C   /data1/data/kwagstyl/anaconda2/bin/python     1495MiB |
 +|    0     16118    C   ...freesurfer_LBL/bin/mris_fix_topology_cuda   153MiB |
 ++-----------------------------------------------------------------------------+
 +</code>
 +
 +== Download and uncompress ==
 +
 +Download nvidia-persistenced version that matches your driver version (see output example above)
 +
 +<code>
 +mkdir -p ~/src/ubuntu/16.04/nvidia
 +cd ~/src/ubuntu/16.04/nvidia
 +wget ftp://download.nvidia.com/XFree86/nvidia-persistenced/nvidia-persistenced-375.39.tar.bz2
 +tar xvjf nvidia-persistenced-375.39.tar.bz2
 +</code>
 +
 +== Edit the nvidia-persistenced.conf.template ==
 +
 += Confirm creation of the nvidia-persistenced user = 
 +
 +<code>
 +sudo cat /etc/passwd | grep nvidia
 +</code>
 +
 +Output example:
 +
 +<code>
 +nvidia-persistenced:x:126:132:NVIDIA Persistence Daemon,,,:/:/sbin/nologin
 +</code>
 +
 +== Edit the systemd template ==
 +
 +<code>
 +cd nvidia-persistenced-375.39/init/systemd
 +nano nvidia-persistenced.service.template
 +</code>
 +
 +Replace __USER__ with the nvidia-persistence users name `nvidia-persistenced`
 +
 +== Run the installer ==
 +
 +<code>
 +cd ~/sys/sw/ubuntu/16.04/nvidia/nvidia-persistenced-375.39/init
 +sudo ./install.sh
 +</code>
 +
 += Output example =
 +
 +<code>
 +Checking for common requirements...
 +  sed found in PATH?  Yes
 +  useradd found in PATH?  Yes
 +  userdel found in PATH?  Yes
 +  id found in PATH?  Yes
 +Common installation/uninstallation supported
 +
 +Creating sample System V script... done.
 +Creating sample systemd service file... done.
 +Creating sample Upstart service file... done.
 +
 +Checking for systemd requirements...
 +  /usr/lib/systemd/system directory exists?  No
 +  /etc/systemd/system directory exists?  Yes
 +  systemctl found in PATH?  Yes
 +systemd installation/uninstallation supported
 +
 +Installation parameters:
 +  User  : nvidia-persistenced
 +  Group : nvidia-persistenced
 +  systemd service installation path : /etc/systemd/system
 +
 +User 'nvidia-persistenced' already exists, skipping useradd...
 +User 'nvidia-persistenced' is in primary group 'nvidia-persistenced'.
 +Stopping nvidia-persistenced.service... done.
 +Installing sample systemd service nvidia-persistenced.service... done.
 +Enabling nvidia-persistenced.service... done.
 +Starting nvidia-persistenced.service... done.
 +</code>
 +
 +== Check ==
 +
 +<code>
 +sudo service nvidia-persistenced status
 +</code>
 +
 +== Troubleshooting ==
 +
 +add "--persistence-mode --verbose" to the line where the service is started.
 +
  • ace-gpu-1_installation_log.1487180478.txt.gz
  • Last modified: 2024/03/26 13:52
  • (external edit)