This is an old revision of the document!
GPU Resources
GPU Info
For CPU and GPU usage:
glances
Other info
nvcc -V
nvidia-smi
lspci -vnn | grep VGA -A 12
dpkg -l | grep -i nvidia
ssh -X ace-gpu-1 nsight
GPU Accounting
SysAdmins: to enable Accounting mode
sudo nvidia-smi -i 0 -am ENABLED
Users: to check if Accounting mode enabled or disabled
nvidia-smi -i 0 -q -d ACCOUNTING
Users: to check GPU stats per process:
nvidia-smi -i 0 --query-accounted-apps=gpu_name,pid,gpu_util,max_memory_usage,time --format=csv
Users: Accounting help
nvidia-smi --help-query-accounted-apps
Deep Learning
Freesurfer
* https://surfer.nmr.mgh.harvard.edu/fswiki/SystemRequirements * https://surfer.nmr.mgh.harvard.edu/fswiki/DevelopersGuide
FreesSurfer 6.0 with CUDA (as well as openmp). Have had issues compiling FreeSurfer with it in the recent past, no longer actively supports GPU/CUDA as Freesurfer it's permanently stuck in the past on version 5.0.35…
https://surfer.nmr.mgh.harvard.edu/fswiki/SystemRequirements https://surfer.nmr.mgh.harvard.edu/fswiki/DevelopersGuide
Nvidia-Docker
Request to install on ACE-GPU-1 so that we can use nvidia-docker.:
- Docker: Docker >= 1.9 (official docker-engine only)
- NVIDIA drivers: >= 340.29 with binary nvidia-modprobe
Why
- Nvidia-Docker is officially supported by NVIDIA
- Allows the containerizing of GPU applications.
- Containers built using this tool should be able to be run on both ACE-GPU-1 and Guillimin.
- Official github site for the project: https://github.com/NVIDIA/nvidia-docker
- Requirements page for installation:https://github.com/NVIDIA/nvidia-docker/wiki/Installation
Status
- For discussion at next IT Team Meeting
OpenACC
OpenACC directives are complementary to and interoperate with existing HPC programming models including OpenMP, MPI, and CUDA.
The directives and programming model defined in the OpenACC API document allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.
The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator. OpenACC is designed for portability across operating systems, host CPUs, and a wide range of accelerators, including APUs, GPUs, and many-core coprocessors.
Status
- For discussion at next IT Team Meeting
Sun Grid Engine - SGE
* Howto set up SGE for CUDA devices?
* Setting Up A Load Sensor in Grid Engine
* kyamagu/sge-gpuprolog - Scripts to manage NVIDIA GPU devices in SGE 6.2u5
* mozhgan-kch/sge-gpuprolog forked from kyamagu/sge-gpuprolog - FORK
* Rocks-Discuss - Grid Engine GPU load sensor
* Tutorial - Submitting a job using qsub
Status
- ???