Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| gpu_resources [2017/04/27 13:20] – csteel | gpu_resources [2024/03/26 13:52] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 2: | Line 2: | ||
| This is a collaborative resource, please improve it. Login using your MCIN user name and ID and add your discoveries. | This is a collaborative resource, please improve it. Login using your MCIN user name and ID and add your discoveries. | ||
| + | |||
| + | ===== Items of Interest / for Discussion? ===== | ||
| + | |||
| + | |||
| + | |||
| + | ==== Resources ==== | ||
| + | |||
| + | * [ OpenACC - Tutorial - Steps to More Science ]( https:// | ||
| + | |||
| + | "Here are three simple steps to start accelerating your code with GPUs. We will be using PGI OpenACC compiler for C, C++, FORTRAN, along with tools from the PGI Community Edition." | ||
| + | |||
| + | * [ Performance Portability from GPUs to CPUs with OpenACC ](https:// | ||
| + | |||
| + | * [ Data Center Management Tools ]( http:// | ||
| + | |||
| + | * The GPU Deployment Kit | ||
| + | * Ganglia | ||
| + | * Slurm | ||
| + | * NVIDIA Docker | ||
| + | * Others??? | ||
| + | |||
| + | " | ||
| + | |||
| ===== Preventing Job Clobbering ===== | ===== Preventing Job Clobbering ===== | ||
| - | Today I was training a model and inadvertently kicked Konrad' | + | There are currently 3 GPU' |
| + | |||
| + | < | ||
| + | export CUDA_VISIBLE_DEVICES=X | ||
| + | </ | ||
| + | |||
| + | This will only take effect when you log in, so log out and back in and try the following to ensure | ||
| + | |||
| + | < | ||
| + | echo $CUDA_VISIBLE_DEVICES | ||
| + | </ | ||
| + | |||
| + | If it outputs the ID that you selected then you're ready to use the GPU. | ||
| + | |||
| + | ==== Sharing a single GPU ==== | ||
| + | To configure TensorFlow to not pre-allocate all GPU memory you can use the following Python code: | ||
| < | < | ||
| Line 15: | Line 53: | ||
| </ | </ | ||
| - | We should develop some kind of policy | + | This has been found to work only to a certain extent, and when there are several |
| ===== GPU Info ===== | ===== GPU Info ===== | ||
| Line 47: | Line 84: | ||
| nsight | nsight | ||
| </ | </ | ||
| + | |||
| + | Nvidia Visual Profiler (https:// | ||
| + | < | ||
| + | / | ||
| + | </ | ||
| + | |||
| ===== GPU Accounting ===== | ===== GPU Accounting ===== | ||
| Line 130: | Line 173: | ||
| Doesn' | Doesn' | ||
| </ | </ | ||
| + | |||
| + | * [[http:// | ||
| + | |||
| + | * [[http:// | ||
| ===== Deep Learning ===== | ===== Deep Learning ===== | ||