Help:Cuda linux

From CECS wiki
Jump to navigation Jump to search

CUDA linux hints

This guide is useful for nvidia cards such as the tegrity or tesla.

For information on driver installation, see nvidia.

Useful commands[edit]

  • nvidia-smi
  • module load cuda cudnn
  • nvidia-ps (locally written -- ask if you don't have it)
  • nvcc

NOTE: If you get the following message, do NOT install cuda. First check if /usr/local/cuda/bin exists.

 The pprogram 'nvcc' is not installed.  You can install it by typing:
   apt-get install nvidia-cuda-toolkit

DO NOT DO THIS. Most likely cuda is already installed but /usr/local/cuda/bin is not in your path. If this directory does not exist, follow instructions at 'nvidia' to correctly install cuda.

Cuda capable software[edit]

If you have a gpu and your server is missing any of these and you'd like to use them, let us know.

  • cudann (nvidia) nVidia deep learning library
  • caffe (needs boost opencv) (install)
  • theano (via pip)
  • torch : dependencies readline-devel gnuplot zeromq-devel nodejs (qt for qutlua, qttorch) (sox for audio)
  • managedCUDA (C#)
  • TensorFlow / download and setup (not installed yet)

Cuda with SGE[edit]

gpu.sge:

#$ -cwd
#$ -l gpu=1
module load cuda opencv caffe
caffe.bin train --solver=solver.prototxt

Note that this script contains common options, you should adjust the module list to fit your needs.

RTX issues with tensorflow[edit]

If your job works on pascal gpus but not the new RTX gpus your code may be having memory segmentation issues. Try this:

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

External links[edit]