Help:Caffe

From CECS wiki
Jump to navigation Jump to search

Caffe is one of several deep learning frameworks made with expression, speed, and modularity in mind.

Software homepage
http://caffe.berkeleyvision.org
https://caffe2.ai/
Software availability
available on multiple clusters for cpu and gpu use
Other related software
cuda, cudnn
command to type to run
module load caffe ; caffe.bin

Using caffe[edit]

sample SGE script:

#$ -cwd
#$ -l gpu=1
module load cuda cudnn opencv caffe-deps
caffe.bin train --solver=solver.prototxt

Sample sge script with checkpoint and restart support: (UNTESTED -- please tell us if this works!)

#$ -cwd
#$ -l gpu=1
#$ -ckpt caffe_ckpt -c 36000
module load cuda cudnn opencv caffe-deps
caffe.bin train --solver=solver.prototxt

It may be necessary to add code to the script to tell caffe to use the checkpoint.

Meaning of checkpoint options:

-r y
job is restartable
-ckpt lsdyna_ckpt
use lsdyna method to trigger checkpoint and migration
-c 36000
checkpoint every 10 hours
$RESTARTED
your script can check for this environment variable to see if the job was restarted automatically

Compiling caffe on rocks[edit]

Caffe is already compiled on the cluster as a module. However, if you want to modify caffe and compile your modified version, these directions may help. These directions apply to all rocks clusters here. If your cluster is missing the caffe-deps module, please ask for it to be installed.

All compilation must be done on the head node.

To compile caffe on the local systems, this is the recommended configuration:

  1. module load cuda cudnn caffe-deps opencv opt-python
  2. cp Makefile.config.example Makefile.config
  3. Edit the following values in Makefile.config (change value or uncomment as appropriate):
USE_CUDNN := 1
BLAS := open
PYTHON_INCLUDE := /opt/python/include/python2.7 \
               /opt/python/lib/python2.7/dist-packages/numpy/core/include /opt/python/include/python2.7
PYTHON_LIB := /opt/python/lib
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib64/atlas /usr/lib64 /usr/lib /share/apps/caffe-deps/lib

Change CUDA= to the path shown with module show cuda for the version of cuda you are using.

If you want to use python layers, add

WITH_PYTHON_LAYER := 1

You may also need to change the following to add cudnn, caffe-deps

INCLUDE_DIRS= 
LIBRARY_DIRS=

Then use make to build caffe.

Use make distribute to install caffe in the distribute directory.