This tutorial will explain you how to setup Tensorflow with cuda 3.0 compute compatibility devices (such as NVIDIA Grid K520, GTX580, GTX650, GTX770, GTX780…)
Basically, You need to use that specific build of Tensorflow if you encounter this kind of error message:
tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
1 – Install Bazel
First, you need to install Bazel. I recommend to use Bazel 0.1.0, because some other builts raise errors when compiling Tensorflow.
1.1 – Setup Java JDK-8
Bazel requires java jdk-7 or 8. Follow the steps to install it:
Ubuntu (14.04)
$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update
$ sudo apt-get install oracle-java8-installer
Ubuntu (15.10)
$ sudo apt-get install openjdk-8-jdk
1.2 – Install other dependencies
$ sudo apt-get install pkg-config zip g++ zlib1g-dev unzip
1.3 – Download Bazel 0.1.0
https://github.com/bazelbuild/bazel/releases/download/0.1.0/bazel-0.1.0-installer-linux-x86_64.sh
1.4 – Compile Bazel
$ chmod +x bazel-version-installer-os.sh $ ./bazel-version-installer-os.sh --user
Bazel executable is located at: $HOME/bin/bazel
2 – Install Tensorflow
2.1 – Install dependencies
$ sudo apt-get install python-numpy swig python-dev wheel git
2.2 – Download Tensorflow 0.7.0
$ git clone --recurse-submodules https://github.com/tensorflow/tensorflow.git -b 0.7.0
2.3 – Compile Tensorflow
Configurate tensorflow built with cuda 3.0 compute capability
$ cd tensorflow $ TF_UNOFFICIAL_SETTING=1 ./configure Indicate basic settings according to your config. For cuda compute compatibility, enter: 3.0
Compile tensorflow
$ $HOME/bin/bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer $ bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu
After compiling, you should get that output:
000009/000005 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427] 000006/000001 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427] 000009/000009 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427]
3 – Install Python interface
If you are planning to use tensorflow python interface, you first need to create a pip package to install it.
3.1 Create pip package
First step is to use bazel to create pip package
$ $HOME/bin/bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
3.2 Install python tensorflow
We can then use that pip package to install tensorflow in python
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# The name of the .whl file will depend on your platform.
$ pip install /tmp/tensorflow_pkg/tensorflow-0.6.0-cp27-none-linux_x86_64.whl
3.3 Test it
To check if everything is working well, you can run that example:
$ cd tensorflow/models/image/mnist $ python convolutional.py Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes. Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes. Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes. Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes. Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz Initialized! Epoch 0.00 Minibatch loss: 12.054, learning rate: 0.010000 Minibatch error: 90.6% Validation error: 84.6% Epoch 0.12 Minibatch loss: 3.285, learning rate: 0.010000 Minibatch error: 6.2% Validation error: 7.0% ... ...
If you meet any difficulty you can leave a comment!
Hi! Thanks for writing this tutorial.
Everything works great, until I try to run the command: “$HOME/bin/bazel build -c opt –config=cuda //tensorflow/cc:tutorials_example_trainer”.
This is the output I see:
“Extracting Bazel installation…
…..
ERROR: /home/me/tensorflow/tensorflow/core/BUILD:1: Extension file not found: ‘google/protobuf/protobuf.bzl’.
ERROR: /home/me/tensorflow/tensorflow/cc/BUILD:65:1: error loading package ‘tensorflow/core’: Extension file not found: ‘google/protobuf/protobuf.bzl’ and referenced by ‘//tensorflow/cc:tutorials_example_trainer’.
ERROR: Loading failed; build aborted.
INFO: Elapsed time: 1.006s”
Any advice?
Hi, I have managed to resolve the issue by cloning the source code using this command: “git clone -b 0.6.0 –recurse-submodules https://github.com/tensorflow/tensorflow.git“
git clone -b 0.6.0 –recurse-submodules https://github.com/tensorflow/tensorflow.git
fatal: repository ‘–recurse-submodules’ does not exist
It is a typo: should be –recurse-submodules