Introduction and goal

Before I jumped into into the field of deep learning my first thoughts were about the hardware I would need to run deep learning models. Of course I could have used cloud services such as Amazon AWS GPU instances, but when I saw their pricing I realized that this wasn't a viable solution in the long run. Truth is that I also wanted to replace my old RX 280 by a shinny new GTX 1070 for gaming... And to my surprise this GPU revealed to be much faster than what amazon could offer at a much higher price! Find the benchmark I used for comparison at the end of the post.

We'll not discuss how to setup the hardware part required to use this tutorial. Rather my goal here is to setup the software part. I assume you already have a machine with an NVIDIA GPU capable of using CUDA and a Linux OS installed on it (here we will use Ubuntu 16.04 LTS x64). If you have an AMD/ATI GPU most of this tutorial still applies to you but I won't cover the installation on this kind of hardware.

This post will be split in 2 parts:

  • The first one will teach you how to setup your environment locally so that you can use your machine with all the necessary tools for DL.
  • The second part will focus on using your machine remotely with security in mind so that you can access it and turn it on/off from anywhere in the world.

The installation includes: Jupyter notebooks, Anaconda, Pycharm IDE (yes, not everyone wants to use notebooks) and Tensorflow running on a GPU.

Most of the knowledge gained here (if not all) could also be applied to other deep learning frameworks such as Pytorch.
By the end of this post you will be able to run out-of-the-box deep learning models which would be as effective as an Amazon AWS GPU instance.

Audience

This tutorial is targeting 2 type of audience: One with a basic computer science background who would like to properly setup a secure remote environment for deep learning, and the other which don't have a background in CS but would like to have their own deep learning rig. As these 2 audience are distinct, parts of this tutorial are in collapsed boxes which mainly target the second audience. Those with a CS background can skip these pieces of information if they already know about the subject.

Installation of Anaconda

Anaconda is a very useful tool to switch between different virtual environments.

What is a virtual environment? Why use it?

A virtual environment is a kind of sandbox where your system programs get replaced by new ones from this environment.
For example, lets say you have Python 2 installed on your machine but you have to launch a Python 3 script, how would you do that? You can either install Python 3 alongside Python 2 and if you want to chose one of the 2 you will either do

python3 myscript.py

or

python2 myscript.py

But now you notice that it require Tensorflow version 0.12 and you have Tensorflow 1.0 on your machine. So what would you do?
Well, you replace one by the other but then you get dependencies conflicts as it depends on older libraries.
Now you scratch your head and tell yourself: "All this mess will only result in breaking my python installation and if I want to execute a code with TF 1.0 I will have to reinstall it back". This is where virtual environments come into play because they allow you to switch between multiples installations with only 1 command.

To install it open your terminal and type:

python --version

If you get an output like Python 2.x.x then type:

wget http://repo.continuum.io/archive/Anaconda2-4.3.1-Linux-x86_64.sh

if instead you get an output with Python 3.x.x use:

wget http://repo.continuum.io/archive/Anaconda3-4.3.1-Linux-x86_64.sh

Then use the command:

chmod +x Anaconda2-4.3.1-Linux-x86_64.sh

to give the execution rights to the file you just downloaded with wget.
Now run it with:

./Anaconda2-4.3.1-Linux-x86_64.sh

Accept the license and default location. Let it install the required files then type yes when asked to prepend Anaconda in your path and you're all set!

Finally type:

source ~/.bashrc

To let your terminal reload the ~/.bashrc which contains the path to your Anaconda installation. You can also close and reopen your console to apply the changes.
Now if you type:

conda

At this point you should get a list of parameters to use with conda, if you don't have them restart this section from the beginning and check that you properly followed the instructions.

Creating a virtual environment with Anaconda

To create a virtual environment with anaconda use the following command:

conda create -n deep-learning python=3.5 anaconda

This command will create an environment called deep-learning which will run Python 3.5 and which have as basic library the ones included by default with anaconda. Accept the package installation and let it finish its work.
Now if you type:

conda env list

You should have something like this:

# conda environments:
#
deep-learning            /home/ekami/anaconda2/envs/deep-learning
root                  *  /home/ekami/anaconda2

You just created your first virtual environment!
The little * between root and /home/ekami/anaconda2 indicate the environment in which you are in. Currently we are in the default python environment of conda but lets move to our deep-learning with this command:

source activate deep-learning

Once you're done you should see something similar to this:

(deep-learning) ekami@ekami-Desktop:~$

The (deep-learning) at the beginning indicate the virtual environment in which you are in.
Now if you type:

python --version

you should get something like this:

Python 3.5.2 :: Anaconda 4.3.1 (64-bit)

If you want to go back to your original environment just type:

source deactivate

Perfect, now lets move on and install the required packages.

Installing CUDA & cuDNN

[This part is irrelevant if you want to use Tensorflow on your CPU]

To run your deep learning models you'll have to have a library called CUDA.

What is CUDA?

CUDA is a set of libraries which permit you to run GPGPU (General-purpose computing on graphics processing units) allowing Tensorflow to make a "bridge" between its code and your physical GPU. As you would use Tensorflow functions to create and run your deep learning models, Tensorflow would use CUDA functions to run them on the GPU.

If you have an AMD card you may want to look at [OpenCL](http://support.amd.com/en-us/kb-articles/Pages/OpenCL2-Driver.aspx) which is more or less an equivalent to CUDA but which is supported on both AMD and NVIDIA GPUs, whereas CUDA only works for NVIDIA GPUs. To install CUDA go to [this page](https://developer.nvidia.com/cuda-downloads), select the version of CUDA for your OS and copy its download link. In my case it looks like this:

CUDA

Paste your download link to a wget command. For example:

wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run

Before running the installer you need to switch to a new tty (a full screen console if you like). To do so click on ctrl + alt + F1 on your keyboard, log yourself in and type:

sudo service lightdm stop

This command will kill your graphical interface to allow you to install the nvidia drivers.
In this same tty, from the folder where you downloaded cuda, run this command:

chmod +x ./cuda_8.0.61_375.26_linux-run && sudo ./cuda_8.0.61_375.26_linux-run

Then fill the questions as below:

Do you accept the previously read EULA?
accept/decline/quit: accept     

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: y

Do you want to install the OpenGL libraries?
(y)es/(n)o/(q)uit [ default is yes ]: y

Do you want to run nvidia-xconfig?
This will update the system X configuration file so that the NVIDIA X driver
is used. The pre-existing X configuration file will be backed up.
This option should not be used on systems that require a custom
X configuration, such as systems with multiple GPU vendors.
(y)es/(n)o/(q)uit [ default is no ]: n

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-8.0 ]: 

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: n

Installing the NVIDIA display driver...

And let it finish.

Note:

You can answer "No" when asked for installing the GPU drivers if you already have installed the proprietary graphic drivers for your GPU.
If you're not sure, answer "yes".

If you run into a conflict with the open source Nouveau drivers you may need to blacklist it first with these commands.

When it's done you will have to add the CUDA libraries dynamic and static libraries to your PATH and LD_LIBRARY_PATH.

What are `PATH` and `LD_LIBRARY_PATH`?

The PATH is an environmental variable in Linux and other Unix-like operating systems that tells the shell (the thing in which you issue your commands, the one we use by default is called bash) which directories to search for executable files. So basically when you type:

conda

How does your shell knows what conda is? Well it will first look into the directory listed in your variable PATH and if this program exists it will execute it. Otherwise you get a conda: command not found error. If you want to see what does your variable PATH looks like type:

echo $PATH

You'll see all the directories in which your shell is looking for when you issue it a command.
Now in the similar fashion as PATH, LD_LIBRARY_PATH is an environmental variable used, not to look for executables, but to look for what we call dynamic libraries or shared libraries.

What are dynamic and static libraries?

The static libraries are files ending with a .a whereas dynamic libraries are files ending with a .so. The static libraries are libraries which are included in a program when it is compiled (for example the program echo for echo $PATH may directly include something like display.a internally to be able to display something to the screen). As it is internal, the program result in 1 big executable file which do not rely on external files.

As for the dynamic libraries , these ones are excluded from the compilation, they are files living somewhere on your hard drive on their own.
In our case we can see where the CUDA dynamic libraries are located by typing:

ls /usr/local/cuda/lib64

You'll see a bunch of .a and .so file there, Tensorflow will look for these .so files. But to allow it to do so, we need to tell him where to look, that is, setting the LD_LIBRARY_PATH to point to this directory.

To add CUDA to your environment open the .bashrc file with this command:

nano ~/.bashrc

scroll until you reach the end of the file, you should see something like this:

# added by Anaconda2 4.3.1 installer
export PATH="/home/ekami/anaconda2/bin:$PATH"

That looks familiar right? Now you need to specify where CUDA is such that you end up with something like this:

# added by Anaconda2 4.3.1 installer
export PATH="/home/ekami/anaconda2/bin:/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"

Notice the : used as a separator.
Now close this file with the nano editor use control + X keys of your keyboard, it will ask if you want to save the file, type Y then press enter.
Finally reload your .bashrc file with:

source ~/.bashrc

And you're done with CUDA! Yay!
Now lets install cuDNN, you though you could open your champagne? Not yet!

What is cuDNN?

From NVIDIA website:

cuDNN is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers.

You can imagine cuDNN as some .so libraries which could have been shipped with CUDA but which are not because everyone doing GPGPU doesn't necessarily want to do deep learning. So just consider them as a set of libraries living alongside CUDA.

To do so go to this page, register yourself to the NVIDIA website then download the file for linux systems as below:
cuDNN
You now should have downloaded a .tgz file. Go into the directory where you saved it and uncompress it:

cd ~/Downloads/ && tar -xvf cudnn-8.0-linux-x64-v5.1.tgz

A new directory named cuda will be created. Go into it and copy all its content to your CUDA directory. Run the following commands:

cd cuda && sudo cp -vfR * /usr/local/cuda/

Now if you run this command:

ls /usr/local/cuda/lib64/ | grep libcudnn.so

You should have an output similar to this:

libcudnn.so
libcudnn.so.5
libcudnn.so.5.1.10

If not then you missed something. Restart this section by following carefully each steps.
Finally restart your system with:

sudo reboot

Congratulations! You just passed the less intuitive part of this tutorial!

Installing the required packages for deep learning

Lets start by installing OpenBlas which is a package used to accelerate tensor operations on the CPU:

sudo apt-get install libopenblas-dev liblapack-dev

Now move to your deep-learning virtual environment if you're not already in it with:

source activate deep-learning

We will install the following packages:

  • Tensorflow
  • The Scipy library collection
What is Scipy?

Scipy is a collection of optimized libraries for scientific computing. It contains libraries such as Numpy, Matplotlib and Pandas.

To install the [Scipy](https://www.scipy.org/) collection execute:
conda install -c anaconda scipy

Finally to install Tensorflow you have 2 choices, either you want to install Tensorflow on your CPU, in this case you just have to run this command:

pip install tensorflow

or you want to install it to use your GPU, if you followed this tutorial entirely this is probably what you want.

pip install tensorflow-gpu

(If you're an AMD GPU user you may want to take a look at the community Tensorflow version for OpenCL)

Finally you have to tell Jupyter to recognize your virtual environment as a kernel. We can simply do this with nb conda kernels:

conda install nb_conda

This program tells Jupyter to recognize your installed anaconda environment.

Now close your current terminal so we can properly test Jupyter from a new terminal session.

And that's all! We're done for this part :)

Testing Jupyter

Now comes the fun part!
You won't have to do a lot of things here as jupyter is shipped with anaconda. But lets test that our environment works properly.
To do so I invite you to run a handcrafted benchmark which will try to predict numbers based on the MNIST dataset and output the time it took to do so.
To download it run:

wget https://raw.githubusercontent.com/EKami/deep_learning_foundation_nanodegree/master/tensorflow_benchmark.ipynb

Then run jupyter with:

jupyter notebook

This will open a new tab in your web browser where you can open the tensorflow_benchmark.ipynb file from Jupyter.

Now select your virtual env we previously created:
Jupyter

Then click on "Kernel > Restart & Run all"
If everything went fine and you used tensorflow on your GPU you should have something like this from the console in which you ran Jupyter:

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally

And at the end of the last cell you should start to see things like this:

Iter 12800, Minibatch Loss= 2820.980713, Training Accuracy= 0.85938
Iter 25600, Minibatch Loss= 1172.836670, Training Accuracy= 0.91406
...

As you run Jupyter locally you don't really need to secure it. In Part 2 we will expose it to the internet, it means that anyone will be able to access it, especially cryptocurrency miners... Remember, Jupyter can serve as a file explorer and even worse... can execute code, that's a hacker heaven if you don't secure it properly!
Everything is now setup, congratulations!! :)

Configuring Pycharm

Now if you don't like working on Jupyter notebooks you can still use a text editor such as Atom or Visual studio code, but personally I prefer using Pycharm, it's free and really powerful :) .
Here I'll just show you how to run Pycharm with the conda environment we just created. In the second part I'll show you how to use your Pycharm instance locally and execute code remotely.
Copy paste this code (which is the same as the one in the Jupyter notebook) into a file named script.py which you can find here.

Go to File > Settings then navigate to Project: script.py > Project Interpreter.
From the drop down list you can select you anaconda environment, select it then press OK:
Pycharm2

Let the indexing process finish then go to the little Edit configuration on the drop down arrow on the top right corner:
Pycharm

A new window will appear, click on the + sign and choose Python.
Give it a name and in the configuration tab fill the Script entry with the path to your script.py file.
Now click on the ... next to Environment variables: and add the variable LD_LIBRARY_PATH so it looks like this:
Pycharm

Click OK then the final result should look like this:
Pycharm

Press OK, now you're ready to launch your script by clicking on the green "play" button.
If everything works, perfect, you're done! :)

Acknowledgements

A big thanks to Akshay Lakhi for his corrections and feedback.
If you guys want clarifications, find any typos or improvements to add, don't hesitate to leave your feedback in the comments!
In Part 2 we will learn how to take what we did so far and expose it to the internet, stay tuned!