Data Science, Artificial Intelligence, Machine Learning has risen exponentially in the last decade largely due to the amount of the data that’s get generated and the huge computation power of modern computers.
Deep Learning is the sub-field of Machine Learning which makes better predictions compared to the traditional Machine Learning techniques as the amount of data increases. It works on the principle of neural networks which is similar to the neurons in our brain. The neurons or the hidden layers in the deep learning framework learn patterns from the data and more often than not give accurate predictions. Image Classification, Digit Recognition, etc., are some of the use cases where Deep learning used.
Suppose to predict the price of a house, we take the variables like size and price. Linear Regression could be applied over here but Deep Learning algorithms would produce much better results. The neurons would receive the input and apply activation functions like Rectified Linear Unit or ReLU to generate the output. An activation function is a real number and the output is that number.
Supervised Learning is a process where we get an output after the input is mapped to it via a certain function. The types of neural networks vary according to the application. On one hand, where Convolutional Neural Networks is used for image classification, speech recognition is achieved by the Recurrent Neural Networks.
As the data increases, algorithms like Support Vector Machines, Linear Regression, etc., don’t improve much but more data improves the deep learning algorithm. The choice of an activation function is also an important factor in Deep Learning with Python as using the sigmoid function increases the computation time which is resolved by the ReLU function.
In Deep Learning, writing less code would give outstanding results provided that the right parameters are used and the hyperparameters are fine-tuned.
Deep Learning Frameworks
So far we got a brief intuition about Deep Learning. Based on the CNN Deep Learning Framework popularity, below are the 5 Deep Learning frameworks list you could use in your next project. We have provided sample code for 3 out of the 5 frameworks implemented in Python.
1. Deep learning with Keras
Developed and maintained by Francois Chollet, Keras is a user-friendly application programming interface that minimizes the user action number and provides clear feedback on error. You could easily add new modules in the Keras model fit and all are described in Python code.
The Keras model is the core data structure and the Sequential model could be imported by:
from keras.models import Sequential model = Sequential()
You can add layers to this model using the .add() method.
from keras.layers import Dense model.add(Dense(units=64, activation='relu', input_dim=100)) model.add(Dense(units=10, activation='softmax'))
The model is configured using the compile () statement.
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrices=['accuracy'])
Based on the number of epochs set, the model is trained.
# x_train and y_train are Numpy arrays --just like in the scikit-learn API. model.fit(x_train, y_train, epochs=5, batch_size=32)
The .evaluate () method would evaluate the performance and the .predict () method would give the results. Keras could be installed using the sudo pip install keras command.
The methods which are common in the layers of Keras load model are – .get_weights() , .set_wights(weights) , and .get_config(). The input and the output tensor of a single node layer could be found with the .input, .output, .input_shape, and the .output_shape command. Below are some of the other layers in Keras and their representation in Python.
keras.layers.MaxPooling1D(pool_size=2, strides=None, padding='valid', data_format='channels_last')
keras.layers.Conv1D(filters, kernel_size, strides=1, padding='valid', data_format='channels_last', dilation_rate=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, activity_regularizer=None, kernel_constraint=None, nias_constraint=None)
Locally Connected Layers
keras.layers.locallyConnected1D(filters, kernel_size, strides=1, padding='valid', data_format=None, activation=None, use_bias=True, kernel_initializer='glorot_uniform', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)
keras.layers.RNN(cell, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False)
Developed by Google, TensorFlow is the most popular Deep Learning framework used. What is Tensorflow? Here, the tensors which are multi-dimensional array acts as the input. The versatility of this framework makes it extremely popular. TensorFlow’s single API allows it to run on different platforms like cloud, desktop, and so on. The architecture of TensorFlow is written in C++ language which makes it fast. It could also be operated using languages like Python.
TensorFlow could be installed using the pip install TensorFlow command. Please note that you would need to set up a local virtual environment to install TensorFlow 2.0. Below is a TensorFlow code in Python.
- Importing the libraries –
from_future_import absolute_import, division, print_function, unicode_literals !pip install tensorflow-gpu==2.0.0-alpha0 import tensorflow as tf
- The MNIST dataset is loaded and it is split into train and test set.
mnist = tf.keras.datasets.mnis (x-train, y_train), (x_tests, y-tests) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0
- The model is fine-tuned and several parameters are set.
model = tf.keras.models.Sequential([ tf.keras.layers.Flattern(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation='softmax') ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrices=['accuracy'])
- The model is trained and evaluated.
model.fit(x_train, y_train, epochs=5) model.evaluate(x_test, y_test)
The TensorBoard, a feature of TensorFlow allows to graphically display and monitor the work of a TensorFlow. The node in a dataflow graph of a TensorFlow represents the tensors. Both Linear and Non-Linear models could be built using this framework. Additionally, the vast collection of tools makes it flexible with other Machine Learning algorithms. The auto-differentiation of TensorFlow benefits gradient descent algorithms.
Developed by Facebook, PyTorch is a Deep Learning framework that uses GPUs and provides flexibility and speed. In layman terms, PyTorch uses Tensors similar to Numpy with GPU.
Below is a sample code of using PyTorch install on a random data of a two-layer network. The forward and backward passes are manually implemented.
# - *- coding: utf-8 -*- import torch dtype = torch.float device = torch.device("cpu") #device = torch.device("cuda:0") # Uncomment this to run on GPU #N is batch size; D_in is input dimension; # H is hidden dimeension; D_out is output dimension. N,D_in, H, D_out = 64, 1000, 100, 10 #Create random input and output data x=torch.randn(N,D_in, device=device, dtype=dtype) y=torch.randn(N, D_out, device=device, dtype=dtype) #Randomly initialize weights w1=torch.randn(D_in, H, deice=device, dtype=dtype) w2=torch.randn(H, D_out, device=device, dtype=dtype)
learning_rate = 1e-6 for t in range(500): #Forward pass: compute predicted y h = x.mm(w1) h_relu = h.clamp(min=0) y_pred = h_relu.mm(w2) #Compute and print loss loss = (y_pred -y).pow(2).sum().item() print(t, loss) #Backprop to compute gradients of w1 and w2 with respect to loss grad_y_pred = 2.0*(y_pred - y) grad_w2 = h_relu.t().mm(grad_y_pred) grad_h_relu = grad_y_pred.mm(w2.t()) grad_h = grad_h_relu.clone() grad_h[h<0]=0 grad_w1 = x.t().mm(grad_h) #Update weights using gradient descent w1-= learning_rate*grad_w1 w2-= learning_rate*grad_w2
The manual implementation of a backward pass could be prevented using the automatic differentiation mechanism provide by the Autograd package in PyTorch.
In PyTorch tutorials, you could easily make the transition for speed, functionality to graph node which works in C++ runtime environments. The front end in PyTorch is hybrid in nature providing flexibility in its workflow. It allows P2P communications as well. PyTorch could be used with libraries such as Numba, Cython. There are several tools used in areas of deep learning such as torchvision.
Developed by Berkley AI Research, the Caffe framework exhibits speed, modularity, and functionality. In Caffe, you could set a single flag and switch between GPU and CPU. After its first release, several changes have been made in support of its code and the models. Nearly sixty million images could be processed per day using the Caffe framework. The community is large as well which supports several research works and large scale applications like computer vision.
MXNet supports various programming languages and it is an open-source, flexible, lean and ultra-scalable framework which models like Convolutional Neural Networks and Long short-term memory. Using a distributed server, it is deployed on the cloud infrastructure and supports both symbolic and imperative programming.
The model built on this framework could be improved as it is much easier to fine-tune the hyperparameters and debug better. The backend is built on C++ with other languages supporting the frontend. A model could be deployed on a low-end device using the MXNet framework.
There are several frameworks available in Deep learning and recent developments have resulted in the rise of more such frameworks. All are efficient and capable building accurate models. However, the choice of uses is dependent on the application.