# Notebook Instructions

1. All the <u>code and data files</u> used in this course are available in the downloadable unit of the <u>last section of this course</u>.
2. You can run the notebook document sequentially (one cell at a time) by pressing **Shift + Enter**. 
3. While a cell is running, a [*] is shown on the left. After the cell is run, the output will appear on the next line.

This course is based on specific versions of Python packages. You can find the details of the packages in <a href='https://quantra.quantinsti.com/quantra-notebook' target="_blank" >this manual</a>.


# Artificial Neural Networks
Artificial Neural Network is an information processing paradigm used to study the behaviour of a complex system with computer simulation. It is inspired by the biological way of processing information by the human brain. Moreover, the key element of this paradigm is the novel structure of the information processing system. An artificial neural network aims to solve any specific problem in the same way as a human brain would. Artificial Neural Networks consist of multiple nodes which mimic the biological neurons of a human brain. As they are connected through links, they interact by taking the data and performing operations on it and then passing it over to the other connected node. 

A multi-layer perceptron classifier (MLP) is a classifier that consists of multiple layers of nodes. Each layer is fully connected to the next layer in the network. Each link between the nodes is associated with a certain weight, an integer number that controls the signal between two nodes and interestingly, they have the power to alter the weight according to their previous learning of an event. If a network generates a good or desired output, there is no need to adjust the weights. However, if the network generates a poor or undesired output, the system alters the weights to improve subsequent results. 
In short, they learn through examples and previous experiences. Artificial Neural Network computations can be carried out in parallel. Additionally, they can create their organisation or representation of information they receive during the learning time. Artificial Neural networking, with its phenomenal ability to derive results from complex data, makes it an efficient tool to solve a wide variety of tasks. For example, computer vision, handwritten digits recognition and speech recognition. You can learn more about <a href="https://blog.quantinsti.com/neural-network-python/" target="_blank"> Artificial Neural Network</a> and their application in trading in this article. 

In this notebook, you will perform the following steps:


1. [Independent and Dependent Variable](#x)


2. [MLP Classifier Model](#model)


3. [Make Prediction](#predict)


4. [Model Coefficients](#coff)


5. [Probabilty Estimation](#prob)

<a id='x'></a> 

## Independent and Dependent Variable

Array X of size (n_samples, n_features), holds the training samples represented as floating-point feature vectors. Array y of size (n_samples) holds the target values (class labels) for the training samples.

In [1]:
X = [[0., 0.], [1., 1.]]
y = [0, 1]
X,y

([[0.0, 0.0], [1.0, 1.0]], [0, 1])

<a id='model'></a> 
## MLP Classifier Model
We will use the `MLPClassifier` function from sklearn to train the model.

In [2]:
# Import MLPClassifier
from sklearn.neural_network import MLPClassifier

# Create the model
clf = MLPClassifier(alpha=1e-05, hidden_layer_sizes=(5, 2), random_state=1,
                    solver='lbfgs')
# Fit the model
clf.fit(X, y)

MLPClassifier(activation='relu', alpha=1e-05, batch_size='auto', beta_1=0.9,
              beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(5, 2), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=1, shuffle=True, solver='lbfgs', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

<a id='predict'></a> 

## Make Prediction
After fitting (training), the model can predict labels for new samples.

In [3]:
clf.predict([[2., 2.], [-1., -2.]])

array([1, 0])

<a id='coff'></a> 

## Model Coefficients
MLP can fit a non-linear model to the training data. `clf.coefs_` contains the weight matrices that constitute the model parameters. This is the same as in the Linear Regression Model, where we have betas for every independent variable.

In [4]:
[coef.shape for coef in clf.coefs_]

[(2, 5), (5, 2), (2, 1)]

<a id='prob'></a> 

##  Probability Estimation
Currently, MLP Classifier supports only the Cross-Entropy loss function, allowing probability estimates by running the `predict_proba` method. MLP trains using Backpropagation. More precisely, it trains using gradient descent, and the gradients are calculated using Backpropagation. For classification, it minimizes the Cross-Entropy loss function, giving a vector of probability estimates per sample.

In [5]:
clf.predict_proba([[2., 2.], [1., 2.]])

array([[1.96718015e-04, 9.99803282e-01],
       [1.96718015e-04, 9.99803282e-01]])

We can use the MLP classifiers to train for the particular sets of input variables and use the model to predict the outcomes, where the model uses the appropriate weights of the input layers to predict the outcome. If the results are in line with the training sample, then the weights are not changed. But if the outcome is not optimised, then the model changes each layer's weights and gives the desired result.<br><br>