Introduction to Neural Networks
Neural Networks are the backbone of deep learning, a subset of machine learning where algorithms inspired by the human brain learn from large amounts of data. At their core, neural networks mimic the way neurons in the human brain operate and communicate. This blog post dives into the fundamentals of neural networks, with a focus on a specific type known as the Multilayer Perceptron (MLP), complete with figures and numeric examples to enhance understanding.
Fundamentals of Neural Networks
Neural networks consist of layers of nodes or "neurons," each connected to others across layers but not within the same layer. These connections represent the synapses in a human brain and carry a weight, which determines the strength of one neuron's influence on another. The basic structure includes an input layer, one or more hidden layers, and an output layer.
The Neuron (Node)
Each neuron in a network processes the input it receives and passes its output to the next layer. The processing involves summing the weighted inputs and then applying an activation function. The activation function's role is to introduce non-linearity into the output of a neuron, allowing the network to learn complex patterns.
Activation Functions
Several activation functions are used in neural networks, including the Sigmoid, Tanh, and ReLU (Rectified Linear Unit) functions. Each has its characteristics and applications, with ReLU being particularly popular in deep learning due to its computational efficiency.
Introduction to Multilayer Perceptron (MLP)
A Multilayer Perceptron is a class of feedforward artificial neural network that consists of at least three layers: an input layer, one or more hidden layers, and an output layer. MLP utilizes a supervised learning technique called backpropagation for training the network.
Structure of MLP
The structure of an MLP is straightforward:
- Input Layer: Receives the input signal to be processed.
- Hidden Layers: Perform computations and feature extractions.
- Output Layer: Delivers the final output produced by the network.
Each neuron, except those in the input layer, uses a non-linear activation function. This non-linearity allows the MLP to solve problems that are not linearly separable, which is a limitation of single-layer perceptrons. A sample MLP architecture is shown below with two input neurons, four hidden neurons and one output neuron
Figure 1: MLP architecture with two input neurons, four hidden neurons and one output neuron.
Learning in MLP: Backpropagation
Backpropagation is the cornerstone of learning in MLPs. It involves two phases: a forward pass and a backward pass. In the forward pass, input data is passed through the network, layer by layer, until the output layer produces its prediction. The backward pass involves calculating the error (difference between the predicted output and the actual output), which is then used to adjust the weights of the network to minimize this error. The process is repeated for many iterations over the training data.
Numeric Example
Let's create a numeric example involving a neural network with two input neurons, a hidden layer with three neurons, and one output neuron. We'll use a simple task to illustrate the process: predicting the output based on the sum and product of two inputs. The activation function for the hidden layer will be ReLU (Rectified Linear Unit), and for simplicity, we'll use a linear activation function for the output neuron.
Initialization and Forward Pass
We initialize the weights and biases randomly. For simplicity, let's choose easy-to-follow numbers:
- Weights from Input to Hidden Layer: $w_{11} = 0.5, w_{12} = -1, w_{21} = 0.3, w_{22} = 1.5, w_{31} = -0.7, w_{32} = 1$
- Bias for Hidden Layer: $b_{1} = 1, b_{2} = -1, b_{3} = 0.5 $
- Weights from Hidden to Output Layer: $w_{o1} = 1, w_{o2} = -2, w_{o3} = 0.5$
- Bias for Output Layer: $b_{o} = 0.1$
Consider the inputs $x_1 = 2$ and $x_2 = 3$. Through the forward pass, we calculate the output of the hidden layer and subsequently the output of the network, which gives us a specific value. This process demonstrates how inputs are transformed through the network.
Learning: Backpropagation
To improve the network's performance, we would use backpropagation:
- Compute the loss (e.g., mean squared error) between the predicted output and the target value.
- Calculate the gradient of the loss function with respect to each weight and bias in the network.
- Update the weights and biases in the direction that minimizes the loss, typically using an optimizer like SGD (Stochastic Gradient Descent).
This process is repeated with many iterations (epochs) over the training data, adjusting the weights and biases to reduce the error between the network's predictions and the actual target values.
Conclusion
Neural Networks, particularly Multilayer Perceptrons, are powerful tools for modeling complex patterns in data across various applications. By understanding the basics of how these networks are structured and function, we can appreciate the depth of learning and prediction capabilities they offer. The MLP, with its simple yet effective architecture, exemplifies how adding layers of neurons and utilizing backpropagation can enable the learning of non-linear relationships, paving the way for advancements in AI and machine learning.