Implementing ANNs Training Process

: (Harvard University - Harvard Taiwan Student Association)

- Overview

Artificial intelligence (AI) uses neural networks - interconnected systems modeled after the human brain - to enable machines to learn from data, reason, and make autonomous decisions.

These systems,, which are not just robots, automate complex tasks and improve daily life through pattern recognition, deep learning, and advanced algorithms.

Neural networks are trained to solve complex problems like medical diagnosis and self-driving technology. The system trains by adjusting weights based on data inputs, similar to human experience.

Core Components of a Neural Network:

Neuron (Node/Perceptron): The fundamental unit that receives weighted input, processes it, and generates an output.
Input Layer: The initial data point that feeds into the network.
Hidden Layers: Intermediate layers between input and output, where, in a Deep Neural Network (DNN), computations and pattern recognition occur.
Weights: Numerical values representing the importance of connections between neurons.
Bias: An additional constant value used to adjust the activation threshold of a node.
Activation Function: A mechanism that introduces non-linearity, allowing the network to learn complex patterns.

Please refer to the following for more information:

Wikipedia: Neural Network (Machine Learning)

- Artificial Neural Networks (ANNs)

Artificial Neural Networks (ANNs) are versatile, biologically inspired models that map inputs to outputs, solving complex classification and regression problems without explicit, hard-coded rules. By training on data, they adjust internal weights to minimize error, making them key to supervised machine learning (ML) applications like image recognition.

1. Key Neural Network Components & Structure:

Input Layer: Receives raw data or signals and passes them into the network.
Hidden Layer(s): Intermediary layers where calculations, feature extractions, and non-linear transformations occur.
Output Layer: Produces the final prediction or classification.
Weights & Biases: Parameters adjusted during training to strengthen or weaken the influence of input signals.
Activation Functions: Introduce non-linearity, enabling the network to learn complex patterns (e.g., ReLU, Sigmoid, Softmax).

2. Common Types of Neural Networks:

Feedforward Neural Networks: Information moves in one direction, from input to output.
Multilayer Perceptron (MLP): A type of feedforward network with at least one hidden layer, often trained using backpropagation.
Convolutional Neural Networks (CNNs): Specialized for processing data with a grid-like topology, such as images.
Recurrent Neural Networks (RNNs): Utilize feedback loops to process sequential data, making them ideal for text or speech.
Backpropagation: The core algorithm used to train neural networks by propagating errors backward to update weights.

- Steps for Building a Neural Network

Artificial neural networks (ANNs), like humans, learn by example. Through a learning process, ANNs are configured for specific applications, such as pattern recognition or data classification. Learning primarily involves adjustments to the synaptic connections between neurons.

The brain is made up of hundreds of billions of cells called neurons. These neurons are connected together by synapses, which are connections through which neurons send impulses to another neuron.

When one neuron sends an excitation signal to another neuron, that signal is added to all the other inputs to that neuron. If that signal exceeds a given threshold, it causes the target neuron to fire an action signal forward - this is how the inner workings of the thought process work.

In computer science, we model this process by creating "networks" on computers using matrices. These networks can be understood as an abstraction of neurons without all the biological complexity.

Here are some steps for building a neural network:

Create an approximation model
Configure data set
Set network architecture
Train neural network
Improve generalization performance
Test results
Deploy model

- To Train an ANN

To train an Artificial Neural Network (ANN), you can use a step-by-step approach, starting with defining the network architecture, loading and preprocessing the data, and then training the model using forward and backward propagation.

A common example is image classification using the MNIST dataset, where the goal is to build a neural network that can accurately classify handwritten digits (0-9).

Here's a breakdown of the process:

1. Define the ANN Architecture:

Input Layer: Determine the number of input nodes based on the dimensions of your data. For MNIST, each image is 28x28 pixels, so you'd likely use 784 input nodes (28 * 28).
Hidden Layer(s): Decide on the number of hidden layers and neurons in each layer. A common approach is to start with a small number of hidden layers (e.g., 1-3) and adjust as needed.
Output Layer: The number of output nodes depends on the number of classes you're trying to predict. For MNIST, you'd need 10 output nodes, one for each digit (0-9).
Activation Functions: Choose activation functions for each layer (e.g., Sigmoid, ReLU, etc.). Sigmoid is often used for output layers in binary classification, while ReLU is common for hidden layers.

2. Load and Preprocess the Data:

Dataset: Obtain your training and testing data (e.g., MNIST dataset).
Preprocessing: Prepare the data by converting it into a suitable format for the ANN. For MNIST, this might involve scaling pixel values to a range between 0 and 1.

3. Initialize Weights and Biases:

Random Initialization: Randomly initialize the weights and biases of the connections between neurons in each layer.

4. Forward Propagation:

Input: Feed the preprocessed input data into the network.
Calculations: Calculate the weighted sum of inputs for each neuron in the hidden layers, apply the activation function, and pass the result to the next layer.
Output: Obtain the predicted output from the output layer.

5. Backward Propagation:

Error Calculation: Compare the predicted output with the actual output (target) to calculate the error.
Weight Adjustment: Adjust the weights and biases using the calculated error and a learning rate. This is done using a process called backpropagation.

6. Training:

Iterate: Repeat steps 4 and 5 multiple times (epochs) until the model achieves satisfactory accuracy.

7. Evaluation:

Test Data: Evaluate the trained model on the test data to assess its performance.
Metrics: Use metrics like accuracy, precision, recall, and F1-score to evaluate the model's performance.

- Step-by-Step Manual Calculation of an ANN

Consider a basic ANN with:

1 input layer: 2 input features (x1, x2)
1 hidden layer: 2 neurons (h1, h2)
1 output layer: 1 output neuron (o1)

The input data is processed through the following steps:

1. Input Features:

x₁ = 0.5, x₂ = 0.8

These are the input features fed into the model.

2. Weights and Biases:

Weights for connections from the input to the hidden layer:

w₁₁ = 0.1, w₁₂ = 0.3, w₂₁ = 0.2, w₂₂ = 0.4

Weights for connections from the hidden layer to the output:

w_o1 = 0.6, w_o2 = 0.5

Biases for hidden layer and output

b_h1 = 0.1, b_h2 = 0.2, b_o = 0.3

3. Step 1: Hidden Layer Computations: The hidden layer neurons apply weights and biases, and an activation function like the sigmoid is used.

For hidden neuron h1:

z_h1 = (x1 x w₁₁) + (x2 x w₂₁) + b_h1

z_h1 = (0.5 x 0.1) + (0.8 x 0.2) + 0.1 = 0.05 + 0.16 + 0.1 = 0.31

Now, apply the sigmoid activation function:

h1 = 1/ ( 1 + e^-zh1) = 1/ (1 + e^-0.31) = 0.576

For hidden neuron h2:

z_h2 = (x1 x w₁₂) + (x2 x w₂₂) + b_h2

zh2 = (0.5 x 0.3) + (0.8 x 0.4) + 0.2 = 0.15 + 0.32 + 0.2 = 0.67

Apply the sigmoid activation function:

h2 = 1/ (1 + e^-z_h2) = 1/ (1 + e-0.67) = 0.661

4. Step 2: Output Layer Computation: The output neuron takes inputs from the hidden neurons and applies its own weights and biases, followed by the activation function.

For the output neuron o1:

z_o1 = (h1 x w_o1) + (h2 x w_o2) + b_o

z_o1 = (0.576 x 0.6) + (0.661 x 0.5) + 0.3 = 0.346 + 0.331 + 0.3 = 0.977

Apply the sigmoid activation function:

o1 = 1/ (1 + e^-z₀₁) = 1/ (1 + e ^-0.977) = 0.726

Thus, the output of this ANN is approximately 0.726.

- Python Implementation of the ANN Example

Now, the same steps will be implemented in Python for better understanding.

import numpy as np

# Sigmoid activation function
def sigmoid(x):
return 1 / (1 + np.exp(-x))

# Input features
x1, x2 = 0.5, 0.8

# Weights for input to hidden layer
w11, w12 = 0.1, 0.3
w21, w22 = 0.2, 0.4

# Weights for hidden to output layer
w_o1, w_o2 = 0.6, 0.5

# Biases
b_h1, b_h2 = 0.1, 0.2 # Biases for hidden layer neurons
b_o = 0.3 # Bias for output layer neuron

# Step 1: Hidden layer computations
z_h1 = (x1 * w11) + (x2 * w21) + b_h1
z_h2 = (x1 * w12) + (x2 * w22) + b_h2

# Applying the sigmoid activation function
h1 = sigmoid(z_h1)
h2 = sigmoid(z_h2)

# Step 2: Output layer computation
z_o1 = (h1 * w_o1) + (h2 * w_o2) + b_o

# Applying the sigmoid activation function
o1 = sigmoid(z_o1)

# Output the final result
print(f"Output of the ANN: {o1}")

Output:

Output of the ANN: 0.726

- Explanation of Python Code

This simple example illustrates the working of an ANN for binary classification, and how it processes input manually before performing computations programmatically in Python.

Sigmoid function: It is used as the activation function to map any real-valued number into the range of (0, 1).
Weights and biases: Initialized for both the input to hidden layer and hidden to output layer.
Step 1: The hidden layer’s neurons calculate their weighted inputs and apply the activation function.
Step 2: The output neuron receives the outputs of the hidden layer neurons, applies weights and biases, and finally computes the output using the activation function.

[More to come ...]

Document Actions

Send this

Sections

Personal tools