Personal tools

Mathematics for Artificial Neural Networks

Duke University_010421B
[Duke University]

- The Math behind Artificial Neural Networks

Artificial neural networks (ANNs) are based on a series of matrix operations, such as matrix-vector multiplication, matrix inversion, and dot product. 

The mathematics used for ANNs includes: Linear algebra, Calculus, Probability, Statistics.

Training an ANN involves searching for the set of weights that best model the patterns in the data. This process uses the backpropagation and gradient descent algorithms, which both make extensive use of calculus. 

Here are some other details about the math used for artificial neural networks: 

  • Weights: Represent the strength of the connection between neurons. They decide how much influence the given input will have on the neuron's output.
  • Neuron's activation: The neuron sums up all the values it receives through its connections to get the output. This neuron's activation is y = w * x + b, or as a formula y=wx+b.
  • Backpropagation: The main goal in neural networks is to reduce the error. To do this, we have to update all the weights by doing backpropagation.

Artificial Neural Networks (ANNs) combine biological principles with advanced statistics to solve problems in areas such as pattern recognition and gameplay. ANNs employ a basic model of neuron analogs that are interconnected in multiple ways.

Today, with the help of open source machine learning (ML) software libraries such as TensorFlow, Keras or PyTorch, we can create a neural network with very high structural complexity in just a few lines of code. 

Having said that, the math behind neural networks is still a mystery to some of us, and having knowledge of the math behind neural networks and deep learning can help us understand what's going on inside neural networks. It also helps with architecture selection, fine-tuning of deep learning models, hyperparameter tuning and optimization.

Please refer to Wikipedia: Mathematics for Artificial Neural Networks for more details.


- Perceptron Neural Network

Perceptron is a single layer neural network and a multi-layer perceptron is called Neural Networks. Perceptron is a linear classifier (binary). Also, it is used in supervised learning. It helps to classify the given input data. 

A perceptron neural network is a single-layer neural network that performs computations to detect features in input data. It is a fundamental unit of an artificial neural network that takes multiple inputs and outputs a single binary decision. 

Here are some characteristics of a perceptron neural network: 

  • It is a linear classifier because its decision boundary is given by a hyperplane.
  • It is a machine learning algorithm used for supervised learning of various binary classifiers.
  • It is a simple yet effective algorithm that can learn from labeled data to perform classification and pattern recognition tasks.
  • It is arguably the oldest and most simple of the ANN algorithms.


A perceptron network consists of a single layer of S perceptron neurons connected to R inputs through a set of weights wi,j. The network indices i and j indicate that wi,j is the strength of the connection from the jth input to the ith neuron.


[More to come ...]


Document Actions