Personal tools

Mathematics for AI, ML, and Data Science

Istanbul_Turkey_Veem_110120A
[Istanbul, Turkey - Veem]

 

 

- Overview

Mathematics is a significant aspect of machine learning. While some may absolutely adore math, others may dislike it. However, it is essential to have at least some knowledge of math and understand the concepts of probability, statistics, and calculus to succeed in solving machine learning tasks. Without mathematics, there’s nothing you can do. Everything around you is mathematics. Everything around you is numbers.

Fueled by data, Machine Learning (ML) models are the mathematical engines of AI, expressions of algorithms that find patterns and make predictions faster than a human can. For the journey to AI, the most transformational technology of our time, the engine you need is a machine learning model.  For example, an ML model for computer vision might be able to identify cars and pedestrians in a real-time video. One for natural language processing might translate words and sentences. 

However, mathematics is quite daunting, especially for folks coming from a non-technical background. Apply that complexity to machine learning and you’ve got quite an intimidating situation. We can easily use the widely available libraries available in Python and R to build models and to perform various machine learning tasks. So it’s easy to avoid the mathematical part of the field.

The main branches of Mathematics involved in Artificial Intelligence are: Linear Functions, Linear Graphics, Linear Algebra, Probability, Statistics.

 

- Machine Learning Model - The Mathematical Engines of AI

A machine learning model is an expression of an algorithm that combs through mountains of data to find patterns or make predictions. Fueled by data, machine learning (ML) models are the mathematical engines of artificial intelligence. 

Under the hood, a model is a mathematical representation of objects and their relationships to each other. The objects can be anything from “likes” on a social networking post to molecules in a lab experiment.

With no constraints on the objects that can become features in an ML model, there’s no limit to the uses for AI. The combinations are infinite. Data scientists have created whole families of machine learning models for different uses, and more are in the works.

 

Mathematics Behind AI

The relationship between Artificial Intelligence (AI) and mathematics can be summed up as: "A person working in the field of AI who doesn’t know math is like a politician who doesn’t know how to persuade. Both have an inescapable area to work upon!" 

All AI models are constructed using solutions and ideas from math. The purpose of AI is to create models for understanding thinking. If you want an AI career: Data Scientist, Machine Learning Engineer, Robot Scientist, Data Analyst, Natural Language Expert, Deep Learning Scientist. You should focus on the mathematic concepts.

The three main branches of mathematics that constitute a thriving career in AI are Linear algebra, calculus, and Probability. Linear algebra (LA), probability and calculus are the 'languages' in which machine learning (ML) is written. Learning these topics will provide a deeper understanding of the underlying algorithmic mechanics and allow development of new algorithms. 

Probability and statistics are related areas of mathematics which concern themselves with analyzing the relative frequency of events. Probability deals with predicting the likelihood of future events, while statistics involves the analysis of the frequency of past events.

Linear algebra (LA) is a fundamental topic in the subject of mathematics and is extremely pervasive in the physical sciences. It also forms the backbone of many machine learning algorithms. Hence it is crucial for the deep learning practitioner to understand the core ideas.

Behind every AI success there is Mathematics.

- Mathematics Behind Machine Learning and Deep Learning

Many supervised machine learning and deep learning algorithms largely entail optimising a loss function by adjusting model parameters. To carry this out requires some notion of how the loss function changes as the parameters of the model are varied. 

Machine learning (ML) and deep learning (DL) applications usually deal with something that is called the cost function, objective function or loss function. This function, in general, represents how good or bad model that we created fits data that we work with. Meaning, it is giving us some sort of scalar value that is telling us how much our model is off. This value is used to optimize the parameters of the model and get better results on the next samples from the training set. For example, you can check how the backpropagation algorithm updates weights in neural networks based on this concept. 

In order for our model to fit data the best way possible, we would have to to find the global minimum of the cost function. However, finding that global minimum and changing all those parameters is usually very costly and time-consuming. That is why we are using iterative optimization techniques like gradient descent. 

Essentially, optimization is all about finding extrema of some function, or to be more precise, finding the minima and maxima. Also, when we are doing some sort of optimization, we always need to consider a set of values for an independent variable over which we are doing it. This set of values is often called the feasible set or feasible region. Mathematically speaking it is always a subset of real numbers set X ⊆ R. If the feasible region is the same as the domain of the function, e.g. if X represents the complete set of possible values of the independent variable, the optimization problem is unconstrained. Otherwise, it is constrained and much harder to solve.

  

Breakfast in Venice_Italy_072721A
[Breakfast in Venice, Italy]

- Difference Between the Mathematics Behind Machine Learning and Data Science

Although Data Science and Machine Learning share a lot of common ground, there are subtle differences in their focus on mathematics. 

Yes, Data Science and Machine Learning overlap a lot but they differ quite a bit in their primary focus. In Data Science, the primary goal is to explore and analyse the data, generate hypotheses and test them. These are often the steps to draw out the hidden inferences in the data which might not observable at first sight. As a result, we have to rigorously rely on the concepts of statistics and probability to compare and conduct hypothesis testing. 

On the other hand, Machine learning focuses more on the concepts of Linear Algebra as it serves as the main stage for all the complex processes to take place (besides the efficiency aspect). On the other hand, multivariate calculus deals with the aspect of numerical optimisation, which is the driving force behind most machine learning algorithms. 

Data science is generally considered as the prerequisite to machine learning. Think about it – we expect the input data for machine learning algorithms to be clean and prepared with respect to the technique we use. If you are among the ones who are looking to work end-to-end (Data Science + Machine Learning), it will be better to make yourself proficient with the union of the math required for Data Science and Machine Learning.

 

- Python vs R for AI, ML, and Data Science

Python is a multi-paradigm programming language that can be characterized as a dynamically-typed, scripting, procedural, interpreted, and object-oriented language. It comes with a very comprehensive built-in library called the standard library. Python’s built in functionality, power, and flexibility are strong reasons for learning it. 

Python is also multi-purpose, and can be used for everything from data science, to system and network administration, building web applications, running utility scripts on your local machine, and so on. Python has become a formidable language in the data science, artificial intelligence, and machine learning spheres. This is largely due to the language’s flexibility and community, but it’s also a direct result of the production of many ultra-powerful, high-quality packages and modules. These packages are fully capable of carrying out tasks such as exploratory data analysis (EDA), statistical analysis, predictive analytics, machine learning, artificial intelligence (neural networks and deep learning), recommender systems, and the list goes on. 

Like Python, R is a multi-paradigm language that can be characterized as a dynamically-typed, scripting, procedural, and interpreted language. R is considered statistical software (similar to SAS and SPSS) and is very specialized and well-suited for statistics, data analysis, and data visualization. It is therefore less flexible and diverse of a language as compared to Python. That said, due to its specialization, R enjoys a vast community of people also specialized in these fields.

 

- The Python Math Library

The Python Math Library provides us access to some common math functions and constants in Python, which we can use throughout our code for more complex mathematical computations. The library is a built-in Python module, therefore you don't have to do any installation to use it. Here are some examples.

import math

math.pi

math.exp(x) 

 

[More to come ...]

 

Document Actions