Personal tools

ML Model Training

ML Model Training_122323A
[ML Model Training - Shreyak]

- Overview

Model training is a stage in the data science development lifecycle. It's the process of running a machine learning algorithm on a dataset, and then optimizing the algorithm to find certain patterns or outputs. 

Model training involves learning good values for all the weights and bias from labeled examples. The resulting function with rules and data structures is called the trained machine learning model. 

The process of training ML models can be divided into four steps:

  • Data set split for training and evaluation
  • Algorithm selection
  • Hyperparameter tuning
  • Model training


The model's performance during training will eventually determine how well it will work when it is eventually put into an application for the end-users. 

Before training your model, you can:

  • Identify the problem and candidate algorithms.
  • Identify data required to train the algorithms.
  • Collect initial data.
  • Identify its quality and suitability for the task.
  • Plan what is needed to make the dataset suitable for the project.


- Training ML Models

Here are some steps to train a machine learning (ML) model:

  • Data collection: Gather and measure information on targeted variables in an established system.
  • Data preparation: Collect, clean, and organize data before using it to train a model. The quality of the data used to train a model significantly impacts the accuracy of its predictions.
  • Choose a model: Select the appropriate model architecture and algorithms that can best solve the problem at hand.
  • Train the model: Training is the process of the computer looking at all data to figure out the relationship between all the values.
  • Analyze and visualize: Visualize the data to have a better understanding of relationships within the dataset.
  • Model evaluation: Model evaluation is one of the most important steps in the ML pipeline. The performance of a model can be measured via dozens of metrics.
  • Parameter tuning: Tune the parameters.Make predictions: Ask the model to make predictions.


ML is a set of algorithms that learn from data and/or experiences, rather than being explicitly programmed. Each task requires a different set of algorithms, and these algorithms detect patterns to perform certain tasks.

Here are some other concepts related to ML: 

  • Representation: How the model looks and how knowledge is represented
  • Evaluation: How good models are differentiated and how programs are evaluated
  • Optimization: The process for finding good models and how programs are generated


- Foundation Models

Trained on massive datasets, foundation models (FMs) are large deep learning neural networks that have changed the way data scientists approach ML. 

Rather than develop AI from scratch, data scientists use a foundation model as a starting point to develop ML models that power new applications more quickly and cost-effectively.

Foundation models are ML models that can perform a variety of tasks, such as understanding language, generating text and images, and conversing in natural language. Researchers coined the term to describe ML models that are trained on a wide range of generalized and unlabeled data.


- Generative AI vs. AI ML

Machine learning (ML) and generative AI (GenAI) are both data-driven learning methods, but they have different goals and strategies:

  • ML: Focuses on analyzing data to find patterns and make accurate predictions. ML models can be trained to help businesses by processing data, finding patterns, and testing correlations. Deep learning models are a type of ML model that imitate how humans process information.
  • GenAI: Focuses on creating new data that resembles training data. Generative AI models are trained to recognize patterns in data and then use these patterns to generate new, similar data. For example, a model trained on English sentences can learn the statistical likelihood of one word following another, allowing it to generate coherent sentences.

Some common models used in GenAI include:

  • Variational Autoencoders (VAEs)
  • Generative Adversarial Networks (GANs)
  • Autoregressive models

GenAI is powered by large machine learning models that are pre-trained on vast amounts of data. A subset of these models are called large language models (LLMs) and are trained on trillions of words across many natural-language tasks.



[More to come ...]

Document Actions