Personal tools

Tokens and Parameters

Stanford University_080921E
[Stanford University]

- Overview

In artificial intelligence (AI) and machine learning (ML), markers and parameters are both important elements of model training, but they have different roles and meanings:

  • Tokens: The smallest units of data processed by the model, such as a word, character, or phrase. Markers represent the context in which words and concepts appear in the text. In natural language processing (NLP), tokens are the basic unit of input and output in language models. During training and inference, the model processes input text into a sequence of tokens.
  • Parameters: Internal variables that a model adjusts during training to improve its performance. Parameters, sometimes called weights, can be thought of as internal settings or dials in the model that can be adjusted to optimize the process of acquiring tokens and generating new tokens. Parameters shape the behavior of an AI system, structure the AI's linguistic interpretation and influence how it manages input and produces output. The higher the number of parameters, the more complex language patterns the model can capture, resulting in a better representation of words and concepts.

Tokens represent the smallest unit of data processed by the model, such as a word or character in natural language processing. Parameters, on the other hand, are internal variables that the model adjusts during training to improve its performance.

For anyone looking to implement modern AI techniques, whether through natural language processing (NLP) or image recognition, or even for those just beginning their ML journey, understanding markers and parameters is essential to mastering models The basics of training are crucial.

 

- AI Parameters

As the basis of AI operations, AI parameters are unobservable but effective elements that drive the performance of these systems.

  • Training Phase Adaptability: For LLM, these parameters are adjusted during the training phase, learning to predict subsequent words based on previous words in context.
  • Operating functions: It is important to note that these parameters do not have any inherent meaning. They work holistically by mapping the complex relationships between words and phrases in the training data.

 

[More to come ...]



Document Actions