Testing Data
- Overview
In machine learning (ML), testing data is a set of data used to evaluate the performance of a trained model. It's typically different from the training data used to train the model, and it's usually not labeled. This means that the model's output is unknown for each data point in the testing data.
Testing data is used to confirm that the model is accurate and can be used for predictive and forecast analyses in real-world situations. For example, if the model is being trained to identify whether there's a dog in a picture, testing data can help ensure that the model doesn't learn incorrect features, like the fact that all four-legged animals are dogs.
The amount of data required for testing depends on several factors, including the complexity of the problem and the learning algorithm. Data is often split into training and testing sets, with an 80-20 split being common.
[More to come ...]