Personal tools

Reducing Dimensionality with Principal Component Analysis

Niagara Fall
(Niagara Fall, Canada - Wei-Jiun Su)

- Overview

Principal component analysis (PCA) is a technique for reducing the dimensionality of data while retaining its essence and meaningful variation. It's a linear technique that projects data with multiple columns into a subspace with fewer columns, while finding principal components that explain most of the variation in the data. 

PCA can be thought of as a way to reduce data complexity without compromising the information it contains. PCA is the main method used for linear dimension reduction. It performs a linear mapping of the data to a lower-dimensional space in such a way that the variance of the data in the low-dimensional representation is maximized, with the maximum variance, maximum information is preserved.

Here are some ways PCA can be used to reduce dimensionality:

  • Cut less important PCs: PCA can remove principal components (PCs) that have less variance than the original data. The remaining PCs can be used to develop new models.
  • Transform the original dataset: PCA can transform a large dataset into a smaller one while maintaining most of the original information.

 

 

[More to come ...]


Document Actions