Personal tools

Principal Component Analysis

 
University of Sydney_022924B
[University of Sydney]


- Overview

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms large data sets into smaller ones that retain most of the original information. PCA is a linear algebra technique that's often used in data analysis and machine learning. 

Here are some steps for solving PCA problems: 

  • Standardize the dataset
  • Find the Eigenvalues and eigenvectors
  • Arrange Eigenvalues
  • Form Feature Vector
  • Transform Original Dataset
  • Reconstructing Data

The eigenvectors, or principal components, determine the directions of the new feature space, while the eigenvalues determine their magnitude. The eigenvalues explain the variance of the data along the new feature axes. 

Principal component analysis (PCA) is a feature engineering technique for dimensionality reduction. PCA is a statistical method that summarizes large data tables into a smaller set of "summary indices". These indices can be more easily visualized and analyzed. 

PCA can be based on either the covariance matrix or the correlation matrix. The new variables (the PCs) depend on the dataset, rather than being pre-defined basis functions.

PCA is a popular and unsupervised algorithm that has been used across several applications like data analysis, data compression, de-noising, and reducing the dimension of data.

Please refer to the following for more information:

 
 

[More to come ...]

 

Document Actions