Machine Learning

# Principal Component Analysis

PCA is used to reduce the dimensions of a large data set such as a set of feature data $x$. The high dimensional data is summarized  using orthogonal transformations into uncorrelated principal components.

The dimension reduction is done by only selecting/using the eigenvectors (principle components) with large eigenvalues (the vectors that explain the most variance). The first component explains the most variance in the data, so an elbow plot is used to determine the significant number of principal components for an analysis. The set of eigenvectors form an uncorrelated orthogonal basis for the covariance matrix $\Sigma$

Covariance Matrix

Covariance matrix  $\Sigma$ , is a matrix of all possible covariances between a set of variables  $X_n$ . The  $(i, j)$ entry of  $\Sigma$ is $\Sigma_{(i, j)}=cov(X_i, X_j)$ .

Eigenvalues

Every eigenvector has a corresponding eigenvalue. The eigenvalue represents the amount of variance in the direction of the eigenvector. The eigenvectors for  $\Sigma$ are found by solving

$det|\Sigma-\lambda I|=0$ for the eigenvalues $\lambda$ .

Eigenvectors

The principal components are the eigenvectors of the covariance matrix $\Sigma$ .  Each eigenvalue represents a direction. For n-dimensional data there are n eigenvectors.  Eigenvectors $X$ of $\Sigma$ are found by solving

$det|\Sigma-\lambda I|X=0$

for $X$ for each specific eigenvalue $\lambda$ , which will result in a set of n eigenvectors and n eigenvalues for an  $nxn$ matrix .

Transformation

The eigenvectors form an orthogonal matrix $W$ used as a transformation matrix on the features to create a set of new features from the data.  $y=W^T \times x$