In PCA, we want to find a set of orthogonal basis vectors that maximize variance of the projections of the data on them. Let \( \mathbf{X} \) be the \( n \times d \) matrix where each row is a data point and \( \mathbf{w} \) be the \( d \times 1 \) dimension basis vector. The optimization problem can be formulated as –

Introducing Lagrange multiplier \( \lambda \) we get –

Rewriting the norm in terms of dot product –

The Jacobian \( \nabla J(\mathbf{w}) \) is given by –

The maxima will occur when \( \nabla J(\mathbf{w}) = 0 \).

Taking transpose on both sides we get –

From the last equation, it is easy to see that the orthogonal basis vectors are the eigenvectors of the scatter matrix \( \mathbf{(X^\intercal X)} \) . This can be conveniently implemented in MATLAB using –

[eigvec, eigval] = eig(X'*X)

where eigvec are the eigenvectors and eigval are the eigenvectors.

Using PCA, we will get a set of atmost \( d \) eigenvectors with their corresponding non-zero eigenvalue. The eigenvalue represents how much information about the data is represented by a principal component. The greater the eigenvalue – the more information the principal component contains.

Resources

LaTeXed notes of the blog post are available in –