Face Recognition Using Principal Components Analysis (PCA)
Contents
- Introduction to Face Recognition
- Principal Component Analysis (PCA)
- The eigenface approach
- Experimental Results
- References and Resources
- Project
1. Introduction to Face Recognition
Face recognition is a key biometric technology with a wide range of potential applications related to national security and safety including surveillance, information security, access control, identity fraud, gang tracking, banking and finding missing children. Faces are highly dynamic and can vary considerably in their orientation, lighting, scale and facial expression, therefore face recognition is considered a difficult problem to solve. Previous studies on face recognition fall into one of two main categories [1]: feature-based and holistic-based. Feature-based methods identify faces by extracting landmarks, or local features, from an image of the subject's face. Methods using geometrical relationships such as the relative position, size, and/or shape of the eyes, nose, cheekbones, and jaw fall into this category [2]. However, facial features are not always easy to extract and discard textural information like the "smoothness" of faces or hair style that might contain strong identity information. This observation has led to holistic-based methods which actually use features extracted from the whole image (i.e., global features) [5]. The main drawback of holistic-based approaches is that they are sensitive to occlusion; however, they can tolerate noise better than feature-based approaches.
Considerable progress has been made in template-based face recognition research over the last decade, especially with the development of powerful models of face appearance [1]. These models represent faces as points in high-dimensional spaces and employ dimensionality reduction to find a more meaningful representation. The key observation is that although face images can be regarded as points in a high-dimensional space, they often lie on a manifold (i.e., subspace) of much lower dimensionality, embedded in the high-dimensional image space [3]. The main issue is how to properly define and determine a low-dimensional subspace of face appearance in a high-dimensional image space.
Dimensionality reduction techniques using linear transformations have been very popular in determining the intrinsic dimensionality of the manifold as well as extracting its principal directions (i.e., basis vectors). The most prominent method in this category is Principal Component Analysis PCA [4]. PCA determines the basis vectors by finding the directions of maximum variance in the data and it is optimal in the sense that it minimizes the error between the original image and the one reconstructed from its low-dimensional representation. PCA has been very popular in face recognition, especially with the development of the method of eigenfaces [5]. Its success has triggered significant research in the area of face recognition and many powerful dimensionality reduction techniques (e.g., Probabilistic PCA, Linear Discriminant Analysis (LDA) Independent Component Analysis (ICA), Local Feature Analysis (LFA), Kernel PCA) have been proposed for finding appropriate low-dimensional face representations [1].
2. Principal Component Analysis (PCA)
Typically, problems arise when performing recognition in high-dimensional spaces (i.e., âcurse of dimensionalityâ). Significant improvements can be achieved by first mapping the data into a lower-dimensional sub-space. Applying dimensionality reduction to some vector <math>\mathbf{x}=[x_1, x_2, ..., x_N]^T</math> in an <math>N</math>-dimensional space yields another vector <math>\mathbf{y}=[y_1, y_2, ..., y_K]^T</math> in an <math>K</math>-dimensional space where <math>K<N</math>. In principle, dimensionality reduction leads to information loss; the goal of PCA is to reduce the dimensionality of the data while preserving as much information as possible in the original data. This is equivalent to retaining as much as possible of the variation present in the original data [4]. In this context, PCA computes a linear transformation <math>T</math> that maps data from a high dimensional space to a lower dimensional sub-space as shown below:
<math> \begin{cases} y_1=t_{11}x_1+t_{12}x_2+\cdots+t_{1N}x_N \\ y_2=t_{21}x_1+t_{22}x_2+\cdots+t_{2N}x_N \\
\cdots \\
y_K=t_{K1}x_1+t_{K2}x_2+\cdots+t_{KN}x_N \end{cases}</math>
or <math>\mathbf{y}=\mathbf{Tx}</math> where
<math>\mathbf{T} = \begin{bmatrix}
t_{11} & t_{12} & \cdots & t_{1N}\\ t_{21}& t_{22} & \cdots & t_{2N}\\ \vdots & \vdots & \ddots & \vdots \\ t_{K1}& t_{K2} & \cdots & t_{KN}
\end{bmatrix}</math>
The optimum transformation <math>\mathbf{T}</math> is the one that minimizes <math>\left \| x-y \right \|</math>. According to the theory of PCA (see [4]), the optimum low-dimensional space can be defined by the âbestâ eigenvectors of the covariance matrix of the data (i.e., the eigenvectors corresponding to the largest eigenvalues of the covariance matrix; also refereed to as âprincipal componentsâ).
Suppose <math>\mathbf{I}_1, \mathbf{I}_2, \cdots, \mathbf{I}_M</math> is a set of <math>M</math>, <math>N \times 1</math> vectors; we provide below the main steps of PCA analysis:
Step 1: Compute the average vector <math>\mathbf{\bar{I}}=\frac{1}{M}\sum_{i=1}^{M}\mathbf{I}_i</math>
Step 2: Normalize vectors by subtracting the average vector: <math>\mathbf{\Phi}_i=\mathbf{I}_i-\mathbf{\bar{I}}</math>
Step 3: Form the matrix <math>\mathbf{A}=[\mathbf{\Phi}_1,\mathbf{\Phi}_2 ,\cdots,\mathbf{\Phi}_M ] </math> ( <math>N \times M</math> matrix )
Step 4: Compute the covariance matrix: <math>\mathbf{C}=\frac{1}{M}\sum_{n=1}^{M}\mathbf{\Phi}_n\mathbf{\Phi}_n^T=\mathbf{AA}^T</math> (<math>N \times N</math> matrix; characterizes variance of the data). See [4] for a review on covariance matrices.
Step 5: Compute the eigenvalues <math>\lambda_1, \lambda_2, \cdots , \lambda_N</math> and eigenvectors <math>\mathbf{u}_1, \mathbf{u}_2, \cdots , \mathbf{u}_N</math> of <math>\mathbf{C}</math> (assume that <math>\lambda_1 > \lambda_2 > \cdots > \lambda_N</math>). See [4] for a review on eigenvalues and eigenvectors.
Since <math>\mathbf{C}</math> is symmetric, <math>\mathbf{u}_1, \mathbf{u}_2, \cdots , \mathbf{u}_N</math> form a set of basis vectors, that is, any vector <math>\mathbf{I}</math> in the same space can be written as a linear combination of the eigenvectors); using normalized vectors, we have:
<math>\mathbf{I}-\mathbf{\bar{I}}=y_1\mathbf{u}_1+y_2\mathbf{u}_2+\cdots +y_N\mathbf{u}_N=\sum_{i=1}^{N}y_i\mathbf{u}_i</math>