 Research
 Open access
 Published:
Deocclusion and recognition of frontal face images: a comparative study of multiple imputation methods
Journal of Big Data volume 11, Article number: 60 (2024)
Abstract
Increasingly, automatic face recognition algorithms have become necessary with the development and extensive use of face recognition technology, particularly in the era of machine learning and artificial intelligence. However, the presence of unconstrained environmental conditions degrades the quality of acquired face images and may deteriorate the performance of many classical face recognition algorithms. Due to this backdrop, many researchers have given considerable attention to image restoration and enhancement mechanisms, but with minimal focus on occlusionrelated and multipleconstrained problems. Although occlusion robust face recognition modules, via sparse representation have been explored, they require a large number of features to achieve correct computations and to maximize robustness to occlusions. Therefore, such an approach may become deficient in the presence of random occlusions of relatively moderate magnitude. This study assesses the robustness of Principal Component Analysis and Singular Value Decomposition using Discrete Wavelet Transformation for preprocessing and city block distance for classification (DWTPCA/SVDL1) face recognition module to image degradations due to random occlusions of varying magnitudes (10% and 20%) in test images acquired with varying expressions. Numerical evaluation of the performance of the DWTPCA/SVDL1 face recognition module showed that the use of the deoccluded faces for recognition enhanced significantly the performance of the study recognition module at each level (10% and 20%) of occlusion. The algorithm attained the highest recognition rate of 85.94% and 78.65% at 10% and 20% occlusions respectively, when the MICE deoccluded face images were used for recognition. With the exception of Entropy where MICE deoccluded face images attained the highest average value, the MICE and RegEM result in images of similar quality as measured by their Absolute mean brightness error (AMBE) and peak signal to noise ratio (PSNR). The study therefore recommends MICE as a suitable imputation mechanism for deocclusion of face images acquired under varying expressions.
Introduction
The ability of humans to detect, identify and classify faces and other attributes of faces (gender, race, emotion) under variable conditions with some apparent efficiency derives from a network of brain regions (fusiform face area and anterior inferotemporal cortex) highly tuned to face information [1, 2]. The study of how machines are able to perform same tasks in realtime has attracted much attention among researchers due to rising security concerns and the fastpaced evolution of supportive technologies. The field of automatic face recognition concerns the use of machines to recognize the identity of persons or individuals from a database of stored face images.
Several studies [3,4,5,6,7,8] have shown that the performances of automatic face recognition modules are affected by the quality of the face images acquired and used for recognition. In most instances, image quality is problematic due to the acquisition of images from unconstrained environments. Image quality is often eroded due to imbalanced illumination effects, wild poses, noise, occlusions and varying facial expressions. When occlusions are the underlying cause of image degradation, the problem becomes more intractable as occlusions obscure salient features of the face needed for training recognition algorithms, thus creating larger intrasubject variability compared with intersubject variability such that images of different individuals appear similar than images of the same individual [9]. This may be further compounded by the wide range or forms of occlusions which may include randomly occurring occlusions of different magnitudes. This not withstanding, only a few studies have focused on how to resolve occlusionrelated problems in face recognition.
References [4, 10, 11] have shown that enhancing the quality of acquired images prior to recognition improves the performance of face recognition algorithms. However, choosing the right image enhancement mechanism is often a challenging task. This is because the choice of enhancement mechanism is contingent on knowledge of the underlying cause of image degradation which, in most instances, is limited. Also, a combination of more than one enhancement mechanism may be required to attain optimal results, but specifying the right combination of enhancement mechanisms is challenging and still continues to be a gap in literature that requires more research. To deal with occlusions in face images, some researchers have advocated for the use of occlusioninvariant features or the nonoccluded portions (sparse representation) of the face for recognition, many of which have attained remarkable successes [12, 13]. However, these approaches sometimes become deficient for some classes of occlusions, especially when there is significant loss of facial features or pixels. Deocclusion techniques, therefore, become indispensable in these situations. These methods may leverage on the inherent topology of the face [14] to reconstruct missing facial components or operate from the premise that missing facial pixels can be inferred from the observed facial pixels in order to restore the occlusions in face images [15]. A key stage in the deocclusion process is the choice of deocclusion mechanism. This is very crucial in automatic face recognition because inappropriate handling of occlusions could further degrade the quality of images and may lead to significant drop in the performance of an otherwise wellperforming face recognition module. Chan and Shen [16] considered the problem of image restoration using a diffusionbased approach. Diffusionbased methods are considered optimal in filling small patches in an image [17]. The use of exemplarbased methods have also been explored in the literature [18]. The exemplarbased methods are known to be optimal for filling larger texture areas. However, the approach sometimes decreases the connectivity of structure and clearness of texture while increasing the time complexity [19]. Some authors leveraged on the advantages of both diffusion based method and texture synthesis technique by first dividing the image into structure and texture layers, then using diffusionbased method to inpaint the structure layer and texture synthesis technique to inpaint texture layer. According to [19], this approach helps overcome the smooth effect disadvantage brought from the diffusionbased inpainting algorithm, but it is still very difficult to recover the larger missing structures. Refer to [20,21,22] for more information on occlusionaware systems.
Occlusions in face images can be classified as a missingdata problem. Therefore, deocclusion as used in this study refers to any process that attempts to restore missing pixels in face images. In general missing value problems, the use of multiple imputation methods has been widely explored [4]. Such methods aim at finding plausible values for the missing data and are known to give unbiased results and can also account for the uncertainty in the imputations. This gives multiple imputation methods an edge over single imputation methods [23]. Despite the aforementioned advantages of multiple imputation methods, some researchers object to their use in handling missing values in datasets, arguing that imputation methods only synthesize numerical (nonreal) values for the missing data [24]. Other researchers [25] on the other hand assert that imputation methods aim not to recreate the missing values in a dataset, but are a means of handling missing data in order to arrive at the proper statistical inferences under a given missingness mechanism. The Multiple Imputation by Chain Equations (MICE) [26], MissForest [27] and the Expectation Maximization (EM)based methods (Regularized Expectation Maximization (RegEM)) [28] are among the most successful contemporary multiple imputation methods in practice. These methods impute missing values in datasets based on multiple regression modules. The MICE uses the conditional distributions of variables with missing data and is based on Markov Chain Monte Carlo (MCMC) and attains imputations via Gibbs sampling; the MissForest draws imputations uses via random forests models while the RegEM is a likelihoodbased method [4].
In this study, we assess the robustness of DWTPCA/SVDL1 face recognition module to image degradations due to random occlusions of varying magnitudes (10% and 20%) in test images acquired with varying expressions. The study also helps identify the appropriate image restoration mechanism when dealing with moderately low levels of occlusions in face images acquired under varying expressions.
The rest of the paper is organised as follows: Section Methods and materials discusses the data acquisition, the mathematical underpins of the adopted imputation mechanisms, recognition modules and their implementation. In section Results and discussion, we evaluate the recognition modules under the adopted imputation mechanisms and conclude by summarising the overall achievements of the study with some recommendations and directions for future developments in section Conclusion and recommendation.
Materials and methods
Data acquisition
The study used two standard face image datasets to benchmark the performance of the study algorithm.
Dataset 1 The Japanese Female Expression (JAFFE) dataset is homogenous in terms of race and gender. It contains face images of ten (10) female Japanese subjects captured along seven universally accepted principal emotions (neutral, angry, disgust, fear, sad, surprise and happy).
Dataset 2 The Cohn Kanade AUCoded Facial Expression (CKFE) dataset is heterogeneous with regard to race and gender. It contains face images of twentytwo (22) subjects of mixed race and gender also captured along the above seven universally accepted principal emotions.
The neutral expressions of subjects in the two datasets (totaling 32) were captured into the trainimage database for training the study algorithm after face detection and cropping. Figure 1 depicts the face images of subjects in the trainimage database.
All the other face images of subjects acquired under varying expressions (sad, happy, disgust, surprise, angry and fear) in each dataset were synthetically occluded (10% and 20% missingness or degradation) after face detection and cropping. Figure 2 (testimage database 1) and Fig. 3 (testimage database 2) contain expressionvariant face images with 10% and 20% occlusions respectively.
The multipleconstrained (occlusions, varying expressions) face images in Figs. 2 and 3 were subsequently reconstructed using the MICE, MissForest and RegEM imputation techniques respectively and captured into separate testimage databases.
Deocclusion via imputation methods
Reconstructive methods seek to restore missing components or pixel information for the purposes of completeness, good visual effects, as well as providing relevant features for subsequent feature extraction.
Multiple imputation methods have been successfully used to deal with missing data problems in many applications. In this work, we use the MICE, MissForest and RegEM imputation methods to deocclude occlusions in test faces based on the assumption that such methods can find plausible pixel values to replace missing components or pixels using information from the existing pixels (nonoccluded portions of the face).
Imputation algorithms
Let \(Y_{n \times p} = (Y_1, Y_2,\dots , Y_p)\) be the image matrix of an occluded face image. For each column (variable) \(Y_j, j\in \{1,2,\dots ,p\}\) that contains missing pixels, Y is divided into four parts indicated below: \(Y_{n\times p} = \left[ \begin{array}{cccccccccc} y_{1,1} &{} y_{1,2} &{} \dots &{} y_{1,j}^{(obs)} &{} \dots &{} y_{1,p1} &{} y_{1,p}\\ \\ y_{2,1} &{} y_{2,2} &{} \dots &{} y_{2,j}^{(obs)} &{} \dots &{} y_{2,p1} &{} y_{2,p}\\ \vdots &{} \vdots &{} &{} \vdots &{} &{} \vdots &{} \vdots \\ \\ y_{n1,1} &{} y_{n1,2} &{} \dots &{} y_{n1,j}^{(mis)} &{} \dots &{} y_{n1,p1} &{} y_{n1,p} \\ \\ y_{n,1} &{} y_{n,2} &{} \dots &{} y_{n,j}^{(mis)} &{} \dots &{} y_{n,p1} &{} y_{n,p}\\ \end{array}\right] ,\) where

\(Y_j^{(obs)}\) \(\longrightarrow\) Observed pixels of \(Y_j\) and \(k_j^{obs} \in \{1,2,\dots ,n\}\) is the corresponding row index set of the observed pixels.

\(Y_j^{(mis)}\) \(\longrightarrow\) Unobserved pixels of \(Y_j\) and \(k_j^{mis} \in \{1,2,\dots ,n\}\) is the corresponding row index set of the missing pixels. Note that, \(k_j^{mis} = \{1,2,\dots ,n\} \ k_j^{obs}\).

\(Y_{j}^{k(j,obs)}\) \(\longrightarrow\) The part of all other columns other than the jth column \(Y_{j}\) with row index set same as \(k_j^{obs}\).

\(Y_{j}^{k(j,mis)}\) \(\longrightarrow\) The part of all other columns other than \(Y_{j}\) with row index set same as \(k_j^{mis}\).
Multiple imputation with chain equations (MICE)
Given the feature matrix of an occluded face image, the MICE algorithm imputes missing pixels using univariate conditional distributions for each variable feature given all other variables [26]. It is assumed that the face image feature matrix has a full multivariate distribution from which the conditional distribution of each feature is obtained, although such distribution may not be explicitly specified [29] as long as the distribution of each feature is stated, or may not exist [30, 31].
The MICE algorithm is an iterative method which imputes missing values based on the fitted conditional (regression) models until a stopping/termination criterion is met and uses the Gibbs sampler to generate multiple imputations.
MissForest
The MissForest [27] is a nonparametric multiple imputation technique based on random forests [32]. Unlike MICE, the MissForest algorithm specifies a random forest model for each variable with missing pixels and uses the other variables to predict the missing values. As in the case of MICE, this process is iteratively done for all missing pixels until a stopping criterion is met. The advantage of using random forest models is that they provide much flexibility, address complex nonlinear interactions [27], require little tuning and provide an internally crossvalidated error estimates [33].
Regularized expectationmaximization (RegEM)
The expectationmaximization (EM) imputation algorithm is an iterative optimization technique for estimating the parameter set (\(\vartheta\)) of a probability model with incomplete data based on the notion of maximum [34]. Parameter estimation, via EMbased methods is done first by estimating the parameters of the data distribution through the existing data and then imputing the missing data based on the estimated distribution [35]. The EM algorithm encompasses two steps; obtaining a probability distribution over all possible complete versions of the incomplete data given the current parameter estimate (Estep) and reestimating the underlying parameter set using these completions (Mstep). In practice however, one need not specify this probability distribution explicitly, but rather need only compute expected sufficient statistics over these completions. The EM algorithm attempts to find the parameter set \(\vartheta ^*\) that maximizes the loglikelihood of the observed pixel intensities by casting it as a prediction problem [36].
Assuming that the distribution of pixels is multivariate normal with parameter set \(\vartheta =\left[ {\mu ,\Sigma }\right]\), then the missing pixel values can be imputed using a regression model. Although the normality assumption is plausible in many application areas, it can be replaced with other more complex densities, such as mixture of simplex ones [31].
In the presence of occlusions, the feature matrix and associated design matrix become illconditioned (as a result of missing pixel values). As a result, ordinary regression estimates (such as least squares) and standard errors could be highly unreliable and can affect the stability of such models as well as the quality of predictions amidst multicollinearity [37]. Under these circumstance a penalized regression method (ridge regression) is recommended for illconditioned design matrices instead of least squares estimators. Specifically, given a linear regression model
\(Y_{n\times 1}\) vector of observations; \(X_{n\times p}\) design matrix of rank p, \(\beta _{p\times 1}\) vector of unknown parameters and \(\varepsilon _{n\times 1}\) vector of unobserved errors, the ridge regression estimate \({\hat{\beta }}\) of \(\beta\) is
where \(\gamma\) is the ridge parameter to be selected and \({\mathbb {I}}\) is the \(n\times n\) identity matrix. This can be obtained as a the solution to the least squares problem
where \(\tau \ge 0\).
The visual quality of the deoccluded image depends on the regularization parameter \(\gamma\). The method of generalized cross validation has been shown by [38] to give a better estimate of \(\gamma\) compared to the method of maximum likelihood. In generalized crossvalidation, the estimate \(\dot{\gamma }\) of \(\gamma\) is obtained as a minimizer of the generalized crossvalidation function
where \(A(\gamma ) = X(X^{T}X + n\gamma {\mathbb {I}})^{1}X^{T}\).
According to [38], using the generalized crossvalidation approach to estimate \(\gamma\) does not require knowledge of the noise variance \(\sigma ^2\), making it a natural choice for solving regressionlike problems where the design matrix is illposed since in such cases there is no way of estimating \(\sigma ^2\) from the data. The regularized EM using multiple ridge regression is carried out, starting with initial estimates of the mean \(\mu\) and covariance matrix \(\Sigma\), as follows:

For each row of the feature matrix with missing values, obtain the multiple ridge regression parameters by regressing columns with missing pixel values on the columns with observed pixel values using the mean and covariance matrix.

Fill in the missing pixel values with their conditional expectation values, where the conditional expectation values are obtained as the product of the available pixel values and the estimated ridge regression coefficients \(\hat{\beta _{r}}\).

Reestimate the mean and covariance matrix, where the mean is obtained as the mean of the completed feature matrix and the covariance matrix is obtained as the sum of the covariance matrix of the feature matrix and an estimate of the conditional covariance matrix of the imputation error.
The testimage database 3 contains the MICE, MissForest and RegEM reconstructed images of testimage database 1.
Fig. 4 shows the reconstructed face images for some subjects under 10% degradation for the JAFFE and CKFE data sets.
The testimage database 4 contains the MICE, MissForest and RegEM reconstructed images of testimage database 2.
Fig. 5 shows the reconstructed face images for some subjects under 20% degradation for the JAFFE and CKFE data sets.
Research design
When face images are sent to the recognition module, they are preprocessed through mean centering and Discrete Wavelet transformation (DWT) mechanisms. The train images are the first to be preprocessed this way. Afterward, the preprocessed images are sent to the feature extraction unit where the PCA/SVD algorithm extracts discriminative features. The extracted features are then stored in memory as a created knowledge for recognition.
As mentioned before, four test datasets were used in this study. It is worthy to note that only one of the adopted imputation mechanisms is used for deocclusion at a time in a database before recognition. The test images are also preprocessed using mean centering and Discrete Wavelet transformation (DWT) mechanisms and their discriminative features are also extracted using the PCA/SVD algorithm for recognition.
The discriminative features are passed on to the classifier/recognition unit where they are matched with the stored knowledge created from the train images where a closer match is defined in terms of minimum recognition distance. We note that only one test image is passed to the recognition module along with the train images at a time. The design of the study recognition module is shown in Fig. 6.
Preprocessing
Preprocessing is a key stage in digital image processing. The importance of preprocessing has been underscored by several research works [11, 39,40,41,42].
The goal of image enhancement is to accentuate, via denoising mechanisms, the defining features of the image by improving the image quality. Image enhancement can be carried out in the spatial domain or in a transformed domain of the image. The latter, particularly, has evolved over the years to effectively deal with image denoising and enhance edge features [43]. In this study, we adopted mean centering and the Discrete Wavelet Transform as preprocessing mechanisms.
Discrete wavelet transform (DWT)
The discrete wavelet transform (DWT) is a transformdomainbased image denoising method with multiresolution property that allows the analysis of a signal (image) in different frequency resolutions [44]. This is particularly useful because some features of a face or signal have low frequency components while others have high frequency components [45]. According to [46], the use of wavelets gives superior performance in image denoising due to its multiresolution property. DWT provides both spatial and temporal information about a given signal. As such, DWTbased image denoising is preferred to other transformdomain denoising mechanisms such as Fourier transforms which only give the spatial information of a signal [6].
Denoising an image based on DWT consists of decomposing the face image, noise filtering and image reconstruction. DWT decomposes an image into two sets of coefficients namely the approximation coefficients and detail coefficients. The decomposition is done by passing the image through a series of filters. First, the image is passed through a lowpass filter resulting in the approximation coefficients (LLsubband). The image is also decomposed simultaneously using a highpass filter resulting in the detail coefficients (Horizontal coefficients (LHsubband), Vertical coefficients (HLsubband) and diagonal coefficients (HHsubband) [47]. These subbands provide different resolutions of the image, with the LL subband being the low resolution form of the image and the remaining subbands being the highresolution forms of the image. The LLsubband contains global information of the image and is less prone to noise while the remaining subbands contain local information such as eyes, nose and mouth [6].
DWT is the most stable invertible transform in transforming signals in diverse domains. Its efficiency in denoising signals is because of its multiresolution property which allows the analysis of a signal at different resolutions or scales, making it easier to identify patterns and anomalies in large datasets [48]. The wavelet transform involves the displacement of basic wavelet functions called mother wavelets [49]. Notable among them are the Haar, Daubechies, Coiflet, symlets and Morlet wavelets. In this study we chose the Haar wavelet as the mother wavelet and performed a onelevel decomposition of the face images. This is because it is simple and orthogonal (rigid in transformation to preserve distance in the original image) in nature. After the onelevel decomposition, a Gaussian filter is applied to normalize illumination and the image is reconstructed via inverse discrete wavelet transform. Figure 7 shows the DWT cycle using the Haar wavelet.
Mean centering
Given a matrix of face images whose columns are the vectorized forms of the face images of subjects, its corresponding meancentered matrix of face images is obtained by subtracting the mean intensity value of each column from each of their respective intensity values. The resultant meancentered matrix is, thus, of zero mean.
Meancentering is an integral part of eigenvalue analysis which ensures that the principal components are proportional to the variance of the input data matrix with the first principal component reflecting the maximum variance, which otherwise would reflect the mean instead of the greatest variance [50].
Feature extraction
Dealing with high dimensional datasets such as the human face is computationally expensive. Besides, with the presence of a large number of features, a learning model tends to overfit and hence underperform [51]. Therefore, feature extraction forms an integral part of every face recognition module. During feature extraction, the dimensionality of the otherwise highdimensional face images is reduced. This is because only the relevant features of each face are selected for classification and redundancy (noise) is removed.
Principal Component Analysis (PCA) is one such effective dimensionality reduction technique widely used in signal and image processing [14]. According to [52], PCA reduces the dimensionality of datasets whilst maintaining as much variability as possible and gives the best possible representation of a pdimensional dataset in q dimensions \((q < p)\) by maximizing variance (statistical information) in q dimensions.
In PCAbased face feature extraction, discriminative facial features are obtained by projecting the given face images onto a feature space spanned by the principal components, which are the eigenvectors of the variancecovariance matrix of the faces data matrix [53]. This approach to feature extraction in face recognition is efficient due to its ease of implementation and low processing steps. In addition, no knowledge of geometry or any specific feature of face is required [54].
Several studies [55] have shown that the use of PCA for dimensionality reduction and feature extraction competes favourably with (and may outperform) other dimensionality reduction techniques, including independent component analysis [56] and linear discriminant analysis [57]. Based on the above merits, we adopted PCA for feature extraction.
Assessment of the quality of deoccluded images
An attempt to resolve the challenges posed by occlusions to face recognition using multiple imputation methods may induce other artifacts or further degradations in the resultant faces. Image quality assessment is, therefore, crucial in this regard.
Image quality metrics are used to quantify the quality of the deoccluded images, and hence determine the best multiple imputation technique used to carry out deocclusion. Here, we refer to the unoccluded images as clean images. The Discrete Shannon Entropy (E), Absolute mean brightness error (AMBE), Peak SignaltoNoise Ratio (PSNR), and Contrast (C) were used in this context.
Entropy
The entropy of a face image characterizes the average level of information inherent in the face image. A relatively higher entropy after deocclusion signifies better image quality and a good source of information that could be leveraged to enhance the classification performance of face recognition modules, given the right choice of feature selection scheme. If the pixel intensity values in an image are seen as discretely sampled from the underlying image probability density P, then the discrete Shannon entropy with base 2 of the jth image is given by
where \(P_j (k)\) is the probability of occurrence of the kth pixel intensity value and L is the number of grey levels [49].
Absolute mean brightness error (AMBE)
The absolute mean brightness error quantifies the brightness preservation property of the multiple imputation schemes in carrying out deocclusion. For the jth image \((I_j)\) the AMBE is evaluated as the absolute difference between the mean brightness of the clean image and its respective deoccluded image \(({\tilde{I}}_j)\) and is given by
where \(m(I_j)\) and \(m(\tilde{I}_j)\) represent the mean brightness of the clean image and deoccluded images respectively. The multiple imputation method that gives the least (average) AMBE values has the highest brightness preservation and thus, conserves the brightness of the “clean” images.
Peak signaltonoise ratio (PSNR)
The PSNR is computed as the ratio of the highest pixel value of the image to the noise (Mean Square Error) that affects the quality of the pixels, expressed in logarithmic decibel scale. For the jth image, the PSNR is given by
where \(Max_j\) is the maximum possible pixel value and MSE is the mean square error. In the absence of noise, a deoccluded image and its respective “clean” image are identical, therefore the MSE is zero and the corresponding PSNR value is infinite. When noise is introduced as a result of the deocclusion processes, the multiple imputation deocclusion method achieving the highest average PSNR values is preferred since it results in the best quality deoccluded images.
Contrast
The contrast of an image refers to the spread in the distribution of its pixel intensity values, which is measured by the range of pixel intensity levels. If the minimum and maximum intensity values are far apart, the image has good contrast, otherwise it has poor contrast. The standard deviation of intensity value is a natural characterization of an image contrast and this is used in this study.
For the jth image, the standard deviation of pixel intensity values is given by
where \(P_j (k)\) is the probability of occurrence of the kth pixel intensity value and L is the number of grey levels.
Results and discussion
Assessment of image quality after using the various deocclusion mechanisms
Figures 8, 9, 10 and 11 show the entropy, AMBE, PSNR and Contrast of the occluded face images after reconstruction with MICE, MissForest and RegEM imputation algorithms respectively.
From Fig. 8, the median entropy value is highest for the MICE deoccluded faces, followed by the RegEM, with the MissForest attaining the least median entropy value.
From Fig. 9, the MissForest has the least brightness preserving property compared with the MICE and RegEM, which have relatively lower and similar brightness conservation property with a few images having brightness significantly different from their respective clean images.
It can be seen from Fig. 10 that when the MissForest is used for deocclusion, it results in images with the most noise since its associated PSNR values are relatively lower compared to the MICE and RegEM. However, the distribution of PSNR values for MICE and RegEM are relatively similar on average, although that of MICE shows more variability.
Results from Fig. 11 show that the MICE and RegEM produce images with relatively lower contrast. Nonetheless, the images obtained as a result of MissForest deocclusion have relatively higher contrast compared with their respective clean images.
Assessment of the performance of the study algorithm under the various deocclusion mechanisms
Sample results of matching the MissForest, MICE, and RegEM deoccluded test face images (of some subjects with happy facial expressions) to the train image database using the study algorithm are presented in Figs. 12 and 13.
Figure 12 shows the decisions and recognition distances for six (6) subjects from the JAFFE database when deocclusion was carried out at 20% random missingness. It is seen that all the six subjects were correctly matched for the MICE and RegEM deoccluded test images. However, there were two (2) mismatches for the MissForest deoccluded test faces.
Figure 13 shows the decisions and recognition distances for six (6) subjects from the CKFE database when deocclusion was carried out at 20% random missingness. It is seen that, there were 3 mismatches each when the MissForest and RegEM deoccluded test images were used for recognition but only one mismatch when the MICE deoccluded test faces were used for recognition.
Table 1 shows the average recognition rates of the study algorithm when using the occluded and deoccluded test face at 10% and 20% rates respectively.
It can be seen from Table 1 that at a 10% occlusion rate, using the corresponding occluded and the MissForest, MICE and RegEM deoccluded images as test face images, the average recognition rates of the study algorithm (DWTPCA/SVSL1) were 41.40%, 68.75%, 85.94% and 84.44% respectively. Also, at 20% occlusion rate, the average recognition rates of the study algorithm using the corresponding occluded, and MissForest, MICE and RegEM deoccluded images, as test face images were 23.63%, 54.69%, 78.65% and 76.54% respectively. At 10% and 20% degradation levels, the DWTPCA/SVDL1 recognition algorithm performed abysmally poor obtaining average recognition rates of 41.40% and 23.63% respectively when the occluded images were used as test images. The decline in the performance observed here was due to the increased degree of occlusion (from 10% to 20%).
It is also evident from Table 1 that, the deocclusion mechanisms (MICE, Missforest, RegEM) enhanced the performance of the recognition algorithm at each level of occlusion. Notably, the MICE deoccluded test face images gave the highest recognition rate (85.94% and 78.65% at 10% and 20% degradation levels respectively), followed closely by the RegEM deoccluded test face images (84.44% and 76.54% at 10% and 20% degradation levels respectively), with the study algorithm attaining the least average recognition rates (68.75% and 54.69% at 10% and 20% degradation levels respectively) when the MissForest deoccluded test face images were used for recognition. However, there was a moderate decline in the performance of the study algorithm with increasing level of occlusion as well as corresponding deocclusions. These results are consistent with the works of [6, 7].
Conclusion and recommendation
In this study, we performed a comparison of three (3) multiple imputation methods (Multiple Imputation with Chain Equations (MICE), MissForest and Regularized expectationmaximization (RegEM)) as deocclusion mechanisms in dealing with moderately low levels of occlusions in test face images, from two standard face image datasets (Japanese Female Facial Expressions (JAFFE) and CohnKanade Facial Expression (CKFE)) on the basis of their effect on image quality and the performance of a face recognition module (DWTPCA/SVDL1). In assessing the image quality, both the MICE and RegEM methods outperformed the MissForest imputation methods when the Entropy, PSNR and AMBE were used as the evaluation criteria. Except for the Entropy where MICE attained the highest average value, the MICE and RegEM resulted in images of similar quality as measured by their AMBE and PSNR. None of the methods produced images of similar contrasts as their respective clean images. Particularly, the MissForest resulted in overenhanced contrast images while the MICE and RegEM deocclusion mechanisms produced relatively lower contrast images when compared to the clean images. This suggest that, the MICE and RegEM result in images with better details and better brightness conservation. The use of the multiple imputationbased test images improved the performance of the study recognition module. Results from the numerical evaluation showed that the study algorithm achieved the highest average recognition rate when deocclusion was done using MICE (85.94% and 78.65% at 10% and 20% occlusion levels respectively), closely followed by the RegEM (85.94% and 78.65% at 10% and 20% occlusion levels respectively), with the study algorithm attaining the least average recognition rates (68.75% and 54.69% at 10% and 20% occlusion levels respectively) when the MissForest deoccluded test face images were used for recognition. These results were consistent across the 10% and 20% occlusion levels. Similar findings were obtained by [4, 6, 7] except that their works adopted different enhancement mechanisms for preprocessing the face images. Their underlying occlusion constraints were also acquired under different degrees of missingness and they did not assess the quality of the images after deocclusion. The performance of the study recognition module (regardless of the multiple imputation method used for deocclusion) appeared to be dependent on the level of occlusions. Particularly, the multiple imputation methods appear not to be robust to higher levels of occlusion. Despite this limitation, the study provides great insight into the use of multiple imputation methods in dealing with occlusions in the field of face recognition and its related areas.
Future work will focus on enhancing the recognition rate of the study algorithm, when multiple imputationbased deoccluded test face images are used for recognition, as well as improving the robustness of the study algorithm to higher levels of occlusions.
Availability of data and materials
The image data used in this manuscript are from previously published manuscript. The processed data are available upon request from the corresponding author.
References
Ghuman AS, Brunet NM, Li Y, Konecky RO, Pyles JA, Walls SA, Destefino V, Wang W, Richardson RM. Dynamic encoding of face information in the human fusiform gyrus. Nat Commun. 2014;5(1):1–10.
Kriegeskorte N, Formisano E, Sorger B, Goebel R. Individual faces elicit distinct response patterns in human anterior temporal cortex. Proc Natl Acad Sci. 2007;104(51):20600–5.
Abate AF, Cimmino L, Mocanu BC, Narducci F, Pop F. The limitations for expression recognition in computer vision introduced by facial masks. Multimedia Tools and Applications. 2023;82(8):11305–19.
Mensah JA, Ocran E, Asiedu L. On multiple imputationbased reconstruction of degraded faces and recognition in multiple constrained environments. Sci Afr. 2023;22: e01964.
Chen G, Peng J, Wang L, Yuan H, Huang Y. Feature constraint reinforcement based age estimation. Multimedia Tools Appl. 2023;82(11):17033–54.
Mensah JA, Asiedu L, Mettle FO, Iddi S. Assessing the performance of dwtpca/svd face recognition algorithm under multiple constraints. J Appl Math. 2021;2021:1–2.
AyiahMensah D, Asiedu L, Mettle FO, Minkah R. Recognition of augmented frontal face images using fftpca/svd algorithm. Appl Comput Intell Soft Comput. 2021;2021:1–9.
Liu X, Pedersen M, Charrier C, Bours P. Can image quality enhancement methods improve the performance of biometric systems for degraded face images? In: 2018 Colour and Visual Computing Symposium (CVCS). IEEE;2018:1–5.
Asiedu L, Mensah JA, AyiahMensah F, Mettle FO. Assessing the effect of data augmentation on occluded frontal faces using dwtpca/svd recognition algorithm. Adv Multimedia. 2021;2021:1.
Kamenetsky D, Yiu SY, Hole M. Image enhancement for face recognition in adverse environments. In: 2018 Digital Image Computing: Techniques and Applications (DICTA). IEEE;2018:1–6.
Rana ME, Zadeh AA, Alqurneh AMM. Use of image enhancement techniques for improving real time face recognition efficiency on wearable gadgets. J Eng Sci Techno. 2017;12(1):155–67.
Oh HJ, Lee KM, Lee SU. Occlusion invariant face recognition using selective local nonnegative matrix factorization basis images. Image Vision Comput. 2008;26(11):1515–23.
Priya GN, Banu RW. Occlusion invariant face recognition using mean based weight matrix and support vector machine. Sadhana. 2014;39(2):303–15.
Asiedu L, Mettle FO, Mensah JA. Recognition of reconstructed frontal face images using fftpca/svd algorithm. J Appl Math. 2020;2020;1–8.
Zhang N, Ji H, Liu L, Wang G. Exemplarbased image inpainting using angleaware patch matching. EURASIP J Image Video Process. 2019;2019(1):1–13.
Chan TF, Shen J. Nontexture inpainting by curvaturedriven diffusions. J visual Commun Image Represent. 2001;12(4):436–49.
A. Criminisi, P. Perez, K. Toyama, Object removal by exemplarbased inpainting, in: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., Vol. 2, IEEE, 2003, pp. II–II.
Zhang J, Zhao D, Gao W. Groupbased sparse representation for image restoration. IEEE Trans Image Process. 2014;23(8):3336–51.
Fan Q, Zhang L, Serikawa S. Improvement of patch selection in exemplarbased image inpainting. J Inst Ind Appl Eng. 2015;3(4):197–202.
Ke L, Tai YW, Tang CK, Occlusionaware instance segmentation via bilayer network architectures. IEEE Trans Pattern Anal Mach Intell. 2023.
Jia M, Sun Y, Zhai Y, Cheng X, Yang Y, Li Y. Semiattention partition for occluded person reidentification. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2023;37:998–1006.
Xu C, Makihara Y, Li X, Yagi Y. Occlusionaware human mesh modelbased gait recognition. IEEE Trans Inf forensics Secur. 2023;18:1309–21.
Penone C, Davidson AD, Shoemaker KT, Di Marco M, Rondinini C, Brooks TM, Young BE, Graham CH, Costa GC. Imputation of missing data in lifehistory trait datasets: which approach performs the best? Methods Ecol Evol. 2014;5(9):961–70.
Alruhaymi AZ, Kim CJ. Why can multiple imputations and how (mice) algorithm work? Open J Stat. 2021;11(5):759–77.
Kontopantelis E, White IR, Sperrin M, Buchan I. Outcomesensitive multiple imputation: a simulation study. BMC Med Res Methodol. 2017;17(1):1–13.
Van Buuren S, Oudshoorn K. Flexible multivariate imputation by MICE. Leiden: TNO; 1999.
Stekhoven DJ, Bühlmann P. Missforestnonparametric missing value imputation for mixedtype data. Bioinformatics. 2012;28(1):112–8.
Li H, Zhang K, Jiang T. The regularized em algorithm. In: AAAI, 2005; 807–812.
Van Buuren S. Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res. 2007;16(3):219–42.
Van Buuren S, Brand JP, GroothuisOudshoorn CG, Rubin DB. Fully conditional specification in multivariate imputation. J Stat Comput Simul. 2006;76(12):1049–64.
Quintero FOL, ContrerasReyes JE. Estimation for finite mixture of simplex models: applications to biomedical data. Stat Model. 2018;18(2):129–48.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Waljee AK, Mukherjee A, Singal AG, Zhang Y, Warren J, Balis U, Marrero J, Zhu J, Higgins PD. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open. 2013;3(8): e002847.
Dempter A. Maximum likelihood from incomplete data via the em algorithm. J Royal Stat Soc. 1977;39:1–22.
Ke J, Zhang S, Yang H, Chen X. Pcabased missing information imputation for realtime crash likelihood prediction under imbalanced data. Transportmetrica A: Transp Sci. 2019;15(2):872–95.
Hinton GE, Sabour S, Frosst N. Matrix capsules with em routing. In: International conference on learning representations, 2018.
Oufdou H, Bellanger L, Bergam A, El Ghaziri A, Khomsi K, Qannari EM, et al. Comparison of different regularized and shrinkage regression methods to predict daily tropospheric ozone concentration in the grand casablanca area. Adv Pure Math. 2018;8(10):793.
Golub GH, Heath M, Wahba G. Generalized crossvalidation as a method for choosing a good ridge parameter. Technometrics. 1979;21(2):215–23.
Li W, Peng M, Wang Q. Improved pca method for sensor fault detection and isolation in a nuclear power plant. Nuclear Eng Technol. 2019;51(1):146–54.
Gross R, Brajovic V, An image preprocessing algorithm for illumination invariant face recognition. In: International Conference on Audioand VideoBased Biometric Person Authentication, Springer, 2003;10–18.
Shan S, Gao W, Cao B, Zhao D. Illumination normalization for robust face recognition against varying lighting conditions. In: 2003 IEEE International SOI Conference. Proceedings (Cat. No. 03CH37443). IEEE. 2003;157–64.
Du S, Ward RK. Adaptive regionbased image enhancement method for robust face recognition under variable illumination conditions. IEEE Trans Circuits Syst Video Technol. 2010;20(9):1165–75.
Jung CR, Scharcanski J. Adaptive image denoising and edge enhancement in scalespace using the wavelet transform. Pattern Recognit Lett. 2003;24(7):965–71.
Mallat SG. A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell. 1989;11(7):674–93.
Ergen B. Comparison of wavelet types and thresholding methods on wavelet based denoising of heart sounds. J Signal Inf Process. 2013;4(3B):164.
Jumah A Al. Denoising of an image using discrete stationary wavelet transform and various thresholding techniques. J Signal Inf Process. 2013;4:33–41.
ElBadawy A, Rashad R, et al. Ultrasonic rangefinder spikes rejection using discrete wavelet transform: application to uav. J Sensor Technol. 2015;5(02):45.
Devi D, Sophia S, Prabhu SB, Deep learningbased cognitive state prediction analysis using brain wave signal. In: Cognitive Computing for HumanRobot Interaction. Elsevier, 2021;69–84.
Nicolis O, Mateu J, ContrerasReyes JE. Waveletbased entropy measures to characterize twodimensional fractional brownian fields. Entropy. 2020;22(2):196.
Alexandris N, Gupta S, Koutsias N. Remote sensing of burned areas via pca, part 1; centering, scaling and evd vs svd. Open Geospatial Data Softw Stand. 2017;2(1):1–11.
Tang J, Alelyani S, Liu H. Feature selection for classification: a review. Data Classif Algorithms Appl. 2014. p. 37.
Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans Royal Soc A: Math Phys Eng Sci. 2016;374(2065):20150202.
Turk M, Pentland A. Eigenfaces for recognition. J Cogn Neurosci. 1991;3(1):71–86.
Shinde K K, Tharewal S S, Suryawanshi K S, Kayte C N, Python based face recognition for person identification using pca and 2dpca techniques. In: 2020 International Conference on Smart Innovations in Design, Environment, Management, Planning and Computing (ICSIDEMPC), IEEE, 2020; 171–175.
Martinez AM, Kak AC. Pca versus lda. IEEE Trans Pattern Anal Mach Intell. 2001;23:228–33.
Yuen PC, Lai JH. Face representation using independent component analysis. Pattern Recognit. 2002;35(6):1247–57.
Belhumeur P. N, Hespanha J. P, Kriegman D. J, fisherfaces Eigenfaces vs. Recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell. 1997;19(7):711–20.
Funding
The manuscript did not receive financial support from any funding institution.
Author information
Authors and Affiliations
Contributions
Joseph Agyapong Mensah wrote the original draft of the manuscript, Ezekiel N.N. Nortey and Eric Ocran reviewed and edited the manuscript, Louis Asiedu and Samuel Iddi supervised the methodology development, reviewed and edited the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The images in the manuscript are openly published data and allowed to be used for all academic research to benchmark face recognition algorithms for face verification. There are no ethical concerns in using these images as they are openly accessible and created for research purposes.
Consent for publication
All the authors grant our consent for the publication of the manuscript.
Competing interests
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mensah, J.A., Nortey, E.N.N., Ocran, E. et al. Deocclusion and recognition of frontal face images: a comparative study of multiple imputation methods. J Big Data 11, 60 (2024). https://doi.org/10.1186/s40537024009256
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s40537024009256
Keywords
 Absolute mean brightness error
 Expressionvariant face images
 Multiple constraint
 Multiple Imputation with chain equations
 Occlusions
 Peak signal to noise ratio
 Entropy