De-occlusion and recognition of frontal face images: a comparative study of multiple imputation methods

Increasingly, automatic face recognition algorithms have become necessary with the development and extensive use of face recognition technology, particularly in the era of machine learning and artificial intelligence. However, the presence of unconstrained environmental conditions degrades the quality of acquired face images and may deteriorate the performance of many classical face recognition algorithms. Due to this backdrop, many researchers have given considerable attention to image restoration and enhancement mechanisms, but with minimal focus on occlusion-related and multiple-constrained problems. Although occlusion robust face recognition modules, via sparse representation have been explored, they require a large number of features to achieve correct computations and to maximize robustness to occlusions. Therefore, such an approach may become deficient in the presence of random occlusions of relatively moderate magnitude. This study assesses the robustness of Principal Component Analysis and Singular Value Decomposition using Discrete Wavelet Transformation for preprocessing and city block distance for classification (DWT-PCA/SVD-L1) face recognition module to image degradations due to random occlusions of varying magnitudes (10% and 20%) in test images acquired with varying expressions. Numerical evaluation of the performance of the DWT-PCA/SVD-L1 face recognition module showed that the use of the de-occluded faces for recognition enhanced significantly the performance of the study recognition module at each level (10% and 20%) of occlusion. The algorithm attained the highest recognition rate of 85.94% and 78.65% at 10% and 20% occlusions respectively, when the MICE de-occluded face images were used for recognition. With the exception of Entropy where MICE de-occluded face images attained the highest average value, the MICE and RegEM result in images of similar quality as measured by their Absolute mean brightness error (AMBE) and peak signal to noise ratio (PSNR). The study therefore recommends MICE as a suitable imputation mechanism for de-occlusion of face images acquired under varying expressions.


Introduction
The ability of humans to detect, identify and classify faces and other attributes of faces (gender, race, emotion) under variable conditions with some apparent efficiency derives from a network of brain regions (fusiform face area and anterior inferotemporal cortex) highly tuned to face information [1,2].The study of how machines are able to perform same tasks in real-time has attracted much attention among researchers due to rising security concerns and the fast-paced evolution of supportive technologies.The field of automatic face recognition concerns the use of machines to recognize the identity of persons or individuals from a database of stored face images.
Several studies [3][4][5][6][7][8] have shown that the performances of automatic face recognition modules are affected by the quality of the face images acquired and used for recognition.In most instances, image quality is problematic due to the acquisition of images from unconstrained environments.Image quality is often eroded due to imbalanced illumination effects, wild poses, noise, occlusions and varying facial expressions.When occlusions are the underlying cause of image degradation, the problem becomes more intractable as occlusions obscure salient features of the face needed for training recognition algorithms, thus creating larger intra-subject variability compared with inter-subject variability such that images of different individuals appear similar than images of the same individual [9].This may be further compounded by the wide range or forms of occlusions which may include randomly occurring occlusions of different magnitudes.This not withstanding, only a few studies have focused on how to resolve occlusionrelated problems in face recognition.
References [4,10,11] have shown that enhancing the quality of acquired images prior to recognition improves the performance of face recognition algorithms.However, choosing the right image enhancement mechanism is often a challenging task.This is because the choice of enhancement mechanism is contingent on knowledge of the underlying cause of image degradation which, in most instances, is limited.Also, a combination of more than one enhancement mechanism may be required to attain optimal results, but specifying the right combination of enhancement mechanisms is challenging and still continues to be a gap in literature that requires more research.To deal with occlusions in face images, some researchers have advocated for the use of occlusioninvariant features or the non-occluded portions (sparse representation) of the face for recognition, many of which have attained remarkable successes [12,13].However, these approaches sometimes become deficient for some classes of occlusions, especially when there is significant loss of facial features or pixels.De-occlusion techniques, therefore, become indispensable in these situations.These methods may leverage on the inherent topology of the face [14] to reconstruct missing facial components or operate from the premise that missing facial pixels can be inferred from the observed facial pixels in order to restore the occlusions in face images [15].A key stage in the de-occlusion process is the choice of de-occlusion mechanism.This is very crucial in automatic face recognition because inappropriate handling of occlusions could further degrade the quality of images and may lead to significant drop in the performance of an otherwise well-performing face recognition module.Chan and Shen [16] considered the problem of image restoration using a diffusion-based approach.Diffusion-based methods are considered optimal in filling small patches in an image [17].The use of exemplar-based methods have also been explored in the literature [18].The exemplar-based methods are known to be optimal for filling larger texture areas.However, the approach sometimes decreases the connectivity of structure and clearness of texture while increasing the time complexity [19].Some authors leveraged on the advantages of both diffusion based method and texture synthesis technique by first dividing the image into structure and texture layers, then using diffusion-based method to in-paint the structure layer and texture synthesis technique to in-paint texture layer.According to [19], this approach helps overcome the smooth effect disadvantage brought from the diffusion-based in-painting algorithm, but it is still very difficult to recover the larger missing structures.Refer to [20][21][22] for more information on occlusion-aware systems.
Occlusions in face images can be classified as a missing-data problem.Therefore, deocclusion as used in this study refers to any process that attempts to restore missing pixels in face images.In general missing value problems, the use of multiple imputation methods has been widely explored [4].Such methods aim at finding plausible values for the missing data and are known to give unbiased results and can also account for the uncertainty in the imputations.This gives multiple imputation methods an edge over single imputation methods [23].Despite the aforementioned advantages of multiple imputation methods, some researchers object to their use in handling missing values in datasets, arguing that imputation methods only synthesize numerical (non-real) values for the missing data [24].Other researchers [25] on the other hand assert that imputation methods aim not to re-create the missing values in a dataset, but are a means of handling missing data in order to arrive at the proper statistical inferences under a given missingness mechanism.The Multiple Imputation by Chain Equations (MICE) [26], MissForest [27] and the Expectation Maximization (EM)-based methods (Regularized Expectation-Maximization (RegEM)) [28] are among the most successful contemporary multiple imputation methods in practice.These methods impute missing values in datasets based on multiple regression modules.The MICE uses the conditional distributions of variables with missing data and is based on Markov Chain Monte Carlo (MCMC) and attains imputations via Gibbs sampling; the MissForest draws imputations uses via random forests models while the RegEM is a likelihood-based method [4].
In this study, we assess the robustness of DWT-PCA/SVD-L1 face recognition module to image degradations due to random occlusions of varying magnitudes (10% and 20%) in test images acquired with varying expressions.The study also helps identify the appropriate image restoration mechanism when dealing with moderately low levels of occlusions in face images acquired under varying expressions.
The rest of the paper is organised as follows: Section Methods and materials discusses the data acquisition, the mathematical underpins of the adopted imputation mechanisms, recognition modules and their implementation.In section Results and discussion, we evaluate the recognition modules under the adopted imputation mechanisms and conclude by summarising the overall achievements of the study with some recommendations and directions for future developments in section Conclusion and recommendation.

Data acquisition
The study used two standard face image datasets to benchmark the performance of the study algorithm.
Dataset 1 The Japanese Female Expression (JAFFE) dataset is homogenous in terms of race and gender.It contains face images of ten (10) female Japanese subjects captured along seven universally accepted principal emotions (neutral, angry, disgust, fear, sad, surprise and happy).
Dataset 2 The Cohn Kanade AU-Coded Facial Expression (CKFE) dataset is heterogeneous with regard to race and gender.It contains face images of twenty-two ( 22) subjects of mixed race and gender also captured along the above seven universally accepted principal emotions.
The neutral expressions of subjects in the two datasets (totaling 32) were captured into the train-image database for training the study algorithm after face detection and cropping.Figure 1 depicts the face images of subjects in the train-image database.
All the other face images of subjects acquired under varying expressions (sad, happy, disgust, surprise, angry and fear) in each dataset were synthetically occluded (10% and

De-occlusion via imputation methods
Reconstructive methods seek to restore missing components or pixel information for the purposes of completeness, good visual effects, as well as providing relevant features for subsequent feature extraction.
Multiple imputation methods have been successfully used to deal with missing data problems in many applications.In this work, we use the MICE, MissForest and RegEM imputation methods to de-occlude occlusions in test faces based on the assumption that such methods can find plausible pixel values to replace missing components or pixels using information from the existing pixels (non-occluded portions of the face).

Imputation algorithms
Let Y n×p = (Y 1 , Y 2 , . . ., Y p ) be the image matrix of an occluded face image.For each column (variable) Y j , j ∈ {1, 2, . . ., p} that contains missing pixels, Y is divided into four parts indicated below: , where • Y

Multiple imputation with chain equations (MICE)
Given the feature matrix of an occluded face image, the MICE algorithm imputes missing pixels using univariate conditional distributions for each variable feature given all other variables [26].It is assumed that the face image feature matrix has a full multivariate distribution from which the conditional distribution of each feature is obtained, although such distribution may not be explicitly specified [29] as long as the distribution of each feature is stated, or may not exist [30,31].
The MICE algorithm is an iterative method which imputes missing values based on the fitted conditional (regression) models until a stopping/termination criterion is met and uses the Gibbs sampler to generate multiple imputations.

MissForest
The MissForest [27] is a non-parametric multiple imputation technique based on random forests [32].Unlike MICE, the MissForest algorithm specifies a random forest model for each variable with missing pixels and uses the other variables to predict the missing values.As in the case of MICE, this process is iteratively done for all missing pixels until a stopping criterion is met.The advantage of using random forest models is that they provide much flexibility, address complex non-linear interactions [27], require little tuning and provide an internally cross-validated error estimates [33].

Regularized expectation-maximization (RegEM)
The expectation-maximization (EM) imputation algorithm is an iterative optimization technique for estimating the parameter set ( ϑ ) of a probability model with incomplete data based on the notion of maximum [34].Parameter estimation, via EM-based methods is done first by estimating the parameters of the data distribution through the existing data and then imputing the missing data based on the estimated distribution [35].The EM algorithm encompasses two steps; obtaining a probability distribution over all possible complete versions of the incomplete data given the current parameter estimate (E-step) and re-estimating the underlying parameter set using these completions (M-step).In practice however, one need not specify this probability distribution explicitly, but rather need only compute expected sufficient statistics over these completions.The EM algorithm attempts to find the parameter set ϑ * that maximizes the log-likelihood of the observed pixel intensi- ties by casting it as a prediction problem [36].
Assuming that the distribution of pixels is multivariate normal with parameter set ϑ = [µ, �] , then the missing pixel values can be imputed using a regression model.Although the normality assumption is plausible in many application areas, it can be replaced with other more complex densities, such as mixture of simplex ones [31].
In the presence of occlusions, the feature matrix and associated design matrix become ill-conditioned (as a result of missing pixel values).As a result, ordinary regression estimates (such as least squares) and standard errors could be highly unreliable and can affect the stability of such models as well as the quality of predictions amidst multi-collinearity [37].Under these circumstance a penalized regression method (ridge regression) is recommended for ill-conditioned design matrices instead of least squares estimators.Specifically, given a linear regression model Y n×1 vector of observations; X n×p design matrix of rank p, β p×1 vector of unknown parameters and ε n×1 vector of unobserved errors, the ridge regression estimate β of β is where γ is the ridge parameter to be selected and I is the n × n identity matrix.This can be obtained as a the solution to the least squares problem where τ ≥ 0.
The visual quality of the de-occluded image depends on the regularization parameter γ .The method of generalized cross validation has been shown by [38] to give a better estimate of γ compared to the method of maximum likelihood.In generalized cross-validation, the estimate γ of γ is obtained as a minimizer of the generalized cross-validation function where A(γ ) = X(X T X + nγ I) −1 X T . (1) According to [38], using the generalized cross-validation approach to estimate γ does not require knowledge of the noise variance σ 2 , making it a natural choice for solving regression-like problems where the design matrix is ill-posed since in such cases there is no way of estimating σ 2 from the data.The regularized EM using multiple ridge regres- sion is carried out, starting with initial estimates of the mean µ and covariance matrix , as follows: • For each row of the feature matrix with missing values, obtain the multiple ridge regression parameters by regressing columns with missing pixel values on the columns with observed pixel values using the mean and covariance matrix.• Fill in the missing pixel values with their conditional expectation values, where the conditional expectation values are obtained as the product of the available pixel values and the estimated ridge regression coefficients βr .• Re-estimate the mean and covariance matrix, where the mean is obtained as the mean of the completed feature matrix and the covariance matrix is obtained as the sum of the covariance matrix of the feature matrix and an estimate of the conditional covariance matrix of the imputation error.

Algorithm 3 RegEM
The test-image database 3 contains the MICE, MissForest and RegEM reconstructed images of test-image database 1.
Fig. 4 shows the reconstructed face images for some subjects under 10% degradation for the JAFFE and CKFE data sets.
The test-image database 4 contains the MICE, MissForest and RegEM reconstructed images of test-image database 2.
Fig. 5 shows the reconstructed face images for some subjects under 20% degradation for the JAFFE and CKFE data sets.

Research design
When face images are sent to the recognition module, they are preprocessed through mean centering and Discrete Wavelet transformation (DWT) mechanisms.The train images are the first to be preprocessed this way.Afterward, the preprocessed images are sent to the feature extraction unit where the PCA/SVD algorithm extracts discriminative features.The extracted features are then stored in memory as a created knowledge for recognition.
As mentioned before, four test datasets were used in this study.It is worthy to note that only one of the adopted imputation mechanisms is used for de-occlusion at a time in a database before recognition.The test images are also preprocessed using mean centering and Discrete Wavelet transformation (DWT) mechanisms and their discriminative features are also extracted using the PCA/SVD algorithm for recognition.
The discriminative features are passed on to the classifier/recognition unit where they are matched with the stored knowledge created from the train images where a closer match is defined in terms of minimum recognition distance.We note that only one test image is passed to the recognition module along with the train images at a time.The design of the study recognition module is shown in Fig. 6.

Preprocessing
Preprocessing is a key stage in digital image processing.The importance of preprocessing has been underscored by several research works [11,[39][40][41][42].
The goal of image enhancement is to accentuate, via denoising mechanisms, the defining features of the image by improving the image quality.Image enhancement can be carried out in the spatial domain or in a transformed domain of the image.The latter, particularly, has evolved over the years to effectively deal with image denoising and enhance edge features [43].In this study, we adopted mean centering and the Discrete Wavelet Transform as preprocessing mechanisms.

Discrete wavelet transform (DWT)
The discrete wavelet transform (DWT) is a transform-domain-based image denoising method with multi-resolution property that allows the analysis of a signal (image) in different frequency resolutions [44].This is particularly useful because some features of a face or signal have low frequency components while others have high frequency components [45].According to [46], the use of wavelets gives superior performance in image denoising due to its multi-resolution property.DWT provides both spatial and temporal information about a given signal.As such, DWT-based image denoising is preferred to other transform-domain denoising mechanisms such as Fourier transforms which only give the spatial information of a signal [6].
Denoising an image based on DWT consists of decomposing the face image, noise filtering and image reconstruction.DWT decomposes an image into two sets of coefficients namely the approximation coefficients and detail coefficients.The decomposition is done by passing the image through a series of filters.First, the image is passed through a low-pass filter resulting in the approximation coefficients (LL-sub-band).The image is also decomposed simultaneously using a high-pass filter resulting in the detail coefficients (Horizontal coefficients (LH-sub-band), Vertical coefficients (HL-sub-band) and diagonal coefficients (HH-sub-band) [47].These sub-bands provide different resolutions of the image, with the LL sub-band being the low resolution form of the image and the remaining sub-bands being the high-resolution forms of the image.The LL-sub-band contains global information of the image and is less prone to noise while the remaining sub-bands contain local information such as eyes, nose and mouth [6].DWT is the most stable invertible transform in transforming signals in diverse domains.Its efficiency in denoising signals is because of its multiresolution property which allows the analysis of a signal at different resolutions or scales, making it easier to identify patterns and anomalies in large datasets [48].The wavelet transform involves the displacement of basic wavelet functions called mother wavelets [49].Notable among them are the Haar, Daubechies, Coiflet, symlets and Morlet wavelets.In this study we Fig. 6 Research design chose the Haar wavelet as the mother wavelet and performed a one-level decomposition of the face images.This is because it is simple and orthogonal (rigid in transformation to preserve distance in the original image) in nature.After the one-level decomposition, a Gaussian filter is applied to normalize illumination and the image is reconstructed via inverse discrete wavelet transform.Figure 7 shows the DWT cycle using the Haar wavelet.

Mean centering
Given a matrix of face images whose columns are the vectorized forms of the face images of subjects, its corresponding mean-centered matrix of face images is obtained by subtracting the mean intensity value of each column from each of their respective intensity values.The resultant mean-centered matrix is, thus, of zero mean.
Mean-centering is an integral part of eigenvalue analysis which ensures that the principal components are proportional to the variance of the input data matrix with the first principal component reflecting the maximum variance, which otherwise would reflect the mean instead of the greatest variance [50].

Feature extraction
Dealing with high dimensional datasets such as the human face is computationally expensive.Besides, with the presence of a large number of features, a learning model tends to overfit and hence under-perform [51].Therefore, feature extraction forms an integral part of every face recognition module.During feature extraction, the dimensionality of the otherwise high-dimensional face images is reduced.This is because only the relevant features of each face are selected for classification and redundancy (noise) is removed.
Principal Component Analysis (PCA) is one such effective dimensionality reduction technique widely used in signal and image processing [14].According to [52], PCA reduces the dimensionality of datasets whilst maintaining as much variability as possible and gives the best possible representation of a p-dimensional dataset in q dimensions (q < p) by maximizing variance (statistical information) in q dimensions.In PCA-based face feature extraction, discriminative facial features are obtained by projecting the given face images onto a feature space spanned by the principal components, which are the eigenvectors of the variance-covariance matrix of the faces data matrix [53].This approach to feature extraction in face recognition is efficient due to its ease of implementation and low processing steps.In addition, no knowledge of geometry or any specific feature of face is required [54].
Several studies [55] have shown that the use of PCA for dimensionality reduction and feature extraction competes favourably with (and may outperform) other dimensionality reduction techniques, including independent component analysis [56] and linear discriminant analysis [57].Based on the above merits, we adopted PCA for feature extraction.

Assessment of the quality of de-occluded images
An attempt to resolve the challenges posed by occlusions to face recognition using multiple imputation methods may induce other artifacts or further degradations in the resultant faces.Image quality assessment is, therefore, crucial in this regard.
Image quality metrics are used to quantify the quality of the de-occluded images, and hence determine the best multiple imputation technique used to carry out de-occlusion.
Here, we refer to the unoccluded images as clean images.The Discrete Shannon Entropy (E), Absolute mean brightness error (AMBE), Peak Signal-to-Noise Ratio (PSNR), and Contrast (C) were used in this context.

Entropy
The entropy of a face image characterizes the average level of information inherent in the face image.A relatively higher entropy after de-occlusion signifies better image quality and a good source of information that could be leveraged to enhance the classification performance of face recognition modules, given the right choice of feature selection scheme.If the pixel intensity values in an image are seen as discretely sampled from the underlying image probability density P, then the discrete Shannon entropy with base 2 of the jth image is given by where P j (k) is the probability of occurrence of the kth pixel intensity value and L is the number of grey levels [49].

Absolute mean brightness error (AMBE)
The absolute mean brightness error quantifies the brightness preservation property of the multiple imputation schemes in carrying out de-occlusion.For the jth image (I j ) the AMBE is evaluated as the absolute difference between the mean brightness of the clean image and its respective de-occluded image ( Ĩj ) and is given by where m(I j ) and m( Ĩj ) represent the mean brightness of the clean image and de-occluded images respectively.The multiple imputation method that gives the least (average) (5) AMBE values has the highest brightness preservation and thus, conserves the brightness of the "clean" images.

Peak signal-to-noise ratio (PSNR)
The PSNR is computed as the ratio of the highest pixel value of the image to the noise (Mean Square Error) that affects the quality of the pixels, expressed in logarithmic decibel scale.For the jth image, the PSNR is given by where Max j is the maximum possible pixel value and MSE is the mean square error.In the absence of noise, a de-occluded image and its respective "clean" image are identical, therefore the MSE is zero and the corresponding PSNR value is infinite.When noise is introduced as a result of the de-occlusion processes, the multiple imputation de-occlusion method achieving the highest average PSNR values is preferred since it results in the best quality de-occluded images.

Contrast
The contrast of an image refers to the spread in the distribution of its pixel intensity values, which is measured by the range of pixel intensity levels.If the minimum and maximum intensity values are far apart, the image has good contrast, otherwise it has poor contrast.The standard deviation of intensity value is a natural characterization of an image contrast and this is used in this study.
For the jth image, the standard deviation of pixel intensity values is given by where P j (k) is the probability of occurrence of the kth pixel intensity value and L is the number of grey levels.From Fig. 8, the median entropy value is highest for the MICE de-occluded faces, followed by the RegEM, with the MissForest attaining the least median entropy value.

Assessment of image quality after using the various de-occlusion mechanisms
From Fig. 9, the MissForest has the least brightness preserving property compared with the MICE and RegEM, which have relatively lower and similar brightness conservation property with a few images having brightness significantly different from their respective clean images.
It can be seen from Fig. 10 that when the MissForest is used for de-occlusion, it results in images with the most noise since its associated PSNR values are relatively Results from Fig. 11 show that the MICE and RegEM produce images with relatively lower contrast.Nonetheless, the images obtained as a result of MissForest de-occlusion have relatively higher contrast compared with their respective clean images.

Assessment of the performance of the study algorithm under the various de-occlusion mechanisms
Sample results of matching the MissForest, MICE, and RegEM de-occluded test face images (of some subjects with happy facial expressions) to the train image database using the study algorithm are presented in Figs. 12 and 13. Figure 12 shows the decisions and recognition distances for six (6) subjects from the JAFFE database when de-occlusion was carried out at 20% random missingness.It is seen that all the six subjects were correctly matched for the MICE and RegEM de-occluded test images.However, there were two (2) mismatches for the MissForest de-occluded test faces.
Figure 13 shows the decisions and recognition distances for six (6) subjects from the CKFE database when de-occlusion was carried out at 20% random missingness.It is seen that, there were 3 mismatches each when the MissForest and RegEM deoccluded test images were used for recognition but only one mismatch when the MICE de-occluded test faces were used for recognition.Table 1 shows the average recognition rates of the study algorithm when using the occluded and de-occluded test face at 10% and 20% rates respectively.
It can be seen from Table 1 that at a 10% occlusion rate, using the corresponding occluded and the MissForest, MICE and RegEM de-occluded images as test face images, the average recognition rates of the study algorithm (DWT-PCA/SVS-L1) were 41.40%, 68.75%, 85.94% and 84.44% respectively.Also, at 20% occlusion rate, the average recognition rates of the study algorithm using the corresponding occluded, and MissForest, MICE and RegEM de-occluded images, as test face images were 23.63%, 54.69%, 78.65% and 76.54% respectively.At 10% and 20% degradation levels, the DWT-PCA/SVD-L1 recognition algorithm performed abysmally poor obtaining average recognition rates of 41.40% and 23.63% respectively when the occluded images were used as test images.The decline in the performance observed here was due to the increased degree of occlusion (from 10% to 20%).
It is also evident from Table 1 that, the de-occlusion mechanisms (MICE, Missforest, RegEM) enhanced the performance of the recognition algorithm at each level of occlusion.Notably, the MICE de-occluded test face images gave the highest recognition rate (85.94% and 78.65% at 10% and 20% degradation levels respectively), followed closely by the RegEM de-occluded test face images (84.44% and 76.54% at 10% and 20% degradation levels respectively), with the study algorithm attaining the least average recognition rates (68.75% and 54.69% at 10% and 20% degradation levels respectively) when the MissForest de-occluded test face images were used for recognition.However, there was a moderate decline in the performance of the study algorithm with increasing level of occlusion as well as corresponding de-occlusions.These results are consistent with the works of [6,7].

Conclusion and recommendation
In this study, we performed a comparison of three (3) multiple imputation methods (Multiple Imputation with Chain Equations (MICE), MissForest and Regularized expectation-maximization (RegEM)) as de-occlusion mechanisms in dealing with moderately low levels of occlusions in test face images, from two standard face image datasets (Japanese Female Facial Expressions (JAFFE) and Cohn-Kanade Facial Expression (CKFE)) on the basis of their effect on image quality and the performance of a face recognition module (DWT-PCA/SVD-L1).In assessing the image quality, both the MICE and RegEM methods outperformed the MissForest imputation methods when the Entropy, PSNR and AMBE were used as the evaluation criteria.Except for the Entropy where MICE attained the highest average value, the MICE and RegEM resulted in images of similar quality as measured by their AMBE and PSNR.None of the methods produced images of similar contrasts as their respective clean images.Particularly, the MissForest resulted in over-enhanced contrast images while the MICE and RegEM de-occlusion mechanisms produced relatively lower contrast images when compared to the clean images.This suggest that, the MICE and RegEM result in images with better details and better brightness conservation.The use of the multiple imputation-based test images improved the performance of the study recognition module.Results from the numerical evaluation showed that the study algorithm achieved the highest average recognition rate when de-occlusion was done using MICE (85.94% and 78.65% at 10% and 20% occlusion levels respectively), closely followed by the RegEM (85.94% and 78.65% at 10% and 20% occlusion levels respectively), with the study algorithm attaining the least average recognition rates (68.75% and 54.69% at 10% and 20% occlusion levels respectively) when the MissForest de-occluded test face images were used for recognition.These results were consistent across the 10% and 20% occlusion levels.Similar findings were obtained by [4,6,7] except that their works adopted different enhancement mechanisms for preprocessing the face images.Their underlying occlusion constraints were also acquired under different degrees of missingness and they did not assess the quality of the images after de-occlusion.The performance of the study recognition module (regardless of the multiple imputation method used for de-occlusion) appeared to be dependent on the level of occlusions.Particularly, the multiple imputation methods appear not to be robust to higher levels of occlusion.Despite this limitation, the study provides great insight into the use of multiple imputation methods in dealing with occlusions in the field of face recognition and its related areas.Future work will focus on enhancing the recognition rate of the study algorithm, when multiple imputation-based de-occluded test face images are used for recognition, as well as improving the robustness of the study algorithm to higher levels of occlusions.

Figure 2 (
test-image database 1) and Fig.3(test-image database 2) contain expression-variant face images with 10% and 20% occlusions respectively.The multiple-constrained (occlusions, varying expressions) face images in Figs. 2 and 3 were subsequently reconstructed using the MICE, MissForest and RegEM imputation techniques respectively and captured into separate test-image databases.

Fig. 2
Fig. 2 Sample of face images with 10% occlusion acquired under varying expressions (test-image database 1) Observed pixels of Y j and k obs j ∈ {1, 2, . . ., n} is the corresponding row index set of the observed pixels.• Y (mis) j −→ Unobserved pixels of Y j and k mis j ∈ {1, 2, . . ., n} is the corresponding row index set of the missing pixels.Note that, k mis j = {1, 2, . . ., n} − k obs j .• Y k(j,obs) −j −→ The part of all other columns other than the jth column Y j with row index set same as k obs j .• Y k(j,mis) −j −→ The part of all other columns other than Y j with row index set same as k mis j .

Fig. 3
Fig. 3 Sample of face images with 20% occlusion acquired under varying expressions (test-image database 2)

Fig. 4
Fig. 4 Sample of reconstructed face images with 10% occlusion acquired under varying expressions (test-image database 3)

Fig. 5
Fig. 5 Sample of reconstructed face images with 20% occlusion acquired under varying expressions (test-image database 4)

Fig. 7
Fig. 7 DWT preprocessing cycle for a MICE reconstructed image

Figures 8 , 9 ,
Figures 8, 9, 10 and 11 show the entropy, AMBE, PSNR and Contrast of the occluded face images after reconstruction with MICE, MissForest and RegEM imputation algorithms respectively.From Fig.8, the median entropy value is highest for the MICE de-occluded faces, followed by the RegEM, with the MissForest attaining the least median entropy value.From Fig.9, the MissForest has the least brightness preserving property compared with the MICE and RegEM, which have relatively lower and similar brightness conservation property with a few images having brightness significantly different from their respective clean images.It can be seen from Fig.10that when the MissForest is used for de-occlusion, it results in images with the most noise since its associated PSNR values are relatively

k
− m(I j ) 2 P j (k), lower compared to the MICE and RegEM.However, the distribution of PSNR values for MICE and RegEM are relatively similar on average, although that of MICE shows more variability.

Table 1
Average recognition rates of the study algorithm using the de-occlusion mechanisms at 10% and 20% degradation levels