Bayesian zero-inflated regression model with application to under-five child mortality

Under-five mortality is defined as the likelihood of a child born alive to die between birth and fifth birthday. Mortality of under the age of five has been the most targets of public health policies and may be a common indicator of mortality levels. Thus, this study aimed to assess the under-five child mortality and modeling Bayesian zero-inflated regression model of the determinants of under-five child mortality. A community-based cross-sectional study was conducted using the 2016 Ethiopia Demographic and Health Survey data. The sample was stratified and selected in a two-stage cluster sampling design. The Bayesian analytic approach was applied to model the mixture arrangement inherent in zero-inflated count data by using the negative Binomial–logit hurdle model. About 71.09% of the mothers had not faced any under-five deaths in their lifetime while 28.91% of the women experienced the death of their under-five children and the data were found to have excess zeros. From Bayesian Negative Binomial—logit hurdle model it was found that twin (OR = 1.56; HPD CrI 1.23, 1.94), Primary and Secondary education (OR = 0.68; HPD CrI 0.59, 0.79), mother’s age at the first birth: 16–25 (OR = 0.83; HPD CrI 0.75, 0.92) and ≥ 26 (OR = 0.71; HPD CrI 0.52, 0.95), using contraceptive method (OR = 0.73; HPD CrI 0.64, 0.84) and antenatal visits during pregnancy (OR = 0.83; HPD CrI 0.75, 0.92) were statistically associated with the number of non-zero under-five deaths in Ethiopia. The finding from the Bayesian Negative Binomial–logit hurdle model is getting popular in data analysis than the Negative Binomial–logit hurdle model because the technique is more robust and precise. Furthermore, Using the Bayesian Negative Binomial–logit hurdle model helps in selecting the most significant factor: mother’s education, Mothers age, Birth order, type of birth, mother’s age at the first birth, using a contraceptive method, and antenatal visits during pregnancy were the most important determinants of under-five child mortality.

many cases because of many zeros in the dependent variable, the mean is not equal to the variance value of the dependent variable. Due to that, the Poisson model is no longer suitable for this kind of data. Thus, we suggest using an NBLH regression model to overcome the problem of over dispersion [21]. Therefore, this study aimed to assess the status of under-five child mortality and modeling Bayesian zero-inflated regression model of the determinants of under-five child mortality.

Study design and source of data
The dataset used for this study was obtained from 2016 Ethiopian Demographic Health Surveys conducted from January 18 to June 27, 2016, across the country. The survey was a population-based cross-sectional study. For the surveys, the 2016 EDHS sample was stratified and selected in two stages. In the first stage, a total of 645 clusters (202 in urban and 443 in rural) were randomly selected proportional to the household size from the sampling strata and in the second stage, 28 households per cluster were selected using systematic random sampling. In this survey, a total of 10,641 children under age 5 of mothers selected from 645 clusters were included in this study.

Dependent variable
The dependent variable for this study was the number of deaths of under-five per mother. That is the number of under-five children death was defined as the death of children less than 60 months in the last 5 years preceding the survey.

Independent variable
The main predictors explored for under-five mortality have been grouped into demographic and socioeconomic. The demographic factors for this study are the mother's age, birth order number, and mother's age at the first birth. The socioeconomic factors are the mother's level of education, residence mother, and household wealth index. Frequency of ANC visits, using the contraceptive method, type of birth were the variables that were included in the utilization of maternal health services by the mother.

Statistical method
In this study, the variable of interest was count data. When the dependent variable is a count, it is appropriate to use non-linear models based on non-normal distribution to describe the relationship between the response variable and a set of predictor variables. For count data, the standard framework for explaining the relationship between the outcome variable and a set of explanatory variables includes the Poisson, negative binomial regression, ZIP, ZINB, and hurdle models. The advanced models for this study count data are the NBLH model and the Bayesian negative binomial-Logit hurdle model [22].

Poisson and Negative Binomial Regression Model
Poisson regression has been widely used for fitting count data. It is traditionally conceived as the basic count model upon which a variety of other count models are based [15]. The Poisson probability mass function, with rate parameter μ i , is given by: where, y i is the number of under-five deaths the ith mother in a given time with rate parameter μ i , the mean and variance of the Poisson distribution is given as E(Y) = Var(Y) = μ. Poisson regression model derives from Poisson distribution and relates µ I , β, and X T i through: β are the vector coefficients X T i , which, unfortunately, in much of the cases, the number of under-five death data produces the variance which is greater than the mean, well known as over-dispersion. The over-dispersion is a result of extra variation in the number of under-five death means which can be caused by various factors like model misspecification, the omission of important covariates, and excess zero counts [23]. During this case, applying a Poisson regression model for the number of under-five death data would result in an underestimation of the standard error of the regression parameters. Therefore, the negative binomial model is introduced with: The mean and variance of the negative binomial distribution are E [y|μ, ∅] = μ and V [y|μ, ∅] = μ (1 + ∅ μ). Where ∅ is the dispersion parameter (if ∅ > 0 and μ > 0). Special cases of the negative binomial include the Poisson ( ∅ = 0) and the geometric ( ∅ = 1). The method of maximum likelihood is used to estimate the parameters in the negative binomial regression model [24].
In some cases, excess zeros in the number of under-five death data exist and are considered as a result of overdispersion. In this case, the NB model cannot be used to handle the overdispersion which is due to the high number of zeros. To do this, zero-inflation models including ZIP and ZINB models can be alternatively used. Both the ZIP and ZINB models assume that all zeros count come from two different processes: the process generating excess zero counts derived from a binary model, and the process generating non-negative counts for the number of under-five death including zero values.

Zero Inflated Regression Model
Poisson regression and negative binomial model with many zero outcomes on the response variable. The ZIP regression model is more effective for many zero outcomes than Poisson regression. While the ZINB regression model is more effective for many zero outcomes than negative binomial regression [25].

Zero-Inflated Poisson and Negative Binomial Regression Model
In ZIP regression, the counts Y i equal 0 with probability p i and follow a Poisson distribution with mean µ i , with probability 1 − p i where i = 0, 1, 2,..., n. ZIP model can thus be seen as a mixture of two-component distributions, a zero part, and no-zero components, given by [15]: The ZINB distribution is a mixture distribution assigning a mass of p to 'extra' zeros and a mass of (1 − p) to a negative binomial distribution, where 0 ≤ p ≤ 1. Based on the probability function of the zero -modified distribution, then the probability mass function for ZINB is: where φ −1 , µ and Ŵ(.) representing dispersion, mean, and gamma function respectively. Assume that there are p predictors for logistic regression function and negative binomial regression function. Hence, ZIP or ZINB regression model can be written as follow: where β are the vector coefficients X T i and γ are the vector coefficients Z T i .

Poisson and Negative Binomial-logit hurdle model
A hurdle model consists of two components-a point mass at zero and a distribution that generates non-zero counts. The first component is a binary component that generates zeros and ones (here "ones" correspond to non-zero values in data) and the second component generates non-zero values from a zero-truncated distribution. The most widely used hurdle models are those with the hurdle value at zero [4]. All zeros in the hurdle model are assumed to be "structural" zeros, i.e., they are generated from a single process, and are observed since the condition is absent. We explore two zero-truncated count distributions for the hurdle model specification [22]. The Hurdle Model of count data can be expressed as follows for the Poisson and Negative Binomial distribution. We consider a Poisson Hurdle Regression Model in which the response variable y has the distribution: where µ i is the mean of the untruncated Poisson distribution. A negative binomial hurdle distribution is given by: where φ(≥ 0) is a dispersion parameter that is assumed not to depend on covariates.
Zero and truncated hurdle model: , y i = 1, 2, 3 . . . The Maximum Likelihood Estimation (MLE) method is used to estimate parameters in the count models. This study includes Poisson, Negative Binomial, ZIP, ZINB, Hurdle Poisson, and NBLH to accommodate the excess zeros for the number of under-five death count data. In this paper, Akaike's information criteria (AIC) and log-likelihood values are used for model selection measures. It is also used dispersion parameters to test for overdispersion. The generalized Pearson χ2 statistic which is the standard measure of goodness of fit is used to evaluate the sufficiency of the analyzing methods. AIC and log-likelihood are basic methods for assessing the performance of the models and model selection [15].

Bayesian Negative Binomial-Logit Hurdle Model
The number of deaths of under-five per mother is a count variable. For modeling of count data, two-part models are applied in the presence of excessive zeros. Therefore, for a better fit an over-dispersed model that incorporates excessive zeros, i.e. Negative Binomial-Logit Hurdle (NBLH) Regression Model is used. The hurdle model is flexible and can handle both under-dispersion and over-dispersion problem. The NBLH model is used on data with either excessive zero counts in the response or at times too few zero counts. In the case where there are too few zero counts, a zero-inflated model cannot be used. The hurdle model is a good way to deal with such data [22]. It uses two-part. The first part estimates zero elements from the dependent variable are zero hurdle model and the second part estimates not zero elements (non-negative integer) from the dependent variable is called truncated negative binomial models [26]. The probability density function of the negative binomial-logit hurdle model is: where φ μ, and Γ(.) representing dispersion parameter, mean, and gamma function, respectively. The most natural choice to model the probability of excess zeros is to use the Zero hurdle model with logit link function and a truncated negative binomial model with log link function respectively.
where β are the vector coefficients X T i and γ are the vector coefficients Z T i . The parameter ∅ is a measure of dispersion. When ∅ = 0, the NBLH model reduces to the Poisson regression model. For ∅ > 0, the NBLH model can be used to fit overdispersed count data. When ∅ < 0, the NBLH model can be used to fit under dispersed count data. The likelihood function of the negative binomial-logit hurdle distribution is as follows: The first and most important step in the Bayesian approach is choosing appropriate prior distributions. Let β and γ are the set of parameters for the above-mentioned model. We assume independent priors for these parameters. Since there is no prior information from historical data or previous experiments, then all parameters will use conjugate non-informative priors. The prior distribution for β and γ is assumed to be normal, while φ is assumed to be gamma-distributed. So, the joint prior distribution for NBLH regression parameters is: where ∅~Gamma (a, b) with a = 0.001 and b = 0.001. but our a priori judgment was that knowledge of the slope parameter γ does not provide any information about β . The regression tool for full Bayesian inference was based on the posterior distribution of all parameters. Markov Chain Monte Carlo techniques were used to draw samples from the full conditionals of all parameter distribution which were then summarized to obtain model estimates in the posterior analysis that is: where f β, γ, ∅/y, X is the joint distribution of all parameters in the observation model, L β, γ , ∅/y, X is the likelihood for all observable data (Y, X) and f (β, γ, ∅) is the joint prior distribution. Gibbs sampler was used to draw samples from the full conditionals. The posterior distribution is difficult to be solved analytically. Therefore, a numerical simulation using the Markov Chain Monte Carlo-Gibbs sampling is used to update the parameters given initial values, and to sample the parameters given the simulation is convergent. The most commonly used of this sampling technique is the Gibbs sampling algorithm. Gibbs sampling is an algorithm to generate a sequence of samples from the joint probability distribution of two or more random variables, to approximate the joint distribution. Gibbs sampling is applicable when the joint distribution is not known explicitly, but the conditional distribution of each variable is known. Moreover, the Gibbs sampling algorithm is a method to generate an instance from the distribution of each variable in turn, conditional on the current values of the other variables [27].

The convergence of the algorithm
Flexible software for Bayesian analysis of complex statistical models by using MCMC methods. We use these tools to estimate the NBLH regression models. MCMC is based on a combination of Markov chain and Monte Carlo estimation which eventually converges to the target distribution (the posterior distribution). If a chain becomes convergent means the produced sample from the target distribution has been obtained correctly. The Markov chain Monte Carlo (MCMC) method is a general simulation method for sampling from posterior distributions and computing posterior quantities of interest. MCMC methods sample successively from a target distribution. Each sample depends on the previous one, hence the notion of the Markov chain. The Markov chain method has been quite successful in modern Bayesian computing. Only in the simplest Bayesian models can you recognize the analytical forms of the posterior distributions and summarize inferences directly. In moderately complex models, posterior densities are too difficult to work with directly. With the MCMC method, it is possible to generate samples from an arbitrary posterior density and to use these samples to approximate expectations of quantities of interest. Several other aspects of the Markov chain method also contributed to its success. Most importantly, if the simulation algorithm is implemented correctly, the Markov chain is guaranteed to converge to the target distribution [28][29][30][31]. MCMC technique depends on the approximate distribution which is improved by a simulation of each step until a convergence of the posterior distribution is achieved. Appropriate diagnostics such as; the Gellman-Rubin convergence diagnostic test, Heidelberger Welch (stationarity test), Heidelberger-Welch (halfwidth test), monitoring the Markov Chain (MC) error, checking for autocorrelation, and observing the trace plots, can be used.

Results
Information on the number of deaths of under-five children obtained from a total of 10,274 women in Ethiopia was studied. Table 1 showed the frequency and percentage distribution of the number of under-five deaths in Ethiopia based on information from 10,274 women. In this study, 71.09% of them never faced any child death, while the remaining 28.91% have at least one child death. This indicates zero outcomes were large in number. However large observations (i.e. large numbers of under-five deaths per mother) are observed less frequently. This leads to a positively skewed distribution. This indicates that the data could be fitted better by a negative binomial hurdle which takes into account excess zeroes. From Fig. 1, we visualized that an over-dispersion of the response variable. Since the histogram is highly peaked at zero, we can state that the overdispersion is due to an excess of zeroes. Due to a large number of zero outcomes, the histogram is highly picked at the very beginning (about the zero values).
This leads to having a positive (or right) skewed distribution. This was an indication that the data could be fitted better by count data models which take into account excess zeroes and the distribution of the number of under-five deaths has a rapidly decreasing tail and is highly skewed to right with excess zeros.

Test for overdispersion
In Poisson regression analysis, Deviance and Pearson Chi-square goodness of fit statistics indicate there was over-dispersion (Table 2). Since the Pearson Chi-square statistic divided by the degrees-of-freedom is higher than one and the observed value of 1.165, then the mentioned goodness of statistics represents that there was an overdispersion  in the data set. Even if the Deviance and Pearson Chi-square goodness of fit statistics of 7552.41and 9939.28 respectively in NB regressions is dropped considerably, still significant over-dispersion exists; because we would like to divide this value by the degrees of freedom to be close to one. Moreover, the ratio of the Deviance and Pearson Chi-square statistic to their corresponding degrees of freedom are greater than one, indicating overdispersion in the data and the NB regression model is preferred over the Poisson model. Figure 2a showed how well the model predicts the count values by overlaying the predicted probabilities for each under-five child death category on the frequency histogram of the actual under-five child mortality data. It appears that the typical regression model under-predicts the 0-5 under-five child mortality categories over-predicts all the other categories. The plots of the predicted probability of each model against the observed probability of the outcome show that the Poisson and the NB model under-estimated zero counts.
The zero-inflated models captured almost all zero values. Based on predicted probabilities, the differences in model fit between the six models were remarkable. Still, the Poisson model and the NB model do not fit the data reasonably well because of high zero counts. The Poisson predicted about 68% zeros and the NB model predicted about 70% zeros compared to NB hurdle ZIP and ZINB about 71.09% observed zeros (Fig. 2b, c).
A Table 3 summary of the model comparisons based on Vuong's statistics for the six regression models explored. The rankings of the model are as follows: Poisson < Negative binomial < ZINB = ZIP = NBLH = Poisson-logit Hurdle. [32]states that if the corresponding p-value is bigger than a pre-specified critical value such as 0.05, then one can conclude that the two models fit the data equally well with no preference given to either model. But, if |V| yields a p-value smaller than the thresholds 0.05, then one of the models is better. Therefore, the ZINB, ZIP, NBLH, PLH was chosen as the best model.

Model selection criteria
The AIC values for the Poisson (PR), negative binomial (NB), ZIP, ZINB, PLH, negative binomial-logit hurdle (NBLH) were given in Table 4. The AIC obtained from the PR model was determined to be greater than that obtained from the other regression models. The model with the smallest AIC was NBLH.

Interpretation of Count Model coefficients (truncated negative binomial with log link)
According to the findings of this study, the wealth index of the household has a significant influence on the number of under-five mortality. The expected number of non-zero under-five deaths for women in rich households was 0.84 times lower than the poor households. A mother's age was a significant positive association with under-five mortality. When we look at the age of mothers, the expected number of non-zero underfive death for mothers aged 30-39 increased by 26.1% as compared to mothers aged less than 29 and equal controlling other variables in the model. Also, the expected number of non-zero under-five deaths for mothers aged 40-49 increased by 91.2% as compared to mothers aged less than 29 and equal by controlling other variables in the model.
The result also revealed that the expected number of non-zero under-five death whose mothers visited the health institution during pregnancy was 0.915 times lower compared to whose mothers who have not received any antenatal. The finding of this study also revealed that mother's levels of education have a significant factor in the number of under-five death. The expected number of non-zero under-five death for mothers with primary and secondary education is 0.704 times lower as compared to those with  non-educated. Contraceptive use is found one of the important significant predictors of under-five mortality. The expected number of non-zero under-five death for mothers who were used contraceptives was 0.731 times lower than mothers who have not used a contraceptive. The result also shown the expected number of non-zero under-five death in multiple births was 1.571 times greater as compared to the single birth. When we see the age of mothers at first birth, the expected number of non-zero under-five deaths for mothers aged above or equal 26 years decreased by 37.9% as compared to mothers aged less than 15 years. Besides, the expected number of non-zero under-five deaths for mothers age 16-25 years decreased by 16% as compared to mothers aged less than 15 years (see Table 5).

Interpretation of Zero hurdle model coefficients (binomial with logit link)
The Zero hurdle model indicated that the estimated odds of the number of non-zero under-five deaths of women who lived in rural was 1.248 times more than those who lived in urban. In addition to this; as birth order increases the under-five mortality  also increases. The estimated odds number of non-zero under-five deaths with children's birth order 2-4 and 5 + are 3.106 and 1.024 times more than the first order; respectively. According to the findings of this study, the wealth index of the household has a significant influence on the number of under-five mortality. The estimated odds that the number of non-zero under-five deaths for women in the rich households were decreased by a factor of 0.897 times the estimated odds number of non-zero under-five deaths for women in the poor households while holding all other variables in the model constant.
The finding also showed that estimated odds that the number of non-zero underfive death for mothers who were used contraceptives was about 0.761 times lower than mothers who were not used. And also, the probability of under-five death decreased with the increasing educational level of the mother. The estimated odds that the number of non-zero under-five death with mothers who have a primary and secondary education are decreased by 16.7% than non-educated mother.
Finally, the result revealed that the type of birth and age at first birth has a significant factor for the likelihood of under-five death. The estimated odds that the number of nonzero under-five death with children born in multiple births is 2.496 times more as compared to children born in a single birth. The odds of the number of non-zero under-five death among children whose mother's age at first birth greater than 26 and 16-25 years were decreased by 46.9 and 30%e than as compared to children whose mother's age at first birth was less than 16 years. And also, estimated odds that the number of non-zero under-five death with mothers age 30-39 and 40-49 is 1.25 and 2.233 times more than as compared to mothers aged < = 29 respectively.

Results of Bayesian Negative Binomial-Logit Hurdle Model
Bayesian approach results, it is needed checking the convergence assessment, that involves checking that the sequence or chain has converged to and provides a representative sample from the posterior distribution. Table 6 shows the Heidelberger and Welch stationarity tests for the Bayesian MCMC.
Time-series: It is one of the tests used to diagnosis the convergence of Bayesian analysis. The time series plot indicates a good convergence three independent generated channels will mix or overlapped (Appendix: Fig. 3a and b). Here, the diagnostic graphs conclude the simulation draws are reasonably converged, and therefore, we can be more confident about the accuracy of posterior inference.

Interpretation of Bayesian Count Model coefficients (truncated negative binomial with log link)
The finding of the analysis, it was shown that the most effective variable on the number of under-five child death. Women ages at first birth, the estimated coefficients of age groups of women are statistically significant for the number of under-five death. The results in Table show    The finding of this study also revealed that the mother's level of education had a significant factor in reducing the number of under-five mortality. The expected number of under-five mortality for women with primary and secondary education was decreased by 31.96% as compared to those with no education controlling other variables in the model. Similarly, the finding of this study, the wealth index of the household has a significant influence on reducing the number of under-five mortality. The expected numbers of under-five deaths for women in the rich households were decreased by 11.57% as compared to the expected number of under-five deaths for women in the poor households while holding all other variables in the model constant.
The finding of this study also revealed that types of birth had a statistically significant impact on the number of under-five mortality. The expected numbers of underfive deaths for multiple births were increased by a factor of 1.556 as compared to the expected number of under-five mortality for the single birth while holding all other variables in the model constant. Besides, the age of mothers at first birth, the expected number of non-zero under-five deaths for mothers aged 16-25, and ≥ 26 years are decreased by 16.64 and 29.25% as compared to mothers aged ≤ 15 years. The result also revealed that the expected number of non-zero under-five death whose mothers visited the health institution during pregnancy was 0.831 times lower compared to mothers who have not received any antenatal. In addition to this, as birth order increases the under-five mortality also increases. The expected number of non-zero under-five deaths with children's birth order 2-4 and 5 + is 35.52 and 113.86 times more than to the first order; respectively.
Contraceptive use is found one of the important significant predictors of under-five mortality. The estimated number of non-zero under-five death for mothers who were used contraceptives is about 0.733 times lower than mothers who were not used.

Interpretation of Bayesian Zero hurdle model coefficients
This implies convergence and accuracy of posterior estimates are attained and the model was appropriate to estimate posterior statistics. Because of the result with non-informative prior given in Table 3, considering the credible interval, the table shows that the following variable: mothers age, education level, birth order number, type of birth, age of respondent at 1st birth, current contraceptive method of using, number of antenatal visits during pregnancy were the significant predictors of the determinants of under-five child death. From the Bayesian zero hurdle model we found that the number of non-zero under-five death whose mothers visited the health institution during pregnancy was 0.831 times lower compared to mothers who did not receive any antenatal (OR = 0.762; HPD CrI 0.690, 0.842). Additionally, the effects of maternal education on the under-five child mortality, we found that higher education level, primary and secondary education level women were 0.810 and 0.317 times less likely to the number of non-zero under-five child death compare to no educated women respectively. Furthermore, as the level of education increases, the odds of the number of non-zero under-five death also decreased by 0.810 and 0.317 respectively.
Regarding the effects of the age of the respondents at first birth on child mortality, mothers aged 16-25 were 0.744 times less likely to the number of non-zero under-five child death compare to mothers aged ≤ 15 years. Besides, the estimated odds of the number of non-zero under-five deaths for mothers aged ≥ 26 years are decreased by 27.02% as compared to mothers aged ≤ 15 years. Mothers who used contraceptives had decreased odds (OR = 0.756; HPD CrI 0.670, 0.851) of the number of non-zero underfive child death compared with mothers did not use. The estimated odds that the number of non-zero under-five death with children born in multiple births was 2.404 times more as compared to children born in a single birth. The estimated odds number of underfive deaths those women aged 30-39 years had decreased by 89.6% as compared to the age group < = 29. Similarly, the estimated odds of the number of under-five deaths those women aged 40-49 years had decreased by 20.07% as compared to the age group < = 29. The estimated odds number of non-zero under-five deaths with children's birth order 2-4 & 5 + are 4.031 & 16.151 times more than to the first order; respectively (Table 7).

Discussion
In this study, we found that Poisson and NB models are insufficient in the presence of excess zero counts. A previous study reported the performance of count models for health data concluding the ZINB model to be best fitted for overdispersed and zeroinflated response variables. However, in the presence of overdispersion and excess of zeros, the NBLH model is better fitted the data which is characterized by excess zeros and high variability in the non-zero outcome than any other models, it also should be noted that NBLH which allows for over-dispersion and also accommodates the presence NBLH of excess zeros, is more appropriate among all zero-adjusted models and therefore, NBLH is selected as the best parsimonious model to predict the number of underfive death in Ethiopia [33][34][35].
Results that there was a significant difference between the two approaches. The comparison between the two approaches had to better apprehend the determinants of the number of under-five deaths highlights lower standard errors of the estimated coefficients in the Bayesian Negative Binomial-Logit Hurdle Model. Thus, the Bayesian Negative Binomial-Logit Hurdle Model is more stable. On the other hand, the results from Bayesian Negative Binomial-Logit Hurdle Model and NBLH are difficult to compare because of both utilized different tools for decision-making. Moreover, when both approaches produce similar results, findings from the Bayesian NBLH model are given preference because the technique is more robust and precise than the NBLH. Our results also give some support to previous findings [22,36].
Besides of priors was to reduce the variance of the model and thereby lead to a better model in the Bayesian approach. Based on the prior definition and the result from our analysis; we concluded that the Bayesian approach gives a better result. Findings from Bayesian and classical inference are not significantly different which could be due to the covariates or non-informative prior utilized in the model. Despite the similarities in their results, it was still difficult to compare the two approaches because classical inference makes use of confidence interval to decide while Bayesian uses credible intervals. Moreover, when both techniques produce similar results, findings from Bayesian are given more attention because it is more robust compared to the classical. It was also possible to assess convergence of models under the Bayesian which could also make its result better than the classical inference [37].
According to the results, the mother education level was an important socio-economic predictor of the number of under-five child death, that was the mortality rate decreases with an increase in mother education level. The higher the level of education of the woman, the lower the risk of mortality. Educated mothers will be well informed about factors such as antenatal care, family planning and others that will lead to a reduction of child mortality. Similar results were obtained from previous studies [38][39][40].
Mother's age at first birth is negatively correlated with child mortality that decreased the risk of child mortality as an increase in mother's age at first birth. The estimated result also show that increases mothers' age at first birth reduced the risk of child mortality and mothers who gave birth to their first child at a younger age face higher child mortality risk which is similar to the previous studies conducted by different scholars in developing countries including Ethiopia, Nigeria and Bangladesh [41][42][43][44][45].
The risk of under-five death associated with multiple births was very high relative to single births and this study is similar to the previous studies that birth type to be linked with under-five child death as multiple births is associated with a higher risk of child mortality [42]. Child death with multiple births is higher relative to single ones. Because multiple births have a lower weight due to nutritional intake competition [46]. In addition to the current study, those under-five children, including infants, whose births were multiple had a higher rate of odds of mortality than those who were singleton births. So, these findings indicate the importance of meticulous identification and investigation of high Maternal and child determinants of under-five mortality risk pregnancies, including multiple pregnancies, during the prenatal period to take appropriate action.
The finding of the study revealed that the death of under-five children from mothers use contraceptives was significantly less compared to the death of children from mothers who did not use a contraceptive [47]. Birth order was another important factor positively associated with under-five child mortality. Under-five child mortality increased as birth order increased. Birth order of greater than or equal to five (> = 5) has been said to experience significant-high childhood mortality, possibly due to less care, since the woman has more children to attend [48,49]. More so, as the birth order increases, the age of the mother also increases.
The result also revealed that the number of under-five death whose mother's antenatal visited during pregnancy was lower than not received any antenatal check. Hence, increased attendance at antenatal clinics reduced child mortality [31,35]. According to the results, under-five mortality risk is higher for children of poor mothers compared to children of medium and rich mothers. In this study, the AIC statistic and predictive probability curve indicated that the Hurdle negative binomial model was the best model for the number of under-five death with about 71.09% zero counts. Several studies reported similar results that the Hurdle negative binomial model was the best model for count outcomes [50].