A customer offers valuable insights on prospective decisions through customer behaviour data. For instance, churn prediction models determine customers who stop utilizing a service or product . This is of significant interest for product providers because a large number of churning customers not merely drives to attrition of revenue but can also harm the reputation of a company . According to Engel, Blackwell, and Mansard, consumer behaviour is the action and the decision process of people who buy goods and services for their consumption. It is important to forecast consumer behaviour as it helps to differentiate consumers from each other which helps to set a target market for the company, it also helps in retaining potential consumers, helps in re-designing marketing programs, assist in predicting market trends, create competition, bring innovation to the existing products and develop new ones, stay relevant to the market and improve customer services .
The field of customer churn prediction receives much less emphasis in business-to-business contexts while it is well-researched in the customer-to-business context. The frequency of customers is often substantially lower in B2B businesses, but their transactional values are usually a lot higher . Thereby, single customers are of great interest to a company and the effect of attrition can be much greater. This supports the appropriateness of customer churn prediction in B2B domains. On the contrary, measures established for B2C systems can usually not be integrated into B2B environments because of their multifaceted setups .
Customer churn occurs when customer expectations are not fulfilled. It is the loss of a retained customer to a competitor . A competitor is a different brand in this study, which can result in churning customers even though the customer remains in the same company . According to Rohini and Devaki,  inspecting customer churn for huge data in terms of customer retention is open research in machine learning technology. They further explained that customer churn means a loss of customers who switches from one sector to another. When customer churn is misclassified utilizing clustering, it can yield huge financial losses and even hurt the development of the association.
Firstly, the churning customers must be identified and then such customers must be encouraged to stay for managing customer churn. The marketing cost to attract new customers is three to five times higher retaining customers, allowing customer retention an interesting topic for all businesses . For instance, insurance companies are specifically associated with customer retention and satisfaction since the needed fundamental insurance package is generally the same for each company . This can create a highly competitive and dynamic environment in which customers are competent enough to instantly switching between insurance companies. The majority of the firms usually serve millions of customers, which makes it complicated for extracting beneficial data on customer switching behaviour and for predicting modifications in customer retention . A study conducted by Amjad et al. , suggested a hybrid data mining learning approach for predicting customers' churn. Their 3 models were carried by stages of clustering and predicting performance. The information about the customers was filtered and utilized through the K-means algorithm and a Multilayer Perceptron Artificial Neural Systems (MLP-ANN) for prediction. The use of clustering with MLP-ANN, their model, used self-sorting out maps (SOM) with MLP-ANN on the data. The churn rates and precision values were compared and determined with other state-of-art. To be precise, their work reflected that the 3 crossover models outperformed single normal and common models.
Another study by Wenjie et al.  was carried out in which a clustering algorithm called semantic driven subtractive clustering technique (SDSCM) using a Hadoop map reduced structure was suggested. This model suggested in the study proved to be fast as compared to different techniques as well as recommended few showcasing procedures according to clustering algorithm to assure benefit implications. Furthermore, a study by Fathian et al.  showed a comparison of single standard classifiers and ensemble classifiers to predict customer churn. Their study built an aggregate of 14 prediction models which were grouped into four categories such as; fundamental classifier, (Decision Tree, K-Nearest Neighbour, SVM, and Artificial Neural Network), Classifier with SOM and basic classifier, Classifier with SOM, and reducing features with PCA and basic classifier, and lastly Classifier with SOM and reducing features with PCA and bagging and boosting ensemble classifier.
A dynamic competitive environment is evident in such a strictly regulated market. The interference of government is not observed with the additional insurance policies and this combination develops a competitive and dynamic environment . There is a reduction in customer churn from 6.9% to 5.3% in 2018 in health insurance firms, but this still covers 1.2 million customers due to the stagnant price level of health insurance. In that year, the high churn percentage is reflected from the outflow of 2018 comprising of switches in group insurances .
Arowolo et al.  have utilized PCA feature extraction algorithm for acquiring the latent constituents that can assist improving the classification of a mosquito anopheles gambiae data through SVM polynomial kernel and Gaussian kernel on a reduced dimensional data, integrating PCA algorithm. The study has comparatively showed that SVM-Gaussian Kernel was outperformed with 99.68% through SVM-Polynomial kernel. Olaolu et al.  have applied dimensionality reduction methods for obtaining the minimal set of genes that contributes to the efficient performance of classification algorithms in microarray data. The findings have shown high accuracies and,thus, compared the performances of the dimension reduction techniques. Significant accuracy was observed through the PLS-based method as compared to other dimension reduction methods such as PCA and One-way ANOVA).
Arowolo et al. [3, 6] have combined feature extraction and selection into a generalized model to obtain an efficient and robust dimensional space. The study has employed One-Way ANOVA for obtaining an optimal number of genes, partial least squares, and PCA as feature extraction methods, independently. In this regard, irrelevant and redundant attributes were removed to present an efficient and accurate performance of almost 98% over the state-of-art.
Arowolo et al. [4, 5] have utilized PCA feature extraction algorithm for obtaining latent constituents and assesses its classification performance via decision tree classification and KNN algorithms. The effectiveness of this experiment was validated via RNA-Seq dataset on a mosquito anopheles gambiae. The findings have indicated an accurate performance metric with a classification accuracy of 86.7% and 83.3%, respectively. Arowolo et al. [4, 5] have proposed a hybrid dimensionality reduction technique for fetching pertinent subset attributes from the data. Features selected were passed into PCA and independent component analysis methods based on the class variants, for helping transform the chosen attributes into a lower dimension independently. The reduced malaria vector dataset was used within SVM kernel classifiers for evaluating the classification performance of the experiment. Arowolo et al. [3, 6] demonstrated the effectiveness of feature extraction and investigated the most efficient approach that can be utilized for improving microarray classification. This study has undertaken PCA and PLS as a supervised technique for the dataset. The overall findings have indicated that PLS algorithm offers an enhanced performance of 95.2% accuracy as compared to PCA algorithms.
Not many studies have been conducted to find out how customer churn can be predicted in a health insurance company; therefore, the study aims to forecast which customers will switch and comprehend why such customers switch. Moreover, the study suggests a prediction model which is authentic and applicable for the marketing department. The study contributes in a way that it provides a prediction model which can be adopted by the marketing department to implement predict customer churn in a B2B business. This prediction model can help the managers at health insurance companies to forecast consumer behavior to identify potential customers, re-design marketing strategies, predict market trends, create a competition, bring innovation in existing products and services as well as develop new ones, stay relevant to the market and improve customer services. The majority of the studies have investigated a business-to-customer relationship. This research contributes to filling the literature gap by conducting a study on the use of knowledge extraction in predicting customer churn in a B2B environment. This study suggests a customer churn prediction model using CRISP-DM, decision tree method, and data mining techniques are also taken into consideration while designing the model for health insurance companies. The data collection for predicting customer churn in health insurance is the novel aspect of this research.
Besides, this prediction model can be utilized throughout the marketing strategy of a health insurance company and in a general academic context, combining a research-based emphasis with a business problem-solving approach. The objective of this study is to investigate the use of knowledge extraction in predicting customer churn in insurance companies in this regard, the following questions were addressed:
Question 1 What are the possibilities to create highly accurate prediction models for predicting the number of customer churn?
Question 2 Which customer behaviour and attributes are essential in predicting customer churn behaviour?
Question 3 Which techniques can be used for generating effective churn prediction models?
The remainder of this paper is structured as follows: “Introduction” Section provides an overview of related work in this area, before elaborating the research method and the research context in “Methods” Section. The findings and discussion of the study were described in “Churn prediction model generation” Section, followed by “conclusion and recommendations” Section.