Skip to main content

Computational methods for predicting the outcome of thoracic transplantation

Abstract

Cardiac disease and the death rates due to coronary heart failure and cardiomyopathy are increasing. Thoracic transplantation is now a widely accepted therapeutic option for end-stage cardiac failure. The survival rate after the organ transplantation is crucial. Survival prediction after heart transplantation is a hot area of research. The use of conventional statistical techniques is computationally expensive and does not provide reliable solutions. Artificial Neural Networks based survival prediction helps surgeons make precise decisions and predict the best outcomes. The proposed system implements multi-layer perceptron algorithm, which shows good performance in survival prediction. We also implemented our work in the Radial Basis Function Network model to prove the accuracy of proposed model. For this research study, data were collected from United Network for Organ Sharing database and extracted the relevant thoracic transplantation survival prediction attributes with the help of suitable data mining techniques. We obtained an accuracy of 97.1% from the multi-layer perceptron model with the evaluation of various performance measures. In order to assure the validity of the proposed model we implemented the Radial Basis Function model and obtained an accuracy of 92.37%. We collated the accuracy of proposed survival prediction models with existing systems and proved that the proposed system appeared to be best for survival prediction with higher accuracy compared to 85.9% in the existing system. The outcome of the model will be an asset for the lifesaving procedures in the medical field.

Introduction

Thoracic transplantation (TT) is a surgical procedure that can be considered as the only medical procedure suitable for end-stage cardiac disease. Thoracic transplants are performed when other medical treatments for cardiac problems have not worked, resulting in coronary failure. In adults, cardiac infarction can be generated by weakening of the cardiac muscle (cardiomyopathy), advanced heart failure, arrhythmia, inherited heart disease, coronary thrombosis, and heart valve diseases [1]. Doctors consider many factors when evaluating patients for transplant, including analysis of liver and kidney function tests to work out whether poor blood flow is hampering the vital functions of those organs [1]. The surgical outcome of TT is dependent on many aspects, such as severity of cardiac illness, age, and health condition for the pumping of blood [1]. The survival prediction of thoracic transplantation is an extensive area of research. In the clinical research, the survival prediction is done by implementing many mathematical and statistical probability models. In clinical outcomes, the survival prediction is done by using Wilcoxon rank sum and chi-square tests and subjective comparisons. In the computational methods, the outcome is predicted by using different machine learning techniques. The survival predictions are done by using different machine learning algorithms like linear regression, logistic regression, K-Nearest Neighbor, Support Vector Machine, and Naive Bayes. The proposed system implements accurate survival prediction by using Artificial Neural Networks (ANN).

ANNs are models that are drafted to resolve problems by trying to mimic the configuration and activity of our nervous system [2]. Neural networks are supported simulated neurons that are linked together in a specific pattern to form networks [2]. A neural network relates the human brain in two ways. A neural network gathers knowledge through learning. Neural network knowledge is reposited with in the link strengths known as synaptic weight.

Related research

In 2007, Davies et al. proposed a new method to evaluate early post-transplant thoracic survival in high-risk pediatric patients [3]. They collected the data from the United Network for Organ Sharing (UNOS) thoracic registry and implemented the survival prediction by using statistical probability method [3]. In 2009, Asil et al. introduced a method for predicting graft survival for heart–lung transplantation patients by using different data mining methods [4]. They implemented the graft survival prediction by using machine learning techniques, such as decision trees, logistic regression and multilayer perceptron on a large feature rich dataset [4]. The survival prediction obtained an accuracy of only 85.9% [4]. In 2012, Weiss et al. developed a significant donor risk index to prognosticate short term mortality in orthotropic thoracic transplantation [5, 6]. They utilized the UNOS STAR file to develop a significant donor risk score for orthotropic heart transplantation. They created and validated donor-based risk score elements such as ischemic time, age of donor, mismatching of race, blood urea nitrogen, and creatinine ratio and implemented them by using a logistic regression model [5, 6]. In this study, the system conducted a retrospective study from administrative dataset and could not conduct scientific validation [5, 6]. In 2012, Chokshi implemented a study on hepatic dysfunction accompanied with impaired clinical outcome after thoracic transplantation. They calculated the model for end-stage liver disease (MELD) and albumin replacing modified MELD after transplantation [7]. The study was conducted by statistical computation method and also the study was in retrospective nature [7]. Later in 2016, Medved et al. introduced a study on an ideal feature set opting to forecast thoracic transplantation outcome for one, five and ten years which was implemented by logistic regression with greedy forward and backward search [8]. As a technique to prognosticate survival, they made use of the Index for Mortality Prediction After Cardiac Transplantation (IMPACT) and the International Heart Transplant Survival Algorithm (IHTSA) [8]. However, the IMPACT could not accurately predict survival for one, five and ten years [8]. In 2016, Ali Dag et al. implemented the prediction of graft survival after cardiac transplantation through development of data analytical models [9]. The main aim of the research was to forecast the one, five, and nine year graft survival of patients undergoing a thoracic transplantation surgery via the sequence of analytical models that were dependent on four powerful classification algorithms (for example support vector machines, decision trees, logistic regression and artificial neural networks) [9]. The results showed that logistic regression and neural network models yielded higher performance in the survival prediction with an accuracy of 82.4% and 81.9%, respectively [9]. In 2017, Raji et al. implemented survival prediction after liver transplantation with validation of attributes using machine learning techniques [10]. Raji et al. introduced Multilayer perceptron for predicting five years survival after liver transplantation with the help of followup data in 2017 [11]. In 2021, brain ayers et al. implemented a one year survival prediction mechanism after heart transplantation using machine learning [12]. The proposed system evaluated whether modern machine learning techniques could enhance the risk prediction in orthotopic heart transplantation rather than from the prior risk models such as the Donor Risk Index (DRI), Risk Stratification Score (RSS), Index for Mortality Prediction After Cardiac Transplantation (IMPACT), and International Heart Transplant Survival Algorithm (IHTSA) by using UNOS database [12]. Four different algorithms such as deep neural network, logistic regression, AdaBoost and random forest were subsequently combined into a final ensemble prognostic model. The proposed ensembled model had gained an accuracy of 95%. The study had all the inherent limitations of a retrospective study [12]. In 2021, Polydoros N. kampaktsis et al. introduced another study on machine learning based contemporary thoracic transplantation survival prediction [13]. The study was conducted based on the results of the UNOS database. They took about 18,625 patient data and made a study of one year mortality after thoracic transplantation through feature selection algorithm together with training five machine learning models [13]. They used Adaboost, Logistic Regression, Decision Tree, Support Vector Machine and K- nearest neighbour models and also compared the results with (IMPACT) Index for Mortality Prediction after Cardiac Transplantation. The predictive accuracy of the chosen machine learning algorithms were evaluated by computing area under the curve (AUC) of the receiver operator curve (ROC) as well as sensitivity, specificity, positive and negative predictive values. The study showed that machine learning based survival prediction after thoracic transplantation is more accurate compared to the previously published risk scores [13].

A number of limitations have been found in various studies to forecast the survival of Thoracic transplantation. Due to the lack of obtaining high accuracy computational models and relevant datasets, the researchers were unable to obtain accurate and precise survival predictions after thoracic transplantation. We used ANN models such as multi-layer perceptron and Radial Basis Function Network models for the survival prediction with the help of the UNOS dataset and performed a successful prediction.

Materials and methods

Description of dataset

The proposed system uses the dataset that was gathered from the UNOS database, which is a non-taxable, pragmatic, curative and pedagogic organization. It is the only one frame of organ procurement and transplant network (OPTN) that is an executive United States (US) government network run by the Health Resources and Services Administration, U.S. Department of Health and Human Services. The UNOS data file consists of multiple pre and post-transplant multiple organ data. From the extensive data file, we obtained files linked to Thoracic transplantation that consisted of five-hundred forty-three attributes and 1,48,509 records.

By recognizing and studying identical relationships between donor and recipient, the Thoracic transplantation survival prediction was examined. Although we had a large dataset, only a small set of relevant data file were used for the prognosis of graft survival. Depending upon the variable GRF_STAT which was designated as nominal, we could evaluate the survival prediction of Thoracic transplantation data. After Thoracic transplantation, cases in which the graft failed are depicted as GRF_STATUS = N and GRF_STATUS = Y shows the success of graft after thoracic transplantation. The age of the patient in the dataset is more important when he or she is going for Thoracic transplantation. The age of donor, AGE_DON and the recipient age, INIT_AGE were represented as numerical years. ABO is the blood group of the recipient and ABO_MAT is the donor recipient’s match level which was numerically represented. The mean and standard deviation for ABO_MATCH were 1.105 and 0.307 respectively. TOT_SERUM_ALBUM is the recipient total serum albumin in which the normal albumin range was 3.4 to 5.4 g/dL. The mean and standard deviation for TOT_SERUM_ALBUM were 3.919 and 0.517 respectively. ECMO_TRR provided the recipient status of life support and CREAT_TRR was the recipient serum creatinine at the time of transplantation. Normal blood levels of creatinine range from approximately 0.6 to 1.2 mg/dL. The mean and standard deviation obtained for CREAT_TRR were 0.9 and 0.784 respectively. GTIME is the graft life span days from transplant to the last follow-up which were numerically represented. The gender of the recipient, GENDER was represented as nominal. Whether the patient was resistant to bacterial infection or not was represented by the attribute, RESIST_INF. The need for life support was shown as ECMO_TRR. Both best and baseline recent hemodynamic mean values were collected between 04/01/1994 and 10/25/1999 and later the value on 10/25/1999 was selected as the best value. During transplant, whether the patient required life support was indicated by the important attribute, IABP_TRR. ACUTE_REJ_EPI was represented as numeric and was considered a recipient attribute. The lifespan of the graft from transplantation to failure/death/follow-up was represented as GTIME. The attributes, TR_TREJ1Y and PRAMR_CL2 were represented as nominal and numeric respectively. The ISCHITIME attribute was represented as numerical hours for the total ischemic time which was represented. Failure of the graft was represented by the attribute, GRF_FAIL_CAUSE. The DOPAMINE_DON_ OLD and INOTROP_AGENTS attributes were represented as nominal donor attributes. The HLA Mismatch level at transplantation was represented by HLAMIS, a numeric attribute.

Table 1 comprises the elements of Input determinant of Donor, Recipient and transplantation, trait and their composite variables. The dataset consisted of donor, recipient and transplantation attributes. In order to forecast the short-term survival after Thoracic transplantation, opting the relevant input parameters is very crucial. A total of 543 attributes includes clinical and nonclinical multi-organ data. As in every dataset, some attributes could be eliminated without the help of any data mining techniques. The relevant data for Thoracic transplantation only were extracted and others were removed. The proposed study considered only the survival prediction of adult patients. Therefore the dataset consisting of 9373 records of pediatric patients were removed. For survival prediction, we considered the attributes of thoracic patients at transplantation time. Finally, through different stages of data validation, 24 important attributes with 485 records were extracted, out of which one attribute, GRF_STAT was kept as output attribute. The attributes regarding donor, recipient and transplantation are included in Table 1.

Table 1 Description of input parameters, characteristics with their composite variables

The Table 2 shows the ranking of input attributes. The selection of attributes was conducted with the help of InfoGain Attribute Eval using the ranker search method that ranks the attributes according to their relevance. The 24 most relevant attributes were obtained which are very beneficial for the short term prediction of survival after Thoracic transplantation. Info Gain Attribute Eval assesses the value of an attribute by calculating the information gain with respect to the class.

Table 2 Ranking of input attributes
$$InfoGain\left(Class,Attribute\right)=H\left(Class\right)-H\left(Class-Attribute\right)$$
(1)

in which H is the information entropy. It is the procedure by which each attribute in our data file is assessed in the context of the output variable (such as the class). The ranker search method traverse different combinations of attributes in the dataset in order to ensue on a short list of selected features. The dataset consisted of 24 clinical input attributes that help to achieve the survival prediction with increased accuracy. Age of recipient and also donor is given by AGE and AGE_DON. The gender of the recipient and donor are also represented by GENDER and GENDER_DON respectively. ABO_MAT represents the donor and recipient match level. ECMO_TRR, TOT_SERUM_ABUM, RESIST_INF, CREAT_TRR, which is the recipient’s serum creatinine at the time of transplantation. HEMO_PA_MN_TRR, IABP_TRR, ACUTE_REJ_EPI, GTIME, TRTREJ1Y, PRAMR_ CL2, AGE, ISCHITIME, GRF_FAIL_CAUSE are all clinical parameters of the recipient which include both numerical and nominal data types. AGE DON is the clinical numerical parameter of the donor. DOPAMINE_DON OLD, INOTROP_AGENTS, GENDER_DON, and ABO_DON are clinical parameters of the donor that are represented as nominal attributes. AGE_DON is the clinical numerical parameter of the donor. DOPAMINE_DON OLD, INOTROP_AGENTS, GENDER_DON, ABO_DON are the clinical parameters of the donor that are represented as nominal attributes. HLAMIS and ABO_MAT are the transplantation data, which are represented as numeric data. The GRF_STAT is taken as the output class and is not included in the input attribute description table. According to InfoGainAttributeEval, GENDER holds the highest ranking in the dataset. ABO shows the next highest ranking, TOT_SERUM_ALBUM follows and so on. In the dataset, INOTROP_AGENTS shows the lowest ranking. It is a relevant attribute for predicting survival in thoracic transplantation.

Model selection

The model selected for short-term Thoracic transplantation survival prediction was the Artificial Neural Network (ANN). The proposed study consisted of multi-layer perceptron ANN Model and Radial Basis Function ANN model.

Artificial neural network

The first step towards an ANN modeled easy neural network with electrical circuits [2]. ANNs or connectionist systems are computing systems that are mimic but not are not similar to biological neural networks that constitute animal brains [2]. Such systems learn to perform tasks by considering examples, generally without being programmed with task specific rules. The Fig. 1 depicts the basic structure of ANN. An ANN is a data processing nonlinear model based on the neural configuration of the brain that is capable of assimilating tasks such as classification, prediction, decision-making, visualization, and other variants just by considering examples. The input layer involves input neurons that transmit information to the hidden layer [2]. The hidden layer transmits data to the output layer. Every neuron has weighted inputs (synapses), which is an activation function that defines the output when given an input and one output.

Fig. 1
figure 1

Basic Structure of an artificial neural network, ANN

The Fig. 1 depicts the basic structure of ANN. An ANN is a data processing nonlinear model based on the neural configuration of the brain that is capable of assimilating tasks such as classification, prediction, decision-making, visualization, and other variants just by considering examples. The input layer involves input neurons that transmit information to the hidden layer [2]. The hidden layer transmits data to the output layer. Every neuron has weighted inputs (synapses), which is an activation function that defines the output when given an input and one output. Synapses are malleable parameters that transform a neural network to a parameterized system [2]. The weighted sum of the inputs generates the activation signal that is transferred to the activation function to yield one output from the neuron [2].

Multilayer perceptron ANN

MLP model is a class of feed forward ANNs [14]. The term MLP is employed inconclusively or else loosely to ask any feed forward ANN’s otherwise it is strictly used to define networks constituted of multiple layers of perceptron. MLP is a perceptron however there is the added complexity through the advent of layers [14]. We can see three types of layers in an MLP which consist of an input, hidden and output layers [14].

MLP consists of more than one collinear layer of neurons. While considering a simple three layer perceptron, the primary layer is input layer and final is the output layer and middle layer is called as hidden layer. We grub our input file into the primary input layer and the product is taken from the output layer [14]. We will be able to raise the amount of the hidden layer as much as we wish, to build the model more composite according to our task. The feed forward network is the most neural network model [14]. The goal of the model is to estimate some function f (). A classifier which chart an input k to an output class c as shown as,

$$c=f\left(k\right)$$
(2)

MLP spot the simplest estimation there to classifier by defining and charting,

$$c=f\left(k,\theta \right)$$
(3)

The outstanding parameters, theta for the classifier need to be investigated. The MLP neural networks consist of numerous functions that are grouped at once. The three province or layers of a network will form,

$$f\left(k\right)=f\left(3\right)\left(f\left(2\right)\left(f\left(1\right)\left(k\right)\right)\right)$$
(4)

The three layers consist of components that achieve an affine alteration of a linear volume of inputs [12]. All layers are depicted as shown,

$$c=f\left({W}_{k}T+b\right)$$
(5)

n which f is the activation function, W is the set of criteria or weights in the layer, k is the input vector, which will be the prior layer’s output and b is that the bias vector. The MLP layers consist of distinct fully connected layers because each unit during a layer is attached to all or any other units within the preceding layer [14]. In a fully connected layer, the specification of each unit is autonomous for the remainder of the units within the layer, meaning each unit possesses a singular set of weights. Activation functions also known as irregularities describe the input or output connections in a nonlinear way. The MLP is implemented by applying the back propagation technique [14]. Figure 2 depicts the basis structure of multi-layer perceptron model. Technically, the back-propagation algorithm is a procedure for placing the weights in a multi-layer feed-forward neural network [14]. In that scenario, it needs a network composition to be explained of further layers in which one layer is entirely attached to subsequent layer [14]. The algorithm is employed to effectually train a neural network via a principle called the chain rule [14].

Fig. 2
figure 2

Multilayer perceptron model

Back propagation

Back propagation is the short form of the “backward propagation of errors”. It is a typical procedure for coaching ANNs. This technique helps compute the gradient of a loss function with respect to all the weights in the network [15]. The mechanism of back propagation is clearly depicted in Fig. 2 Back-propagation is the core of neural network training. It is the procedure of tweaking the weights of a neural network supporting the fault rate (for example, loss) acquired in the previous stage (for example, iteration) [15]. Appropriate tuning of the weights assures lower fault rates, making the model more relevant by increasing its generality [15]. The algorithm is employed to effectively instruct a neural network via the chain rule. In a back-propagation algorithm there are two passes taking place, forward and backward passes [15]. In simple terms, through a network after each forward pass, back-propagation executes a backward pass while regulating the model’s parameters (weights and biases). After each forward pass, the fault is computed by evaluating the predicted output against the calculated output and back-propagates the output layer until the error vanishes [15].

Radial basis function ANN

A Radial Basis Function (RBF) ANN is a man-made neural network that uses an activation function called RBF [16]. The output of the network will be a consecutive amalgamation of RBFs of the input and neuronal parameters [16]. RBF network consists of three layers: an input layer, hidden layer with a non-linear RBF as an activation function and a linear output layer.

Figure 3 depicts the structure of a RBF network. One neuron within the input layer corresponds to every predictor variable [16]. Every neuron present in the hidden layer consists of a radial basis function (for example, Gaussian) gathered on a point with the same dimensions as the predictor variables [16]. The output layer consists of a weighted sum of outputs from the hidden layer to form the network output. RBF networks are conceptually similar to K-nearest neighbor models [16]. The basis for this model is that a predicted target value of an item is probably going to be about equivalent because the other items have close values of the predictor variables [16].

Fig. 3
figure 3

Radial basis function model

$${\sum }_{j=1}^{m}{w}_{j}{h}_{j}\left(x\right)$$
(6)
$${h}_{j}\left(x\right)=exp\left(-{\left(x-{c}_{j}\right)}^{2}/{r}_{j}^{2}\right)$$
(7)

h (x) is the Gaussian activation function with the parameter, r is the radius and c is the center or the average taken from the input space defined individually at each RBF unit [16]. The learning process is based on adjusting the parameters of the network to propagate a set of input–output patterns.

Model implementation

Datasets for the study were collected from the UNOS database. It is a live multi-organ dataset, and the dataset consisted of 543 attributes and 1,48,509 records.

Figure 4 depicts the model implementation of the proposed study. We considered the Thoracic transplantation data only. Data validation is an important task in neural networks. Data validated and the relevant 25 attributes for the Thoracic transplantation survival prediction which includes 24 input attributes and one output attribute were extracted. The model for survival prediction was then selected which was the multi-layer perceptron. In order to prove the accuracy of the proposed model, we also implemented the RBF ANN. These 24 input attributes in the dataset were given to the MLP model and RBF network model. Thus the classification result of the system was obtained. The MLP showed a higher accuracy of survival prediction which was 97.1%. In order to prove the accuracy of the proposed system, we also implemented the RBF network which achieved an accuracy of 92.37%. The proposed system is a great achievement for the survival prediction of thoracic transplantation with accuracy of 97.1%.

Fig. 4
figure 4

System design

Survival analysis based on MLP

The survival analysis of cardiac transplantation was calculated on the basis of number of years in the UNIOS dataset. The follow up information in the dataset was used to perform the survival analysis. Different tables were created for different year’s survival prediction and linked those tables using the identifier PT_CODE. In order to do the survival analysis, the survival probabilities were calculated with the follow up information. Based upon the number of cardiac patients were alive at the start and the number of patients dead, the survival probabilities was calculated. The survival probabilities were calculated with the difference of number of patients were alive at the start and dead and was represented as SP.

$${\text{SP}}\,{\text{ = }}\,{\text{((initial living patient's number)}}\,{{ - }}\,{\text{(Died patient's number))/(initial living patient's number)}}$$
(8)

The MLP model was used to train the data which consists of 4023 records of cardiac patients of 3 years. There were multiple records for most of the patients. Multiple datasets were trained using MLP model for cardiac survival analysis.

Performance

For the short-term prediction of survival in Thoracic transplantation, we used the ANN model and MLP and in order to prove the accuracy of the model, we trained the dataset in the RBF model also.

Performance measures

To dictate the supreme classifier and enhance the accuracy of the model, the tenfold cross-validation method was also used in the training set, and the training phase did not use the data from the test set. We could see that the accuracy of the two NN models were 90% among which the accuracy of the MLP was 97.1% which was the highest, followed by the RBF which was 92.37%. For the performance assessment in implementation of the models, the performance measures such as TP, FP, TN and FN were symbolized as true positive (the number of actual positives), false positive (the number of actual negatives), true negative (the number of instances correctly predicted as not required) and false negative (the number of instances incorrectly predicted as not required) [17]. The evaluation procedures consisted of two types: (1) evaluation with performance measures and (2) evaluation with performance error measures.

$$Accuracy=\frac{TP+TN}{TP+FP+TN+FP}$$
(9)
$$Precision=\frac{TP}{TP+FP}$$
(10)
$$Recall=\frac{TP}{TP+FN}$$
(11)
$$F1-Measure=\frac{2\times Precision\times Recall}{Precision+Recall}$$
(12)

in which F1-Measure was defined as the weighted harmonic mean of the precision and recall, which depicts the inclusive performance [18]. In addition to the above mentioned evaluation benchmark, we used a receiver operating characteristic (ROC) curve and area under curve (AUC) to evaluate the assets and liabilities of the classifier [18]. The ROC curve exhibits the commutation linking the true and false positive rates (TPR and FPR) respectively [19]. If the ROC curve is adjacent to the top left corner of the graph, then the model is said to be fitter [19]. When the area of the AUC was adjacent to 1, the selected model was preferable. In medical data, more recognition is given to recall rather than accuracy [19]. When the recall rate is higher rather than lower, the chance that a patient has the threat of disease is postulated to have no disease danger.

Performance error measures

Performance error measures include mean absolute error, root mean square error (RMSE), relative absolute error and root relative square error (RRSE) [19]. In demography, mean absolute error (MAE) is an estimation of difference between two continuous variables. The mean absolute error is useful for expressing MAE as the sum of two constituents such as quantity disagreement and allocation disagreement. Quantity disagreement is the absolute gain of the mean error [20]. The root mean square deviation (RMSD) or RMSE is often cast off as the computation of the variation between values (sample and population values) speculated by a model or a gauge in which the values are literally declared [20]. The absolute error is the amplitude of the variation between the actual value and the approximation. The relative error is the absolute error divided by the amplitude of the exact value. The RRSE is correlative to what it would have been if an easy predictor had been utilized. Specifically, we can define simple predictor is just the median of the true values. Hence, the relative squared error considers the entire squared error and normalizes it by dividing the total squared error of the simple predictor [20]. By considering the square root of the relative squared error, we can reduce the fallacy of the identical measurements as the predicted consignment [20]. We can determine these measures by considering the AUC [21].

Results and discussion

The dataset used for the study consisted of 485 adult records and 25 attributes. The models used in the study trained 24 input attributes and produced the survival output through the output attribute. The input attributes, state of the recipient, complications in the transplantation, and quality of the graft are the factors considered for the post transplantation outcome. Initially the study was made on a one year thoracic transplantation survival prediction and then it is extended to a novel study of three year survival prediction of post thoracic transplantation based on the data collected from the database.

Data validation in survival prediction

In our study, 231 female recipients and 254 donor recipients waiting for donors were included. The donor data includes 184 females and 301 males. Four different blood groups such as O, A, B and AB for the recipients were included in the dataset. One-hundred eighty-nine recipients belonged to blood group O, two-hundred eight recipients belonged to group A, sixty-one patients in group B and twenty-seven patients are in group AB. The attribute, TOT SERUM ALBUM had a maximum value of 0.8 and minimum value of 6.9. The mean of the same attribute was represented as 3.919 and the standard deviation was 0.517. Thirty-one records were missing in the data set. Regarding RESIST IN, four-hundred seven patients were not affected with bacterial infection. Nine patients were affected with bacterial infection and six records were represented as undefined. Three missing records were found. In the recipient age, INIT AGE, seventeen and sixty-eight were the minimum and maximum ages, respectively. This finding clearly shows that the dataset includes only adult records. The mean value of INIT AGE was 51.192 and standard deviation was 11.039. The minimum value of CREAT TRR was 0.4 and maximum value was seventeen with mean values of 0.9 and 0.784 respectively. The minimum value of HEMO PA MN TRR was eight and the maximum value was ninety nine with a mean value of 27.664 and standard deviation of 11.899. Six different blood groups such as O, A1, A, B, A2 and AB for the donors were found in the dataset. Two-hundred twenty-seven recipients belonged to blood group O, fifty recipients belonged to group A1, One-hundred thirty-two patients in group A, fifty-six patients in group B, nine patients in group A2 and eleven patients in group AB. Sixty no values and Four-hundred twenty-five yes values or DOPAMINE DON OLD with two distinct values were found. The minimum age of the donors in the dataset was eighteen and maximum value was sixty eight. The minimum ischemic time was 0.9 and maximum ischemic value was twelve with a mean value of 4.651 and standard deviation of 10.943.

Survival analysis with respect to survival probabilities

While classifying the records for three years including six months, we obtained 391 records. Initially the UNOS dataset included 405 records of cardiac patients.

Out of 405 records, 14 patients were dead after six months. Out of 14 cardiac patients, six patients were died after 1 month of cardiac transplantation. Three patients were died after 2 months of transplantation. Subsequently, one patient died after 3 months, again one patient died after 4 months, two patients died after 5 months and one patient died after 6 months of cardiac transplantation. While performing the survival analysis we could observe that 387 patients were alive after 1 year, 384 patients after 2 years and 378 patients after 3 years of cardiac transplantation. The survival analysis with respect to survival probabilities is listed in Table 3.

Table 3 Survival analysis with respect to survival probabilities

Performance evaluation of the proposed models

The Table 4 depicts the performance measures of the proposed classifiers, MLP and RBF. The accuracy of the proposed MLP was very high, 97.1%. The RBF model had an accuracy of 92.37%. Sensitivity and specificity of the proposed MLP were 0.966 and 0.984 respectively and the precision value was 0.972. Sensitivity and specificity values of RBF were 0.935 and 0.893 respectively. RBF had a precision of 0.923. The results of recall and F-measure obtained from the MLP model were 0.971 and 0.971. TP and FP rates of the RBF were 0.924 and 0.129 respectively. Although the time taken for training the dataset in RBF was less than for the MLP, the performance in terms of accuracy was more in MLP than in the RBF.

Table 4 Performance measures of MLP and RBF

Table 5. depicts the performance error measures of MLP and RBF model. From Table 4, it can be seen that MAE of the MLP was only 0.0309 and that of the RMSE was0.165. The RAE of MLP was only 7.515% and RRSE was only 36.423%. The MAE value of RBF was 0.1149 and RMSE was 0.2501. The RAE value of RBF was 27.944% and that of the RRSE value was 55.198%.

Table 5 Performance error measures of MLP and RBF

Analysis of results

The outcome of the proposed MLP and RBF ANN models for the short term survival prediction of Thoracic transplantation were implemented and evaluated in terms of performance measures and performance error measures. The output survival prognosis was determined using these computed values and depicted the survival output was depicted using ROC curve. The confusion matrix of MLP model is more precise than that of the RBF model. The correctly classified instances of MLP included 471 and incorrectly classified instances included 14. In the case of RBF, correctly classified instances included 448 and incorrectly classified instances included 37 out of the total 485 instances. The FP Rate of MLP was 0.086 and that of RBF was 0.129. Here we can see that classification is more accurate for MLP model than RBF.

Proposed model comparison with respect to performance measures

The performance comparison of RBF and MLP ANN models are depicted in Table 4 and 5. The accuracy was higher for the MLP model than for the RBF. MLP has an accuracy of 97.1% while RBF model has an accuracy of 92.37%. The TP Rate was high for MLP with 0.971 while RBF was 0.924. In order to demonstrate the accuracy of proposed MLP classifier we also implemented the RBF which clearly shows that MLP with back-propagation had a higher accurate survival prediction of thoracic transplantation than that of the RBF. From the comparison, it is very clear that MLP had a higher AUC than that of RBF that shows a higher accurate classification of MLP model. Even though the RBF takes less time to build than that of MLP, the accuracy is less than MLP.

Studies have shown that the models chosen for the classification purpose contain an AUC value 0.5. Thus, MLP and RBF models can be selected for medical purposes with a higher priority to MLP. Also Fig. 5 depicts the ROC curve of the proposed MLP classifier. The ROC area of the proposed MLP was 0.918. The ROC curve was graphed with sensitivity on the Y-axis and 1-specificity on the X-axis. The TP rate from the RBF model was 0.924 and FP rate was 0.129. The ROC area results from the RBF model was 0.95. The MAE was 0.0309 for MLP and 0.1149 for RBF. RAE was 7.515% for MLP and 27.944% for RBF. The RRSE was 36.423% for MLP and 55.198% for RBF. We can see that the error rate was higher for RBF than MLP. Hence, the MLP with back-propagation had a high accurate survival prediction than that of RBF.

Fig. 5
figure 5

Comparison between classifiers MLP and RBF

Comparison of the proposed system with existing system

Figure 6 compared the proposed system with the existing system. The existing system that we considered was predicting the graft survival of heart lung transplant patients [4]. The existing system we used was from the research work of Ostekin et al. in 2009 [4]. The system aimed to forecast the integrated survival of heart lung transplant patients. The system used a different dataset than the proposed system and was implemented with MLP, decision tree and logistic regression [4] out of which the MLP produced the higher accuracy of 85.9%. Our proposed system also implemented MLP with a dataset collected from UNOS and extracted the relevant attributes for the survival prediction. The system implemented MLP with back-propagation and obtained a higher accuracy of 97.1%. Again, in the Fig. 6 we compared our classifier result with the existing MLP model having different attributes. It is very clear that the accuracy of the proposed MLP classifier was elevated compared to that of existing system. Table 6 depicts the performance differentiation of proposed system with the existing system. Sensitivity of the existing system was 0.847 and sensitivity of proposed system was 0.966. Specificity of the existing system was 0.869 and the specificity of the proposed system was 0.984. We can also see the difference in the accuracy which is 97.1% and 85.9% for the proposed and existing systems, respectively [4]. With reference to Ayers, Brian, et al. the accuracy obtained was 95% with the sensitivity and specificity values as 0.896 and 0.912. In short, our proposed system for survival prediction of Thoracic transplantation has a higher accuracy of 97.1%, which was obtained by implementing our relevant dataset in MLP with back-propagation. Hence, the proposed system had a higher survival prediction than the existing systems.

Fig. 6
figure 6

Performance analysis of proposed system with existing system

Table 6 Performance comparison of proposed system with existing system

Conclusion

Thoracic transplantation and the survival after the thoracic transplantation is a hot area of research. The donor scarcity is an important problem and hence in such a synopsis, every organ allocation has to be accurate. We all know that survival from Thoracic transplantation is very risky in the medical domain. Hence, we propose a computational model for survival prediction for which we used a relevant live dataset. Data were validated accurately and 25 relevant attributes for survival prediction were extracted and our dataset was implemented in the selected MLP model with back-propagation. The proposed system obtained a higher accuracy of 97.1%. In order to forecast the accuracy of proposed model, we implemented another computational model RBF. It yielded an accuracy of 92.37%, which was less than that of the proposed model. As it is a life problem and in order to prove the accuracy of model, we compared the proposed system with an existing system. The existing system branched off the MLP model and was used to forecast the survival of heart lung transplant patients with another dataset, which produced an accuracy of only 85.9%. Hence, through all of these comparisons, we came to the conclusion that our proposed model with our relevant dataset had a higher accuracy of survival prediction in thoracic transplantation than the existing systems. The results will be very supportable for doctors for undertaking lifesaving procedures for the patients.

Availability of data and materials

Not applicable.

Abbreviations

TT:

Thoracic transplantation

ANN:

Artificial neural networks

UNOS:

United Network for Organ Sharing

IMPACT:

Index for Mortality Prediction after Cardiac Transplantation

OPTN:

Organ procurement and transplant network

MLP:

Multi-layer perceptron

RBF:

Radial basis function

ROC:

Receiver operating characteristic

AUC:

Area under curve

RMSE:

Root mean square error

MAE:

Mean absolute error

RRSE:

Root relative square error

References

  1. Agrawal A et al. Heart transplant outcome prediction using unos data. In: Proceedings of the KDD workshop on data mining for healthcare (DMH). 2013.

  2. Zupan J. Basics of artificial neural network. In: Leardi R, editor. Nature-inspired methods in chemometrics: genetic algorithms and artificial neural networks. Amsterdam: Elsevier; 2003. p. 199–229.

    Chapter  Google Scholar 

  3. Davies RR, et al. Predicting survival among high-risk pediatric cardiac transplant recipients: an analysis of the united network for organ sharing database. J Thor Cardiovasc Surg. 2008;135(1):147–55.

    Article  Google Scholar 

  4. Oztekin A, Dursun D, Zhenyu JK. Predicting the graft survival for heart–lung transplantation patients: an integrated data mining methodology. Int J Med Inform. 2009;78(12):e84–96.

    Article  Google Scholar 

  5. Weiss ES, et al. Development of a quantitative donor risk index to predict short-term mortality in orthotropic heart transplantation. J Heart Lung Transplant. 2012;31(3):266–73.

    Article  Google Scholar 

  6. Urban M, et al. Donor and recipient risk factor analysis of inferior post heart transplantation outcome in the era of durable mechanical assist devices. Clin Transplant. 2018;32(10): e13390.

    Article  Google Scholar 

  7. Chokshi A, et al. Hepatic dysfunction and survival after orthotropic heart transplantation: application of the MELD scoring system for outcome prediction. J Heart Lung Transplant. 2012;31(6):591–600.

    Article  Google Scholar 

  8. Medved D, Pierre N, Johan N. Selection of an optimal feature set to predict heart transplantation outcomes. In: 2016 38th annual international conference of the IEEE engineering in medicine and biology society (EMBC). Newyork; IEEE, 2016.

  9. Dag A, et al. Predicting heart transplantation outcomes through data analytics. Decis Support Syst. 2017;94:42–52.

    Article  Google Scholar 

  10. Raji CG, Vinod Chandra SS. Computer based prognosis model with dimensionality reduction and validation of attributes for prolonged survival prediction. Inform Med Unlocked. 2017;9:93–106.

    Article  Google Scholar 

  11. Raji CG, Vinod Chandra SS. Long-term forecasting the survival in liver transplantation using multilayer perceptron networks. IEEE Trans Syst Man Cybern Syst. 2017;47(8):2318–29.

    Article  Google Scholar 

  12. Ayers B, et al. Using machine learning to improve survival prediction after heart transplantation. J Card Surg. 2021;36(11):4113–20.

    Article  Google Scholar 

  13. Kampaktsis PN, et al. State-of-the-art machine learning algorithms for the prediction of outcomes after contemporary heart transplantation: results from the UNOS database. Clin Transplant. 2021;35(8): e14388.

    Article  Google Scholar 

  14. Mozolin M, Thill J-C, Lynn Usery E. Trip distribution forecasting with multilayer perceptron neural networks: a critical evaluation. Transp Res Part B Methodol. 2000;34(1):53–73.

    Article  Google Scholar 

  15. Siregar SP, Anjar W. Analysis of artificial neural network accuracy using back propagation algorithm in predicting process (forecasting). Int J Inform Syst Technol. 2017;1(1):34–42.

    Google Scholar 

  16. Kelwade JP, and Suresh SS. Prediction of heart abnormalities using particle swarm optimization in radial basis function neural network. In: 2016 international conference on automatic control and dynamic optimization techniques (ICACDOT). Newyork; IEEE, 2016.

  17. Min C, Yixue H, Kai H, Lu W, Lin W. Disease Prediction by Machine Learning Over Big Data From Healthcare Communities. IEEE Access. 2017. https://doi.org/10.1109/ACCESS.2017.2694446.

    Article  Google Scholar 

  18. Raji CG, Vinod Chandra SS. Artificial neural networks in prediction of patient survival after liver transplantation. J Health Med Inform. 2016;7:1.

    Google Scholar 

  19. Raji CG, Vinod Chandra SS. Graft survival prediction in liver transplantation using artificial neural network models. J Comput Sci. 2016;16:72–8.

    Article  Google Scholar 

  20. Raji CG, Vinod Chandra SS. Predicting the survival of graft following liver transplantation using a nonlinear model. J Public Health. 2016;24(5):443–52.

    Article  Google Scholar 

  21. Su Z, Xinming O, Doina C. Predicting cyber risks through national vulnerability database. Inf Sec J A Global Perspect. 2015.

Download references

Acknowledgements

Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

This study was designed and compiled by RCG as the principal investigator. The development of the basic research questions, identifying the problems and selecting appropriate statistical models, data collection, data analysis, interpretation, and critical review of the paper have been done by RCG and SAK. The edition of the overall progress of the work was supported by RCG and SAK. All authors read and approved the final manuscript.

Corresponding author

Correspondence to C. G. Raji.

Ethics declarations

Ethics approval and consent to participate

The data used for this study was collected based on Organ Procurement and Transplantation Network data as of 5th June 2015. This research work was supported in part by Health Resources and Services Administration contract 234-2005-370011C. The content is the responsibility of the authors alone and does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations approved by the U.S. Government.

Consent for publication

Not applicable.

Competing interests

The authors declare that they do have no competing interests available.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Raji, C.G., Safna, A.K. Computational methods for predicting the outcome of thoracic transplantation. J Big Data 9, 58 (2022). https://doi.org/10.1186/s40537-022-00609-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40537-022-00609-z

Keywords