Artificial intelligence models for prediction of monthly rainfall without climatic data for meteorological stations in Ethiopia

Global climate change is affecting water resources and other aspects of life in many countries. Rainfall is the most significant climate element affecting the livelihood and well-being of the majority of Ethiopians. Rainfall variability has a great impact on agricultural production, water supply, transportation, the environment, and urban planning. Because all agricultural activities and subsequent national crop production hinge on the amount and distribution of rainfall, accurate monthly and seasonal predictions of this rainfall are vital for agricultural planning. Rainfall prediction is also useful for governmental, non-governmental, and private agencies in making long-term decisions and planning in numerous areas such as farming, early warning of potential hazards, drought mitigation, disaster prevention, and insurance policy. Artificial Intelligence (AI) has been widely used in almost every area, and rainfall prediction is one of them. In this study, we attempt to investigate the use of AI-based models to predict monthly rainfall at 92 Ethiopian meteorological stations. The applicability of Artificial Neural Networks (ANNs) and Adaptive Neuro-Fuzzy Inference System (ANFIS) models in predicting long-term monthly precipitation was investigated using geographical and periodicity component (longitude, latitude, and altitude) data collected from 2011 to 2021. The experimental results reveal that the ANFIS model outperforms the ANN model in all assessment criteria across all testing stations. The Nash–Sutcliffe efficiency coefficients were 0.995 for ANFIS and 0.935 for ANN over testing stations.


Introduction
Global climate change is affecting water resources and several other aspects of life in many countries. Studies on climate change due to global warming have achieved high importance over the past few years [1,2]. Ref. [3] stated that, global warming has recently attracted considerable attention from researchers, and it may cause changes in rainfall patterns, a rise in seawater level, and impacts on plants, wildlife, and humans. The magnitude of climatic variations, including temperature and rainfall, differs in several parts of the world [4]. Consequently, some arid regions are expected to experience droughts while others may be affected by heavy rainfall [5]. As a result, the prediction early warning of potential hazards, and managing risks arising from climate variability and change [17]. Therefore, to get efficient and accurate results for forecasting rainfall, methods have been developed. Among them, a statistical model has been broadly used to make predictions of rainfall [18].
The attempt to predict statistics of rainfall several months in advance needs the predictor's engagement with the theory of climate systems, consideration of trade-offs between physical-based dynamical methods and empirically grounded statistical methods, and selection of appropriate models that are generalizable and provide the best fit to recent observations [19].
AI has been widely used in almost every area, and weather prediction is one of them. Rainfall prediction is one of the most widely used research areas as many lives and property damage occur due to this. Intense rainfall has numerous impacts on society and on our daily life, from cultivation to disaster measures [20]. Weather prediction methods based on ANNs and ANFIS have been investigated intensively in recent years [6]. Different studies indicate that models based on AI can be applied for the identification of nonlinear systems in various fields of engineering, and can be used for rainfall prediction [21,22]. Therefore, this study aims to apply ANN and ANFIS models to predict the monthly rainfall of meteorological stations in Ethiopia.
Weather predictions are identified as major areas requiring further progress in climate research and have thus been selected as one of the World Climate Research Program (WCRP) Grand Challenges [23]. Reliable predictions of climate variables are required on short and long time scales to reduce potential risks and damage that result from weather and climate extremes [24]. Precise and timely weather prediction is a major challenge for national meteorological agencies all over the world.
Weather prediction models are important for developing countries like Ethiopia, where most of the agriculture depends on rainfall. It is a major concern to identify any trends for weather parameters to deviate from their periodicity, which would affect the economy of the country. This fear has been aggravated due to the threat of global warming and the greenhouse effect. The impact of extreme weather phenomena on society is growing more and more costly, causing infrastructure damage, injury, and the loss of life. Therefore, there is a need for accurate weather forecasts today more than ever before, not only as a defense against hazardous weather but also in planning the day-to-day operations of private enterprises and governments, and by individuals to enhance their quality of life [25].
Rainfall prediction and early warning systems are the most important services for an agricultural country like Ethiopia [26]. Meteorological data is periodically gathered by the Ethiopian meteorology agency. However, due to the lack of appropriate data analysis tools, the available data cannot be practically used to alleviate the problems faced by planners, policymakers, and decision-makers. In Ethiopia, agriculture is the backbone of the economy. Irrigation facilities are still not so good in the country and most agriculture depends upon the rain [27]. A reliable rainfall prediction results in the occurrence of a dry period for a long time or heavy rain that affects both the crop yield as well as the economy of the country, so early rainfall prediction is very crucial. Rainfall forecasting models have been applied in many sectors, such as agriculture [28] and water resources management [29]. Rainfall prediction involves a combination of statistical models, observation, and knowledge of trends and patterns. Using these methods, reasonably accurate forecasts can be made. The main aim of this study is to apply AI-based models for the prediction of monthly rainfall in Ethiopia. The contribution of this study is summarized as follows: 1. Develop a model to predict monthly rainfall of the study area using ANN and ANFIS. 2. To evaluate model performance using different statistical evaluation criteria and observed values and select the best fit model.
This paper is organized as follows: related works are described in "Related works" Section. Our method of monthly rainfall prediction models is defined in "Methods and materials" Section. The experimental findings of the study are defined in "Results and discussions" Section. Finally, "Conclusion" Section contains the conclusion of the study.

Related works
Rainfall prediction is important in water resource engineering, management, and planning. There are difficulties in the accurate prediction of rainfall because of the complexity of physical processes, especially for long-term prediction. As a result, many efforts have been made to develop appropriate methods to predict rainfall, which can be classified into dynamical methods [30], statistical methods [31], soft computing methods [32], and numerical weather prediction methods [33].
Many researchers worldwide have attempted to accurately predict the spatial and temporal distribution of rainfall using various techniques such as simple linear regression and ANNs [18,34]. However, the accuracy of prediction obtained by some of these techniques could not achieve a satisfactory level because of the complex and nonlinear nature of rainfall.
Several studies have indicated that they are still inaccurate methods to predict rainfall because weather data is non-linear [18]. However, in some cases, the statistical method is also able to produce good and accurate predictions. Along with the development of computing technology, many researchers are trying to make predictions using the ANN method in the field of hydrology.
In recent years, different researchers have been applying soft computing techniques such as ANNs, ANFIS, and Support Vector Machines (SVM) in different research areas [35,36]. Among numerous soft computing methods, ANNs are promising tools based on their ability to model nonlinear processes. The ANN algorithm is an inductive, datadriven approach that can model both linear and non-linear systems without the need to make pre-assumptions. It is the most popular approach for rainfall prediction [37].
Different researchers apply ANNs to generating short-term predictions of rainfall. ANNs can be easily adapted to provide spatial predictions, areal average precipitation, or any other precipitation-related parameters that might be useful for hydrologic forecasting [38]. ANNs have been applied for quantitative precipitation forecast, predicting monthly rainfall and temperature using geographical information of stations [3], for prediction of rainfall time series coupled with data preprocessing methods [39], and for flood forecasting by comparing the performance of ANNs with Auto Regressive Moving Average (ARMA) and nearest neighbor methods [40]. The results indicated that the use of ANNs provided a substantial enhancement in flood forecasting accuracy.
Several previous works have applied soft-computing approaches to overcome prediction difficulty, mainly based on neural computation approaches. These approaches have several advantages over global numerical models: they are much simpler and faster to train; they can be applied to data from a specific point of measurement (a specific area in a river basin, for example); and their performance is competitive compared to global techniques [32].
More recently, ANNs have been applied to model and forecast precipitation in Athens, Greece [41], to forecast precipitation during the summer monsoon season in India using El Niño South Oscillation (ENSO) indices [42], and a neural computation approach is applied to the short-term forecasting of thunderstorm rainfall [43].
Data-driven modeling, which aims to apply AI techniques to extract the data patterns in historical variables to forecast future events, has proven to be a very popular and successful forecasting and prediction tool. Most recently, a massive development has been accomplished by several researchers in the field of hydrology; for instance, sediment transport modeling [44], water level [45], groundwater simulation [46], rainfall pattern analysis [47], and water irrigation prediction [48].
There are numerous categories of data-driven models, including ANNs and ANFISs. They are used for rainfall and temperature analyses, and these models may perform non-linear regression using various optimization techniques [5]. Data-driven models are simple to use and require less time and effort when compared to Global Circulation Models (GCMs) [49]. These models can efficiently address the non-linearity of systems due to their parallel architecture. ANN in particular is considered a modern technique to address signals in engineering fields and has also been used as a calculation tool to solve certain problems concerning water resources. Other types of data-driven models, such as fuzzy logic and genetic algorithms, cannot be used for long-term predictions due to their logical assumptions [5]. They can be used in a hybrid approach with ANN models, to optimize the weights and bias values during the iteration process. However, ANN and ANFIS are trained based on a database and have the ability to make long-term predictions.
The ANFIS model has a great ability to integrate the power of a fuzzy logic system with the numeric power of a neural system adaptive network in modeling numerous processes. As stated by [50], the advantage of fuzzy rule-base methods such as ANFIS is that they include all of the causes that are not included in the idealized model, whereas they exclude some of the causes that are taken into account in physically-based models [6].
Various identification methods, such as Grid Partitioning (GP) and Subtractive Clustering (SC), can be applied in the ANFIS model, and different researchers have applied this method for different purposes. Some of them are [51] compared ANFIS-GP, ANFIS-SC, and ANFIS with the Gustafson-Kessel Clustering (GKC) method for rainfall-discharge modeling; [52] introduced the hybrid model of ANFIS and wavelet transform for precipitation forecasting; [53] applied ANFIS-GP for investigation of the influence of lag time on the event-based rainfall-runoff process; [35] compared the performance of the ANFIS-GP and ANFIS-SC in streamflow prediction (the results from the studies indicated that the ANFIS-SC has slightly better accuracy than the ANFIS-GP in streamflow estimation); [54] applied ANFIS and Gene Expression Programming (GEP) with wavelet to forecast precipitation for two stations in Turkey; [55] applied ANNs and ANFIS-GP for spatial prediction of monthly air temperature using geographical inputs; [56] examined the performance of ARMA, ANNs, ANFIS, SVR, and genetic programming for forecasting monthly discharge time series. The best performance was achieved by ANFIS, SVM, and genetic programming during the training and validation period; [57] introduced a model that integrated SVM and a multi-objective genetic algorithm to predict hourly typhoon rainfall. The proposed model provided an accurate forecast of hourly rainfall and improved the long lead-time forecasts.
But many studies do not employ spatial modeling of long-term monthly rainfall predictions by ANNs and ANFIS, which uses geographical information of stations as an input. To the best knowledge of the authors, there is no published work in the literature that uses ANNs and ANFIS for predicting long-term monthly rainfall in the study area. This gave motivation to the present study. In this paper, the applicability of ANNs and ANFIS models is investigated for predicting long-term monthly rainfall using the geographical and periodicity components as input data.

Artificial neural networks
The ANN is an engineering concept of knowledge in the field of AI designed by adopting the human nervous system. Wherein the main processing of the human nervous system is composed of the brain's nerve cells as the basic unit of information processing. In the concept of ANN, the basic unit of information processing (neurons) serves to process information in parallel and immediately. Furthermore, the process of training the ANN has many types and uses, including perceptron, backpropagation, Self-Organizing Map (SOM), and delta.
ANN, as the most general AI method, is the collection of some neurons with a specific structure formed based on the relationships between neurons in different layers [6]. A neural network is a computing system made up of several simple and highly interconnected nodes or processing elements called neurons. The goal of neural networks is to map a set of input patterns onto a corresponding set of output patterns. The neural networks achieve this mapping by first training the neurons to be suitable for a given series of patterns. Then, the neural network applies this model to a new input pattern to predict the appropriate output pattern [58].
There are many kinds of neural networks depending on their structure, function, or training method. In this study, multiple-layer feed-forward neural networks are applied for rainfall prediction using geographical information and a periodicity component. The structure to be considered here includes one input layer, a hidden layer, and an output layer. For each layer, some neurons are related by weighted connections. The number of neurons for the input and output layers is equal to the numbers of input and output variables, but the number of neurons in the hidden layer will be selected by a trial-and-error procedure.
The weights and bias of connected neurons should be determined before applying the ANN model. In this matter, the model should be trained using a dataset. The backpropagation method is utilized for the training of networks and among various training algorithms, Levenberg-Marquardt, gradient descent, gradient descent with adaptive learning rate, gradient descent with momentum, adaptive learning rate, and scaled conjugate gradient are used. For all training algorithms, the tangent sigmoid transfer function is used in the hidden layers and the purelin transfer function in the output layer.
A typical neural network propagates information in the feedforward direction using Eq. 1.
where a i is the input vector, b j is the output vector, w ij is a weight factor between two nodes, T j is the internal threshold, and f is a transfer function.
The backpropagation learning algorithm is based on a generalized delta-rule accelerated by a momentum term. To improve the performance of the neural network, both the weight factors and the internal threshold values are adjusted using Eqs. 2 and 3.
where, η is the learning rate, α is the momentum coefficient, w is the previous weight factor change, T is the previous threshold value change, O is the output, δ is the gradient-descent correction term, and p stands for the pattern.
Despite its theoretical simplicity, the neural network model has excellent performance for a wide range of applications and has developed into a powerful and versatile tool in recent years [58]. The ANN method was selected for this study because it is the most popular data-driven method in hydrological applications.

Adaptive Neuro-Fuzzy inference system
ANFIS is an effective AI model that combines neural networks and fuzzy logic capabilities [6]. ANFIS utilizes a feed-forward network for searching for fuzzy decision rules to perform well on a given problem. With considering a given input-output dataset, ANFIS creates a Fuzzy Inference System (FIS) for which Membership Function (MF) parameters are adjusted using either a back-propagation algorithm or a combination of a backpropagation algorithm and a least-squares method.
By using a first-order Takagi-Sugeno fuzzy model, Eqs. (4) and (5) present a typical rule set with two fuzzy if/then rules.
: if x is A 1 and y is B 1 then f 1 = p 1 x + q 1 y + r 1 (5) Rule 2 : if x is A 2 and y is B 2 then f 2 = p 2 x + q 2 y + r 2 where, A 1 (LOW), A 2 (LOW) and B 1 (HIGH), B 2 (MEDIUM) are the MFs for inputs x(LAT) and y(LON), respectively, and p 1 , q 1 , r 1, and p 2 , q 2 , r 2 are the parameters of the output function. The system consists of five layers. The relationship between the input and output of each layer is described as follows: Layer 1: Every node i in this layer is an adaptive node with a node output defined by; where, LAT is the input to the node; Ai is a fuzzy set associated with this node, identified by the shape of the MF in this node, and can be any appropriate function that is continuous and piecewise differentiable such as a Gaussian function. Supposing a Gaussian function as an MF, Ai can be computed as; where, {ci, σi} are parameter sets that are called to as premise (antecedent) parameters. Layer 2: Every node in this layer is a fixed node, which multiplies the incoming signals and output product. For instance, Each output node describes the firing strength of a rule. Layer 3: Every node in this layer computes the ratio of the i th rule's firing strength to the sum of all rule's firing strengths as follows: The output of this layer is referred normalized firing strengths. Layer 4: Node i in this layer calculate the contribution of the i th rule towards the model output as described follows: where, w i is the output of layer 3 and {pi, qi, ri} is the parameter set that is called as consequent parameters.
Layer 5: This layer calculates the overall output as the summation of all incoming signals.
The ANFIS method was also selected in this study because it is commonly used in hydrological applications.

Data collection
In this paper, the applicability of ANNs and ANFIS models was investigated for predicting long-term monthly rainfall using the geographical and periodicity components (longitude, latitude, and altitude) as input data. The rainfall data from 92 meteorological stations within the study area (Ethiopia) (Fig. 1) was collected from Climate Prediction Centre (CPC) and used for training, evaluating, and testing the performance of the models. The acquired data is global unified gauge-based rainfall data for 11 years (2011-2021). Sample geographical information about study areas used by this study is depicted as shown in Table 1.
The performance of the trained network is verified by determining the error between the predicted value and the real value. Before training the neural network, all the data points for the patterns are normalized to be less than 1.  where, n is the number of the dataset, R o is the mean of observed monthly rainfall, and R p and R o denote the rainfall values generated by different models and observed monthly rainfall values, respectively.

Results and discussions
All experiments in this study were conducted with a device having the Windows 10 operating system, a core i7, and 16 GB of RAM. A grid search strategy was used to compute the optimal hyper-parameter values of both ANFIS and ANN. Within the input parameter values indicated in Table 2, the ANFIS model produced a better predictive outcome. The hyper parameters for the ANN model were 0.5 dropout, sigmoid activation function, 100 epoch, Adam optimizer, and batch size of 16.

Model training and performance evaluation
We partitioned the dataset into three sections with an 80%, 10%, and 10% split ratio for training, validating, and testing the model, respectively. ANFIS and ANN training and validation losses using the dataset gathered from 92 weather stations in Ethiopia are depicted as shown in Figs. 2 and 3.
As demonstrated in Figs. 2 and 3, the ANFIS model learns the patterns of the input variables in order to predict rainfall. The ANFIS validation loss overlaps the training loss around the 3 rd epoch, whereas the ANN model validation loss overlaps the training loss around the 20 th epoch. As a result, the ANFIS model learns the pattern in the training data faster than the ANN model. We examine the performance of the two models on the testing dataset after training and validating them. The result is depicted as shown in Figs. 4 and 5 for ANFIS and ANN, respectively. The actual rainfall value used in the graphs below has been normalized to reduce the impact of outliers on the learning process.  Figure 4 shows how the rainfall value predicted with the ANFIS is related to the actual rainfall value. The predicted and actual rainfall values are remarkably similar, indicating that the ANFIS is approximately 100 percent accurate in its prediction. In the majority of the months used for testing, the rainfall value predicted by ANN is lower than the actual monthly rainfall, as shown in Fig. 5.
We also compare and contrast the two models' performance using other evaluation metrics like mean absolute error, R-square, root mean square error, and mean absolute percentage error. Figures 6 and 7 illustrate a comparison of the two models' results using those evaluation metrics.
The R 2 of the ANFIS is 0.9992, while the R 2 of the ANN is 0.9383, according to the graphs in Figs. 6 and 7. This means that when compared to ANN, ANFIS improves   We also compare and contrast these two predictive models with the Nash-Sutcliffe model efficient coefficient (E). The value of E is 0.9954 and 0.935 for ANFIS and ANN, respectively. The value E for the ANFIS model is 0.9954, which is nearly equal to 1, which means the model is a perfect match between the model and the observed data.
We have tested the ANFIS and ANN models on the rainfall prediction for the nine stations from station ID 84 to 92. At these stations, ANFIS performs better than ANN. Therefore, we recommend researchers use the ANFIS model for applications that require rainfall prediction without climatic data.

Conclusion
The prediction accuracy of ANFIS and ANN models was investigated in the prediction of monthly rainfall using meteorological stations in Ethiopia. Longitude, latitude, and altitude data from 92 weather stations for 11 years running, from 2011 to 2021, were used for this study. We conducted an experiment using weather station data from Ethiopia's 92 stations to evaluate and compare the ANFIS and ANN predictive models. We used different evaluation metrics to evaluate these models, and the experimental result shows the ANFIS model performs better than the ANN model. In general, the ANFIS model was found to be better than the other models in long-term monthly rainfall prediction. It gave the best prediction accuracy of the nine stations.