Skip to main content

Time-series analysis with smoothed Convolutional Neural Network

Abstract

CNN originates from image processing and is not commonly known as a forecasting technique in time-series analysis which depends on the quality of input data. One of the methods to improve the quality is by smoothing the data. This study introduces a novel hybrid exponential smoothing using CNN called Smoothed-CNN (S-CNN). The method of combining tactics outperforms the majority of individual solutions in forecasting. The S-CNN was compared with the original CNN method and other forecasting methods such as Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM). The dataset is a year time-series of daily website visitors. Since there are no special rules for using the number of hidden layers, the Lucas number was used. The results show that S-CNN is better than MLP and LSTM, with the best MSE of 0.012147693 using 76 hidden layers at 80%:20% data composition.

Introduction

Prediction estimates future events using a specific scientific approach [1] of analyzing time-series data patterns [2, 3]. One of the techniques is Convolutional Neural Network (CNN). CNN applies the basic concept of the Neural Network (NN) algorithm with more layers [4]. CNN is popular in computer vision and image processing for being efficient [5]. CNN uses a convolution layer that can handle spatial information available in images, while fully connected layers have a memory to store information in time-series data [6]. The only difference between computer vision problems and time-series ones is the input given to the model, image matrix for computer vision, and 1D array for time-series forecast [7]. The observation sequence can treat the raw input data as a 1D array that can be read and filtered by the CNN model. Thus, this principle can be implemented in time-serries analysis.

CNN deals with time-series problems effectively. Recent studies which applied CNN to time-series forecasting tasks, mainly involving financial data, show promising results. CNN estimates the stock market by extracting features and it can be used to collect data from various sources, including different markets models such as S and P 500, NASDAQ, DJI, NYSE, and RUSSEL [8]. The ability of convolutional layers for gold price volatilities may filter out the noise of the input data and extract more valuable features, which would be more beneficial for the final prediction model [9]. Many CNN models can solve various time-series data, such as univariate, multivariate, multi-step, and multivariate multi-step model [10].

CNN extracts image features from raw pixel data [11]. However, the raw data extraction is unnecessary in time-series analysis because of the numerical pattern. CNN may increase the accuracy up to 30% and train models twice faster than other algorithms such as RNN, GRU, and LSTM [12]. CNN weight division can reduce the number of parameters to increase the efficiency of model learning [13]. CNN is suitable for forecasting time-series because it offers dilated convolutions, in which filters can be used to compute dilations between cells. The size of the space between each cell allows the neural network to understand better the relationships between the different observations in the time-series [14].

Researchers have conducted various experiments to improve CNN performance. A novel approach which combined CNN with an autoregressive model outperformed both CNN and LSTM [15]. A specific architecture of CNN, WaveNet, outperformed LSTM and the other methods in forecasting financial time-series [16]. Livieris et al. [17] proposed a framework for enhancing deep learning by combining CNN-LSTM with simple exponential smoothing. The technique generated high-quality time-series data that considerably improves the forecasting performance of a deep learning model. Studies show that hybridizing CNN with other methods, creating a specific architecture, and smoothing the input data of CNN can increase the algorithm performance.

This study combines a simple exponential smoothing with CNN, Smoothed-CNN (S-CNN), to reduce forecasting errors. Instead of using a smoothing factor (\(\alpha\)), ranging from 0 to 1 in steps of 0.1 [17], this study promotes a novel optimum \(\alpha\) as the main parameter of the simple exponential smoothing. We use CNN, Multilayer Perceptron (MLP), and Long Short-Term Memory (LSTM) with Lucas number hidden layers for the baseline and select the best method based on the performance analysis. Four different datasets are used to indicate the algorithm’s consistency.

The rest of this paper is organized as follows. “Smoothing algorithm for time-series” section describes details of the smoothing algorithm for time-series. “Experimental design” section presents the experimental design, focusing on the exponential dataset, data normalization, smoothing with optimum α, CNN with Lucas hidden layers, and performance testing. “Results” section presents the results and detailed experimental analysis, focusing on the evaluation of the proposed smoothing with optimum α. The section also summarizes the findings of this research by discussing the numerical experiments. Finally, “Conclusions” section summarizes the general findings of this study and discusses possible future research areas.

Smoothing algorithm for time-series

Data smoothing can enhance the quality of data. Smoothing generates excellent results in small dataset forecasting by removing outliers from time-series data [18]. This method is easy to understand and can be effectively implemented in new research without referring to or taking parameters from other studies [19].

Smoothing procedures improve forecasting by averaging the past value of time-series data [20]. The algorithm assigns a weighting value to previous observations to predict future values [21], smooth the value of fluctuations in the data used, and eliminate noise [22]. Generally, there are four common types of data smoothing, which are simple exponential smoothing (SES)/exponential smoothing (ES), moving average (MA), and random walk (RW). In the case of a forecasting task, data smoothing can help researchers predict trends. Table 1 describes the types of data smoothing and the advantages and disadvantages.

Table 1 Types of data smoothing

In this study, we use exponential smoothing as a data smoothing method. Simple Exponential Smoothing (SES) [23], also known as Exponential Smoothing (ES) [18], was invented by Hyndman and is included in the R software’s libraries. Similar to other methods, ES works well for short-term forecasts that take seasonality into account, and the models chosen were evaluated only using MAPE. This method is currently used in a forecasting task due to its performance. Table 2 presents several related works which employ ES for forecasting. Exponential smoothing is a rule-of-thumb approach for smoothing time-series data using the exponential window function. Exponential functions are employed to apply exponentially decreasing weights over time. It is simple to learn and use for making a judgment based on the user's prior assumptions, such as seasonality [24].

Table 2 Related Works

Experimental design

In order to conduct a more systematic way of research, we designed the experiment as shown in Fig. 1. We generally compared the smoothed CNN with basic CNN using various datasets. We also used various scenarios and metrics to determine the best scenario. The details of Fig. 1 will be explained in the following subsections.

Fig. 1
figure 1

Experimental design of Smoothed-CNN (S-CNN) with optimum \(\alpha\)

Dataset

We used 4 different datasets in this study. Table 3 shows the characteristics of each dataset. Dataset 1 is the primary dataset, while the rest is used to test the proposed method’s consistency. All of the datasets were multivariate. However, we selected a single attribute (univariate) on each dataset due to the limitation of the study.

Table 3 Dataset Description

The first time-series data in this study is from a journal website of Universitas Negeri Malang (Knowledge Engineering Data Science/KEDS). We retrieved the.csv from the Statcounter machine connected to the web. The dataset contains data within the period of January 2018 to 31 December 2018 [30]. In this study, the input and output of forecasting algorithms are the sessions attribute. Sessions (unique visitors) are the number of visitors from one IP in a certain period [31]. The number of unique visitors is an essential success indicator of an electronic journal. It measures the breadth of distribution that will accelerate the journal accreditation system [32]. The second dataset is similar to the first instead of the source and the range of the data. The third and fourth datasets are foreign exchange and electrical energy consumption.

We used two scenarios to discover the influence of training data composition on forecasting performance. The first scenario used 70% (256 days) data training and 30% (109 days) data for testing. We used 80% (292 days) of the dataset as training in the second scenario while the rest was for testing (73 days). Figure 2 illustrates the scheme of training testing data composition.

Fig. 2
figure 2

Training testing data composition

Data normalization

The natural behavior of most time-series is dynamic and nonlinear [33]. Data normalization is used to deal with this problem. Because the main objective of data normalization is to ensure the quality of the data before it is fed to any model, it substantially influences the performance of any model [34].

Data normalization is essential for CNN [35] because it can scale the attribute into a specific range required by the activation function. This study uses Min–Max normalization. The method assures that all features have the same scale, although it is inefficient in dealing with outliers. Equation (1) shows the Min–Max formula [36], resulting in normalized data with smaller intervals within 0–1.

$${X}_{t(norm)}=\frac{{X}_{t}-{X}_{ min}}{{X}_{ max}- {X}_{ min}}$$
(1)

\({X}_{t(norm)}\) is the result of normalization, \({X}_{t}\) is the data to be normalized, while \({X}_{ min}\) and \({X}_{ max}\) stand for the minimum and maximum value of the entire data.

Exponential smoothing with optimum \(\boldsymbol{\alpha }\)

In time-series forecasting, the raw data series is generally denoted as \({\{X}_{t}\}\), with starting time at \(t=0\). Here t is a day index. The result of the exponential smoothing process is commonly written as \({S}_{t}\), which is considered as the potential future value of \(X\). Equation (2) and (3) offer the single exponential smoothing [37] when \(t=0\)

$${S}_{0} ={ X}_{0}$$
$${S}_{t} ={\alpha X}_{t}+ \left(1-\alpha \right){ S}_{t-1 ,} t>0$$
(2)
$${S}_{t}={ S}_{t-1} +\alpha ({ X}_{t}-{ S}_{t-1})$$
(3)

The smoothed data \({S}_{t}\) is the result of smoothing the raw data \({\{X}_{t}\}\). The smoothing factor, \(\alpha\) is a value that determines the level of smoothing. The range of \(\alpha\) is between 0 and 1 (0 ≤ \(\alpha\)  ≤ 1). When \(\alpha\) close to 1, the learning process is fast because it has a less smoothing effect. In contrast, values of \(\alpha\) closer to 0 have a more significant smoothing effect and are less responsive to recent changes (slow learning).

The value of \(\alpha\) is not the same for every case. Therefore, we promote an optimum value of the smoothing factor based on the dataset characteristics. In this study, we use time-series data as in Fig. 3. Figure 3 shows the maximum (\({X}_{ max})\), minimum (\({X}_{ min}\)), and average (\(\frac{1}{n} \sum_{t=1}^{n}{ X}_{t}\)) value of the series.

Fig. 3
figure 3

Time-series component of optimum \(\alpha\)

We have two considerations in order to define the optimum \(\alpha\). The first is that the average value is less than the difference between \({X}_{ max}\) and \({X}_{ min}\). The second, the optimum \(\alpha\) must be less than 1. Equation (46) shows the optimum \(\alpha\) formula.

$$Optimum \alpha ={ \alpha }_{max}- \frac{\frac{1}{n} \sum_{t=1}^{n}{ X}_{t}}{{ X}_{ max}- { X}_{ min}}$$
(4)
$$Optimum \alpha =1-\frac{\frac{1}{n} \sum_{t=1}^{n}{ X}_{t}}{{ X}_{ max}- { X}_{ min}}$$
(5)
$$Optimum \alpha =\frac{\left({ X}_{ max}- { X}_{ min}\right)-\frac{1}{n}\sum_{t=1}^{n}{ X}_{t}}{{{X}_{ max}- X}_{ min}}$$
(6)

The substitution of Eq. (6) to (3) results in the following Eq. (7).

$${S}_{t}={ S}_{t-1} +\frac{\left({ X}_{ max}- { X}_{ min}\right)-\frac{1}{n}\sum_{t=1}^{n}{ X}_{t}}{{{X}_{ max}- X}_{ min}}({ X}_{t}-{ S}_{t-1})$$
(7)

We use the optimum smoothed result (\({S}_{t}\)) to improve the CNN algorithm performance.

CNN with lucas hidden layers

CNN is the main algorithm of this research. CNN has the capacity to learn meaningful features automatically from high-dimensional data. The input layer used one feature since it is a univariate model. Flatten was used for input to get a fully connected layer. The fully connected layer contains dense for the number of hidden layers.

Instead of using a random number, we used the Lucas number to determine the hidden layer. The Lucas number (Ln) is recursive in the same way as the Fibonacci sequence (Fn), with each term equal to the sum of the two preceding terms, yet with different initial values. This sequence was selected since it provides a golden ratio number. The golden ratio emerges in nature, demonstrating that this enchanted number is ideal for determining the optimal solution to numerous covering problems such as arts, engineering, and financial forecasting [38]. To date, several computer science problems do not have an optimal algorithm. Due to the lack of a better solution in these circumstances, approaches based on random or semi-random solutions are frequently used. Therefore, using the Lucas number is expected to provide an optimal result, in this case, to determine a hidden layer [39]. Figure 4 presents the sequence of Fibonacci and Lucas numbers.

Fig. 4
figure 4

Sequence of Fibonacci and Lucas numbers

In this study, the Lucas number starts from three and ends with the last number before 100, which is 76. We limited the number of hidden layers to avoid the impact of time consumption and efficiency performance. Overall, we used 3, 4, 7, 11, 18, 29, 47, and 76 [40] for the numbers of hidden layers.

There are several lists of different parameters in CNN according to the layer. The CNN forecast component parameters can be seen in Table 4. The parameters selection is based on various research by Ref. [41,42,43,44,45]. A dropout was used during the weight optimization at all layers to avoid overfitting [46]. Dropout is a weight optimization strategy that randomly picks a percentage of neurons at each training period and leaves them out. The dropout value used was 0.2 [47].

Table 4 The list of CNN forecast component parameters

Performance testing

All experiments in this study were performed using Python programming language from google collab executor using the Tensorflow and Keras libraries from google chrome browser. We used an Asus VivoBook X407UF Laptop with a 7th generation Intel Core i3 processor, 1 TB hard drive, and 12 GB DDR3 RAM.

We used the mean square error (MSE) and the mean absolute percentage error (MAPE) as error evaluation measures. The MSE was employed to identify anomalies or outliers in the planned projection system [48]. On the other hand, MAPE displayed mistakes that may signify correctness [49]. The formulae are as follows [49].

$$MSE = \sum\limits_{{t = 1}}^{n} {\frac{{(A_{t} - F_{t} )^{2} }}{n}}$$
(8)
$$MAPE = \sum\limits_{{t = 1}}^{n} {\frac{{|(A_{t} - F_{t} )|}}{{n.A_{t} }} \times 100}$$
(9)

\({A}_{t}\) is the actual data value, \({F}_{t}\) is the forecast value, and \(n\) is the number of instances. The better the forecasting outcomes, the less the MSE and MAPE value produced, and hence the better the approach utilized [50]. Based on the MSE and MAPE value computation results, the values show the best forecasting performance.

We also recorded the training time of every scenario. The information is used as additional performance indicators. We define the best algorithm as the method with the lowest time consumption.

Results

Tables 5 and 6 present the comparison of CNN and S-CNN in all scenarios. We used \(\alpha\) = 0.57 as the smoothing factor of the S-CNN. The hidden layers are various, starting from 3 to 76 of Lucas numbers. These layers were used for all scenarios, including the baseline: MLP and LSTM.

Table 5 MSE of CNN and S-CNN ( \(\alpha\) = 0.57) in all scenarios
Table 6 Training time (s) of CNN and S-CNN ( \(\alpha\) = 0.57) in all scenarios

Table 5 shows the MSE of CNN and S-CNN in all scenarios. Table 5 presents CNN results using the input data of Scenario 1 with a composition of 70% training and 30% testing data. From Tables 5 and 6, it can be seen that the average MSE value produced is 0.029351031, with an average processing time of 2013s. The highest MSE, 0.039486471, is achieved when the network has 3 hidden layers with a processing time of 1401 s. The lowest MSE 0.024530865 is generated when the hidden layer is 18, with a processing time of 1810s. The lowest MAPE is in the network with 3 hidden layers (10.38339615). Figure 4 shows the forecasting result of CNN within scenario 1.

Tables 5 and 7 show that the number of hidden layers of the lowest MSE of smoothed CNN (S-CNN) is 76. The architecture has 0.020868076 MSE and 3410 s of processing time. The highest MSE, 0.036637854, is achieved when the network has 3 hidden layers with a processing time of 1221 s. This scenario generates the average MSE and processing time of 0.026531364 and 1878s, respectively. For scenario 1, S-CNN with 4 layers produces the best MAPE of 9.45147180. Figure 5 shows the forecasting result of S-CNN within scenario 1.

Fig. 5
figure 5

The forecasting results of CNN scenario 1

Figure 5 shows the best forecasting results because it has the lowest MSE. Despite the lowest MSE, from Fig. 5, we can see a fairly significant gap at the beginning and middle of the period. Meanwhile, when entering the end of the period, it can be seen in Fig. 5 that the forecasting results are similar to the original value. We can see that Fig. 6 shows a significant difference between the data and forecasting results at the middle and the end of the forecasting period. Nevertheless, the gap between testing data and the result in Fig. 5 is more significant than the gap between testing data and forecasting in Fig. 6. Thus, Fig. 6 is the best architecture due to its low MSE.

Fig. 6
figure 6

The forecasting results of S-CNN scenario 1

Figure 7 compares the CNN with the smoothed one. In general, S-CNN is better than the original CNN in terms of MSE. Figure 7a shows that the MSE of S-CNN is lower than CNN, except in the hidden layer 47, in which the MSE values of both are 0.026. The MSE values obtained by the two began to settle when they entered the hidden layer 7 to the last 76, with the average MSE value in that range being 0.026165752 for CNN and 0.023423422 for S-CNN. As Fig. 7b shows, the more hidden layers used, the longer the processing time required. When using the initial three hidden layers, the processing time is the same for both, 1142 s. Meanwhile, when using the last hidden layer, which is when using 76 hidden layers, the processing time required for S-CNN is 111 s faster than CNN. Again, S-CNN processing time is faster than CNN in every scenario.

Fig. 7
figure 7

Comparison of CNN and S-CNN with scenario 1: a MSE; b Processing time

Table 5 show the CNN performance of Scenario 2 with 80% training and 20% testing data composition. The MSE results are 0.013227105 for the lowest and 0.018452732 for the average MSE. From Table 6, the average processing time is 2597 s, averaging the time between 1901 and 3641 s. The best structure for scenario 2 is using 7 hidden layers with MAPE = 9.29571771. Figure 8 presents the forecasting results of CNN within this scenario.

Fig. 8
figure 8

The forecasting results of CNN scenario 2

Figure 8 presents the best forecasting results which is the lowest MSE of Scenario 2. Despite the lowest MSE, Fig. 8 indicates a substantial disparity at the start and halfway. However, as Fig. 8 indicates, the forecasting results are approaching the initial value as the period ends.

Table 5 presents the MSE and processing time of an S-CNN with the various hidden layers. The lowest MSE 0.012147693 happened when the hidden layer number was 76. Nevertheless, in Table 6, the computation is the longest among other architectures with 3351 s. The highest MSE was 0.023890096 due to 11 hidden layers. In Table 7, the lowest MAPE is 9.49165793 for the S-CNN with 29 layers. Figure 9 shows the forecasting results of the S-CNN.

Table 7 MAPE of CNN and S-CNN ( \(\alpha\) = 0.57) in all scenarios
Fig. 9
figure 9

The forecasting results of S-CNN scenario 2

Figures 8 and 9 show a considerable change in the forecasted findings. Figure 8 has a greater MSE than Fig. 9, which means that the outlier occurrence of CNN is greater than S-CNN. In Fig. 8, the outliers occurs in almost all periods. On the other hand, the outliers occur in the early and late periods in Fig. 9. Therefore, it can be concluded that smoothing can improve performances by reducing the occurrence of outliers.

Figure 10 compares the CNN and S-CNN forecasting of Scenario 2 with the 80%:20% composition of the training and testing data. In general, Fig. 10a shows that the S-CNN has a lower MSE than its original version. It means that the CNN performance is less accurate than the S-CNN. On the other hand, the computation number of those methods is increasing in line with the rise of hidden layers numbers. In terms of processing time, smoothed CNN is faster than the original CNN in all scenarios, as seen in Fig. 10b.

Fig. 10
figure 10

Comparison of CNN and S-CNN with scenario 2: a MSE; b Processing time

We also compare our optimum α with α between 0.1 and 0.9 [17]. Table 8 shows the performance comparison using various values of smoothing factors. The results show that the optimum α has the lowest average MSE and MAPE. Therefore, our proposed optimum α outperformed other scenarios.

Table 8 Performance comparison using various values of smoothing factors

Table 9 shows the significance of using Lucas numbers as hidden layers on MSE, MAPE, and training time. The significance is shown when the paired t-test 2-tailed P value < 0.05. The result shows that Lucas numbers have a significant impact on MSE and training time. The insignificance shown in the MAPE results means that the Lucas numbers hidden layers cannot significantly improve the accuracy.

Table 9 Paired T-test result based on lucas hidden layers

We also used paired t-test to indicate the significance of α on MSE, MAPE, and training time. Since the results in Table 10 are lower than 0.05, the use of α is significant to MSE, MAPE, and training time. In other words, using the smoothing factor is necessary to improve the forecasting performance.

Table 10 Paired T-test result based on alpha

The proposed CNN is compared with other time-series forecasting methods using the same dataset, preprocessing process, and general parameter settings. This study uses MLP and LSTM as the baseline. The general parameter setting for MLP and LSTM is the same as the CNN setting in Table 4. Table 11 shows the forecasting comparison of all approaches. In all scenarios, the CNN method has lower MSE and MAPE results than MLP or LSTM. Therefore, forecasting using smoothed CNN (S-CNN) has better performance than the original CNN.

Table 11 Forecasting comparison

We used three more datasets to test the consistency of the best algorithm, S-CNN. The best scenario is used to test the datasets: scenario 2, 76 hidden layers, and smoothing factor based on the statistical parameter of each dataset. The results of the evaluation using various types of datasets and different methods can be seen in Table 12. Table 12 presents the use of the S-CNN method on different datasets to find the best MSE and MAPE values. S-CNN outperformed the baseline in all datasets. The computation of S-CNN is more complex than other methods in Table 12. It is indicated that the time value is more significant than one in LSTM and MLP. Due to the smoothing process, S-CNN is slightly faster than its origin, CNN. Therefore, the results of the performance test are consistent in every dataset.

Table 12 Comparison with others Dataset

Overall, the proposed use of the optimum smoothing factor in CNN (S-CNN) may improve the forecasting performance of CNN by reducing the MSE and MAPE. The proposed smoothing factor is limited because it is suitable for seasonal time-series data. Second, the efficiency of the proposed algorithm for multivariate time-series analysis should be considered. Multivariate data has different ranges, units, and dependencies.

Conclusions

This study aims to optimize the performance of CNN, a widely used algorithm for image processing, in time-series analysis. Based on the results of the analysis, it can be concluded that CNN with optimum smoothing factor performs better than other selected methods in time-series forecasting. The optimum alpha proposed in this study produces the best evaluation results. The use of Lucas numbers as hidden layers significantly raises the performance of the forecasting algorithm due to the generated golden ratio.

While the results have addressed the research objectives, this research still has limitations. The study is focused on implementing optimized exponential smoothing in fundamental deep learning methods. Therefore, the effect of implementing this method to more advanced deep learning algorithms (i.e., Resnet, hybrid CNN-LSTM) will be investigated in the future. Our next focus is a deeper analysis of different smoothing techniques for trend data and double or triple exponential smoothing implementation. The use of multivariate data will also be considered for further research.

Availability of data and materials

Attached in the submission.

References

  1. Velázquez JA, Petit T, Lavoie A, Boucher M-A. An evaluation of the Canadian global meteorological ensemble prediction system for short-term hydrological forecasting. Hydrol Earth Syst Sci. 2009;13(11):2221–31. https://doi.org/10.5194/hess-13-2221-2009.

    Article  Google Scholar 

  2. Purnawansyah P, Haviluddin H, Alfred R, Gaffar AFO. Network traffic time series performance analysis using statistical methods. Knowl Eng Data Sci. 2017;1(1):1. https://doi.org/10.17977/um018v1i12018p1-7.

    Article  Google Scholar 

  3. Singh J, Tripathi P. Time series forecasting using back propagation neural networks. Neurocomputing. 2017;7(5):147–59.

    Google Scholar 

  4. Lundervold AS, Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Z Med Phys. 2019;29(2):102–27. https://doi.org/10.1016/j.zemedi.2018.11.002.

    Article  Google Scholar 

  5. Kim BS, Kim TG. Cooperation of simulation and data model for performance analysis of complex systems. Int J Simul Model. 2019;18(4):608–19. https://doi.org/10.2507/IJSIMM18(4)491.

    Article  Google Scholar 

  6. Yamashita R, Nishio M, Do RKG, Togashi K. Convolutional neural networks: an overview and application in radiology. Insights Imaging. 2018;9(4):611–29. https://doi.org/10.1007/s13244-018-0639-9.

    Article  Google Scholar 

  7. Lewinson E. Python for finance cookbook. In: Lewinson E, editor. Over 50 recipes for applying modern Python libraries to financial data analysis. 1st ed. Birmingham: Packt Publishing; 2020. p. 434.

    Google Scholar 

  8. Hoseinzade E, Haratizadeh S. CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Syst Appl. 2019;129:273–85. https://doi.org/10.1016/j.eswa.2019.03.029.

    Article  Google Scholar 

  9. Livieris IE, Pintelas E, Pintelas P. A CNN–LSTM model for gold price time-series forecasting. Neural Comput Appl. 2020;32(23):17351–60. https://doi.org/10.1007/s00521-020-04867-x.

    Article  Google Scholar 

  10. Wang K, Li K, Zhou L, Hu Y, Cheng Z. Multiple convolutional neural networks for multivariate time series prediction. Neurocomputing. 2019;360:107–19. https://doi.org/10.1016/j.neucom.2019.05.023.

    Article  Google Scholar 

  11. Jogin M, Mohana, Madhulika MS, Divya GD, Meghana RK, Apoorva S. Feature Extraction using Convolution Neural Networks (CNN) and Deep Learning.In: 2018 3rd IEEE International conference on recent trends in electronics, information and communication technology (RTEICT). 2018, pp. 2319–2323. Doi: https://doi.org/10.1109/RTEICT42901.2018.9012507.

  12. Rajagukguk RA, Ramadhan RAA, Lee H-J. A review on deep learning models for forecasting time series data of solar irradiance and photovoltaic power. Energies. 2020;13(24):6623. https://doi.org/10.3390/en13246623.

    Article  Google Scholar 

  13. Qin L, Yu N, Zhao D. Applying the convolutional neural network deep learning technology to behavioural recognition in intelligent video. Teh Vjesn Tech Gaz. 2018;25(2):528–35. https://doi.org/10.17559/TV-20171229024444.

    Article  Google Scholar 

  14. Borovykh A, Bohte S, Oosterlee CW. Dilated convolutional neural networks for time series forecasting. J Comput Financ. 2018. https://doi.org/10.21314/JCF.2019.358.

    Article  Google Scholar 

  15. Binkowski M, Marti G, Donnat P. Autoregressive convolutional neural networks for asynchronous time series. In: 35th International conference on machine learning. ICML 2018. 2018,.Vol. 2, pp. 933–945.

  16. Borovykh A, Bohte S, Oosterlee CW. Conditional time series forecasting with convolutional neural networks. In: Proceedings of the International Conference on Artificial Neural Networks (ICANN). 2017. pp. 1–22.

  17. Livieris IE, Stavroyiannis S, Iliadis L, Pintelas P. Smoothing and stationarity enforcement framework for deep learning time-series forecasting. Neural Comput Appl. 2021;33(20):14021–35. https://doi.org/10.1007/s00521-021-06043-1.

    Article  Google Scholar 

  18. Smyl S. A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. Int J Forecast. 2020;36(1):75–85. https://doi.org/10.1016/j.ijforecast.2019.03.017.

    Article  Google Scholar 

  19. Muhamad NS, Din AM. Exponential smoothing techniques on daily temperature level data. In: Proceedings of the 6th international conference on computing and informatics. 2017. no. 217, pp. 62–68.

  20. Gustriansyah R, Suhandi N, Antony F, Sanmorino A. Single exponential smoothing method to predict sales multiple products. J Phys Conf Ser. 2019;1175: 012036. https://doi.org/10.1088/1742-6596/1175/1/012036.

    Article  Google Scholar 

  21. Singh K, Shastri S, Singh Bhadwal A, Kour P, Kumari M. Implementation of Exponential Smoothing for Forecasting Time Series Data. Int J Sci Res Comput Sci Appl Manag Stud. 2019;8(1):1–8.

    Google Scholar 

  22. Datta M, Senjyu T, Yona A, Funabashi T. Photovoltaic output power fluctuations smoothing by selecting optimal capacity of battery for a photovoltaic-diesel hybrid system. Electr Power Components Syst. 2011;39(7):621–44. https://doi.org/10.1080/15325008.2010.536809.

    Article  Google Scholar 

  23. Dhamodharavadhani S, Rathipriya R. Region-wise rainfall prediction using mapreduce-based exponential smoothing techniques. In: Peter J, Alavi A, Javadi B, editors. Advances in big data and cloud computing. Advances in intelligent systems and computing. Berlin: Springer; 2019. p. 229–39.

    Google Scholar 

  24. Billah B, King ML, Snyder RD, Koehler AB. Exponential smoothing model selection for forecasting. Int J Forecast. 2006;22(2):239–47. https://doi.org/10.1016/j.ijforecast.2005.08.002.

    Article  Google Scholar 

  25. Frausto-Solís J, Hernández-González LJ, González-Barbosa JJ, Sánchez-Hernández JP, Román-Rangel E. Convolutional Neural Network-Component Transformation (CNN–CT) for confirmed COVID-19 cases. Math Comput Appl. 2021;26(2):29. https://doi.org/10.3390/mca26020029.

    Article  Google Scholar 

  26. Rabbani MBA, et al. a comparison between seasonal autoregressive integrated moving average (SARIMA) and exponential smoothing (ES) based on time series model for forecasting road accidents. Arab J Sci Eng. 2021;46(11):11113–38. https://doi.org/10.1007/s13369-021-05650-3.

    Article  Google Scholar 

  27. Farsi B, Amayri M, Bouguila N, Eicker U. On short-term load forecasting using machine learning techniques and a novel parallel deep LSTM-CNN approach. IEEE Access. 2021;9:31191–212. https://doi.org/10.1109/ACCESS.2021.3060290.

    Article  Google Scholar 

  28. Lee BH, Jung SJ, Kim BS. A study on the prediction of power demand for electric vehicles using exponential smoothing techniques. J Korean Soc Disaster Sec. 2021;14(2):35–42. https://doi.org/10.21729/ksds.2021.14.2.35.

    Article  Google Scholar 

  29. Arceda MAM, Laura PCL, Arceda VEM. Forecasting time series with multiplicative trend exponential smoothing and LSTM COVID-19 case study. Cham: Springer; 2021. p. 568–82.

    Google Scholar 

  30. AP Wibawa, ZN Izdihar, ABP Utama, L Hernandez, and Haviluddin. Min-Max Backpropagation Neural Network to Forecast e-Journal Visitors. In: 2021 International conference on artificial intelligence in information and communication (ICAIIC). 2021. pp. 52–58, Doi: https://doi.org/10.1109/ICAIIC51459.2021.9415197.

  31. Satapathy SC, Govardhan A, Raju KS, Mandal JK. An overview on web usage mining. In: Satapathy SC, editor. Advances in intelligent systems and computing, vol. 338. Springer; 2015. p. V–VI.

    Google Scholar 

  32. Gracia E. Psychosocial intervention: a journal’s journey towards greater scientific quality, visibility and internationalization. Psicol Reflex Crít. 2015;28:94–8. https://doi.org/10.1590/1678-7153.20152840013.

    Article  Google Scholar 

  33. Tealab A, Hefny H, Badr A. Forecasting of nonlinear time series using ANN. Futur Comput Informatics J. 2017;2(1):39–47. https://doi.org/10.1016/j.fcij.2017.05.001.

    Article  Google Scholar 

  34. McFarland JM, et al. Improved estimation of cancer dependencies from large-scale RNAi screens using model-based normalization and data integration. Nat Commun. 2018;9(1):4610. https://doi.org/10.1038/s41467-018-06916-5.

    Article  Google Scholar 

  35. Patro SGK, KK sahu,. Normalization: a preprocessing stage. IARJSET. 2015. https://doi.org/10.17148/iarjset.2015.2305.

    Article  Google Scholar 

  36. Buttrey SE. Data mining algorithms explained using R. J Stat Softw. 2015. https://doi.org/10.18637/jss.v066.b02.

    Article  Google Scholar 

  37. Prema V, Rao KU. Development of statistical time series models for solar power prediction. Renew Energy. 2015;83:100–9. https://doi.org/10.1016/j.renene.2015.03.038.

    Article  Google Scholar 

  38. Flaut C, Savin D, Zaharia G. Some applications of Fibonacci and Lucas numbers. arXiv Prepr. arXiv1911.06863v1. 2019.

  39. Lopez N, Nunez M, Rodriguez I, Rubio F. Introducing the golden section to computer science. In: Proceedings first IEEE international conference on cognitive informatics. 2002. pp. 203–212, Doi: https://doi.org/10.1109/COGINF.2002.1039299

  40. Noe TD, Vos Post J. Primes in Fibonacci n-step and Lucas n-step sequences. J Integer Seq. 2005;8(4):1–12.

    MathSciNet  MATH  Google Scholar 

  41. Tuba E, Bačanin N, Strumberger I, Tuba M. Convolutional neural networks hyperparameters tuning. In: Pap E, editor. Artificial intelligence: theory and applications. Cham: Springer; 2021. p. 65–84.

    Chapter  Google Scholar 

  42. Parashar A, Sonker A. Application of hyperparameter optimized deep learning neural network for classification of air quality data. Int J Sci Technol Res. 2019;8(11):1435–43.

    Google Scholar 

  43. Tovar M, Robles M, Rashid F. PV power prediction, using CNN-LSTM hybrid Neural Network Model. Case of study: Temixco-Morelos México. Energies. 2020;13(24):6512. https://doi.org/10.3390/en13246512.

    Article  Google Scholar 

  44. Pelletier C, Webb G, Petitjean F. Temporal convolutional neural network for the classification of satellite image time series. Remote Sens. 2019;11(5):523. https://doi.org/10.3390/rs11050523.

    Article  Google Scholar 

  45. Zatarain Cabada R, Rodriguez Rangel H, Barron Estrada ML, Cardenas Lopez HM. Hyperparameter optimization in CNN for learning-centered emotion recognition for intelligent tutoring systems. Soft Comput. 2020;24(10):7593–602. https://doi.org/10.1007/s00500-019-04387-4.

    Article  Google Scholar 

  46. Koprinska I, Wu D, Wang Z. Convolutional Neural Networks for Energy Time Series Forecasting. In: 2018 International Joint Conference on Neural Networks (IJCNN). 2018. pp. 1–8. Doi: https://doi.org/10.1109/IJCNN.2018.8489399

  47. Oh J, Wang J, Wiens J. Learning to exploit invariances in clinical time-series data using sequence transformer networks. 2018. pp. 1–15. http://arxiv.org/abs/1808.06725.

  48. Nguyen H-P, Liu J, Zio E. A long-term prediction approach based on long short-term memory neural networks with automatic parameter optimization by Tree-structured Parzen Estimator and applied to time-series data of NPP steam generators. Appl Soft Comput. 2020;89: 106116. https://doi.org/10.1016/j.asoc.2020.106116.

    Article  Google Scholar 

  49. Khullar S, Singh N. Water quality assessment of a river using deep learning Bi-LSTM methodology: forecasting and validation. Environ Sci Pollut Res. 2022;29(9):12875–89. https://doi.org/10.1007/s11356-021-13875-w.

    Article  Google Scholar 

  50. Alameer Z, Elaziz MA, Ewees AA, Ye H, Jianhua Z. Forecasting gold price fluctuations using improved multilayer perceptron neural network and whale optimization algorithm. Resour Policy. 2019;61:250–60. https://doi.org/10.1016/j.resourpol.2019.02.014.

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank Knowledge Engineering and Data Science (KEDS) journal and the research group for sharing the dataset and idea.

Funding

Universitas Negeri Malang (UM).

Author information

Authors and Affiliations

Authors

Contributions

All authors have equal contributions to the paper. All the authors have read and approved the final manuscript.

Corresponding author

Correspondence to Aji Prasetya Wibawa.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

No competing of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wibawa, A.P., Utama, A.B.P., Elmunsyah, H. et al. Time-series analysis with smoothed Convolutional Neural Network. J Big Data 9, 44 (2022). https://doi.org/10.1186/s40537-022-00599-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40537-022-00599-y

Keywords

  • CNN
  • Time-series
  • Exponential smoothing
  • Optimum smoothing factor