 Research
 Open access
 Published:
Predicting LQ45 financial sector indices using RNNLSTM
Journal of Big Data volume 8, Article number: 104 (2021)
Abstract
As one of the most popular financial market instruments, the stock has formed one of the most massive and complex financial markets in the world. It could handle millions of transactions within a short period of time and highly unpredictable. In this study, we aim to implement a famous Deep Learning method, namely the long shortterm memory (LSTM) networks, for the stock price prediction. We limit the stocks to those that are included in the LQ45 financial sectors indices, i.e., BBCA, BBNI, BBRI, BBTN, BMRI, and BTPS. Rather than using too deep network architecture, we propose using a simple threelayer LSTM network architecture to predict the stocks’ closing prices. We found that the prediction results fall in the reasonable forecasting category. Moreover, it is worth noting that two of the considered stocks, namely, BBCA and BMRI, have the lowest MAPE values at 19.1020 and 18.6135, which fall in the good forecasting results. Hence, the proposed LSTM model is most recommended to be used on those two stocks.
Introduction
Stock, as one form of securities, is a paper sheet as proof of someone’s ownership in a company [1]. Since it can be exchanged easily, it is popularly used as a kind of investment by many people nowadays. In fact, the stock has become one of the most popular financial market instruments [2] and has formed one of the most massive and complex financial markets in the world—the stock market. The market could handle millions of transactions in a short period and is a highly dynamic environment [3]. It also has nonlinear characteristics and could be highly unpredictable [4, 5].
Many have tried to study and predict the stock’s price movement, which is considered as the crucial step in developing trading strategies [6]. A good trading strategy could help traders in avoiding losses and gaining more profit. Different kinds of forecasting methods and approaches have been applied and developed in order to achieve that. Some have used conventional and technical analysis methods, such as moving average, ARIMA, and stochastic optimization method, but the accuracy results are not satisfactory enough [5]. Hence, others move to a more emerging technique and approach, such as machine learning or deep learning methods, for the stock price prediction problem.
One of the most widely used deep learning methods, especially for time series analysis, is the long shortterm memory (LSTM) networks. As a better version of recurrent neural networks (RNN), LSTM has been used in many fields. Particularly in stock price prediction, LSTM had been proposed and applied by some notable researchers, such as Murtaza et al. [7], Nelson et al. [8], Faurina et al. [6], and Jin et al. [9]. Murtaza et al. [7] had built a model and predict the stock returns of NIFTY 50 by using LSTM networks. Similarly, Nelson et al. [8] also had applied LSTM networks to predict the future trends of stock prices based on its historical data. They applied their proposed LSTM networks on five different stocks from the Brazilian stock exchange and found that the outcomes are very promising. Faurina et al. [6] had tried a slightly different approach in predicting stock movement on the Indonesian market. They used three different dimensional reduction methods, namely Lasso, ElasticNet, and principal component analysis (PCA), that were combined with the LSTM networks for the stock price prediction. They found that the PCALSTM combination outperforms the other two combinations. Lastly, Jin et al. [9] had recently published their findings in the stock closing price prediction. In their study, they incorporated the investors’ emotional tendency with LSTM networks and found that their proposed model could improve the prediction results.
In this study, we also aim to implement the famous deep learning method, i.e., the LSTM networks, for the stock price prediction, specifically for stocks that are included in the LQ45 indices. LQ45 consists of 45 companies’ stocks with high liquidity and huge market share; and therefore, are considered to have good financial status [10]. The Indonesia stock exchange (IDX) has periodically assessed all companies’ performance in the stock market and announced the report of LQ45 indices semiannually [11]. This report has been used by many stakeholders, including traders and investors, as one of their decisionmaking tools. Moreover, many studies also had used this report as their criteria in choosing the considered stocks for their evaluation, as can be found on Nurmalitasari et al. [12], Pantagama [11], and Pramanaswari and Yasa [13]. Interested readers are encouraged to read their publications for further information.
Although many studies have used the LSTM networks in predicting the stock price movement, most of them incorporated many hidden layers and neurons in the deep network’s architecture [7, 14,15,16]. Here, we argue that using simpler and not too deep network’s architecture could get similar prediction results with those that used more complex networks. Briefly, the contributions of this study are (1) we propose a simple three layers LSTM network architecture in predicting the stock’s closing price, (2) we use the newly announced LQ45 report from the IDX for the period of February–July 2021, especially those that are included in the financial sector indices, and (3) we find that the proposed model could give good prediction results and serves as a baseline for a good trading strategy.
We will further discuss the research method used in this study, namely the LSTM networks in the following section. Some error criteria also will be described in the section. Moreover, the description of datasets being used in this study, the experimental results, and the analysis will be discussed. Lastly, some concluding remarks and suggestions will be given in the last section of this paper.
Research method
A simple diagram, which shows the research methodology applied in this study, is shown in Fig. 1. We started the process by handling any missing values in the stocks ‘Closing’ price with a simple data imputation method. Then, we divided the cleaned stocks’ data into training and test set with an 80:20 ratio. On both sets, we conducted the feature scaling method to get normalized training and test sets. Next, we reshaped both sets into 3D arrays that can be further processed on the prepared LSTM networks. We built the deep learning model using a threelayer LSTM network consisting of an LSTM layer, a dropout layer to prevent overfitting, and a dense layer as the output layer of the networks. To get the prediction results, we used the built model on the reshaped test set and converted it back to the original scale. Lastly, we calculated the root mean square error (RMSE) and the mean absolute percentage error (MAPE) scores in the performance evaluation process of the built model. A more detailed explanation of each step taken in this study will be given in the following section.
RNNLSTM
The long shortterm memory (LSTM) is an advanced soft computing method that Hochreiter and Schmidhuber first introduced to tackle the limitation found in the conventional recurrent neural networks (RNN) method, especially in solving problems with the longterm dependency issue [17]. Typically, it consists of several LSTM cells that are selfconnected and used to store the networks’ temporal state by using three gates, namely input, output, and forget gates [17]. An illustration of an LSTM cell containing those three gates is depicted in Fig. 2 [18].
The gate mechanism in an LSTM cell is used to control how much information can be passed throughout the networks. The forget gate can be found in the first part of the cell and is used to control how much of the previous cell’s hidden state could be forgotten. Next, the input gate is used to determine what new information will be stored in the current cell state. Finally, as its name inferred, the output gate is used to find the value we want to be the output of the current cell [19]. Some equations related to this mechanism in an LSTM cell are given below.
where \({f}_{t}\) is the forget gate value at the current cell, \({i}_{t}\) is the input gate value, \({C}_{t}\) is the current cell state, \({\tilde{C }}_{t}\) is the cell candidate value, \({o}_{t}\) is the output gate value, \({W}_{f}, {W}_{i}, {W}_{C},{W}_{o}\) and \({U}_{f}, {U}_{i}, {U}_{C},{U}_{o}\) are weights of the networks, \({b}_{f}, {b}_{i}, {b}_{C}, {b}_{o}\) are bias variable values, \({h}_{t}\) is the current hidden state value, \({h}_{t1}\) is the prior hidden state value, and \({x}_{t}\) is the new input value at the current cell. There are two activation functions (AFs) being used here, namely the sigmoid activation function (\(\sigma )\) and the tanh activation function. Both of them are the most frequently used nonlinear AFs in the artificial neural networks [20].
Error criteria
There are two prediction error criteria being used as the performance evaluation metrics in this study, i.e., the root mean square error (RMSE) and the mean absolute percentage error (MAPE) criteria. While RMSE will give the degree of error in a unit value, MAPE will give the degree of error in a percentage value. As explained by Shahid et al. [17] and Hansun et al. [21], both of them can be represented as
where \(n\) is the total number of data, \({Y}_{t}\) is the actual value, and \({F}_{t}\) is the forecasted value.
Result and analysis
This section is divided into three subsections. Firstly, we describe the data source, preprocessing steps, and model development conducted in this study. Secondly, the experimental results for each considered stock in LQ45 indices will be given. Moreover, the evaluation results and analysis will be discussed following the experimental results.
Data source, preprocessing, and model development
In this study, we used some stocks included in the LQ45 indices as semiannually reported by the Indonesia Stock Exchange (IDX). The last (Major Evaluation) Report is for the period of February to July 2021 and can be downloaded from the IDX website [22]. For simplicity, we will focus on some stocks on the list included in the financial sector, as shown in Table 1.
Next, we collected the recorded daily stock prices for each considered stock from Yahoo! Finance [23]. We chose to download the maximum data available in the data source; hence each stock could have a different number of data records. There are several features in the collected datasets, such as ‘Date,’ ‘Open,’ ‘High,’ ‘Low,’ ‘Close,’ ‘Adjusted Close,’ and ‘Volume,’ but we will only consider the ‘Close’ values of each set. The downloaded data then were preprocessed to handle any missing values using a simple data imputation method by replacing the missing values with their last known records. Moreover, xwe used an 80:20 ratio for training and test sets of each considered stock in the data splitting process. The resulted training and test sets are shown in Table 2.
On both the training and test sets, we further processed them for data normalization by using the feature scaling method. This step was done after the data splitting process because we normalized both training and test sets based on the training set scale. This is intuitive since the model will be trained on the training set, not on the test set, so the scaling should be based on the training set scale on both training and test sets. Next, we reshaped the data into a 3Darray shape accepted by the LSTM model in Keras, a deep learning package for Python. Keras runs on top of the TensorFlow machine learning platform.
This study proposed a threelayer LSTM network containing an LSTM layer, a dropout layer, and a dense layer. There are 100 neurons being used in the LSTM layer; meanwhile, for the dropout regularisation, we chose to drop 20% of the processed information in the networks to prevent overfitting. For the loss function in the networks, we used the simple mean square error (MSE) with Adam optimizer. Moreover, we trained the model on the training set for 20 training epochs with a batch size of 32 each. The built model then will be used to predict the Closing price on the test set. However, we need to do the data inversion phase to convert the predicted results into the original scaling of the data. Figure 3 shows a snippet code of the proposed LSTM model in this study. The prediction results based on the built model using the proposed LSTM networks are described in the following section.
Experimental results
We used a machine with an Intel Core i38130U CPU @ 2.20 GHz (4 CPUs) processor and 8 GB of RAM in the experimental phase. We also used several libraries in conducting the experiments, such as Pandas, Matplotlib, Keras, and Scikitlearn with Python programming language on the Jupyter Notebook in Anaconda 3 environment.
The prediction results on all considered stocks included in the LQ45 financial sector indices are shown in Fig. 4. Furthermore, the loss function results obtained during the model training are shown in Fig. 5. As can be inferred from Fig. 4, the proposed LSTM model could predict all considered stocks data very well, except for BTPS. Moreover, from Fig. 5, we know that all built models have converged quite well and remained stable to be used in the test phase.
Analysis
To get the performance results of the built model on the test set, we evaluated the prediction results by using the root mean square error (RMSE) and the mean absolute percentage error (MAPE) criteria. Table 3 presents the RMSE and MAPE values for each considered stock in this study. As evinced from the results, the RMSE values are ranged from 266.1255 to 2878.4668, which depend not only on the prediction results but also on the magnitude of the stock’s closing prices. Moreover, there are two stocks that have MAPE values under 20%, namely BMRI (18.6135) and BBCA (19.1020). The typical interpretation of MAPE values, as stated by Moreno et al. [24], is highly accurate forecasting (< 10), good forecasting (10–20), reasonable forecasting (20–50), and inaccurate forecasting (> 50). Therefore, the prediction results on BMRI and BBCA stock prices fall in good forecasting results, while all other prediction results have reasonable results based on the MAPE values interpretation.
We also compare the results of this study to other similar studies that tried to predict future values of stock prices using various kinds of machine learning or deep learning methods. Table 4 depicts the comparison results of this study with other similar studies. As can be inferred from the MAPE values, our proposed simple threelayer LSTM networks could achieve similar results with other more complex algorithms implementation, even the hybrid ones.
Conclusion
In this study, we have successfully implemented the wellknown Deep Learning methods, i.e., the LSTM networks, for the stock price prediction, specifically for stocks included in the LQ45 financial sector indices. There are six stocks considered in this study, namely BBCA (Bank Central Asia Tbk.), BBNI (Bank Negara Indonesia Tbk.), BBRI (Bank Rakyat Indonesia Tbk.), BBTN (Bank Tabungan Negara Tbk.), BMRI (Bank Mandiri Tbk.), and BTPS (Bank BTPN Syariah Tbk.). Using a simple threelayer LSTM network architecture, we found that the proposed model gave reasonable prediction results for all considered stocks. For BBCA and BMRI, the model could get good prediction results, and therefore, it is more recommended to use the built model for those two stocks in the LQ45 financial sector indices.
There are also some limitations in this study that are worthy of being noted. As previously explained in the text, we only did simple data imputation technique to handle any missing values in the dataset. However, as shown in the prediction plot for BBNI in Fig. 4, almost half of the stocks’ closing prices in the training set are stagnant at the value of 1800 s. It seems that there are some periods when the stock’s price data for BBNI in Yahoo! Finance are not recorded properly. This could affect the prediction results as shown by the high MAPE value, and therefore, some other preprocessing techniques need to be considered if the stock will be used in the future.
Moreover, we only used two wellknown forecast error criteria in this study as the performance evaluation metrics, namely the RMSE and MAPE. Other criteria, including the Rsquared score or coefficient of determination, could also be applied to get better and comprehensive analysis results [29]. Another possible future work is to compare the results of this study with other technical approaches, such as weighted exponential moving average [30] and double exponential smoothing [31] methods.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Abbreviations
 ANN:

Artificial neural networks
 ARIMA:

AutoRegressive Integrated Moving Average
 BBCA:

Bank Central Asia Tbk
 BBNI:

Bank Negara Indonesia Tbk
 BBRI:

Bank Rakyat Indonesia Tbk
 BBTN:

Bank Tabungan Negara Tbk
 BMRI:

Bank Mandiri Tbk
 BTPS:

Bank BTPN Syariah Tbk
 CNN:

Convolutional neural networks
 IDX:

Indonesia stock exchange
 LSTM:

Long shortterm memory
 MAPE:

Mean absolute percentage error
 MLP:

Multi layer perceptron
 MSE:

Mean square error
 RF:

Random forest
 RMSE:

Root mean square error
 RNN:

Recurrent neural networks
 SVR:

Support vector regression
References
Johan K, Young JC, Hansun S. LSTMRNN automotive stock price prediction. Int J Sci Technol Res. 2019;8(9).
Meizir, Rikumahu B. Prediction of Agriculture and Mining Stock Value Listed in Kompas100 Index Using Artificial Neural Network Backpropagation. In: 2019 7th International Conference on Information and Communication Technology (ICoICT) [Internet]. Kuala Lumpur, Malaysia: IEEE; 2019. pp. 1–5. Available from: https://ieeexplore.ieee.org/document/8835284/.
Mootha S, Sridhar S, Seetharaman R, Chitrakala S. Stock Price Prediction using BiDirectional LSTM based Sequence to Sequence Modeling and Multitask Learning. In: 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) [Internet]. New York, NY, USA: IEEE; 2020. pp. 0078–86. Available from: https://ieeexplore.ieee.org/document/9298066/.
Sridhar S, Mootha S, Subramanian S. Detection of market manipulation using ensemble neural networks. In: 2020 International Conference on Intelligent Systems and Computer Vision (ISCV) [Internet]. Fez, Morocco: IEEE; 2020. pp. 1–8. Available from: https://ieeexplore.ieee.org/document/9204330/.
Liu S, Liao G, Ding Y. Stock transaction prediction modeling and analysis based on LSTM. In: 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA) [Internet]. Wuhan, China: IEEE; 2018. p. 2787–90. Available from: https://ieeexplore.ieee.org/document/8398183/.
Faurina R, Winduratna B, Nugroho P. Predicting stock movement using unidirectional LSTM and feature reduction: the case of an Indonesia stock. In: 2018 International Conference on Electrical Engineering and Computer Science (ICEECS). Bali, Indonesia; 2018. pp. 180–5.
Murtaza R, Patel H, Varma S. Predicting Stock Prices Using LSTM. Int J Sci Res. 2017;6(4):1754–6. Available from: https://www.ijsr.net/archive/v6i4/ART20172755.pdf.
Nelson DMQ, Pereira ACM, de Oliveira RA. Stock market’s price movement prediction with LSTM neural networks. In: 2017 International Joint Conference on Neural Networks (IJCNN) [Internet]. Anchorage, AK, USA: IEEE; 2017. pp. 1419–26. Available from: http://ieeexplore.ieee.org/document/7966019/.
Jin Z, Yang Y, Liu Y. Stock closing price prediction based on sentiment analysis and LSTM. Neural Comput Appl. 2020;32(13):9713–29. https://doi.org/10.1007/s00521019045042.
Tanuwijaya J, Hansun S. LQ45 stock index prediction using knearest neighbors regression. Int J Recent Technol Eng. 2019;8(3):2388–91. Available from: https://www.ijrte.org/wpcontent/uploads/papers/v8i3/C4663098319.pdf.
Pantagama M, Rikumahu B. Indonesia financial sector stock prediction using long shortterm memory network algorithm and modeling (Study of Banking in August 2018 LQ45 Index). In: Anggadwita G, Martini E, editors. Digital Economy for Customer Benefit and Business Fairness [Internet]. Routledge; 2020. pp. 159–64. Available from: https://www.taylorfrancis.com/books/9781000070644.
Nurmalitasari, Sumarlinda S, Supriyanto N, Putri DK. LQ45 stock price predictions using the deep learning method. Int J Adv Res Publ. 2020;4(4):20–3. Available from: http://www.ijarp.org/publishedresearchpapers/apr2020/Lq45StockPricePredictionsUsingTheDeepLearningMethod.pdf.
Pramanaswari ASI, Yasa GW. Graham & Dodd theory in stock portfolio performance in LQ 45 index at Indonesia Stock Exchange. Int Res J Manag IT Soc Sci. 2018;5(6):52–9. Available from: http://sloap.org/journals/index.php/irjmis/article/view/338.
Shah D, Campbell W, Zulkernine FH. A comparative study of LSTM and DNN for stock market forecasting. In: 2018 IEEE International Conference on Big Data (Big Data) [Internet]. Seattle, WA, USA: IEEE; 2018. pp. 4148–55. Available from: https://ieeexplore.ieee.org/document/8622462/.
Istiake Sunny MA, Maswood MMS, Alharbi AG. Deep learningbased stock price prediction using LSTM and Bidirectional LSTM model. In: 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference (NILES) [Internet]. Giza, Egypt: IEEE; 2020. pp. 87–92. Available from: https://ieeexplore.ieee.org/document/9257950/.
Du J, Liu Q, Chen K, Wang J. Forecasting stock prices in two ways based on LSTM neural network. In: 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC) [Internet]. Chengdu, China: IEEE; 2019. pp. 1083–6. Available from: https://ieeexplore.ieee.org/document/8729026/.
Shahid F, Zameer A, Muneeb M. Predictions for COVID19 with deep learning models of LSTM, GRU and BiLSTM. Chaos, Solitons & Fractals. 2020;140:110212. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0960077920306081.
Phi M. Illustrated Guide to LSTM’s and GRU’s: A step by step explanation. Towards Data Sci. 2018 [cited 2021 Apr 1]. Available from: https://towardsdatascience.com/illustratedguidetolstmsandgrusastepbystepexplanation44e9eb85bf21.
Wang P, Zheng X, Ai G, Liu D, Zhu B. Time series prediction for the epidemic trends of COVID19 using the improved LSTM deep learning method: case studies in Russia, Peru and Iran. Chaos, Solitons & Fractals. 2020;140:110214. Available from: https://linkinghub.elsevier.com/retrieve/pii/S096007792030610X.
Li S, Zhao X. Imagebased concrete crack detection using convolutional neural network and exhaustive search technique. Adv Civ Eng. 2019;2019:1–12. Available from: https://www.hindawi.com/journals/ace/2019/6520620/.
Hansun S, Charles V, Gherman T, Subanar S, Indrati CR. A tuned HoltWinters whitebox model for COVID19 prediction. Int J Manag Decis Mak. 2021;20(1):1. Available from: http://www.inderscience.com/link.php?id=10034422.
IDX. Indeks Saham [Internet]. Jakarta, Indonesia; 2021. Available from: https://www.idx.co.id/datapasar/datasaham/indekssaham/.
Yahoo! Finance. Quotes [Internet]. 2021 [cited 2021 May 1]. Available from: https://finance.yahoo.com/lookup.
Moreno JJM, Pol AP, Abad AS, Blasco BC. Using the RMAPE index as a resistant measure of forecast accuracy. Psicothema. 2013;25(4):500–6. Available from: http://www.psicothema.com/pdf/4144.pdf.
Patel J, Shah S, Thakkar P, Kotecha K. Predicting stock market index using fusion of machine learning techniques. Expert Syst Appl. 2015;42(4):2162–72. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0957417414006551.
M H, E.A. G, Menon VK, K.P. S. NSE stock market prediction using deeplearning models. Procedia Comput Sci. 2018;132:1351–62. Available from: https://linkinghub.elsevier.com/retrieve/pii/S1877050918307828.
Sun S, Wei Y, Wang S. AdaBoostLSTM ensemble learning for financial time series forecasting. In: Shi Y, Fu H, Tian Y, Krzhizhanovskaya V V., Lees MH, Dongarra J, et al., editors. Lecture Notes in Computer Science, vol 10862 [Internet]. Springer, Cham; 2018. p. 590–7. https://doi.org/10.1007/9783319937137_55.
Nabipour M, Nayyeri P, Jabani H, Mosavi A, Salwana E, S. S. Deep Learning for Stock Market Prediction. Entropy [Internet]. 2020;22(8):840. Available from: https://www.mdpi.com/10994300/22/8/840
Budiharto W. Data science approach to stock prices forecasting in indonesia during Covid19 using long shortterm memory (LSTM). J Big Data. 2021;8(1):47. https://doi.org/10.1186/s40537021004300.
Hansun S. A new approach of moving average method in time series analysis. In: 2013 International Conference on New Media Studies, CoNMedia 2013. 2013.
Hansun S, Wicaksana A, Kristanda MB. Prediction of Jakarta City air quality index: modified double exponential smoothing approaches. Int J Innov Comput Inf Control. 2021;17(4):1363–71. Available from: http://www.ijicic.org/contents.htm.
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable comments and corrections. The authors would also like to extend our gratitude to Universitas Multimedia Nusantara for the support given during this study.
Funding
No applicable.
Author information
Authors and Affiliations
Contributions
SH has initiated, conducted, and made the first draft of this paper. All authors have reviewed and contributed to the final version of the paper. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hansun, S., Young, J.C. Predicting LQ45 financial sector indices using RNNLSTM. J Big Data 8, 104 (2021). https://doi.org/10.1186/s4053702100495x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4053702100495x