- Research
- Open access
- Published:

# Time series data analysis under indeterminacy

*Journal of Big Data*
**volumeÂ 10**, ArticleÂ number:Â 126 (2023)

## Abstract

The existing semi-average method under classical statistics is applied to measure the trend in the time series data. The existing semi-average method cannot be applied when the time series data is in intervals or imprecise. In this paper, we will introduce a semi-average method under neutrosophic statistics to measure the trend in imprecise or interval data. The application of the proposed semi-average method will be given using the wind speed data. The efficiency of the proposed semi-average method under neutrosophic statistics will be given over the semi-average method under classical statistics in terms of information and adequacy.

## Introduction

The semi-average method is an important method in time series analysis that is used to analyze the trend in the time series data. This method is very simple and easy to apply in practice. In this method, the time series data in hand is divided into two parts and the corresponding average of each part is calculated. The semi-average method can be used in a variety of fields for estimation and forecasting purposes. The semi-average method is used to set the trend in the time series data and provide the forecasting using the data for future implications. Although this method is simple, easy, objective, and gives identical trend values it is a crude method. The application of the time series method in organizational research can be seen in [1] The application of time series analysis can be seen in [2, 3]. The application of the method for geographical data can be seen in [4]. Kosiorowski et al. [5] proposed the Wilcoxon statistics for the time-series data.

The estimation and forecasting of wind speed cannot be done with the use of the appropriate statistical techniques. There are many studies on the use of statistical methods in the fields of energy and weather. The applications of the statistical distribution using the wind speed data can be seen in [6,7,8,9,10,11,12] also discussed the applications of statistical methods in wind speed forecasting.

It is important to note that classical statistics can be applied for forecasting and estimation purposes when the time series data have precise, certain and indeterminate observations. The use of such statistical techniques in an uncertain environment may mislead the expert in estimating or forecasting wind speed. Therefore, statistical methods using fuzzy logic are applied to deal with this type of data. The applications of the fuzzy-based statistical methods in estimating or forecasting can be seen in [13,14,15,16].

Smarandache [17] discussed the advantages of neutrosophic logic over fuzzy logic. Based on this logic, the idea of neutrosophic statistics was introduced by [18]. More information on neutrosophic logic can be seen in [19, 20]. Smarandache [18, 21, 22] showed that neutrosophic statistics (NS) is quite effective to be applied in an uncertain environment than classical statistics. Smarandache [23] proved the efficiency of NS over interval statistics and classical statistics. The NS can be applied when imprecise; vogue and uncertain observations are recorded in the time series data. Guan et al. [24] introduced a new perspective for time series forecasting, considering the quantification of inconsistency as a key characteristic. Abdel-Basset et al. [25] presented a novel neutrosophic forecasting approach using neutrosophic time series (NTS), transforming historical data into NTS data with truth, indeterminacy, and falsity functions. The proposed method includes determining neutrosophic logical relationship groups (NLRGs) and providing a deneutrosophication method for NTS. Singh et al. [26] applied a neutrosophic Set-Based Clustering Algorithm (NEBCA) to fMRI time series datasets, specifically focusing on working memory tasks and resting-state data. Aslam and Albassam [27] introduced the method of least squares under NS. More advantages and applications of NS can be seen in [28, 29].

Aslam and Albassam [27] made a valuable contribution by introducing the application of the least square method under NS for wind speed data forecasting. However, upon the thorough exploration of the literature on NS for time series analysis, we discovered a significant gap. Specifically, no previous research has focused on the utilization of the semi-average method under NS. This gap highlights the need for further investigation and presents an opportunity for us to introduce the semi-average method under NS in this paper. By addressing this gap, we aim to bring a novel approach to time series analysis within the framework of NS. Furthermore, we intend to demonstrate the practicality and effectiveness of our proposed method by applying it to wind speed data and comparing its performance against the semi-average method under classical statistics. Our work will make a significant contribution by introducing the application of the least square method under NS for wind speed data forecasting. Through an extensive review of NS literature, we identified a noteworthy research gap specifically, the absence of previous studies on the utilization of the semi-average method under NS. This comparative analysis will not only showcase the novelty of our approach but also establish its superiority over existing methods.

## Preliminaries

Let \({Y}_{1N}={Y}_{1L}+{Y}_{1U}{I}_{1N};{I}_{N}\epsilon \left[{I}_{1L},{I}_{1U}\right]\), \({Y}_{2N}={Y}_{2L}+{Y}_{2U}{I}_{2N};{I}_{N}\epsilon \left[{I}_{2L},{I}_{2U}\right]\),â€¦,\({Y}_{nN}={Y}_{nL}+{Y}_{nU}{I}_{nN};{I}_{N}\epsilon \left[{I}_{nL},{I}_{nU}\right]\) present the time variables under neutrosophic statistics. Suppose that \({n}_{N}\epsilon \left[{n}_{L},{n}_{U}\right]\) be neutrosophic sample size and \({I}_{N}\epsilon \left[{I}_{nL},{I}_{nU}\right]\) be the measure of uncertainty/indeterminacy \({I}_{N}\epsilon \left[{I}_{nL},{I}_{nU}\right]\). Suppose that \({Y}_{1L}\),\({Y}_{2L}\),â€¦,\({Y}_{nL}\) be the neutrosophic values of the time series. Let \({X}_{1N}={X}_{1L}+{X}_{1U}{I}_{1N};{I}_{N}\epsilon \left[{I}_{1L},{I}_{1U}\right]\) and \({X}_{2N}={X}_{2L}+{X}_{2U}{I}_{2N};{I}_{N}\epsilon \left[{I}_{2L},{I}_{2U}\right]\) present the coded values for the first and second halves of the time data. For more details, see, [18, 21, 22]. These neutrosophic averages can be computed as follows.

**Step-1:** The neutrosophic average of the determinate part of the 1st half is calculated as

**Step-2:** The neutrosophic average of the indeterminate part of the 1st half is calculated as

**Step-3:** The neutrosophic form of both averages is given by.

## Semi-average method under indeterminacy

The neutrosophic time series data consists of the neutrosophic numbers having a lower and upper value of the variable of interest. By following the classical semi-average method, the neutrosophic time series data is divided into two halves. The neutrosophic average of the 1st half and the 2nd half are computed and placed at the center of each half. Let \({\overline{Y} }_{1N}\in \left[{\overline{Y} }_{1L},{\overline{Y} }_{1U}\right]\) be the neutrosophic average of the 1st half and \({\overline{Y} }_{2N}\in \left[{\overline{Y} }_{2L},{\overline{Y} }_{2U}\right]\) be the neutrosophic average of the 2nd half. Let \({X}_{1N}\) and \({X}_{2N}\) denote the codded values of each half, respectively. The proposed semi-average method under neutrosophic statistics is explained as

The regression line under the proposed method is given by

where \({a}_{N}\epsilon \left[{a}_{L},{a}_{U}\right]\) is intercept under neutrosophic statistics and computed as

where \({b}_{N}\epsilon \left[{b}_{L},{b}_{U}\right]\) is the slope of the regression line and can be computed as.

The necessary steps to show the derivation of the above equations are shown in the appendix.

## Application using wind speed data

As mentioned earlier, the wind speed data is usually recorded as a minimum value and a maximum value. Therefore, we will use the wind speed data (mph) and apply the proposed semi-average method to it. The data is recorded from the Meteorology department in Punjab, Pakistan. The forecasting of the wind speed data recorded in the intervals can be done using the neutrosophic statistics adequately as compared to the semi-average method under classical statistics. The data of some selected months are taken from [30] are shown in Tables 1, 2 and 3.

Some necessary computations for the implication of the proposed method are shown as follows.

The regression line under neutrosophic statistics for January 2020 is expressed as

The regression line under neutrosophic statistics for February 2020 is expressed as

The regression line under neutrosophic statistics for March 2020 is expressed as

The regression line for January 2020 shows that the intercept of the regression line will be between 0 and 8.86. The slope of the regression for January 2020 is also in an indeterminate interval from 0 to 0.43. The regression line for February 2020 shows that the intercept of the regression line will be between \(0.1428\) and \(10.57\). The slope of the regression for February 2020 is also in the indeterminate interval from \(0.02\) to \(0.19\). The regression line for March 2020 shows that the intercept of the regression line will be between \(1.4\) and \(15.33\). The slope of the regression for March 2020 is also in indeterminate interval from \(0.87\) to \(-\,0.29\). Using these regression lines, the trended values are calculated and placed in Tables 1, 2 and 3. From Tables 1, 2 and 3, it can be noted that the lower (minimum) value of wind speed is mostly zero. Therefore, a larger indeterminacy can be expected in forecasting using this indeterminate data. From Table 3, the fitted regression line for the 4th March 2020 is \({\widehat{Y}}_{N}=0.0628+9.81{I}_{N};{I}_{N}\epsilon \left[\mathrm{0,0.99}\right]\). From the trended regression equation, it can be forecast that the wind speed (mph) will be 0.0628 to 9.81 with the measure of indeterminacy 0.99. The fitted trend lines and the real wind speed (mph) for 3 months are shown in Fig.Â 1. From Fig.Â 1, it can be seen that for the month of January, the values of the first 11 days are close to the fitted regression lines while the values of the remaining days are away from the upper value of the trended line. In actual values of the months of February and March are away from the upper values of the trended lines. On the other hand, the lower values of actual data are very close to the lower values of trended lines for 3 months. FigureÂ 1 clearly shows that indeterminate wind speed (mph) should be forecasted using the semi-average method under neutrosophic statistics. For this type of time series data, the use of the existing semi-average method under classical statistics may mislead decision-makers. Based on the wind speed data, it can be concluded that the proposed semi-average method under neutrosophic is adequate and suitable to use for forecasting purposes.

## Competitive study based on wind speed data

The proposed semi-average method under neutrosophic is a generalization of the semi-average method under classical statistics. The equations of the proposed semi-average method reduce to equations under classical statistics when \({I}_{L}=0\). The efficiency of the proposed semi-average method under neutrosophic statistics will be compared with the existing semi-average method under classical statistics in terms of information and adequacy. For the comparison purpose, we presented the trended values, their neutrosophic forms, and the measure of indeterminacy in Tables 1, 2 and 3. Note here that in each neutrosophic form in Tables 1, 2 and 3, the first value presents the trend value using classical statistics. The second value of each neutrosophic form presents the indeterminate part. Each neutrosophic form reduces to the determinate part when \({I}_{N}\)=0. For example, the fitted regression line for 5th February 2020 is \({\widehat{Y}}_{N}=0.0828+10{I}_{N}\). In this neutrosophic form, the first value \(0.0828\) presents the forecasted value under classical statistics. The second value \(10{I}_{N}\) denotes the indeterminate part and the measure of indeterminacy is \({I}_{N}\epsilon \left[\mathrm{0,0.99}\right]\). From this analysis, it can be seen that this is the beauty of the proposed model in that it provides the forecasting results in indeterminate intervals rather the exact result in the presence of uncertainty. In addition, the proposed method gives additional information about the measure of indeterminacy that cannot be obtained from the analysis under classical statistics. Note here that there are higher measures of indeterminacy in the months of January and February. In March, after the 6th day, smaller values of measure of indeterminacy in trend values are found. From the analysis, it is clear that the existing semi-average method under neutrosophic statistics is more flexible than the classical method. In addition, in the presence of a high measure of indeterminacy, the forecasting may be affected and mislead the decision-makers. We also note that for interval data, it would be better to use neutrosophic statistics to forecast wind speed.

## Conclusions

The paper introduced a modified version of the semi-average method within the context of neutrosophic statistics. This modified semi-average method serves as a generalization of the existing approach. Based on the methodology and application, it can be concluded that the proposed semi-average method is well-suited for situations where data is recorded in intervals. In contrast, utilizing the existing semi-average method for wind speed prediction with interval data may lead decision-makers astray. It is important to note that the proposed method has its limitations and can only be applied when dealing with imprecise observations within time series data. However, the neutrosophic semi-average method utilizes less information compared to other time series methods. The applications of the proposed method extend to various fields such as meteorology for weather forecasting, business, and medical science. As for further research, it is worth considering other time series methods within the framework of neutrosophic statistics. This would open up possibilities for additional advancements in the field. Other time series methods under neutrosophic statistics can be extended as future research.

## Availability of data and materials

The data is given in the paper.

## References

Jebb AT, Tay L. Introduction to time series analysis for organizational research: methods for longitudinal analyses. Organ Res Methods. 2017;20(1):61â€“94.

Chatfield C, Xing H. The analysis of time series: an introduction with R. Boca Raton: CRC Press; 2019.

McDowall D, McCleary R, Bartos BJ. Interrupted time series analysis. Oxford: Oxford University Press; 2019.

Feyrer J. Trade and incomeâ€”exploiting time series in geography. Am Econ J Appl Econ. 2019;11(4):1â€“35.

Kosiorowski D, Rydlewski JP, Snarska M. Detecting a structural change in functional time series using local Wilcoxon statistic. Stat Pap. 2019;60(5):1677â€“98.

Bidaoui H, et al. Wind speed data analysis using weibull and rayleigh distribution functions, case study: five cities Northern Morocco. Procedia Manuf. 2019;32:786â€“93.

Alrashidi M, Rahman S, Pipattanasomporn M. Metaheuristic optimization algorithms to estimate statistical distribution parameters for characterizing wind speeds. Renew Energy. 2020;149:664â€“81.

ul Haq MA, et al. Marshall-Olkin Power Lomax distribution for modeling of wind speed data. Energy Rep. 2020;6:1118â€“23.

Mahmood FH, Resen AK, Khamees AB. Wind characteristic analysis based on Weibull distribution of Al-Salman site. Iraq Energy Rep. 2019;6:79â€“87.

AkgÃ¼l FG, ÅženoÄŸlu B. Comparison of wind speed distributions: a case study for Aegean coast of Turkey. Energy Sources Part A Recovery Util Environ Eff. 2019. https://doi.org/10.1080/15567036.2019.1663309.

Zaman B, Lee MH, Riaz M. An improved process monitoring by mixed multivariate memory control charts: An application in wind turbine field. Comput Ind Eng. 2020;142: 106343.

Yan J. A comparison wind power forecasting with different methods. Integr Ferroelectr. 2022;227(1):191â€“201.

Grzegorzewski P. K-sample median test for vague data. Int J Intell Syst. 2009;24(5):529â€“39.

Laib M, et al. Multifractal analysis of the time series of daily means of wind speed in complex regions. Chaos Solitons Fractals. 2018;109:118â€“27.

Grzegorzewski P, Åšpiewak M. The sign test and the signed-rank test for interval-valued data. Int J Intell Syst. 2019;34(9):2122â€“50.

Sezer OB, Gudelek MU, Ozbayoglu AM. Financial time series forecasting with deep learning: a systematic literature review: 2005â€“2019. Appl Soft Comput. 2020;90: 106181.

Smarandache F. Neutrosophy. Neutrosophic probability, set, and logic, proQuest information learning. Ann Arbor Mich USA. 1998;105:118â€“23.

Smarandache F. Introduction to neutrosophic statistics. Infinite Study. 2014. https://doi.org/10.13140/2.1.2780.1289.

Broumi S, et al. Interval-valued fermatean neutrosophic graphs. Collect Papers Vol XIII Various Sci Topics. 2022. https://doi.org/10.31181/dmame0311072022b.

Broumi S, et al. Complex fermatean neutrosophic graph and application to decision making. Decis Mak App Manag Eng. 2023;6(1):474â€“501.

Chen J, Ye J, Du S. Scale effect and anisotropy analyzed for neutrosophic numbers of rock joint roughness coefficient based on neutrosophic statistics. Symmetry. 2017;9(10):208.

Chen J, et al. Expressions of rock joint roughness coefficient using neutrosophic interval statistical numbers. Symmetry. 2017;9(7):123.

Smarandache F. Neutrosophic statistics is an extension of Interval Statistics, while plithogenic statistics is the most general form of statistics (second version). Infinite Study. 2022. https://doi.org/10.54216/IJNS.190111.

Guan H, et al. A neutrosophic forecasting model for time series based on first-order state and information entropy of high-order fluctuation. Entropy. 2019;21(5):455.

Abdel-Basset M, et al. A refined approach for forecasting based on neutrosophic time series. Symmetry. 2019;11(4):457.

Singh P, et al. Analysis of fMRI time series: neutrosophic-entropy based clustering algorithm. J Adv Inf Technol. 2022;13(3):224â€“9.

Aslam M, Albassam M. Forecasting of wind speed using interval-based least square method. Front Energy Res. 2022. https://doi.org/10.3389/fenrg.2022.896217.

Aslam M. Design of the Bartlett and Hartley tests for homogeneity of variances under indeterminacy environment. J Taibah Univ Sci. 2020;14(1):6â€“10.

Aslam M. On detecting outliers in complex data using Dixonâ€™s test under neutrosophic statistics. J King Saud Univ Sci. 2020;32(3):2005â€“8.

Aslam M. Retracted article: Forecasting of the wind speed under uncertainty. Sci Rep. 2022;10:20300.

## Acknowledgements

The author is deeply thankful to the editor and reviewers for their valuable suggestions to improve the quality and presentation of the paper.

## Funding

None.

## Author information

### Authors and Affiliations

### Contributions

MA wrote the paper.

### Corresponding author

## Ethics declarations

### Ethics approval and consent to participate

Not applicable.

### Consent for publication

Not applicable.

### Competing interests

No conflict of interest regarding the paper.

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendix

### Appendix

Let \({Y}_{1}={a}_{1}+{b}_{1}{I}_{NY}\) and \({Y}_{2}={a}_{2}+{b}_{2}{I}_{NY}\) for \({I}_{NY}\epsilon \left[{I}_{LY},{I}_{UY}\right]\) be two neutrosophic regression numbers obtained from regression analysis. The basic operation of these neutrosophic regression numbers can be performed as follows.

Or

Based on the above-mentioned information, let \({Y}_{i}={a}_{i}+{b}_{i}{I}_{NY}\left(i=\mathrm{1,2},3,\dots ,n\right)\) be a group of neutrosophic numbers obtained from regression analysis having the lower and upper values for \({I}_{NY}\epsilon \left[{I}_{LY},{I}_{UY}\right]\), the neutrosophic average of these regression numbers can be calculated as follows

where \(\overline{a} = \,\frac{1}{n}\sum\nolimits_{i = 1}^{n} {a_{i} }\) and \(\overline{b} = \frac{1}{n}\sum\nolimits_{i = 1}^{n} {b_{i} }\)

The development of Eq.Â (4), Eq.Â (5), Eq.Â (6) and Eq.Â (7) under neutrosophic statistics are given as follows

Suppose that \({\overline{Y} }_{1N}\in \left[{\overline{Y} }_{1L},{\overline{Y} }_{1U}\right]\) and \({\overline{Y} }_{2N}\in \left[{\overline{Y} }_{2L},{\overline{Y} }_{2U}\right]\) be neutrosophic average for the 1st half and the 2nd half, respectively. Let \({X}_{1N}\) and \({X}_{2N}\) be coded values for the 1st and 2nd half, respectively. Let \(\left({X}_{1N},{\overline{Y} }_{1N}\right)\) and \(\left({X}_{2N},{\overline{Y} }_{2N}\right)\) be two points from which the estimated line \({\widehat{Y}}_{N}={a}_{N}+{b}_{N}{X}_{N};{\widehat{Y}}_{N}\in \left[{\widehat{Y}}_{L},{\widehat{Y}}_{U}\right]\) passes. Two constant \({a}_{N}\) and \({b}_{N}\) can be determined from the following expression

By following the neutrosophic theory, the neutrosophic form of \({\overline{Y} }_{1N}\in \left[{\overline{Y} }_{1L},{\overline{Y} }_{1U}\right]\) is expressed by

The neutrosophic form of \({\overline{Y} }_{2N}\epsilon \left[{\overline{Y} }_{2L},{\overline{Y} }_{2U}\right]\) is expressed by.

The above expression can be written as follows

where

and

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Aslam, M. Time series data analysis under indeterminacy.
*J Big Data* **10**, 126 (2023). https://doi.org/10.1186/s40537-023-00806-4

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/s40537-023-00806-4