Skip to main content

Data analysis for vague contingency data

Abstract

The existing Fisher’s exact test has been widely applied for investigating whether the difference between the observed frequencies is significant or not. The existing Fisher’s exact test can be applied only when the observed frequencies are in determinate form and has no vogues information. In practice, due to the complicity in the production process, it is not always possible to have observed frequencies in determinate form. Therefore, the use of the existing Fisher’s exact test may mislead the industrial engineers. The paper presents the modification of Fisher’s exact test using neutrosophic statistics. The operational process, simulation study, and application using the production data will be given in the paper. From the analysis of industrial data, it can be concluded that the proposed Fisher’s exact test performs well than the existing Fisher’s exact test.

Introduction

Fisher’s exact test using classical statistics has been applied for investigating whether the observed frequencies from dichotomous distributions are associated with each other or independent from each other. Fisher’s exact test using classical statistics is usually applied for \(2\times 2\) contingency table. The main aim of Fisher’s exact test is to test the null hypothesis that observed frequencies dichotomous distributions are associated vs. the alternative hypothesis that observed frequencies dichotomous distributions are independent. According to Kanji [1], the test statistic \(\left(\sum p\right)\) of Fisher’s exact test is calculated and compared with the specified level of significance (which is the probability of rejecting the null hypothesis when it is true) and the null hypothesis is rejected when the calculated value of the test statistic is less than the level of significance, otherwise, the null hypothesis is not rejected. Chen [2] differentiate between the chi-square test and Fisher’s exact test for \(2\times 2\) contingency table. Choi et al. [3] discussed the foundations and inference of \(2\times 2\) contingency table. Zhong et al. [4] discussed the application of the test for biological data. Ma and Mao [5] discussed the application of this test for scanning dependency. More information on Fisher’s exact test can be seen in [6,7,8].

Fuzzy-logic has the application where uncertainty is found in the data. To analyze the uncertain data, the statistical tests using classical statistics cannot be applied. The information about two measures (true and false) can be obtained from the fuzzy-based analysis. The logic having more information in uncertainty is known as “neutrosophic logic” is introduced by [9]. Smarandache [10] discussed that the neutrosophic logic has an edge over the interval data analysis and fuzzy logic. Basha et al. [11] and Das et al. [12] discussed the applications of neutrosophic logic. Based on the idea of neutrosophic numbers, the idea of neutrosophic statistics was given by [13] and further investigated by [14, 15]. Neutrosophic statistics was found to be more informative and more efficient than classical statistics by [12, 16, 17].

The operational process of Fisher’s exact test using classical statistics is designed to analyze only the determinate or exact observed frequencies. The existing Fisher’s exact test cannot be applied when the observed frequencies are in intervals. By exploring the literature and best of the author’s knowledge, no efforts have been made to design Fisher’s exact test using neutrosophic statistics. In this paper, we will extend Fisher’s exact test using neutrosophic statistics. The test statistic of Fisher’s exact test will be modified to analyze the neutrosophic numbers. The power of Fisher’s exact test will be discussed and application will be given using the industrial data. It is expected that Fisher’s exact test under neutrosophic statistics will be more efficient than the existing Fisher’s exact tests in terms of the power of the test, information and flexibility.

The proposed fisher’s exact test

The exiting Fisher’s exact test under classical statistics is applied to investigate whether the difference between observed frequencies is significant or not. The existing Fisher’s exact test cannot be applied if the observed frequency is interval rather than the exact number. To overcome this issue, it is necessary to modify Fisher’s exact test using neutrosophic statistics so that an investigation about the difference in frequency can be done in the presence of interval, fuzzy, imprecise and indeterminate data. Similar to Fisher’s exact test under classical statistics, the proposed Fisher’s exact test under neutrosophic statistics will be applied using a \(2\times 2\) contingency table. Let \({a}_{N}={a}_{L}+{a}_{U}{I}_{N};{I}_{N}\epsilon \left[{I}_{L},{I}_{U}\right]\),\({b}_{N}={b}_{L}+{b}_{U}{I}_{N};{I}_{N}\epsilon \left[{I}_{L},{I}_{U}\right]\),\({c}_{N}={c}_{L}+{c}_{U}{I}_{N};{I}_{N}\epsilon \left[{I}_{L},{I}_{U}\right]\), and \({d}_{N}={d}_{L}+{d}_{U}{I}_{N};{I}_{N}\epsilon \left[{I}_{L},{I}_{U}\right]\) be neutrosophic observed frequencies. Note here that the first values of observed frequency denote the determinate values, \({a}_{U}{I}_{N}\),\({b}_{U}{I}_{N}\),\({c}_{U}{I}_{N}\),\({d}_{U}{I}_{N}\) are indeterminate observed frequencies and \({I}_{N}\epsilon \left[{I}_{L},{I}_{U}\right]\) is a measure of indeterminacy associated with observed frequencies. These measures can be calculated from the imprecise data as (upper value-lower value)/upper value. Suppose that \({{N}_{N}=N}_{L}+{N}_{U}{I}_{N};{I}_{N}\epsilon \left[{I}_{L},{I}_{U}\right]\) be the total observed frequency. A \(2\times 2\) contingency table to carry out Fisher’s exact test under the idea of neutrosophy is presented in Table 1 as follows, see [18, 19] for more details. The neutrosophic test statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) for Fisher’s exact test is defined as

$$\sum {p}_{N}=\sum {p}_{L}+\sum {p}_{U}{I}_{{p}_{N}};{I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]$$
(1)

where the first part \(\sum {p}_{L}\) denotes the statistic of Fisher’s exact test under classical statistics, the second part \(\sum {p}_{U}{I}_{{p}_{N}}\) denote the indeterminate part and \({I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]\) is the uncertainty measure associated with the proposed test statistic. The proposed test statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) reduces to the existing test statistic \(\sum {p}_{L}\) when \({I}_{{p}_{L}}\)=0. By following [1], the test statistic of the proposed test can be written as

Table 1 A \(2\times 2\) contingency table
$$\sum {p}_{N}=\frac{\left({a}_{L}+{b}_{L}\right)!\left({c}_{L}+{d}_{L}\right)!\left({a}_{L}+{c}_{L}\right)!\left({b}_{L}+{d}_{L}\right)!}{{N}_{L}!}\sum_{i}\frac{1}{{a}_{iL}!{b}_{iL}!{c}_{iL}!{d}_{iL}!}+\frac{\left({a}_{U}+{b}_{U}\right)!\left({c}_{U}+{d}_{U}\right)!\left({a}_{U}+{c}_{U}\right)!\left({b}_{U}+{d}_{U}\right)!}{{N}_{U}!}\sum_{i}\frac{1}{{a}_{iU}!{b}_{iU}!{c}_{iU}!{d}_{iU}!}{I}_{{p}_{N}};{I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]$$
(2)

The proposed test statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) can be expressed as

$$\sum {p}_{N}\epsilon \left\{\begin{array}{c}\frac{\left({a}_{L}+{b}_{L}\right)!\left({c}_{L}+{d}_{L}\right)!\left({a}_{L}+{c}_{L}\right)!\left({b}_{L}+{d}_{L}\right)!}{{N}_{L}!}\sum_{i}\frac{1}{{a}_{iL}!{b}_{iL}!{c}_{iL}!{d}_{iL}!},\\ \frac{\left({a}_{U}+{b}_{U}\right)!\left({c}_{U}+{d}_{U}\right)!\left({a}_{U}+{c}_{U}\right)!\left({b}_{U}+{d}_{U}\right)!}{{N}_{U}!}\sum_{i}\frac{1}{{a}_{iU}!{b}_{iU}!{c}_{iU}!{d}_{iU}!}\end{array}\right\}$$
(3)

As mentioned in [1] “the summation is over all possible 2 × 2 schemes with a cell frequency equal to or smaller than the smallest experimental frequency (keeping the row and column totals fixed as above)”.

The computed value of \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) is compared with the pre-specified level of significance \(\alpha\). The null hypothesis of independence between sample and class is rejected if \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]<\alpha\), otherwise, the alternative hypothesis that sample and class are not independent is not rejected. The operational procedure of the proposed Fisher’s exact test under classical statistics is discussed in Fig. 1.

Fig. 1
figure 1

The procedure of Fisher’s exact test under classical statistics

Application using industrial data

In this section, the application of the proposed test is given using the information obtained from the manufacturing industry. Two machines \({A}_{1}\) and \({A}_{2}\) work for an hour and produced defective items in intervals. To explain the process of the proposed test, a \(2\times 2\) contingency table is extracted from [20] and the data is shown in Table 2. Industrial engineers are interested to investigate there is a significant difference between the performance of machines \({A}_{1}\) and \({A}_{2}\). As mentioned before, the neutrosophic-based tests have the ability to analyze the interval-based data more effectively than the tests using classical statistics.

Table 2 A \(2\times 2\) contingency table of machines and production

The neutrosophic test statistic is derived by computing all conceivable combinations utilizing the hypergeometric distribution, as outlined in Table 3. The minimum value among these combinations is identified and compared against all other combinations to ascertain those below this minimum. It's important to emphasize that these combinations are carefully selected to ensure that both the row and column totals remain consistent with those presented in Table 2.

Table 3 1st combination of the original data

Based on the possible combinations in Tables 2, 3, \(\sum {p}_{N}\) is calculated as

$$\sum {p}_{N}=\frac{2!36!11!27!}{38!}\left\{\frac{1}{10!26!}+\frac{1}{2!11!25!}\right\}+\frac{2!47!16!33!}{49!}\left\{\frac{1}{15!32!}+\frac{1}{2!16!31!}\right\}{I}_{{p}_{N}};{I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]$$

The simplified neutrosophic form of \(\sum {p}_{N}\epsilon \left[0.9218,0.8980\right]\) is given as \(\sum {p}_{N}=0.9218-0.89{80I}_{{p}_{N}};{I}_{{p}_{N}}\epsilon \left[0,0.0267\right]\). Suppose that \(\alpha\)=0.05. The calculated values of \(\sum {p}_{N}\epsilon \left[0.9218,0.8980\right]\) will be compared with 0.05. By comparing the values of statistic \(\sum {p}_{N}\epsilon \left[0.9218,0.8980\right]\) with 0.05, the values of statistic \(\sum {p}_{N}\epsilon \left[0.9218,0.8980\right]\) is greater than 0.05, therefore, the industrial engineers do not reject the null hypothesis \({H}_{0}\) of no difference between the performance of machines \({A}_{1}\) and \({A}_{2}\). Figure 2 depicts the operational procedure of the proposed Fisher’s exact test for the production data.

Fig. 2
figure 2

The procedure of Fisher’s exact test for production data

Advantages based on industrial data

The proposed Fisher’s exact test using neutrosophic statistics is a generalization of several tests. Now, the efficiency of the proposed Fisher’s exact test under neutrosophic statistics will be compared with Fisher’s exact test using Fisher’s exact test under classical statistics, interval-statistics, and Fisher’s exact test using fuzzy logic in terms of information and adequacy. To compare the efficiency of various tests, the neutrosophic statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) obtained for the production data will be considered. The neutrosophic form of the statistic from the data is given as:\(\sum {p}_{N}=0.9218-0.8980{I}_{{p}_{N}};{I}_{{p}_{N}}\epsilon \left[\mathrm{0,0.0267}\right]\). Note that the first value \(\sum {p}_{L}\)=\(0.9218\) presents Fisher’s exact test under classical statistics and \(0.8980{I}_{{p}_{N}}\) is an indeterminate part, and \({I}_{{p}_{N}}\epsilon \left[\mathrm{0,0.0267}\right]\) is a measure of indeterminacy associated with \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\). The proposed statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) reduces to Fisher’s exact test under classical statistics when \({I}_{{p}_{L}}\)=0. By comparing the proposed Fisher’s exact test under neutrosophic statistics with Fisher’s exact test under classical statistics, it can be seen that the proposed Fisher’s exact test under neutrosophic statistics provide the values of statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) in indeterminate interval with the measure of indeterminacy. For example, for testing the null hypothesis at a level of significance \(\alpha\)=0.05, the proposed Fisher’s exact test under neutrosophic is explained as: the probability of accepting the null hypothesis is 0.95, the probability of committing an error is 0.05 and the measure of indeterminacy is \(0.0267\). From the comparison, it is clear that the proposed Fisher’s exact test under neutrosophic is more efficient and more informative than Fisher’s exact test using classical statistics. Now, the efficiency of the proposed Fisher’s exact test under neutrosophic statistics will be compared with Fisher’s exact test using interval-statistics. The statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) using interval-statistics only capture the data inside the interval. The statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) using interval statistic tells that the values of the test statistic may vary from \(0.9218\) to \(0.8980\). Similarly, Fisher’s exact test using fuzzy-logic gives information about the measure of truth that is 0.95, and the measure of falseness that is 0.05. Like the interval-statistics, it tells that the statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) may change from 0.9218 to \(0.8980\) under uncertain environment. From the analysis, it is concluded that the proposed Fisher’s exact test under neutrosophic statistics has an edge over the three Fisher’s exact tests. Therefore, the use of the proposed Fisher’s exact test under neutrosophic statistics in the production industry will give more information and facilitate the decision-makers in the presence of an indeterminate environment.

Simulation study

To see whether the measure of indeterminacy \({I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]\) affects the decision about the null hypothesis or not. To study this affect, various intervals values of \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) are considered in Table 4. The neutrosophic forms of \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) for the selected values of \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\), the measure of indeterminacy \({I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]\), and the decision about the null hypothesis are also reported in Table 4. From Table 4, it can be seen that \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) increases, the values of \({I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]\) decreases. For example, when \(\sum {p}_{N}\epsilon \left[0.01, 0.04\right]\), the values of \({I}_{{p}_{N}}\) is \({I}_{{p}_{N}}\epsilon \left[\mathrm{0,3}\right]\). When \(\sum {p}_{N}\epsilon \left[0.95, 0.99\right]\), the values of \({I}_{{p}_{N}}\) is \({I}_{{p}_{N}}\epsilon \left[\mathrm{0,0.04}\right]\). In addition, it can be noted when \({I}_{{p}_{N}}\epsilon \left[\mathrm{0,0.80}\right]\) are fewer, although the values of \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) changes but no effect on the null hypothesis when comparing with \(\alpha\)=0.05. We note that the larger values of \({I}_{{p}_{N}}\) affect the decision about the null hypothesis. For example, when \({I}_{{p}_{N}}\epsilon \left[\mathrm{0,3}\right]\), the decision about the null hypothesis is changed from “Do not reject \({H}_{0}\)” to “reject \({H}_{0}\)”. From the study, it is clear that the larger values of the measure of uncertainty/indeterminacy affect the decision about the null hypothesis. Therefore, industrial engineers should be very careful in making decisions in the presence of uncertainty.

Table 4 Effect of indeterminacy

Sensitivity analysis

The sensitivity of the proposed Fisher’s exact test under neutrosophic statistics will be discussed now. The values of \({I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]\) are shown in Table 4. From Table 4, it can be seen that when \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) changes from [0.79, 0.75] to [0.89, 0.85], the values of the measure of indeterminacy remain the same that is 0.05. When \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) changes from [0.94, 0.90] to [0.99, 0.95], the values of the measure of indeterminacy remain the same that is 0.04. Similarly, there is not much change in \({I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]\) when \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) changes from [0.34, 0.30] to [0.44, 0.40]. This analysis shows that the change in the statistic \(\sum {p}_{N}\epsilon \left[\sum {p}_{L},\sum {p}_{U}\right]\) change the values of \({I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]\) but it does not affect the decision about the null hypothesis. From the analysis, it is concluded that the proposed test is sensitive for the higher values of \({I}_{{p}_{N}}\epsilon \left[{I}_{{p}_{L}},{I}_{{p}_{U}}\right]\).

Power of the test

This section presents the discussion on the power of Fisher’s exact test under neutrosophic statistics. Suppose that \(\alpha\) and \(\beta\) be the probability of rejecting \({H}_{0}|\) true and the probability of accepting \({H}_{0}|\) false. The power of the test is denoted by \(\left(1-\beta \right)\).

By following Nosakhare and Bright [21], the steps used to calculate \(\beta\) are given as.

  • Step-1: Generate a set of 10,000 random samples of the test statistic \(\sum {p}_{N}\)

  • Step-2: Compare the values of \(\sum {p}_{N}\) with the level of significance and record whether the null hypothesis \({H}_{0}\) is rejected or accepted.

  • Step-3: Determine the values of \(\beta\) (Type II error rate) by the ratio of the number of erroneous conclusion to the total number of replications.

The values of \(\left(1-\beta \right)\) for various values of \(\alpha\) are shown in Table 5. The power curve for the proposed Fisher’s exact test under neutrosophic statistics is shown in Fig. 3. As mentioned earlier, the proposed Fisher’s exact test under neutrosophic statistics reduces to Fisher’s exact test under classical statistics when no uncertainty is found. The lower line in Fig. 3 shows the power of the indeterminate part and the upper curve shows the power of the test for the determinate part. Overall, Fig. 3 shows the power of the test for Fisher’s exact test under neutrosophic statistics. From Fig. 3, it is clear that as the values of \(\alpha\) increases, the power of Fisher’s exact test under neutrosophic statistics decreases. For example, when \(\alpha =0.01\), the power of the test ranges from 0.9857 to 1. When \(\alpha =0.10\), the power of the test ranges from 0.8978 to 0.6705. The first value 0.9878 presents the power of the test under classical statistics. It is clear that the power of the test reduces from 0.8978 to 0.6705; therefore, the use of the existing test under classical statistics may mislead decision-makers. In a nutshell, it is concluded that in the case of neutrosophy, the power of the test is in indeterminate intervals rather than the exact values. This study shows that the proposed test is more flexible than the existing Fisher’s exact test.

Table 5 The values of power of the tests
Fig. 3
figure 3

The power curve of Fisher’s exact test under neutrosophic statistics

Effect of indeterminacy on level of significance

Now, we will discuss the effect of indeterminacy on the level of significance. To study this effect, various values of specified values of levels of significance are considered to assess the effect of uncertainty. Suppose that \({\alpha }_{0}\) denote the pre-defined/specified level of significance and let \({\widehat{\alpha }}_{N}\epsilon \left[{\widehat{\alpha }}_{L},{\widehat{\alpha }}_{U}\right]\) denote the computed level of significance from the following simulation process.

  • Step-1: Generate a set of 10,000 random samples of the test statistic \(\sum {p}_{N}\)

  • Step-2: Compare the values of \(\sum {p}_{N}\) with the level of significance and record whether the null hypothesis \({H}_{0}\) is rejected or accepted.

  • Step-3: Determine the values of \({\widehat{\alpha }}_{N}\epsilon \left[{\widehat{\alpha }}_{L},{\widehat{\alpha }}_{U}\right]\) (Type I error rate) by the ratio of the number of rejection conclusions to the total number of replications.

By implementing the above simulation process, the values of \({\widehat{\alpha }}_{N}\epsilon \left[{\widehat{\alpha }}_{L},{\widehat{\alpha }}_{U}\right]\) are placed in Table 6. From Table 6, it can be noted that lower values of \({\widehat{\alpha }}_{N}\) are the same as the \({\alpha }_{0}\). But it is worth noting that the upper values of \({\widehat{\alpha }}_{N}\) are larger than \({\alpha }_{0}\). In addition, it can be noted that as \({\alpha }_{0}\), there is an increase in \({\widehat{\alpha }}_{N}\). From the study, it is clear while implementing the test under uncertainty, the level of significance may change from \({\alpha }_{0}\). For example, when \({\alpha }_{0}\)=0.05, the computed \({\widehat{\alpha }}_{N}\) is \({\widehat{\alpha }}_{N}\epsilon \left[\mathrm{0.05,0.20}\right]\). We can see that level of significance changes from 0.05 to 0.20 which can affect the decision related to the null hypothesis.

Table 6 The computed values of \({\alpha }_{N}\)

Concluding remarks

In this paper, Fisher’s exact test under neutrosophic statistics was presented. The design of the proposed Fisher’s exact test under an indeterminate environment was given. The operational procedure was explained with the help of industrial data. The proposed Fisher’s exact test was a generalization of the existing Fisher’s exact test under classical statistics. Based on the analysis and the simulation studies, it is concluded that the proposed test efficiently indicates a change in the power of the test and the level of significance when the test is implanted in the presence of imprecise data. The use of the proposed test is quite adequate to be applied in the uncertain environment as compared to the existing test. Based on the analysis and simulation studies, the application of the proposed Fisher’s exact test is recommended in the industry where the production data is ambiguous, imprecise, and or in-intervals. For future research, other statistical properties of the proposed Fisher’s exact test under neutrosophic statistics can be studied. Another fruitful area of the research may the extension of the proposed Fisher’s exact test using other sampling schemes.

Availability of data and materials

The data is given in the paper.

References

  1. Kanji GK. 100 Statistical tests. Newcastle upon Tyne: Sage; 2006.

    Book  Google Scholar 

  2. Chen Y-P. Do the chi-square test and fisher’s exact test agree in determining extreme for 2× 2 tables? Am Stat. 2011;65(4):239–45.

    Article  MathSciNet  MATH  Google Scholar 

  3. Choi L, Blume JD, Dupont WD. Elucidating the foundations of statistical inference with 2 x 2 tables. PLoS ONE. 2015;10(4):e0121263.

    Article  Google Scholar 

  4. Zhong H, Song M. A fast exact functional test for directional association and cancer biology applications. IEEE/ACM Trans Comput Biol Bioinf. 2018;16(3):818–26.

    Article  Google Scholar 

  5. Ma L, Mao J. Fisher exact scanning for dependency. J Am Stat Assoc. 2019;114(525):245–58.

    Article  MathSciNet  MATH  Google Scholar 

  6. West LJ, Hankin RK. Exact tests for two-way contingency tables with structural zeros. J Stat Softw. 2008;28:1–19.

    Article  Google Scholar 

  7. Bolboacă SD, et al. Pearson-fisher chi-square statistic revisited. Information. 2011;2(3):528–45.

    Article  Google Scholar 

  8. Ludbrook J. Analysing 2× 2 contingency tables: which test is best? Clin Exp Pharmacol Physiol. 2013;40(3):177–80.

    Article  Google Scholar 

  9. Smarandache F. Neutrosophy. Ann Arbor: Neutrosophic probability, set, and logic, proquest information & learning; 1998.

    MATH  Google Scholar 

  10. Smarandache F. Introduction to neutrosophic measure, neutrosophic integral, and neutrosophic probability. Infinite Study; 2013

  11. Basha SH, et al. Hybrid intelligent model for classifying chest X-ray images of COVID-19 patients using genetic algorithm and neutrosophic logic. Soft Comput. 2021. https://doi.org/10.1007/s00500-021-06103-7.

    Article  Google Scholar 

  12. Das R, Mukherjee A, Tripathy BC. Application of neutrosophic similarity measures in Covid-19. Ann Data Sci. 2021. https://doi.org/10.1007/s40745-021-00363-8.

    Article  Google Scholar 

  13. Smarandache F. Introduction to neutrosophic statistics. HITEC City: Infinite Study; 2014.

    MATH  Google Scholar 

  14. Chen J, Ye J, Du S. Scale effect and anisotropy analyzed for neutrosophic numbers of rock joint roughness coefficient based on neutrosophic statistics. Symmetry. 2017;9(10):208.

    Article  Google Scholar 

  15. Chen J, et al. Expressions of rock joint roughness coefficient using neutrosophic interval statistical numbers. Symmetry. 2017;9(7):123.

    Article  Google Scholar 

  16. Aslam M, Arif OH, Sherwani RAK. New diagnosis test under the neutrosophic statistics: an application to diabetic patients. BioMed Res Int. 2020. https://doi.org/10.1155/2020/2086185.

    Article  Google Scholar 

  17. Saeed M, et al. An application of neutrosophic hypersoft mapping to diagnose hepatitis and propose appropriate treatment. IEEE Access. 2021;9:70455–71.

    Article  Google Scholar 

  18. Aslam M, Arif OH. Test of association in the presence of complex environment. Complexity. 2020. https://doi.org/10.1155/2020/2935435.

    Article  Google Scholar 

  19. Aslam M. Chi-square test under indeterminacy: an application using pulse count data. BMC Med Res Methodol. 2021;21(1):1–5.

    Article  Google Scholar 

  20. Parthiban S, Gajivaradhan P. A comparative study of chi-square goodness-of-fit under fuzzy environments. Int Knowl Shar Platf. 2020;6(2):2224.

    Google Scholar 

  21. Nosakhare UH, Bright AF. Statistical analysis of strength of W/S test of normality against non-normal distribution using monte carlo simulation. Am J Theor Appl Stat. 2017;6(5–1):62–5.

    Google Scholar 

Download references

Acknowledgements

Thanks to the editor and reviewers for their valuable comments to improve the quality and presentation of the paper. This research was funded by Princess Nourah bint Abdulrahman University and Researchers Supporting Project number (PNURSP2023R346), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Funding

This research was funded by Princess Nourah bint Abdulrahman University and Researchers Supporting Project number (PNURSP2023R346), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Author information

Authors and Affiliations

Authors

Contributions

M.A and F.S.A wrote the paper.

Corresponding author

Correspondence to Faten S. Alamri.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

No competing interests regarding the paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aslam, M., Alamri, F.S. Data analysis for vague contingency data. J Big Data 10, 131 (2023). https://doi.org/10.1186/s40537-023-00812-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40537-023-00812-6

Keywords