Estimating Latent Linear Correlations from Fuzzy Frequency Tables

Antonio Calcagnì

Communications in Mathematics and Statistics ›› 2022, Vol. 12 ›› Issue (3) : 435 -461.

PDF
Communications in Mathematics and Statistics ›› 2022, Vol. 12 ›› Issue (3) : 435 -461. DOI: 10.1007/s40304-022-00295-6
Article

Estimating Latent Linear Correlations from Fuzzy Frequency Tables

Author information +
History +
PDF

Abstract

This research concerns the estimation of latent linear or polychoric correlations from fuzzy frequency tables. Fuzzy counts are of particular interest to many disciplines including social and behavioral sciences and are especially relevant when observed data are classified using fuzzy categories—as for socioeconomic studies, clinical evaluations, content analysis, inter-rater reliability analysis—or when imprecise observations are classified into either precise or imprecise categories—as for the analysis of ratings data or fuzzy-coded variables. In these cases, the space of count matrices is no longer defined over naturals and, consequently, the polychoric estimator cannot be used to accurately estimate latent linear correlations. The aim of this contribution is twofold. First, we illustrate a computational procedure based on generalized natural numbers for computing fuzzy frequencies. Second, we reformulate the problem of estimating latent linear correlations from fuzzy counts in the context of expectation–maximization-based maximum likelihood estimation. A simulation study and two applications are used to investigate the characteristics of the proposed method. Overall, the results show that the fuzzy EM-based polychoric estimator is more efficient to deal with imprecise count data as opposed to standard polychoric estimators that may be used in this context.

Keywords

Fuzzy frequency / Generalized natural numbers / Polychoric correlations / Fuzzy data analysis

Cite this article

Download citation ▾
Antonio Calcagnì. Estimating Latent Linear Correlations from Fuzzy Frequency Tables. Communications in Mathematics and Statistics, 2022, 12(3): 435-461 DOI:10.1007/s40304-022-00295-6

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Agresti A. Categorical Data Analysis. 2003 Hoboken: Wiley

[2]

Asan, Z., Greenacre, M.: Measures of fit in multiple correspondence analysis of crisp and fuzzy coded data. Available at SSRN 1107815 (2008)

[3]

Aşan Z, Greenacre M. Biplots of fuzzy coded data. Fuzzy Sets Syst.. 2011, 183 1 57-71

[4]

Aslam M. Chi-square test under indeterminacy: an application using pulse count data. BMC Med. Res. Methodol.. 2021, 21 1 1-5

[5]

Aslam M. Neutrosophic statistical test for counts in climatology. Sci. Rep.. 2021, 11 1 1-5

[6]

Aslam M, Sherwani RAK, Saleem M. Vague data analysis using neutrosophic jarque-bera test. PLoS ONE. 2021, 16 12

[7]

Augustin T, Coolen FP, De Cooman G. Introduction to Imprecise Probabilities. 2014 Hoboken: Wiley

[8]

Blasius J, Greenacre M. Visualization and Verbalization of Data. 2014 Boca Raton: CRC Press

[9]

Bodjanova, S., Kalina, M.: Cardinalities of granules of vague data. In: Magdalena, L., Ojeda-Aciego, M., Verdegay, J.L. (edis.) Proceedings of IPMU2008, Torreliminos (Malaga), June 22–27 2008, pp. 63–70 (2008)

[10]

Bodjanova S. A generalized histogram. Fuzzy Sets Syst.. 2000, 116 2 155-166

[11]

Bonanomi, A., Ruscone, M.N., Osmetti, S.A.: The polychoric ordinal alpha, measuring the reliability of a set of polytomous ordinal items. In: SIS 2013 Conference: Advances in latent variables: Methods, models and applications, Brescia, Italy, pp. 19–21. Citeseer (2013)

[12]

Calcagnì A, Lombardi L. Dynamic fuzzy rating tracker (dyfrat): a novel methodology for modeling real-time dynamic cognitive processes in rating scales. Appl. Soft Comput.. 2014, 24 948-961

[13]

Calcagnì A, Lombardi L, Pascali E. Non-convex fuzzy data and fuzzy statistics: a first descriptive approach to data analysis. Soft. Comput.. 2014, 18 8 1575-1588

[14]

Casasnovas J, Torrens J. An axiomatic approach to fuzzy cardinalities of finite fuzzy sets. Fuzzy Sets Syst.. 2003, 133 2 193-209

[15]

Chakraborty S, Chakravarty D. Discrete gamma distributions: properties and parameter estimations. Commun. Stat. Theory Methods. 2012, 41 18 3301-3324

[16]

Chevene F, Doleadec S, Chessel D. A fuzzy coding approach for the analysis of long-term ecological data. Freshw. Biol.. 1994, 31 3 295-309

[17]

Ciavolino E, Salvatore S, Calcagnì A. A fuzzy set theory based computational model to represent the quality of inter-rater agreement. Qual. Quant.. 2014, 48 4 2225-2240

[18]

Coletti G, Scozzafava R. Conditional probability, fuzzy sets, and possibility: a unifying view. Fuzzy Sets Syst.. 2004, 144 1 227-249

[19]

Da Roit B, Weicht B. Migrant care work and care, migration and employment regimes: a fuzzy-set analysis. J. Eur. Soc. Policy. 2013, 23 5 469-486

[20]

Dan JRG, Arnaldos J, Darbra RM. Introduction of the human factor in the estimation of accident frequencies through fuzzy logic. Saf. Sci.. 2017, 97 134-143

[21]

de Sáa SDLR, Gil , González-Rodríguez G, López MT, Lubiano MA. Fuzzy rating scale-based questionnaires and their statistical analysis. IEEE Trans. Fuzzy Syst.. 2014, 23 1 111-126

[22]

Delgado M, Gonzalez A. An inductive learning procedure to identify fuzzy systems. Fuzzy Sets Syst.. 1993, 55 2 121-132

[23]

Demertzis K, Iliadis LS, Anezakis V-D. An innovative soft computing system for smart energy grids cybersecurity. Adv. Build. Energy Res.. 2018, 12 1 3-24

[24]

Denœux T. Maximum likelihood estimation from fuzzy data using the em algorithm. Fuzzy Sets Syst.. 2011, 183 1 72-91

[25]

Diciccio TJ, Romano JP. A review of bootstrap confidence intervals. J. R. Stat. Soc. Ser. B (Methodological). 1988, 50 3 338-354

[26]

Dou W, Ren Y, Qian W, Ruan S, Chen Y, Bloyet D, Constans J-M. Fuzzy kappa for the agreement measure of fuzzy classifications. Neurocomputing. 2007, 70 4–6 726-734

[27]

Dubois D, Prade H. Fundamentals of Fuzzy Sets. 2012 Berlin: Springer

[28]

Foldnes N, Grønneberg S. Pernicious polychorics: the impact and detection of underlying non-normality. Struct. Equ. Model.. 2020, 27 4 525-543

[29]

Gil, M.A., López, M.T., Gil, P.: Comparison between fuzzy information systems. Kybernetes (1984)

[30]

Gil MA, López-Díaz M, Ralescu DA. Overview on the development of fuzzy random variables. Fuzzy Sets Syst.. 2006, 157 19 2546-2557

[31]

Greenacre M. Fuzzy coding in constrained ordinations. Ecology. 2013, 94 2 280-286

[32]

Grzegorzewski, P.: Distribution-free tests for vague data. In: Soft Methodology and Random Information Systems, pp. 495–502. Springer, Berlin (2004)

[33]

Grzegorzewskia, P., Romaniuka, M.: Epistemic bootstrap for fuzzy data. In: 19th World Congress of the International Fuzzy Systems Association (IFSA), 12th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT), and 11th International Summer School on Aggregation Operators (AGOP), pp. 538–545. Atlantis Press (2021)

[34]

Hanss M. Applied Fuzzy Arithmetic. 2005 Berlin: Springer

[35]

Higham NJ. Computing the nearest correlation matrix-a problem from finance. IMA J. Numer. Anal.. 2002, 22 3 329-343

[36]

Hryniewicz O. Goodman-Kruskal $\gamma $ measure of dependence for fuzzy ordered categorical data. Comput. Stat. Data Anal.. 2006, 51 1 323-334

[37]

Inés, C., Dubois, D.: Statistical reasoning with set-valued information: ontic vs. epistemic views. Int. J. Approx. Reason. 55(7), 1502–1518 (2014)

[38]

Jadon RS, Chaudhury S, Biswas KK. A fuzzy theoretic approach for video segmentation using syntactic features. Pattern Recogn. Lett.. 2001, 22 13 1359-1369

[39]

Jin S, Yang-Wallentin F. Asymptotic robustness study of the polychoric correlation estimation. Psychometrika. 2017, 82 1 67-85

[40]

Jones PN, Geoffrey MJ. Improving the convergence rate of the em algorithm for a mixture model fitted to grouped truncated data. J. Stat. Comput. Simul.. 1992, 43 1–2 31-44

[41]

Jöreskog KG. On the estimation of polychoric correlations and their asymptotic covariance matrix. Psychometrika. 1994, 59 3 381-389

[42]

Kahraman C, Bozdag CE, Ruan D, Fahri Özok A. Fuzzy sets approaches to statistical parametric and nonparametric tests. Int. J. Intell. Syst.. 2004, 19 11 1069-1087

[43]

Kirilenko AP, Stepchenkova S. Inter-coder agreement in one-to-many classification: fuzzy kappa. PloS ONE. 2016, 11 3

[44]

Knol DL, ten Berge JMF. Least-squares approximation of an improper correlation matrix by a proper one. Psychometrika. 1989, 54 1 53-61

[45]

Kolenikov S, Angeles G. Socioeconomic status measurement with discrete proxy variables: is principal component analysis a reliable answer?. Rev. Income Wealth. 2009, 55 1 128-165

[46]

Lee, S., Lee, J.-H., Lee, K.-M., Youn, H.Y.: Fuzzy category and fuzzy interest for web user understanding. In: International Conference on Computational Science and Its Applications, pp. 1149–1158. Springer (2005)

[47]

Lee S-Y, Lam M-L. Estimation of polychoric correlation with elliptical latent variables. J. Stat. Comput. Simul.. 1988, 30 3 173-188

[48]

Lee S-Y, Poon W-Y. Two-step estimation of multivariate polychoric correlation. Commun. Stat. Theory Methods. 1987, 16 2 307-320

[49]

Lee S-Y, Shi J-Q. Maximum likelihood estimation of two-level latent variable models with mixed continuous and polytomous data. Biometrics. 2001, 57 3 787-794

[50]

Lee C-T, Zhang G, Edwards MC. Ordinary least squares estimation of parameters in exploratory factor analysis with ordinal data. Multivar. Behav. Res.. 2012, 47 2 314-339

[51]

Lorenzo-Seva U, Ferrando PJ. Not positive definite correlation matrices in exploratory item factor analysis: causes, consequences and a proposed solution. Struct. Equ. Model. Multidiscip. J.. 2021, 28 1 138-147

[52]

Lotfi Asker Zadeh. Probability measures of fuzzy events. J. Math. Anal. Appl.. 1968, 23 2 421-427

[53]

Louis TA. Finding the observed information matrix when using the em algorithm. J. R. Stat. Soc. Ser. B (Methodological). 1982, 44 2 226-233

[54]

McLachlan GJ, Krishnan T. The EM Algorithm and Extensions. 2007 Hoboken: Wiley

[55]

McLachlan GJ, Peel D. Finite Mixture Models. 2004 Hoboken: Wiley

[56]

Monroe S. Contributions to estimation of polychoric correlations. Multivar. Behav. Res.. 2018, 53 2 247-266

[57]

Muthén B. A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika. 1984, 49 1 115-132

[58]

Muthén BO, Satorra A. Technical aspects of muthén’s liscomp approach to estimation of latent variable relations with a comprehensive measurement model. Psychometrika. 1995, 60 4 489-503

[59]

Olsson U. Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika. 1979, 44 4 443-460

[60]

Petry K, Kuppens S, Vos P, Maes B. Psychometric evaluation of the dutch version of the mood, interest and pleasure questionnaire (mipq). Res. Dev. Disabil.. 2010, 31 6 1652-1658

[61]

Price PB, Jones EE. Examining the alliance using the psychotherapy process q-set. Psychother. Theory Res. Pract. Train.. 1998, 35 3 392

[62]

Quost B, Denoeux T. Clustering and classification of fuzzy data using the fuzzy em algorithm. Fuzzy Sets Syst.. 2016, 286 134-156

[63]

Roscino, A., Pollice, A.: A generalization of the polychoric correlation coefficient. In: Data Analysis, Classification and the Forward Search, pp. 135–142. Springer (2006)

[64]

Rosseel, Y.: Lavaan: an r package for structural equation modeling and more: version 0.5–12 (beta). J. Stat. Softw. 48(2), 1–36 (2012)

[65]

Sherwani, R.A.K., Iqbal, S., Abbas, S., Aslam, M. and AL-Marshadi, A.H.: A new neutrosophic negative binomial distribution: properties and applications. J. Math. 2021 (2021)

[66]

Shiina, K., Ueda, T., Kubo, S.: Polychoric correlations for ordered categories using the em algorithm. In: The Annual Meeting of the Psychometric Society, pp. 247–259. Springer (2017)

[67]

Silvia, A.B.M.N.R., Osmetti, A.: Reliability measurement for polytomous ordinal items: the empirical polychoric ordinal alpha. Quaderni di Statistica, 14 (2012)

[68]

Song X-Y, Lee S-Y. Full maximum likelihood estimation of polychoric and polyserial correlations with missing data. Multivar. Behav. Res.. 2003, 38 1 57-79

[69]

Taheri SM, Hesamian G, Viertl R. Contingency tables with fuzzy information. Commun. Stat. Theory Methods. 2016, 45 20 5906-5917

[70]

Tóth ZE, Jónás T, Dénes RV. Applying flexible fuzzy numbers for evaluating service features in healthcare-patients and employees in the focus. Total Qual. Manag. Bus. Excel.. 2019, 30 sup1 S240-S254

[71]

Trutschnig W. A strong consistency result for fuzzy relative frequencies interpreted as estimator for the fuzzy-valued probability. Fuzzy Sets Syst.. 2008, 159 3 259-269

[72]

Viertl R. Statistical Methods for Fuzzy Data. 2011 Hoboken: Wiley

[73]

Vovan, T., Lethithu, T.: A fuzzy time series model based on improved fuzzy function and cluster analysis problem. Commun. Math. Stat. 1–16 (2020)

[74]

Wolodzko, T.: extraDistr: Additional Univariate and Multivariate Distributions, 2020. R package version 1.9.1

[75]

Wygralak M. Questions of cardinality of finite fuzzy sets. Fuzzy Sets Syst.. 1999, 102 2 185-210

[76]

Yager RR. Generalized probabilities of fuzzy events from fuzzy belief structures. Inf. Sci.. 1982, 28 1 45-62

[77]

Yang N. East Asia in transition: re-examining the east Asian welfare model using fuzzy sets. J. Asian Public Policy. 2017, 10 1 104-120

[78]

Yang-Wallentin F, Jöreskog KG, Luo H. Confirmatory factor analysis of ordinal variables with misspecified models. Struct. Equ. Model.. 2010, 17 3 392-423

[79]

Zadeh, L.A: A computational approach to fuzzy quantifiers in natural languages. In: Computational Linguistics, pp. 149–184. Elsevier (1983)

[80]

Zhi-Gang S, Wang P-H, Li Y-G, Zhou Z-K. Parameter estimation from interval-valued data using the expectation-maximization algorithm. J. Stat. Comput. Simul.. 2015, 85 2 320-338

[81]

Zumbo BD, Gadermann AM, Zeisser C. Ordinal versions of coefficients alpha and theta for likert rating scales. J. Mod. Appl. Stat. Methods. 2007, 6 1 4

AI Summary AI Mindmap
PDF

262

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/