Estimating Latent Linear Correlations from Fuzzy Frequency Tables

Antonio Calcagnì

Communications in Mathematics and Statistics ›› 2022, Vol. 12 ›› Issue (3) : 435-461. DOI: 10.1007/s40304-022-00295-6
Article

Estimating Latent Linear Correlations from Fuzzy Frequency Tables

Author information +
History +

Abstract

This research concerns the estimation of latent linear or polychoric correlations from fuzzy frequency tables. Fuzzy counts are of particular interest to many disciplines including social and behavioral sciences and are especially relevant when observed data are classified using fuzzy categories—as for socioeconomic studies, clinical evaluations, content analysis, inter-rater reliability analysis—or when imprecise observations are classified into either precise or imprecise categories—as for the analysis of ratings data or fuzzy-coded variables. In these cases, the space of count matrices is no longer defined over naturals and, consequently, the polychoric estimator cannot be used to accurately estimate latent linear correlations. The aim of this contribution is twofold. First, we illustrate a computational procedure based on generalized natural numbers for computing fuzzy frequencies. Second, we reformulate the problem of estimating latent linear correlations from fuzzy counts in the context of expectation–maximization-based maximum likelihood estimation. A simulation study and two applications are used to investigate the characteristics of the proposed method. Overall, the results show that the fuzzy EM-based polychoric estimator is more efficient to deal with imprecise count data as opposed to standard polychoric estimators that may be used in this context.

Keywords

Fuzzy frequency / Generalized natural numbers / Polychoric correlations / Fuzzy data analysis

Cite this article

Download citation ▾
Antonio Calcagnì. Estimating Latent Linear Correlations from Fuzzy Frequency Tables. Communications in Mathematics and Statistics, 2022, 12(3): 435‒461 https://doi.org/10.1007/s40304-022-00295-6

References

[1.]
Agresti A. . Categorical Data Analysis, 2003 Hoboken Wiley
[2.]
Asan, Z., Greenacre, M.: Measures of fit in multiple correspondence analysis of crisp and fuzzy coded data. Available at SSRN 1107815 (2008)
[3.]
Aşan Z, Greenacre M. Biplots of fuzzy coded data. Fuzzy Sets Syst., 2011, 183(1): 57-71,
CrossRef Google scholar
[4.]
Aslam M. Chi-square test under indeterminacy: an application using pulse count data. BMC Med. Res. Methodol., 2021, 21(1): 1-5,
CrossRef Google scholar
[5.]
Aslam M. Neutrosophic statistical test for counts in climatology. Sci. Rep., 2021, 11(1): 1-5,
CrossRef Google scholar
[6.]
Aslam M, Sherwani RAK, Saleem M. Vague data analysis using neutrosophic jarque-bera test. PLoS ONE, 2021, 16(12),
CrossRef Google scholar
[7.]
Augustin T, Coolen FP, De Cooman G. . Introduction to Imprecise Probabilities, 2014 Hoboken Wiley,
CrossRef Google scholar
[8.]
Blasius J, Greenacre M. . Visualization and Verbalization of Data, 2014 Boca Raton CRC Press,
CrossRef Google scholar
[9.]
Bodjanova, S., Kalina, M.: Cardinalities of granules of vague data. In: Magdalena, L., Ojeda-Aciego, M., Verdegay, J.L. (edis.) Proceedings of IPMU2008, Torreliminos (Malaga), June 22–27 2008, pp. 63–70 (2008)
[10.]
Bodjanova S. A generalized histogram. Fuzzy Sets Syst., 2000, 116(2): 155-166,
CrossRef Google scholar
[11.]
Bonanomi, A., Ruscone, M.N., Osmetti, S.A.: The polychoric ordinal alpha, measuring the reliability of a set of polytomous ordinal items. In: SIS 2013 Conference: Advances in latent variables: Methods, models and applications, Brescia, Italy, pp. 19–21. Citeseer (2013)
[12.]
Calcagnì A, Lombardi L. Dynamic fuzzy rating tracker (dyfrat): a novel methodology for modeling real-time dynamic cognitive processes in rating scales. Appl. Soft Comput., 2014, 24: 948-961,
CrossRef Google scholar
[13.]
Calcagnì A, Lombardi L, Pascali E. Non-convex fuzzy data and fuzzy statistics: a first descriptive approach to data analysis. Soft. Comput., 2014, 18(8): 1575-1588,
CrossRef Google scholar
[14.]
Casasnovas J, Torrens J. An axiomatic approach to fuzzy cardinalities of finite fuzzy sets. Fuzzy Sets Syst., 2003, 133(2): 193-209,
CrossRef Google scholar
[15.]
Chakraborty S, Chakravarty D. Discrete gamma distributions: properties and parameter estimations. Commun. Stat. Theory Methods, 2012, 41(18): 3301-3324,
CrossRef Google scholar
[16.]
Chevene F, Doleadec S, Chessel D. A fuzzy coding approach for the analysis of long-term ecological data. Freshw. Biol., 1994, 31(3): 295-309,
CrossRef Google scholar
[17.]
Ciavolino E, Salvatore S, Calcagnì A. A fuzzy set theory based computational model to represent the quality of inter-rater agreement. Qual. Quant., 2014, 48(4): 2225-2240,
CrossRef Google scholar
[18.]
Coletti G, Scozzafava R. Conditional probability, fuzzy sets, and possibility: a unifying view. Fuzzy Sets Syst., 2004, 144(1): 227-249,
CrossRef Google scholar
[19.]
Da Roit B, Weicht B. Migrant care work and care, migration and employment regimes: a fuzzy-set analysis. J. Eur. Soc. Policy, 2013, 23(5): 469-486,
CrossRef Google scholar
[20.]
Dan JRG, Arnaldos J, Darbra RM. Introduction of the human factor in the estimation of accident frequencies through fuzzy logic. Saf. Sci., 2017, 97: 134-143,
CrossRef Google scholar
[21.]
de Sáa SDLR, Gil , González-Rodríguez G, López MT, Lubiano MA. Fuzzy rating scale-based questionnaires and their statistical analysis. IEEE Trans. Fuzzy Syst., 2014, 23(1): 111-126
[22.]
Delgado M, Gonzalez A. An inductive learning procedure to identify fuzzy systems. Fuzzy Sets Syst., 1993, 55(2): 121-132,
CrossRef Google scholar
[23.]
Demertzis K, Iliadis LS, Anezakis V-D. An innovative soft computing system for smart energy grids cybersecurity. Adv. Build. Energy Res., 2018, 12(1): 3-24,
CrossRef Google scholar
[24.]
Denœux T. Maximum likelihood estimation from fuzzy data using the em algorithm. Fuzzy Sets Syst., 2011, 183(1): 72-91,
CrossRef Google scholar
[25.]
Diciccio TJ, Romano JP. A review of bootstrap confidence intervals. J. R. Stat. Soc. Ser. B (Methodological), 1988, 50(3): 338-354,
CrossRef Google scholar
[26.]
Dou W, Ren Y, Qian W, Ruan S, Chen Y, Bloyet D, Constans J-M. Fuzzy kappa for the agreement measure of fuzzy classifications. Neurocomputing, 2007, 70(4–6): 726-734,
CrossRef Google scholar
[27.]
Dubois D, Prade H. . Fundamentals of Fuzzy Sets, 2012 Berlin Springer
[28.]
Foldnes N, Grønneberg S. Pernicious polychorics: the impact and detection of underlying non-normality. Struct. Equ. Model., 2020, 27(4): 525-543,
CrossRef Google scholar
[29.]
Gil, M.A., López, M.T., Gil, P.: Comparison between fuzzy information systems. Kybernetes (1984)
[30.]
Gil MA, López-Díaz M, Ralescu DA. Overview on the development of fuzzy random variables. Fuzzy Sets Syst., 2006, 157(19): 2546-2557,
CrossRef Google scholar
[31.]
Greenacre M. Fuzzy coding in constrained ordinations. Ecology, 2013, 94(2): 280-286,
CrossRef Google scholar
[32.]
Grzegorzewski, P.: Distribution-free tests for vague data. In: Soft Methodology and Random Information Systems, pp. 495–502. Springer, Berlin (2004)
[33.]
Grzegorzewskia, P., Romaniuka, M.: Epistemic bootstrap for fuzzy data. In: 19th World Congress of the International Fuzzy Systems Association (IFSA), 12th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT), and 11th International Summer School on Aggregation Operators (AGOP), pp. 538–545. Atlantis Press (2021)
[34.]
Hanss M. . Applied Fuzzy Arithmetic, 2005 Berlin Springer
[35.]
Higham NJ. Computing the nearest correlation matrix-a problem from finance. IMA J. Numer. Anal., 2002, 22(3): 329-343,
CrossRef Google scholar
[36.]
Hryniewicz O. Goodman-Kruskal γ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma $$\end{document} measure of dependence for fuzzy ordered categorical data. Comput. Stat. Data Anal., 2006, 51(1): 323-334,
CrossRef Google scholar
[37.]
Inés, C., Dubois, D.: Statistical reasoning with set-valued information: ontic vs. epistemic views. Int. J. Approx. Reason. 55(7), 1502–1518 (2014)
[38.]
Jadon RS, Chaudhury S, Biswas KK. A fuzzy theoretic approach for video segmentation using syntactic features. Pattern Recogn. Lett., 2001, 22(13): 1359-1369,
CrossRef Google scholar
[39.]
Jin S, Yang-Wallentin F. Asymptotic robustness study of the polychoric correlation estimation. Psychometrika, 2017, 82(1): 67-85,
CrossRef Google scholar
[40.]
Jones PN, Geoffrey MJ. Improving the convergence rate of the em algorithm for a mixture model fitted to grouped truncated data. J. Stat. Comput. Simul., 1992, 43(1–2): 31-44,
CrossRef Google scholar
[41.]
Jöreskog KG. On the estimation of polychoric correlations and their asymptotic covariance matrix. Psychometrika, 1994, 59(3): 381-389,
CrossRef Google scholar
[42.]
Kahraman C, Bozdag CE, Ruan D, Fahri Özok A. Fuzzy sets approaches to statistical parametric and nonparametric tests. Int. J. Intell. Syst., 2004, 19(11): 1069-1087,
CrossRef Google scholar
[43.]
Kirilenko AP, Stepchenkova S. Inter-coder agreement in one-to-many classification: fuzzy kappa. PloS ONE, 2016, 11(3),
CrossRef Google scholar
[44.]
Knol DL, ten Berge JMF. Least-squares approximation of an improper correlation matrix by a proper one. Psychometrika, 1989, 54(1): 53-61,
CrossRef Google scholar
[45.]
Kolenikov S, Angeles G. Socioeconomic status measurement with discrete proxy variables: is principal component analysis a reliable answer?. Rev. Income Wealth, 2009, 55(1): 128-165,
CrossRef Google scholar
[46.]
Lee, S., Lee, J.-H., Lee, K.-M., Youn, H.Y.: Fuzzy category and fuzzy interest for web user understanding. In: International Conference on Computational Science and Its Applications, pp. 1149–1158. Springer (2005)
[47.]
Lee S-Y, Lam M-L. Estimation of polychoric correlation with elliptical latent variables. J. Stat. Comput. Simul., 1988, 30(3): 173-188,
CrossRef Google scholar
[48.]
Lee S-Y, Poon W-Y. Two-step estimation of multivariate polychoric correlation. Commun. Stat. Theory Methods, 1987, 16(2): 307-320,
CrossRef Google scholar
[49.]
Lee S-Y, Shi J-Q. Maximum likelihood estimation of two-level latent variable models with mixed continuous and polytomous data. Biometrics, 2001, 57(3): 787-794,
CrossRef Google scholar
[50.]
Lee C-T, Zhang G, Edwards MC. Ordinary least squares estimation of parameters in exploratory factor analysis with ordinal data. Multivar. Behav. Res., 2012, 47(2): 314-339,
CrossRef Google scholar
[51.]
Lorenzo-Seva U, Ferrando PJ. Not positive definite correlation matrices in exploratory item factor analysis: causes, consequences and a proposed solution. Struct. Equ. Model. Multidiscip. J., 2021, 28(1): 138-147,
CrossRef Google scholar
[52.]
Lotfi Asker Zadeh. Probability measures of fuzzy events. J. Math. Anal. Appl., 1968, 23(2): 421-427,
CrossRef Google scholar
[53.]
Louis TA. Finding the observed information matrix when using the em algorithm. J. R. Stat. Soc. Ser. B (Methodological), 1982, 44(2): 226-233,
CrossRef Google scholar
[54.]
McLachlan GJ, Krishnan T. . The EM Algorithm and Extensions, 2007 Hoboken Wiley
[55.]
McLachlan GJ, Peel D. . Finite Mixture Models, 2004 Hoboken Wiley
[56.]
Monroe S. Contributions to estimation of polychoric correlations. Multivar. Behav. Res., 2018, 53(2): 247-266,
CrossRef Google scholar
[57.]
Muthén B. A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 1984, 49(1): 115-132,
CrossRef Google scholar
[58.]
Muthén BO, Satorra A. Technical aspects of muthén’s liscomp approach to estimation of latent variable relations with a comprehensive measurement model. Psychometrika, 1995, 60(4): 489-503,
CrossRef Google scholar
[59.]
Olsson U. Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 1979, 44(4): 443-460,
CrossRef Google scholar
[60.]
Petry K, Kuppens S, Vos P, Maes B. Psychometric evaluation of the dutch version of the mood, interest and pleasure questionnaire (mipq). Res. Dev. Disabil., 2010, 31(6): 1652-1658,
CrossRef Google scholar
[61.]
Price PB, Jones EE. Examining the alliance using the psychotherapy process q-set. Psychother. Theory Res. Pract. Train., 1998, 35(3): 392,
CrossRef Google scholar
[62.]
Quost B, Denoeux T. Clustering and classification of fuzzy data using the fuzzy em algorithm. Fuzzy Sets Syst., 2016, 286: 134-156,
CrossRef Google scholar
[63.]
Roscino, A., Pollice, A.: A generalization of the polychoric correlation coefficient. In: Data Analysis, Classification and the Forward Search, pp. 135–142. Springer (2006)
[64.]
Rosseel, Y.: Lavaan: an r package for structural equation modeling and more: version 0.5–12 (beta). J. Stat. Softw. 48(2), 1–36 (2012)
[65.]
Sherwani, R.A.K., Iqbal, S., Abbas, S., Aslam, M. and AL-Marshadi, A.H.: A new neutrosophic negative binomial distribution: properties and applications. J. Math. 2021 (2021)
[66.]
Shiina, K., Ueda, T., Kubo, S.: Polychoric correlations for ordered categories using the em algorithm. In: The Annual Meeting of the Psychometric Society, pp. 247–259. Springer (2017)
[67.]
Silvia, A.B.M.N.R., Osmetti, A.: Reliability measurement for polytomous ordinal items: the empirical polychoric ordinal alpha. Quaderni di Statistica, 14 (2012)
[68.]
Song X-Y, Lee S-Y. Full maximum likelihood estimation of polychoric and polyserial correlations with missing data. Multivar. Behav. Res., 2003, 38(1): 57-79,
CrossRef Google scholar
[69.]
Taheri SM, Hesamian G, Viertl R. Contingency tables with fuzzy information. Commun. Stat. Theory Methods, 2016, 45(20): 5906-5917,
CrossRef Google scholar
[70.]
Tóth ZE, Jónás T, Dénes RV. Applying flexible fuzzy numbers for evaluating service features in healthcare-patients and employees in the focus. Total Qual. Manag. Bus. Excel., 2019, 30(sup1): S240-S254,
CrossRef Google scholar
[71.]
Trutschnig W. A strong consistency result for fuzzy relative frequencies interpreted as estimator for the fuzzy-valued probability. Fuzzy Sets Syst., 2008, 159(3): 259-269,
CrossRef Google scholar
[72.]
Viertl R. . Statistical Methods for Fuzzy Data, 2011 Hoboken Wiley,
CrossRef Google scholar
[73.]
Vovan, T., Lethithu, T.: A fuzzy time series model based on improved fuzzy function and cluster analysis problem. Commun. Math. Stat. 1–16 (2020)
[74.]
Wolodzko, T.: extraDistr: Additional Univariate and Multivariate Distributions, 2020. R package version 1.9.1
[75.]
Wygralak M. Questions of cardinality of finite fuzzy sets. Fuzzy Sets Syst., 1999, 102(2): 185-210,
CrossRef Google scholar
[76.]
Yager RR. Generalized probabilities of fuzzy events from fuzzy belief structures. Inf. Sci., 1982, 28(1): 45-62,
CrossRef Google scholar
[77.]
Yang N. East Asia in transition: re-examining the east Asian welfare model using fuzzy sets. J. Asian Public Policy, 2017, 10(1): 104-120,
CrossRef Google scholar
[78.]
Yang-Wallentin F, Jöreskog KG, Luo H. Confirmatory factor analysis of ordinal variables with misspecified models. Struct. Equ. Model., 2010, 17(3): 392-423,
CrossRef Google scholar
[79.]
Zadeh, L.A: A computational approach to fuzzy quantifiers in natural languages. In: Computational Linguistics, pp. 149–184. Elsevier (1983)
[80.]
Zhi-Gang S, Wang P-H, Li Y-G, Zhou Z-K. Parameter estimation from interval-valued data using the expectation-maximization algorithm. J. Stat. Comput. Simul., 2015, 85(2): 320-338,
CrossRef Google scholar
[81.]
Zumbo BD, Gadermann AM, Zeisser C. Ordinal versions of coefficients alpha and theta for likert rating scales. J. Mod. Appl. Stat. Methods, 2007, 6(1): 4,
CrossRef Google scholar

Accesses

Citations

Detail

Sections
Recommended

/