Recent progress and trends in predictive visual analytics

Junhua LU, Wei CHEN, Yuxin MA, Junming KE, Zongzhuang LI, Fan ZHANG, Ross MACIEJEWSKI

PDF(815 KB)
PDF(815 KB)
Front. Comput. Sci. ›› 2017, Vol. 11 ›› Issue (2) : 192-207. DOI: 10.1007/s11704-016-6028-y
REVIEW ARTICLE

Recent progress and trends in predictive visual analytics

Author information +
History +

Abstract

A wide variety of predictive analytics techniques have been developed in statistics, machine learning and data mining; however, many of these algorithms take a black-box approach in which data is input and future predictions are output with no insight into what goes on during the process. Unfortunately, such a closed system approach often leaves little room for injecting domain expertise and can result in frustration from analysts when results seem spurious or confusing. In order to allow for more human-centric approaches, the visualization community has begun developing methods to enable users to incorporate expert knowledge into the prediction process at all stages, including data cleaning, feature selection, model building and model validation. This paper surveys current progress and trends in predictive visual analytics, identifies the common framework in which predictive visual analytics systems operate, and develops a summarization of the predictive analytics workflow.

Keywords

predictive visual analytics / visualization / visual analytics / data mining / predictive analysis

Cite this article

Download citation ▾
Junhua LU, Wei CHEN, Yuxin MA, Junming KE, Zongzhuang LI, Fan ZHANG, Ross MACIEJEWSKI. Recent progress and trends in predictive visual analytics. Front. Comput. Sci., 2017, 11(2): 192‒207 https://doi.org/10.1007/s11704-016-6028-y

References

[1]
Larose D T, larose C D. Data Mining and Predictive Analytics, 2nd ed. Hoboken: John Wiley & Sons, 2015
[2]
Schlangenstein M. UPS crunches data to make more routes more efficient, save gas. http://www.bloomberg.com/news/articles/2013-10-30/ups-uses-big-data-to-make-routes-more-efficient-save-gas, 2013
[3]
Ginsberg J, Mohebbi M H, Patel R S, Brammer L, Smolinski M S, Brilliant L. Detecting influenza epidemics using search engine query data. Nature, 2009, 457(7232): 1012–1014
CrossRef Google scholar
[4]
Butler D. When Google got flu wrong. Nature, 2013, 494(7436): 155–156
CrossRef Google scholar
[5]
Culotta A. Towards detecting influenza epidemics by analyzing Twitter messages. In: Proceedings of the 1st Workshop on Social Media Analytics. 2010, 115–122
CrossRef Google scholar
[6]
Lazer D, Kennedy R, King G, Vespignani A. The parable of Google flu: traps in big data analysis. Science, 2014, 343(6176): 1203–1205
CrossRef Google scholar
[7]
Keim D A, Kohlhammer J, Ellis G, Mansmann F. Mastering the Information Age — Solving Problems with Visual Analytics. Goslar: Florian Mansmann, 2010
[8]
Bertini E, Lalanne D. Surveying the complementary role of automatic data analysis and visualization in knowledge discovery. In: Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration. 2009, 12–20
CrossRef Google scholar
[9]
Sacha D, Stoffel A, Stoffel F, Kwon B C, Ellis G, Keim D. Knowledge generation model for visual analytics. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1604–1613
CrossRef Google scholar
[10]
El-Assady M, Jentner W, Stein M, Fischer F, Schreck T, Keim D. Predictive visual analytics —approaches for movie ratings and discussion of open research challenges. In: Proceedings of IEEE VIS Workshop: Visualization for Predictive Analytics. 2014
[11]
Krause J, Perer A, Bertini E. INFUSE: interactive feature selection for predictive modeling of high dimensional data. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1614–1623
CrossRef Google scholar
[12]
Gleicher M. Position paper: towards comprehensible predictive modeling. In: Proceedings of IEEE VIS Workshop: Visualization for Predictive Analytics. 2014
[13]
Kandel S, Paepcke A, Hellerstein J, Heer J. Wrangler: interactive visual specification of data transformation scripts. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2011, 3363–3372
CrossRef Google scholar
[14]
Rahm E, Do H H. Data cleaning: problems and current approaches. IEEE Data Eng. Bull., 2000, 23(4): 3–13
[15]
Kim W, Choi B J, Hong E K, Kim S K, Lee D. A taxonomy of dirty data. Data Mining and Knowledge Discovery, 2003, 7(1): 81–99
CrossRef Google scholar
[16]
Ganuza M L, Ferracutti G, Gargiulo M F, Castro S M, Bjerg E, Gröller E, Matković K. The spinel explorer — interactive visual analysis of spinel group minerals. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1913–1922
CrossRef Google scholar
[17]
Brown E T, Ottley A, Zhao H, Lin Q, Souvenir R, Endert A, Chang R. Finding waldo: learning about users from their interactions. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1663–1672
CrossRef Google scholar
[18]
Born S, Sundermann S H, Russ C, Hopf R, Ruiz C E, Falk V, Gessat M. Stent maps — comparative visualization for the prediction of adverse events of transcatheter aortic valve implantations. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 2704–2713
CrossRef Google scholar
[19]
Xie C, Chen W, Huang X X, Hu Y Q, Barlowe S, Yang J. VAET: a visual analytics approach for e-transactions time-series. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1743–1752
CrossRef Google scholar
[20]
Madhavan K, Elmqvist N, Vorvoreanu M, Chen X, Wong Y, Xian H, Dong Z, Johri A. Dia2: Web-based cyberinfrastructure for visual analysis of funding portfolios. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1823–1832
CrossRef Google scholar
[21]
Hao M C, Janetzko H, Mittelstädt S, Hill W, Dayal U, Keim D A, Marwah M, Sharma R K. A visual analytics approach for peak-preserving prediction of large seasonal time series. Computer Graphics Forum, 2011, 30(3): 691–700
CrossRef Google scholar
[22]
Hao M C, Marwah M, Janetzko H, Dayal U, Keim D A, Patnaik D, Ramakrishnan N, Sharma R K. Visual exploration of frequent patterns in multivariate time series. Information Visualization, 2012, 11(1): 71–83
CrossRef Google scholar
[23]
Malik A, Maciejewski R, Towers S, McCullough S, Ebert D S. Proactive spatiotemporal resource allocation and predictive visual analytics for community policing and law enforcement. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1863–1872
CrossRef Google scholar
[24]
Hollt T, Magdy A, Zhan P, Chen G, Gopalakrishnan G, Hoteit I, Hansen C D, Hadwiger M. Ovis: a framework for visual analysis of ocean forecast ensembles. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(8): 1114–1126
CrossRef Google scholar
[25]
Doraiswamy H, Ferreira N, Damoulas T, Freire J, Silva C T. Using topological analysis to support event-guided exploration in urban data. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 2634–2643
CrossRef Google scholar
[26]
Chen W, Guo F, Wang F Y. A survey of traffic data visualization. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(6): 2970–2984
CrossRef Google scholar
[27]
Koch S, John M, Worner M, Muller A, Ertl T. Varifocalreader-in-depth visual analysis of large text documents. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1723–1732
CrossRef Google scholar
[28]
Zhao J, Cao N, Wen Z, Song Y, Lin Y R, Collins C M. # FluxFlow: visual analysis of anomalous information spreading on social media. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1773–1782
CrossRef Google scholar
[29]
Sun G, Wu Y, Liu S, Peng T Q, Zhu J J, Liang R. EvoRiver: visual analysis of topic coopetition on social media. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1753–1762
CrossRef Google scholar
[30]
Klemm P, Oeltze-Jafra S, Lawonn K, Hegenscheid K, Volzke H, Preim B. Interactive visual analysis of image-centric cohort study data. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1673–1682
CrossRef Google scholar
[31]
Arietta S M, Efros A, Ramamoorthi R, Agrawala M. City forensics: using visual elements to predict non-visual city attributes. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 2624–2633
CrossRef Google scholar
[32]
Ma Y X, Xu J Y, Peng D C, Zhang T, Jin C Z, Qu H M, Chen W, Peng Q S. A visual analysis approach for community detection of multi-context mobile social networks. Journal of Computer Science and Technology, 2013, 28(5): 797–809
CrossRef Google scholar
[33]
Van den Elzen S, Holten D, Blaas J, VanWijk J J. Dynamic network visualization with extended massive sequence views. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(8): 1087–1099
CrossRef Google scholar
[34]
Van den Elzen S, Van Wijk J J. Multivariate network exploration and presentation: From detail to overview via selections and aggregations. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 2310–2319
CrossRef Google scholar
[35]
Van den Elzen S, Holten D, Blaas J, Van Wijk J J. Reducing snapshots to points: a visual analytics approach to dynamic network exploration. IEEE Transactions on Visualization and Computer Graphics, 2016, 22(1): 1–10
CrossRef Google scholar
[36]
Gschwandtner T, Gärtner J, Aigner W, Miksch S. A taxonomy of dirty time-oriented data. In: Proceedings of International Conference on Availability, Reliability, and Security. 2012, 58–72
CrossRef Google scholar
[37]
Eaton C, Plaisant C, Drizd T. Visualizing missing data: graph interpretation user study. In: Proceedings of IFIP Conference on Human Computer Interaction. 2005, 861–872
CrossRef Google scholar
[38]
Templ M, Alfons A, Filzmoser P. Exploring incomplete data using visualization techniques. Advances in Data Analysis and Classification, 2012, 6(1): 29–47
CrossRef Google scholar
[39]
Lin J, Wong J, Nichols J, Cypher A, Lau T A. End-user programming of mashups with vegemite. In: Proceedings of the 14th International Conference on Intelligent User Interfaces. 2009, 97–106
[40]
Scaffidi C, Myers B, Shaw M. Intelligently creating and recommending reusable reformatting rules. In: Proceedings of the 14th International Conference on Intelligent User Interfaces. 2009, 297–306
[41]
Ives Z, Knoblock C, Minton S, Jacob M, Talukdar P, Tuchinda R, Ambite J L, Muslea M, Gazen C. Interactive data integration through smart copy & paste. In: Proceedings of the Biennial Conference on Innovative Data Systems Research. 2009
[42]
Kandel S, Heer J, Plaisant C, Kennedy J, Van Ham F, Riche N H, Weaver C, Lee B, Brodbeck D, Buono P. Research directions in data wrangling: visualizations and transformations for usable and credible data. Information Visualization, 2011, 10(4): 271–288
CrossRef Google scholar
[43]
Robertson G G, Czerwinski M P, Churchill J E. Visualization of mappings between schemas. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2005, 431–439
CrossRef Google scholar
[44]
Altova. Data integration: opportunities, challenges, and altova mapforce. http://www.altova.com/whitepapers/mapforce.pdf, 2014
[45]
Informatica. The informatica data quality methodology: a framework to achieve pervasive data quality through enhanced businessit collaboration. https://www.informatica.com/downloads/7130-DQMethodology- wp-web.pdf, 2010
[46]
Zheng Y. Methodologies for cross-domain data fusion: an overview. IEEE Transactions on Big Data, 2015, 1(1): 16–34
CrossRef Google scholar
[47]
Dash M, Liu H. Feature selection for classification. Intelligent Data Analysis, 1997, 1(3): 131–156
CrossRef Google scholar
[48]
Fogarty J, Hudson S E. Toolkit support for developing and deploying sensor-based statistical models of human situations. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2007, 135–144
CrossRef Google scholar
[49]
Markovitch S, Rosenstein D. Feature generation using general constructor functions. Machine Learning, 2002, 49(1): 59–98
CrossRef Google scholar
[50]
Schuller B, Reiter S, Rigoll G. Evolutionary feature generation in speech emotion recognition. In: Proceedings of the IEEE International Conference on Multimedia and Expo. 2006, 5–8
CrossRef Google scholar
[51]
Guo D S. Coordinating computational and visual approaches for interactive feature selection and multivariate clustering. Information Visualization, 2003, 2(4): 232–246
CrossRef Google scholar
[52]
Seo J, Shneiderman B. A rank-by-feature framework for unsupervised multidimensional data exploration using low dimensional projections. In: Proceedings of the IEEE Symposium on Information Visualization. 2004, 65–72
[53]
Piringer H, Berger W, Hauser H. Quantifying and comparing features in high-dimensional datasets. In: Proceedings of the 12th International Conference on Information Visualization. 2008, 240–245
CrossRef Google scholar
[54]
May T, Bannach A, Davey J, Ruppert T, Kohlhammer J. Guiding feature subset selection with an interactive visualization. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology. 2011, 111–120
CrossRef Google scholar
[55]
Kohavi R, John G H. Wrappers for feature subset selection. Artificial Intelligence, 1997, 97(1): 273–324
CrossRef Google scholar
[56]
Klemm P, Lawonn K, Glaöer S, Niemann U, Hegenscheid K, Völzke H, Preim B. 3D regression heat map analysis of population study data. IEEE Transactions on Visualization and Computer Graphics, 2016, 22(1): 81–90
CrossRef Google scholar
[57]
Lu Y, Wang F, Maciejewski R. Business intelligence from social media: a study from the vast box office challenge. IEEE Computer Graphics and Applications, 2014, 34(5): 58–69
CrossRef Google scholar
[58]
Brooks M, Amershi S, Lee B, Drucker S M, Kapoor A, Simard P. Featureinsight: visual support for error-driven feature ideation in text classification. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology. 2015, 105–112
CrossRef Google scholar
[59]
Bögl M, Aigner W, Filzmoser P, Lammarsch T, Miksch S, Rind A. Visual analytics for model selection in time series analysis. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(12): 2237–2246
CrossRef Google scholar
[60]
Lu Y, Kruger R, Thom D, Wang F, Koch S, Ertl T, Maciejewski R. Integrating predictive analytics and social media. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology. 2014, 193–202
CrossRef Google scholar
[61]
Piringer H, Berger W, Krasser J. Hypermoval: Interactive visual validation of regression models for real-time simulation. Computer Graphics Forum, 2010, 29(3): 983–992
CrossRef Google scholar
[62]
Mühlbacher T, Piringer H. A partition-based framework for building and validating regression models. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(12): 1962–1971
CrossRef Google scholar
[63]
Gotz D, Sun J. Visualizing accuracy to improve predictive model performance. In: Proceedings of the IEEE VISWorkshop on Visualization for Predictive Analytics. 2014
[64]
Quinlan J R. Induction of decision trees. Machine Learning, 1986, 1(1): 81–106
CrossRef Google scholar
[65]
Suykens J A, Vandewalle J. Least squares support vector machine classifiers. Neural Processing Letters, 1999, 9(3): 293–300
CrossRef Google scholar
[66]
Johnson B, Shneiderman B. Tree-maps: a space-filling approach to the visualization of hierarchical information structures. In: Proceedings of the IEEE Conference on Visualization. 1991, 284–291
CrossRef Google scholar
[67]
Stasko J, Zhang E. Focus+context display and navigation techniques for enhancing radial, space-filling hierarchy visualizations. In: Proceedings of the IEEE Symposium on Information Visualization. 2000, 57–65
CrossRef Google scholar
[68]
Ware M, Frank E, Holmes G, Hall M, Witten I H. Interactive machine learning: letting users build classifiers. International Journal of Human-Computer Studies, 2001, 55(3): 281–292
CrossRef Google scholar
[69]
Ankerst M, Elsen C, Ester M, Kriegel H P. Visual classification: an interactive approach to decision tree construction. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999, 392–396
CrossRef Google scholar
[70]
Van den Elzen S, Van Wijk J J. Baobabview: Interactive construction and analysis of decision trees. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology. 2011, 151–160
CrossRef Google scholar
[71]
Becker B, Kohavi R, Sommerfield D. Visualizing the simple Baysian classifier. In: Fayyad U, Grinstein G G, Wierse A, eds. Information Visualization in Data Mining and Knowledge Discovery. San Francisco: Morgan Kaufmann Publishers Inc., 2002
[72]
Caragea D, Cook D, Honavar V G. Gaining insights into support vec tor machine pattern classifiers using projection-based tour methods. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2001, 251–256
[73]
Ma Y. EasySVM: a visual analysis approach for open-box support vector machines. In: Proceedings of the IEEE VIS Workshop on Visualization for Predictive Analytics. 2014
[74]
John G H, Langley P. Estimating continuous distributions in bayesian classifiers. In: Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence. 1995, 338–345
[75]
Ho T K. Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition. 1995, 278–282
[76]
Mühlbacher T, Piringer H, Gratzl S, Sedlmair M, Streit M. Opening the black box: strategies for increased user involvement in existing algorithm implementations. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1643–1652
CrossRef Google scholar
[77]
Paiva J G S, Schwartz W R, Pedrini H, Minghim R. An approach to supporting incremental visual data classification. IEEE Transactions on Visualization and Computer Graphics, 2015, 21(1): 4–17
CrossRef Google scholar
[78]
Talbot J, Lee B, Kapoor A, Tan D S. EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2009, 1283–1292
CrossRef Google scholar
[79]
Wu Y, Pitipornvivat N, Zhao J, Yang S, Huang G, Qu H. egoSlider: visual analysis of egocentric network evolution. IEEE Transactions on Visualization and Computer Graphics, 2016, 22(1): 260–269
CrossRef Google scholar
[80]
Stolper C D, Perer A, Gotz D. Progressive visual analytics: user-driven visual exploration of in-progress analytics. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1653–1662
CrossRef Google scholar
[81]
Ng K, Ghoting A, Steinhubl S R, Stewart W F, Malin B, Sun J. PARAMO: a PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records. Journal of Biomedical Informatics, 2014, 48: 160–170
CrossRef Google scholar
[82]
Chang C C, Lin C J. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 27
CrossRef Google scholar
[83]
Bögl M, Aigner W, Filzmoser P, Gschwandtner T, Lammarsch T, Miksch S, Rind A. Visual analytics methods to guide diagnostics for time series model predictions. In: Proceedings of the IEEE VIS Workshop on Visualization for Predictive Analytics. 2014
[84]
Andrienko N, Andrienko G, Rinzivillo S. Experiences from supporting predictive analytics of vehicle traffic. In: Proceedings of the IEEE VIS Workshop on Visualization for Predictive Analytics. 2014
[85]
Maciejewski R, Hafen R, Rudolph S, Larew S G, Mitchell M, Cleveland W S, Ebert D S. Forecasting hotspots — a predictive analytics approach. IEEE Transactions on Visualization and Computer Graphics, 2011, 17(4): 440–453
CrossRef Google scholar
[86]
Cleveland R B, Cleveland W S, McRae J E, Terpenning I. STL: a seasonal-trend decomposition procedure based on loess. Journal of Official Statistics, 1990, 6(1): 3–73
[87]
Bryan C, Wu X, Mniszewski S, Ma K L. Integrating predictive analytics into a spatiotemporal epidemic simulation. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology. 2015, 17–24
CrossRef Google scholar
[88]
Chuang J, Socher R. Interactive visualizations for deep learning. In: Proceedings of the IEEE VIS Workshop on Visualization for Predictive Analytics. 2014
[89]
Yeon H, Jang Y. Predictive visual analytics using topic composition. In: Proceedings of the 8th International Symposium on Visual Information Communication and Interaction. 2015, 1–8
CrossRef Google scholar
[90]
Wu Y C, Liu S X, Yan K, Liu M C, Wu F Z. OpinionFlow: visual analysis of opinion diffusion on social media. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1763–1772
CrossRef Google scholar
[91]
Choo J, Lee H, Kihm J, Park H. iVisClassifier: an interactive visual analytics system for classification based on supervised dimension reduction. In: Proceedings of the IEEE Symposium on Visual Analytics Science and Technology. 2010, 27–34
CrossRef Google scholar
[92]
Höferlin B, Netzel R, Höferlin M, Weiskopf D, Heidemann G. Interactive learning of ad-hoc classifiers for video visual analytics. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology. 2012, 23–32
[93]
Heimerl F, Koch S, Bosch H, Ertl T. Visual classifier training for text document retrieval. IEEE Transactions on Visualization and Computer Graphics, 2012, 18(12): 2839–2848
CrossRef Google scholar
[94]
Munzner T. Visualization Analysis and Design. Boca Raton: CRC Press, 2014
[95]
Delevingne L. Hedge fund robots crushed human rivals in 2014. http://www.cnbc.com/2015/01/05/hedge-fund-robots-crushed-humanrivals- in-2014.html, 2015
[96]
Seifert M, Hadida A L. On the relative importance of linear model and human judge(s) in combined forecasting. Organizational Behavior and Human Decision Processes, 2013, 120(1): 24–36
CrossRef Google scholar
[97]
Ruchikachorn P, Mueller K. Learning visualizations by analogy: promoting visual literacy through visualization morphing. IEEE Transactions on Visualization and Computer Graphics, 2015, 21(9): 1028–1044
CrossRef Google scholar
[98]
Amini F, Rufiange S, Hossain Z, Ventura Q, Irani P, McGuffin M J. The impact of interactivity on comprehending 2D and 3D visualizations of movement data. IEEE Transactions on Visualization and Computer Graphics, 2015, 21(1): 122–135
CrossRef Google scholar

RIGHTS & PERMISSIONS

2016 Higher Education Press and Springer-Verlag Berlin Heidelberg
AI Summary AI Mindmap
PDF(815 KB)

Accesses

Citations

Detail

Sections
Recommended

/