Please wait a minute...

Frontiers of Computer Science

Front. Comput. Sci.    2018, Vol. 12 Issue (5) : 939-949
Achieving data-driven actionability by combining learning and planning
Qiang LV(), Yixin CHEN, Zhaorong LI, Zhicheng CUI, Ling CHEN, Xing ZHANG, Haihua SHEN()
1. College of Information Engineering, Yangzhou University, Yangzhou 225127, China
2. Department of Computer Science and Engineering,Washington University in St. Louis, St. Louis MO 63130, USA
3. School of Management, Fudan University, Shanghai 200433, China
4. School of Computer and Control Engineering, University of Chinese Academy of Science, Beijing 100049, China
Download: PDF(437 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

A main focus of machine learning research has been improving the generalization accuracy and efficiency of prediction models. However, what emerges as missing in many applications is actionability, i.e., the ability to turn prediction results into actions. Existing effort in deriving such actionable knowledge is few and limited to simple action models while in many real applications those models are often more complex and harder to extract an optimal solution. In this paper, we propose a novel approach that achieves actionability by combining learning with planning, two core areas of AI. In particular, we propose a framework to extract actionable knowledge from random forest, one of the most widely used and best off-the-shelf classifiers. We formulate the actionability problem to a sub-optimal action planning (SOAP) problem, which is to find a plan to alter certain features of a given input so that the random forest would yield a desirable output, while minimizing the total costs of actions. Technically, the SOAP problem is formulated in the SAS+ planning formalism, and solved using a Max-SAT based approach. Our experimental results demonstrate the effectiveness and efficiency of the proposed approach on a personal credit dataset and other benchmarks. Our work represents a new application of automated planning on an emerging and challenging machine learning paradigm.

Keywords actionable knowledge extraction      machine learning      planning      random forest     
Corresponding Authors: Qiang LV,Haihua SHEN   
Just Accepted Date: 05 January 2017   Online First Date: 06 March 2018    Issue Date: 21 September 2018
 Cite this article:   
Qiang LV,Yixin CHEN,Zhaorong LI, et al. Achieving data-driven actionability by combining learning and planning[J]. Front. Comput. Sci., 2018, 12(5): 939-949.
E-mail this article
E-mail Alert
Articles by authors
Qiang LV
Yixin CHEN
Zhaorong LI
Zhicheng CUI
Haihua SHEN
1 Mitchell T M. Machine learning and data mining. Communications of the ACM, 1999, 42(11): 30–36
2 Bailey T C, Chen Y X, Mao Y, Lu C Y, Hackmann G, Micek S T, Heard K M, Faulkner K M, Kollef M H. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. Journal of Hospital Medicine, 2013, 8: 236–242
3 Johnson R A, Gong R, Greatorex-Voith S, Anand A, Fritzler A. A datadriven framework for identifying high school students at risk of not graduating on time. Bloomberg Data for Good Exchange, 2015
4 Liu B, Hsu W. Post-analysis of learned rules. In: Proceedings of the AAAI Conference on Artificial Intelligence. 1996, 828–834
5 Liu B, Hsu W, Ma Y M. Pruning and summarizing the discovered associations. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999, 125–134
6 Cao L B, Zhang C Q. Domain-driven, actionable knowledge discovery. IEEE Intelligent Systems, 2007, 22(4): 78–88
7 Cao L B, Zhao Y C, Zhang H F, Luo D, Zhang C Q, Park E K. Flexible frameworks for actionable knowledge discovery. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(9): 1299–1312
8 DeSarbo W S, Ramaswamy V. Crisp: customer response based iterative segmentation procedures for response modeling in direct marketing. Journal of Direct Marketing, 1994, 8(3): 7–20
9 Levin N, Zahavi J. Segmentation analysis with managerial judgment. Journal of Direct Marketing, 1996, 10(3): 28–47<28::AID-DIR3>3.0.CO;2-#
10 Moro S, Cortez P, Rita P. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 2014, 62: 22–31
11 Hilderman R J, Hamilton H J. Applying objective interestingness measures in data mining systems. In: Proceedings of European Conference of Principles of Data Mining and Knowledge Discovery. 2000, 432–439
12 Cao L B, Luo D, Zhang C Q. Knowledge actionability: satisfying technical and business interestingness. International Journal of Business Intelligence and Data Mining, 2007, 2(4): 496–514
13 Cortez P, Embrechts M J. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences, 2013, 225: 1–17
14 Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. In: Proceedings of the International Conference on Learning Representations. 2014
15 Yang Q, Yin J, Ling C, Chen T. Postprocessing decision trees to extract actionable knowledge. In: Proceedings of the 3rd IEEE International Conference on Data Mining. 2003, 685–688
16 Yang Q, Yin J, Ling C, Pan R. Extracting actionable knowledge from decision trees. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 43–56
17 Cui Z C, Chen W L, He Y J, Chen Y X. Optimal action extraction for random forests and boosted trees. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 179–188
18 Friedman J, Hastie T, Tibshirani R. The Elements of Statistical Learning, Vol 1. New York: Springer-Verlag, 2001
19 Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 2013, 56(1): 116–124
20 Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154
21 Mohan A, Chen Z, Weinberger K. Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research, 2011, 14: 77–89
22 Lu Q, Cui Z C, Chen Y X, Chen X P. Extracting optimal actionable plans from additive tree models. Frontiers of Computer Science, 2017, 11(1): 160–173
23 Freund Y, Schapire R E. A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 1997, 55: 119–139
24 Friedman J H. Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 2001, 29: 1189–1232
25 Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32
26 Fox M, Long D. PDDL2.1: an extension to PDDL for expressing temporal planning domains. Journal of Artificial Intelligence Research, 2003, 20: 61–124
27 Bäckström C, Nebel B. Complexity results for SAS+ planning. Computational Intelligence, 1995, 11(4): 625–655
28 Jonsson P, Bäckström C. State-variable planning under structural restrictions: algorithms and complexity. Artificial Intelligence, 1998, 100(1–2): 125–176
29 Helmert M. The fast downward planning system. Journal of Artificial Intelligence Research, 2006, 26: 191–246
30 Kautz H A, Selman B. Planning as satisfiability. In: Proceedings of European Conference on Artificial Intelligence. 1992, 359–363
31 Blum A, Furst M L. Fast planning through planning graph analysis. Artificial Intelligence, 1997, 90(1–2): 281–300
32 Lu Q, Huang R Y, Chen Y X, Xu Y, Zhang W X, Chen G L. A SATbased approach to cost-sensitive temporally expressive planning. ACM Transactions on Intelligent Systems and Technology, 2014, 5(1): 18
33 Huang R Y, Chen Y X, Zhang W X. A novel transition based encoding scheme for planning as satisfiability. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2010, 89–94
34 Huang R Y, Chen Y X, Zhang W X. SAS+ planning as satisfiability. Journal of Artificial Intelligence Research, 2012, 43: 293–328
35 Balyo T, Chrpa L, Kilani A. On different strategies for eliminating redundant actions from plans. In: Proceedings of the 7th Annual Symposium on Combinatorial Search. 2014, 10–18
Related articles from Frontiers Journals
[1] Ruochen HUANG, Xin WEI, Liang ZHOU, Chaoping LV, Hao MENG, Jiefeng JIN. A survey of data-driven approach on multimedia QoE evaluation[J]. Front. Comput. Sci., 2018, 12(6): 1060-1075.
[2] Ashish Kumar DWIVEDI, Anand TIRKEY, Santanu Kumar RATH. Software design pattern mining using classification-based techniques[J]. Front. Comput. Sci., 2018, 12(5): 908-922.
[3] Bo SUN, Haiyan CHEN, Jiandong WANG, Hua XIE. Evolutionary under-sampling based bagging ensemble method for imbalanced data classification[J]. Front. Comput. Sci., 2018, 12(2): 331-350.
[4] Min-Ling ZHANG, Yu-Kun LI, Xu-Ying LIU, Xin GENG. Binary relevance for multi-label learning: an overview[J]. Front. Comput. Sci., 2018, 12(2): 191-202.
[5] Zhongqing WANG, Shoushan LI, Guodong ZHOU. Personal summarization from profile networks[J]. Front. Comput. Sci., 2017, 11(6): 1085-1097.
[6] Chao CHEN,Xia CHEN,Zhu WANG,Yasha WANG,Daqing ZHANG. ScenicPlanner: planning scenic travel routes leveraging heterogeneous user-generated digital footprints[J]. Front. Comput. Sci., 2017, 11(1): 61-74.
[7] Qiang LU,Zhicheng CUI,Yixin CHEN,Xiaoping CHEN. Extracting optimal actionable plans from additive tree models[J]. Front. Comput. Sci., 2017, 11(1): 160-173.
[8] Ruifeng XU,Lin GUI,Qin LU,Shuai WANG,Jian XU. Incorporating multi-kernel function and Internet verification for Chinese person name disambiguation[J]. Front. Comput. Sci., 2016, 10(6): 1026-1038.
[9] Zongzhang ZHANG,Qiming FU,Xiaofang ZHANG,Quan LIU. Reasoning and predicting POMDP planning complexity via covering numbers[J]. Front. Comput. Sci., 2016, 10(4): 726-740.
[10] Hongbo NI,Shu WU,Bessam ABDULRAZAK,Daqing ZHANG,Xiaojuan MA,Xingshe ZHOU. Non-intrusive sleep pattern recognition with ubiquitous sensing in elderly assistive environment[J]. Front. Comput. Sci., 2015, 9(6): 966-979.
[11] Wenge RONG,Baolin PENG,Yuanxin OUYANG,Chao LI,Zhang XIONG. Structural information aware deep semi-supervised recurrent neural network for sentiment analysis[J]. Front. Comput. Sci., 2015, 9(2): 171-184.
[12] Jamal YOUSAF, Juanzi LI, Lu CHEN, Jie TANG, Xiaowen DAI. Generalized multipath planning model for ride-sharing systems[J]. Front. Comput. Sci., 2014, 8(1): 100-118.
[13] Nicolas CEBRON. Active improvement of hierarchical object features under budget constraints[J]. Front Comput Sci, 2012, 6(2): 143-153.
[14] Liangxiao JIANG. Learning random forests for ranking[J]. Front Comput Sci Chin, 2011, 5(1): 79-86.
[15] Mingxin YUAN, Sun-an WANG, Canyang WU, Kunpeng LI. APF-guided adaptive immune network algorithm for robot path planning[J]. Front Comput Sci Chin, 2009, 3(2): 247-255.
Full text