Structural diversity for decision tree ensemble learning

Tao SUN; Zhi-Hua ZHOU

doi:10.1007/s11704-018-7151-8

PDF(370 KB)

Front. Comput. Sci. ›› 2018, Vol. 12 ›› Issue (3) : 560-570. DOI: 10.1007/s11704-018-7151-8

RESEARCH ARTICLE

Structural diversity for decision tree ensemble learning

Tao SUN ,
Zhi-Hua ZHOU

Author information +

History +

Abstract

Decision trees are a kind of off-the-shelf predictive models, and they have been successfully used as the base learners in ensemble learning. To construct a strong classifier ensemble, the individual classifiers should be accurate and diverse. However, diversity measure remains a mystery although there were many attempts. We conjecture that a deficiency of previous diversity measures lies in the fact that they consider only behavioral diversity, i.e., how the classifiers be have when making predictions, neglecting the fact that classifiers may be potentially different even when they make the same predictions. Based on this recognition, in this paper, we advocate to consider structural diversity in addition to behavioral diversity, and propose the TMD (tree matching diversity) measure for decision trees. To investigate the usefulness of TMD, we empirically evaluate performances of selective ensemble approacheswith decision forests by incorporating different diversity measures. Our results validate that by considering structural and behavioral diversities together, stronger ensembles can be constructed. This may raise a new direction to design better diversity measures and ensemble methods.

Keywords

ensemble learning / structural diversity / decision tree

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Tao SUN, Zhi-Hua ZHOU. Structural diversity for decision tree ensemble learning. Front. Comput. Sci., 2018, 12(3): 560‒570 https://doi.org/10.1007/s11704-018-7151-8

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Stiglic G, Kocbek S, Pernek I, Kokol P. Comprehensive decision tree models in bioinformatics. PloS One, 2012, 7(3): e33812 CrossRef Google scholar

[2]	Creamer G, Freund Y. Using boosting for financial analysis and performance prediction: application to s&p 500 companies, latin american adrs and banks. Computational Economics, 2010, 36(2): 133–151 CrossRef Google scholar

[3]	Rokach L. Decision forest: twenty years of research. Information Fusion, 2016, 27: 111–125 CrossRef Google scholar

[4]	Zhou Z H. Ensemble Methods: Foundations and Algorithms. Boca Raton, FL: Chapman & Hall/CRC, 2012

[5]	Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32 CrossRef Google scholar

[6]	Geurts P, Ernst D, Wehenkel L. Extremely randomized trees. Machine Learning, 2006, 63(1): 3–42 CrossRef Google scholar

[7]	Rodriguez J J, Kuncheva L I, Alonso C J. Rotation forest: a new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1619–1630 CrossRef Google scholar

[8]	Brown G, Wyatt J, Harris R, Yao X. Diversity creation methods: a survey and categorisation. Information Fusion, 2005, 6(1): 5–20 CrossRef Google scholar

[9]	Melville P, Mooney R J. Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence. 2003, 505–510

[10]	Yu Y, Li Y F, Zhou Z H. Diversity regularized machine. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 1603–1608

[11]	Breiman L. Randomizing outputs to increase prediction accuracy. Machine Learning, 2000, 40(3): 229–242 CrossRef Google scholar

[12]	Kuncheva L I, Whitaker C J. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 2003, 51(2): 181–207 CrossRef Google scholar

[13]	Tang E K, Suganthan P N, Yao X. An analysis of diversity measures. Machine Learning, 2006, 65(1): 247–271 CrossRef Google scholar

[14]	Didaci L, Fumera G, Roli F. Diversity in classifier ensembles: fertile concept or dead end? In: Proceedings of the 11th International Workshop on Multiple Classifier Systems. 2013, 37–48 CrossRef Google scholar

[15]	Reyzin L, Schapire R E. How boosting the margin can also boost classifier complexity. In: Proceedings of the 23rd International Conference on Machine Learning. 2006, 753–760 CrossRef Google scholar

[16]	Quinlan J R. Simplifying decision trees. International Journal of Human-Computer Studies, 1999, 51(2): 497–510 CrossRef Google scholar

[17]	Freund Y, Mason L. The alternating decision tree learning algorithm. In: Proceedings of the 16th International Conference on Machine Learning. 1999, 124–133

[18]	Friedman J H. Greedy function approximation: a gradient boosting machine. Annals of Statistics. 2001, 1189–1232 CrossRef Google scholar

[19]	Krogh A, Vedelsby J. Neural network ensembles, cross validation, and active learning. In: Proceedings of International Conference on Neural Information Processing Systems. 1995, 231–238

[20]	Geman S, Bienenstock E, Doursat R. Neural networks and the bias/variance dilemma. Neural Computation, 1992, 4(1): 1–58 CrossRef Google scholar

[21]	Margineantu D D, Dietterich T G. Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning. 1997, 211–218

[22]	Brown G, Kuncheva L I. “Good” and “bad” diversity in majority vote ensembles. In: Proceedings of the 9th International Workshop on Multiple Classifier Systems. 2010, 124–133 CrossRef Google scholar

[23]	Brown G. An information theoretic perspective on multiple classifier systems. In: Proceedings of the 8th International Workshop on Multiple Classifier Systems. 2009, 344–353 CrossRef Google scholar

[24]	Zhou Z H, Li N. Multi-information ensemble diversity. In: Proceedings of the 9th International Workshop on Multiple Classifier Systems. 2010, 134–144 CrossRef Google scholar

[25]	Brooks F P Jr. Three great challenges for half-century-old computer science. Journal of the ACM, 2003, 50(1): 25–26 CrossRef Google scholar

[26]	Zhou Z H, Wu J, Tang W. Ensembling neural networks: many could be better than all. Artificial Intelligence, 2002, 137(1): 239–263 CrossRef Google scholar

[27]	Martínez-Mu noz G, Hernández-Lobato D, Suárez A. An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(2): 245–259 CrossRef Google scholar

[28]	Giacinto G, Roli F, Fumera G. Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of the 15th International Conference on Pattern Recognition. 2000, 160–163 CrossRef Google scholar

[29]	Lazarevic A, Obradovic Z. Effective pruning of neural network classifier ensembles. In: Proceedings of International Joint Conference on Neural Networks. 2001, 796–801 CrossRef Google scholar

[30]	Zhang Y, Burer S, Street W N. Ensemble pruning via semi-definite programming. Journal of Machine Learning Research, 2006, 7: 1315–1338

[31]	Li N, Zhou Z H. Selective ensemble under regularization framework. In: Proceedings of the 8th International Workshop on Multiple Classifier Systems. 2009, 293–303 CrossRef Google scholar

[32]	Qian C, Yu Y, Zhou Z H. Pareto ensemble pruning. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015, 2935–2941

[33]	Pawlik M, Augsten N. Tree edit distance: Robust and memoryefficient. Information Systems, 2016, 56: 157–173 CrossRef Google scholar

[34]	Wolberg W H, Mangasarian O L. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proceedings of the National Academy of Sciences, 1990, 87(23): 9193–9196 CrossRef Google scholar

[35]	Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I H. The weka data mining software: an update. ACM SIGKDD explorations newsletter, 2009, 11(1): 10–18

RIGHTS & PERMISSIONS

2018 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature

AI Summary AI Mindmap

PDF(370 KB)

Accesses

Citations

Detail

Sections

Recommended

Received	Accepted	Published
28 Apr 2017	21 Dec 2017	02 May 2018
Online First Date	Issue Date
06 Mar 2018	02 May 2018

About the journal

Aims & scope

Description

Editorial board

Abstracting / Indexing

Contact us

Browse

Just accepted

Online first

Latest issue

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates