Please wait a minute...

Frontiers of Computer Science

Front. Comput. Sci.    2018, Vol. 12 Issue (5) : 950-965     https://doi.org/10.1007/s11704-016-6104-3
RESEARCH ARTICLE |
Polygene-based evolutionary algorithms with frequent pattern mining
Shuaiqiang WANG1,2(), Yilong YIN3()
1. Research Center of Big Data Applications, Qilu University of Technology, Jinan 250100, China
2. Department of Computer Science and Information Systems, University of Jyvaskyla, Jyvaskyla 40014, Finland
3. School of Computer Science and Technology, Shandong University, Jinan 250101, China
Download: PDF(994 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

In this paper, we introduce polygene-based evolution, a novel framework for evolutionary algorithms (EAs) that features distinctive operations in the evolutionary process. In traditional EAs, the primitive evolution unit is a gene, wherein genes are independent components during evolution. In polygene-based evolutionary algorithms (PGEAs), the evolution unit is a polygene, i.e., a set of co-regulated genes. Discovering and maintaining quality polygenes can play an effective role in evolving quality individuals. Polygenes generalize genes, and PGEAs generalize EAs. Implementing the PGEA framework involves three phases: (I) polygene discovery, (II) polygene planting, and (III) polygene-compatible evolution. For Phase I, we adopt an associative classificationbased approach to discover quality polygenes. For Phase II, we perform probabilistic planting to maintain the diversity of individuals. For Phase III, we incorporate polygenecompatible crossover and mutation in producing the next generation of individuals. Extensive experiments on function optimization benchmarks in comparison with the conventional and state-of-the-art EAs demonstrate the potential of the approach in terms of accuracy and efficiency improvement.

Keywords polygenes      evolutionary algorithms      function optimization      associative classification      data mining     
Corresponding Authors: Shuaiqiang WANG   
Just Accepted Date: 19 October 2016   Online First Date: 06 March 2018    Issue Date: 21 September 2018
 Cite this article:   
Shuaiqiang WANG,Yilong YIN. Polygene-based evolutionary algorithms with frequent pattern mining[J]. Front. Comput. Sci., 2018, 12(5): 950-965.
 URL:  
http://journal.hep.com.cn/fcs/EN/10.1007/s11704-016-6104-3
http://journal.hep.com.cn/fcs/EN/Y2018/V12/I5/950
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
Shuaiqiang WANG
Yilong YIN
1 Wang S Q, Gao B J, Wang S L, Cao G B, and Yin Y L. Polygenebased evolution: a novel framework for evolutionary algorithms. In: Proceedings of the 21st ACM Conference on Information and Knowledge Management. 2012, 2263–2266
2 Boughanem M, Tamine L. Query optimization using an improved genetic algorithm. In: Proceedings of the 9th International Conference on Information and Knowledge Management. 2000, 368–373
3 Venkatraman S, Yen G G. A generic framework for constrained optimization using genetic algorithms. IEEE Transactions on Evolutionary Computation, 2005, 9(4): 424–435
https://doi.org/10.1109/TEVC.2005.846817
4 Malek H, Ebadzadeh M M, Rahmati M. Three new fuzzy neural networks learning algorithms based on clustering, training error and genetic algorithm. Applied Intelligence, 2012, 37(2): 280–289
https://doi.org/10.1007/s10489-011-0327-7
5 Zafra A, Ventura S. Multi-objective genetic programming for multiple instance learning. In: Proceedings of the 18th European Conference on Machine Learning. 2007, 790–797
https://doi.org/10.1007/978-3-540-74958-5_81
6 Chang D X, Zhang X D, Zheng C W. A genetic algorithm with gene rearrangement for k-means clustering. Pattern Recognition, 2009, 42(7): 1210–1222
https://doi.org/10.1016/j.patcog.2008.11.006
7 Özyer T, Alhajj R. Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer. Applied Intelligence, 2009, 31(3): 318–331
https://doi.org/10.1007/s10489-008-0129-8
8 Wang S Q, Ma J, and Liu J M. Learning to rank using evolutionary computation: Immune programming or genetic programming? In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. 2009, 1879–1882
https://doi.org/10.1145/1645953.1646254
9 Kaya MAlhajj R. Utilizing genetic algorithms to optimize membership functions for fuzzy weighted association rules mining. Applied Intelligence, 2006, 24(1): 7–15
https://doi.org/10.1007/s10489-006-6925-0
10 Weale T, Seitzer J. EVOC: a music generating system using genetic algorithms. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence. 2003, 1383–1384
11 Bryden K M, Ashlock D A, Corns S M, Willson S J. Graph-based evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 2006, 10(5): 550–567
https://doi.org/10.1109/TEVC.2005.863128
12 Ishibuchi H, Tsukamoto N, Nojima Y. Diversity improvement by nongeometric binary crossover in evolutionary multiobjective optimization. IEEE Transactions on Evolutionary Computation, 2010, 14(6): 985–998
https://doi.org/10.1109/TEVC.2010.2043365
13 Qu B Y, Suganthan P N, and Liang J J. Differential evolution with neighborhood mutation for multimodal optimization. IEEE Transactions on Evolutionary Computation, 2012, 16(5): 601–614
https://doi.org/10.1109/TEVC.2011.2161873
14 Mabu S, Hirasawa K, Hu J L. A graph-based evolutionary algorithm: genetic network programming (GNP) and its extension using reinforcement learning. Evolutionary Computation, 2007, 15(3): 369–398
https://doi.org/10.1162/evco.2007.15.3.369
15 Hu T, Chen Y Z P, Banzhaf W. WiMAX network planning using adaptive-population-size genetic algorithm. In: Proceedings of the International Conference on Applications of Evolutionary Computation. 2010, 31–40
https://doi.org/10.1007/978-3-642-12242-2_4
16 Zhang J, Chung H S H, Lo W L. Clustering-based adaptive crossover and mutation probabilities for genetic algorithms. IEEE Transactions on Evolutionary Computation, 2007, 11(3): 326–335
https://doi.org/10.1109/TEVC.2006.880727
17 Cross A D J, Myers R, Hancock E R. Convergence of a hill-climbing genetic algorithm for graph matching. Pattern Recognition, 2000, 33(11): 1863–1880
https://doi.org/10.1016/S0031-3203(99)00171-5
18 Tantar A A, Melab N, Talbi E G. A grid-based genetic algorithm combined with an adaptive simulated annealing for protein structure prediction. Soft Computing, 2008, 12(12): 1185–1198
https://doi.org/10.1007/s00500-008-0298-8
19 Chen Y P, Goldberg D E. Introducing start expression genes to the linkage learning genetic algorithm. In: Proceedings of the 7th International Conference on Parallel Problem Solving from Nature. 2002, 351–360
https://doi.org/10.1007/3-540-45712-7_34
20 Chen Y P, Peng W C, Jian M C. Particle swarm optimization with recombination and dynamic linkage discovery. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2007, 37(6): 1460–1470
https://doi.org/10.1109/TSMCB.2007.904019
21 Goldman B W, Tauritz D R. Linkage tree genetic algorithms: Variants and analysis. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation Conference. 2012, 625–632
https://doi.org/10.1145/2330163.2330252
22 Shao K Y, Li F, Jiang B Y, Wang N, Zhang H Y, Li W C. Neural network optimization based on improved diploidic genetic algorithm. In: Proceedings of the International Conference on Machine Learning and Cybernetics. 2010, 1470–1475
https://doi.org/10.1109/ICMLC.2010.5580839
23 Manjari K M, Gallagher M. Variable screening for reduced dependency modelling in gaussian-based continuous estimation of distribution algorithms. In: Proceedings of IEEE Congress on Evolutionary Computation. 2012, 1–8
24 Rastegar R. On the optimal convergence probability of univariate estimation of distribution algorithms. Evolutionary Computation, 2011, 19(2): 225–248
https://doi.org/10.1162/EVCO_a_00022
25 Lawrence E. Henderson’s Dictionary of Biology. New York: Pearson/ Prentice Hal, 2005
26 Lewis R. Human Genetics: Concepts and Applications. New York: McGraw Hill, 2002
27 Mather K M, Jinks J L. Biometrical Genetics. 3rd ed. London: Chapman and Hall, 1982
https://doi.org/10.1007/978-1-4899-3406-2
28 Beurton P J, Falk R, and Rheinberger H J. The Concept of the Gene in Development and Evolution. Cambridge: Cambridge University Press, 2000
https://doi.org/10.1017/CBO9780511527296
29 Gilbert S F. Developmental Biology. 6th ed. Sunderland, MA: Sinauer Associates, 2000
30 Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1993, 207–216
https://doi.org/10.1145/170035.170072
31 Liu B, Hsu W, Ma Y M. Integrating classification and association rule mining. In: Proceedings of the 4th ACM SIGKDD International Conference on Knowledge Discovery in Databases. 1998, 443–447
32 Holland J H. Adaptation in Natural and Artificial Systems. Cambridge, MA: The MIT Press, 1975
33 Han J W, Pei J, Yin Y W. Mining frequent patterns without candidate generation. In: Proceeding of ACM SIGMOD International Conference on Management of Data. 2000, 1–12
https://doi.org/10.1145/342009.335372
34 Agarwal R, Aggarwal C C, Prasad V V V. A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing, 2001, 61(3): 350–371
https://doi.org/10.1006/jpdc.2000.1693
35 Hämäläinen W. Statapriori: an efficient algorithm for searching statistically significant association rules. Knowledge and Information Systems, 2010, 23(3): 373–399
https://doi.org/10.1007/s10115-009-0229-8
36 Beil F, Ester M, Xu X W. Frequent term-based text clustering. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery in Databases. 2002, 436–442
https://doi.org/10.1145/775047.775110
37 Leung C W, Chan S C, Chung F L. A collaborative filtering framework based on fuzzy association rules and multiple-level similarity. Knowledge and Information Systems, 2006, 10(3): 357–381
https://doi.org/10.1007/s10115-006-0002-1
38 Yan X F, Yu P S, Han J W. Graph indexing: a frequent structure-based approach. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2004, 335–346
https://doi.org/10.1145/1007568.1007607
39 Teredesai A, Ahmad M A, Kanodia J, Gaborski R S. Comma: a framework for integrated multimedia mining using multi-relational associations. Knowledge and Information Systems, 2006, 10(2): 135–162
https://doi.org/10.1007/s10115-005-0221-x
40 Punin J, Krishnamoorthy M, Zaki M. Web Usage Mining: Languages and Algorithms. Berlin: Springer-Verlag, 2001
41 Liu C, Fei L, Yan X F, Han J W, Midkiff S P. Statistical debugging: a hypothesis testing-based approach. IEEE Transactions on Software Engineering, 2006, 32(10): 831–848
https://doi.org/10.1109/TSE.2006.105
42 Han J W, Cheng H, Xin D, Yan X F. Frequent pattern mining: Current status and future directions. Data Mining and Knowledge Discovery, 2007, 15(1): 55–86
https://doi.org/10.1007/s10618-006-0059-1
43 Dong G Z, Li J Y. Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery in Databases. 1999, 43–52
https://doi.org/10.1145/312129.312191
44 Li J Y, Dong G Z, Ramamohanarao K. Making use of the most expressive jumping emerging patterns for classification. In: Proceeding of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2000, 131–145
https://doi.org/10.1007/3-540-45571-X_29
45 Li W M, Han J W, Pei J. CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceeding of the International Conference on Data Mining. 2001, 369–376
46 Yin X X, Han J W. CPAR: classification based on predictive association rules. In: Proceeding of SIAM International Conference on Data Mining. 2003, 331–335
https://doi.org/10.1137/1.9781611972733.40
47 Cong G. Mining top-k covering rule groups for gene expression data. In: Proceedings of the 24th ACM SIGMOD International Conference on Management of Data. 2005, 670–681
https://doi.org/10.1145/1066157.1066234
48 Ting C K, Zeng W M, Lin T C. Linkage discovery through data mining. IEEE Computational Intelligence Magazine, 2010, 5(1): 10–13
https://doi.org/10.1109/MCI.2009.935310
49 Chen Y P, Goldberg D E. Convergence time for the linkage learning genetic algorithm. Evolutionary Computation, 2005, 13(3): 279–302
https://doi.org/10.1162/1063656054794806
50 Ng K P, Wong K C. A new diploid scheme and dominance change mechanism for non-stationary function optimization. In: Proceedings of the 6th International Conference on Genetic Algorithms. 1995, 159–166
51 Baluja S. Population-based incremental learning: a method for integrating genetic search based function optimization and competitive learning. Technical Report CMU-CS-94-163, 1994
52 Tang K, Yao X, Suganthan P N, MacNish C, Chen Y P, Chen C M, Yang Z. Benchmark functions for the CEC’2008 special session and competition on large scale global optimization. Technical Report, 2007
Related articles from Frontiers Journals
[1] Bo SUN, Haiyan CHEN, Jiandong WANG, Hua XIE. Evolutionary under-sampling based bagging ensemble method for imbalanced data classification[J]. Front. Comput. Sci., 2018, 12(2): 331-350.
[2] Yong WANG, Zhi-Zhong LIU, Jianbin LI, Han-Xiong LI, Jiahai WANG. On the selection of solutions for mutation in differential evolution[J]. Front. Comput. Sci., 2018, 12(2): 297-315.
[3] Yuan LI, Yuhai ZHAO, Guoren WANG, Xiaofeng ZHU, Xiang ZHANG, Zhanghui WANG, Jun PANG. Finding susceptible and protective interaction patterns in large-scale genetic association study[J]. Front. Comput. Sci., 2017, 11(3): 541-554.
[4] Junhua LU,Wei CHEN,Yuxin MA,Junming KE,Zongzhuang LI,Fan ZHANG,Ross MACIEJEWSKI. Recent progress and trends in predictive visual analytics[J]. Front. Comput. Sci., 2017, 11(2): 192-207.
[5] Chengliang WANG,Yayun PENG,Debraj DE,Wen-Zhan SONG. DPHK: real-time distributed predicted data collecting based on activity pattern knowledge mined from trajectories in smart environments[J]. Front. Comput. Sci., 2016, 10(6): 1000-1011.
[6] Wenmei LIU,Hui LIU. Major motivations for extract method refactorings: analysis based on interviews and change histories[J]. Front. Comput. Sci., 2016, 10(4): 644-656.
[7] Xin XU,Wei WANG,Jianhong WANG. A three-way incremental-learning algorithm for radar emitter identification[J]. Front. Comput. Sci., 2016, 10(4): 673-688.
[8] Lijin WANG,Yilong YIN,Yiwen ZHONG. Cuckoo search with varied scaling factor[J]. Front. Comput. Sci., 2015, 9(4): 623-635.
[9] Yaobin HE, Haoyu TAN, Wuman LUO, Shengzhong FENG, Jianping FAN. MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data[J]. Front. Comput. Sci., 2014, 8(1): 83-99.
[10] Fabian GIESEKE, Gabriel MORUZ, Jan VAHRENHOLD. Resilient k-d trees: k-means in space revisited[J]. Front Comput Sci, 2012, 6(2): 166-178.
[11] Xuesong YIN, Enliang HU. Distance metric learning guided adaptive subspace semi-supervised clustering[J]. Front Comput Sci Chin, 2011, 5(1): 100-108.
[12] Xinguang TIAN, Xueqi CHENG, Miyi DUAN, Rui LIAO, Hong CHEN, Xiaojuan CHEN, . Network intrusion detection based on system calls and data mining[J]. Front. Comput. Sci., 2010, 4(4): 522-528.
[13] Qiang YANG, . Three challenges in data mining[J]. Front. Comput. Sci., 2010, 4(3): 324-333.
[14] Lifei CHEN, Shanjun HE, Qingshan JIANG, . Validation indices for projective clustering[J]. Front. Comput. Sci., 2009, 3(4): 477-484.
[15] Dongling ZHANG, Yong SHI, Yingjie TIAN, Meihong ZHU. A class of classification and regression methods by multiobjective programming[J]. Front Comput Sci Chin, 2009, 3(2): 192-204.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed