ContextAug: model-domain failing test augmentation with contextual information

Zhuo ZHANG; Jianxin XUE; Deheng YANG; Xiaoguang MAO

doi:10.1007/s11704-023-2521-2

PDF(10959 KB)

Front. Comput. Sci. ›› 2024, Vol. 18 ›› Issue (2) : 182202. DOI: 10.1007/s11704-023-2521-2

Software

RESEARCH ARTICLE

ContextAug: model-domain failing test augmentation with contextual information

Author information +

History +

Abstract

In the process of software development, the ability to localize faults is crucial for improving the efficiency of debugging. Generally speaking, detecting and repairing errant behavior at an early stage of the development cycle considerably reduces costs and development time. Researchers have tried to utilize various methods to locate the faulty codes. However, failing test cases usually account for a small portion of the test suite, which inevitably leads to the class-imbalance phenomenon and hampers the effectiveness of fault localization.

Accordingly, in this work, we propose a new fault localization approach named ContextAug. After obtaining dynamic execution through test cases, ContextAug traces these executions to build an information model; subsequently, it constructs a failure context with propagation dependencies to intersect with new model-domain failing test samples synthesized by the minimum variability of the minority feature space. In contrast to traditional test generation directly from the input domain, ContextAug seeks a new perspective to synthesize failing test samples from the model domain, which is much easier to augment test suites. Through conducting empirical research on real large-sized programs with 13 state-of-the-art fault localization approaches, ContextAug could significantly improve fault localization effectiveness with up to 54.53%. Thus, ContextAug is verified as able to improve fault localization effectiveness.

Graphical abstract

Keywords

context / fault localization / test cases

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Zhuo ZHANG, Jianxin XUE, Deheng YANG, Xiaoguang MAO. ContextAug: model-domain failing test augmentation with contextual information. Front. Comput. Sci., 2024, 18(2): 182202 https://doi.org/10.1007/s11704-023-2521-2

Zhuo Zhang received the BA in computer science and technology, MA and PhD degrees in software engineering, all from the National University of Defense Technology, China. His research interests include fault localization, intelligent software technology, etc

Jianxin Xue received his MS in software engineering from National University of Defense Technology, China. His PhD in computer software and theory is from Shanghai Jiao Tong University, China. Jianxin Xue has been an associate professor in School of Computer and Information Engineering, Institute for Artificial Intelligence, Shanghai Polytechnic University, China. His primary research interest is concurrency theory and analysis of concurrent program, etc

Deheng Yang is currently a Master student at National University of Defense Technology, China under the supervision of Dr. Xiaoguang Mao. He received the BA in computer science and technology from the National University of Defense Technology, China. His research interests include fault localization, automated program repair, etc

Xiaoguang Mao is a professor at College of Computer, National University of Defense Technology, China. His research interests include high confidence software, software development methodology, software assurance, software service engineering, etc

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Wong W E, Gao R, Li Y, Abreu R, Wotawa F. A survey on software fault localization. IEEE Transactions on Software Engineering, 2016, 42( 8): 707–740

[2]	Pearson S, Campos J, Just R, Fraser G, Abreu R, Ernst M D, Pang D, Keller B. Evaluating and improving fault localization. In: Proceedings of the 39th IEEE/ACM International Conference on Software Engineering. 2017, 609−620

[3]	Xie X, Chen T Y, Kuo F C, Xu B. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering and Methodology, 2013, 22( 4): 31

[4]	Naish L, Lee H J, Ramamohanarao K. A model for spectra-based software diagnosis. ACM Transactions on Software Engineering and Methodology, 2011, 20( 3): 11

[5]	Zhang Z, Lei Y, Mao X, Li P. CNN-FL: an effective approach for localizing faults using convolutional neural networks. In: Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering. 2019, 445−455

[6]	Zhang Z, Lei Y, Mao X, Yan M, Xu L, Wen J. Improving deep-learning-based fault localization with resampling. Journal of Software: Evolution and Process, 2021, 33( 3): e2312

[7]	Li X, Li W, Zhang Y, Zhang L. DeepFL: integrating multiple fault diagnosis dimensions for deep fault localization. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2019, 169−180

[8]	Sohn J, Yoo S. FLUCCS: using code and change metrics to improve fault localization. In: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2017, 273−283

[9]	Lee H J, Naish L, Ramamohanarao K. Effective software bug localization using spectral frequency weighting function. In: Proceedings of the 34th IEEE Annual Computer Software and Applications Conference. 2010, 218−227

[10]	Lei Y, Mao X, Zhang M, Ren J, Jiang Y. Toward understanding information models of fault localization: elaborate is not always better. In: Proceedings of the 41st IEEE Annual Computer Software and Applications Conference. 2017, 57−66

[11]	Cheng G, Zheng Z, Wei L, Hao P. Effects of class imbalance in test suites: an empirical study of spectrum-based fault localization. In: Proceedings of the 36th IEEE Annual Computer Software and Applications Conference Workshops. 2012, 470−475

[12]	Zhang L, Yan L, Zhang Z, Zhang J, Chan W K, Zheng Z. A theoretical analysis on cloning the failed test cases to improve spectrum-based fault localization. Journal of Systems and Software, 2017, 129: 35–57

[13]	Jin W, Orso A. F3: fault localization for field failures. In: Proceedings of 2013 International Symposium on Software Testing and Analysis. 2013, 213−223

[14]	Jin W, Orso A. BugRedux: reproducing field failures for in-house debugging. In: Proceedings of the 34th International Conference on Software Engineering. 2012, 474−484

[15]	Soltani M, Derakhshanfar P, Panichella A, Devroey X, Zaidman A, van Deursen A. Single-objective versus multi-objectivized optimization for evolutionary crash reproduction. In: Proceedings of the 10th International Symposium on Search Based Software Engineering. 2018, 325−340

[16]	Soltani M, Derakhshanfar P, Devroey X, van Deursen A. A benchmark-based evaluation of search-based crash reproduction. Empirical Software Engineering, 2020, 25( 1): 96–138

[17]	Böhme M, Geethal C, Pham V T. Human-in-the-loop automatic program repair. In: Proceedings of the 13th IEEE International Conference on Software Testing, Validation and Verification. 2020, 274−285

[18]	An G, Yoo S. Human-in-the-loop fault localisation using efficient test prioritisation of generated tests. 2021, arXiv preprint arXiv: 2104.06641

[19]	Baudry B, Fleurey F, Le Traon Y. Improving test suites for efficient fault localization. In: Proceedings of the 28th International Conference on Software Engineering. 2006, 82−91

[20]	Hao D, Pan Y, Zhang L, Zhao W, Mei H, Sun J. A similarity-aware approach to testing based fault localization. In: Proceedings of the 20th IEEE/ACM International Conference on Automated software Engineering. 2005, 291−294

[21]	Lei Y, Sun C, Mao X, Su Z. How test suites impact fault localisation starting from the size. IET Software, 2018, 12( 3): 190–205

[22]	He H, Garcia E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21( 9): 1263–1284

[23]	Krawczyk B. Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 2016, 5( 4): 221–232

[24]	Shorten C, Khoshgoftaar T M. A survey on image data augmentation for deep learning. Journal of Big Data, 2019, 6( 1): 60

[25]	Xian Y, Lorenz T, Schiele B, Akata Z. Feature generating networks for zero-shot learning. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 5542−5551

[26]	Xian Y, Sharma S, Schiele B, Akata Z. F-VAEGAN-D2: a feature generating framework for any-shot learning. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 10276−10276

[27]	Zhou F, Huang S, Xing Y. Deep semantic dictionary learning for multi-label image classification. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 3572−3580

[28]	Tantithamthavorn C, Hassan A E, Matsumoto K. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Transactions on Software Engineering, 2020, 46( 11): 1200–1219

[29]	Agrawal H, Horgan J R. Dynamic program slicing. ACM SIGPLAN Notices, 1990, 25( 6): 246–256

[30]	Xu B, Qian J, Zhang X, Wu Z, Chen L. A brief survey of program slicing. ACM SIGSOFT Software Engineering Notes, 2005, 30( 2): 1–36

[31]	Zhang Z, Lei Y, Mao X, Yan M, Xu L, Zhang X. A study of effectiveness of deep learning in locating real faults. Information and Software Technology, 2021, 131: 106486

[32]	Wang H, Du B, He J, Liu Y, Chen X. IETCR: an information entropy based test case reduction strategy for mutation-based fault localization. IEEE Access, 2020, 8: 124297–124310

[33]	Zhang Z, Lei Y, Mao X, Yan M, Xia X. Improving fault localization using model-domain synthesized failing test generation. In: Proceedings of 2022 IEEE International Conference on Software Maintenance and Evolution. 2022, 199−210

[34]	Xie X, Kuo F C, Chen T, Yoo S, Harman M. Provably optimal and human-competitive results in SBSE for spectrum based fault localisation. In: Proceedings of the 5th International Symposium on Search Based Software Engineering. 2013, 224−238

[35]	Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321–357

[36]	Just R, Jalali D, Ernst M D. Defects4J: a database of existing faults to enable controlled testing studies for Java programs. In: Proceedings of 2014 International Symposium on Software Testing and Analysis. 2014, 437−440

[37]	Li Y, Wang S, Nguyen T. Fault localization with code coverage representation learning. In: Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering. 2021, 661−673

[38]	Parnin C, Orso A. Are automated debugging techniques actually helping programmers? In: Proceedings of 2011 International Symposium on Software Testing and Analysis. 2011, 199−209

[39]	Debroy V, Wong W E, Xu X, Choi B. A grouping-based strategy to improve the effectiveness of fault localization techniques. In: Proceedings of the 10th International Conference on Quality Software. 2010, 13−22

[40]	Briand L C, Labiche Y, Liu X. Using machine learning to support debugging with tarantula. In: Proceedings of the 18th IEEE International Symposium on Software Reliability. 2017, 137−146

[41]	Lei Y, Mao X, Dai Z, Wang C. Effective statistical fault localization using program slices. In: Proceedings of the 36th IEEE Annual Computer Software and Applications Conference. 2012, 1−10

[42]	Richardson A. Nonparametric statistics for non-statisticians: a step-by-step approach. International Statistical Review, 2010, 78( 3): 451–452

[43]	Jones J A, Bowring J F, Harrold M J. Debugging in parallel. In: Proceedings of 2007 International Symposium on Software Testing and Analysis. 2007, 16−26

[44]	Wong E, Wei T, Qi Y, Zhao L. A crosstab-based statistical method for effective fault localization. In: Proceedings of the 1st International Conference on Software Testing, Verification, and Validation. 2008, 42−51

[45]	Japkowicz N, Stephen S. The class imbalance problem: a systematic study. Intelligent Data Analysis, 2002, 6( 5): 429–449

[46]	Yu Y, Jones J A, Harrold M J. An empirical study of the effects of test-suite reduction on fault localization. In: Proceedings of the 30th International Conference on Software Engineering. 2008, 201−210

[47]	Wong W E, Qi Y. BP neural network-based effective fault localization. International Journal of Software Engineering and Knowledge Engineering, 2009, 19( 4): 573–597

[48]	Wong W E, Debroy V, Golden R, Xu X, Thuraisingham B. Effective software fault localization using an RBF neural network. IEEE Transactions on Reliability, 2012, 61( 1): 149–169

[49]	Zhang Z, Lei Y, Tan Q, Mao X, Zeng P, Chang X. Deep Learning-based fault localization with contextual information. IEICE Transactions on Information and Systems, 2017, E100.D( 12): 3027–3031

[50]	Troya J, Segura S, Parejo J A, Ruiz-Cortés A. Spectrum-based fault localization in model transformations. ACM Transactions on Software Engineering and Methodology, 2018, 27( 3): 13

[51]	Zhang M, Li Y, Li X, Chen L, Zhang Y, Zhang L, Khurshid S. An empirical study of boosting spectrum-based fault localization via PageRank. IEEE Transactions on Software Engineering, 2021, 47( 6): 1089–1113

[52]	Jiang J, Wang R, Xiong Y, Chen X, Zhang L. Combining spectrum-based fault localization and statistical debugging: an empirical study. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering. 2019, 502−514

[53]	Chen M Y, Kiciman E, Fratkin E, Fox A, Brewer E. Pinpoint: problem determination in large, dynamic internet services. In: Proceedings of International Conference on Dependable Systems and Networks. 2002, 595−604

[54]	Jones J A. Fault localization using visualization of test information. In: Proceedings of the 26th International Conference on Software Engineering. 2004, 54−56

[55]	Abreu R, Zoeteweij P, van Gemund A J C. An evaluation of similarity coefficients for software fault localization. In: Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing. 2006, 39−46

[56]	Wong W E, Qi Y, Zhao L, Cai K Y. Effective fault localization using code coverage. In: Proceedings of the 31st Annual International Computer Software and Applications Conference. 2007, 449−456

[57]	Wong W E, Debroy V, Choi B. A family of code coverage-based heuristics for effective fault localization. Journal of Systems and Software, 2010, 83( 2): 188–208

[58]	Wong W E, Debroy V, Li Y, Gao R. Software fault localization using DStar (D*). In: Proceedings of the 6th IEEE International Conference on Software Security and Reliability. 2012, 21−30

RIGHTS & PERMISSIONS

2024 Higher Education Press

AI Summary AI Mindmap

PDF(10959 KB)

Accesses

Citations

Detail

Sections

Recommended

Received	Accepted	Published
10 Aug 2022	29 Dec 2022	15 Apr 2024
Just Accepted Date	Issue Date
30 Dec 2022	23 Mar 2023

About the journal

Aims & scope

Description

Editorial board

Abstracting / Indexing

Contact us

Browse

Just accepted

Online first

Latest issue

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates

Guidelines for reviewers