Measuring code maintainability with deep neural networks
Yamin HU , Hao JIANG , Zongyao HU
Front. Comput. Sci. ›› 2023, Vol. 17 ›› Issue (6) : 176214
The maintainability of source code is a key quality characteristic for software quality. Many approaches have been proposed to quantitatively measure code maintainability. Such approaches rely heavily on code metrics, e.g., the number of Lines of Code and McCabe’s Cyclomatic Complexity. The employed code metrics are essentially statistics regarding code elements, e.g., the numbers of tokens, lines, references, and branch statements. However, natural language in source code, especially identifiers, is rarely exploited by such approaches. As a result, replacing meaningful identifiers with nonsense tokens would not significantly influence their outputs, although the replacement should have significantly reduced code maintainability. To this end, in this paper, we propose a novel approach (called DeepM) to measure code maintainability by exploiting the lexical semantics of text in source code. DeepM leverages deep learning techniques (e.g., LSTM and attention mechanism) to exploit these lexical semantics in measuring code maintainability. Another key rationale of DeepM is that measuring code maintainability is complex and often far beyond the capabilities of statistics or simple heuristics. Consequently, DeepM leverages deep learning techniques to automatically select useful features from complex and lengthy inputs and to construct a complex mapping (rather than simple heuristics) from the input to the output (code maintainability index). DeepM is evaluated on a manually-assessed dataset. The evaluation results suggest that DeepM is accurate, and it generates the same rankings of code maintainability as those of experienced programmers on 87.5% of manually ranked pairs of Java classes.
code maintainability / lexical semantics / deep learning / neural networks
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
Bavota G, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A. An empirical study on the developers’ perception of software coupling. In: Proceedings of the 35th International Conference on Software Engineering (ICSE). 2013, 692−701 |
| [17] |
Pantiuchina J, Lanza M, Bavota G. Improving code: the (Mis) perception of quality metrics. In: Proceedings of 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). 2018, 80−91 |
| [18] |
|
| [19] |
|
| [20] |
Scalabrino S, Linares-Vasquez M, Poshyvanyk D, Oliveto R. Improving code readability models with textual features. In: Proceedings of the 24th IEEE International Conference on Program Comprehension. 2016, 1−10 |
| [21] |
|
| [22] |
Schankin A, Berger A, Holt D V, Hofmeister J C, Riedel T, Beigl M. Descriptive compound identifier names improve source code comprehension. In: Proceedings of the 26th IEEE/ACM International Conference on Program Comprehension (ICPC). 2018, 31−40 |
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
Alzahrani M, Alqithami S, Melton A. Using client-based class cohesion metrics to predict class maintainability. In: Proceedings of the 43rd IEEE Annual Computer Software and Applications Conference (COMPSAC). 2019, 72−80 |
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
Hofmeister J, Siegmund J, Holt D V. Shorter identifier names take longer to comprehend. In: Proceedings of the 24th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 2017, 217−227 |
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
Smith N, Van Bruggen D, Tomassetti F. JavaParser: Visited. Victoria: Leanpub, 2021 |
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
Alon U, Zilberstein M, Levy O, Yahav E. Code2vec: learning distributed representations of code. Proceedings of the ACM on Programming Languages, 2019, 3(POPL): 40 |
| [52] |
|
| [53] |
|
| [54] |
|
| [55] |
Aniche M. Java code metrics calculator (CK), 2015 |
| [56] |
|
| [57] |
|
| [58] |
Li Y, Yang Z, Guo Y, Chen X. Humanoid: a deep learning-based approach to automated black-box android app testing. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 2019, 1070−1073 |
| [59] |
Gu X, Zhang H, Zhang D, Kim S. Deep API learning. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 2016, 631−642 |
| [60] |
|
| [61] |
|
| [62] |
|
| [63] |
|
| [64] |
|
| [65] |
|
| [66] |
|
| [67] |
Allamanis M, Barr E T, Bird C, Sutton C. Suggesting accurate method and class names. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering. 2015, 38−49 |
| [68] |
|
| [69] |
|
| [70] |
Zerouali A, Mens T. Analyzing the evolution of testing library usage in open source java projects. In: Proceedings of the 24th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 2017, 417−421 |
Higher Education Press
Supplementary files
/
| 〈 |
|
〉 |