Evaluating the use of synthetic data for machine learning prediction of self-healing capacity of concrete
Franciana Sokoloski de Oliveira , Ricardo Stefani
AI in Civil Engineering ›› 2025, Vol. 4 ›› Issue (1) : 25
Evaluating the use of synthetic data for machine learning prediction of self-healing capacity of concrete
The scarcity of experimental data poses a significant challenge in predicting the self-healing capacity of bacteria-driven concrete. To address this issue, we explored the use of synthetic data generation to augment the limited available dataset. By creating a synthetic dataset derived from real-world data, we substantially expanded the original data volume. We then trained and evaluated multiple machine learning (ML) models, encompassing both probabilistic and ensemble methods, for predicting self-healing capacity. Our comparative analysis revealed that ensemble methods, specifically the random forest (RF) algorithm, achieved the highest performance with an accuracy and F1-score of 0.863, surpassing the probabilistic models. Furthermore, when applied to real-world cases, the models maintained high predictive accuracy. This work confirms the value of synthetic data for enhancing the accuracy and reliability of predictive models in civil engineering, especially in data-scarce contexts. Our findings underscore the potential of machine learning and artificial intelligence to transform concrete research and highlight the role of synthetic data in overcoming common data limitations.
Self-healing / Concrete / Bacteria / Synthetic data / Machine learning
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
Hittmeir, M., Ekelhart, A., & Mayer, R. (2019). On the Utility of Synthetic Data: An Empirical Evaluation on Machine Learning Tasks. Proceedings of the 14th International Conference on Availability, Reliability and Security, 1–6. https://doi.org/10.1145/3339252.3339281 |
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
Patki, N., Wedge, R., & Veeramachaneni, K. (2016). The synthetic data vault. Proceedings - 3rd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016, 399–410. https://doi.org/10.1109/DSAA.2016.49 |
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
de Rooij, M., Tittelboom, K. Van, Belie, N. De, & Schlangen, E. (Eds.). (2013). Self-Healing Phenomena in Cement-Based Materials: State-of-the-Art Report of RILEM Technical Committee 221-SHC: Self-Healing Phenomena in Cement-Based Materials (Vol. 11). Springer. http://www.springer.com/series/8780 |
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
|
| [52] |
|
| [53] |
|
| [54] |
|
| [55] |
|
| [56] |
|
The Author(s)
/
| 〈 |
|
〉 |