Heuristic solution using decision tree model for enhanced XML schema matching of bridge structural calculation documents

Sang I. PARK; Sang-Ho LEE

doi:10.1007/s11709-020-0666-8

PDF(1066 KB)

Front. Struct. Civ. Eng. ›› 2020, Vol. 14 ›› Issue (6) : 1403-1417. DOI: 10.1007/s11709-020-0666-8

RESEARCH ARTICLE

Heuristic solution using decision tree model for enhanced XML schema matching of bridge structural calculation documents

Sang I. PARK¹^,² ,
Sang-Ho LEE²

Author information +

History +

Abstract

Research on the quality of data in a structural calculation document (SCD) is lacking, although the SCD of a bridge is used as an essential reference during the entire lifecycle of the facility. XML Schema matching enables qualitative improvement of the stored data. This study aimed to enhance the applicability of XML Schema matching, which improves the speed and quality of information stored in bridge SCDs. First, the authors proposed a method of reducing the computing time for the schema matching of bridge SCDs. The computing speed of schema matching was increased by 13 to 1800 times by reducing the checking process of the correlations. Second, the authors developed a heuristic solution for selecting the optimal weight factors used in the matching process to maintain a high accuracy by introducing a decision tree. The decision tree model was built using the content elements stored in the SCD, design companies, bridge types, and weight factors as input variables, and the matching accuracy as the target variable. The inverse-calculation method was applied to extract the weight factors from the decision tree model for high-accuracy schema matching results.

Keywords

structural calculation document / bridge structure / XML Schema matching / weight factor / data mining / decision tree model

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Sang I. PARK, Sang-Ho LEE. Heuristic solution using decision tree model for enhanced XML schema matching of bridge structural calculation documents. Front. Struct. Civ. Eng., 2020, 14(6): 1403‒1417 https://doi.org/10.1007/s11709-020-0666-8

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Liu S, McMahon C A, Darlington M J, Culley S J, Wild P J. A computational framework for retrieval of document fragments based on decomposition schemes in engineering information management. Advanced Engineering Informatics, 2006, 20(4): 401–413 CrossRef Google scholar

[2]	Tan X, Hammad A, Fazio P. Automated code compliance checking for building envelope design. Journal of Computing in Civil Engineering, 2010, 24(2): 203–211 CrossRef Google scholar

[3]	Zhong B T, Ding L Y, Luo H B, Zhou Y, Hu Y Z, Hu H M. Ontology-based semantic modeling of regulation constraint for automated construction quality compliance checking. Automation in Construction, 2012, 28: 58–70 CrossRef Google scholar

[4]	Zhang J, El-Gohary N M. Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking. Journal of Computing in Civil Engineering, 2016, 30(2): 04015014 CrossRef Google scholar

[5]	Lin K Y, Soibelman L. Incorporating domain knowledge and information retrieval techniques to develop an architectural/engineering/construction online product search engine. Journal of Computing in Civil Engineering, 2009, 23(4): 201–210 CrossRef Google scholar

[6]	McGibbney L J, Kumar B. A knowledge-directed information retrieval and management framework for energy performance building regulations. In: Proceedings from International Workshop on Computing in Civil Engineering 2011. Miami, FL: American Society of Civil Engineers, 2011, 339–346

[7]	Zhang L, El-Gohary N M. Epistemology-based context-aware semantic model for sustainable construction practices. Journal of Construction Engineering and Management, 2016, 142(3): 04015084 CrossRef Google scholar

[8]	Zhou P, El-Gohary N M. Automated matching of design information in BIM to regulatory information in energy codes. In: Proceedings from Construction Research Congress 2018. New Orleans, LA: American Society of Civil Engineers 2018, 75–85

[9]

Sacks R, Bloch T, Katz M, Yosef R. Automating design review with artificial intelligence and BIM: State of the art and research framework. In: Proceedings from Computing in Civil Engineering 2019: Visualization, Information Modeling, and Simulation. Atlanta, GA: American Society of Civil Engineers 2019, 353–360

[10]	Caldas C H, Soibelman L. Automating hierarchical document classification for construction management information systems. Automation in Construction, 2003, 12(4): 395–406 CrossRef Google scholar

[11]	Ma Z, Li H, Shen Q P, Yang J. Using XML to support information exchange in construction projects. Automation in Construction, 2004, 13(5): 629–637 CrossRef Google scholar

[12]	Park S I, Kim B G, Kim K H, Lee S H. A methodology for automatic hierarchy definition of sentences in engineering documents. Journal of Computational Structural Engineering Institute of Korea, 2009, 22: 323–330 (in Korean)

[13]	Kim B G, Park S I, Kim H J, Lee S H. Automatic extraction of apparent semantic structure from text contents of a structural calculation document. Journal of Computing in Civil Engineering, 2010, 24(3): 313–324 CrossRef Google scholar

[14]	Lee S H, Kim B G, Kim H J, Kim S J. A strategy for IT-based lifetime management of bridge. In: Proceedings from Bridge Maintenance, Safety, Management, Health Monitoring and Informatics (IABMAS08). Seoul: CRC Press, 2008.

[15]	Kim B G. Integration of a 3-D Bridge model and structured information of engineering documents. Dissertation for the Doctoral Degree. Seoul: Yonsei University, 2010

[16]	Rahm E, Bernstein P A. A survey of approaches to automatic schema matching. VLDB Journal, 2001, 10(4): 334–350 CrossRef Google scholar

[17]

Lee S H, Kim B G, Kim D H, Jeong Y S. Development of standardized semantic model for structural calculation documents of bridges and XML schema matching technique. In: Proceedings from the 3rd International Conference on Bridge Maintenance Safety and Management (IABMAS). Porto: Taylor & Francis, 2006

[18]	Yi S, Huang B, Tatchan W. XML application schema matching using similarity measure and relaxation labeling. Information Sciences, 2005, 169(1-2): 27–46 CrossRef Google scholar

[19]	Park S I, Kim B G, Lee S H. An efficient application of XML schema matching technique to structural calculation document of bridge. Journal of the Korean Society of Civil Engineers, 2012, 32: 51–59 (in Korean)

[20]	Lin J G. Multiple-objective problems: Pareto-optimal solutions by method of proper equality constraints. IEEE Transactions on Automatic Control, 1976, 21(5): 641–650 CrossRef Google scholar

[21]	Li W S, Clifton C. SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks. Data & Knowledge Engineering, 2000, 33(1): 49–84 CrossRef Google scholar

[22]	Madhavan J, Bernstein P A, Rahm E. Generic schema matching with cupid. In: Proceedings of the 27th International Conference on Very Large Data Bases. San Francisco, CA: Morgan Kanfmann Publishers Inc., 2001, 49–58

[23]	Castano S, De Antonellis V. Global viewing of heterogeneous data sources. IEEE Transactions on Knowledge and Data Engineering, 2001, 13(2): 277–297 CrossRef Google scholar

[24]	Algergawy A, Schallehn E, Saake G. Improving XML schema matching performance using Prüfer sequences. Data & Knowledge Engineering, 2009, 68(8): 728–747 CrossRef Google scholar

[25]	Algergawy A, Massmann S, Rahm E. A clustering-based approach for large-scale ontology matching. In: Proceedings from ADBIS 2011. Berlin: Heidelberg, 2011, 415–428

[26]	Melnik S, Garcia-Molina H, Rahm E. Similarity flooding: A versatile graph matching algorithm and its application to schema matching. In: Proceedings from 18th International Conference on Data Engineering. San Jose, CA: IEEE Computer Society 2002, 117–128

[27]	Doan A, Madhavan J, Domingos P, Halevy A. Learning to map between ontologies on the semantic web. In: Proceedings of the 11th International Conference on World Wide Web. Honolulu, HI: Association for Computing Machinery, 2002, 662–673

[28]	Doan A, Domingos P, Halevy A Y. Reconciling schemas of disparate data sources: A machine-learning approach. ACM SIGMOD Record Journal of Management in Engineering, 2001, 30(2): 509–520 CrossRef Google scholar

[29]	Fayyad U, Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery in databases. AI Magazine, 1996, 17(3): 37–54

[30]	Adriaans P, Zantinge D. Data Mining. Boston: Addison-Wesley, 1996

Acknowledgement

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2016R1A6A3A11934917).