A Practical Data Quality Assessment Method for Raw Data in Vessel Operations

Gang Chen , Jie Cai , Niels Gorm Maly Rytter , Marie Lützen

Journal of Marine Science and Application ›› 2023, Vol. 22 ›› Issue (2) : 370 -380.

PDF
Journal of Marine Science and Application ›› 2023, Vol. 22 ›› Issue (2) : 370 -380. DOI: 10.1007/s11804-023-00326-w
Research Article

A Practical Data Quality Assessment Method for Raw Data in Vessel Operations

Author information +
History +
PDF

Abstract

With the current revolution in Shipping 4.0, a tremendous amount of data is accumulated during vessel operations. Data quality (DQ) is becoming more and more important for the further digitalization and effective decision-making in shipping industry. In this study, a practical DQ assessment method for raw data in vessel operations is proposed. In this method, specific data categories and data dimensions are developed based on engineering practice and existing literature. Concrete validation rules are then formed, which can be used to properly divide raw datasets. Afterwards, a scoring method is used for the assessment of the data quality. Three levels, namely good, warning and alarm, are adopted to reflect the final data quality. The root causes of bad data quality could be revealed once the internal dependency among rules has been built, which will facilitate the further improvement of DQ in practice. A case study based on the datasets from a Danish shipping company is conducted, where the DQ variation is monitored, assessed and compared. The results indicate that the proposed method is effective to help shipping industry improve the quality of raw data in practice. This innovation research can facilitate shipping industry to set a solid foundation at the early stage of their digitalization journeys.

Keywords

Data quality / Vessel operations / Shipping / Validation rules / Noon reports

Cite this article

Download citation ▾
Gang Chen, Jie Cai, Niels Gorm Maly Rytter, Marie Lützen. A Practical Data Quality Assessment Method for Raw Data in Vessel Operations. Journal of Marine Science and Application, 2023, 22(2): 370-380 DOI:10.1007/s11804-023-00326-w

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Ahn K, Rakha H, Hill D (2008) Data quality white paper. Technical Report. United States. Federal Highway Administration. Office of Operations

[2]

Alkhattabi M, Neagu D, Cullen A. Assessing information quality of e-learning systems: a web mining approach. Computers in Human Behavior, 2011, 27: 862-873

[3]

Bates MJ (2019) Understanding information retrieval systems: management, types, and standards. Auerbach Publications

[4]

Blake R, Mangiameli P. The effects and interactions of data quality and problem complexity on classification. Journal of Data and Information Quality (JDIQ), 2011, 2: 1-28

[5]

Cai J, Chen G, L/tzen M, Rytter NGM. A practical ais-based route library for voyage planning at the pre-fixture stage. Ocean Engineering, 2021, 236: 109478

[6]

Cai J, Jiang X, Yang Y, Lodewijks G, Wang M. Data-driven methods to predict the burst strength of corroded line pipelines subjected to internal pressure. Journal of Marine Science and Application, 2022, 21: 115-132

[7]

Caro A, Calero C, Caballero I, Piattini M. A proposal for a set of attributes relevant for web portal data quality. Software Quality Journal, 2008, 16: 513-542

[8]

Chengalur-Smith IN, Ballou DP, Pazer HL. The impact of data quality information on decision making: an exploratory analysis. IEEE Transactions on Knowledge and Data Engineering, 1999, 11: 853-864

[9]

Coen-Porisini A, Sicari S. Improving data quality using a cross layer protocol in wireless sensor networks. Computer Networks, 2012, 56: 3655-3665

[10]

De Mauro A, Greco M, Grimaldi M (2015) What is big data? A consensual definition and a review of key research topics, in: AIP Conference Proceedings, American Institute of Physics, 97–104. https://doi.org/10.1063/1.4907823

[11]

Dey D, Kumar S. Reassessing data quality for information products. Management science, 2010, 56: 2316-2322

[12]

Eisele WL, Rilett LR. Travel-time estimates obtained from intelligent transportation systems and instrumented test vehicles: Statistical comparison. Transportation research record, 2002, 1804: 8-16

[13]

Falge C, Otto B, Österle H (2012) Data quality requirements of collaborative business processes, in: 2012 IEEE 45th Hawaii International Conference on System Sciences, 4316–4325. https://doi.org/10.1109/HICSS.2012.8

[14]

FORCE Technology (2021) Onboard decision support system. URL: https://forcetechnology.com/en/services/onboard-decision-support-system

[15]

Hazen BT, Boone CA, Ezell JD, Jones-Farmer LA. Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications. International Journal of Production Economics, 2014, 154: 72-80

[16]

Hermann M, Pentek T, Otto B (2016) Design principles for industrie 4.0 scenarios, in: 2016 49th Hawaii international conference on system sciences (HICSS), IEEE. pp. 3928–3937. https://doi.org/10.1109/HICSS.2016.488

[17]

Jones-Farmer LA, Woodall WH, Steiner SH, Champ CW. An overview of phase i analysis for process improvement and monitoring. Journal of Quality Technology, 2014, 46: 265-280

[18]

Karagiannidis P, Themelis N. Data-driven modelling of ship propulsion and the effect of data pre-processing on the prediction of ship fuel consumption and speed loss. Ocean Engineering, 2021, 222: 108616

[19]

Knight Sa, Burn J (2005) Developing a framework for assessing information quality on the world wide web. Informing Science 8 KONGSBERG (2021) KONGSBERG Vessel Performance. URL: https://www.kongsberg.com/digital/kognifaiecosystem/kognifai-marketplace/maritime/vessel-performance/

[20]

Lee YW, Strong DM, Kahn BK, Wang RY. Aimq: a methodology for information quality assessment. Information & management, 2002, 40: 133-146

[21]

Liao CF, Davis GA (2012) Traffic data quality verification and sensor calibration for weigh-in-motion (wim) systems

[22]

Peltier JW, Zahay D, Lehmann DR. Organizational learning and crm success: a model for linking organizational practices, customer data quality, and performance. Journal of interactive marketing, 2013, 27: 1-13

[23]

Perera LP, Mo B. Ship performance and navigation information under high-dimensional digital models. Journal of Marine Science and Technology, 2020, 25: 81-92

[24]

Pipino LL, Lee YW, Wang RY. Data quality assessment. Communications of the ACM, 2002, 45: 211-218

[25]

Redman TC. The impact of poor data quality on the typical enterprise. Communications of the ACM, 1998, 41: 79-82

[26]

Richardson JK, Smith BL. Development of hypothesis test for travel time data quality. Transportation research record, 2012, 2308: 103-109

[27]

Røseth ØJ (2016) Integrating iec and iso information models into the s-100 common maritime data structure

[28]

Shankaranarayan G, Ziad M, Wang RY. Managing data quality in dynamic decision environments: An information product approach. Journal of Database Management (JDM), 2003, 14: 14-32

[29]

Soner O, Akyuz E, Celik M. Use of tree based methods in ship performance monitoring under operating conditions. Ocean Engineering, 2018, 166: 302-310

[30]

Soner O, Akyuz E, Celik M. Statistical modelling of ship operational performance monitoring problem. Journal of Marine Science and Technology, 2019, 24: 543-552

[31]

Tejay G, Dhillon G, Chin AG (2004) Data quality dimensions for information systems security: A theoretical exposition, in: Working Conference on Integrity and Internal Control in Information Systems, Springer. pp. 21–39

[32]

TORM (1889) TORM SHIPPING. URL: https://torm.com/

[33]

Turner S. Defining and measuring traffic data quality: White paper on recommended approaches. Transportation research record, 2004, 1870: 62-69

[34]

US Department of Transportation (2021) Bureau of Transportation Statistics. URL: http://ntl.bts.gov/lib/jpodocs/reptste/14058files/chap3.htm

[35]

VPS (2021) Vessel Performance Solutions. URL: https://www.vpsolutions.dk/

[36]

Wang RY, Strong DM. Beyond accuracy: What data quality means to data consumers. Journal of management information systems, 1996, 12: 5-33

[37]

Wang RY, Ziad M, Lee YW (2006) Data quality. volume 23. Springer Science & Business Media

[38]

Yan R, Wang S, Du Y. Development of a two-stage ship fuel consumption prediction and reduction model for a dry bulk ship. Transportation Research Part E: Logistics and Transportation Review, 2020, 138: 101930

[39]

Yerva SR, Miklós Z, Aberer K. Quality-aware similarity assessment for entity matching in web data. Information Systems, 2012, 37: 336-351

AI Summary AI Mindmap
PDF

146

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/