Matching dependencies: semantics and query answering

Jaffer GARDEZI, Leopoldo BERTOSSI, Iluju KIRINGA

PDF(384 KB)
PDF(384 KB)
Front. Comput. Sci. ›› 2012, Vol. 6 ›› Issue (3) : 278-292. DOI: 10.1007/s11704-012-2007-0
RESEARCH ARTICLE

Matching dependencies: semantics and query answering

Author information +
History +

Abstract

Matching dependencies (MDs) are used to declaratively specify the identification (or matching) of certain attribute values in pairs of database tuples when some similarity conditions on other values are satisfied. Their enforcement can be seen as a natural generalization of entity resolution. In what we call the pure case of MD enforcement, an arbitrary value from the underlying data domain can be used for the value in common that is used for a matching. However, the overall number of changes of attribute values is expected to be kept to a minimum. We investigate this case in terms of semantics and the properties of data cleaning through the enforcement of MDs. We characterize the intended clean instances, and also the clean answers to queries, as those that are invariant under the cleaning process. The complexity of computing clean instances and clean query answering is investigated. Tractable and intractable cases depending on the MDs are identified and characterized.

Keywords

databases / data cleaning / duplicate and entity resolution / integrity constraints / matching dependencies

Cite this article

Download citation ▾
Jaffer GARDEZI, Leopoldo BERTOSSI, Iluju KIRINGA. Matching dependencies: semantics and query answering. Front Comput Sci, 2012, 6(3): 278‒292 https://doi.org/10.1007/s11704-012-2007-0

References

[1]
Elmagarmid A, Ipeirotis P, Verykios V. Duplicate record detection: a survey. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 1-16
CrossRef Google scholar
[2]
Bleiholder J, Naumann F. Data fusion. ACM Computing Surveys, 2008, 41(1): 1-41
CrossRef Google scholar
[3]
Benjelloun O, Garcia-Molina H, Menestrina D, Su Q, Whang S, Widom J. Swoosh: a generic approach to entity resolution. The VLDB Journal, 2009, 18: 255-276
CrossRef Google scholar
[4]
Fan W. Dependencies revisited for improving data quality. In: Proceedings of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. 2008, 159-170
[5]
Fan W, Jia X, Li J, Ma S. Reasoning about record matching rules. Proceedings of the VLDB Endowment, 2009, 2(1): 407-418
[6]
Arenas M, Bertossi L, Chomicki J. Consistent query answers in inconsistent databases. In: Proceedings of the 18th ACM SIGMODSIGACTSIGART Symposium on Principles of Database Systems. 1999, 68-79
[7]
Bertossi L. Consistent queryanswering in databases. ACM Sigmod Record, 2006, 35(2): 68-76
CrossRef Google scholar
[8]
Chomicki J. Consistent query answering: five easy pieces. In: Proceedings of the 11th International Conference on Database Theory. LNCS 4353. Springer, 2007, 1-17
[9]
Bertossi L. Database repairing and consistent query answering. Synthesis Lectures on Data Management. Morgan & Claypool, 2011
[10]
Bertossi L, Bravo L. Consistent query answers in virtual data integration systems. In: Bertossi L, Hunter A, Schaub T, eds. Inconsistency Tolerance. LNCS 3300. Berlin: Springer, 2005, 42-83
CrossRef Google scholar
[11]
Bertossi L, Kolahi S, Lakshmanan L. Data cleaning and query answering with matching dependencies and matching functions. In: Proceedings of the 14th International Conference on Database Theory. 2011, 268-279
[12]
Gardezi J, Bertossi L, Kiringa I. Matching dependencies with arbitrary attribute values: semantics, query answering and integrity constraints. In: Proceedings of the 4th International Workshop on Logic in Databases. 2011, 23-30
[13]
Abiteboul S, Hull R, Vianu V. Foundations of Databases. 1st edition. Addison-Wesley, 1995
[14]
Bertossi L, Kolahi S, Lakshmanan L. Data cleaning and query answering with matching dependencies and matching functions. Theory of Computing Systems, 2012 (in press)
CrossRef Google scholar
[15]
Franconi1 E, Palma A, Leone N, Perri S, Scarcello F. Census data repair: A challenging application of disjunctive logic programming. In: Nieuwenhuis R, Voronkov A, eds. Logic for Programming, Artificial Intelligence, and Reasoning. LNCS 2250. Berlin: Springer, 2001, 561-578
[16]
Flesca S, Furfaro F, Parisi F. Querying and repairing inconsistent numerical databases. ACM Transactions on Database Systems, 2010, 35(2): 14
CrossRef Google scholar
[17]
Wijsen J. Database repairing using updates. ACM Transactions on Database Systems, 2005, 30(3): 722-768
CrossRef Google scholar
[18]
Bertossi L, Bravo L, Franconi E, Lopatenko A. The complexity and approximation of fixing numerical attributes in databases under integrity constraints. Information Systems, 2008, 33(4): 407-434
CrossRef Google scholar
[19]
Barceló P. Logical foundations of relational data exchange. ACM SIGMOD Record, 2009, 38(1): 49-58
CrossRef Google scholar
[20]
Bahmani Z, Bertossi L, Kolahi S, Lakshmanan L V S. Declarative entity resolution via matching dependencies and answer set programs. In: Proceedings of the 13th International Conference on Principles of Knowledge Representation and Reasoning. 2012 (in press)
[21]
Gardezi J, Bertossi L. Query answering under matching dependencies for data cleaning: Complexity and algorithms. Arxiv preprint arXiv: 1112. 5908, 2011
[22]
Garey M, Johnson D. Computers and Intractability: A Guide to the Theory of NP-completeness. WH Freeman & Co, 1979
[23]
Goldreich O. Computational Complexity: A Conceptual Perspective. 1st edition. New York: Cambridge University Press, 2008
[24]
Papadimitriou C. Computational complexity. Addison-Wesley, 1994
[25]
Gelfond M, Lifschitz V. Classical negation in logic programs and disjunctive databases. New Generation Computing, 1991, 9(3): 365-385
CrossRef Google scholar
[26]
Brewka G, Eiter T, Truszczy’nski M. Answer set programming at a glance. Communications of the ACM, 2011, 54(12): 92-103
CrossRef Google scholar
[27]
Buccafurri F, Leone N, Rullo P. Enhancing disjunctive datalog by constraints. IEEE Transactions on Knowledge and Data Engineering, 2000, 12(5): 845-860
CrossRef Google scholar
[28]
Fan W, Li J, Ma S, Tang N, Yu W. Towards certain fixes with editing rules and master data. Proceedings of the VLDB Endowment, 2010, 3(1-2): 173-184

RIGHTS & PERMISSIONS

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg
AI Summary AI Mindmap
PDF(384 KB)

Accesses

Citations

Detail

Sections
Recommended

/