Machine Learning Models: Combining Evidence of Similarity for XML Schema Matching

Tran Hong-Minh, Dan J. Smith

Research output: Chapter in Book/Report/Conference proceedingChapter


Matching schemas at an element level or structural level is generally categorized as either hybrid, which uses one algorithm, or composite, which combines evidence from several different matching algorithms for the final similarity measure. We present an approach for combining element-level evidence of similarity for matching XML schemas with a composite approach. By combining high recall algorithms in a composite system we reduce the number of real matches missed. By performing experiments on a number of machine learning models for combination of evidence in a composite approach and choosing the SMO for the high precision and recall, we increase the reliability of the final matching results. The precision is therefore enhanced (e.g., with data sets used by Cupid and suggested by the author of LSD, our precision is respectively 13.05% and 31.55% higher than COMA and Cupid on average).
Original languageEnglish
Title of host publicationKnowledge Discovery from XML Documents
EditorsRichi Nayak, Mohammed Zaki
PublisherSpringer Berlin / Heidelberg
Number of pages11
Publication statusPublished - 2006

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Berlin / Heidelberg

Cite this