Proceedings of the ACM Symposium on Applied Computing
Building Automatic Mapping between XML Documents Using Approximate Tree Matching
作者:
Xing, G.M., Xia, Z.H. and Ernest, A.
关键词:
Building Automatic Mapping between XML Documents Using Approximate Tree Matching
摘要:
The eXtensible Markup Language (XML) is becoming the standard format for data exchange on the Internet, providing interoperability among Web applications. It is important to provide efficient algorithms and tools to manipulate XML documents that are ubiquitous on the Web.
In this paper, we present a novel system for automating the transformation of XML documents based on structural mapping with the restriction that the leaf text information are exactly the same in the source and target documents.
Firstly, tree edit distance algorithm is used to find the mapping between a pair of source and target documents. With the introduction of tree partition, the efficiency of the tree matching algorithm has been improved significantly. Secondly, template rules for transformation are inferred from the mapping using generalization. Thirdly, a template matching component is used to process new documents.
Experimental studies have shown that our methods are very promising and can be widely used for Web document cleaning, information filtering, and other applications.
在线下载