Abstract
Purpose – This paper aims to present the semantic content identifier (SCI), a permanent identifier, computed through a linear-time onion-peeling algorithm that enables the extraction of semantic features from a text, and the integration of this information within the permanent identifier. Design/methodology/approach – The authors employ SCI to propose a mechanism for simultaneously checking the authenticity and degrees of similarity between different information objects, and present an empirical investigation of the method. A management scenario for the control of the authentication process and the detection of the degree of violation of documents is proposed. Findings – Such a mechanism could be adopted as a component of libraries’ strategy for the protection of the copyrights for documents published on the web. Practical implications – The use of the proposed numeric code can be utilised efficiently as a constituent part of the digital object identifier (DOI) system, making its computation more efficient and meaningful. Originality/value – The identifier proposed in the paper can result in a more efficient index for identifying and retrieving objects in a digital library, as well as online repositories and commercial applications that can handle information retrieval requests more effectively.
Original language | English |
---|---|
Pages (from-to) | 439-451 |
Number of pages | 13 |
Journal | Program: Electronic Library and Information Systems |
Volume | 45 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2011 |
Keywords
- Text identification, Information retrieval, Semantics, Persistent identifiers, Data handling, Copyright, Research work