The dilated triple

Abstract

This article was published as The dilated triple. Marko A. Rodriguez, Alberto Pepe, Joshua Shinavier. In: Emergent Web Intelligence: Advanced Semantic Technologies, Advanced Information and Knowledge Processing series, Pages 3-16, ISBN:978-1-84996-076-2, Springer-Verlag. 2010.

Abstract. The basic unit of meaning on the Semantic Web is the RDF statement, or triple, which combines a distinct subject, predicate and object to make a definite assertion about the world. A set of triples constitutes a graph, to which they give a collective meaning. It is upon this very simple foundation that the rich, complex knowledge structures of the Semantic Web are built. Yet the very expressivness of RDF, by inviting comparison with real-world knowledge, highlights a fundamental shortcoming of RDF: that it is limited to statements of absolute fact, in contrast to the thoroughly context-sensitive nature of human thought. However, when a statement is interpreted from beyond the scope of its local graph representation, other statements augment its meaning and identify its uniqueness. Following this line of thought, a model is presented in which each statement in an RDF graph is supplemented by some subjectively related subgraph of the same RDF graph, thereby framing the meaning of the statement within a broader context.

Introduction

The World Wide Web introduced a set of standards and protocols that has led to the development of a collectively generated graph of web resources. Individuals participate in creating this graph by contributing digital resources (e.g. documents, images, etc.) and linking them together by means of dereferenceable Hypertext Transfer Protocol (HTTP) Uniform Resource Identifiers (URI) (Berners-Lee 1994). While the World Wide Web is primarily a technology that came to fruition in the early nineties, much of the inspiration that drove the development of the World Wide Web was developed earlier with such systems as Vannevar Bush’s visionary Memex device (Bush 1945) and Ted Nelson’s Xanadu (Nelson 1981). What the World Wide Web provided that made it excel as the de facto standard was a common, relatively simple, distributed platform for the exchange of digital information. The World Wide Web has had such a strong impact on the processes of communication and cognition that it can be regarded as a revolution in the history of human thought – following those of language, writing and print (Harnad 1991).

While the World Wide Web has provided an infrastructure that has revolutionized the way in which many people go about their daily lives, over the years, it has become apparent that there are shortcomings to its design. Many of the standards developed for the World Wide Web lack a mechanism for representing “meaning” in a form that can be easily interpreted and used by machines. For instance, the majority of the Web is made up of a vast collection of Hypertext Markup Language (HTML) documents. HTML documents are structured such that a computer can discern the intended layout of the information contained within a document, but the content itself is expressed in natural language and thus, understandable only to humans. Furthermore, all HTML documents link web resources according to a single type of relationship. The meaning of a hypertext relationship can be loosely interpreted as “cites” or “related to”. The finer, specific meaning of this relationship is not made explicit in the link itself. In many cases, this finer meaning is made explicit within the HTML document. Unfortunately, without sophisticated text analysis algorithms, machines are not privy to the communication medium of humans. Yet, even within the single relationship model, machines have performed quite well in supporting humans as they use go about discovering and sharing information on the World Wide Web (Brin 1998, Kleinberg 1999, Haveliwala 2002, Golder 2006).

The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help. One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web. (Berners-Lee 1998)

As a remedy to the aforementioned shortcoming of the World Wide Web, the Semantic Web initiative has introduced a standard data model which makes explicit the type of relationship that exists between two web resources (Berners-Lee 2001, Berners-Lee 2001a). Furthermore, the Linked Data community has not only seen a need to link existing web resources in meaningful ways, but also a need to link the large amounts of non-typical web data (e.g. database information) (Bizer 2008). The standard for relating web resources on the Semantic Web is the Resource Description Framework (RDF) (Berners-Lee 2001, Klyne 2004). RDF is a data model that is used to create graphs of the form \[R \subseteq \underbrace{(U \cup B)}_\text{subject} \times \underbrace{U}_\text{predicate} \times \underbrace{(U \cup B \cup L)}_\text{object},\] where \(U\) is the i