Authorea

Alberto Pepe edited Comparing contexts.tex about 11 years ago

Commit id: 2572acd4c0800b636d9011fc376a3b830f03b0c7

deletions | additions

A simple way in which the process can make a distinction between the various interpretations of \ttt{foaf:knows} is to intersect its history with the context of the relationships. In other words, the process can compare its history subgraph with the subgraph that constitutes a dilated triple. If $H \subseteq R$ is a graph defining the history of the process which includes the process' traversal through the scholarly aspects of Marko, then it is the case that $| H \cap T_x | > | H \cap T_y |$ as the process' scholarly perspective is more related to Marko and Alberto than it is to Marko and Carole. That is, the process' history $H$ has more triples in common with $T_x$ than with $T_y$. Thus, what the process means by \texttt{foaf:knows} is a ``scholarly" \texttt{foaf:knows}. This idea is diagrammed in Figure 4, where $H$ has more in common with $T_x$ than with $T_y$, thus an intersection of these sets would yield a solution to the query (\texttt{lanl:marko}, \texttt{foaf:knows}, ?o) that included Alberto and not Carole. (Note: $H$ need not be a dynamic context that is generated as a process moves through an RDF graph. $H$ can also be seen as a static, hardwired ``expectation" of what the process should perceive. For instance, $H$ could include ontological triples and known instance triples. In such cases, querying for such relationships as \texttt{foaf:knows}, \texttt{foaf:fundedBy}, \texttt{foaf:memberOf}, etc. would yield results related to $H$ -- biasing the results towards those relationships that are most representative of the process' expectations.) In other words, the history of the process ``blinds" the process in favor of interpreting its place in the graph from the scholarly angle. (Note: This notion is sometimes regarded as a ``reality tunnel" \cite{nerosoc:wilson1979,prome:wilson1983}.) The trivial intersection method of identifying the degree of similarity between two graph structures can be extended. Other algorithms, such as those based on a spreading activation within a semantic graph \cite{spread:collins1975,inform:cohen1987,search:crestani2000,grammar:rodriguez2008} \cite{spread:collins1975,inform:cohen1987,search:crestani2000} can be used as a more fuzzy and probabilistic means of determining the relative ``semantic distance" between two graphs \cite{semdist:delugach1993}. graphs. Spreading activation methods are more analogous to the connectionist paradigm of cognitive science than the symbolic methods of artificial intelligence research \cite{rumelhart:conn1993}. The purpose of a spreading activation algorithm is to determine which resources in a semantic graph are most related to some other set of resources. In general, a spreading activation algorithm diffuses an energy distribution over a graph starting from a set of resources and proceeding until a predetermined number of steps have been taken or the energy decays to some $\epsilon \approx 0$. (Note: In many ways this is analagous to finding the primary eigenvector of the graph using the power method. However, the energy vector at time step $1$ only has values for the source resources, the energy vector is decayed on each iteration, and finally, only so many iterations are executed as a steady state distribution is not desired.) Those resources that received the most energy flow during the spreading activation process are considered the most similar to the set of source resources. With respect to the particular example at hand, the energy diffusion would start at the resources in $H$ and the results would be compared with resources of $T_x$ and $T_y$. If the set of resource in $T_x$ received more energy than those in $T_y$, then the dilated triple $T_x$ is considered more representative of the context of $H$.(Note: Spreading activation on a semantic graph is complicated as edges have labels. A framework that makes use of this fact to perform arbitrary path traversals through a semantic graph is presented in \cite{grammar:rodriguez2008}.) By taking advantage of the supplementary information contained within a dilated triple, a process has more information on which to base its interpretation of the meaning of a triple. To the process, a triple is not simply a string of three symbols, but instead is a larger knowledge structure which encapsulates the uniqueness of the relationship. The process can use this information to bias its traversal of the graph and thus, how it goes about discovering information in the graph.