Extracting and Representating Causal Relations in Children’s Stories
Stories are an essential part of knowledge and communication for humans. They are composed of a series of related concepts, such as events and states, which people use to share ideas to other members of society. Past researches have already tried to replicate the way humans produce or understand stories through creativetext generation systems. Unfortunately, there is a lack of data concerning relationships between events within and across sentences in a story because of lacking common sense knowledge. Therefore, a system called Eventure, which extracts instances of event relations within children’s stories, has been implemented. This system identifies concepts, as well as meta-data, in stories through the use of a thirdparty language processing tool that provides preprocessing capabilities like tokenization and POS tagging. With the concepts and meta-data collected, Eventure utilizes a predefined list of grammar templates and rules to extract instances of event relations and ultimately produces an ontology that stores them. The initial list grammar rules were collected from (Samson 2014) and were modified to accommodate meta-data of concepts. A new event relation between a causing state and a resulting event was also added. To validate the system’s accuracy, a gold standard of the extracted instances of event relations was created using ten children’s stories. The system yielded a precision of only 3.27%, a recall of 10.14%, and an F-measure of 4.95%. This is due to the relatively generic extraction templates, complexity of the children’s stories, and inherent problems with the utilized POS tagger.
Keywords: causal relation, relation extraction, knowledge representation, lexical semantics
A story is essentially a series of events. ...
In this paper, we describe our system, Eventure, that extracts event relations from children’s stories with the use of predefined extraction templates and rules, as well as concept indicators. Multiple word and sentence analysis tools such as morphological analyzers and transducers are also utilized. Section 2 describes an event relation and the representation of an event in Eventure’s ontology. This is followed by a discussion of the templates and rules used in the extraction process in Section 3. Section 4 presents an analysis of the quality of the extracted relations. The paper ends with a discussion of issues and recommendations for future work.
some intro text...
An event relation is a form of binary semantic relation represented as common sense assertions of the form relation(concept1, concept2). This form was patterned after ConceptNet (cite) and is used to provide the storytelling knowledge needed by story generation systems (MakeBelieve, PB1, PB2).
A number of relations are used by ConceptNet to describe events, as shown in Table 1.
Table 1. Event Relations <INSERT TABLE 4.2 HERE>
<please insert something here...>
To extract event relations, different types of concepts need to be identified. These are listed in Table 2. The first four relations, namely EffectOf, EffectOfIsState, EventForGoalEvent and EventForGoalState are similar to those used in ConceptNet to describe events. Happens(f, t) represents that a fluent f holds at time t. Fluent is a concept adopted from (cite) and is considered as an event in our research. The last event relation, CauseOfIsState, is derived from the first two event relations, and is used to represent the state that a story character is in that may lead to the execution of an event. For example, CauseOfIsState(sleep, tired) means that if a story character is tired (a state), he/she may go to sleep (an event).
Table 2. Concepts in Eventure <INSERT TABLE 4.1 HERE>
These concepts are used to define the elements that comprise an extraction template, which are shown in Table 3. The last two elements are based from McIntyre and Lapata’s (cite) content planning phase for ?story generation?, and are used in Eventure to extract event relations that span across sentences.
Table 3. Elements used in Extraction Templates <INSERT THE COMBINED CONTENTS OF TABLES 4.3 AND 4.5 HERE>
12 extraction templates for event relations were defined. These templates are shown in Table 4.
Table 4. Extraction Templates for Event Relations <INSERT THE COMBINED CONTENTS OF TABLES 4.4 AND 4.5 / 5.2 HERE>
Each of the extraction templates has an associated set of rules (why?).
Summarize the types of rules and provide one or two example...
The corpus is comprised of 30 stories for children age five to eight that were collected from the Internet and manually pre-processed to reformat dialogues into direct sentences; and to split compound, complex and compound-complex sentences to simple sentences. The corpus is then passed to automated pre-processing for POS tagging, tokenization and co-reference resolutions are then applied, yielding the sample output in Listing 1:
Listing 1. Sample output from pre-processing module Sample Sentence: Piglet smiled because Tigger gave him a present Tokenizer, POS Tagger: [Piglet] [smiled] [because] [Tigger] [gave] [him] [a] [present][.] Gazetteer: [character] [taskindicator, causeindicator] [character] [determiner] Morphological Analyzer: [smile] [give] Chunker: [B-NP] [B-VP] [B-SBAR] [B-NP] [B-VP] [B-NP] [I-NP] [I-NP][O]