ConceptNet 5.5: An Open Multilingual Knowledge Graph about Natural Language
ConceptNet is a knowledge graph that connects words and phrases of natural language with labeled edges. It is designed to improve natural language applications by allowing the application to better understand the meanings behind the words people use. Version 5.5 extends its representation to include word forms in many languages. ConceptNet provides applications with understanding that they would not acquire from distributional semantics (such as word2vec) alone, nor from narrower resources such as WordNet or DBPedia. We demonstrate this with state-of-the-art results on intrinsic evaluations (word relatedness and analogies) that translate into improvements on an extrinsic evaluation in story understanding (the Story Cloze Test).
ConceptNet is a knowledge graph that connects words and phrases of natural language (terms) with labeled, weighted edges (assertions). The original release of ConceptNet (Liu 2004) was intended as a parsed representation of Open Mind Common Sense (Singh 2002), a crowd-sourced knowledge project. This paper describes the release of ConceptNet 5.5, which has expanded to include lexical and world knowledge from many different sources in many languages.
In this paper, we will concisely represent an assertion as a triple of its start node, relation label, and end node: the assertion that "a dog has a tail" can be represented as (dog, HasA, tail).
ConceptNet 5.5 is built from the following sources:
With the combination of these sources, ConceptNet contains over 21 million edges and over 8 million nodes. Its English vocabulary contains approximately 1,500,000 nodes, and there are 83 languages in which it contains at least 10,000 nodes.
The largest source of input for ConceptNet is Wiktionary, which provides 18.1 million edges and is mostly responsible for its large multilingual vocabulary. However, much of the character of ConceptNet comes from OMCS and the various games with a purpose, which express many different kinds of relations between terms, such as PartOf ("a wheel is part of a car") and UsedFor ("a car is used for driving").