ConceptNet 5.5: An Open Multilingual Knowledge Graph about Natural Language

Abstract

ConceptNet is a knowledge graph that connects words and phrases of natural language with labeled edges. It is designed to improve natural language applications by allowing the application to better understand the meanings behind the words people use. Version 5.5 extends its representation to include word forms in many languages. ConceptNet provides applications with understanding that they would not acquire from distributional semantics (such as word2vec) alone, nor from narrower resources such as WordNet or DBPedia. We demonstrate this with state-of-the-art results on intrinsic evaluations (word relatedness and analogies) that translate into improvements on an extrinsic evaluation in story understanding (the Story Cloze Test).

Introduction

ConceptNet is a knowledge graph that connects words and phrases of natural language (terms) with labeled, weighted edges (assertions). The original release of ConceptNet (Liu 2004) was intended as a parsed representation of Open Mind Common Sense (Singh 2002), a crowd-sourced knowledge project. This paper describes the release of ConceptNet 5.5, which has expanded to include lexical and world knowledge from many different sources in many languages.

In this paper, we will concisely represent an assertion as a triple of its start node, relation label, and end node: the assertion that "a dog has a tail" can be represented as (dog, HasA, tail).