loading page

MathSemantifier - a Notation-based Semantification Study
  • Ion Toloaca
Ion Toloaca

Corresponding Author:[email protected]

Author Profile

Abstract

ifundefinedshowcaptionsetup Mathematical formulae are a highly ambiguous content for which typesetting systems as LaTeX store only the rendering information. MathSemantifier is an open-source notation-based mathematical formula semantification system that attempts to tackle the problem of ambiguity in mathematical documents and produce knowledge-rich equivalents. The system extracts formulae (from formats such as LaTeX or MathML) and produces content-rich results (Content MathML) that contain no semantic ambiguity. The disambiguation is achieved by matching the input formulae against a known database of notation definitions, which is aggregated into a Context Free Grammar. This paper outlines an implementation of MathSemantifier that focuses on helping researchers in semantifying their works, and the ultimate goal being a scalable implementation that would need minimal help from a human, and, therefore, could be used to semantify large collections of mathematical papers such as arXiv \citep{arXiv}.