We’ve combined the gramer with NLP tools such as Part-Of-Speech (POS) taggers, noun phrase (NP) chunkers and stemmers/lemmatizers to make the extraction procedure more robust. Although we could have used a Machine Learning (ML) based parser, given the limited vocubulary of the recipe domain, it seemed an overkill.

Ontology Mapping

Ontology represents knowledge as a hierarchy of concepts within a domain, using a shared vocabulary to denote the types, properties and interrelationships of those concepts. In an approximate sense we can assume an existing food database as a Food Ontology (ex. Food can have classifications, relationships etc.). Using United States Department of Agriculture (USDA)’s National Nutrition Database Standard Reference as a standard food ontology, the primary aim for this part of the algorithm was to map ingredient input like 1 teaspoon vanilla to a specific node Vanilla extract (http://ndb.nal.usda.gov/ndb/search/list?qlookup=02050) in the database

Instead of a traditional approach, we’ve used open source search engine ElasticSearch’s text analysis capabilities for the ontology mapping.