Introduction

Recipes show a tremendous amount of diversity in cooking styles and ingredients some of which are highly community or culture or even country-specific. This diversity makes it challenging to design a system which can infer nutritional information without much manual intervention and with substantial accuracy. Although it’s possible to manually enter each ingredient from an enormous database, it’s often time consuming and impractical in our day-to-day lives. To automatically deduce nutritional information from textual recipes we’ve segmented the core procedure into following steps

  • Information Extraction (IE) from text recipes, using Rule-based or NLP (Natural Language Processing) parser

  • Conversion to structured data - amount, unit, ingredient name and any modifiers (ex. “lightly beaten”)

  • Mapping of each ingredient to an existing food ontology (USDA Food Database is used for demonstrative purpose. It can be extended to other food databases like NUTTAB)

  • Deduction of weights from various lexical clues and ingredient densities

  • Deduction of final nutritional information and