Ontotext

EDAMAM Food KB

EDAMAM Logo EDAMAM is a US new media startup by the founders of Bulgaria's most successful internet media holding. Edamam contracted Ontotext to develop an extensive Knowledge Base about Food and Cooking with the ambition is to become an authoritative source of cooking information, and provide attractive end-user interfaces for access to this wealth of knowledge. This success story describes some aspects of the application and the involved technologies.

 

Edamam is about eating better. We harness technology to organize food knowledge and give it back to the people so they can make smarter choices about food. We aim to bring the joy of food and cooking back into people's lives.
Transforming the organic and implied knowledge about food into structured data was a huge challenge. Ontotext was instrumental in solving this problem.
– Victor Penev, Edamam CEO

See a SemTech 2011 case study on this topic, and Jan 2013 news item Semantic Technology for Healther Eating

Focused Crawling

EDAMAM uses Ontotext's Focused Crawling technology and Web Mining Framework to extract recipes from numerous sites. Currently the KB includes more than 300,000 recipes, and more sites are added all the time. After creating the initial crawlers and providing training, additional crawlers can be added by the client.

EDAMAM parses the recipes, extracts the available information, converts it to a common ontology, and provides recipe de-duplication. A link to the original site and full credit are provided by the search interfaces.

Ontology, Pragmatics, Searching

EDAMAM's food ontology includes recipes, ingredients, nutrition information, measures, etc.

Derived (inferenced) data includes cooking time, dietary restrictions (e.g. Vegan, Vegetarian, Kosher, etc), recipe classification, recipe complexity, nutrition information in original and per serving, to what degree the recipe contributes to a balanced diet (based on Recommended Daily Intakes). The EDAMAM ontology includes some 30 classes and detailed properties.

Numerous domain-specific facts and pragmatics are taken into account, such as:

  • Conversion from a measure (e.g. Cup) to weight depends on the state of the ingredient. For example, minced onion weighs more than chopped onion
  • Certain measures depend on the ingredient. E.g. a pouch of Dry Onion Soup has different weight from a pouch of Flavor Fresh Tuna
  • Default values are provided for all measures, even subjective ones such as "to taste", "dash of", "top it up", etc. In some cases these defaults depend on the total meal weight (e.g. "salt to taste")

The next picture illustrates a small populated example of specific individual measures.

ANAM specific measures

The KB is stored in the OWLIM semantic repository, allowing very fast inferencing and search. The KB offers a developmental SPARQL end-point, RDF exploration through Ontotext's Forest UI, and several search applications. Full-text search is available through the Lucene index that is integrated in OWLIM.

Data Sources, Text Analysis, Mapping

After extracting the parts of a recipe, ANAM uses Text Analysis and Semantic Annotation techniques provided by the KIM Platform to map ingredients, cooking techniques and tools to industry databases. The most important of these databases are:

  • US Department of Agriculture's Standard Reference: provides a list of some 9000 ingredients, including full nutrition information over some 140 nutrients
  • A very comprehensive food description thesaurus (multitude of classifications)

These data sets allow EDAMAM to compute precise nutritional information, and filter by various dietary restrictions.

The EDAMAM KB is also mapped to available Linked Open Data, such as DBpedia and FreeBase.

User Interfaces

The gist of the project is to create a comprehensive knowledge base that can be used for various applications: recipe search, healthy eating applications, shopping, even cooking robots and smart fridges. The initial release of the project includes two consumer applications

Mobile Application

Smart-phone application for iPhone and Android (developed by Ontotext's sibling company Sirma Mobile). The first screen below shows a recipe view, and the user further refining (restricting) the result set by selecting a computed criterion "Balanced Diet". The second screen shows detailed nutrition information:

Refine (restrict) by computed criterion 'Balanced Diet'Detailed nutrition information

Web Application

The recipe detail screen shows instructions, ingredient list, dietary classifications, total energy, a bar of the fundamental nutrients, and detailed nutrition information:

 recipe details

UI Features

The user interface provides efficient full-text search, ranking by various criteria, filtering by dietary restrictions and other recipe classifications.

Auto completion and spelling correction are also provided, as illustrated below (phonetic completion and spelling correction respectively):

Spelling correction