Create a Packaged Food Ingredient Network
Packaged foods contain a wide variety of ingredients, including food colorings, preservatives and more. With detailed lists of thousands of computable food ingredient entities, the Wolfram Language allows for easy analysis of over 100,000 packaged foods and their ingredients.
For example, it is easy to find foods with a specific ingredient, such as lemon powder.
It is also easy to find foods that do not have certain ingredients, in order to satisfy a dietary restriction. For example, find refried beans without lard for a vegetarian diet.
Write an entity function to find the most common ingredient pairings for an entity class of foods.
Use the entity function to compare some food brands by their most commonly paired ingredients.
Of course, a much larger analysis of ingredients in packaged foods can be done. Start by finding ingredient lists for about 67,000 foods from the USDA Branded Food Products Database.
For each food, find all pairs of ingredients and how frequently they occur together.
Find the five most common ingredient pairs.
Construct a graph from the weighted ingredient pairs and how frequently the individual ingredients appear.
First, determine how frequently the individual ingredients appear.
Then construct a graph from the weighted ingredients and their pair weights. The resulting graph is very large, with over 5,000 vertices and 700,000 edges.
Despite its large size, the graph is connected, which indicates that there exists a path of pairings between any two ingredients in the ingredient lists for these foods.
You can find the centers of the ingredient network, which are the ingredients that are paired with the most ingredients.
Another way to find the most commonly paired ingredients is to compute the page-rank centrality for the network, which shows that salt, water and sugar are among the most commonly paired ingredients.
The full graph is too large to visualize, so you must use the neighborhood graph for a specific ingredient.
Visualize the neighborhood graph as a 3D graph.
Style the graph according to edge and vertex weights, noting how connected the structure is and how the most common ingredients are in the center of the graph, as was computed in the full network.