Development of a Food Image Recognition Algorithm Using Machine Learning


 Background: Researchers and consumers have limited options for objectively collecting or tracking data related to food choices. Objective: To develop and pilot test an algorithm that could accurately categorize food items from a meal photograph. Methods: We used a dataset of 7721 meal photographs taken by patrons in a cafeteria setting. We designed 22 broad categories recognizable by image that are parents of the original 1239 types of items in the photographs. We split the dataset into 3 mutually exclusive subsets: a training set (5250 images), a validation set (1312 images), and a test set (1159 images). Using a convolutional neural network and standard machine learning techniques, we tested the operating characteristics of the algorithm. Results: Salad recognition had the lowest specificity (0.74), while multiple categories had specificities close to 1.0 (e.g. cereals, pastries, sushi, yogurt). Areas under the ROC curve (AUCs), reflecting trade-offs between sensitivity and specificity, ranged from 0.73 (for yogurt) to 0.97 (for sushi). Conclusions: This work provides proof-of-concept for an algorithm that can categorize food items from a meal photograph.

Obesity and physical inactivity are leading causes of premature death in the United 4 States and across the globe. The costs of obesity and its health related problems are 5 also a substantial societal burden. Obese adults have more visits to physicians, and 6 a higher number of inpatient hospital days per year [1], [2]. Given the public health 7 and medical consequences of obesity and unhealthy eating, nutrition research is of 8 critical importance. 9 Methods used to measure nutritional content of foods that people eat include 10 various forms of food surveys. A food diary (or food/diet record) has people record 11 all foods and drinks consumed over a specified time period (e.g. consecutive days). 12 Measuring the items is critical to obtaining reliable estimates. The method does not 13 rely on recall, but it is burdensome and the measurement process itself might alter 14 eating behaviors. The Food Frequency Questionnaire [3] asks people to report how 15 frequently they consumed certain foods and drinks over a specified time period. 16 It is less burdensome and does not alter eating behaviors, but it relies on recall. 17 Additionally, it is not as precise in terms of quantifying intake. The Automated Self-  we were limited to binary labels indicating presence or absence of the food items.

43
A few example pictures are shown in Figure 1.  Meat Lovers Pizza) is a 0 since "12oz Chicken Noodle" is a subcategory of "Soup", 53 but "Meat Lovers Pizza" is not a subcategory of "Desserts".

54
In order to convert a labelỹ i for an image x i into a coarse label vector y i , we defined the following: where 1(·) is the elementwise indicator function. This process established a dataset Having reduced the number of categories from 1239 to 22, the number of instances 58 of each label in the dataset is sufficient to train a deep learning system to label food 59 items according to the broad categories (the number of instances for each coarse 60 label are shown in Table 1). We use D val to validate the performance of the system and tune hyperparameters.

80
Loss & Training 81 We use an elementwise binary cross-entropy loss to train the model: Here, L represents the loss on a batch of size n, and i is used to denote the i-th  and specificities to get a better sense of our model's performance.

97
AUC values ranged from 0.73 for yogurt to 0.97 for sushi. Figure 3 shows the 98 ROC curves for each food category, which show that the model works very well for 99 most items.  A few test set images and their corresponding predictions are shown in Figure   101 4. As can be seen, the algorithm correctly identifies the categories of foods in each 102 photo. These categories are broad, however, and additional information would need 103 to be gathered from the user to maximize the usefulness of the program. customizable to a person's usual preferences for food (e.g., the category "sandwich" 122 could default to a "half-size ham & cheese on white bread", and "drink" could 123 default to a "16 oz sweetened iced tea"