GCAM: Gaussian and causal-attention model of food fine-grained recognition

doi:10.21203/rs.3.rs-4134165/v1

Download PDF

Research Article

GCAM: Gaussian and causal-attention model of food fine-grained recognition

https://doi.org/10.21203/rs.3.rs-4134165/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 20 Jun, 2024

Read the published version in Signal, Image and Video Processing →

You are reading this latest preprint version

Currently, most food recognition relies on deep learning for category classification. However, these approaches struggle to effectively distinguish between visually similar food samples, highlighting the pressing need to address fine-grained issues in food recognition. To address these issues, we advocate for a Gaussian and causal-attention model specifically designed for nuanced object recognition. This model involves training to capture Gaussian characteristics in targeted areas, followed by extracting detailed features from the objects, thus improving the target regions’ feature mapping capabilities. To counter data drift caused by skewed data distributions, we implement a counterfactual reasoning strategy. Through counterfactual interventions, the effect of the learned image attention mechanism on network predictions is examined, allowing for the optimization of attention weights in detailed image recognition. A learnable loss strategy is also developed to ensure consistent training across various modules, thereby enhancing the precision of the ultimate recognition task. Our method has been validated on four 1 pertinent datasets, where it demonstrated superior performance. Specifically, the Gaussian and Causal-Attention Model (GCAM) has outperformed existing state-of-the-art methods on the ETH-FOOD101, UECFOOD256, and Vireo-FOOD172 datasets and achieved leading results on the CUB-200 dataset.

Gaussian function

Counterfactuals are inferences

Fine-grained identification of food

attention mechanism

No competing interests reported.

Download PDF

Journal Publication

published 20 Jun, 2024

Read the published version in Signal, Image and Video Processing →

Editorial decision: Revision requested
12 Apr, 2024
Reviews received at journal
09 Apr, 2024
Reviewers agreed at journal
21 Mar, 2024
Reviewers invited by journal
20 Mar, 2024
Submission checks completed at journal
20 Mar, 2024
Editor assigned by journal
20 Mar, 2024
First submitted to journal
20 Mar, 2024

You are reading this latest preprint version

GCAM: Gaussian and causal-attention model of food fine-grained recognition

Status:

Journal Publication

Version 1

Abstract

Full Text

Additional Declarations

Status:

Journal Publication

Version 1