19
IRUS TotalDownloads
Altmetric
Deep learning-based 3D reconstruction and pervasive monitoring for nutritional analysis
File | Description | Size | Format | |
---|---|---|---|---|
Lo-P-2022-PhD-Thesis.pdf | Thesis | 245.43 MB | Adobe PDF | View/Open |
Title: | Deep learning-based 3D reconstruction and pervasive monitoring for nutritional analysis |
Authors: | Lo, Po Wen |
Item Type: | Thesis or dissertation |
Abstract: | A recent National Health Service (NHS) survey in England reported that the proportion of adults who were obese or overweight was increasingly growing. Unhealthy food consumption, including nutritional imbalance and excess calorie intake, is one of the reasons which leads to obesity. Commonly used daily dietary assessment methods, such as 24 hour dietary recalls (24HR), have proved effective in helping users to understand their dietary behaviour and enable targeted interventions to address the underlying health problems, such as obesity and diabetes. However, in this self-reporting technique, the food types and the portion size reported highly depend on users' subjective judgement which may lead to a biased and inaccurate dietary analysis result. As such, a variety of objective vision-based dietary assessment approaches have been proposed recently. To facilitate the development and industrialisation of objective dietary assessment technologies, and motivate researchers to improve the accuracy of dietary reporting, an in-depth study on vision-based approach is an important step forward. This thesis first summarises the computing algorithms, mathematical models and methodologies used in the field of vision-based dietary assessment. It also provides a comprehensive comparison of the state-of-the-art approaches in image-based food recognition and portion size estimation in terms of their model accuracy, efficiency and constraints. While these methods show promising outcomes in tackling issues in nutritional epidemiology studies, several challenges and forthcoming opportunities, as detailed in this thesis, still exist. For instance, in portion size estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles respectively which could be a tedious process. The main objective of the thesis is to tackle these challenges by investigating innovative measures to quantify individual nutrient intake. With the advances of Artificial Intelligence (AI) for the computer vision applications, existing approaches show promising results in image-based food recognition; however, there still exists a number of hurdles in estimating the nutrient intake, as accurate food consumption measurement requires accurate portion size estimation, and the performance of up-to-date portion size estimation techniques is not yet satisfactory and inconvenient to use. Therefore, this thesis has explored the feasibility and potential of assessing accurate image-based portion size estimation using deep learning-based approaches. Specifically, a novel vision-based method based on deep learning view synthesis and point cloud registration is proposed to enable accurate portion size estimation of food items consumed. View synthesis is understood to be a method which infers the occluded/unseen view of a specific object item given only an image with a single-viewing angle. In using such an approach, 3D reconstruction can be carried out through Iterative Closest Point (ICP) to reconstruct the full geometrical shape of the 3D food models without requiring the need of images captured from multiple viewing angles and positions. With regard to the progress on processing point cloud, we further developed an end-to-end learning-based 3D reconstruction approach via point cloud completion networks. To achieve this, the pairing of partial and its corresponding completed 3D model of food items is required. In this context, a new volume-annotated dataset with synthetic 3D models of food items, namely Volume-3D dataset, is also constructed for validating our proposed portion size estimation pipeline. To ensure the synthetic training dataset be able to apply on real-world scenarios, domain adaption has also been investigated to facilitate the bridging of the reality gap via a Generative Adversarial Network (GAN) based approach. Despite the great promise of the learning-based 3D reconstruction approach in portion size estimation, it requires the use of depth information to address the issue of scale ambiguity; however, hand-held or wearable devices with depth sensors, such as Time Of Flight (TOF) camera or LiDAR scanner, are still costly and cumbersome in size. To facilitate the large-scale dietary assessment and reduce the costs of the systems, an alternative dietary assessment pipeline via egocentric cameras is explored. Without relying on depth information, only RGB images are required in this pipeline. In this approach, semantic segmentation is applied to recognise multi-food types and newly-designed handcrafted features are extracted for portion size estimation. Comprehensive experiments are conducted to validate our methods on a large scale in-the-wild dataset captured under the settings, which simulate the unique Low-and-Middle-Income Countries (LMIC) conditions with participants of Ghanaian and Kenyan origin eating common Ghanaian/Kenyan dishes. To demonstrate the efficacy, experienced dietitians were involved in this research to perform the visual portion size estimation, and their predictions were compared to our proposed method. In my PhD work, I have proven the feasibility of accurate image-based portion size estimation for dietary assessment. The advantages and limitations of the approaches proposed in this thesis are also discussed, followed by potential research directions and new challenges that need to be addressed in the future. |
Content Version: | Open Access |
Issue Date: | Oct-2021 |
Date Awarded: | Mar-2022 |
URI: | http://hdl.handle.net/10044/1/103228 |
DOI: | https://doi.org/10.25560/103228 |
Copyright Statement: | Creative Commons Attribution NonCommercial Licence |
Supervisor: | Lo, Benny |
Sponsor/Funder: | Lee, Richard (Dr) Bill and Melinda Gates Foundation |
Funder's Grant Number: | OPP1171395 |
Department: | Department of Surgery & Cancer |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Department of Surgery and Cancer PhD Theses |
This item is licensed under a Creative Commons License