3D gaze based semi-autonomous wheelchair
File(s)
Author(s)
Subramanian, Mahendran
Type
Thesis
Abstract
A gaze-based semi-autonomous wheelchair was developed to control mobility platforms by
decoding how the user looks at the environment to understand where they want to navigate
their mobility device. However, many natural eye movements are not relevant for action
intention decoding; only some are, which places a challenge on decoding, the so-called ‘Midas
touch’ Problem.
Herein, a new solution is presented, consisting of 1. deep computer vision to understand what
object a user is looking at in their field of view, with 2. an analysis of where on the object’s
bounding box the user is looking, to 3. use a simple machine learning classifier to determine
whether the overt visual attention on the object is predictive of a navigation intention to that
object.
This decoding system ultimately determines whether the user wants to drive to, e.g., a TV or
just looks at it. Crucially, we find that when users look at an object and imagine they were to
interact with it, the resulting eye movements from this motor imagery (akin to neural interfaces)
remain decodable.
Once a driving intention and thus also the location is detected, our system instructs our
autonomous wheelchair platform, the A.Eye-Drive, to navigate to the desired object while
avoiding static and moving obstacles.
Thus, we have realised a cognitive-level human interface for navigation purposes, as it requires
the user only to cognitively interact with the desired goal, not to continuously steer their
wheelchair to the target (low-level human interfacing).
decoding how the user looks at the environment to understand where they want to navigate
their mobility device. However, many natural eye movements are not relevant for action
intention decoding; only some are, which places a challenge on decoding, the so-called ‘Midas
touch’ Problem.
Herein, a new solution is presented, consisting of 1. deep computer vision to understand what
object a user is looking at in their field of view, with 2. an analysis of where on the object’s
bounding box the user is looking, to 3. use a simple machine learning classifier to determine
whether the overt visual attention on the object is predictive of a navigation intention to that
object.
This decoding system ultimately determines whether the user wants to drive to, e.g., a TV or
just looks at it. Crucially, we find that when users look at an object and imagine they were to
interact with it, the resulting eye movements from this motor imagery (akin to neural interfaces)
remain decodable.
Once a driving intention and thus also the location is detected, our system instructs our
autonomous wheelchair platform, the A.Eye-Drive, to navigate to the desired object while
avoiding static and moving obstacles.
Thus, we have realised a cognitive-level human interface for navigation purposes, as it requires
the user only to cognitively interact with the desired goal, not to continuously steer their
wheelchair to the target (low-level human interfacing).
Version
Open Access
Date Issued
2022-05
Date Awarded
2024-09
Copyright Statement
Creative Commons Attribution NonCommercial NoDerivatives Licence
Advisor
Faisal, Aldo
Sponsor
Engineering and Physical Sciences Research Council
Grant Number
EP/N509486/1: 1979819
Publisher Department
Computing
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)