Computer vision and machine learning for in-play tennis analysis: framework, algorithms and implementation
File(s)
Author(s)
Vinyes Mora, Silvia
Type
Thesis or dissertation
Abstract
Statistical analysis has become an essential part of professional sports to the point that every major professional team employs expert analysts. With the widespread availability of high definition streaming data, increased processing power and algorithmic advances, statistical models are undergoing an evolution from static coarse-grained pre-play models based on simple statistical data to finer-grained dynamic in-play models that exploit spatio-temporal data and event streams.
Our primary hypothesis is that Computer Vision (CV) and Machine Learning (ML) research has opened up the opportunity to develop automated systems to collect the fine-grained data needed to support these more sophisticated models. We have chosen tennis as our focus on account of it being a highly structured sport which has enthusiastically embraced technology, but our research is also applicable to other sports.
In this work, we propose a novel framework for the application of CV and ML techniques to the in-play spatio-temporal analysis of tennis using commodity hardware with the ultimate objective of obtaining insights related to players’ performance or prediction of future events. Our framework consists of three-layers: Vision, Classification and Modelling, each of which features various algorithmic innovations.
For the Vision Layer we propose a multi-camera system able to detect the player and ball positions in real-time, at over 60 fps. Their 3D inferred location has an error lower than 10 cm in 80% of the frames. The Classification Layer uses data from the previous layer to obtain high- level information. Our contributions within this layer are two-fold: (1) insightful visualizations of Vision Layer data and automatic extraction of high-level statistics (2) the first deep Neural Network for fine-grained action recognition in tennis which yields highly competitive results: 88.16% one versus all accuracy for classifying backhands, forehands and serves and 47.22% in classifying 12 finer-grained actions. The Modelling Layer incorporates the data obtained for the previous two layers to gain insights about the game. For this layer, we survey and critically examine different approaches of using tennis spatio-temporal data for prediction and analysis.
Our work opens the door to a deeper understanding of tennis and other sports leading to new approaches to coaching, better analysis of rivals for developing strategies and in-play match analysis to enhance spectators’ experience, amongst many other applications.
Our primary hypothesis is that Computer Vision (CV) and Machine Learning (ML) research has opened up the opportunity to develop automated systems to collect the fine-grained data needed to support these more sophisticated models. We have chosen tennis as our focus on account of it being a highly structured sport which has enthusiastically embraced technology, but our research is also applicable to other sports.
In this work, we propose a novel framework for the application of CV and ML techniques to the in-play spatio-temporal analysis of tennis using commodity hardware with the ultimate objective of obtaining insights related to players’ performance or prediction of future events. Our framework consists of three-layers: Vision, Classification and Modelling, each of which features various algorithmic innovations.
For the Vision Layer we propose a multi-camera system able to detect the player and ball positions in real-time, at over 60 fps. Their 3D inferred location has an error lower than 10 cm in 80% of the frames. The Classification Layer uses data from the previous layer to obtain high- level information. Our contributions within this layer are two-fold: (1) insightful visualizations of Vision Layer data and automatic extraction of high-level statistics (2) the first deep Neural Network for fine-grained action recognition in tennis which yields highly competitive results: 88.16% one versus all accuracy for classifying backhands, forehands and serves and 47.22% in classifying 12 finer-grained actions. The Modelling Layer incorporates the data obtained for the previous two layers to gain insights about the game. For this layer, we survey and critically examine different approaches of using tennis spatio-temporal data for prediction and analysis.
Our work opens the door to a deeper understanding of tennis and other sports leading to new approaches to coaching, better analysis of rivals for developing strategies and in-play match analysis to enhance spectators’ experience, amongst many other applications.
Version
Open Access
Date Issued
2018-01
Date Awarded
2019-02
Copyright Statement
Creative Commons Attribution 4.0 International Licence
Advisor
Knottenbelt, William
Sponsor
Engineering and Physical Sciences Research Council
Publisher Department
Computing
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)