Kernel two-sample and independence tests for non-stationary random processes
File(s)engproc-05-00031.pdf (2.21 MB)
Published version
Author(s)
Laumann, Felix
von Kuegelgen, Julius
Barahona, Mauricio
Type
Conference Paper
Abstract
Two-sample and independence tests with the kernel-based MMD and HSIC have
shown remarkable results on i.i.d. data and stationary random processes.
However, these statistics are not directly applicable to non-stationary random
processes, a prevalent form of data in many scientific disciplines. In this
work, we extend the application of MMD and HSIC to non-stationary settings by
assuming access to independent realisations of the underlying random process.
These realisations - in the form of non-stationary time-series measured on the
same temporal grid - can then be viewed as i.i.d. samples from a multivariate
probability distribution, to which MMD and HSIC can be applied. We further show
how to choose suitable kernels over these high-dimensional spaces by maximising
the estimated test power with respect to the kernel hyper-parameters. In
experiments on synthetic data, we demonstrate superior performance of our
proposed approaches in terms of test power when compared to current
state-of-the-art functional or multivariate two-sample and independence tests.
Finally, we employ our methods on a real socio-economic dataset as an example
application.
shown remarkable results on i.i.d. data and stationary random processes.
However, these statistics are not directly applicable to non-stationary random
processes, a prevalent form of data in many scientific disciplines. In this
work, we extend the application of MMD and HSIC to non-stationary settings by
assuming access to independent realisations of the underlying random process.
These realisations - in the form of non-stationary time-series measured on the
same temporal grid - can then be viewed as i.i.d. samples from a multivariate
probability distribution, to which MMD and HSIC can be applied. We further show
how to choose suitable kernels over these high-dimensional spaces by maximising
the estimated test power with respect to the kernel hyper-parameters. In
experiments on synthetic data, we demonstrate superior performance of our
proposed approaches in terms of test power when compared to current
state-of-the-art functional or multivariate two-sample and independence tests.
Finally, we employ our methods on a real socio-economic dataset as an example
application.
Date Issued
2021-06-30
Date Acceptance
2021-06-01
Citation
Eng. Proc. 2021, 5(1), 31, 2021, 5 (1), pp.1-13
Publisher
https://www.mdpi.com/2673-4591/5/1/31
Start Page
1
End Page
13
Journal / Book Title
Eng. Proc. 2021, 5(1), 31
Volume
5
Issue
1
Copyright Statement
© 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
License URL
Sponsor
Engineering & Physical Science Research Council (EPSRC)
Identifier
http://arxiv.org/abs/2010.00271v3
Grant Number
EP/N014529/1
Source
ITISE 2021 (7th International conference on Time Series and Forecasting)
Subjects
stat.ME
stat.ME
stat.AP
Publication Status
Published
Start Date
2021-07-19
Finish Date
2021-07-21
Coverage Spatial
Gran Canaria, Spain
Date Publish Online
2021-06-30