Multi-view data visualisation via manifold learning
File(s)peerj-cs-1993.pdf (15.34 MB)
Published version
Author(s)
Rodosthenous, Theodoulos
Shahrezaei, Vahid
Evangelou, Marina
Type
Journal Article
Abstract
Non-linear dimensionality reduction can be performed by manifold learning approaches, such as Stochastic Neighbour Embedding (SNE), Locally Linear Embedding (LLE) and Isometric Feature Mapping (ISOMAP). These methods aim to produce two or three latent embeddings, primarily to visualise the data in intelligible representations. This manuscript proposes extensions of Student’s t-distributed SNE (t-SNE), LLE and ISOMAP, for dimensionality reduction and visualisation of multi-view
data. Multi-view data refers to multiple types of data generated from the same samples.
The proposed multi-view approaches provide more comprehensible projections of the samples compared
to the ones obtained by visualising each data-view separately. Commonly, visualisation is used for identifying underlying patterns within the samples. By incorporating the obtained low-dimensional embeddings from the multi-view manifold approaches into the K-means clustering algorithm, it is shown that clusters of the samples are accurately identified. Through extensive comparisons of novel and existing multi-view manifold learning algorithms on real and synthetic data, the proposed multi-view extension of t-SNE, named multi-SNE, is found to have the best performance, quantified both qualitatively and quantitatively by assessing the clusterings obtained.
The applicability of multi-SNE is illustrated by its implementation in the newly developed and challenging
multi-omics single-cell data. The aim is to visualise and identify cell heterogeneity and cell types in biological tissues relevant to health and disease. In this application, multi-SNE provides an improved performance over single-view manifold learning approaches and a promising solution for unified clustering of multi-omics single-cell data.
data. Multi-view data refers to multiple types of data generated from the same samples.
The proposed multi-view approaches provide more comprehensible projections of the samples compared
to the ones obtained by visualising each data-view separately. Commonly, visualisation is used for identifying underlying patterns within the samples. By incorporating the obtained low-dimensional embeddings from the multi-view manifold approaches into the K-means clustering algorithm, it is shown that clusters of the samples are accurately identified. Through extensive comparisons of novel and existing multi-view manifold learning algorithms on real and synthetic data, the proposed multi-view extension of t-SNE, named multi-SNE, is found to have the best performance, quantified both qualitatively and quantitatively by assessing the clusterings obtained.
The applicability of multi-SNE is illustrated by its implementation in the newly developed and challenging
multi-omics single-cell data. The aim is to visualise and identify cell heterogeneity and cell types in biological tissues relevant to health and disease. In this application, multi-SNE provides an improved performance over single-view manifold learning approaches and a promising solution for unified clustering of multi-omics single-cell data.
Date Issued
2024-05-24
Date Acceptance
2024-03-25
Citation
PeerJ, 2024, 10
ISSN
2167-8359
Publisher
PeerJ Inc.
Journal / Book Title
PeerJ
Volume
10
Copyright Statement
© 2024 Rodosthenous et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
License URL
Identifier
https://peerj.com/articles/cs-1993/
Publication Status
Published
Article Number
e1993
Date Publish Online
2024-05-24