45
IRUS TotalDownloads
Altmetric
Novel methods for multi-view learning with applications in cyber security
File | Description | Size | Format | |
---|---|---|---|---|
Hogan-J-2023-PhD-Thesis.pdf | Thesis | 4.46 MB | Adobe PDF | View/Open |
Title: | Novel methods for multi-view learning with applications in cyber security |
Authors: | Hogan, Jack |
Item Type: | Thesis or dissertation |
Abstract: | Modern data is complex. It exists in many different forms, shapes and kinds. Vectors, graphs, histograms, sets, intervals, etc.: they each have distinct and varied structural properties. Tailoring models to the characteristics of various feature representations has been the subject of considerable research. In this thesis, we address the challenge of learning from data that is described by multiple heterogeneous feature representations. This situation arises often in cyber security contexts. Data from a computer network can be represented by a graph of user authentications, a time series of network traffic, a tree of process events, etc. Each representation provides a complementary view of the holistic state of the network, and so data of this type is referred to as multi-view data. Our motivating problem in cyber security is anomaly detection: identifying unusual observations in a joint feature space, which may not appear anomalous marginally. Our contributions include the development of novel supervised and unsupervised methods, which are applicable not only to cyber security but to multi-view data in general. We extend the generalised linear model to operate in a vector-valued reproducing kernel Hilbert space implied by an operator-valued kernel function, which can be tailored to the structural characteristics of multiple views of data. This is a highly flexible algorithm, able to predict a wide variety of response types. A distinguishing feature is the ability to simultaneously identify outlier observations with respect to the fitted model. Our proposed unsupervised learning model extends multidimensional scaling to directly map multi-view data into a shared latent space. This vector embedding captures both commonalities and disparities that exist between multiple views of the data. Throughout the thesis, we demonstrate our models using real-world cyber security datasets. |
Content Version: | Open Access |
Issue Date: | Apr-2023 |
Date Awarded: | Aug-2023 |
URI: | http://hdl.handle.net/10044/1/106417 |
DOI: | https://doi.org/10.25560/106417 |
Copyright Statement: | Creative Commons Attribution NonCommercial NoDerivatives Licence |
Supervisor: | Adams, Niall |
Sponsor/Funder: | QinetiQ (Firm) |
Department: | Mathematics |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Mathematics PhD theses |
This item is licensed under a Creative Commons License