45
IRUS Total
Downloads
  Altmetric

Novel methods for multi-view learning with applications in cyber security

File Description SizeFormat 
Hogan-J-2023-PhD-Thesis.pdfThesis4.46 MBAdobe PDFView/Open
Title: Novel methods for multi-view learning with applications in cyber security
Authors: Hogan, Jack
Item Type: Thesis or dissertation
Abstract: Modern data is complex. It exists in many different forms, shapes and kinds. Vectors, graphs, histograms, sets, intervals, etc.: they each have distinct and varied structural properties. Tailoring models to the characteristics of various feature representations has been the subject of considerable research. In this thesis, we address the challenge of learning from data that is described by multiple heterogeneous feature representations. This situation arises often in cyber security contexts. Data from a computer network can be represented by a graph of user authentications, a time series of network traffic, a tree of process events, etc. Each representation provides a complementary view of the holistic state of the network, and so data of this type is referred to as multi-view data. Our motivating problem in cyber security is anomaly detection: identifying unusual observations in a joint feature space, which may not appear anomalous marginally. Our contributions include the development of novel supervised and unsupervised methods, which are applicable not only to cyber security but to multi-view data in general. We extend the generalised linear model to operate in a vector-valued reproducing kernel Hilbert space implied by an operator-valued kernel function, which can be tailored to the structural characteristics of multiple views of data. This is a highly flexible algorithm, able to predict a wide variety of response types. A distinguishing feature is the ability to simultaneously identify outlier observations with respect to the fitted model. Our proposed unsupervised learning model extends multidimensional scaling to directly map multi-view data into a shared latent space. This vector embedding captures both commonalities and disparities that exist between multiple views of the data. Throughout the thesis, we demonstrate our models using real-world cyber security datasets.
Content Version: Open Access
Issue Date: Apr-2023
Date Awarded: Aug-2023
URI: http://hdl.handle.net/10044/1/106417
DOI: https://doi.org/10.25560/106417
Copyright Statement: Creative Commons Attribution NonCommercial NoDerivatives Licence
Supervisor: Adams, Niall
Sponsor/Funder: QinetiQ (Firm)
Department: Mathematics
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Mathematics PhD theses



This item is licensed under a Creative Commons License Creative Commons