Modelling temporal and structural dependencies in computer network traffic with applications in cyber security
File(s)
Author(s)
Price-Williams, Matthew
Type
Thesis or dissertation
Abstract
Anomaly detection for cyber security defence has received much attention in recent years, providing a complementary approach to traditional signature based detection systems. Anomaly detection methods rely on building probability models of normal computer network behaviour and detecting deviations from the model.
Traffic flowing between nodes in a computer network can be interpreted and modelled as counting processes, where events in a process indicate connections being established for data to be exchanged. Due to the typical size of computer networks, postulating full joint models on the network is often infeasible. For this reason, the methods in this thesis first look at building independent edge-based models on computer network graphs, before detecting correlation between different edges and constructing locally joint models on the correlated sub-networks.
The first proposed method uses changepoint detection techniques to construct
models of normal connectivity behaviour for each edge in a computer network graph. This is achieved by identifying key user features such as seasonality or self-exciting behaviour, since events typically arise in bursts and at particular times of day in patterns which may be peculiar to that edge. In particular, a flexible, nonparametric model for the excitation function of a Wold process is proposed for modelling the conditional intensities of network edges. When monitoring a computer network in real time, unusual patterns of activity against the intensity model of normality could indicate the presence of a malicious actor.
To build robust, realistic models it is important to understand the dependencies that exist between the large numbers of routinely interacting communication pathways within a computer network. For two counting processes $A$ and $B$ denoting the interactions between two distinct edges in a computer network, we often wish to assess whether events occurring in $A$ trigger events to then occur in $B$. A test will be introduced using the well-known higher criticism statistic to detect such dependence when only a subset of the events in $A$ exhibit a triggering effect on process $B$; this test will allow us to detect even weakly correlated edges within a computer network graph. After identifying correlated sub-networks, joint models of normal network behaviour are constructed where events in one edge cause an increase in activity along other network edges.
Finally, many pairs of interacting computers exchange a mixture of user-driven and automated events. The latter category most often appears as periodic polling behaviour. Separating these automated events from those caused by human activity is an essential precursor to modelling user-driven computer network behaviour. This thesis presents a changepoint detection framework for identifying automated network events appearing as periodic subsequences of event times. The opening event of each subsequence is interpreted as a human action, which then generates an automated, periodic process.
Traffic flowing between nodes in a computer network can be interpreted and modelled as counting processes, where events in a process indicate connections being established for data to be exchanged. Due to the typical size of computer networks, postulating full joint models on the network is often infeasible. For this reason, the methods in this thesis first look at building independent edge-based models on computer network graphs, before detecting correlation between different edges and constructing locally joint models on the correlated sub-networks.
The first proposed method uses changepoint detection techniques to construct
models of normal connectivity behaviour for each edge in a computer network graph. This is achieved by identifying key user features such as seasonality or self-exciting behaviour, since events typically arise in bursts and at particular times of day in patterns which may be peculiar to that edge. In particular, a flexible, nonparametric model for the excitation function of a Wold process is proposed for modelling the conditional intensities of network edges. When monitoring a computer network in real time, unusual patterns of activity against the intensity model of normality could indicate the presence of a malicious actor.
To build robust, realistic models it is important to understand the dependencies that exist between the large numbers of routinely interacting communication pathways within a computer network. For two counting processes $A$ and $B$ denoting the interactions between two distinct edges in a computer network, we often wish to assess whether events occurring in $A$ trigger events to then occur in $B$. A test will be introduced using the well-known higher criticism statistic to detect such dependence when only a subset of the events in $A$ exhibit a triggering effect on process $B$; this test will allow us to detect even weakly correlated edges within a computer network graph. After identifying correlated sub-networks, joint models of normal network behaviour are constructed where events in one edge cause an increase in activity along other network edges.
Finally, many pairs of interacting computers exchange a mixture of user-driven and automated events. The latter category most often appears as periodic polling behaviour. Separating these automated events from those caused by human activity is an essential precursor to modelling user-driven computer network behaviour. This thesis presents a changepoint detection framework for identifying automated network events appearing as periodic subsequences of event times. The opening event of each subsequence is interpreted as a human action, which then generates an automated, periodic process.
Version
Open Access
Date Issued
2018-09
Date Awarded
2018-12
Copyright Statement
Creative Commons Attribution NonCommercial NoDerivatives Licence
Advisor
Heard, Nick
Sponsor
Engineering and Physical Sciences Research Council
Government Communications Headquarters (Great Britain)
Grant Number
4177302
Publisher Department
Mathematics
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)