An autoencoder-based reduced-order model for eigenvalue problems with application to neutron diffusion

Using an autoencoder for dimensionality reduction, this paper presents a novel projection-based reduced-order model for eigenvalue problems. Reduced-order modelling relies on finding suitable basis functions which define a low-dimensional space in which a high-dimensional system is approximated. Proper orthogonal decomposition (POD) and singular value decomposition (SVD) are often used for this purpose and yield an optimal linear subspace. Autoencoders provide a nonlinear alternative to POD/SVD, that may capture, more efficiently, features or patterns in the high-fidelity model results. Reduced-order models based on an autoencoder and a novel hybrid SVD-autoencoder are developed. These methods are compared with the standard POD-Galerkin approach and are applied to two test cases taken from the field of nuclear reactor physics.


Introduction
Reduced-order modelling or model reduction [52] is a numerical technique that has made a significant impact across a broad range of fields, including aerodynamics [65], haemodynamics [4], fracture [34,1], porous media [31,63] and molecular dynamics [29]. Its aim is to produce a low-dimensional model which is a good approximation of a highdimensional model (high-fidelity model), but which can be solved in a fraction of the computational time required by the high-fidelity model (HFM). Main applications for reduced-order modelling are multi-query problems, such as optimisation and control [13,65], and applications which require real-time solutions [41,2]. The computational efficiency of reduced-order models is achieved by an offline/online split of the computational work. There are typically three parts to the offline stage. First, the high-fidelity model is solved for different parameter values. Second, dimensionality reduction techniques are applied to the HFM solutions (referred to as snapshots) producing basis functions that span the low-dimensional space or reduced space. Finally, the discretised governing equations are approximated in the reduced space. In the online stage, the reduced-order model is solved for an unseen parameter value in a relatively fast time.
A number of dimensionality reduction methods have been proposed for finding the basis of the reduced space, however Proper Orthogonal Decomposition [57,8] (POD) is by far the most common. Also known as Principal Component Analysis (PCA), the Karhunen-Loève transform, Empirical Orthogonal Functions and the Hotelling transform, POD was introduced by Lumley [37] as a method to extract coherent structures from turbulent flow data, and was made tractable with the method of snapshots by Sirovich [54]. The POD basis functions are found by taking a Singular Value Decomposition (SVD) of the snapshots, or by solving a related eigenvalue problem. Using the basis functions, a Galerkin projection can be applied to the discretised governing equations (i.e. the highfidelity model) to produce the reduced system of equations. For HFMs whose operators can be affinely decomposed, the Galerkin projection will be applied to all the matrices in the decomposition. Finding the reduced system for an unseen parameter can be done by interpolating the resulting reduced system matrices. Reduced-order models produced by this method are referred to as projection-based reduced-order models [5]. For non-affine or non-linear systems, this process is more complex, and generally more sampling is required to represent the non-affine nature or non-linearities of the system. Hyper-reduction [48,19] and Discrete Empirical Interpolation [12] are two methods that can be used for this.
In some circumstances, the projection step is challenging, as the matrices representing the high-fidelity model matrices are not easily available, for example when using licensed software or for complex multi-physics problems. To alleviate this difficulty, an alternative method has been developed and is referred to variously as 'non-intrusive reduced-order modelling' (NIROM), 'POD with interpolation', 'Galerkin-free', 'surrogate POD' (see [7] and references within), 'data-driven reduced-order modelling' [32,23,56] and 'system/model identification' [46,60]. As in projection-based methods, these non-intrusive methods take a set of high-fidelity model solutions (snapshots) and find a reduced basis, often by using POD. However, to find solutions for unseen parameter values, projection of the high-fidelity system onto the basis functions is replaced by interpolating between snapshots (or snapshots that have themselves been projected onto the basis functions). Although cubic interpolation [11] or radial basis functions [63] were originally used for the interpolation, NIROM methods have since embraced machine learning, using a variety of neural networks for this purpose, including autoencoders in combination with Long Short-Term Memory networks [22,61], multi-layer perceptrons [27] and Gaussian Process Regression [23,64]. When using neural networks, the offline stage involves training a neural network to learn how the solutions depend on various model parameters, and the online stage involves evaluating the neural network for an unseen parameter value or values.
Although machine learning techniques are becoming more prevalent for predicting the behaviour of the HFM (third part of offline stage and online stage in NIROM-type methods), it is much less common to find machine learning algorithms being used in the dimensionality reduction stage (second part of offline stage in both projection-based or NIROM methods). One type of neural network that is ideal for dimensionality reduction is the autoencoder. A type of unsupervised (or self-supervised) feed-forward neural network, the autoencoder learns the identity map with a network architecture that includes a bottleneck. The central bottleneck layer forces the autoencoder to learn a reduced or low-dimensional representation of the data. The neurons in this layer determine the so-called latent space and are referred to as latent variables. These types of networks are common in image classification and identification [28,59], and have been used with great success to fill in gaps in images [38] and to remove noise from data [44]. Several authors [42,6,3] have highlighted the connection between autoencoders and SVD (or PCA), namely that an autoencoder with one hidden layer and linear activation functions can produce basis functions than span the same space as basis functions derived from an SVD. This connection between the autoencoder and the SVD paves the way for an autoencoder to be seen as a tool to perform nonlinear dimensionality reduction (by choosing nonlinear activation functions) and therefore to be used as an alternative to the SVD/PCA [58,28]. Amongst those early to recognise the potential of the autoencoder for dimensionality reduction within a model reduction framework were Milano et al. [39], who reconstructed the near wall field from computational solutions for pressure and shear stress at the wall. The suitability of autoencoders for dimensionality reduction having been established, a number of authors have since explored their use in reduced-order models, mostly non-intrusive reduced-order models. This is because the strength of the autoencoder is that it is a nonlinear embedding, but this also means that the reduced-order system will be nonlinear and is therefore more complex to solve than for a POD-based ROM. For non-intrusive models, this nonlinearity does not present any additional challenge and no change in the algorithm is required, hence the greater uptake of autoencoders in NIROM-type methods. For example, Gonzalez et al. [22] and Wiewel et al. [61] both use a convolutional autoencoder to reduce the dimension of their problems and Long Short-Term Memory networks to learn the dynamics. The former demonstrate their method on interacting vortices in a 2D box and 2D lid-driven cavity flow. The interacting-vortices problem aims to demonstrate the location invariance properties of convolutional networks. The long-term statistics of the flow, as embodied by the turbulent kinetic energy, are captured much better by the autoencoder-based method than by the projection-based POD-Galerkin model. Coming from the field of computer graphics, Wiewel et al. [61] solve 3D problems with millions of spatial degrees of freedom. With their reduced-order model, they are able to produce realistic-looking simulations of complex dynamical systems such as sloshing waves, colliding bodies of fluid and smoke convection. Their reduced model runs approximately two orders of magnitude faster than their high fidelity solver.
As previously mentioned, the substitution of an autoencoder for an SVD is less straightforward for projection-based reduced-order models than it is for non-intrusive reduced-order models. There are a couple of simple examples of the incorporation of autoencoders in a projection-based reduced-order model [25,33], however, the most comprehensive to date is a paper by Lee et al. [36] They propose a framework for projecting dynamical systems onto nonlinear manifolds using deep convolutional autoencoders and derive a posteriori error bounds. Focusing on advectiondominated benchmark problems known to be challenging for POD-based ROMs due to a slow decay of singular values or Kolmogorov n-width [53], they model the convection, diffusion and chemical reactions of a flame as it evolves in time over a range of values for two parameters which occur in the nonlinear reaction source term. The error in the autoencoder-based method is over 40 times less than that of the standard POD-based projection method, showing that, although their method requires a higher offline computational time (for the training of the autoencoder), they can produce much more accurate results than projection-based methods for some nonlinear problems. In this paper, we develop autoencoder-based ROMs for eigenvalue problems, using a standard autoencoder and a hybrid SVD-autoencoder.
The field of nuclear engineering has some extremely demanding applications such as neutron transport within a reactor core, the geometry of which presents extremely challenging aspect ratios. A recent discretisation of a Westinghouse PWR-900 core with 44 energy groups used over 1.7 trillion degrees of freedom [43,55] precluding the use of optimisation, safety-analysis or real-time analysis. It comes as no surprise, therefore, that reduced-order modelling has been applied to reactor physics problems. Based on the multigroup neutron diffusion equation for criticality, Chunyu et al. [15] applied a reduced basis method to 2D and 3D Pressurized Water Reactor (PWR) benchmark problems. Buchan et al. [10] also tackled criticality by transforming the eigenvalue problem into a time dependent problem and applied this to simple 1D and 2D models of nuclear reactors (test cases based on these are used in this paper). These papers [15,10] used a single POD basis for all the energy groups, the so-called monolithic approach. Heaney et al. [26] developed a method to model control-rod movement with a POD-based ROM for a PWR fuel assembly, in which POD basis functions were calculated independently for each energy group. German et al. [20] demonstrated that such an approach produced more accurate results than the monolithic approach. All the methods discussed in this paragraph show computational gains of at least three orders of magnitude over the respective HFMs. Outside of criticality problems, ROM has also been applied to time-dependent movement of control rods [50], and coupled neutronics and heat transfer [49], both using a reduced basis method based on POD.
Although, in many areas, adding diffusion or solving diffusion problems often leads to fields that are smooth and well able to be represented accurately with POD basis functions, in other areas, such as reactor physics, one is often confronted by problems with abruptly changing fields similar to advection problems. A good example is a control rod that is partially inserted into a reactor, where one would see a near zero flux within the control rod and a much higher flux a small distance away from the control rod; see the scalar flux solutions shown in Buchan et al. [9] for instance. This similarity between advection and diffusion problems is also seen in discretisation methods such as the Self Adjoint Angular Flux method [40], in which the first order transport equations are transformed into a series of diffusion-like equations with the application of a least squares principle and a particular weighting. Although two relatively simple test cases are presented in this paper, our future work will involve increasingly complicated/more realistic assembly or reactor configurations which will benefit from the improved modelling capabilities that autoencoders provide. This paper reports, what the authors believe to be, the first application of an autoencoder-based reduced-order model to an eigenvalue problem. Also presented is a novel hybrid SVD-autoencoder method that combines an SVD with an autoencoder. This is useful as training a fully-connected autoencoder with a long input vector is computationally expensive. Without the SVD, the length of the input vector would be equal to the number of degrees of freedom of the problem. Applying the SVD reduces the length of the input vector to, at most, the number of snapshots.
The remainder of the paper is organised as follows. In Section 2, the governing equations and their controlvolume discretisation are presented. Used to find the dominant eigenvalues and eigenvectors of eigenvalue problems, the power method is also described. Section 3 contains background information relevant to reduced-order modelling techniques, including descriptions of POD, SVD, selecting basis functions and constructing projection-based reduced-order models. A description of autoencoders is included in Section 4 along with theory on how to include these networks in a projection-based ROM. In Section 5 results of the 1D and 2D test cases are given for several different autoencoder-based reduced-order models and they are compared with a standard POD-based reduced-order model. Concluding remarks are made in the final section.

Governing equations and their discretisation
This section introduces the equations that govern criticality and their control-volume discretisation, referred to as the high-fidelity model. A description is then given of the power method, which can be used to solve generalised eigenvalue problems such as that arising in criticality.

Diffusion Equation
The one-group steady-state diffusion equation for criticality takes the form of a generalised eigenvalue problem and can be written as where φ is the scalar flux of the neutron population, Σ a represents the absorption cross-section, Σ f represents the fission cross-section and ν is the average number of neutrons produced per fission event. The diffusion coefficient, D, is given by: where Σ s is the scattering coefficient. The eigenvalue, λ, is defined as the reciprocal of k eff , that is λ = 1 k eff , where: k eff = number of neutrons in one generation number of neutrons in the preceding generation .
The boundary conditions are for reflection: and for a vacuum or bare-surface: in which n is the outward-pointing normal to the boundary. In this paper, autoencoder-based reduced-order models are developed for the one-group equations, however the methods described here could easily be applied to the multi-group equations.

Discretisation
A control-volume discretisation of the diffusion equation in 2D with a regular N x × N y mesh can be written as: in which ∆x and ∆y are the uniform cell widths in the x-and y-directions respectively, N x and N y are the numbers of cells in the x-and y-directions respectively, the subscripts i, j refer to the cells in the x-and y-directions respectively, and φ i,j represents the scalar flux in cell i, j. In this expression, the first and last cells are omitted in order to apply the boundary conditions efficiently. To apply a reflective boundary condition, see Equation (4), the diffusion coefficients in the outermost cells are set to a large negative number. To set a bare surface boundary condition, see Equation (5), the diffusion coefficients in the outermost cells are again set to a large negative number and the absorption term is modified as now described. To apply such a boundary condition to the left or right edges (where the normal to the boundary is aligned with the x-direction): and to apply this to the top or bottom edges (where the normal to the boundary is aligned with the y-direction): For cells that have both boundary conditions the following modification is made: .
The discretised form of Equation (1) can therefore be written as: where matrix A contains the transport terms (scattering, diffusion) from the left-hand side of Equation (6), matrix B represents the fission terms given in the right-hand side of Equation (6) and the vector φ contains the values of the scalar flux for each cell. The matrices are of size (N x − 2)(N y − 2) by (N x − 2)(N y − 2). Although a 2D discretisation is given here, the methods can also be applied in 1D or 3D.

The Power Method
The power method [21] is an algorithm which finds the dominant eigenvalue and eigenvector of an eigenvalue problem. In certain circumstances the method can be slow to converge, however, it is possible to accelerate convergence [47]. The power method has a broad range of application and has been used to rank scientific journals [45] and to make suggestions to Twitter users of whom to follow [24]. In order to apply the power method to the criticality eigenvalue problem [51], Equation (10) is rearranged to be in the form of a standard eigenvalue problem as follows First the algorithm iterates to find a flux solution by repeatedly applying A −1 B to the current scalar flux. Once these inner iterations have converged, the algorithm normalises the scalar flux and calculates the associated eigenvalue. The outer iterations continue until convergence is reached, which, in this paper, is either when the difference in the eigenvalue in successive iterations is less than 10 −8 or when 1000 iterations have been completed. This procedure is detailed in Algorithm 1, in which b is a vector whose entries all equal one. The normalisation applied here, seen on line 9, differs from the normalisation usually applied in the power method, however, results in a unit number of fissions, which is often preferred in nuclear applications. The parenthesised superscript on the flux solutions and the eigenvalues indicates the outer iteration index. The inner iteration loop is not described here. Although not described here, the inner iterations use a forward backward Gauss Seidel method to solve Equation (11) when generating the high-fidelity model solutions. while not converged do 8: ! normalising the flux 10: return φ, λ 20: end function

Projection-based Reduced-Order Modelling
This section briefly describes the key stages of constructing a projection-based reduced-order model, including explanations of proper orthogonal decomposition (POD) and the method of snapshots, singular value decomposition, selecting POD basis functions and constructing a projection-based reduced-order model.

Proper Orthogonal Decomposition
Proper orthogonal decomposition was introduced by Lumley [37] to identify the most energetic coherent structures within turbulent flow fields. The method can be applied to data from experiments as well as from numerical simulations. In the latter case, the numerical data is gathered into a matrix, S, whose columns are individual snapshots pertaining to the solution of the high-fidelity model for a particular set of parameters. For a problem with N spatial degrees of freedom and M snapshots, the snapshots' matrix will be of size N by M . According to POD, a snapshot can be decomposed into a finite number of coefficients (α i ) and POD basis functions or modes (ψ i ) as follows: where α i is a scalar and φ, ψ i ∈ R N . POD gives the optimal linear representation of the data in the snapshots' matrix. In order to find the POD basis functions, the average squared error between the snapshots and their projection onto the basis functions is minimised [30]. This leads to an eigenvalue problem, which was originally written in the form where µ j is the eigenvalue, ψ j is the eigenvector, and, without loss of generality, it is assumed that the eigenvalues are given in descending order and that the eigenvectors have been orthonormalised. For a large number of degrees of freedom, this problem quickly becomes challenging to solve, and it was Sirovich [54] who noted that the same (non-zero) eigenvalues could be found by instead solving the following, more tractable problem for eigenvectors ϕ k . Called the method of snapshots, its lower computational burden has led to a wide uptake of POD for analysing results from numerical simulations [57,8]. Assuming, again, that the eigenvalues are arranged in descending order, the eigenvectors of the two problems (13) and (14) are related through Finding the POD basis functions can also be done by taking a singular value decomposition of the snapshots' matrix [30], which we now describe briefly. For a real N × M matrix S, where N M , the singular value decomposition of S is as follows: where U is an N × N matrix consisting of orthonormalised eigenvectors associated with the M largest eigenvalues of SS T and V is an M × M matrix consisting of orthonormalised eigenvectors associated with S T S. Here, the POD basis functions are the columns of U . The matrix Σ contains the non-negative square roots of the eigenvalues of S T S (called the singular values) on its diagonal. These singular values are ordered as follows: A reduced rank approximation to S can be found by setting to zero all but the first P largest singular values in the matrix Σ, and then pre-and post-multiplying by U and V T respectively. This will give the optimal rank P approximation of the matrix S according to the Eckart-Young-Mirsky theorem [18].

POD Basis Functions
For some problems, a few basis functions will capture much of the behaviour of the system (as represented by the snapshots), and this is seen by the magnitude of the first few singular values being much larger than the remaining singular values. In such cases, the POD basis can be truncated without introducing much error. To determine how many basis functions should be kept, an empirical expression can be used that relates the number of basis functions retained to the fraction of information captured by them. The amount of information carried by each basis function depends on the square of its singular value. Suppose that the fraction of information of the original system to be captured is γ, where 0 γ 1, then the lowest integer value of P is sought, such that the following is satisfied: After truncation is applied (if indeed it is), the basis functions are stored in the matrix R ∈ R N ×P . If the SVD has been used then the first P basis functions from U are retained to make up the columns of the matrix R. If no truncation is applied then P = M (i.e. the number of basis functions is equal to the number of snapshots). If either of the related eigenvalue problems are solved to determine the basis functions, then, once orthonormalised, the first P eigenvectors are retained to make up R, either from Equation (13) or Equations (14) and (15). The complexity of solving the eigenvalue problems is O(N 3 ) for Equation (13) and O(M 3 ) for Equation (14). For an SVD, the complexity is O(N M 2 ) (see Golub et al. [21]), thus, for large problems (many snapshots but where N M ), the eigenvalue problem in Equation (14) will be the cheapest to solve.
Once R has been determined, the P reduced variables α associated with a snapshot φ can be determined from the basis functions by: and the cell-based scalar flux values can be recovered from the reduced variables by: where R represents the POD basis functions and α the POD coefficients or reduced variables. Equation (20) is the matrix equivalent of Equation (12).

Constructing a Reduced-Order Model for Criticality with POD
Having found the POD basis functions for the system, the discretised governing equations can be projected onto the reduced space. By inserting Equation (20) into Equation (10) and pre-multiplying by R T , the reduced system of equations is found: where the matrices A and B both depend on the material parameters Σ a , Σ s , Σ f . This reduced system of equations has matrices of size P by P , whereas the original system given in Equation (10) is N by N where N P .
In this paper, the aim is to construct a reduced-order model based on a set of snapshots, each of which correspond to a particular control-rod configuration, and to use this model to predict solutions for previously unseen control-rod configurations. An important point to address is how to modify the reduced system in Equation (21) for unseen parameters. One possibility is to exploit an affine decomposition of the matrices A and B, by calculating R T A i R and R T B i R as part of the offline stage, where A i is the ith matrix in the affine decomposition of A, similarly for B i . These matrices could be interpolated in order to approximate the matrices corresponding to an unseen parameter [26]. Another approach would be to use the Matrix Discrete Empirical Interpolation Method [62], which involves sampling the high-fidelity model matrices A and B at a relatively low number of points before pre-and post-multiplying by R T and R respectively, in order to estimate A and B for the unseen parameter. For both these methods, the sampling of A and B, and their pre-and post-multiplication by the POD basis functions would be part of the offline stage, whereas the interpolation of the resulting reduced matrices would be part of the online stage.
As the aim of this paper is to demonstrate an autoencoder-based ROM for eigenvalue problems, a simple method is chosen which avoids approximation due to sampling, but would be impractical for large problems. To evaluate R T AR and R T BR for an unseen parameter, the high-fidelity model is used to assemble A and B (Equations (6) and (10)) and then pre-and post-multiplied by R T and R respectively. This would not be practical for systems with large number of degrees of freedom, as the online stage, in which R T AR and R T BR are approximated for the unseen parameter, should be independent of the high-fidelity model for reasons of computational efficiency. Future work will involve larger scale problems, for which more computationally efficient methods will be investigated.
For the POD-based ROM, the offline stage consists of solving the high-fidelity model many times for different material parameters (cross-sections), see Equation (10) and Algorithm 1. In this paper, the POD basis functions are determined by solving the eigenvalue problem associated with the method of snapshots, see Equation (14). The online stage of the POD-based ROM consists of assembling the matrices A and B for a given set of parameters, projecting these matrices onto the reduced space and then solving with the power method. The outer iteration algorithm for the POD-based ROM is very similar to that of the high-fidelity model (Algorithm 1), except for the passing down of the POD basis functions from the outer to the inner iterations. The so-called inner iterations are described for POD-based ROM in Algorithm 2. (Although there are no iterations, the description 'inner iterations' is still used for Algorithm 2). Here the reduced-order model, Equation (21), is used to solve for the scalar fluxes. The right-hand side of Equation (21) is evaluated using the current value of scalar flux, resulting in a system that can be solved for the reduced variables. As the system is small, the problem is solved directly with Gaussian elimination. There is no need to use the eigenvalue in the source term (see line 2) as, once passed back to the outer iterations, the flux is normalised. However, it is included here as it is needed in the inner iterations for the autoencoder-based reduced order models.
Algorithm 2 POD-based reduced-order model: inner iterations

Autoencoders and their inclusion in projection-based ROMs
First, the two types of autoencoder used in this paper are described: a standard, fully-connected autoencoder and an SVD-autoencoder. Then, their incorporation into projection-based reduced-order models is explained. The use of autoencoders for dimensionality reduction in eigenvalue problems represents the main novelty in this paper. To the authors' best knowledge, the hybrid SVD-autoencoder is also a new method.

A Standard Autoencoder
An autoencoder is a special type of feed-forward neural network that is trained to learn the identity map. A bottleneck at its central layer forces the autoencoder to learn a reduced representation of the data. The simplest (vanilla) autoencoder has an input layer with n neurons, a hidden layer with m neurons where m < n, and an output layer with n neurons. The values of the neurons in the hidden layer define the latent space and can be referred to as the latent variables. Networks with additional hidden layers can learn more complicated patterns creating a more effective representation of complex data. An autoencoder can be split into two networks: the encoder, which maps an input to the latent variables, and a decoder, which maps from the latent variables to the output. If the input vector is given by x ∈ R n , the output vector byx ∈ R n and the latent variables of the hidden layer by y ∈ R m , then the autoencoder can be written as a composition of functions as followŝ where f AE , f ENC and f DEC represent the autoencoder, encoder and decoder respectively, with associated weights w AE , w ENC and w DEC . Figure 1 illustrates a fully-connected autoencoder with 6 layers (only layers with trainable weights are included in the total), 5 of these are hidden layers. Neurons can have connections with sending neurons from the previous layer (except for those in the input layer) and with receiving neurons from the following layer (except for those in the output layer). In this diagram, the input and output layers have 6 neurons each and the latent space is of dimension 1. The grey arrows indicate how the information passes through the autoencoder from left to right. The input to a neuron is calculated as a weighted sum of the outputs of the sending neurons to which it is connected and a bias term. The output of the neuron is the evaluation of the activation function for this weighted sum. For example, the output of the ith neuron of the hidden layer of the vanilla autoencoder described above, is given by where g is the activation function, b i is the bias of the ith neuron, {x j } n j=1 are the values of the sending neurons connected to the ith neuron and w ij is the weight of the jth sending neuron of the ith neuron in the hidden layer.

Input
Encoder Encoder Latent Variable(s) Decoder Decoder Output The data available to build the network is customarily split into three parts: training data, validation data and test data. The use of these data sets in the training process will now be described. The activation function and number(s) of neurons in the hidden layer(s) are examples of hyper-parameters of the network, and the weights and biases are referred to as parameters of the network. For given hyper-parameters, the objective of training the autoencoder is to find weights that minimise the so-called loss function, which, for regression problems, is often taken as the mean squared error of the reconstruction: where N ex is the number of samples or examples used to train the autoencoder and the norm used is the Euclidean norm. Based on gradient descent-type methods, a back-propagation algorithm is used to optimise the weights by solving In order to find the optimal hyper-parameters, less sophisticated approaches than back-propagation are employed, such as cross-validation [35]. This involves training a number of networks each with a different set of hyperparameters and comparing the performance of these networks on the validation data. The best-performing network is selected and can be re-trained with both the training and validation data combined. The performance of this network is then assessed on the test data.

SVD Autoencoder
The length of the input to the autoencoder is one factor that governs the difficulty of training an autoencoder. The longer these vectors are, the more parameters (weights) there are to be adjusted during optimisation, which increases the computational cost. To combat this, the dimensionality reduction can be split into two steps. The first step is to apply an SVD to the inputs of the autoencoder (i.e. the snapshots). A basis is obtained and the snapshots are projected onto the resulting low-dimensional space yielding reduced variables α (just as is done in POD-based ROM), see Equation (19). This reduces the length of the input vectors from the number of degrees of freedom of the problem (N ) to the number of basis functions (P where P N ). The second step is to train a fully-connected autoencoder with the reduced variables of the snapshots (or the subset of these that have been selected for training), thereby further reducing the dimensionality.

Projection-based ROMs with autoencoders
POD defines a linear map between the scalar fluxes φ, and the corresponding reduced variables, α, see Equation (19). Using an autoencoder for the dimensionality reduction means that the map between these variables will, in general, be nonlinear. Consequently, even for a linear system of equations, the resulting ROM will be nonlinear, and for an eigenvalue problem, this adds an additional level of non-linearity. To construct a reduced-order model, the nonlinear mapping between the reduced variables and the scalar fluxes needs to be linearised and the reduced system of equations needs to be derived. Both points are now elucidated.
Suppose that we have a known state of reduced variablesα and scalar fluxesφ, and wish to find the change in the scalar flux ∆φ due to a small change in the reduced variables ∆α. We can linearise the map C between these variables using a first order Taylor series as follows: and The scalar flux φ can therefore be approximated as In order to calculate C(α), perturbations are made toα from which the entries of C can be inferred. Each of the P entries inα are perturbed in turn by a small value, which yields a column of entries in C. The kth perturbation simply adds a small number to the kth component ofα, so, for example, the second perturbation is given by The decoder can be used to calculate the scalar flux corresponding to the perturbed reduced variables k α: The change in the scalar flux can be calculated by subtracting the known stateφ from the above. The kth column of C is now known: where φ j is the jth component of the scalar flux. Once every component of the reduced variables α has been perturbed, the C matrix will be fully determined. Having found the map between the reduced variables and the high-fidelity model variables, the projection-based ROM, using an autoencoder, can now be developed.
When solving the high-fidelity model, an inner iterative loop solves for the scalar flux and an outer iterative loop solves for the eigenvalue. When using a reduced-order model with an autoencoder, we solve a reduced-order model for the scalar flux but use the same outer iterations to solve for the eigenvalues, similar to what was done for the POD-based method. Hence, Algorithm 1 is used unchanged for the reduced model. The right-hand side of Equation (10) is treated as a source which will be held constant while solving, iteratively, for the flux. In other words, although φ and λ will be updated in the inner iterations, the source term will use the values of these variables from the outer iterations: where s = λ (·) Bφ (·) . Substituting Equation (30) into Equation (34), rearranging and pre-multiplying the result by C T gives the reduced system of equations for the autoencoder-based ROM: As this system is relatively small for the examples used in this paper, it is solved with Gaussian elimination. The algorithm for the inner iterations, which solve for the scalar flux, is given in Algorithm 3 in which the dependence of C onα has been dropped for readability. Regularisation has been applied to the reduced-order model by adding a small number, ε, to the diagonal of C T AC as, without this, there is no guarantee that this matrix will be nonsingular. The regularisation will also ensure that the iterative change in reduced variables (∆α) will not become too large. Note that this is a modified power iteration, as the eigenvalue is included in the source term. For the standard power method this is not required when determining the flux, as normalisation renders this unnecessary. However, in line 11 of the algorithm, the right-hand side of this expression has a source term and an additional term, meaning that the size of the source term is now important. Both autoencoders described in  while not converged do 9: calculate C (k) for α (k) using Equation (33) 10: solve the reduced-order model: 11: 12: 14:

Results
The methods outlined in Sections 3 (POD-based ROM) and 4 (autoencoder-based ROMs) are applied to two test cases. Before presenting the results, the error measures that are used to compare the methods are introduced, followed by an explanation of how the material parameters (cross-sections) are homogenised. The first test case is a 1D slab reactor and results from a POD-based ROM and an AE-based ROM are presented. Further comparisons are made between a POD-based ROM and an AE-based ROM constructed with fewer reduced variables. The second test case is a simplified 2D reactor. For this case, results are presented for ROMs based on POD, on an autoencoder and a hybrid SVD-autoencoder. The reduced-order models were developed in python using Keras [14] with the TensorFlow backend.

Error measures
To assess the accuracy of the results, a number of error measures are introduced. In all the expressions, the reduced-order model solutions are measured against the high-fidelity model results. First, the error committed in the dimensionality reduction step is evaluated. For the POD-based ROM, this error is incurred by projecting the high-fidelity model solutions onto the reduced space and for the AE-based ROM this error occurs during the compression and de-compression of the high-fidelity model solutions by the autoencoder. For the SVD-AE-based ROM, the error is due to both projecting the solution onto a space defined by POD basis functions, and compressing and de-compressing with an autoencoder. The normalised maximum errors in the reconstruction of the solution are defined here using for the POD-based ROM, the AE-based ROM and SVD-AE-based ROM respectively, where · ∞ represents the maximum norm, i and k represent cell indices, and the superscript 'HFM' indicates the flux is associated with the high-fidelity model.
Second, the error in the scalar flux of the reduced-order models is calculated. The normalised maximum error in the flux profile is determined by in which the superscript 'ROM' indicates the solution is from a reduced-order model (either POD-based or autoencoderbased), and i and j represent cell indices.
Calculating the error in k eff is done by comparing the value calculated by the high-fidelity model with that of the reduced-order model: As k eff is usually close to one, no normalisation is applied.
Finally, an average maximum error is defined, where, for N sets of material parameters resulting in N solutions of the high-fidelity model and N solutions of one of the reduced-order models: where µ l refers to the parameters (i.e. the macroscopic cross-sections) used for the lth problem. Until this point, dependence of the solution on the material parameters has not been explicitly indicated (for readability), however here it is required. A similar error measure can also be used to find the average maximum error in k eff : and the average maximum error in the reconstruction: where DR represents one of the three dimensionality reduction methods used: POD, AE or SVD-AE.

Homogenising the Material Parameters
In the test cases that follow, there are fuel regions and also control-rod regions, within which the control rods can be fully inserted, completely withdrawn or partially inserted. In the latter case, the cross-sections will be calculated by combining the cross-sections for fuel and control rod. Control-rod regions are assumed to have a uniform distribution of the averaged properties of the mixture within the system based on a mixing coefficient for each region. This can be written as follows where the mixing coefficient, r, for a given region, lies between 0 and 1, and the subscripts 'hom', 'cr' and 'fuel' indicate a homogenised cross-section, and cross-sections of the control rod and the fuel respectively. If r were chosen to be proportional to the amount of control rod present in a control-rod region, the high absorption of the control rods would dominate the homogenised cross-section. In order to address this, the reciprocal of the absorption cross-sections are averaged: By combining Equations (44) and (46) it can be shown that the mixing coefficient is given by In this way, the desired amount of control rod in a region can be chosen by setting z, converting this into a value for r by using Equation (47), and then calculating the homogenised absorption and scattering cross-sections from Equations (44) and (45).

1D Slab Reactor
The length of the slab reactor is 10 cm, and it consists of fuel and two control-rod regions labelled as 1 and 2 in Figure 2. Region 1 is located between 2.2 cm and 2.5 cm on the x-axis, region 2, between 7.5 cm and 7.8 cm. This test case was used by Buchan et al. [10] x axis (cm)  Table 1 contains the macroscopic cross-sections used for the control rods and the fuel. The control rods can be fully inserted, completely withdrawn or partially inserted into regions 1 and 2. In the latter case, the cross-sections for these regions will be calculated by combining the cross-sections for fuel and control rod by defining mixing coefficients as described in the previous section. The amount of control rod in region 1 will, in general, be different from that of region 2, henceforth, z 1 and r 1 refer to region 1, and z 2 and r 2 refer to region 2. 144 cells were used in the control-volume discretisation although 44 of these were used to enforce the boundary conditions, resulting in a system with 100 degrees of freedom.  To generate the data necessary to build the reduced-order models, 200 problems were solved using the high-fidelity model with different control-rod settings. This was done by choosing values for z 1 and z 2 at random from the interval [0, 1] and using Equations (47), (44) and (45) to calculate the corresponding cross-sections. The controlvolume discretisation along with the cross-sections determine the matrices A and B in Equation (10). Given an initial guess for the scalar flux, the system in (10) is solved with the power method as described in Section 2.3 yielding 200 solutions of the high-fidelity model. One hundred of the high-fidelity model solutions were treated as snapshots and were used to generate basis functions (either by applying POD or by training an autoencoder). These snapshots are referred to as 'seen' data (in the context of ROM) or as training data (in the context of neural networks). The remaining 100 solutions were used in the online stage to test how good the models were at predicting solutions for unseen data (i.e. parameter sets whose corresponding solutions were not included in the snapshots). This data is referred to as unseen data (in the context of ROM) and as test data (in the context of neural networks). Two reduced-order models are compared: a POD-based ROM and an autoencoder-based ROM. The models have 10 reduced variables (10 POD coefficients for the POD-based model and 10 latent variables for the autoencoder-based model). Following this, a POD-based model and an autoencoder-based model are developed with just two reduced variables.

POD-based Reduced-Order Model
The first method to be tested is a POD Galerkin method which uses POD and the SVD to determine basis functions, projects the discretised governing equations onto the subspace defined by the POD basis functions, and results in a generalised eigenvalue problem of reduced dimension, Equation (21), to be solved in the online stage for a given set of macroscopic cross-sections. This method is described in Section 3.3.
An SVD is applied to the 100 snapshots and from a possible 100 basis functions, 10 are retained, which corresponds to capturing 99.999% of the information contained in the snapshots, see Figure 3a. As can be seen from this figure, the magnitude of the singular values decreases rapidly from the 1st to the 20th singular value (by about 7 orders of magnitude) before levelling off, and it can be expected from this that a POD model with a sufficient number of basis functions will be perform well. The first four basis functions can be seen in Figure 3b. Notice that the fourth basis function has more detail or structure than the other basis functions, following the general trend that the higher-order basis functions tend to have more structure.   Figure 4 shows the error, e POD max (φ HFM ), resulting from projecting high-fidelity model solutions onto the reduced space for both the seen data (snapshots or training data) and the unseen data (test data), see Equation (36). Both data sets have a similar range of errors and the errors themselves are low: O(10 −6 ).
The 200 sets of parameters that generated the high-fidelity model results are used again to obtain the matrices A and B. The system of equations describing the POD-based ROM, given in (21), is solved for each problem using the power method described in Algorithms 1 and 2, and Section 3.3 using numpy's Gaussian elimination function to solve for the reduced variables. Half of the results have been seen by the model and half have not been seen.    To further analyze the results, the solutions from the POD-based ROM were compared with solutions from the high-fidelity model (both seen and unseen data sets) using the error measures for the scalar flux and k eff as given by Equations (39) and (40) respectively. Figure 7a shows a histogram of the error in the scalar flux profile and Figure 7b shows a histogram of the error in k eff . The plots demonstrate both the accuracy of the POD-based model (for seen data) and its ability to prediction (for unseen parameters). The ranges of values of the errors for both the seen and unseen parameters are similar. Table 2 displays the average maximum errors in the reconstruction, scalar flux and k eff as defined in Equations (41), (42) and (43). The projection error (for one solution) is given by Equation (36). The error in the scalar flux solution will contain this error and other numerical errors made when solving the reduced generalised eigenvalue problem. Therefore we expect that the projection error would be lower   Table 2. Also noteworthy is the similarity between the range of errors for the seen and unseen data. This may mean that the snapshots represent well the behaviour that was tested by the 100 unseen problems. For k eff , the error for the unseen results is slightly lower that than of the seen results. This is surprising but could be explained by several outliers and the averaging of the maximum errors.   Table 2: Average maximum errors for seen and unseen data using the POD-based reduced-order model.

Autoencoder-based reduced-order model
In this method the dimensionality reduction (or compression) is performed by an autoencoder instead of an SVD. Making this change requires an additional iteration loop when solving the reduced generalised eigenvalue problem as explained in Section 4.2. For the online prediction, an initial flux guess is compressed using the encoder. Following this, the C matrix is formed by using Equation (29). The system in Equation (33) is then solved, the flux is decoded and passed to the outer iterations, and the whole process is repeated until k eff converges. In order to compare with results from the previous section, the autoencoder compresses to 10 latent variables. Autoencoders with different architectures were trained and the best performing architecture comprised 14 fully-connected layers with the following number of neurons in each layer: The activation function for every layer was the exponential linear unit (ELU) [16]; the optimiser used was 'Nadam' [17] and the loss function was defined to be the mean squared reconstruction error. The snapshots were scaled between 0 and 1. The network was trained over 10,000 epochs using gradient descent (with a batch size of 100). In one epoch, all the training data is presented to the neural network.
A similar procedure is carried out as in the previous section and the same solutions of the high-fidelity model are used as seen data (the snapshots) and unseen data. When training neural networks, the seen data is usually referred to as training data and the unseen data is referred to as test data. The autoencoder was trained on the 100 snapshots and the other 100 unseen high-fidelity model results were used to test the prediction power of the AE-based reduced-order model. Figure 8a shows the error (as defined by Equation (37)) between each HFM solution and the corresponding output of the autoencoder. The results are split into seen or training data and unseen or test data. The autoencoder performs well, achieving, at worst, an error of 1.5% in the flux, with almost all of the absolute values of the error less than 0.5% . Although the training and test data have a similar range of errors, it should be noted that these errors are much larger than those of the SVD (see figure 4). Shown in Figure 8b are the minimum and maximum values taken by the 10 latent variables over the training data. Most of the latent variables here are contributing to the system although a couple of the variables have minimum and maximum values close to zero.
(a) Compression error in the flux profile output by the autoencoder.
(b) Minimum and maximum values of the 10 latent variables generated from the autoencoder over the training data.  Figures 9 and 10 show the flux profile and the convergence of k eff for two sets of parameters, the first was used in the training (seen) and the second from the test data (unseen). Both seen and unseen flux profiles of the reduced-order model agree well with the high-fidelity model solution, although the convergence of the reduced-order model is very slightly worse for the unseen case, and the value of k eff is slightly less accurate. Figure 11a shows the error in flux and Figure 11b shows the error in k eff based on the error measures defined in Equations (39) and (40). Excluding outliers, most of the flux solutions have an absolute error of 1% or less,   and the absolute error in the k eff values is less than 0.4 × 10 −3 . These values are higher than for the POD-based reduced-order model, and there is also a larger spread of errors for the AE-based model than for the POD-based method. Table 3 shows the average maximum errors for the AE-based reduced-order model (see Equations (41), (42) and (43)). It can be seen that the scalar flux error is of the same magnitude as the compression error. Here, as for the POD-based ROM, the errors for the training data (seen) are very close to those of the test data (unseen) which, again, would seem to confirm that the training data captures most of the significant behaviour that we test with the unseen data set. Again, the compression error acts as a lower bound for the flux error. On the whole, these errors are one or two orders of magnitude larger than those seen for the POD-based ROM.
For this test case, with two control-rod regions, one could postulate that having a low-dimensional space of dimension 2 should be enough to describe the behaviour of the system. The next section compares a projection-based ROM and an autoencder-based ROM with two basis functions and latent variables respectively.

Reduction to two variables
In order to see how well the models perform reducing to a low-dimensional space of dimension 2, a POD-based ROM and an AE-based ROM were constructed using just two POD basis functions for POD, and two latent variables for    Table 3: Average maximum errors for seen and unseen data using the autoencoder-based reduced-order model.
the autoencoder. Table 4 shows the errors for compression, flux and k eff when the system is compressed from 100 variables to two variables. For this extreme case, it can be seen that the autoencoder performs better than POD, and for k eff , has errors that are an order of magnitude lower than for POD. This suggests that the autoencoder can capture more of the features than POD with a given number of basis functions/latent variables.

2D reactor core eigenvalue problem
The methods are now applied to a simple 2D reactor. The domain is square-shaped with sides of length 90 cm and there are four different materials in this mock reactor, see Figure 12. The domain is discretised with 90 cells in both directions, each cell is of length 1 cm, making 8100 cells. In addition to this, 364 cells were used to enforce the boundary conditions. Once again, this test case was used by Buchan et al. [10] who based the cross-sections on IAEA benchmarks. The same macroscopic cross-sections are used here and their values are given in Table 5.
In Figure 12 the blue region represents graphite, the red area contains the fuel and the four control-rod regions are white and labelled 1, 2, 3 and 4. Each control-rod region is 10 cm square with centres at (25,35), (55,35), (25,65) and (55,65) respectively. These are assumed to have a uniform distribution of the averaged properties of the mixture within the system based on a mixing coefficient for each region chosen. The method for determining the mixing coefficient is described in Section 5.2, however, instead of mixing fuel and control rod materials in the control-rod   regions as done for the 1D test case, here, water and control rod are mixed, with more weight given to the water in order to prevent domination of the strong absorption of the control rods.
For this system 800 high-fidelity model solutions are generated, each with four distinct values of r, one for each control-rod region. The method for generating the solutions is similar to Section 5.3. Of the 800 solutions, 400 are used during the offline phase as the snapshots (or seen or training data). The other 400 results make up the unseen data or test data. These high-fidelity model solutions are used to construct a POD-based ROM and two autoencoder-based ROMs: the first uses a fully-connected autoencoder, the second uses the hybrid SVDautoencoder. The number of degrees of freedom of the high-fidelity model solutions is 8100, and, for all ROMs, the number of POD coefficients or latent variables is chosen as 4, as this is perhaps the minimum number required to represent the system.

POD-based ROM
For the POD-based ROM, an SVD is used to compress the snapshots. From a possible 400 basis functions, 4 are retained which corresponds to capturing 99.878% of the information contained in the snapshots. In the online stage the parameter values are chosen and matrices A and B of the high-fidelity model are formed. The first 4 basis functions are used to create R and Equation (21) is formed. This is solved with the power method as described in Algorithm 2 and Section 3.1. Figure 13a shows the singular values, of which the first 100 decrease rapidly, the remaining values reach a plateau. The first four corresponding basis functions can be seen in Figure 13b. Figure 14 shows the reconstruction error in truncating the POD basis for the snapshots (seen data) and the unseen data, see Equation (36). The range of errors for both sets of data is very similar.   The two flux profiles shown in Figure 15, for a seen set of parameter values, are in close agreement. The high-fidelity model results are on the left, results from the POD-based ROM are on the right. The pointwise error in the flux is shown in Figure 16a, and most error can be seen near the control-rod regions. The value of k eff converges to a value within 0.0035 of the value given by the high-fidelity model, see Figure 16b.  Figure 18a, where most of the error is concentrated around the control-rod regions. In Figure 18b, k eff is seen to converge closely to the value obtained from the highfidelity model. Figure 19 shows histogram plots of the errors from comparing the POD-based ROM with both the 400 snapshots (training data) and the 400 unseen HFM solutions (the test data). Figure 19a shows the error in flux profile calculated using Equation (39), and Figure 19b shows the error in k eff calculated using Equation (40). Setting aside some outliers, the error in flux profile has a range similar to the range of errors seen in the truncation of the POD basis, implying that the main contribution to the error for the POD-based ROM is due to the truncation of the POD basis.

Autoencoder-based reduced order model
In the autoencoder-based ROM, the SVD is replaced by an autoencoder, which, for this case, compresses down from 8100 variables to 4 latent variables. The autoencoder comprises 14 fully connected layers with the following number of neurons in each layer: Note here, the large reduction in neurons between the input and the first hidden layer. Although needed to reduce the number of layers which would otherwise be too high, this is expected to cause some difficulties with the accuracy of the network's predictions. The activation function for every layer is the exponential linear unit; the optimiser used is 'Nadam' and the loss function is defined to be the mean squared reconstruction error. The snapshots are scaled between 0 and 1, and the network was trained using mini-batch gradient descent with a batch size of 50. The number of epochs was 30,000. Figure 20a shows the reconstruction or compression error (i.e. the difference between each high-fidelity model solution and the corresponding output of the autoencoder after training) calculated using Equation (37). The range of errors in the test data is much larger (approximately ten times larger) than the range of errors for the training   data, revealing that over-fitting has occurred. Figure 20b shows the minimum and maximum values attained by the latent variables over the training data, which shows the relative sizes and ranges of the latent variables.
Results for a seen set of parameter values can be seen in Figures 21 and 22. There is a good agreement of the scalar flux profiles of the high-fidelity model and the AE-based ROM, with most error occurring in the vicinity of the control-rod regions (especially the lower two regions). This error is much more diffuse than for the POD-based ROM. The value of k eff seen in Figure 22b, is in close agreement with that of the high-fidelity model.
The results for an unseen set of parameters can be seen in Figures 23 and 24. The pointwise error between the high-fidelity model and the AE-based ROM is shown in Figure 24a is again more diffuse than the error for the POD-based ROM. The k eff value for the AE-based ROM, seen in Figure 24b, has a final value close to that of the high-fidelity model, although convergence is not as smooth for the other models. Figure 25 shows four basis functions of the nonlinear map, linearised at convergence, for both the seen and unseen sets of parameter values. These basis functions are columns of the C matrix. As for the POD-based model, it can be seen the basis functions capture variation associated with the control-rod regions. Figure 26 shows, despite the over-fitting of the network, that the training and test data both have similar ranges of errors as calculated by Equation (39) for the error in scalar flux and Equation (40) for the error in k eff . In   comparison with results from the POD-based ROM, this model has a lower standard deviation associated with the errors, but a much longer tail. For this reason 5% of the results have been removed from Figure 26a to focus the graph on the more frequently-occurring errors. The original range of the errors was between -0.6 and +0.32. Figure 26b shows errors k eff from 80% of the results, where the original range of errors was between -0.04 and +0.003. The majority of the errors are lower than for the POD-based ROM, and the standard deviation is small compared to the results of the POD-based ROM.

Singular Value Decomposition and Autoencoder approach
As an alternative to having a large difference between the number of neurons in the input and the first hidden layer, a combination of an SVD and an autoencoder is explored. First the SVD is applied to the snapshots, after which, 100 of the 400 possible basis functions are retained. The snapshots are projected onto the basis functions (see Equation (19)) and the resulting snapshot coefficients are used to train the autoencoder. The architecture is similar to that of the 1D problem, however, in order to be consistent with the other 2D reduced order models, the 100 variables are compressed to 4 latent variables. The autoencoder comprises of 14 fully connected layers with the number of neurons in each layer being:   The activation function for every layer was the exponential linear unit; the optimiser used was 'Nadam' and the loss function was defined to be the mean squared reconstruction error. The snapshots were scaled between 0 and 1, and the network was trained using mini-batch gradient descent with a batch size of 50. The number of epochs was 30,000. Figure 27a shows the error in the reconstruction when using the SVD-autoencoder for dimensionality reduction, calculated using Equation (38). The ranges of error for the training data and the test data are similar, which indicates that this network has been trained to a much better degree than the previous one, with no over-fitting occurring. Figure 27b shows the minimum and maximum values attained by the latent variables over the training data, which shows the relative sizes and ranges of the latent variables. Figure 28 shows the scalar flux profiles of the high-fidelity model and the SVD-AE-based ROM for a seen set of parameter values. The agreement is good, and much better than the previous POD-based and AE-based ROMs. The pointwise error between the high-fidelity model and the AE-based ROM seen in Figure 29a is lower than for the two previous models and the error is concentrated around the control-rod regions. The k eff value of the SVD-AE-based model, seen in Figure 29b, converges to a value extrememly close to that of the high-fidelity model.    The errors of the training and test data for the scalar flux and k eff can be seen in Figure 32. The errors are much lower than those seen in both the POD-based ROM and the AE-based ROM. Figure 32b has had outliers removed (amounting to 5% of the whole data set) to improve clarity of the plot. Figure 33 shows four basis functions of the nonlinear map, linearised at the point of convergence, for both the seen and unseen sets of parameter values. These basis functions are columns of the C matrix. As for the POD-based ROM and the AE-based ROM, it can be seen the basis functions capture variation associated with the control-rod regions. Table 6 shows the average maximum errors, defined in Equations (41), (42) and (43), for all three ROMs applied to the 2D problem. It can be seen that both autoencoder-based ROMs have lower compression errors than POD. However, the SVD-AE ROM performs better than both of AE-based ROM and POD-based ROM. Both AE-based ROMs generate better k eff values with the SVD-AE ROM being better than both the other models by an order of  (a) Histogram of the error in the flux profile, after removing outliers (5% of the values).
(b) Histogram of the error in k eff , after removing some outliers (20% of values).  (a) Reconstruction error in the flux profile for seen (blue) and unseen data (green).
(b) Minimum and maximum of the 4 latent variables generated from the SVD-autoencoder over the training data.

Conclusions
By adopting an autoencoder for dimensionality reduction or compression, a novel projection-based reduced-order model for solving eigenvalue problems is developed. Two simple reactor problems are used as test cases to compare a number of autoencoder-based reduced order models with a reduced-order model (ROM) based on Proper Orthogonal Decomposition (POD). One of the autoencoder-based ROMs uses a novel combination of singular value decomposition (SVD) with an autoencoder.
POD and singular value decomposition (SVD) are often used to obtain basis functions of a low-dimensional or 'reduced' space onto which the high-fidelity model is projected. This embedding of a high-dimensional space into a low-dimensional space is linear for POD, whereas, the embedding found by an autoencoder (AE) is, in general, nonlinear. This nonlinearity means that an autoencoder can provide a more accurate representation of data which could be important for larger, more complex problems.
For the 1D test case, when using a reduced space of dimension two, the AE-based reduced order model performs better than the POD-based model, with slightly lower errors in compression and in predicting the scalar flux profiles. When predicting k eff , the AE-based ROM has an error that is one order of magnitude lower than the POD-based ROM. This supports the claim that each latent variable of the autoencoder can capture more information than each POD basis function.
For the 2D test case, a standard autoencoder-based ROM demonstrated similar levels of accuracy to the POD-based ROM. However, the novel hybrid SVD-autoencoder is more accurate than both the other models, with errors that are an order of magnitude lower for k eff . Combining an SVD with an autoencoder reduces the length of the inputs to the autoencoder, which, in turn, reduces the complexity of the network, making overfitting less likely.
Although two relatively simple test cases are presented in this paper, our future work will involve increasingly complicated (i.e. more realistic) assembly or reactor configurations which will benefit from the increased modelling capabilities that autoencoders provide. It can be seen that, for this eigenvalue problem, there is a clear benefit to using autoencoders over the standard POD approach. For minimal numbers of basis functions, autoencoders can be used to achieve a level of compression that cannot be matched in accuracy by POD.