A new method based on surface‐sample pollen data for reconstructing palaeovegetation patterns

Abstract Aim Biomisation has been the most widely used technique to reconstruct past regional vegetation patterns because it does not require an extensive modern pollen dataset. However, it has well‐known limitations including its dependence on expert judgement for the assignment of pollen taxa to plant functional types (PFTs) and PFTs to biomes. Here we present a new method that combines the strengths of biomisation with those of the alternative dissimilarity‐based techniques. Location The Eastern Mediterranean‐Black Sea Caspian Corridor (EMBSeCBIO). Taxon Plants Methods Modern pollen samples, assigned to biomes based on potential natural vegetation data, are used to characterize the within‐biome means and standard deviations of the abundances of each taxon. These values are used to calculate a dissimilarity index between any pollen sample and every biome, and thus assign the sample to the most likely biome. We calculate a threshold value for each modern biome; fossil samples with scores below the threshold for all modern biomes are thus identified as non‐analogue vegetation. We applied the new method to the EMBSeCBIO region to compare its performance with existing reconstructions. Results The method captured changes in the importance of individual taxa along environmental gradients. The balanced accuracy obtained for the EMBSeCBIO region using the new method was better than obtained using biomisation (77% vs. 65%). When the method was applied to high‐resolution fossil records, 70% of the entities showed more temporally stable biome assignments than obtained using biomisation. The technique also identified likely non‐analogue assemblages in a synthetic modern dataset and in fossil records. Main conclusions The new method yields more accurate and stable reconstructions of vegetation than biomisation. It requires an extensive modern pollen dataset, but is conceptually simple, and avoids subjective choices about taxon allocations to PFTs and PFTs to biomes.


| INTRODUC TI ON
Pollen evidence has been widely used to reconstruct Holocene changes in vegetation, both at individual sites and at a regional scale (e.g. Bozilova & Tonkov, 1995;Edwards et al., 2017;Huntley & Birks, 1983). These reconstructions provide insights into how vegetation responds to climate changes and human activities (Gaillard et al., 2010;Prentice, 1986), relevant for understanding how vegetation patterns and biodiversity may respond to future climate changes (Bradshaw et al., 2015;Cole, 2010). They also provide information about the resource base for human societies, relevant for understanding how cultural changes such as the adoption of agriculture affected the natural environment, and documenting the sensitivity of human societies to environmental change and degradation (Leroy et al., 2010;Turner & Sabloff, 2012). Vegetation reconstructions have also been widely used to test the performance of Earth System models (Foley et al., 2013;Kaplan et al., 2003;Song et al., 2021).
Available methods use modern pollen-vegetation relationships to reconstruct regional vegetation changes. Semi-quantitative approaches, including biomisation (Prentice et al., 1996, distinguish between vegetation types by grouping individual taxa into plant functional types (PFTs) and PFTs into biomes or Land Cover Classes (LCC) (pseudobiomisation: Fyfe et al., 2010). Quantitative approaches, based on pollen source area theory (Prentice, 1985;Prentice & Parsons, 1983;Sugita, 1993), include the Landscape Reconstruction Algorithms (Sugita, 2007a(Sugita, , 2007b) that accounts for pollen dispersal dynamics using region-specific pollen productivity estimates (PPEs) and pollen size and deposition velocities, or the Regional Estimates of VEgetation Abundance from Large Sites (REVEALS) without PPEs method (ROPES : Theuerkauf & Couwenberg, 2018), which derives PPEs and mean plant abundances from single pollen records. Lack of information on PPEs and deposition velocities means these model-based reconstructions have only been applied in limited regions and for limited vegetation classes (Gaillard et al., 2010;Harrison et al., 2020). Statistical approaches including the Modern Analogue Technique (MAT; Gaillard et al., 1994;Jackson & Williams, 2004;Overpeck et al., 1985Overpeck et al., , 1992Wang et al., 2020;Williams & Jackson, 2007) use modern training datasets to establish relationships between pollen assemblages and biomes, and then use these relationships to reconstruct biomes for fossil pollen assemblages. The accuracy of statistical reconstructions depends on the representativeness of the training dataset (Turner et al., 2021); the lack of sufficiently extensive modern training data precludes MAT reconstructions for many regions. Biomisation has been one of the most widely used vegetation-reconstruction methods (Allen & Huntley, 2009;Allen et al., 2000;Bigelow et al., 2003;Edwards et al., 2000;Elenga et al., 2000;Harrison et al., 2001;Jolly et al., 1998;Marchant et al., 2001Marchant et al., , 2009Pickett et al., 2004;Prentice & Webb, 1998;Prentice et al., 1996;Takahara et al., 2000;Tarasov et al., 1998Tarasov et al., , 2013Thompson & Anderson, 2000;Williams et al., 1998Williams et al., , 2000Yu et al., 1998Yu et al., , 2000, in large part because of its conceptual simplicity and minimal data requirements. Although modern pollen samples are used to test the method, biomisation does not require the extensive modern calibration datasets needed for statistical approaches, nor does it require regional information about pollen productivity and deposition velocities. The principle of the biomisation method is to assign pollen taxa to PFTs defined based on life form, leaf type, phenology and climate tolerance, where taxa within a PFT are assumed to have similar responses to physical and biotic environmental factors. Major vegetation types (biomes) are then characterized as assemblages of PFTs, where the PFTs included represent either the dominant taxa or taxa that are characteristic of the biome. The allocation of pollen taxa into PFTs and PFTs into biomes is initially determined by knowledge of the regional vegetation and subsequently optimized, after comparison of reconstructions to observed vegetation, by removing non-diagnostic taxa or PFTs. The assignment of individual pollen samples to biomes is made by calculating an affinity score for the similarity of the pollen assemblage to every biome, based on the weighted average of the square root of the pollen abundances of taxa that could be present in each biome.
Biomisation has been shown to produce robust reconstructions of vegetation patterns at key times in the past. However, it has some well-known limitations. These include the fact that there is no direct relationship between pollen abundance and plant abundance such that the dominant taxa in the regional vegetation may be under-represented in the pollen assemblages. For example, Larix is the dominant genus in the boreal cold-deciduous forest of Eurasia but produces little pollen and is systematically under-represented in pollen samples . In contrast, Pinus produces abundant and easily dispersed pollen and is often over-represented in pollen samples, both in biomes where it actually occurs and as a result of long-distance transport into biomes representing open vegetation types Edwards et al., 2000). The use of the square root of pollen abundance in the biomisation formula decreases the weight of over-represented taxa, but they still contribute to biomes that contain the PFT to which they belong. An alternative method of down-weighting over-represented taxa is to set a universal threshold of abundance for inclusion in the analysis higher than the 0.5% default value , but this eliminates a large number of under-represented taxa that might be diagnostic.
The definition of biomes in terms of diagnostic or dominant PFTs emphasizes the most characteristic expression of a biome but does not account for the fact that there is variability in the abundance of these PFTs within a biome and considerable differences between PFT abundance at, for example, the northern and southern limits of the biome (Edwards et al., 2000). This issue has been solved by creating additional PFTs or biomes. For example, the temperate summergreen PFT in the eastern United States and Canada was split into intermediate and warm variants to separate out taxa with different climatic tolerances . Similarly, the tundra (TUND) biome has been split into multiple sub-biomes which are differentiated by the presence/absence of different categories of shrubs . However, these approaches still only consider part of the within-biome variability.
A further problem is associated with the fact that some biomes are characterized by a subset of the PFTs present in another biome.
For example, the PFTs defining deciduous forest types are often a subset of those defining equivalent mixed forest types, creating a situation where identical affinity scores are obtained for the two biomes. The biomisation approach solves this through a tie-break rule that favours the less PFT-rich biome. However, the presence of a small amount of pollen from a PFT that is not shared would be sufficient to change the affinity score and hence the biome allocation, meaning that the biome assignments can be sensitive to small changes in pollen abundance and may not be stable. Small changes in pollen abundance can also result in shifts towards biomes that are not characterized by a subset of the PFTs representative of another biome, for example in the case where taxa are assigned to more than one PFT. The problem, referred to as the 'flickering switch' problem, is most noticeable when biomisation is used to reconstruct changes in vegetation through time using down-core pollen samples at a single site (Allen et al., 2000;Fyfe et al., 2018). The flickering switch problem has been observed to be acute in high-resolution records where one might expect greater similarity in biome allocation between adjacent samples (see e.g. Allen & Huntley, 2009;Marchant et al., 2001).
One final issue poorly addressed by the biomisation technique is the existence of palaeovegetation communities with no analogue in the present-day vegetation because they consist of extant species but in combinations not found at present (Jackson & Overpeck, 2000;Overpeck et al., 1992;Williams & Jackson, 2007).
Statistical dissimilarity approaches have addressed this by establishing thresholds for detecting such assemblages, but it is important then to establish criteria to avoid ad hoc choices of such thresholds (Gavin et al., 2003).
In this study, we have derived a method that overcomes these problems and extends existing dissimilarity-based approaches by accounting explicitly for within-biome variability. We calculated a dissimilarity index (which takes account of this variability) between a given pollen sample and every biome, and thereby assessed the likelihood that a sample belongs to a particular biome. We applied this method to reconstruct the modern vegetation of the Eastern Mediterranean-Black Sea Caspian Corridor (EMBSeCBIO) region (33°-49°N, 20°-60°E), where the performance of the new approach can be compared to reconstructions made using biomisation (Marinova et al., 2018). The EMBSeCBIO region is characterized by strong temperature and precipitation gradients and topographic heterogeneity, resulting in clear patterns in biome distribution within a relatively limited geographical space and thus provides an excellent case to test how well the different methods capture spatially complex vegetation patterns.

| MATERIAL S AND ME THODS
The workflow is summarized in Figure 1. We first assigned modern pollen samples to biomes, where the biomes are defined using a recently developed global modern potential vegetation map. Since the modern samples were derived from different settings and basins of different sizes, we tested what would be an appropriate search window for determining the biome for each sample. We then divided the modern pollen samples into training and testing datasets. The training dataset was used to characterize the within-biome variability of the pollen assemblages assigned to a given biome, in terms of the mean, range and standard deviation of each pollen taxon. We then calculated the dissimilarity index between every sample in the test dataset and every biome. This index provides an approximation of the probability that a given pollen assemblage would be produced by a particular biome such that samples can be allocated to the biome with the highest likelihood. We tested the robustness of these assignments using different training and testing data, since the selection and size of the training and testing datasets could influence the within-biome variability and hence the dissimilarity calculation.

F I G U R E 1
Overview of the approach for pollen-based vegetation reconstruction accounting for within-biome variability. PNV, potential natural vegetation.
We then compared the reconstructions based on this new approach to previous vegetation reconstructions for the EMBSeCBIO region.
Finally, we estimated optimal threshold values for the biome scores by pairwise comparison of every biome, using the receiver operating characteristic (ROC) curve as a performance metric, to determine whether assemblages were characteristic of an observed modern biome or could represent a non-analogue vegetation type. We tested this procedure using a synthetic modern dataset and also applied it to fossil pollen assemblages from the EMBSeCBIO region.

| Pollen and vegetation data
The modern pollen dataset was derived from the SPECIAL Modern Pollen Data Set (SMPDS: Harrison, 2019) and the EMBSeCBIO pollen database (Harrison & Marinova, 2017;Harrison et al., 2021). The SMPDS consists of relative abundance records of 247 pollen taxa from 6459 terrestrial sites from Europe, the Middle East and northern Eurasia. These taxa result from taxonomic harmonization of the individual records and amalgamation of rare taxa into higher taxonomic groups, after ensuring that this was consistent with their distribution in climate space (Harrison, 2020;Wei et al., 2020). To avoid duplication, we removed all SMPDS samples from the EMBSeCBIO region (28° to 49°N, 20° to 62°E). We did not use SMPDS sites from east of 62°E, where the sampling is limited and likely not representative of the diversity of the vegetation. Most of the pollen assemblages in the SMPDS (94%) are from lake or bog environments where sediments are currently accumulating, have been dated to the last 50 years, or are modern samples from moss polsters, litter or pollen traps. For comparability, we only used sites from the EMBSeCBIO database with ages <150 calibrated years before present (cal BP).
This resulted in the selection of 1356 samples from the EMBSeCBIO database and 4409 samples from the SMPDS (Figure 2). Pollen counts in the EMBSeCBIO database were amalgamated into higher taxon groupings consistent with the SMPDS (Table S1).
Since our goal was to develop a method to reconstruct vegetation changes through time, we used potential natural vegetation (PNV) as a target for the modern reconstructions. Maps of PNV have been widely used in this way to test regional vegetation reconstructions (e.g. Bigelow et al., 2003;Marchant et al., 2009;Marinova et al., 2018). The modern vegetation data were extracted from an updated version of the Global PNV map produced by Hengl et al. (2018).
The original version of this map had a resolution of 1 km; the updated version (https://github.com/Envir ometr ix/PNVmaps) has a resolution of 250 m. The PNV map was produced using pollen-based vegetation reconstructions as a target, a large set of spatially explicit covariate datasets representing the potential climatic, topographic, geologic and hydrological controls on plant growth, and an ensemble of five machine-learning approaches (neural networks, random forest, gradient boosting, K-nearest neighbourhoods, Cubist) to account for the relationships between vegetation and these covariates. Different machine-learning approaches vary in terms of computational requirements, predictive power and interpretability; the use of an ensemble of machine-learning tools allows an assessment to be made of the robustness of the predictions (Hengl et al., 2018;Heung et al., 2016). The prediction accuracy of this dataset at 1 km is F I G U R E 2 Map of the distribution of modern pollen samples from the SPECIAL modern pollen dataset (SMPDS) and the eastern Mediterranean-Black Sea Caspian corridor (EMBSeCBIO) database used for the training and testing datasets. The red box delineates the area covered by the EMBSeCBIO database; samples outside the red box were obtained from the SMPDS dataset. The background map shows the distribution of major biomes derived from the Hengl et al. (2018) reconstruction of potential natural vegetation. The biome codes are: CENF, cold evergreen needleleaf forest; CMIX, cool mixed evergreen needleleaf and deciduous broadleaf forest; DESE, desert; ENWD, evergreen needleleaf woodland; GRAM, graminoids with forbs; TEDE, temperate deciduous malacophyll broadleaf forest; TUND, tundra; WTFS, warm-temperate evergreen needleleaf and sclerophyll broadleaf forest; XSHB, xeric shrubland. ca 70% globally. The PNV map has 13 biomes in the area covered by the SMPDS and EMBSeCBIO databases. We amalgamated biomes that covered a limited area or occurred as disjunct patches (e.g. erect dwarf shrub tundra, cool evergreen needleaf forest) because they do not sample within-biome pollen variability adequately. As a result, we defined nine biomes ( Table 1) that are structurally distinctive and represent distinct parts of the climate space of the region: tundra (TUND), desert (DESE), graminoids with forbs (GRAM), evergreen needleleaf woodland (ENWD), xeric shrubland (XSHB), cold evergreen needleleaf forest (CENF), temperate malacophyll broadleaf forest (TEDE), cool mixed evergreen needleleaf and deciduous broadleaf forest (CMIX), warm-temperate evergreen needleleaf and sclerophyll broadleaf forest (WTFS). Thus, graminoid and forb tundra, erect dwarf shrub tundra, low and high shrub tundra, and prostrate dwarf shrub tundra were amalgamated into a single tundra biome (TUND in Table 1) and the cool evergreen needleleaf forest was amalgamated into the cool mixed evergreen needleleaf and deciduous broadleaf forest (CMIX in Table 1).

| Biome characterization
The training and testing datasets were created by sub-sampling the modern pollen samples using the PNV map to assign samples to biomes. We used samples from this larger area to sample the whole of the realized range of each biome and to include biomes that are not currently represented in the EMBSeCBIO region but may have occurred there under different climate conditions in the past. This 'space-for-time' substitution is warranted because the climatic drivers of plant compositional turnover across space are similar to those that drive compositional turnover through time during the Late Quaternary (Blois et al., 2013).
There are some locations with multiple modern samples in a very small area, for example, multiple moss polsters within a catchment of a few square kilometres. This could lead to over-representation of some localities in the training dataset and therefore an apparent reduction in the variability of assemblages from the biome(s) involved. Similarly, there are some regions where there are modern pollen samples from sites that are geographically close together and have similar vegetation, resulting in over-sampling of that biome in the training dataset ( Figure S1) and consequently overfitting to the better sampled biome. To limit redundancy in such cases, a single modern sample at each point (defined by latitude and longitude coordinates) was randomly selected for the analyses. We tested different ways of subsampling the data to reduce the biome oversampling bias. The best performance was obtained by down-sampling biomes with a large number of samples to create a dataset with a similar number of samples from each biome. The down-sampled data were then split into training and test datasets.
The observed PNV biome for each pollen sample in the modern training dataset was derived using different search windows (from 12 × 12 km up to 50 × 50 km) around the location of the sample point.
We determined both the dominant and subdominant biome in each search window for subsequent evaluation, based on which biomes occupied the largest (dominant) and second largest (sub-dominant) number of 1 km 2 pixels within the search window. We also tested the impact of using different proportions of sites for the training set and the test set for each of these search windows, splitting using ratios of 50:50, 60:40, 70:30, 75:25 and 80:20. We tested how well each of these combinations predicted the modern vegetation of the EMBSeCBIO region using only modern samples from this region in the test set. The best performance, based on the balanced accuracy, was obtained using a search window of 20 × 20 km centred on the sample location, with a training/test data partitioning ratio of 70:30 (Tables S2 and S3).
The training samples were grouped according to the dominant biome observed in the PNV map. Each modern biome was then characterized by the relative abundance (expressed in terms of the mean, and standard deviation) of all taxa present to account for variability in pollen abundances within each biome.

| Calculation of the dissimilarity and similarity scores
We calculated the following coefficient of dissimilarity between the pollen assemblage of a sample from the test dataset and each biome: where D ik is the dissimilarity of pollen sample k from biome i; p jk is the pollen percentage of taxon j in sample k; μ ji is the mean of taxon j in biome i, s ji is the sample standard deviation of taxon j in biome i and ε is a parameter. Summation is over all taxa in sample k.
Equation (1) is based on the idea that each taxon has a certain distribution of values within a biome, and more weight is assigned to those taxa for which this distribution is narrow. For each taxon, D ik is related inversely to the log-likelihood that a pollen sample is drawn from the population represented by the abundances of that taxon among the modern pollen samples in each biome. The term ε has two functions. First, it decreases the weight of characteristic taxa with low values of s (e.g. due to limited sampling) in the calculation of the dissimilarity measure. Second, it allows the absence of taxa from a fossil sample (s = 0) to be informative, with ε = 0 they could not be considered in the calculation. The choice of the value of ε represents a balance between accuracy and sensitivity, since small values of ε could result in more accurate predictions of well-sampled biomes but increase the sensitivity such that poorly sampled biomes are less well predicted. Values between 0% and 1% were tried, and good results obtained with a value of 0.5% (Table S4) which was therefore used for the subsequent calculations.
The dissimilarity values were converted to similarities by: where S ik is an approximation of the likelihood of biome i given pollen sample k. By the nature of the dissimilarity index, these similarity scores quantify how close a pollen sample is to the mean abundance values in the biome, accounting for within-biome variability.
Samples that represent vegetation closer to the ecotonal boundary of a biome will necessarily have a lower similarity score, although this score will still be higher than the score for other biomes.

| Evaluation of the modern biome reconstructions
The biome reconstructions were evaluated quantitatively using a matrix of predicted versus observed vegetation at each site ('confusion matrix': e.g. Bigelow et al., 2003). We constructed confusion matrices for the test dataset and the EMBSeCBIO dataset, based on both the dominant and subdominant biome registered in the search window around the sample. The confusion matrix provides an assessment of the accuracy of the reconstructions, but is strongly influenced by imbalances in the sampling of different classes (Carrillo et al., 2014;Grandini et al., 2020). Some biomes are less well represented in the training dataset even after downsampling to reduce differences in sample numbers between biomes. Under these circumstances, the 'balanced accuracy' metric, defined as the average accuracy obtained on all classes (Carrillo et al., 2014), provides a better performance metric. The balanced accuracy metric is given by: where k i is the number of sites correctly predicted for biome i, l is the number of biomes and n i is the number of sites in biome i. Evaluation was performed both on the testing dataset and on samples only from the EMBSeCBIO region. The reconstructed modern biome distribution for the EMBSeCBIO region was mapped.
We compared the performance of the new method to a reconstruction made using the biomisation method (Marinova et al., 2018).
We reclassified the 13 biomes used by Marinova et al. (2018) to match the set of nine biomes (Table 1)

| Impact on the flickering switch problem
We used fossil pollen data from the EMBSeCBIO database to assess the impact of the new method on the stability of biome reconstructions through time. The EMBSeCBIO database contains 187 records covering some part of the Holocene. New age-depth models have been produced for 149 of these records using the IntCal20 calibration curve (Reimer et al., 2020) and the 'rbacon' R package (Blaauw et al., 2021) in the framework of the

| Estimation and evaluation of biome analogue thresholds
We used the similarity scores obtained from the pollen samples in the modern testing dataset for each biome ( Figure S3) and made pairwise comparisons between biomes. We obtained the optimal threshold value differentiating each pair of biomes by calculating specificity and sensitivity metrics ( Figure S4) and plotting these on a ROC curve ( Figure S5) using the 'cutpoint' R package (Thiele, 2021), where the point with the maximum balance (sensitivity + specificity) was selected as the optimal threshold between two biomes. The ROC curve has been used previously in MAT-based vegetation reconstructions to assess the ability of different distance metrics to distinguish between groups and to avoid ad hoc interpretations of dissimilarity scores and thresholds (Gavin et al., 2003). We derived a single threshold value per biome, as the lowest value obtained in comparisons of that biome against all other biomes. If a pollen sample has a similarity score below the lowest threshold value for all biomes, it is assumed to represent a non-analogue assemblage.
We tested this approach using (1) the 3650 modern samples in the testing dataset, and (2) a synthetic dataset of 70 samples, using pollen abundances from actual samples and removing taxa or assigning abundances to different taxa randomly. We then applied this approach to fossil samples to determine if it identified non-analogue samples in these records and summarized these analyses by quantifying the proportion of records with at least one non-analogue sample in specific time periods, defined as 200-year windows with 50% overlap.

| Within-biome variability in the training dataset
Most biomes show considerable within-biome variability in taxon abundances, reflecting changes in the importance of individual taxa along climate and environmental gradients (Figure 3; Figure S2). The most abundant taxa in every biome of the training dataset correspond to PFTs characteristic of that vegetation type (Figure 3).

| Assessment of the modern reconstructions
The modern reconstruction using the training dataset reached a balanced accuracy of 66%, which improved to 76% when both dominant and subdominant biomes were considered ( Table 2). The best predictions, considering both dominant and subdominant biomes (

F I G U R E 3
Examples of the characterization of individual biomes in terms of their pollen assemblages and the distribution of the pollen sites belonging to these biomes in bioclimatic space. The bioclimatic space is defined in terms of the mean temperature of the coldest month (MTCO) and growing degree days above a base level of 0°C (GDD0). The climate data at each pollen site were derived from Harrison (2020).
Plot ( The modern reconstruction using the EMBSeCBIO dataset (

| Impact of choice of training dataset
The construction of the training dataset had only a minor impact on the overall accuracy of the reconstructions: the average balanced accuracy was 76 ± 1.87% (Table S2). In evaluation of different biome search windows with different training/test splits, the average balanced accuracy was 76 ± 0.96% (Table S3). The prediction accuracy in the EMBSeCBIO region was similar (average balanced accuracy 76 ± 0.77%) to that for datasets covering all of the SMPDS region (Table S3). The removal of lakes larger than 50 km 2 , which have a larger pollen source area than the area used to allocate dominant and sub-dominant biomes for each sample (Prentice, 1985;Sugita, 1993), improved the prediction, but only marginally from a balanced accuracy of 76% to 77%.

| Comparison with reconstructions using the biomisation method
The balanced accuracy obtained for the EMBSeCBIO region using the new method was better than that obtained using the biomisation method (excluding CENF and TUND) ( Table 4) Mismatches occur between similar biomes in both methods.
Some GRAM samples, for example, were allocated to DESE in 12% of cases under the new method, and 15% of cases in biomisation. Similarly, CMIX samples were allocated to TEDE in 15% of cases using the new method and to CENF in 26% of cases using biomisation. Some TEDE samples were wrongly allocated either to ENWD (8% in the new method, 20% in biomisation) or WTFS (5% in the new method, 22% in biomisation).

F I G U R E 4
Maps showing (a) observed and (b) reconstructed biomes using the modern records across the eastern Mediterranean-Black Sea Caspian corridor region. The colours show the observed or reconstructed biome. The biome codes are: CENF, cold evergreen needleleaf forest; CMIX, cool mixed evergreen needleleaf and deciduous broadleaf forest; DESE, desert; ENWD, evergreen needleleaf woodland; GRAM, graminoids with forbs; TEDE, temperate deciduous malacophyll broadleaf forest; TUND, tundra; WTFS, warm-temperate evergreen needleleaf and sclerophyll broadleaf forest; XSHB, xeric shrubland. The size of the dots in (b) provides an indication of the similarity score obtained, where the larger dots mean higher similarities. Plot (c) shows an enlargement of the map of reconstructed biomes in the Caucasus region, where the distribution is strongly controlled by elevational gradients.

| Reconstruction stability
Down-core reconstructions for high-resolution records using the new method showed generally greater stability than was obtained using the biomisation approach. There was no difference in the range of the proportion of changes between contiguous samples per thousand years (0%-77%), but the new method reduced the proportion of changes in 70% of the records, and produced a similar number of changes in 7% of the records (Table S7). Thus, the new method provides more stable reconstructions and reduces the tendency for small variations in taxon abundance to cause unrealistic temporal shifts between biomes.

| Assessment of the approach for non-analogue detection
A threshold value for identifying analogue and non-analogue samples was estimated for each biome (Table S8). Only 4.5% of the modern samples were identified as non-analogue, a value that is reasonable for false positives. However, 100% of the synthetic dataset were identified as non-analogue. Application of this approach to the fossil dataset showed an abrupt increase in the proportion of non-analogues at the transition between the late-glacial and early Holocene (~11 ky), where the number of entities identified as having non-analogue assemblages exceeds 20% ( Figure 5).
After ~4 ky, the proportion of entities identified as having nonanalogue samples is around 5% and taken to indicate the presence of false positives.

| DISCUSS ION
We have devised a method to reconstruct vegetation that is concep- when both dominant biome and subdominant biomes were considered shows, moreover, that many of the remaining misallocations occurred between climatically related biomes-a feature that is common to biomisation reconstructions (Prentice et al., 1996. The method identifies even poorly sampled biomes successfully: DESE is represented by far fewer samples than forest biomes in the training set, for example, but is predicted with 70% accuracy whereas scarcity of samples led to a very low probability of reconstructing DESE using biomisation, for example in Africa and the Arabian Peninsula (Jolly et al., 1998).
Our approach exploits a large modern pollen dataset to characterize biomes. This is important because previous research (Turner et al., 2021) has shown that the accuracy of pollen-based climate reconstructions is contingent on maximizing the sampled environmental space and is unaffected by the inclusion of climatically insensitive or systematically overrepresented taxa. Turner et al. (2021) also TA B L E 4 Quantitative comparison of predicted and observed dominant and sub-dominant (in brackets) biomes in the eastern Mediterranean-Black Sea Caspian corridor region. The predicted dominant and sub-dominant biomes were obtained using the biomisation method as described by Marinova et al. (2018); the observed dominant and sub-dominant biomes are taken from the Hengl et al. (2018) potential natural vegetation map. The biome codes are as follows: CENF, cold evergreen needleleaf forest; CMIX, cool mixed evergreen needleleaf and deciduous broadleaf forest; DESE, desert; ENWD, evergreen needleleaf woodland; GRAM, graminoids with forbs; TEDE, temperate deciduous malacophyll broadleaf forest; TUND, tundra; WTFS, warm-temperate evergreen needleleaf and sclerophyll broadleaf forest; XSHB, xeric shrubland. The total number of predicted and observed records for each biome are also shown (Σ). The grey diagonal shows the number of correctly predicted samples, while off-diagonal elements are those incorrectly predicted.   are considered suggests a greater degree of precision to take account of the pollen source area of each site could be beneficial.

Predicted
A fifth of the modern samples are from lakes or bogs (19%), and 68% of these records are from basins <1 km 2 in area. The selected search window is close to the theoretical pollen source area for small (<1 km 2 ) basins (Prentice, 1988;Sugita, 1993). Of the remaining modern samples, 33% are from moss polsters and 16% are from soil or other surface materials, both categories that record an even more local signal than small lakes or bogs, although the area sampled depends on vegetation openness and may be comparable with small lakes in some settings (Beer et al., 2007;Bunting et al., 2004;Jackson & Wong, 1994;Prentice, 1988).
This again suggests that the choice of a 20 × 20 km search window is reasonable. Improvements to prediction accuracy could be achieved using a search window appropriate to the pollen source of each site. However, the EMBSeCBIO database lacks information about site type for 27% and about basin size for 14% of the sample records. Furthermore, basin size is only recorded categorically, so the quantitative information required to allow us to test this idea still needs to be collected.
Our new method combines the relative simplicity of the biomisation approach with the more rigorous statistical basis of dissimilarity-based reconstruction techniques. One major advantage of the current approach compared to existing dissimilaritybased reconstruction techniques is that it explicitly takes account of within-biome variability. Samples that represent ecotonal transitions between biomes can therefore be assigned to a biome and the nature of the ecotone represented can be determined by comparing the scores for other biomes, providing a more nuanced way of examining past vegetation changes and transitions.

| CON CLUS IONS
Our new approach provides a promising way of reconstructing past vegetation taking account of the non-proportionality in the relationship between vegetation cover and pollen abundances. There are alternative approaches to doing this, but their accuracy is tempered by very high data demands. Our new method requires an extensive modern pollen dataset but is less data-demanding than other approaches and yields robust results. Biomisation depends on expert judgement for the assignment of taxa to PFTs and PFTs to biomes; our method avoids this limitation, while producing more accurate and temporally stable reconstructions.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The SMPDS data are available through the University of Reading