An atlas of genetic scores to predict multi-omic traits

Genetically predicted levels of multi-omic traits can uncover the molecular underpinnings of common phenotypes in a highly efficient manner. Here, we utilised a large cohort (INTERVAL; N=50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, N=3,175; Olink, N=4,822), plasma metabolomics (Metabolon HD4, N=8,153), serum metabolomics (Nightingale, N=37,359), and whole blood Illumina RNA sequencing (N=4,136). We used machine learning to train genetic scores for 17,227 molecular traits, including 10,521 which reached Bonferroni-adjusted significance. We evaluated genetic score performances in external validation across European, Asian and African American ancestries, and assessed their longitudinal stability within diverse individuals. We demonstrated the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of UK Biobank to identify disease associations using a phenome-wide scan. Finally, we developed a portal (OmicsPred.org) to facilitate public access to all genetic scores and validation results as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.


Introduction 74
Multi-omic analysis has become a powerful approach to improve disease predictors and dissect 75 the regulatory networks that underpin disease biology 1-3 . However, the collection of 76 transcriptomic, proteomic, metabolomic and other modalities is an extremely expensive and 77 time-consuming process. Because of these barriers, large-scale population cohorts typically 78 generate multi-omic data for only a subset of participants (or not at all), which consequently 79 reduces the statistical power of subsequent analyses and creates inequities for studies that do 80 not have ample resources or are from underrepresented ancestries and other demographics. 81 It has been shown that genetic prediction of complex human traits can have both analytic 82 validity and potential utility in research and clinical settings 4-8 . Genetic prediction has also been 83 extended to omics data, for example whole blood 9 and multi-tissue transcriptomics 10,11 as well 84 as plasma proteomics 12,13 . The value of such genetically-predicted traits is primarily in the 85 elucidation of the molecular aetiology of common diseases, incorporating both directionality 86 (as the germline genome is more or less fixed over a life course) and the power of large-scale 87 genotyped biobanks to overcome prediction noise [14][15][16] . 88 The use of genetic scores to predict, expand and thereby democratize multi-omics data is an 89 area of intense interest. While foundational, genetic prediction in this area has historically 90 focused on gene expression, drawing on heterogeneous sources for training data which have 91 limited sample sizes. With many cohorts now performing multi-omics profiling at scale, there 92 is a unique opportunity to create genetic scores which capture multi-omic variation of 93 population-based samples. Given suitably robust external validation, the reliability of multi-94 omic genetic scores can be quantified and extended to analyses assessing their transferability 95 across ancestries, thus facilitating equitable tools for molecular investigations in multiple 96 populations. This approach both facilitates integrative cross-cohort analyses for multi-omic 97 studies and enables the efficient generation of synthetic multi-omic data for studies with only 98 genetic data assayed. 99 Here, we utilise the INTERVAL study 17 , a cohort of UK blood donors with extensive multi-100 omic profiling, to train genetic prediction models. We externally validated these genetic scores 101 in seven different external studies, comprising European, East Asian (Chinese, Malay), South 102 Asian (Indian) and African American ancestries. We then demonstrate the use of genetically-103 predicted molecular data, including their coverage of biological pathways and the identification 104 of multi-omic predictors of diseases and traits in UK Biobank. Finally, we construct an open 105 resource (OmicsPred.org) which makes all genetic scores, validations and biomarker analyses 106 freely available to the wider community. 107 108

110
This study aimed to develop genetic scores for blood biomolecular traits, including transcripts, 111 proteins, metabolites (Figure 1). To do this, we used the INTERVAL study which collected 112 participant serum or plasma on which assays from five different omics platforms were 113 performed: SomaScan v3 (SomaLogic Inc., Boulder, Colorado, US), an aptamer-based 114 multiplex protein assay; Olink Target (Olink Proteomics Inc., Uppsala, Sweden), an antibody-115 based proximity extension assay for proteins; Metabolon HD4 (Metabolon Inc., Durham, US), 116 an untargeted mass spectrometry metabolomics platform; Nightingale (Nightingale Health Plc., 117 Helsinki, Finland), a proton nuclear magnetic resonance (NMR) spectroscopy platform; and 118 whole blood RNA sequencing via the Illumina NovaSeq 6000 (Illumina Inc., San Diego, 119 California, US) (Methods). INTERVAL participants were genotyped on the Affymetrix 120 Biobank Axiom array which was then imputed using a combined 1000 Genomes Phase 3-121 UK10K reference panel (Methods). After quality control, there were 10,572,788 genetic 122 variants for constructing genetic scores. 123 124 To train genetic scores, we utilised Bayesian ridge regression (BR), which has been shown to 125 have equal or better performance as other machine learning methods for genetic prediction 8 126 and is more computationally efficient with a smaller carbon footprint 18 . In the data used here, 127 we confirmed the generalisability of these findings across multiple platforms (Metabolon,128 Olink, SomaScan), assessing the impact of different sets of variants arising from different 129 filtering strategies (Methods; Figures S1-4). Overall, we found the best performing approach 130 overall to be BR with a genome-wide variant selection using GWAS p-value < 5´10 -8 ( Figures  131  S1-4). 132 We developed genetic scores for 17,227 biomolecular traits from the five platforms, including 133 726 metabolites (Metabolon HD4), 141 metabolic traits (Nightingale), 308 proteins measured 134 by Olink, 2,384 protein targets measured by SomaScan, 13,668 genes for Illumina RNAseq 135 (Ensembl gene-level counts) (Methods). Across all platforms, we found wide variation in the 136 predictive value (R 2 between the genetically predicted and the directly measured biomolecular 137 trait) and the number of variants of the genetic scores in internal validation ( Figure S5). 138 Overall, we found 10,521 biomolecular traits could be genetically predicted at  adjusted significance (correcting for all genetic scores tested), including 1,051, 206, 379, 137 140 and 8,748 for SomaScan, Olink, Metabolon, Nightingale and RNAseq respectively. Of these, 141 5,816 and 409 genetic scores could predict their biomolecular traits with R 2 > 0.1 and R 2 > 0.5, 142 respectively (Figure 2 and Tables S1-5). 143 between internal and external validation in European ancestries, with genetic scores of many 153 traits being highly predictive (Figure 3 and Figures S6-11). As expected, we also found that 154 genetic scores with high missingness rates amongst variants (e.g. due to allele frequency 155 differences or technical factors) had attenuated power (Methods; Figure S12). 156

Validation in external cohorts of European ancestries
The SomaScan v3 platform quantified 3,622 plasma protein targets in INTERVAL 27 , of which 157 2,384 proteins had at least one significant genetic variant that could be used for genetic score 158 development ( Figure S5) found to be similarly genetically predicted (Figure 3). As compared to SomaScan, a larger 177 proportion of Olink proteins in NSPHS (N=117; 39%) and ORCADES (N=87; 29%) could be 178 genetically predicted with R 2 > 0.10 in external validation. Overall, we found broad consistency 179 between validations in NSPHS and ORCADES ( Figure S13). 180 The Metabolon HD4 platform quantifies >900 plasma metabolites and was used here in two 181 different phases of the INTERVAL study (Methods ORCADES, respectively (Figure 3). We again found broad consistency between the two 189 external validation sets ( Figure S13). and SomaScan trait levels in cohorts of Asian and African American ancestries, but as expected 234 their performances were significantly reduced when compared to the validations in European 235 ancestry cohorts (Figure 4). For Nightingale, the European-trained genetic score performance 236 generally declined from Chinese to Indian to Malay ancestries, with LDL subclasses displaying 237 some of the most variable cross-ancestry R 2 (Figure 4a and 4b). The most transferrable 238 Nightingale genetic scores were triglycerides in IDL, triglycerides in small HDL and medium 239 HDL, degree of unsaturation and phosphatidylcholines (Figure 4c). When assessing 240 transferability of SomaScan, we found genetic score performance generally declined from 241 Indian to Malay to Chinese to African American ancestries (Figure 4d). The SomaScan genetic 242 scores that attenuated most in non-European ancestries were those for CD177 (a cell-surface 243 expressed protein on neutrophil and Treg's) and GDF5 (a secreted ligand of TGF-beta) ( Figure  244 4e). The most transferable SomaScan genetic scores included SIGLEC9 (which mediates 245 sialic-acid binding to cells), SIRPA (a cell surface receptor for CD47 involved in signal 246 transduction) and ACP1 (an acid and protein tyrosine phosphatase), where all internal and 247 external validation R 2 were >0.50 (Figure 4f). Multi-omic genetic scores may be used to probe the relevance of biological pathways to a 261 particular trait or disease outcome of interest. To assess the coverage of biological pathways 262 by the proteomic genetic scores we present here, we applied the genetic scores for SomaScan 263 and Olink to assess the extent to which pathways are genetically controlled (Methods). Here, 264 we considered all genetic scores with R 2 > 0.01 in internal validation (2,205 unique proteins) 265 and jointly mapped the SomaScan and Olink scores onto data curated from Reactome 30 ( Figure  266 6a, Figure S15). 267 For the plasma proteome, we found wide variation amongst the 27 super-pathways with some 268 super-pathways under relatively little genetic control (e.g. chromatic organisation, or transport 269 of small molecules) and others under substantially greater genetic control (e.g. digestion and 270 absorption, or extracellular matrix organisation) (Figure 6a). Approximately 18% of proteins 271 in the digestion and absorption super-pathway had internal validation R 2 > 0.10, and ~4% with 272 R 2 > 0.30. For the lowest-level pathway annotation (N=1,717) of the 27 super-pathways, we 273 found that a majority (N=1,169, 68%) were covered by at least one SomaScan or Olink genetic 274 score with an internal validation R 2 > 0.01 ( Figure S15). For both the digestion and absorption 275 and the extracellular matrix organisation super-pathways, 25% and 42%, respectively, of 276 lowest-level pathway annotations were covered by at least one SomaScan or Olink genetic 277 score with internal R 2 > 0.30. 278

279
Using the multi-omic genetic scores, we generated genetically predicted Metabolon HD4, 280 Nightingale NMR, Olink, SomaScan and whole blood RNAseq data for the UK Biobank 281 (Methods). Next, using these predicted multi-omics data of UKB, we performed a phenome-282 wide association study using PheCodes 31 (ICD-9 and ICD-10 based diagnosis codes collapsed 283 into hierarchical clinical disease groups; Methods). For simplicity and to maximize the number 284 of qualified PheCodes, we focused the analysis on UKB individuals of white British ancestry. 285 Multiple testing was controlled using Benjamini-Hochberg FDR of 0.05 (Methods). 286 Overall, at an FDR of 5%, we identified 18,404 associations between genetic scores of the 287 biomolecular traits and 18 categories of PheCodes (Figure 6b). These associations comprised 288 1,668 for Metabolon HD4, 2,854 for Nightingale NMR, 740 for Olink, 5,501 for SomaScan 289 and 7,641 for RNAseq (Table S6 and S7). Circulatory system diseases, endocrine/metabolic 290 and digestive diseases yielded the largest number of associations across platforms (Figure 6b). 291 The PheWAS detected many well-known blood biomarkers as well as intriguing associations 292 across genes, proteins and metabolites. For example, total cholesterol was significantly 293 associated with myocardial infarction (HR = 1.13 per s.d., FDR-corrected p-value = 1´10 -61 ). 294 Interleukin-6 (IL-6) pathways have been shown to have a causal association with coronary 295 artery disease 32,33 , and notably, IL-6 receptor genetic scores in SomaScan and Olink had R 2 > 296 0.50 in both internal and external validation, showing its high genetic predictability. 297 Genetically predicted levels of IL-6 receptor in both Olink and SomaScan were significantly 298 associated with myocardial infarction (HR = 0.97 per s.d., FDR-corrected p-value = 2´10 -4 ; 299 HR = 0.97 per s.d., FDR-corrected p-value = 4´10 -4 , respectively). Microseminoprotein-beta 300 has been identified as a biomarker for prostate cancer 34 and PheWAS findings support this 301 association (HR = 0.87 per s.d., FDR-corrected p-value = 3´10 -49 ). Genetically predicted Sex 302 Hormone-Binding Globulin (SHBG) protein was associated with type 2 diabetes (HR = 0.98 303 per s.d., FDR-corrected p-value = 0.03), consistent with previous observational and genetic 304 analyses 35 . Similarly, we found associations for insulin signaling pathway related proteins, e.g. 305 insulin receptor (INSR) and insulin-like growth factor 1 receptor (IGF1R), with type 2 306 diabetes 36,37 ; ABO 38 with type 2 diabetes; IL-6 with asthma 39 ; and HLA-DQA1/DQB1 with 307 celiac disease 40 (Table S6). 308 Our results validate those of a recent study identifying putative causal plasma protein mediators 309 between polygenic risk and incident cardiometabolic disease 5 , including six of the novel and 310 putatively causal associations for coronary artery disease ( While the utility of generating predicted transcriptomic data for cohorts with genome-wide 349 genotype data has been demonstrated 42 , our work substantially extends these foundations using 350 a large multi-omic cohort, quantifying both the intra-and inter-ancestry reliability of proteomic 351 and metabolomic genetic scores across multiple platforms. We generate a predicted multi-omic 352 dataset for UK Biobank and show that PheWAS can uncover many known and novel omic 353 associations with disease. Given that the increase in sample size required to detect an 354 association for a noisy explanatory variable can be estimated by the formula n/R (where n is 355 the sample size required if no measurement error exists and R is the reliability coefficient) 14 , 356 even genetic scores of apparently low predictive value are likely powerful enough to detect 357 true associations at the sample sizes of current and forthcoming biobank-scale data. This 358 suggests that large biobanks could reliably test trait-disease associations using efficient 359 genetically-predicted data, before committing to novel data generation using (frequently 360 expensive) molecular assays. 361 Our study has several limitations. While blood is a key tissue of broad utility in discovery 362 science and medicine, it is most likely a correlate but not the main site of function for many of 363 the biomolecules assessed here. While genetic score performance was generally consistent 364 across cohorts, there were factors that could affect their performance, including technical 365 factors (e.g. use of serum versus plasma; genetic variant missingness), participant 366 demographics, and genetic factors (e.g. allele frequency differences). Genetic scores may also 367 pick up differences in molecular traits shared by multiple platforms (e.g. Olink and SomaScan). 368 Despite genetic scores for most shared proteins being consistently predictive across platforms, 369 there can be large differences which can be due to technical factors (e.g. binding affinity) 370 (Methods), as assessed in a recent study 43 . The attenuated performance of polygenic scores 371 across ancestries is a well-known limitation 44 and our analysis also found this in multi-omics 372 data. Multi-omics for non-European ancestries will likely become more common in the future, 373 and we see a key role for OmicsPred in facilitating robust genetic scores which enable multi-374 omic prediction in diverse populations. Finally, we acknowledge that there are many highly 375 sophisticated machine learning approaches, which may improve genetic score performance 376 and/or transferability. We selected Bayesian ridge because it has been previously shown to 377 both perform well relative to other machine learning approaches and because it scales very well 378 to large numbers of traits, thus improving computational efficiency and promoting green 379 4,811 measurements proceeding to standard QC assessments. We also checked for missing 437 measurements and measurements below the limit of detection. No missing measurements were 438 found. 8 out of 92 proteins had values below the limit of detection (LOD), of which 4 (HAGH, 439 BDNF, GDNF, CSF3) had more than 5% of measurements below the LOD so were not taken 440 forward for further analyses. No participant had more than 4% of protein measurements below 441 LOD, and we did not observe over-representation of particular proteins below LOD for specific 442 participants. Protein measurements were then adjusted for age, sex, season and the first 11 443 genetic PCs, residuals of which were further inverse normal rank transformed for their GWASs. 444 It was noted that there are a small number of shared proteins across the four Olink panels 445 (detailed numbers of proteins and participants per panel after QC were given in Table S8). To 446 avoid duplication in genetic score construction, these shared proteins were merged by 447 averaging their protein levels on each sample across panels, and taken as a unique protein. quality samples with RNA integrity number (RIN) < 4 or read depth < 10 million uniquely 506 mapped reads were removed. We further removed one random individual from each flagged 507 pair of related individuals, which were first-or second-degree estimated from genetic data. 508 Finally, sample swaps and cross-contamination were assessed using match bam to VCF (MBV) 509 method from QTLtools 51 , which identified and corrected 10 pairs of mislabelled samples;  call level, genotypes with DP<5 or GQ<20 or AB>0.8 (heterozygotes calls), were set to NULL. 594 Finally, samples with abnormal ploidy were excluded, and genetic ancestry were determined 595 with k-means clustering from the top 15 principal components. Both SomaScan (version 4) and 596 Nightingale NMR platforms were used to assay baseline and revisit blood samples of 597 participants in MEC. For quality control of Nightingale data, participants with >10% missing 598 metabolic biomarker values were excluded from subsequent analyses. For participants with 599 biomarker values lower than detection level, we replaced values of 0 with a value equivalent 600 to 0.9 multiplied by the non-zero minimum value of that measurement. For quality control of 601 SomaScan data, protein levels were first normalized to remove hybridization variation within 602 a run. This was followed by median normalization across calibrator control samples to remove 603 other assay biases within the run. Overall scaling and calibration were then performed on a per-604 plate basis to remove overall intensity differences between runs with calibrator controls. 605 Finally, median normalization to a reference was performed on the individual samples with QC 606 controls. During these standardization steps, multiple scaling factors were generated for each 607 sample/aptamer at each step. The final number of samples in each ethnic groups used in our 608 validation were given in Table 1 were detailed in the previous studies 29,59,60 . SomaScan IDs were used to match shared proteins 618 between JHS and INTERVAL, which identified 820 proteins in total. Protein levels were 619 adjusted for age, sex and the first 10 principal components of genetic ancestry in JHS, before 620 they were used for evaluating performance of genetic scores. 621

Polygenic scoring method 622
A genetic score is most commonly constructed as a weighted sum of genetic variants carried 623 by an individual, where the genetic variants are selected and their weights quantified via 624 univariate analysis in a corresponding genome-wide association study 61,62 : 625 (1) 626 where S is the set of variants, referring to single nucleotide polymorphisms (SNPs) in this study, 627 that are identified in the variant selection step described below; βj is the effect size of the SNP 628 j that is obtained through the univariate statistical association tests in the GWAS; xij is the 629 genotype dosage of SNP j of the individual i. As the variant set S is derived through a LD 630 pruning and p-value thresholding process, this method is often named as the P+T. However, 631 P+T relies on hard cut-off thresholds to remove LD correlations among variants and select 632 associated variants. It is often challenging to balance between keeping predictive variants and 633 removing redundant and uninformative variants that can limit the prediction precision. Also, 634 due to the inherent linear assumption of the univariate analysis in P+T, this method leaves no 635 modelling considerations for joint effects between variants. To alleviate these limitations, 636 various machine learning based methods, such as Bayesian ridge (BR), elastic net (EN) 45 and 637 LDpred 63 , have been utilized to construct genetic scores for a wide range of traits and diseases 8 . 638 In particular, BR and EN have been shown to outperform other methods when developing 639 scores for predicting biomolecular traits, such as blood cell traits and gene expression 8,10 , which 640 are similar to the type of traits considered in this study. We adopted the BR method for the 641 genetic score construction of all the biomolecular traits as BR is more efficient to run in practice 642 (see details below). 643 Bayesian ridge is a multivariate linear model which assumes that the genetic variants have 644 linear additive effects on the genetic score of the trait 8,64 . In addition, BR also assumes that the 645 genetic score of a trait follows a Gaussian distribution, and the prior for effect sizes of variants 646 is also given by a spherical Gaussian: 647 where α and λ are coefficients of the model and subject to two Gamma distribution: Gamma(α1, 651 α2) and Gamma(λ1, λ2). These two prior Gamma distributions can be set via a validation step. 652

Genetic score model training and evaluation 653
The explained variance (R 2 ) and Spearman's rank correlation coefficient were used to measure 654 the performance of constructed genetic scores in the INTERVAL training samples and external 655 cohorts (or INTERVAL withheld subset), where R 2 scores were calculated using the squared 656 Pearson correlation coefficient. We adopted a similar strategy for sample partition when 657 training and evaluating genetic scores within the training samples as previous studies 8,10 that 658 utilised learning-based methods to construct genetic scores for molecular traits. The training 659 samples of a trait were randomly and equally partitioned to five subsets, from which any four 660 subsets are used as true-training data to learn a genetic score model of the trait, and test the 661 model's performance on the remaining 20% of samples. Given a genetic scoring method and a 662 trait, we obtained five different genetic score models of the trait and the mean of their 663 performance measurements in the corresponding testing samples in INTERVAL was reported 664 (internal validation). Note that, due to the high similarities between the five genetic score 665 models trained for most traits, only one model was randomly selected from the five and 666 evaluated in the external cohorts (or INTERVAL withheld subset). 667 When training genetic score models using the BR method, we need to select two appropriate 668 prior gamma distributions, i.e. α1, α2, λ1 and λ2. To do so, a grid search across the set [-10 10 , -669 10 5 ,-10, 0, 10, 10 5 , 10 10 ] was performed on the true-training data set, in which 10% of the 670 samples were used as a validation set. However, running this validation process is resource and 671 time-intensive, which makes it challenging to run for all the traits. To address this problem, we 672 found that it is reasonable to assume that the same category of molecular traits, i.e. proteomic 673 traits or metabolomic traits, share the same prior distributions, without sacrificing model 674 performance. Thus, we only needed to run the validation process once for each of the platforms 675 (a trait was randomly selected), and applied the identified optimal prior distributions to other 676 traits. 677

Variant selection and performance comparison between BR and P+T 678
Selecting a proper set of variants and feeding into a polygenic scoring method are a key step 679 for effective genetic score construction. To do so and further confirm the superiority of BR 680 method, we looked at the performance of BR and P+T on a variety of variant selection schemes 681 for the traits in three platforms (SomaScan, Olink and Metabolon). 682 To ensure the generalizability of genetic score models when applied to other cohorts, a variant 683 filtering step was first performed for all the traits considered, which applied a MAF threshold 684 of 0.5% and excluded all multi-allelic variants as well as ambiguous variants (i.e. A/T, G/C). 685 To remove LD dependencies among variants, a follow-up LD thinning step was carried out at 686 an (1) p-value < 5 ´ 10 -8 ; (2) p-value < 1 ´ 10 -3 . 696 697 Then, we compared the performance of BR and P+T on these variant sets in the internal 698 validation (Figure S1-S3). Regarding the proteomic traits (SomaScan and Olink), the two 699 variant selection schemes: (1) p-value < 5´10 -8 on genome-wide variants and (2) all the cis 700 variants and p-value < 1´10 -3 on the trans variants, were shown to be the best performing 701 schemes with either of the methods; BR method largely outperformed P+T across the two 702 variant selection schemes. Meanwhile, it was noted that the two selection schemes led to 703 greatly different performance, with the latter scheme achieving an unrealistic mean R 2 of ~0.74 704 across all the proteins (only ~0.09 for the former scheme). Similarly, for the metabolomic traits 705 (Metabolon), the applied two variant selection schemes significantly differ in their performance 706 in internal validation, and BR was also shown to a better performing method. 707 To further identify the optimal variant selection scheme, we also looked at the performance of 708 validated genetic score models trained with the two identified (for proteins) or all the two 709 applied (for metabolites) schemes using BR method for Olink traits and Metabolon traits 710 (Figure 3 and Figure S4) in external cohorts (NSPHS and ORCADES) or withheld 711 INTERVAL data. Despite the second scheme (all the cis variants and p-value < 1´10 -3 on the 712 trans variants for proteins, or p-value < 1´10 -3 on genome-wide variants for metabolites) 713 showed outstanding performance in internal validation, its performance saw a dramatic decline 714 in external validation for almost every trait validated ( Figure S4). It indicates this variant 715 selection scheme caused an overfitting problem in genetic score training, which is consistent 716 with previous findings when using overly lenient p-value thresholds for variant selection 8 . 717 These results suggested that the BR method with the variant selection scheme of p-value < 718 5´10 -8 on genome-wide variants was the optional method (of those tested) for genetic score 719 development of these biomolecular traits, thus we applied this approach to all other traits for 720 their genetic score development in this study. 721

722
SomaScan and Olink used two different technologies for protein level measurement. The two 723 platforms measured many proteins in common,among which there are 169 unique proteins 724 whose genetic scores we have validated . To check the impact of technologies on genetic  725  prediction, we looked at how the genetic scores trained on one platform can predict protein  726 levels from the other platform on the INTERVAL training samples (Figure S16). We 727 confirmed that performance of these overlapped genetic scores trained in the other platform 728 was generally consistent with that of the scores trained in their original platform. However, we 729 did observe, in some cases, the genetic scores trained in the two platforms can lead to very 730 different predictions, for which we found that they are mainly due to the differences in what 731 the two platforms are actually quantifying. For example, among the 169 proteins, there are 11 732 proteins in SomaScan that had a R 2 > 0.3 in internal validation, in which 10 proteins also 733 achieved a R 2 > 0.3 but the remaining protein (CHI3L1) received a poor R 2 < 0.1 when 734 predicting with Olink genetic scores. We found that the remaining protein received a lowest 735 Pearson's r score among the 11 proteins between their actual protein levels measured in the 736 two platforms. In INTERVAL, there were ~700 participants (depending on the protein) who 737 were assayed by both SomaScan and Olink, which allowed us to calculate the correlations 738 between the actual protein levels measured by the two platforms for the same protein. These 739 results suggested, despite great consistency, genetic scores of the same protein trained in the 740 two platforms can represent distinct aspects of protein biology of prediction and integration of 741 diverse proteomic techniques may enable to develop better genetic scores for these proteins 65 . 742

Pathway coverage analysis of heritable proteins 743
In this analysis, SomaScan and Olink proteins were combined based on their Uniprot ID, where 744 duplicate proteins were removed if identified. We only kept proteins with R 2 > 0.01 in internal 745 validation, resulting in a total of 2,205 unique proteins for the analysis. We used pathway data 746 of Homo sapiens curated at Reactome 30 and conducted analyses to uncover the coverage of 747 these proteins in the pathways. In detail, this analysis looked at the percentages of these proteins 748 in annotated physical entities of each super-pathway, and the percentages of the lowest-level 749 pathways these proteins covered among all the lowest-level pathways of each super-pathway. 750 Where at least one protein in this study are included in entities of a lowest-level pathway, we 751 considered this pathway is covered by proteins of this study. 752

Phenome-wide association analysis (PheWAS) in UKB 753
We included biomolecular traits with R 2 > 0.01 in internal validation in this analysis (11,576 754 traits in total) and considered only participants of European ancestry in UKB (the White British 755 subset). We used the version 3 of imputed and quality controlled genotype data for UKB, which 756 were detailed in the previous study 25 . Using version 1.2 of the PheWAS Catalog 31 , we extracted 757 the curated phenotype definitions of all phecodes. Each phecode is provided as a set of WHO 758 International Classification of Diseases (ICD) diagnosis codes in versions 9 (ICD-9) and 10 759 (ICD-10) of the ontology to define individuals with the phenotype of interest, and a set of 760 related phecodes that should be excluded from the control cohort of unaffected individuals. To 761 define cases for each phecode, we searched for the presence of any of the constituent ICD-9/10 762 codes in linked health records (including in-patient Hospital Episode Statistics data, cases of 763 invasive cancer defined in the cancer registry, and primary and secondary cause of death 764 information from the death registry), and converted the earliest coded date to the age of 765 phenotype onset. Individuals without any codes for the phenotype of interest were recorded as 766 controls, and censored according to the maximum follow-up of the health linkage data (January 767 31, 2020) or the date of death whichever came first. To define the cohort for testing molecular 768 genetic score associations with the age-of-onset of each phenotype, we used the set of events 769 and censored individuals described above and removed any individuals with related 770 phenotypes (based on definitions from the PheWAS Catalog), restricting analyses to be sex-771 specific (e.g. ovarian and prostate cancer) where requires. To ensure a well-powered study we 772 restricted the PheWAS analysis to phenotypes with at least 200 cases in the 409,703 European 773 ancestry individuals whose reported sex match the genetically inferred sex from the UKB 774 quality controlled genotype data 25 , resulting in a set of 1,123 phecodes included in the final 775 analysis. The association of the genetic score for biomolecular traits with the onset of each 776 phenotype was assessed by using a Cox proportional hazards model with age-as-timescale, 777 stratified by sex and adjusted for genotyping array and 10 PCs of genetic ancestry. The 778 association between genetic scores and each phecode is reported in terms of its effect size 779 (Hazard ratio) and corresponding significance (p-value), and significant results were defined 780 as Benjamini/Hochberg FDR-corrected p-value < 0.05 for all the tested traits.   platform, genetic scores were constructed using Bayesian ridge regression on the genome-wide 1075 genetic variants with univariate p-value < 5´10 -8 in INTERVAL. The variance explained in the 1076 measured biomolecular trait (R 2 ) by the genetic score is assessed in the internal validation set 1077 (Methods). Pie charts reflect the number of genetic scores in a particular R 2 range. 1078