Cell-specific Bioorthogonal Tagging of Glycoproteins

Altered glycosylation is an undisputed corollary of cancer development. Understanding these alterations is paramount but hampered by limitations underlying cellular model systems. For instance, the intricate interactions between tumour and host cannot be adequately recapitulated in monoculture of tumour-derived cell lines. More complex co-culture models usually rely on sorting procedures for proteome analyses and rarely capture the details of protein glycosylation. Here, we report a strategy termed Bio-Orthogonal Cell line-specific Tagging of Glycoproteins (BOCTAG). Cells are equipped by transfection with an artificial biosynthetic pathway that transforms bioorthogonally tagged sugars into the corresponding nucleotide-sugars. Only transfected cells incorporate bioorthogonal tags into glycoproteins in the presence of non-transfected cells. We employ BOCTAG as an imaging technique and to annotate cell-specific glycosylation sites in mass spectrometry-glycoproteomics. We demonstrate application in co-culture and mouse models, allowing for profiling of the glycoproteome as an important modulator of cellular function.

Cancer is a multifactorial disease consisting of an interplay between different host and tumour cells. Emulating 47 the complexity of a tumour using cell monoculture is thus incomplete by design, requiring more elaborated co-48 culture systems or in vivo models. 1-3 Recent years have seen a stark increase in methods to probe the 49 transcriptomes of tumour and host cell populations, respectively, providing some insight into their state within a 50 multicellular conglomerate. 4 However, the relationship between transcriptome and proteome is still elusive. 5 In 51 addition, posttranslational modifications (PTMs) heavily impact the plasticity of the proteome. Glycosylation is 52 the most complex and most abundant PTM, but challenging to probe due to the non-templated nature of glycan 53 biosynthesis. 6 Glycans are generated by the combinatorial interplay of >250 glycosyltransferases (GTs) and 54 glycosidases, mostly in the secretory pathway. 7 Certain glycoproteins aberrantly expressed in cancer, such as 55 mucins, are approved as diagnostic markers, but their discovery is a particular challenge. 8,9 This is especially 56 true when in vivo or in vitro model systems comprise cell populations from the same organism that do not allow 57 distinction of proteomes by amino acid sequence. 10,11 Methods to study the glycoproteome of a cell type in co-58 culture or in vivo are therefore an unmet need.

63
Many MOE reagents are based on analogues of sugars such as N-acetylgalactosamine (GalNAc) that are 64 straightforward to chemically tag by replacing the acetamide with bioorthogonal N-acylamides (Fig. 1a).
3 modifications at the N-acyl moiety can render GalNAc analogues recalcitrant to these metabolic processes. For 72 instance, analogues of UDP-GalNAc with long alkyne-containing N-acyl substituents are not biosynthesised by 73 the GalNAc salvage pathway and not used as substrates by wild type (WT)-GalNAc-Ts. 18,22-24 While being a 74 substantial impediment to generating MOE reporters, we realised that overcoming these metabolic roadblocks 75 might enable programmable bioorthogonal glycoprotein tagging. Such a strategy would allow for studying the 76 glycoproteome in a cell-specific fashion, which is currently elusive despite the rapid advances in the 77 development of new MOE reagents.

78
Here, we develop a technique called Bio-Orthogonal Cell-specific Tagging of Glycoproteins (BOCTAG). The 79 strategy uses an artificial biosynthetic pathway to generate alkyne-tagged UDP-GalNAc and UDP-GlcNAc 80 analogues from a readily available GalNAc precursor that is not accepted by the GalNAc salvage pathway. We 81 find that a single methylene group between 5-carbon (GalNAlk) and 6-carbon (GalN6yne) N-acyl substituents 82 drastically reduces uptake by the native GalNAc salvage pathway and thereby reduces the background of 83 bioorthogonal labelling in non-transfected cells. Only cells carrying the artificial pathway biosynthesise the 84 corresponding UDP-sugars (UDP-GalN6yne and UDP-GlcN6yne) that are then used by GTs to chemically tag 85 the glycoproteome. We further expand the strategy with mutant GalNAc-Ts that are engineered to accept UDP-

86
GalN6yne as a substrate. The combined use of an artificial biosynthetic pathway and engineered GalNAc-Ts 87 enables GalN6yne-mediated fluorescent labelling of the cellular glycoproteome that is two orders of magnitude 88 higher than in cells carrying neither component. We demonstrate that BOCTAG allows for programmable 89 glycoprotein tagging in co-culture and mouse models. Moreover, the nature of the artificial biosynthetic 90 pathway allowed for the use of readily available Ac4GalN6yne as a precursor with enhanced stability over 91 previously used caged GalN6yne-1-phosphates as an essential pre-requisite for in vivo applications. We show 92 that the chemical modification enters a range of glycan subtypes, supporting the use of BOCTAG to tag a large 93 number of glycoproteins in complex biological systems.

95
Developing an artificial biosynthetic pathway for chemically tagged UDP-sugars.

114
We next assessed UDP-sugar biosynthesis in the living cell. Stable bicistronic expression of a codon-optimized 115 version of Bifidobacterium longum NahK as well as mut-AGX1 in K-562 cells biosynthesized UDP-GalNAlk 116 and UDP-GalN6yne from membrane-permeable per-acetylated precursors Ac4GalNAlk and Ac4GalN6yne, 117 respectively (Fig. 1c). Expression of either enzyme alone or WT-AGX1 led to inefficient biosynthesis compared 118 to levels of native UDP-sugars (Fig. S3). We confirmed these results by feeding cells a caged precursor of  with the clickable fluorophore CF680-picolyl azide by CuAAC. The MOE reagent Ac4ManAlk that enters the 143 pool of the sugar sialic acid was included as a positive control. Alkyne tags were visualized by in-gel 144 fluorescence after cell lysis (Fig. 2a). While Ac4GalNAlk feeding led to high-intensity fluorescent signal when 145 NahK and mut-AGX1 were expressed, substantial signal was observed in cells expressing WT-AGX1 when 146 NahK was present (Fig. 2a). Fluorescent signal after Ac4GalNAlk feeding was also observed in cells transfected 147 with an empty plasmid or only overexpressing WT-AGX1, confirming the permissiveness of the GalNAc 148 salvage pathway for GalNAlk (Fig. 1b). 18 In contrast, Ac4GalN6yne incorporation was critically dependent on

157
we observed that the day of sample collection has a greater effect on transcript levels than either transgene 158 expression or compound treatment (Fig. S4b). These data suggest that neither artificial biosynthetic pathway nor 159 compound feeding has substantial effects on the transcriptome. Due to the robustness of metabolic 160 incorporation, we used Ac4GalN6yne as an MOE reagent for all subsequent applications of BOCTAG.

198
Cell type-specific glycoproteome tagging in co-culture.

210
BOCTAG enables cell-specific tagging of cell surface glycoproteins in co-culture.

212
Assessing and manipulating the glycan types tagged by GalN6yne.

213
We next sought to assess and expand the glycan subtypes targeted by our MOE approach. We were prompted by 214 our recent findings that GalNAc analogues with bulky N-acyl chains such as GalN6yne are not incorporated into 215 O-GalNAc glycans by WT-GalNAc-Ts (Fig. 4a). 23,27,34 We have created GalNAc-T mutants termed BH-

219
Expression of BH-GalNAc-Ts increased the intensity of in-gel fluorescence more than sevenfold over 220 expression of WT-GalNAc-Ts when cells were fed with Ac4GalN6yne (Fig. 4b). WT-AGX1 expressing cells 221 lacked UDP-GalN6yne/UDP-GlcN6yne biosynthesis and did not show any discernible fluorescent signal over 222 vehicle control DMSO (Fig. 1c). We assessed the subtypes of the chemically tagged glycans by digestion with

245
We then validated BOCTAG as a strategy for cell-specific MS-glycoproteome analysis. We chose a co-culture 246 model between murine 4T1 and human MCF7 breast cancer cell lines, opting to distinguish labelled 247 glycoproteins with species-specific peptide sequences by label-free quantitative (LFQ) LC/MS-MS analysis. We 248 transfected cells with either NahK/mut-AGX1/BH-GalNAc-T2 (termed "BOCTAG-T2") or empty plasmid 249 (pSBbi-Hyg, mock), co-cultured murine and human cells overnight and subsequently fed the co-cultures with 250 either Ac4GalN6yne or vehicle DMSO. Chemically tagged glycoproteins in the secretome were reacted with 251 acid-cleavable biotin-picolyl azide by CuAAC and enriched on neutravidin magnetic beads (Fig. 5a). On-bead 252 digest yielded a peptide fraction and left glycopeptides bound to beads to be separately eluted with formic of the respective other species (Fig. 5b, Supplementary Table2). Only two human peptides and one murine 258 peptide were found in the enriched datasets from the corresponding other species.

259
BOCTAG-T2 allows for cell-specific glycosylation site identification. Using a tandem MS technique consisting 260 of higher energy collision dissociation (HCD)-triggered electron transfer dissociation (ETD), we identified 37 261 specific glycosylation sites on 57 murine glycopeptides from 4T1 cells and 9 specific glycosylation sites on 12 262 human glycopeptides from MCF7 cells in secretome samples (Supplementary Table2). Our data indicated 263 glycosylation of homologous glycopeptides from murine and human origins in pro-X carboxypeptidase in 264 secretome (Fig. 5d, fig. S9c). We also performed an MS-glycoproteomics experiment in lysate from the 265 4T1/MCF7 co-culture expressing BOCTAG-T2 or empty plasmid. We annotated a total of 4 specific 266 glycosylation sites on 11 murine glycopeptides from 4T1 samples and 2 specific glycosylation sites on 8 human 267 glycopeptides from MCF7 cells (Supplementary Table 3). Particularly, we identified a homologous 268 glycopeptide from both human and murine glucosidase 2 (Fig. S9b). The presence of the chemical tag facilitated

285
To evaluate the protein expression levels of NahK/mut-AGX1/BH-T2 ex vivo, part of the tumours were digested, 286 plated and cells cultured. Protein expression of NahK/mut-AGX1/BH-T2 was assessed by Western blot and found 287 to be comparable to expression levels before in vivo injection (Fig. S10b). Cells also generally retained the ability 288 to incorporate Ac4GalN6yne-dependent chemical glycoproteome tagging (Fig. S10b).   glycosylation sites that could not be confidently assigned. d, HCD spectra of homologous glycopeptides from 299 murine (left) and human (right) origins. Peptide sequences were confirmed by ETD (Fig S9c). e, in vivo 300 glycoproteome tagging by BOCTAG-T2. Tumours were grown in fat pads of mice as described. BOCTAG-T2 301 and mock tumours were grown in the same mouse treated systemically by intraperitoneal (i.p. administration) 302 for five days with 300 mg/kg Ac4GalN6yne, Ac4ManNAlk or the corresponding volume of vehicle. Tumours

308
We developed BOCTAG to address two major shortcomings in prominent research fields such as 309 cancer biology. First, there is still an unmet need for characterising proteins produced by a particular 310 cell type. Glycans are a means to an end in this respect, and the large signal-to-noise ratio in our 311 fluorescent labelling experiments indicates that BOCTAG allows for efficient protein tagging. The 312 approach is complementary to other techniques, including the use of unnatural amino acids and 313 proximity biotinylation. 39,40 Second, directly incorporating glycans in the analysis will give insight 314 into cell-type-specific glycosylation sites and glycan structures to add another dimension to proteome 315 profiling. The presence of a modification that can be observed by MS and is a direct corollary of 316 using chemical tools allows for further validation of enriched glycoproteins, facilitating 317 glycoproteome analysis even in complex co-culture or in vivo settings. An artificial biosynthetic 318 pathway was essential to ensure minimal background labelling while being able to supply the tagged 319 sugar as an easy-to-synthesise MOE reagent. To this end, the use of the kinase NahK allows for use of 320 a per-acetylated bioorthogonal sugar that is fundamental to in vivo use and in marked difference to 321 highly unstable caged sugar-1-phosphates used previously. 19,27 To enable BOCTAG, cells require 322 transfection with at least two transgenes. However, the design of a multicistronic, transposase-323 responsive plasmid ensures that transfection efforts are straightforward. 41,42 BOCTAG allowed us to 324 selectively tag tumour glycoproteomes in vivo, highlighting the robustness of the approach. MOE 325 reagents have been chemically caged to be released by enzymes overexpressed in cancer 43-45 .While 326 independent of transfection, such targeting can be accompanied by substantial background labelling in 327 non-cancerous tissue. BOCTAG allows for programmable glycoprotein tagging with remarkable 328 signal-to-noise ratio, and an enabling technology that will transform our understanding of tumour-host 329 interactions particularly in the context of protein glycosylation.