A Region Within the Third Extracellular Loop of Rat AQP6 Precludes Its Tracking to Plasma Membrane in a Heterologous Cell Line.

The inability to over-express AQP6 in the plasma membrane of heterologous cells has hampered efforts to further characterize the function of this aquaglyceroporin membrane protein at atomic detail. Using the AGR reporter system we have identied a region within loop C of AQP6 that is responsible for severely hampering its plasma membrane localization. Serine substitution corroborated that amino acids present within AQP6 194-213 of AQP6 loop C contribute to intracellular retention. This intracellular retention signal may preclude proper plasma membrane tracking and severely curtail expression of AQP6 in heterologous cells.


Introduction
Aquaporins are speci c membrane channel proteins, composed of six membrane-spanning helical monomers forming the monomeric membrane spanning channel. All monomeric pores exhibit a conserved Asn-Pro-Ala (NPA) 1 constriction site formed by two short half helical segments, and a selectivity lter (SF) at the narrowest point in the channel. Four monomers, held together by hydrophobic interactions, come together as a tight tetramer, with a central pore in between. Aquaporins are traditionally divided into 'water channels' and 'aquaglycerolporins' that also facilitate the permeation of glycerol and other small solutes via the monomeric pore [2][3][4] . This classi cation has expanded considerably over the past decade as evidence rst emerged of potential gas permeability of membrane spanning channel proteins. Boron and co-workers rst demonstrated that Aquaporin 1 (AQP1) expressed in oocytes is also permeable to CO 2 5 . Molecular dynamic (MD) simulations 6 and monomeric pore inhibitor studies 7 suggested that about half the channel-dependent ux of CO 2 moves through the four water pores, whereas half permeates via a central pore pathway.
AQP6 is localized mainly in the kidney and co-localized with vH+-ATPase, and other tissues, and along with AQP3, AQP7, AQP8 and AQP10 belongs to the family of aquaglycerolporins. Interestingly, although Aqp6 is classi ed as an aquaglycerolprotein its' highest sequence homology is to the pure water channels AQP0, AQP2 and AQP5. Uniquely, AQP6 is also permeable to anions and this pathway is shown to be activated by acidic pH levels 8 . It is therefore surmised that AQP6 is involved in acid-base regulation 9 . A single amino acid substitution at asparagine 60 (N60) for glycine (G60, N60G) totally eliminates the anion permeability of AQP6 when expressed in Xenopus oocytes 10 and at the same time shows signi cantly increased water permeability, similar to AQP2, which is not inhibited by HgCl 2 .
Oocytes studies performed by Boron and co-workers 7,11 , showed that AQP6 is also permeable to CO 2 and NH 3 . Reciprocal glycine to asparagine mutations N60G in other aquaporins (AQP0, AQP1 and AQP2) all resulted in a failure of the protein to tra c to the plasma membrane, suggesting that the interaction of transmembrane domains 2 and 5 (TM2 and TM5) may result in signi cant conformational changes 10 and perturbation of the protein structure.
Currently, heterologous overexpression of AQP6 is a major bottleneck to obtaining su cient amounts of puri ed protein to determine the 3-dimensional structure of this channel. AQP6 basal expression is low in the kidney, its native location 12 , and for this reason puri cation from renal tissue is not practicable.
Transient transfection of AQP6 in mammalian cell lines is also not e cient 13 .
One important feature of AQP6 is, like most eukaryotic membrane proteins, it is glycosylated. One potential glycosylation site, N134, is in the region of loop B and may be essential for translocation and function 14 . Therefore, heterologous expression of AQP6 would ideally be carried out in a system which has the ability for post-translational modi cations.
In the present work we took advantage of a previously created Aquaporin 3-GFP (AQP3-GFP) construct 15 that displays intense plasma membrane (PM) with low cytoplasmic uorescence, termed the AQP3-GFPbased reporter (AGR) system, to identify unknown Endoplasmic Reticulum (ER) retention regions in proteins of interest that show poor expression in heterologous systems 16 . We identi ed a region within the third outside loop of AQP6 which when speci cally attached to AQP3-GFP using the AGR systematic approach, displayed decreased expression and was devoid of plasma membrane (PM) localization of AQP6 using HEK293 cells. Using serine-mutagenesis we restored the ability of this region to reach the PM using the AGR system. We hypothesize that the third outside loop of AQP6 with the sequence FTGCSMNPARSFGPAVIVGKFAVHWIF harbors an uncharacterized ER retention sequence that may explain the inability to over-express AQP6 in heterologous cell systems.

Expression and localization of tagged AQP6 in HEK293 cells
To study the cellular localization of AQP6 in transiently transfected HEK293 cells, two tagged constructs were made using turbo GFP (tGFP) at the N-or C-terminus of AQP6 ( Fig. 1A and B respectively). The Cterminal GFP tagged construct produced almost no AQP6 expression as previously reported by Ikeda M.
et al 17 with a complete absence of plasma membrane (PM) localization ( Fig. 1A and B). Although the Nterminal GFP tagged version displayed higher levels of expression, localized mainly in the ER. However, we were able to identify a small number of transfected cells displaying PM localization (Fig. 1C) con rming previous reports using a different heterologous system 18 . However, in both N-and C-terminal versions, expression levels of AQP6 were so low overall that any attempt to use these constructs for mass-production of AQP6 for structural studies is unlikely.
Scanning Aqp6 Residues 1-138 Using The Agr System In order to identify potential regions of AQP6 that may be responsible for reducing expression levels (thus containing uncharacterized ER retention or degron motifs 19 ) we used our previously described AGR system 16 . Using the AGR system we focused on the intracellular and extracellular loops of AQP6, obviating its Transmembrane domains (TMD) except for the initial domain.
To our surprise, as shown in Fig. 2 The third outside loop of AQP6 precludes PM localization of AGR We further scanned the rest of the AQP6 loops, individually attaching fragments AQP6 158-167 , AQP6 187-213 and the C-terminus AQP6 232-276 to the C-terminal of the AGR system as shown in Fig. 3. Interestingly, fragment AQP6 187-213 produced a marked decrease in expression and a total absence of PM localization of the AGR reporter. In contrast, a scrambled version of AQP6 187-213 was able to reach the PM when Cterminally attached to AGR (Fig. 3D), suggesting presence of an uncharacterized sequence-speci c ER retention motif within region AQP6 187-213 .

Mutagenesis Of Loop C Identi es The Retention Region
In order to further pinpoint the region that produces the PM targeting in the AGR system, as shown in Fig. 3B, we undertook a systemic serine-mutagenesis approach in order to further elucidate the characteristics of "loop C" (Fig. 4A). Using four different constructs, each C-terminally attached to AGR, we identi ed the amino acid sequence PARSFGPAVIVGKFAVHWIF as the one responsible for PM retention ( Fig. 4B-E).

Discussion
Polytopic multi-pass membrane proteins are a notoriously di cult to mass express for functional and structural studies 20 . The aquaglyceroporin subtype AQP6 has been known for quite some time to be very di cult to express in heterologous systems 17 . In the present work, we sought to examine the molecular structure of AQP6, focusing on the cellular expression and localization characteristics of intracellular and extracellular AQP6 loops. Using the AGR system 16 , we were able to identify a region contained within the third extracellular loop of AQP6, so called "loop C", that completely abrogated PM localization, while decreasing the expression levels of the AGR reporter system in HEK293 cells (Fig. 3B). To pinpoint whether or not this ER retention region could be abolished, we used serine-mutagenesis to localize the amino acid sequence containing the ER retention signal. As shown in Fig. 4, sequence, any consecutive 5 amino acid substitution by serines along FTGCSMNPARSFGPAVIVGKFAVHWIF sequence (underlined) restored PM localization of the AGR system (Fig. 4C-D), suggesting the retention motif is encoded within the residues FTGCSMN (italicized above).
This novel insight may help in further designing a mutagenized AQP6 that displays high-expression levels when transiently expressed in heterologous cells such as HEK293. The ability to overcome this obstacle would improve the chances of being able to produce su cient quantities of AQP6 to allow for further crystallographic work on AQP6 structure.

Materials And Methods
Plasmid constructs.
The AQP3-GFP-pcDNA5/FRT plasmid was synthesized as previously descried 16 . The full cDNA containing human AQP6-tGFP (NP_001643) was synthesized by Biomatik (Wilmington, DE) and tGFP-AQP6 (rat NP_071517.1) by Genscript (Piscatawas, NJ). In the tGFP-AQP6 construct, the tGFP was separated from AQP6 by the following peptide linker which included a Precision Protease site (underlined in blue): AASAVNGSLEVLFQGPAA, and containing Afe1 and Hpa1 restriction sites, as well as a 10X Cterminal Histamine Tag (Histag) as shown in Fig. 1A (in black). For the AQP6-tGFP construct, the tGFP was separated from AQP6 by the following peptide linker (Fig. 1B in blue) which included a Precision Protease site: GGSLEVLFQGPAA and a c-terminal 10X Histag (in black) as shown in Fig. 1B.
The AQP6-tGFP construct was subcloned into a pcDNA5/FRT plasmid (Invitrogen, Carlsbad, CA) using restriction enzymes BamH1/EcoRV while the AQP6 was subcloned using Hpa1/EcoRI into a pcDNA5/FRT plasmid containing tGFP. Each construct was custom synthesized by Genscript, codon optimized for mammalian expression in HEK293 cells. Each AQP6-based sequence was then subcloned into the AQP3-GFP-pcDNA5/FRT plasmid using Hpa1/EcoRV restriction sites. All constructs were con rmed by sequencing.
Declarations Figure 2 AGR scanning of the rst half of AQP6. A) When the N-terminus of AQP6 compromising residues AQP61-10 are C-terminally attached to AGR, the construct reaches the PM (white arrows). B) When the rst transmembrane domain (TMD) of AQP6 compromising residues AQP67-34 are C-terminally attached to AGR, it also reaches the PM. C) When the entire AQP61-34 residues (which include N-terminus and rst TMD) are attached to AGR, they also reach the PM. D) When residues AQP630-42 are attached to AGR they also reach the PM. E) When AQP662-101 residues are attached to AGR, they also reach the PM. E) When AQP6121-138 residues are attached to AGR, they also reach the PM. Scale bars: 8 µm Figure 3 AGR scanning of the second half of AQP6. A) When residues AQP6158-167 are C-terminally attached to AGR, the construct reaches the PM (white arrows). B) When residues AQP6187-213 are C-terminally attached to AGR, the construct does not reach the PM. D) When AQP6232-276 residues (compromising AQP6 C-terminal) are attached to AGR they reach the PM. D) Interestingly, when a scrambled version of residues AQP6187-213 are C-terminally attached to AGR, they reach the PM. Scale bars: 8 µm Serine mutagenesis pinpoints Omega signal localization. A) The amino acid sequence compromising AQP6187-213 is shown. B) When the amino acid sequence shown with serine substitutions is Cterminally attached to AGR, this construct fails to localize in the PM of HEK293 cells. C) When the amino acid sequence shown with serine substitutions is C-terminally attached to AGR, this construct localizes in the PM of HEK293 cells (white arrows). D) When the amino acid sequence shown with serine