Capturing acyltransferase(s) transforming final step in the biosynthesis of a major Iridoid Glycoside, (Picroside-II) in a Himalayan Medicinal Herb, Picrorhiza kurroa

Picrorhiza kurroa has been reported as an age-old ayurvedic hepato-protection to treat hepatic disorders due to the presence of iridoids such as picroside-II (P-II), picroside-I, and kutkoside. The acylation of catalpol and vanilloyl coenzyme A by acyltransferases (ATs) is critical step in P-II biosynthesis. Since accumulation of P-II occurs only in roots, rhizomes and stolons in comparison to leaves uprooting of this critically endangered herb has been the only source of this compound. Recently, we reported that P-II acylation likely happen in roots, while stolons serve as the vital P-II storage compartment. Therefore, developing an alternate engineered platform for P-II biosynthesis require identification of P-II specific AT/s. In that direction, egg-NOG function annotated 815 ATs from de novo RNA sequencing of tissue culture based ‘shoots-only’ system and nursery grown shoots, roots, and stolons varying in P-II content, were cross-compared in silico to arrive at ATs sequences unique and/or common to stolons and roots. Verification for organ and accession-wise upregulation in gene expression of these ATs by qRT-PCR has shortlisted six putative ‘P-II-forming’ ATs. Further, six-frame translation, ab initio protein structure modelling and protein-ligand molecular docking of these ATs signified one MBOAT domain containing AT with preferential binding to the vanillic acid CoA thiol ester as well as with P-II, implying that this could be potential AT decorating final structure of P-II. Organ-wise comparative transcriptome mining coupled with reverse transcription real time qRT-PCR and protein-ligand docking led to the identification of an acyltransferases, contributing to the final structure of P-II.


Introduction
Picrorhiza kurroa Royle ex Benth, Plantaginaceae earlier known as Scrophulariaceae family has been reported as a major herbal therapeutic for liver diseases [1][2][3]. Annual trade of 1000-2000 metric tons of P. kurroa has been Shoots grown in tissue culture at 15 o C (PKS15) and 25 o C (PKS25) and shoots from field-grown plants (PKSS) usually contain only P-I, while P-II is mainly detected in fieldgrown roots (PKSR) and stolons (PKSTS). P. kurroa stolon is the only organ that accumulates significant amounts of both P-I and P-II. De novo next generation RNA sequencing (RNA-seq) of PKS15, PKS25, PKSS, PKSR, and PKSTS samples has been recently reported [1]. In the present study, the annotated transcriptomes were thoroughly mined for transcripts annotated as acyltransferases (ATs). The criteria of in silico differential expression (transcripts per million values of RNA-seq reads, TPM) of ATs transcripts in transcriptome samples of shoot, root and stolons from P. kurroa coupled with their transcript abundance by reverse transcription real-time quantitative PCR (qRT-PCR) have been recently utilized to identify gene paralogues of picrosides biosynthetic pathways [2]. Further, gene co-expression sub-networks generated from RNA-seq of organs varying in P-I and P-II accumulation in P. kurroa have revealed crucial components of picrosides biosynthetic machinery [1]. Using similar approach, bioinformatics sorting of ATs common or unique between any two transcriptomes varying in P-I and P-II contents have been carried out in the present investigation. ATs shortlisted for acylation of P-II were further analysed for their real time organ-wise RNA expression profiles by qRT-PCR. Abundance of gene expression of these transcripts was further verified in P. kurroa accessions collected from wild habitat of P. kurroa exhibiting 89.33 fold variations in P-II content (fold change determined from fresh weight%). Transcripts short-listed from above criterion were translated in silico by six-frame translation and were modeled into three dimensional proteins by protein structure modeling to observe for protein to acyl donor, protein to acyl acceptor, and protein to P-I/ P-II interactions. This study is a step-wise approach of utilizing parameters for sorting ATs specific to P-II biosynthesis. Thus an initial comparison and TPM based bioinformatics sorting of ATs have been accompanied with additional reduction of possibilities by qRT-PCR screening for higher transcript abundance and finally cross-verified by molecular 'protein to ligand' docking interactions. Both P-II and the acyl donor vanilloyl-CoA exhibited preferential binding to protein structure modeled proteins with membrane-bound O-AT domain (membrane-bound O-acyltransferase family, MBOAT domain). soil moisture and high soil organic matter, but with lowest possible total soil nitrogen availability and spatial remoteness to higher land plants of vascular origin [7]. P. kurroa biosynthesize and harbour a rich repertoire of terpenoids having diverse decorations of the main backbone and one among the superior most terpenoid structure modifications is the acylation of bicyclic monoterpenes with phenylpropanoids or phenolics to form iridoid glycosides [1,2]. The bryophytes too are recognized for their voluminous store of terpenoids [8]. It is not unusual to infer that these terpenoids act as allelopathic and/ or symbiotic ground microbial interactors and assist in the survival of the herb or the bryophytes [9]. More strikingly, inclusion of nitrogen in these terpenoid structures are less profound or missing in comparison to monoterpenoids indole alkaloid structures commonly found in medicinal plants like Catharanthus roseus that resides in warmer climates.
Picroside-II (P-II, amphicoside) is a major iridoid found in stolons and roots of P. kurroa [10]. P-II has a vanilloyl moiety acylated to C-6 of a catalpol moiety and is also commercially referred to as 6-vanillyl catalpol (https://www. sigmaaldrich.com/DE/en/search/amphicoside). P-II serves as the dominant active principle in hepatoprotective herbal formulations of P. kurroa, which also includes picroside-I (P-I, 6-O-Cinnamoyl catalpol) and kutkoside as the other active chemical ingredients [1]. In addition P-II has been demonstrated to possess plethora of pharmacological activities ( Table 1). The last committed step in P-II biosynthesis is the acylation of vanillic acid and catalpol [11,12]. Considering pharmacological viewpoint, this acylation between the monoterpene bicyclic iridoid catalpol and the phenolic acid, vanillic acid has conferred the bioactive properties in P-II. Therefore, identification of acyltransferase's (ATs) specific to P-II biosynthesis is important for engineering and developing an exclusive production platform for P-II.

Validation of gene expression using qRT-PCR analysis
Gene expression profiles of ATs were checked by qRT-PCR in PKSS, PKS15, PKS25, PKSTS, PKSR, PKR84 (P. kurroa accession having high P-II content in roots), and PKR31 (P. kurroa accession having low P-II content in roots). For qRT-PCR, equal volumes of cDNA of each sample were subjected to qRT-PCR on a CFX96 system (Bio-Rad Laboratories) using AT partial transcript-specific primers (Suppl. File 1 A) and Hi-SYBr master mix (with Taq Polymerase) (Hi-media) in biological duplicates where each biological replicate consisted of three technical replicates. The qRT-PCR reaction mix (10 µL) for AT expression analysis consisted of 0.3 µL forward primer (10 pmol), 0.3 µL reverse primer (10 pmol), 5µL Hi-SYBr master mix (Hi-media), 1 µL cDNA, and 3.4 µL nuclease-free water. The qRT-PCR cycles involved an initial denaturation of 3 min at 95 °C, followed by 39 cycles of denaturation at 95 °C for 10 s (s), annealing for 30 s at 50 − 65 °C, and elongation for 20 s at 72 °C. In the final step, melt curve analysis was done at 65-95 °C with a 0.5 °C increment at each 0.05 s to verify the amplification of single products. Expression levels of each sample were normalized relative to 26 S rRNA as an

Mining of putative acyltransferase(s) from NGS transcriptome datasets
For extracting and identification of putative ATs that catalyse the final step of picrosides biosynthesis in P. kurroa, five transcriptome datasets viz. PKS-25, PKS-15, PKSS, PKSTS, and PKSR varying for picrosides content were computationally searched for sequences that have putative AT function.
Alignment analysis of mined transcripts from 5 transcriptome datasets using BLASTn with criteria of > 75% query coverage and > 90% was performed to pull out sequences in different permutations and combinations of the datasets. Comparative analysis was performed using five transcriptome datasets to shortlist acyltransferase(s) corresponding to P-I and P-II biosynthesis. Comparison of stolon (P-I &P-II) and roots (P-II) have identified acyltransferase(s) that play possible role in P-II biosynthesis. Unique acyltransferase(s) were extracted and pooled together with ATs that were common in stolons (PKSTS) and roots (PKSR) with respect to P-II (Fig. 1).

Collection of plant material
Field-grown plants of P. kurroa were collected from experimental nursery established in experimental research station of Himalayan Forest Research Institute at Jagatsukh, H.P. India (2193 m, altitude; 32°12'0 N and 77°12'0E). The plants at the time of harvesting were in fully developed stage of 3 years old and were harvested in the month of October, washed with tap water and segregated into shoots, stolons, and roots. All samples were stored at -80 °C for use in future experiments (for example, in qRT-PCR and highperformance liquid chromatography (HPLC)).

RNA extraction and cDNA synthesis
Total RNA was isolated from P. kurroa tissues by using Takara total RNA isolation kit (Nucleospin RNA Plant) by following the manufacturer's instructions. The isolated RNA was stored at -80 °C and the quality of RNA was checked by 1% (w/v) ethidium bromide-stained agarose gel and visualized by observing UV fluorescence through the gel documentation system (Biorad). RNA was quantified in a Nano Drop Multiskan GO (Thermo Scientific) by measuring absorbance at 260 nm and 260/280 nm ratio. The synthesis of cDNA was carried out with 1 µg RNA using a Takara cDNA synthesis kit according to the manufacturer's was performed in triplicates and average of corresponding values was considered as final score. All 8 ligands were screened simultaneously with six proteins separately. The grid was generated by looking into the active binding sites of template proteins as indicated by I-TASSER tool. Different conformations of Protein-ligand binding complexes were generated. Molecules with lowest binding energy (highest binding affinity) and comparatively predominant interactions were chosen for further analysis. Bond interactions within protein-ligand complexes were visualized using LigPlot.

In silico sorting of Picroside-II specific Acyltransferases in P. kurroa
The eggNOG function annotated RNA-seq files pertaining to five transcriptomes under study were thoroughly searched for the general term 'acyltransferase'. A sum-total of 815 AT FASTA (FAST All) sequences corresponding to unique identifiers were retrieved (Figs. 1 and 2; Table 2). BLAST alignment comparison taking pairs of individual transcriptome samples as query and subject, resulted in the identification of transcripts which were unique to stolons (Stolon uniseq) when compared with RNAseq samples corresponding to PKS15 (58 unique ATs), PKS25 (63 unique ATs), and PKSS (67 unique ATs) (Fig. 3). Similarly BLAST alignment comparison taking pairs of individual transcriptome samples as query and subject, resulted in the identification of transcripts which were unique to PKSR (Root uniseq) when compared with RNAseq samples corresponding to PKS15 (69 unique ATs), PKS25 (70 unique ATs), and PKSS (79 unique ATs) (Fig. 3).Additionally, 123 AT sequences were found to be common between root and stolon samples (Stolon & Root: comseq, Figs. 2 and 3). An additional layer of in silico reductive screening was used by pair-wise cross-comparisons between unique and common AT sequences retrieved from root and stolon AT sequences (Fig. 3). Thus, a total of 177 AT sequences specific to P-II were obtained from these comparative analyses. AT redundancy was removed by searching for duplicate sequence identifiers occurring more than one time. Manual shortselection of ATs were carried out to remove transcripts with insignificant TPM values (TPM values of transcripts > 0.5 were considered as significantly expressed and were proceeded for further qRT-PCR analyses). A final list of 22 ATs with corresponding BLASTX identities is tabulated in Suppl. File 1B.
internal control and relative expression was calculated using the 2 −∆∆Ct method [13].

Quantification of picrosides through HPLC analysis
P. kurroa accessions PKR84 and PKR31 were collected and stored at -80 °C and were further evaluated for their organspecific P-II content using reverse phase HPLC. Representative HPLC chromatograms showing P-II from root samples of PKR84 and PKR31 are given alongside a chromatogram of P-II standard obtained from Sigma (G0174). P-II retention times in the HPLC runs ranged at 10.5 ± 0.5 min. Briefly, P. kurroa root samples were ground to fine powder using a mortar and pestle in liquid nitrogen. Powdered samples (100.0 mg) were extracted in 10.0 mL 80% HPLC grade aqueous methanol. The samples were then vortexed to mix properly and extracted overnight at room temperature (~ 25℃). Samples were centrifuged at 10,000 rpm for 15 min. and the supernatant was filtered through 0.22 μm filter (Millipore) and used for HPLC analysis [14]. A reverse phase HPLC system consisting of Sunfire C18 µm (Waters, 4.6 × 250 mm, 5 μm) and photodiode array detector (PDA; Waters 2998) was used for detection of P-II at 280 nm. The isocratic mode consisted of mobile phase A 0.03% trifluoroacetic acid in HPLC grade water (Molecular grade water Aldo from Merck) and mobile phase B acetonitrile/methanol (Merck, USA) in 1:1 ratio. A flow rate of 1.0 mL min − 1 was maintained throughout the run. Sample volume of 20 µL was injected and HPLC run was conducted for 30 min at 30 °C. The extraction and quantification experiments were performed in triplicates [15].

Molecular modeling and Docking
The nucleotide sequences (partial transcripts) were translated using Transeq (https://www.ebi.ac.uk/Tools/st/ emboss_transeq/) and further used as input files for Pfam (http://pfam.xfam.org/) to identify domain and family of protein sequence. Three dimensional (3-D) structures of these sequences/ domains were obtained by ab initio modelling using I-Tasser (https://zhanglab.dcmb.med.umich.edu/ I-TASSER/). Models were chosen based on their Z-score, Qmean and C-value. After protein structure modeling, energy minimization was done using Swiss PDB Viewer and structure validation was performed on Ramachandran plot server (https://zlab.umassmed.edu/bu/rama/). The six protein structures and 8 ligands used in this study were prepared for docking analysis using AutoDock Tools (http:// autodock.scripps.edu/resources/adt). Molecular Docking The partial AT transcripts were optimized for single band amplification with both genomic DNA and cDNA as templates, Hi-SYBr master mix (with Taq Polymerase) (Himedia) was used in these PCR amplification reactions. Eight AT partial transcripts failed to amplify under all tested conditions, which is why qRT-PCR studies were carried out with 14 partial AT transcripts. NODE_48592/618 bp designated as AT_PII-10 exhibited a 6.22 ± 0.03 fold higher expression in PKSR compared to the PKSTS and PKSS samples (Fig. 4). NODE_37534/628 bp designated as AT_PII-1 exhibited 15.34 ± 0.05 fold higher expression in roots in comparison to stolon and leaf samples (Fig. 4). NODE_59938/512 bp named AT_PII-5 in Fig. 4 was found to have 3.23 ± 0.4 fold higher gene expression in root samples in comparison to stolon and leaf samples, respectively. NODE_34024/833 bp designated as AT_PII-6 showed similar relative expression profiles in root and stolon samples (Fig. 4). NODE_16506/1324 bp designated as AT_PII-7 revealed 8.20 ± 0.32 fold higher expression in PKSR compared to PKSTS and PKSS (Fig. 4). NODE_3996/2193 bp or AT_PII-8 exhibited 6.03 ± 0.12 fold higher expression relative to stolon and leaf samples, respectively (Fig. 4). NODE_11774/1549 bp and NODE_107081/320 bp designated as AT_PII-4 and AT_PII-14, respectively, exhibited insignificant expression in PKSR and PKSTS (Fig. 4). Further, NODE_93830/351 bp designated as AT_PII-13 displayed zero relative expression in PKSTS (Fig. 4). Finally, six ATs were selected based on high expression in roots or stolons and BLASTN and BLASTX searches against viridiplantae these six ATs have been represented in Suppl. File 1. P. kurroa accessions PKR84 (2.68% fresh weight P-II) and PKR31 (0.03% fresh weight P-II) exhibited contrasting variation of around 89 fold in P-II content in roots (Suppl. Figure 1). qRT-PCR analyses of the 6 selected transcripts from Fig. 4 with root cDNA of PKR84 and PKR31 exhibited significantly higher relative expression of five transcripts in PKR84 in comparison to PKR31 (Fig. 5). AT_PII-1 exhibited 0.78 ± 0.18 fold higher expression in PKR84 in comparison to PKR31. AT_PII-5 exhibited 1.92 ± 0.30 fold higher expression in PKR84 in comparison to PKR31. AT_PII-6 exhibited 1.63 ± 0.08 fold higher expression in PKR84 in comparison to PKR31. AT_PII-7 exhibited 5.05 ± 0.02 fold higher expression in PKR84 in comparison to PKR31. AT_PII-8 exhibited 2.49 ± 0.09 fold higher expressions in PKR84 in comparison to PKR31. AT_PII-10 exhibited 2.21 ± 0.33 fold higher expression in PKR84 in comparison to PKR31 (Fig. 5). analyses might provide more adequate quantifications of pair-wise binding energies. Node_48592/618 bp (AT_PII-10, MBOAT) exhibited a highest comparative binding affinity for vanilloyl-CoA, catalpol, and P-II (Suppl. File 2, Figs. 6 and 7). Evidently, binding affinity of vanillic acid was the lowest, suggesting absence of binding without acyl CoA activation (Suppl. File 2, Figs. 6 and 7). Further, feruloyl catalpol exhibited similar binding energy values to P-II but the binding affinity of vanilloyl-CoA was comparatively slightly higher than the end product iridoids, suggesting probable true binding; if we consider the fact that the activated acyl carrier vanilloyl-CoA binds true to the protein with Node_48592/618 bp (AT_PII-10), while the end products docked to the protein pocket represent an instantaneous time when the bound products might be on the verge of escaping from the binding pockets after acylation (Suppl. File 2, Figs. 6 and 7). It remains to be verified whether this protein is a vacuolar membrane bound protein as was predicted from root gene co-expression sub-network analysis [1], that acylation occurs in vacuoles inside the roots.

Discussion
The current study has utilized RNA-seq data from P. kurroa plants grown at an experimental farm located at around 2193 m altitude in Jagatsukh, Himachal Pradesh, India. A single AT could be sorted out after applying multi-step screening protocols which include in silico sorting of unique and common ATs from five transcriptomes exhibiting contrasting P-II contents; TPM-wise screening and removal of redundancy in transcripts, and screening by higher relative gene expression profiles of fourteen ATs in shoot, root and stolons of field-grown P. kurroa. ATs displaying significantly higher relative gene expression were manually picked up and transferred to prediction analysis by molecular protein-ligand docking after ab initio protein structure modelling of the respective proteins. Vanilloyl-CoA exhibited highest binding preference to a partial AT transcript with the identifier NODE_48592/618 bp. Strikingly, even P-II independently exhibited a close binding preference to NODE_48592/618 bp similar to vanilloyl-CoA, indicating high probability of this specific transcript to turn up as the P-II specific AT on the wet-lab bench.
Active fractions and pure compounds from P. Kurroa have been reported to exhibit anti-asthma, anti-inflammation, hepato-protection, immune-stimulation, and free radical scavenging properties (Table 1). These pharmacological properties are mostly attributed to the presence of a group of monoterpenoids bicyclic iridoid glycosides of catalposide class, namely, P-I, P-II, 6-feruloyl catalpol, and pikuroside [10,16]. Among these compounds, P-II and pikuroside are

Protein structure modeling and molecular docking to verify specificity of acyltransferases to final structural decoration of Picroside-II
The shortest AT partial transcript spanned 512 bp (AT_ PII-5) while the longest transcript was 2.193 kb in length (Suppl. File 1). Six translated coding frames were generated for each transcript using Transeq tool. Inclusion of conserved AT domain/s in the transcripts in the correct frame were confirmed by conserved domain database searches via Pfam (http://pfam.xfam.org/) (Suppl. File 2). Ab initio modeling of the translated amino acids was carried out using I-Tasser. It is imperative to mention here that since complete complementarity determining sequences (CDSs) was possibly not available in any of the partial AT transcripts, the downstream analyses after protein structure modeling will always have some degree of added inaccuracy, although since correct codon frames were carefully chosen before initiation of the ab initio modeling, the final 3-D protein structures will essentially retain the preferred binding pockets. Thus, the proteins with the selected partial AT transcript identifiers NODE_37534/628 bp, NODE_59938/512 bp, NODE_34024/833 bp, NODE_16506/1324 bp, NODE_3996/2193 bp, and NODE_48592/618 bp were individually docked with the supposed active acyl donors cinnamoyl-CoA, feruloyl-CoA, and vanilloyl-CoA one at a time to compare 3-D best-fit conformation as well as highest binding affinity (Suppl. File 2, Figs. 6 and 7). Additionally, the probable acyl acceptor catalpol was docked to each protein separately (Suppl. File 2, Fig. 7). Further, the end products P-II, and feruloyl catalpol as well as the inactive acyl donor's cinnamic acid and vanillic acid have been docked separately (Suppl. File 2, Fig. 7). We argued that docking both the active acyl CoA thiol ester donor and acyl acceptor (catalpol) onto the protein might increase inaccuracy levels of assumption; therefore, separate binding  [19]. Several BAHD ATs ranging from medium-chain-length aliphatic and benzyl-derived alcohols as substrates and acetyl CoA to other acyl CoAs as acyl donors have been biochemically and genetically characterized [20]. Two alcohol ATs VpAAT1 from Vasconcellea pubescens and CpAAT1 from Carica papaya that accept the substrates benzyl acetate and ethyl butanoate/ methyl butanoate, respectively, could be distinguished for their substrate size specificity based on the size and shape of the solvent channels in CpAAT1 and VpAAT1 proteins [21]. Around 110 diverse BAHD family enzymes have been reported of having been biochemically characterized and include acyl-CoA dependent acylation of substrates ranging from aliphatic/ aromatic aldehydes/ alcohols to terpenoids, alkaloids and phenolics [22]. Depending upon specific CoA thioester acyl donors or specific substrates, the BAHDs have been classified till date into eight distinct clades [23][24][25][26].
Antimicrobial triterpenoid saponin group of avenacins found in oats (Avena spp) have been reported to undergo O-Glc ester mediated acylation of either benzoate or N-methyl anthranilate moieties at C-21 of avenacins via SCPL ATs [27]. The authors also noted that the SCPL AT was found in a BAC contigs spanning four genes within the avenacin gene cluster which included a glucosyl transferase and a N-methyltransferase responsible for the respective glucosylation and methylation of the avenacins [27]. Thus, gene machinery for biosynthesis and modification of cyclic compounds like iridoids might exist in adjacent gene clusters or in other words might possibly be co-expressed. Acylation can impact solubilisation, stabilization, vacuolar uptake and color modulation of phenylpropanoids [28] as well as regulate the transport and storage of molecules [29][30][31] reported that ABC-type and proton gradient-driven transporters are most probably involved in vacuolar import of abscisic acid-glucose esters. Accordingly, a possibility crops up that activity tagging by glucose esterification of cyclic terpenoids like the iridoids might aid their transport into vacuoles where they might be acylated by SCPL or BAHD ATs for future storage and transport. Thus, these reports serve as vital clues to investigate gene co-expression based networks in future for sorting out ATs required for P-II found mainly in roots and rhizomes and structurally both of these compounds are constituted of two moieties, a catalpol and a vanillic acid moiety bonded in a ester linkage at C-6 of the backbone catalpol moiety. Pikuroside differs from P-II in having a 3, 10-acetal linkage in the catalpol backbone. Apart from P. Kurroa, similar iridoids with rigid three ring skeletons have been identified in Catalpa bignonioides and Rehmannia glutinosa [16]. Following the formation of geraniol from geranyl pyrophosphate or farnesyl pyrophosphate, epi-iridodial is cyclized to form epi-iridotrial, which involves the formation of an aldehyde group at C-4, glucosylation at C-1 and concurrent oxidation of the aldehyde to acid at C-4 to form 8-epiloganic acid, and the conversion to mussaenosidic acid. Dehydration of mussaenosidic acid yields deoxygeniposidic acid, which is then hydrolyzed at C-10 to yield geniposidic acid. Decarboxylation of geniposidic acid at C-4 produces bartsioside, a penultimate hydroxylation at C-6 produces aucubin, and an epoxidation with the alcohol at C-10 produces catalpol [11,12].
Cinnamic acid pKa = 4.37 with the formula C6H5CH = CHCOOH or hydroxycinnamic acids pKa = 4.65 such as p-coumaric acid and caffeic acid, which are hydroxy derivatives of cinnamic acid, do not seem much suited to act as an acyl donor for P-II which is a vanilloyl catalpol (6-Vanilloyl catalpol). Vanilloyl-CoA (vanillic acid CoA thiol ester), obtained from the 3-methoxy-4-hydroxy derivative of ferulic acid (vanillic acid) supposedly, can act as activated acyl donor/s for acylation of CAT and VA to form P-II. Feruloyl-CoA obtained from the 3-methoxy-4-hydroxy side chain unsaturated derivative of cinnamic acid (ferulic acid) will require an additional hydratase/ lyase/ dehydrogenase to form the VA moiety of P-II. According to an earlier report, feeding of radio-labelled ferulic acid and VA in Vanilla planifolia resulted in the bio-conversion of ferulic acid to vanillin by CoA-dependent β-oxidative cleavagemediated pathway [17]. A modification of ferulic acid to feruloyl-CoA, and further oxidation to vanilloyl-CoA followed by a reduction step yielded vanillin [17,18]. Thus, vanilloyl-CoA as a reported active acyl donor was utilized in our protein docking studies to investigate ATs capable of acylating catalpol. Further, as an evidence for the existence of vanilloyl-CoA, earlier reports have confirmed the presence of vanillic acid in roots and rhizomes of P. kurroa and a table (supplementary spreadsheet) representing compounds identified in P. kurroa roots, rhizomes and stems have been provided in Sharma et al. 2021 [1].
ATs aid natural product scaffold modification by acylating oxygen or nitrogen atoms to generate esters or amides. Among the two largely known AT families, the BAHD ATs accept CoA thioesters as a source of the acyl group, while the serine carboxypeptidase-like (SCPL) ATs frequently utilize the 1-O -β -ester of glucose (1-O -β -acyl acetals) as or cinnamic or vanillic acid as independent substrates must be substantially less.
In a published report concerning comparative acetylation of 19-hydroxytabersonine derivatives of monoterpenoid indole alkaloids like minovincinine, 19-hydroxytabersonine, and horhammericine or other structurally almost similar 16-hydroxytabersonine derivatives of monoterpenoid indole alkaloids like deacetylvindoline from C. roseus, 19-O-acetyltransferase (TAT) was unable to acetylate the 16-hydroxytabersonine derivatives or any compounds outside the monoterpenoid indole alkaloid family, thus corroborating to the fact that all the three major participants in an acyl group transfer/ esterification reaction; the AT, the acyl donor and the acyl acceptor are highly structurespecific [22]. Thus, acylation predictions based on in silico docking studies require assessment of individual binding affinity scores for the active acyl donor with the AT (a), the acyl acceptor with the AT (b), the acyl donor and acceptor with the AT (c) and the formed final product with the AT (d).
We assumed that an equivalently solid approach would be to dock (a) and (d). In a conceived scenario of an instance where acylation has been achieved, the final product will have a close binding with the AT before being released from the binding protein pocket.
Activated acyl carrier formation in case of SCPL ATs involves the action of UDP-glucosyltransferases before SCPLs can function. We assumed that the active acyl donor vanilloyl CoA would be a better fit to mine for P-II-specific ATs. Our lab recently worked out 'no-direction' terpenoidglycoside gene co-expression sub-networks in P. kurroa and hypothesized that the biosynthesis and accumulation of isoprene units stemmed from negative regulation of abscisic acid (ABA) or ABA degradation, with cautious inclusion of downstream terpenoid glycosides/ iridoids in the hypothesis, since their up-accumulation might not be concurrent to ABA degradation and is inevitably subject to further functional scrutiny [1]. Generation of P-II accumulation dependent organ-wise unique and common gene co-expression networks involving ATs in P. kurroa might bring in novel information supposedly missing from the current analysis. Our basic assumption in using this strategy would be that the required AT for final acylation of catalpol and vanillic acid to form P-II might appear preferentially in either the terpenoid or the phenylpropanoid gene co-expression subnetworks. Moreover an AT gene co-expression network might provide an additional clue to P-II biosynthesis via gene-gene co-expressed interactions. The major drawbacks in this assumption are that firstly, P-II is an iridoid glycoside biosynthesized in P. kurroa to meet the requirements of P. kurroa, therefore there are ample possibilities that the candidate AT will not show up in any of the co-expressed gene sub-networks in as much as of the 'need to express' criteria biosynthesis. Furthermore, engineering ATs in plants might lead to the biosynthesis of substrate-dependent acetates in plants inducing characteristic aromas such as in transgenic lisianthus plants, which are ornamental plants with beautiful but scentless flowers [32]. Moving even further, [33] had preconceived that it is highly likely that specific ATs might determine alkaloid patterns in particular plant species. Similar activities of alkaloid ATs were detected in Lupinus species and accessions largely varying in alkaloid contents and were interpreted accordingly that the allied gene expression might not affect concentrations of alkaloids in cells. Thus, specific ATs might determine presence of iridoids such as picrosides in plants like P. kurroa, C. bignonioides and in R. glutinosa.
Normally, isotope studies and other chromatographyoriented procedures were used in the past decades to find out the nature of CoA thiol esters required for a particular scaffold generation in a secondary metabolite such as a monoterpenoid of interest. By far, p-coumaroyl-CoA, cinnamoyl-CoA, benzoyl-CoA, feruloyl-CoA, hydroxy cinnamoyl-CoA, 3-hydroxybenzoyl-CoA, 3-hydroxy-3-phenylpropanoyl-CoA, 3-oxo-3-phenylpropanoyl-CoA, and anthraniloyl-CoA are the known active acyl CoA donors in plants that have been reported to mediate acyl transfer reactions upon catalysis by specific ATs [34]. Kinetic study of a ion-exchange and gel-filtration chromatography purified vegetal acyclic monoterpenol AT in Palmarosa aroma grass have been reported till date [35]. Recently, regulation of santalol sesquiterpene biosynthesis in sandalwood (Santalum album) were shown to be co-dependent on the elicitor methyl jasmonate and up-accumulation of two mevalonate pathway genes which include an acetyl-CoA C-acetyl transferase (AACT) and a 3-hydroxy-3methyglutary-CoA synthase (HMGS) [36]. An alcohol AT (PpAAT1), exhibiting the biosynthesis of γ-decalactone was identified by a combined protein molecular docking and site-directed mutagenesis approach [37]. In this study, the authors docked acetyl-CoA and the four alcohols propanol, hexanol, benzyl alcohol, and decanol on to the active centre of PpAAT1 to identify amino acids participating in enzyme-substrate recognition in the esterification reaction. During our docking studies to select an AT from the qRT-PCR shortlisted ATs, we assumed that an AT committing to an esterification reaction between a phenylpropanoid or phenolic acid such as cinnamic acid or vanillic acid and the catalpol moiety, will exhibit a higher binding affinity to cinnamoyl-CoA or vanilloyl-CoA, and a slightly lower binding affinity to 6-vanilloyl catalpol (P-II) or feruloyl catalpol which will facilitate their release from the ATs after the completion of the esterification reaction. Another natural assumption to shortlist these ATs was that the binding preferences of either catalpol

Conclusions
Bioactive fractions obtained from dried roots and rhizomes of P. kurroa are rich in the hepato-protectant iridoids P-II, P-I, and kutkoside. Acylation of catalpol with vanillic acid forms P-II. In need to decipher the P-II specific AT, de novo RNA-seq data from five P. kurroa transcriptomes, significantly varying in P-II content have been sorted to pull out every transcript annotated as an AT. With regard to their possible functional aspect, these ATs have been subjected to pair-wise cross-comparisons with one transcriptome data being the main feed while the other data was used as a query. All possible cross-comparisons of unique and common ATs from root and stolon could be converted into a list of 177 P-II specific putative ATs. A second round of screening of these ATs based on redundancy removal and observation of significant TPM values resulted in further shortlisting of ATs to just 22. An efficient relative gene expression oriented sorting of these ATs further brought down the number of putative ATs to six. In the end, experimentation of protein structure modeling and protein-ligand docking of these putative 6 ATs favoured a single AT possessing MBOAT domain. Thus, AT PII-10 has been identified as a putative acyltransferase in Picrorhiza kurroa that catalyses the conversion of vanillic acid to Picroside-II in the presence of catalpol. RNAi and CRISPR-CAS9 mediated silencing of these putative ATs adjunct to biochemical experiments with the cloned and heterologously expressed putative AT protein will have the final say on the observed in silico binding affinity and binding preferences of the putatively sorted and identified AT transcripts predicted to be specific to P-II biosynthesis.

Acknowledgements
The authors thank the Department of Biotechnology, Ministry of Science and Technology, Government of India for providing financial support to RSC in the form of Programme Support on high altitude medicinal herbs. SS thank Bennett University for providing PhD research fellowship.
Contributions RSC conceived and designed the central flow and objectives to be investigated in the current research. AK, SS, and AS performed more than 90% of the crucial experiments (in silico and wet lab) described in this study and compiled the results section. NN was involved during pursual of primary initiative work for this publica-