Plant Science Today

In the present study, we report identification and characterization of the plant-specific WUSCHELrelated homeobox (WOX) gene family in Apostasia shenzhenica, a primeval orchid. WOX proteins are DNA-binding WUSCHEL-related homeobox (WOX) encoding transcription factors that play critical role in zygote patterning, embryo development, organogenesis, florigenesis, stress responses etc. Ten putative AsWOX genes were predicted in the A. shenzhenica genome and were characterized by the presence of DNA-binding helix-loop-helix-turn-helix motif. AsWOX proteins were grouped into three clades, ancient, intermediate and WUS on the basis of sequence homology with Arabidopsis thaliana (AtWOX), Oryza sativa (OsWOX), Phalaenopsis equestris (PeWOX) and Dendrobium catenatum (DcWOX) and their phylogenetic relationship was established. Gene structure analysis revealed that three AsWOX genes had two introns, six genes had a single intron, and one gene was intron-less. Expression profiling in a variety of tissue (tubers, seeds and pollens) was analysed in light of the presence of specific cisregulatory elements in the promoter region and their role in various developmental processes was discussed. Three dimensional structures were predicted for three selected AsWOX proteins representing the three clades. The present study provides insights to the role of AsWOX gene family in various vital developmental processes, establishes phylogenetic relationships with related plant species and provides a platform for functional validation of specific AsWOX genes.


Introduction
The WUSCHEL-related homeobox (WOX) gene family is involved in plant embryonic patterning, stem cell maintenance, organogenesis, florigenesis, somatic embryogenesis and stress responses (1)(2)(3). These genes encode plant specific DNA-binding homeobox transcription factors which are characterized by 60-66 amino acid (aa) residues long homeobox domain with embedded DNA-binding helix-loop-helix-turn-helix motif (4,5). Based on sequence homology, WOX proteins can be classified into three clades, Ancient, Intermediate and WUS (1). The Ancient clade evolved earlier and can be found from algae to angiosperms. The Intermediate clade emerged after the origin of pteridophytes and is absent in algae and bryophytes. The WUS clade, on the other hand, is found only in angiosperms, indicating that it is the most advanced clade (1).
In post-genomics era, genome-wide characterizations of gene families are prevalent in crop plants, however, such studies are scarce in orchids as only a few orchid genomes have been sequenced so far (24)(25)(26). For the present work, A. shenzhenica, a primeval terrestrial orchid, was selected because of its evolutionarily significance as it represents the most primitive subfamily, Apostasioideae which has strong divergence from Orchidaceae. This taxa is characterized by actinomorphic flowers, indistinct labellum, rudimentary gynostemium, absence of pollinia and non-resupinating ovary which are contrasting to the general characteristics of orchids (27). Thus, it is important to study the plant in order to evaluate its evolutionary status and relationship. In this report, a genome-wide analysis was done to study physicochemical characterization, structure prediction, phylogenetic relationships, analysis of cis-regulatory elements and spatio-temporal expression profiling of AsWOX gene family. The study would pave way for functional characterization of WOX genes and provide insights for their potential role in growth and development of orchids.

Phylogenetic analysis
Full length protein sequences (AsWOX, PeWOX, DcWOX, AtWOX and OsWOX) were initially aligned with MUSCLE program and the phylogenetic tree was then constructed using MEGA7 tool (32) by maximumlikelihood method at bootstrap value of 1000.

Physico chemical characterization
The AsWOX sequences were analysed using the Expasy-ProtParam server (33) to calculate the physico chemical properties such as molecular weight, 166 RAMKUMAR ET AL.   (37), TargetP-2.0 (38) and Plant-mPLoc (39) were used to predict the sub-cellular localization.

Gene structure and cis-regulatory elements analyses
For each AsWOX protein sequence, coding sequence (CDS) and gene sequence were retrieved from NCBI database. The gene structure with exon-intron display was drawn using Gene Structure Display Server 2.0 (40). The promoter regions were retrieved from 1.5 kb upstream sequences of the genes, from NCBI database and the presence of cis-regulatory elements was confirmed using PLACE server (41). Promoter elements were further analysed to identify common and specific promoter elements using Venn Diagram tool.

Gene duplication events and ortholog prediction
The sequence similarity index among AsWOX CDS sequences were obtained using MUSCLE tool (42), and the genes sharing ≥ 80% identity were considered duplicated. To predict the ortholog proteins, a local NCBI BLASTp search was performed with each candidate AsWOX protein sequences querying independently against the WOX protein sequences of target species i.e. A. thaliana (AtWOX), O. sativa (OsWOX), P. equestris (PeWOX) and D. catenatum (DcWOX), and the orthologs were identified.

Expression analysis
The AsWOX CDS sequences were used for BLASTn search against the high throughput RNA-seq data for various developmental stages like tuber (SRX2938654), seed (SRX2938653) and pollen (SRX2938652) of A. shenzhenica available from NCBI database (26). The hits were counted and the RPKM values (Reads per Kilobase per Million) were calculated using the formula RPKM = (C x 10 9 ) / (N x L), in which C represents number of hits for the candidate gene, N represents total mapped reads in the concerned RNAseq experiment and L represents the length of gene in base-pairs (43). The heat maps to visualise the differential expression of were generated using Hierarchical Clustering Explorer 3.5 (44).

Molecular modelling
The secondary structure of AsWOX proteins was analysed using the online tool SOPMA secondary structure prediction (45), for the presence of alpha helices, random coils, beta turns and extended strands. For prediction of tertiary structure, I-Tasser, a molecular modelling tool (46) was used by simulating with top 10 closely related homologous templates in PDB (Protein Data Bank) with the help of BS-scores, TM-scores, IDEN coverage. The DNA-bindng site prediction was based on identification of analogs with similar binding sites with BS-score value of > 0.5.

Identification and characterization of WOX gene family proteins
WUS gene was the first WOX gene family member identified in Arabidopsis thaliana (AtWUS) and characterized to be involved in meristem maintenance in shoot and floral apices (6, 47) and later this gene was shown to regulate floral patterning as well (48). In general, WOX gene family members regulate zygote and embryonic patterning and development, organogenesis, florigenesis and participate in stress responses (1,3). To identify WOX members in Apostasia shenzhenica (AsWOX), extensive BLASTp was carried out and a total of 10 AsWOX protein sequences were identified (Table 1). No splice variants for any of the WOX genes were identified. The size of the WOX gene family in A. shenzhenica (10 genes) was comparable with the   related orchid species like Phalaenopsis equestris (14 genes, including 3 duplicated genes) and Dendrobium catenatum (10 genes) (19). Multiple sequence alignment indicated that all AsWOXs carry DNAbinding helix-loop-helix-turn-helix region (Fig. 1a). Domain analysis showed that all AsWOX sequences consisted of WUSCHEL-related homeobox domain (helix-loop-helix-turn-helix region) (pfam00046) (Fig.  1b). MEME Suite server identified five conserved motifs, with reduction in evolutionary clade advancement (Fig. 1c).
Phylogenetic analysis for AsWOXs was carried out with those of D. catenatum (DcWOXs), P. equestris (PeWOXs), A. thaliana (AtWOXs) and O. sativa (OsWOXs), to establish the evolutionary relationships (Fig. 2). Tight clustering of AsWOX13 with PeWOX13A, PeWOX13B, PeWOX13C, DcWOX13 and AtWOX13 revealed that it is a member of ancient clade. Similarly, rest of the genes clustered in the other two clades, Intermediate and WUS, along with the respective sequences of P. equestris, D. catenatum, O. sativa and A. thaliana (Fig. 2).
The peptide length of AsWOXs varied from 137 aa (AsWOX7) to 349 aa (AsWOX9), with an average length of 228 aa (Table 1) which is in sync with reports of Dendrobium catenatum (19).  (Table 1). No signal peptide or transmembrane helix region was detected in any of the AsWOX protein sequences. These were localised in the nucleus as expected for DNA-binding transcription factors (Table 1) which is in conformity with earlier reports (49).

Gene characterization and duplication events and ortholog prediction
Genomic scaffold regions for each AsWOXs were identified from the NCBI database and listed ( Table  2). The exon-intron gene structure analysis of WOXs showed that three WOX genes (AsWUS, AsWOX9 and AsWOX13) carried two introns, while six genes (AsWOX2A, AsWOX2B, AsWOX3A, AsWOX3B, AsWOX11 and AsWOX12) were mono-intronic genes and one gene (AsWOX7) was intron less (Fig. 3). Multiple sequence alignment-based sequence similarity index among AsWOX CDS sequences indicated that no gene duplication event occurred within WOX gene family in A. shenzhenica genome (Supplementary Table 1) which is in conformity with the results in a related orchid species, D. catenatum having 10 WOX genes (19).
The orthologs for each AsWOX were predicted against the sequences of taxonomically closest 168 RAMKUMAR ET AL.  (Table 2).

Cis-regulatory elements prediction and Expression analysis
Detailed analysis revealed that the 1.5 kb upstream promoter sequences in all the WOX genes carried cisregulatory elements including core promoter elements TATA-box (TATABOX5) and CAAT-box (CAATBOX1), and other elements that direct specific expressions such as, root-specific (ROOTMOTIFTAPOX1, OSE2ROOTNODULE), mesophyll-specific (CACTFTPPCA1), pollen-specific (POLLEN1LELAT52, GTGANTG10), dehydration-esponsive (MYCCONSENSUSAT), light-responsive (IBOXCORE, GT1CONSENSUS), Dof proteins binding domain (DOFCOREZM), WRKY proteins binding W-box (WRKY71OS) and wound activating W-box (WBOXNTERF3). In addition, the AGL15 binding element (CARGCW8GAT) and scaffold or matrix attachment region (1 MARTBOX) were predominant (Supplementary Table 2). This reflects the diverse role of WOX genes in plant development. In A. shenzhenica, RNA-seq data was available for three developmental stages i.e. tuber, seed and pollen. AsWOX13, the ancient clade member, showed high expression in all the three developmental stages (Fig. 4). The presence of pollen-specific (POLLEN1LELAT52, GTGANTG10) promoter elements in AsWOX13 suggests a role in anther development, the expression profile of this gene also confirms the same (Fig. 4). This has also been reported in A. thaliana, where, the expression of AtWOX13 was found to be higher at floral transition stage, in inflorescences, floral buds, gynoecium and relatively weak expression in fruits and leaves indicating towards its role in flower and embryo development (7,51). It is reported that WOX13 from Physcomitrella patens (PpWOX13) is involved in reprogramming of leaf cells and protoplast cells into stem cells (50). In A. thaliana, AtWOX14, a homologous gene of AtWOX13 gene, is involved in gibberellin synthesis, vascular cell differentiation, floral transtition, anther development and lateral root development (7,52,53). The WOX13 genes from Ananas comosus (AcoWOX13), D. catenatum (DcWOX13) and P. equestris (PeWOX13A, PeWOX13B and PeWOX13C) had similar expression pattern (18,19). The AsWOX9 gene (Fig. 2) had maximum expression in seed (Fig. 4). The promoter region of AsWOX9 carried cis-regulatory elements AACACOREOSGLUB1, 2SSEEDPROTBANAPA, DPBFCOREDCDC3, for seed/embryo specific expression ( Table 3). The A. thaliana genes AtWOX9 and AtWOX8 which are phylogenetically closest to AsWOX9, have been reported to play a role in zygote patterning, late embryo development and apical growth (54,55). This suggests that the AsWOX9 gene, with maximum expression in seed could have a role in embryo development. AsWUS showed significant expression in pollen (Fig. 4) which is comparable to the expression profile for WUS gene in P. equestris (19). Earlier reports in A. thaliana, indicated the role of AtWUS protein as a repressor in stem cell regulation and an activator in floral patterning (56). AtWUS also promotes vegetative-to-embryonic transition (57) and involves in somatic embryogenesis (58). Promoters of AsWOX9 and AsWUS carried tuber-specific element (SP8BFIBSP8BIB) and water stress regulated element (MYB2AT). The AsWUS gene promoter carried element (EVENINGAT) for circadian rhythm as well. These observations suggest that AsWUS could possibly play a role in flower development. Another member, AsWOX2B gene (Fig. 2), had seed specific expression (Fig. 4). AsWOX2B can be related to the AtWOX2 of A. thaliana, which expresses downstream to AtWOX8 and was shown to be involved in zygote patterning and apical growth patterning regulation (54,55). AsWOX2B with seed specific expression might be playing a similar role. All the other WOX genes showed weak expression in tuber, pollen and seed. Promoters of genes AsWOX2A, AsWOX3A, AsWOX9 and AsWOX12 carried elements that interacts with AGAMOUS gene target sequence WUSATAg of intron and that are light-regulated (IBOX). A coupling element (CGACGOSAMY3) for the G box element was found in the promoters of genes AsWOX2A, AsWOX7, AsWOX11 and AsWOX13. Promoters of the genes, AsWOX2B, AsWOX3B, AsWOX9 and AsWOX13 carried (gibberellic acid) GA-responsive element (GAREAT) (Supplementary Table 3). These results support the involvement of WOX gene family in overall plant development.

Protein structure and Homology modeling
AsWOX13 of ancient clade, AsWOX9 gene of intermediate clade and AsWOX2B gene of WUS clade were selected for protein structure simulation and homology modelling, based on high expression profile. These sequences were dominated with random coils ranging from 52.65% (AsWOX13) to 62.79% (AsWOX2B) and alpha helix regions ranged from 32.24% in AsWOX13 to 20.16% in AsWOX2B (Fig. 5a-c; Table 3). The three-dimensional structure simulation analysis predicted that AsWOX13 and AsWOX9 were nucleic acid binding proteins (Fig. 5d, e, g, h; Table 3), while AsWOX2B was predicted to bind with glycerol ( Fig. 5f, i).

Conclusion
The present study characterizes the WOX gene family in Apostasia shenzhanica. Ten AsWOX genes were identified and grouped into three clades, ancient, intermediate and WUS based on homology modelling with related plants and establishes phylogenetic relationships amongst these. This study opens vistas for functional characterization of AsWOX gene family in various developmental pathways.