Abstract
BACKGROUND:
R2R3-MYB transcription factor (TF) family plays important roles in various biological processes in many plants, especially in the regulation of plant flavonoid accumulation. The fruit of Lonicera caerulea contains abundant anthocyanin.
OBJECTIVE:
The R2R3-MYB TF family was systematically analyzed according to the RNA-seq data, and the R2R3-MYB candidate genes that were involved in anthocyanin biosynthesis in the fruit of Lonicera caerulea were screened.
METHODS:
The R2R3-MYB TFs in Lonicera caerulea were identified, and the physical and chemical properties, protein conserved sequence alignment and motifs of each R2R3-MYB TFs were analyzed using bioinformatics methods. The expression levels of these genes and anthocyanin levels in different tissues and different developmental stages of fruit were determined by RT-qPCR and pH shift method.
RESULTS:
A total of 59 genes encoding R2R3-MYB TFs in Lonicera caerulea were identified and clustered into 20 subgroups (C1 to C20) based on the relationship to AtR2R3-MYBs. Expression profiles showed that the expression of CL6086 and CL552 in fruit were higher than other tissues, and upregulated in the veraison fruit compared to the green ripe fruit. As the expression of the two genes was concurrent with the anthocyanin content, and showed high correlation with anthocyanin biosynthetic structural genes, they were considered as closely related to anthocyanin biosynthesis in the fruit.
CONCLUSION:
The results provide a systematic analysis of LcR2R3-MYBs, and the foundation for further molecular mechanisms research of anthocyanin biosynthesis regulated by R2R3-MYB in the fruit of Lonicera caerulea.
Introduction
Blue honeysuckle (Lonicera caerulea) is a shrub species belongs to Lonicera, Caprifoliaceae. This species widely distributed in the northern region of Russia, China, Japan and North Korea due to its high tolerance to cold [1, 2]. The fruit of blue honeysuckle is economically important berrie with high nutrient contents, including amino acids, vitamins, mineral elements, and antioxidant phenolic compounds [3]. These substances could provide health benefits and prevent against cancer, hypertension, cardio-vascular diseases et al. [4]. In addition, the anthocyanin level in the fruit of blue honeysuckle is higher than the other small berries [5], such as cranberry [6], raspberry [7], blueberry and black currant [8].
Anthocyanin is an important kind of secondary metabolite in plants. Tissues of plants contained anthocyanins exhibit different color, such as red, blue and purple [9]. Anthocyanin can protect the plant against biological and abiotic stresses due to its anti-oxidative property. In recent years, people are increasingly interested in the anthocyanin-rich foods, which is good for health [10]. And the anthocyanin level has become an important parameter for appraising the fruit quality [11]. The biosynthesis of anthocyanin has been widely studied in many plants [12]. Generally, anthocyanins are synthesized via enzymes encoded by early and late steps structural genes (such as Chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H) and flavonoid 3′-hydroxylase (F3’H), dihydroflavonol (DFR), UDP-glucoside: flavonoid glucosyltransferase (UFGT) and anthocyanidin synthase (ANS)) [13–15]. The expression of PAL, CHS, F3H, DFR and ANS genes showed positive correlation to anthocyanin content in rice (Oryza sativa) [16]. Huo et al. found that the expression of PAL, CHS, F3H, ANS gene were upregulated in harf-veraison fruit, compared with green ripe fruit of Lonicera caerulea [17]. In addition, transcription factors including MYB, bHLH and WD40 were reported to regulate anthocyanin biosynthesis through activating or inhibiting these structural genes.
Over the past decade, the function of MYB transcription factors have been studied in numerous plants, such as Arabidopsis, rice, maize (Zea mays), apple (Malus domestica), grape (Vitis vinifera) et al. [18]. They were reported to play roles in the regulation of various biological processes including development (cell differentiation [19], organ morphogenesis [20], chloroplast development [21]), secondary metabolism (phenylpropanoid [22], flavonoids [23], capsaicinoid [24] metabolism), responses to biotic and abiotic stresses (insect [25], salt [26], drought [27], Ultraviolet B stress [28]). The members of MYB family were identified by a highly conserved DNA-binding domain in the N-terminus. According to the numbers of this conserved domain, MYBs were divided into four class, including 1R-MYB (or MYB-related and others), R2R3-MYB, 3R-MYB and 4R-MYB. Among them, R2R3-MYB subfamily with a large amount of members is the most widely studied and show diverse function in plants. They were reported to be involved in the regulation of primary and secondary metabolism, responses to biotic and abiotic stresses, developmental processes, and cell fate and identity [29]. Recently, the role of R2R3-MYB transcription factors in the regulation of anthocyanin biosynthesis has also been characterized [30]. In Arabidopsis, AtMYB4, AtMYB11/PFG1, AtMYB12/PFG1, AtMYB111, AtMYB75/PAP1, AtMYB90/PAP2, AtMYB113, AtMYB114 and AtMYB123/TT2 were found to control anthocyanin and flavonoid biosynthesis in different tissues [31–33]. Moreover, SmMYB1 identified from Solanum melongena was responsible for the anthocyanin biosynthesis in the fruit skin of purple cultivar [34]. PaMYB10 was involved in anthocyanin biosynthesis and the formation of red blushed skin in Prunus armeniaca [35]. PpMYB140 was reported to repress the expression of anthocyanin-related structural genes in Pyrus pyrifolia [36]. However, R2R3-MYB transcription factors in blue honeysuckle have not been identified to date.
In the present study, the R2R3-MYB genes were screened using the reported RNA-seq data of blue honeysuckle. Furthermore, GO categories, physical and chemical properties, multiple sequence alignment, phylogeny and conserved motif of the identified 59 LcR2R3-MYB members were analyzed using bioinformatics software. Finally, two LcR2R3-MYBs were selected as candidate genes of regulating the anthocyanin biosynthesis according to the correlation analysis and expression profiles in different tissues and different developmental stages of fruits. The results of our study will help to better understand the classification, evolutionary relations and functions of R2R3-MYB genes in blue honeysuckle.
Materials and methods
Plant materials
The blue honeysuckle cultivar Berel (L. caerulea subsp. Kamtschatica) grown in a resource nursery of small berry in Harbin, China was chosen for this research. The tissues of lateral root, stem, leaf, bud, and fruit (green ripe, verasion and softening stage) were harvested and immediately frozen in liquid nitrogen and stored in a –80°C refrigerator for further RT-qPCR analysis. Three biological replicates were performed for each sample. Green ripe and verasion fruits used for RNA-seq were collected in different year.
Determination of total anthocyanin content
The total anthocyanin content of root, stem, leaf, bud and fruit (green ripe, verasion and ripe stage) were determined using pH shift method as described previously [37]. About 5 g lyophilized powder of each samples were extracted in 80%methanol. The absorbance of the crude extract was measured at 510 nm and 700 nm using an Epoch microplate reader (Biotek, VT, USA). The anthocyanin contents were calculated according to absorbance value.
RNA-seq and functional annotation
RNA-seq was performed using Illumina pair-end sequencing on the Hiseq TM 2000 platform by BGI Tech (Shenzhen, China) in our previously published work [17]. Two RNA-seq libraries (green ripe and veraison fruits) was constructed using TruSeq technology and sequenced. Extracted mRNA of samples was cleaved into fragmented, and then treated with end repair and poly(A) addition. The first-strand cDNA was synthesized using these fragments with random hexamer primers and the second-strand cDNA was synthesized using DNA polymerase. After adding the sequencing adapters, the suitable fragments were selected for PCR amplification and cDNA library construction. The data was deposited in the NCBI Sequence Read Archive (PRJNA749367).
The clean data (54,629,616 reads of green ripe fruit and 53,070,888 reads of veraison fruit) were obtained from raw data (59,959,904 reads of green ripe fruit and 56,068,218 reads of veraison fruit) by removing adapter sequences and low-quality sequences using Trimmomatic (version 0.22) [38]. The Q20 percentages were both over 97 %and GC percentage was 45.64 %and 46.40%in green ripe and veraison fruit, respectively.
The clean data were de novo assembled using the Trinity assembling program. The contigs were connected to generate the sequence until they could not be extended on either end. The N50 of Unigenes was 1588 and the mean length of Unigenes was 867 nt.
The function of assembled All-unigenes was annotated using BLASTx according to the non-edundant (NR), Clusters of Orthologous Groups (COG), Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene ontology (GO) and Swiss-Prot databases.
Identification of MYB gene family members in the fruit of blue honeysuckle
The HMM profile of MYB domain (Pf00249) was obtained from Pfam database (http://pfam.xfam.org/). The sequences of LcR2R3-MYB identified which shared > 95%matches were considered as redundant transcripts and removed from the LcR2R3-MYB gene family [39]. The R2R3-MYB conserved domain was verified on the BLAST program of NCBI (National Center for Biotechnology Information) database (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Additionally, theoretical isoelectric point (pI), molecular weight (Mw), instability index and grand average of hydropathicity of LcR2R3-MYB proteins were predicted using Protparam tool (https://www.expasy.org/resources/protparam). The secondary structures of these proteins were predicted using SOPMA tool (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_sopma.html).
Conserved domains and motifs analysis of LcR2R3-MYB proteins
Multiple sequence alignment of LcR2R3-MYBs was performed by Clustal omega program (https://www.ebi.ac.uk/Tools/msa/clustalo/) with default parameters [40]. The conserved domain was created by WebLogo online software (https://weblogo.Threeplusone.com/). Phylogenetic tree of LcR2R3-MYBs was constructed using the neighbor-joining (NJ) method with 1000 bootstrap replicates by MEGA X program. The motif of LcR2R3-MYBs was analyzed by MEME program (https://meme-suite.org/tools/meme) and default parameters were used.
Phylogenetic analysis of LcR2R3-MYBs and AtR2R3-MYBs
The R2R3-MYB family sequences of Arabidopsis thaliana were downloaded from TAIR (http://www.arabidopsis.org/). Multiple sequence alignment and phylogenetic tree of AtR2R3-MYBs and LcR2R3-MYBs were performed by Clustal omega with default parameters and MEGA X program with 1000 bootstrap replicates, respectively.
Isolation of total RNA and RT-qPCR analysis
The total RNA of root, stem, leaf, bud, and fruit (green ripe, verasion and softening stage) were isolated using a RN53-EASYspin plant RNA extraction kit (Aidlab, Beijing, China). About 500 mg of each RNA samples were reversed transcribed to cDNA using TransScript One-Step gDNA Removal and cDNA Synthesis SuperMix (TransGen, Beijing, China). The cDNA was diluted 10 fold and used as temples. RT-qPCR (reverse transcription quantitative real-time polymerase chain reaction) was performed using SYBR Green PCR master mix (Toyobo, Osaka, Japan) on a qTOWER 3.0 real-time PCR system (Analytik Jena, Jena, Germany). The relative expression of target genes was calculated using the 2–ΔΔ Ct method. The β-Actin gene was selected as the internal control. Three biological replicates were performed for each measurement. Differences of the means of gene expression in green ripe, verasion, and ripe fruit were determined using one-way analysis of variance and the Tukey multiple comparison procedure (p < 0.05). Primer sequences were listed in Table 1, and all the primers had a good specificity and efficiency (Figure S1). The correlations of each gene expression were analyzed using Pearson correlation coefficients
Primer sequences of LcR2R3-MYBs and structural genes in anthocyanin biosynthesis used for RT-qPCR
Primer sequences of LcR2R3-MYBs and structural genes in anthocyanin biosynthesis used for RT-qPCR
F, forward primer; R, reverse primer.
Anthocyanin content analysis in different tissues of blue honeysuckle
Anthocyanin is a class of important secondary metabolites in plants. To explore the distribution of anthocyanin in blue honeysuckle, the total anthocyanin contents in root, stem, leaf, bud and fruit (green ripe, verasion and ripe stage) were measured. The result displayed that root, stem, leaf and bud contained little anthocyanin. In contrast, there was more anthocyanin in fruit. The levels of anthocyanin were continuously accumulated during fruit maturation, and reached the maximum amount at the fruit ripe stage (Fig. 1). Thus, fruits are the ideal materials to study the biosynthesis of anthocyanin in blue honeysuckle.

The content of total anthocyanins in different tissues and different developmental stages of blue honeysuckle. Error bars represent the SD of three biological replicates. Different letter means statistically significant differences (p < 0.05).
To identify the R2R3-MYB gene family in blue honeysuckle, the transcriptome database obtained previously was used. The database is composed of 66,610 unigenes extracted from green ripe and veraison period fruits of a blue honeysuckle cultivar Berel. The R2R3-MYB gene family in blue honeysuckle were screened using Pf00249 domain as a query against the transcriptome database using HMMER software (v3.0) with a threshold of 0.001. And the final R2R3-MYB sequences were screened according to the highest alignments score and lowest E value that matched to the Arabidopsis 126 R2R3-MYB proteins. Then the obtained R2R3-MYB sequences were confirmed based on the blast result on NCBI. After removing the repetitive sequences and redundant transcripts, a total of 59 genes were identified as R2R3-MYB in blue honeysuckle.
The physical and chemical properties of the 59 LcR2R3-MYB proteins were analyzed (Table 2). The length of LcR2R3-MYB proteins ranged from 51 amino acids to 567 amino acids. The molecular weight of LcR2R3-MYBs ranged from 5954.83 Da to 61495.21 Da. The predicted isoelectric points (pI) values varied with a minimum of 4.89 (Unigene8494) and maximum of 11.4 (Unigene25226). The predicted instability index revealed that most of the LcR2R3-MYBs were unstable, except for Unigene34370 and Unigene39711. The predicted hydrophobic index indicated that all LcR2R3-MYBs were hydrophilic proteins. The predicted of secondary structure indicated that α-helix and random curl accounted for a large proportion, and relatively little β-angular and extension in secondary structure.
Physicochemical characteristics and secondary structure of LcR2R3-MYB proteins in blue honeysuckle
Physicochemical characteristics and secondary structure of LcR2R3-MYB proteins in blue honeysuckle
MW, molecular weight; pI, theoretical isoelectric point; IC, instability index GRAVY, grand average of hydropathicity; α-h, α-helix; Ex, extension; β-a, β-angular; RC, random curl.
In order to explore the potential functions of the R2R3-MYB transcription factors, GO categories and annotation were then performed. In the GO categories, these genes were classified into cellular component, molecular function and biological process. For the cellular component, the most three overrepresented terms were cell part, organelle and cell. For the molecular function, most of the LcR2R3-MYBs were involved in binding and transcription regulator activity, indicated that most of the LcR2R3-MYBs had DNA binding domain and participated in transcription regulation. In the biological process, many LcR2R3-MYBs were involved in cellular process, biological regulation and regulation of biological process (Fig. 2).

Function classification of LcR2R3-MYBs by GO categories. CC, cellular component. MF, molecular function. BP, biological process.
GO annotation revealed that LcR2R3-MYBs were involved in the regulation of gene expression, response to stimulus (including UV-B, abscisic acid, jasmonic acid, ethylene, salicylic acid, gibberellin, hyperosmotic salinity, water deprivation, fungus, virus, bacterium, cellular phosphate starvation, and sucrose stimulus), anthocyanin-containing compound biosynthetic process, trichome morphogenesis, regulation of secondary cell wall biogenesis, stamen development, photoperiodism and signal transduction process (including abscisic acid mediated signaling pathway, ethylene mediated signaling pathway, jasmonic acid mediated signaling pathway). The result revealed that LcR2R3-MYBs participated in various biological pathway in blue honeysuckle.
Multiple sequence alignments of 59 LcR2R3-MYB proteins were analyzed to estimate the conservation of sequences (Fig. 3A). All the screened LcR2R3-MYB proteins contained the conserved domain [W]-X(3)-[ED]-X(2)-L-X(7)-[G]-X(3)-[W]-X(8)-[R]-X(2)-[KSCRLRW]-X(32)-[W]-X(2)-[IA]-X(5)-[RTDN]-X(2)-KN-X(1)-W (Fig. 3B). The Phylogeny and conserved motifs of LcR2R3-MYBs were implemented suing MEME program. A total of fifteen conserved motifs were identified, ranged from 6 aa to 50 aa (Fig. 4A). Among them, motif1, motif2, motif3 and motif4 were distributed in 54, 50, 45 and 55 LcR2R3-MYBs, respectively. And the 59 LcR2R3-MYBs were divided into 4 groups (A to D) according to the numbers and location of these motifs (Fig. 4B). Motif 5 were located in the N-terminal of 21 LcR2R3-MYBs proteins in group A. And most of the members in group A had a conserved order of motif 5- motif 2- motif 7- motif 4- motif 1- motif 3. The members in group B had an order of motif 1- motif 3- motif 6. Motif 15 was only existed in group C. Almost all members in group D contained motif 13, and located between motif 2 and motif 1.

Multiple sequence alignments and conserved domain of LcR2R3-MYB protein. (A) Multiple sequence alignments of 59 LcR2R3-MYB proteins. The solid black triangle indicates the highly conserved Trp residues in the R2R3-MYB domain. (B) The sequence logos of LcR2R3-MYB conversed DNA-binding domain.

Structure analysis of LcR2R3-MYB proteins. (A) The sequence logos of motif in LcR2R3-MYB proteins. (B) Phylogenetic relationship and conserved motif structure analysis of LcR2R3-MYB proteins.
To investigate the phylogenetic relationship of R2R3-MYB sequences in blue honeysuckle and Arabidopsis, a neighbor-joining phylogenetic tree was constructed using MRGA-X. Previously, the R2R3-MYBs in Arabidopsis were divided into 25 subgroups (S1 to S25) according to the conservation of the DNA-binding domain and motifs in C-terminal [41]. The 59 LcR2R3-MYBs were divided into 20 subgroups (C1 to C20) based on the phylogenetic relationship to Arabidopsis (Fig. 5). The proteins clustered into the same subgroups might have the same function. R2R3-MYBs including AtMYB11/PFG1, AtMYB12/PFG1 and AtMYB111/PFG3 in subgroup 7, AtMYB75/PAP1, AtMYB90/PAP2, AtMYB113 and AtMYB114 in subgroup 6, AtMYB123/TT2 in subgroup 5, and AtMYB4 in subgroup 4 have been reported to be involved in the regulation of structure genes in anthocyanins and flavonoids biosynthesis pathways in Arabidopsis. C15 to C20 subgroups of blue honeysuckle and S4 to S7 subgroups of Arabidopsis were clustered together, and C15 to C20 contained 15 LcR2R3-MYBs (CL552, Unigene14460, CL284.Conting2/4, CL284.Conting3/5, Unigene12084, Unigene8424, CL5223.Contig1, CL5223.Contig2, Unigene16358, Unigene20420, Unigene 36593, Unigene20332, Unigene22789, CL6086, Unigene19500). Thus, the 15 LcR2R3-MYBs were most likely to participate in the regulation of anthocyanin biosynthesis in blue honeysuckle.

Phylogenetic tree of R2R3-MYB proteins in blue honeysuckle and Arabidopsis. S1-S25 were previously reported subgroups of AtR2R3-MYB proteins. C1-C20 were subgroups of LcR2R3-MYB proteins.
Gene expression pattern is closely related to gene function. To explore the potential function of 15 LcR2R3-MYBs in blue honeysuckle, the expression levels of 15 LcR2R3-MYBs in root, stem, leaf, bud and fruit were detected by RT-qPCR. The expression levels of different LcR2R3-MYB genes varied significantly in the five tissues (Fig. 6A). The 15 LcR2R3-MYB genes were clustered into four branches according to their expressions, and the genes in the same branch showed a similar expression pattern. The first branch including Unigene19500, Unigene14460, Unigene16358 and CL284 were highly expressed in root compared with the other tissues. The second branch including Unigene22789, Unigene20420 and Unigene36593 were highly expressed in stem, and these genes might mainly function in stem. The genes including CL6086, CL552, Unigene12084, CL5223.Contig1 and CL5223.Contig2 in the third branch showed high expression in fruits, and these genes might paly important role in fruit. The expression of genes in the fourth branch were higher in stem, leaf, bud and fruit than in root.

Gene expression profiles of 15 LcR2R3-MYBs in different tissues. (A) Heat map of 15 LcR2R3-MYB genes in root, stem, leaf, buds and fruit. (B) The expression of 15 LcR2R3-MYB genes in fruits at green ripe, veraison and ripe stage. The expression of genes in RNA-seq was normalized in the logarithm scale (log(FPKM+1)). Means with the same letter were not significantly different (p < 0.05).
To clarify the LcR2R3-MYB genes involved in fruit coloring, the expression of fruits in different developmental stage (green ripe, veraison and ripe) were also detected. The expression of genes including CL6086, CL552, Unigene20420, Unigene8424 and Unigene12084 were upregulated in veraison than green ripe, and downregulated in ripe than veraison (Fig. 6B). The expression of genes including Unigene20332, Unigene22789, Unigene16358, Unigene14460, CL5223.Contig1, CL5223.Contig2, CL284.Contig2/4 and CL284.Conting3/5 showed no significant difference in green ripe and verasion. Additionally, the expression of Unigene19500 and Unigene36593 were downregulated in veraison compared to green ripe. As the anthocyanin contents were increased during fruit ripening, the differently expressed LcR2R3-MYBs might be related to anthocyanin biosynthesis. Thus, CL6086, CL552 and Unigene12084 were preliminary selected as candidate genes for anthocyanin biosynthesis in the fruit of blue honeysuckle.
To further identify the R2R3-MYBs involved in the regulation of anthocyanin biosynthesis in the fruits of blue honeysuckle, the correlation of LcR2R3-MYBs and structural genes in anthocyanin biosynthesis were analyzed. The expression of the key structural genes, including PAL, CHS, CHI, F3H, DFR and ANS in anthocyanin biosynthesis were detected by RT-qPCR. The expression of these structural genes except F3H gene were significantly upregulated in the veraison fruit, compared with the green ripe fruit, and downregulated or unchanged in the ripe fruit (Fig. 7A). These genes were classified into three groups according to their expressions pattern. All the anthocyanin biosynthetic structural genes except F3H were clustered in group III. And three LcR2R3-MYB genes including CL6086, CL552 and Unigene20420 were also clustered in group III (Fig. 7B). The correlation analysis revealed that the expression of CL6086, CL552, Unigene20420 with anthocyanin biosynthetic structural genes had high correlation coefficients (Fig. 7C). However, the expression of Unigene20420 showed a lower expression in fruit than its in stem and bud. In summary, CL6086 and CL552 were most likely to be involved in anthocyanin biosynthesis in the fruit of blue honeysuckle.

Correlations and clustering analysis of LcR2R3-MYBs and anthocyanin biosynthetic structural genes. (A) The expression of anthocyanin biosynthetic key structural genes in fruits at green ripe, veraison and ripe stage. The expression of genes in RNA-seq was normalized in the logarithm scale (log(FPKM+1)). Means with the same letter were not significantly different (p < 0.05). (B) The hierarchically clustered heat maps of LcR2R3-MYBs and anthocyanin biosynthetic structural genes in fruits at green ripe, veraison and ripe stage. (C) The Pearson’s r correlations of LcR2R3-MYBs and anthocyanin biosynthetic structural genes expression levels.
R2R3-MYB gene family in blue honeysuckle
The R2R3-MYB gene family is a very large family of transcription factors in plants. As the genomes and transcriptomes of many species have been completely sequenced, the R2R3-MYB gene family were comprehensive identified in many plants. And at least 126 members in Arabidopsis thaliana, 88 members in Oryza sativa [42], 51 members in Hordeum vulgare [43], 45 members in Ginkgo biloba [44], 108 members in Vitis vinifera [45], 105 members in Fragaria vesca [46] and 184 members in Pyrus bretschneideri [47] of the R2R3-MYB subfamilies have been detected. The number of genes from a gene family is highly species-dependent. The identification of genes using RNA-seq data is highly depend on the tissue samples used for RNA-seq and this may affect the number of genes identified. However, it also becomes an effective way for the systematical analysis of gene families for the species that lack of genome database, and also for the analysis of gene function in specific tissue [48, 49]. In this work, a total of 59 R2R3-MYB members in blue honeysuckle were identified using the transcriptome data of green ripe and veraison stage fruits. And the function annotation analysis showed that these genes were related to anthocyanin biosynthetic process, trichome, cell wall and stamen developmental processes, and responses to biotic and abiotic stresses. This was consistent with the researches of AtR2R3-MYBs in Arabidopsis [29].
The R2R3-MYB proteins are characterized by the conserved R2R3 domain at the N-terminal end. Each repeat of MYB contains about 52 amino acids and form three α-helices. Three highly conserved tryptophan residues separated by 18 or 19 amino acids formed a helix-turn-helix structure [50]. The result revealed that the amino acid sequences of R2R3-MYB conserved domain in blue honeysuckle were similar to the other plant species, such as Arabidopsis, Camellia sinensis [51], Capsicum annuum [52] and Medicago truncatula [53]. Most of the LcR2R3-MYB proteins had 18 or 19 amino acids between the three tryptophan residues in R2 motif. The R3 motif had two tryptophan residues and spaced by 18 amino acids. The first tryptophan residue was replaced by phenylalanine, leucine and isoleucine. These results indicated that the evolution of R2R3-MYB proteins was relatively conservative in different species. The 59 LcR2R3-MYBs were divided into 4 groups. The MYB members in the same group had similar motif structure. The adjacent genes in the phylogenetic tree were likely evolved from the same origin and shared the same function.
The candidate LcR2R3-MYB genes involved in anthocyanin biosynthesis
Anthocyanin plays an important role in the response of plants to various abiotic stresses such as ultraviolet light, low temperature and drought. In recent years, studies have showed members of R2R3-MYB transcription factors promoted or suppressed the structural genes related to the anthocyanin biosynthesis in leaves, flowers, fruits and roots of plants. The functions of LcR2R3-MYB family genes was preliminarily predicted by aligning with 126 R2R3-MYBs in Arabidopsis thaliana. The 59 LcR2R3-MYBs were divided into 20 subgroups (C1 to C20). In Arabidopsis, members of subgroup S4, S5, S6 and S7 were observed to regulate anthocyanins and flavonoids biosynthesis pathways. AtMYB4 in subgroup S4 was reported to interact with the bHLH transcription factors TT8, GL3 and EGL3, and negatively regulated the flavonoid accumulation by downregulating the expression of Arogenate Dehydratase 6 gene, encoding the enzyme of the final step in the biosynthesis of phenylalanine [54]. AtMYB123 (TT2) in subgroup S5 could form a complex with TT8 and TTG1 and control proanthocyanidin biosynthesis in the seed coat by regulating the expression of BAN and DFR gene [31, 55]. AtMYB75 (PAP1) in subgroup S6 in combination with TT8, EGL3, GL3 and TTG1 suppressed the expression of CHS, DFR, LDOX, and BAN and anthocyanin biosynthesis [56, 57]. ATMYB90 (PAP2), AtMYB113 and AtMYB114 in subgroup S6 downregulated the same late anthocyanin structural genes as PAP1[33]. AtMYB11/PFG1, AtMYB12/PFG1 and AtMYB111/PFG3 in subgroup S7 exhibited functional similarity and affected flavonol accumulation by regulating the expression of CHS, CHI, F3H, FLS gene [58]. Six subgroups C15, C16, C17, C18, C19, C20 contained 15 LcR2R3-MYBs were clustered together with subgroup S4, S5, S6 and S7 of Arabidopsis, and might be related to anthocyanin biosynthesis in blue honeysuckle.
Expression profiles of candidate LcR2R3-MYBs
R2R3-MYB transcription factors control anthocyanin biosynthesis by activating or inhibiting the anthocyanin structural genes [30, 59]. Thus, the expression pattern of LcR2R3-MYBs is closely correlated with anthocyanin level. Our result displayed that the fruit of blue honeysuckle had the most abundant anthocyanin contents than root, stem, leaf and bud. The anthocyanin level was accumulated during fruit ripening and reached a peak at ripen stage. The expression profiles exhibited that the expression of genes in the third branch including CL6086, CL552, Unigene12084, CL5223.Contig1 and CL5223.Contig2 in fruit were higher than the other tissues, and the five genes might play roles in fruit. Among them, the expression of CL6086, CL552 and Unigene12084 was upregulated in the green ripe stage compared to the veraison stage. Moreover, the expression of CL6086 and CL552 showed high corrections with anthocyanin biosynthetic structural genes, and they were clustered in the same group. Thus, CL6086 and CL552 appeared to be most likely to participate the positive regulation of anthocyanin biosynthesis in the fruit of blue honeysuckle according to the expression profiles and correlation analysis. In the GO annotation, CL6086 was homologue to the Arabidopsis MYB90. CL552 was homologue to the Arabidopsis MYB5. MYB5 and MYB90 were reported to positively regulate anthocyanin biosynthesis in plants [60, 61]. This study provides a basic information for studying the molecular regulation of CL6086 and CL552 in anthocyanin biosynthesis in the fruit of blue honeysuckle.
Footnotes
Acknowledgments
The authors thank Northeast Agricultural University and Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (Northeast Region), Ministry of Agriculture and Rural Affairs for their support of platforms. We also thank the editors and reviewers for their comments of this manuscript.
Funding
This research was supported by the National Key Research and Development Project (2016YFC0500304), Natural Science Foundation of Heilongjiang Province, China (LH2020C008), “Young Talents” Project of Northeast Agricultural University, China (19QC05), Application Technology Research and Development Program of Heilongjiang Province, China (GA19B102) and National Key R&D Program of China (2018YFD100200).
Conflict of interest
The authors have no conflicts of interest to report.
