INTRODUCTION
Reproductive efficiency is one of the most valued aspects of animal production. In the particular case of beef and dairy cattle, reproduction-associated traits, especially of females, are included in the selection objective of the most modern breeds and they have received much attention in genetic and genomic studies. Starting with puberty and then cyclically repeated along productive life, the reproductive process in females requires the release of the gonadotropin-releasing hormone (GnRH) from the hypothalamus, which in turn is needed for the secretion of gonadotropins from the anterior pituitary gland (Amstalden and Williams, 2015; Atkins et al., 2013). Research devoted to understanding the mechanisms underlying the onset of puberty and the regulation of subsequent reproductive cycles in different mammal species contributed to building the “KNDy hypothesis” (Smith et al., 2014). According to this hypothesis three peptides, Kisspeptin, Neurokinin B, and Dynorphin A, are co-expressed in the same group of neurons in specific regions of the hypothalamus (Goodman et al., 2007). These peptides would have the role of a “pulse generator” that drives GnRH secretion. Neurokinin B and Dynorphin A have stimulatory and inhibitory effects on Kisspeptin expression, respectively. Moreover, these neurons also express the Neurokinin B receptor and the Dynorphin A receptor, but not the Kisspeptin receptor that is expressed in GnRH neurons (Weems et al., 2018). The regulatory system involving the KNDy neurons would be sensitive to sex steroid feedback and different regulatory cues (e.g. energy balance, lactation, photoperiod, stress) but the specific mechanisms involved are not clear (Lehman et al., 2010). The role of Dynorphin A in the KNDy neurons is consistent with previous evidence showing an inhibitory effect of endogenous opioid peptides on reproduction (Malven, 1986). Suckling increases levels of endogenous opioid peptides in the brain, making the hypothalamus more sensitive to the negative feedback from estrogen and decreasing the production of GnRH (Squires, 2010). Despite the role of Kisspeptin, Neurokinin B and Dynorphin A in the regulation of the reproductive cycle, the corresponding genes (KISS1, TAC3, and PDYN) and their receptors (KISS1R, TACR3, and OPRK1) have not been analyzed as functional or positional candidates in cattle. The only known exception is the Dynorphin receptor (OPRK1), proposed as a candidate gene associated with sexual precocity in a GWAS for sexual precocity in Nellore cattle from Brazil (Irano et al., 2016). From the standpoint of animal breeding, the dissection of genetic variation at the level of the KNDy system could be of help to understand differences in reproductive performance among cattle breeds, and in turn, to improve management strategies. As an example, we have evaluated the reproductive efficiency of Angus and Argentinean Creole females (Corva et al., 1995). The Creole is a local beef breed that has been much less selected than Angus, but it is appreciated due to its rusticity and some other favorable traits such as calving ease. The superiority of Angus over Creole in reproductive performance was noticed starting with age at puberty (Pardo et al., 2018) and other differences between the two breeds were detected in subsequent reproductive cycles (Corva et al., 1995). While body weight change during the breeding season was a key factor defining pregnancy rate in Angus females, lactation and the interaction with the offspring was the most limiting factor in Creole females. The marked difference between European (Bos taurus) and Indicine (Bos indicus) breeds in sexual precocity is another well-known example of genetic variation in reproductive performance (Freetly et al., 2011; Laster et al., 1979) even when were evaluated in similar restrictive environmental conditions (Meirelles et al., 1994; Ferraz Jr. et al., 2018). Taking together the evidences about the inhibitory effect of endogenous opioid peptides on reproduction and the recently proposed role of Dynorphin A in KNDy neurons, we hypothesized that PDYN, the gene that codes for the propeptide Prodynorphin, which is cleaved to produce several opioid peptides including Dynorphin A (Day et al., 1998, Figure 1), could be under selection in cattle. This selection probably resulting in the relaxation of the effects of different cues that convey to the hypothalamus to regulate the onset of reproductive activity. To test this hypothesis, we compared the sequence of the PDYN gene from different cattle breeds, taking advantage of the availability of whole genome sequences.

Figure 1 Structure of the bovine PDYN gene (Top), the corresponding transcript (Genbank sequence: NM_174139.3) (Middle) and the translated propeptide (Bottom). The different functional peptides resulting from proteolytic cleavage by the enzyme Prohormone convertase 2 are also indicated in the figure.
MATERIALS AND METHODS
Retrieval and processing of PDYN genomic sequences
The genomic sequence of the Bovine Reference Genome (ARS-UCD1.2) corresponding to a region of BTA13 spanning the PDYN gene (GenBank accession: NC_037340.1: 53,130,235 - 53,148,819 bp) was downloaded from the NCBI site (http://www.ncbi.nlm. nih.gov). This DNA sequence was used as a reference to retrieve the corresponding genomic sequence of Bos indicus (GenBank accession: NC_032662.1: 53,688,047 - 53,707,634 bp) and also the genomic sequences of a panel of individuals from selected bovine breeds as it is described below. The sequenced genomes of selected animals had already been aligned to the Bovine Reference Genome UMD 3.1.1 (Merchant et al., 2014; Zimin et al., 2009). Sequence alignment confirmed that apart from a change in relative coordinates, the region of interest resulted in identical between UMD 3.1.1 and ARS-UCD1.2, the latest version of the Bovine Reference Genome (wwww.bovinegenome.org). The whole panel of genomic sequences used for the analyses included: The reference Bos taurus and Bos indicus genomes, corresponding to the Hereford and Nellore breeds respectively; one individual representing extinct wild cattle (Auroch, Bos primigenius, Park et al., 2015) and individuals from the “1000 Bull Genomes Project” (http://www.1000bullgenomes.com/) or the U.S. Meat Animal Research Center (USMARC) Beef Cattle Diversity Panel 2.9 (MBCDPv2.9, Heaton et al., 2016). Selected breeds were: Holstein (n=13), Jersey (n=11), Angus (n=10), Limousine (n=6), Corriente (n=4), Longhorn (n=3) and Brahman (n=7). In the first experiment, the average sequencing depth was 8.3X (Daetwyler et al., 2014) while in the experiment of Heaton et al. (2016) it was 14.8X. Aligned genomic sequences (in bam format) were downloaded from the Sequence Read Archive (SRA) division of NCBI (http://www.ncbi.nlm.nih.gov/sra) and the USMARC website (https://www.ars.usda.gov/plainsarea/ clay-center-ne/marc/wgs/bovref/), respectively. A detailed list of these genomic sequences is presented in Table 1. Based on breed selection history and well known productive differences among bovine biotypes (particularly in terms of sexual precocity; Diskin and Kenny, 2014; Sartori et al., 2016) four contrasting groups were defined: highly selected Bos taurus dairy (Holstein, Jersey, n=24) and beef breeds (Angus, Limousine, Hereford, n=17); less selected -hardy- Bos taurus breeds (Corriente, Longhorn, Auroch, n=8) and zebu breeds (Brahman, Nellore, n=8). These groups are hereinafter identified as “Dairy”, “Beef”, “Less selected” and “Zebu”, respectively. Each genomic sequence in bam format was searched for variant sites using BCFtools software v1.5 (Li, 2011) with the command: “bcftools mpileup -f sequence.fasta input.bam | bcftools call -m -Ob | tabix | bcftools query -f ‘%POS\t%QUAL\t%ALT[\t%IUPACGT]\n’ -o output. txt” where “sequence.fasta” was the sequence of the Bovine Reference Genome UMD 3.1.1 used as a reference in the original alignment. Variants with a Phred quality score below 20 (base call accuracy above 99%) were manually discarded and a consensus sequence was obtained for each sample in fasta format using UNIX commands. The average depth of coverage of each sample for the whole genome sequence was retrieved from the original database. On the other hand, the average depth of coverage of each sample for the PDYN gene region was calculated using Mosdepth software v 0.2.9 (Pedersen and Quinlan, 2018) (Table 1).
Haplotype reconstruction
The Prodynorphin cDNA was cloned by Jiang et al. (1997); Genbank accession U58500.1. This sequence agrees with accession NM_174139.3 from the annotation of the last version of the Bovine Reference Genome (ARSUCD1.2), and therefore it was used in the present work as a reference of gene organization. In the comparative sequence analyses, the whole sequence and four functional regions of the mRNA were considered: a 1,000 bp fragment spanning the putative proximal promoter; the 5’ untranslated region (5’UTR, 174 bp); the coding sequence (CDS, 777 bp); and the 3’ untranslated region (3’UTR, 1,588 bp) (Figure 1). The sequences of the breed panel were aligned online with the MAFFT software version 7 (https://mafft.cbrc.jp/alignment/server/) and the resulting alignment was manually edited using the Aliview software version 1.23 (Larsson, 2014). Then, haplotype phases from sequences with heterozygous sites were reconstructed with DnaSP version 6.10.03 (Rozas et al., 2017) using the option fastPHASE (Scheet and Stephens, 2006) with default parameters. Haplotypes were reconstructed for the whole sequence and then trimmed into the functional regions as defined above. The phylogenetic relationships among the haplotypes identified in the putative proximal promoter (1,000 bp) and in the coding sequence (CDS) of the PDYN gene, were inferred through a median-joining network analysis (Bandelt et al., 1999) using the PopART software version 1.7 (Leigh and Bryant, 2015).
Principal component analyses (PCA)
To evaluate population structure and the genetic relationships among the four breed groups (“Dairy”, “Beef”, “Less selected” and “Zebu”), principal component analyses (PCA) were computed on haplotype frequencies of both, the putative proximal promoter region (1,000 bp) and the cDNA (complementary DNA, 2,539 bp), of PDYN using the ‘prcomp’ function of Rstudio v.3.4.4 (R Development Core Team, 2018). PCA results were plotted using the R package ‘ggplot2’ (Wickham, 2016).
Signatures of selection
To detect departures from the expectations of the neutral theory of evolution (Kimura, 1968) the neutrality tests Tajima’s D (Tajima, 1989) and Fu & Li’s F* and D* (Fu and Li, 1993) were conducted. Tajima’s D compares the estimate of DNA sequence variation based on the average pairwise distance between all sequences in the sample (φ), to the estimate of DNA sequence variation based on the observed number of segregating sites and the number of chromosomes in a sample (θ). Under neutrality, the means θ and φ should be approximately equal to each other. Therefore, the expected value of Tajima’s D for populations conforming to a standard neutral model is zero. Significant deviations from zero indicate a skew in the allele frequency distribution relative to neutral expectations. Positive values of Tajima’s D arise from an excess of intermediate frequency alleles and can result from population bottlenecks, structure and/ or balancing selection. Negative values of Tajima’s D indicate an excess of low frequency alleles and can result from population expansions or positive selection (Tajima, 1989; Biswas and Akey, 2006). On the other hand, Fu & Li´s test makes the distinction between old and recent mutations as determined by where they occur on the branches of genealogies. The D* and F* statistics compare an estimate of the population mutation rate based on the number of derived variants seen only once in a sample (referred to as singletons) with φ or θ, respectively. Fu & Li´s F* and D* values are negative when there is an excess of recent mutations and will be taken as evidence against the neutrality of mutations (Fu and Li, 1993; Biswas and Akey, 2006). These tests were conducted in the four defined cattle groups for the whole PDYN gene and also for each functional region defined above, using DnaSP version 6.10.03 (Rozas et al., 2017).
RESULTS
Principal component analyses
The PCA conducted on the haplotype frequencies of the cDNA and the putative promoter of PDYN are presented in Figure 2. It can be seen that the distribution of the breed groups does not agree between both gene regions. In the case of the promoter, PC1 alone explains 79.7% of the variability. Breeds are distributed along this axis, in agreement with the criteria that were originally considered to make the groups (selection history and, particularly, sexual precocity) suggesting a population structure defined not only by breed isolation and genetic drift but probably also by selection. On the contrary, when haplotype frequencies of the cDNA are considered, PC1 and PC2 explain 37.1% and 32.4% of the variation, respectively, and breed distribution do not match grouping criteria.

Figure 2 Bidimensional plot illustrating the results of the Principal Component Analysis performed on haplotype frequencies of the putative proximal promoter (left) and the cDNA (NM_174139.3) (right) of the bovine PDYN gene. The percentages of total variability explained by each PC are included between brackets. Dairy: Highly selected dairy breeds (Bos taurus); Beef: Highly selected beef breeds (Bos taurus); Less selected: Less selected breeds (Bos taurus), included the Auroch (Bos primigenius); Zebu: Zebu breeds (Bos indicus).
Signatures of selection
Departures from neutral theory expectations were analyzed on each of the four defined breed groups (“Dairy”, “Beef”, “Less selected” and “Zebu”) (Table 4). In the “Dairy” cattle group, these tests were consistently negative and resulted to be statistically significant for the Genomic sequence as well as for the putative promoter region (Table 2). These results suggest that this gene and particularly the promoter region could be under positive selection. Moreover, these results pointed to a potential effect on regulatory features of PDYN gene expression in Dairy cattle. The other groups, on the contrary, had much higher nucleotide and haplotype diversity in the promoter and CDS than the Dairy breeds, and none of their neutrality tests were statistically significant (Table 2).
Haplotype description
Table 3 shows the mutations identified in the promoter region of the PDYN gene that had evidence of positive selection according to the neutrality tests, and in the CDS. These mutations define the haplotypes reported in Table 4. The combination of 14 SNPs defined the haplotypes of the promoter region (1,000 bp) (Table 4). In this region, a total of 32 haplotypes were found among the 114 chromosomes (57 samples by 2 chromosomes each) analyzed. Sixteen of them were found in at least two chromosomes and were represented by 98 chromosomes (Table 4). The most abundant haplotype (p-Hap_1, ref; n=56) was found in the Bovine Reference Genome ARSUCD1.2 but it was absent in the “Zebu” cattle group. In the CDS (777 bp), a total of 28 haplotypes defined by 10 SNPs were found among the 114 chromosomes analyzed. Twelve of them were found in at least two chromosomes and were represented by 98 chromosomes (Table 3). In this region, the haplotype found in the Bovine Reference Genome ARS-UCD1.2 (c-Hap_1, ref; n=15) was not the most abundant and was the only one with the “A” allele in the SNP rs42388967 (SNP 3 in the CDS, Table 4). Although “Beef” and “Dairy” groups shared their most frequent haplotype (p-Hap_9) in the promoter that was not the case for the CDS (c-Hap_9 and c-Hap_1 for “Dairy” and “Beef” groups, respectively). The promoter of PDYN had five different haplotypes in dairy breeds but with little variability, given that 34 out of 40 chromosomes shared p-Hap_1. This same haplotype was present in 20 out of 33 chromosomes of selected beef breeds, but the rest of the chromosomes of this group had seven different haplotypes in low frequencies each (Table 4). Moreover, in the group of dairy breeds, 40 out of 44 haplotypes of de CDS share the same six alleles in their 5’ end. This same trend was detected in the promoter, where 37 out of 40 haplotypes share the same seven alleles in their 3’ end. This pattern is not seen in any other breed group (Table 4). Zebu breeds had four haplotypes with a predominance of p-Hap_11 in the promoter region and showed little coincidence in haplotype distribution with the other breed groups (Table 4). Interestingly, differences between breed groups were smoothed when the CDS was considered and even the “Zebu” group showed more coincidence with the other Bos taurus groups. Nevertheless, for the CDS the “Dairy” group showed again the lowest variability (30 out of 44 chromosomes had Hap_9; Table 4). The “Median Joining” network constructed using the haplotypes identified in the promoter region suggested the existence of two clades separated by three mutational steps (Figure 3A). In this network, the haplotypes found in the “Dairy” cattle group could be differentiated from those of the “Zebu” cattle group. Noteworthy, the “Dairy” group exhibited few haplotypes and they formed a star-like configuration with the reference haplotype (p-Hap_1) as central haplotype in the clade (Figure 3A). In contrast, the haplotypes from the other groups were sparsely distributed. On the other hand, the “Median Joining” network constructed using the haplotypes identified in the CDS did not exhibit a clear pattern of variation and contrary to what happened in the promoter region, the haplotypes found in the CDS in the different cattle groups were not separated in this network (Figure 3B). The 28 haplotypes identified in the CDS (Table 4) could be translated into six propeptide sequences, which had no more than four changes compared to the translated reference sequence. The four breed groups coincidentally had the same translated sequence (translated sequence of c-Hap_9) as the most abundant (Table 4). Interestingly, none of the aminoacid substitutions affected the sequence of the functional peptides that are processed from the translated propeptide (Table 3, Figure 1).

Table 3 List of mutations identified in two regions of the PDYN gene (promoter and coding sequence, CDS).

Table 4 List of haplotypes. List of haplotypes defined by mutations identified in the coding sequence (CDS) and in the promoter region of the PDYN. Only haplotypes found in at least two chromosomes are shown.

Figure 3 Median Joining network constructed with the haplotypes corresponding to (A) the putative proximal promoter (1,000 bp) and (B) the coding sequence (CDS) of PDYN gene. The size of each circle is proportional to the corresponding chromosome frequency. Slashes represent the number of mutations separating each haplotype. In each network Hap_1 (box) corresponds to the Bovine Reference Genome ARS-UCD1.2. Dairy: Highly selected dairy breeds (Bos taurus); Beef: Highly selected beef breeds (Bos taurus); Less selected: Less selected breeds (Bos taurus), included the Auroch (Bos primigenius); Zebu: Zebu breeds (Bos indicus).
DISCUSSION
The involvement of Endogenous Opioid Peptides (EOP) as negative regulators of the reproductive cycle has been known for a long time, but their role has not been completely elucidated. In cattle, EOP received attention as factors consistently involved in the inhibitory effects of suckling on reproductive activity (Malven et al., 1986). Frequent suckling or milking increases levels of EOP in the brain, which in turn increases the sensitivity of the hypothalamus to the negative feedback from estrogen (Squires, 2010). This experimental evidence confirms the role of Dynorphin A and EOP in general, as inhibitors of reproductive activity in mammals, and justify the search of selection signatures conducted in the present work to explain the well-known variability in reproductive performance among cattle breeds, that is noticeable since the onset of puberty (Diskin and Kenny, 2014). In the present experiment, we analyzed an annotated sequence of PDYN, which encoded, among others, the endogenous opioid peptide Dynorphin A (Figure 1), from the last version of the Bovine Reference Genome (Genbank accession NM_174139.3). This sequence has a genomic organization that is consistent with the cDNA originally cloned by Jiang et al. (1997); Genbank accession U58500.1). The dataset analyzed here included the only available sample from the Auroch (Bos primigenius, Park et al., 2015) that were grouped with less selected breeds. This sample did not show marked differences with other groups. However, it must be noted that a single sample is not enough to identify signatures of selection. Thus, more genomes from this group should be sequenced to be more conclusive about the effects of domestication (Orlando, 2015). The PCA results presented in Figure 2, showing sample substructure differed depending in the analyzed regions of PDYN (promoter and cDNA) indicates that the observed differences are not only justified by the well-known genetic distance between Bos indicus and Bos taurus breeds, product of sub-speciation, and justify the definition of four separated groups used in the other analyses. The comparison of haplotype frequencies among breed groups for both genomic regions (Table 4, Figure 3) also supports the existence of directional selection on regulatory regions of the gene and not only random genetic drift associated with the breed or subspecies formation in the definition of sequence variation. It could be argued that low genetic variability is the result of reduced effective population size (Ne), a feature common to selected populations in most livestock species (Taylor et al., 2016). In this case, the group of dairy breeds was integrated by two different breeds, Holstein and Jersey that share many selection objectives and also a common production system. Also, this group had the largest sample size. The methods that were used in the present work are appropriate to identify recent selection (Biswas and Akey, 2006). Indeed, the results of the Tajima’s D (Tajima, 1989) and Fu & Li D* and F* (Fu and Li, 1993) exhibited statistically significant negative values in the “Dairy” group (Table 2) which pointed out to a recent selection process in modern, highly specialized breeds. This evidence of recent selection is consistent with the emphasis of intense artificial selection on production traits, particularly those related to reproductive efficiency. A compelling argument in favor of positive selection on a gene such as PDYN comes from the intrinsic characteristics of different livestock production systems. Under extensive conditions, the cow not only nurses but also protects the progeny, and the maternal bond exerts a strong inhibiting effect on reproductive activity (Williams, 1990). In more intensified systems such as those for dairy production the calf is separated from the mother hours after parturition. Moreover, dairy cows are usually expected to get pregnant under a strongly negative energy balance (Butler and Smith, 1989; Beam and Butler, 1999; Vercouteren et al., 2015). Clearly, in these extreme cases, the physiological and environmental cues that have to be processed at the hypothalamic level to regulate reproductive cycles are very different. It could be hypothesized that in females from more intensified production systems, the effects of some inhibitory mechanisms acting on the regulation of reproduction are relaxed, favoring the selection response for reproductive efficiency. The results reported here showed that the “Dairy” group presented fewer haplotypes than the other cattle groups in the promoter region and that these haplotypes exhibit low differences among them (Table 4, Figure 3A). A detailed description on the epigenetic regulation of the expression on KISS1 and TAC3 by implementing histone modifications was recently made by Toro et al. (2018). However, it does not seem to be the case of PDYN gene, the third member of the triad of the “KNDy hypothesis” (Smith et al., 2014). The results presented here warrants further investigation on the mechanisms underlying the PDYN transcriptional regulation. The results presented here suggest that there is selection pressure acting on putative regulatory regions on the proximal promoter of PDYN (Table 2), which seems to be a common feature of selection for quantitative or complex traits. Modifications of the coding sequence are more common in the case of genes underlying mendelian traits, in which the mutation has a strong effect on the phenotype (Boyle et al., 2017). Although aminoacid substitutions on the propeptide were detected in the present analysis, they do not seem to affect the sequence of functional peptides, stressing their biological importance (Table 3, Figure 1). Also, a not clear pattern of differentiation was observed among the haplotypes of the four cattle groups in the coding sequence (Figure 3B). Selection on regulatory regions of PDYN would affect the expression of all the different peptides coded by the gene. Given the physiological role of Dynorphin A, the interest to improve reproductive efficiency in highly specialized cattle breeds provides a very reasonable explanation to justify selection on PDYN. Nevertheless, at this point, the relevance of the other peptides cannot be ruled out. Further research should clarify the role of EOP, particularly Dynorphin A, in the regulation of reproduction both within and between cattle breeds. The usefulness of QTN identification (Quantitative Trait Nucleotides underlying Quantitative Trait Loci) in animal breeding is a matter of debate. Present genomic selection in cattle is mostly based on dense arrays of anonymous markers; however, methods are being developed to include QTN information to improve prediction accuracy (Fragomeni et al., 2017). Besides, there are initiatives to enhance progress in animal selection trough the combination of conventional breeding and gene editing (e.g.: Promotion of Alleles by Genome Editing; Jenko et al., 2015). However, there are doubts about the efficacy of this approach due to the intrinsic polygenic basis of quantitative variation (Simianer, 2018). Independently of potential direct applications of the knowledge about PDYN variation among cattle breeds, the present work contributed with elements for a deeper understanding of the complex interaction among genetic variation, artificial selection, and environmental effects.