Principle Component Analysis of Breeding Values Estimated by Six Animal Models for Evaluating Some Productive and Reproductive Traits of Holstein Dairy Cattle

| This study was conducted to estimate genetic parameters and breeding values (EBVs) for milk yield (MY), peak yield (PY), lactation length (LL), days open (DO), calving interval (CI), and services per conception (SC) of Holstein dairy cattle. The direct genetic, maternal genetic and maternal permanent environmental effects were separately evaluated. Furthermore, the principle components analysis (PCA) was applied to explore the relationships among the animal EBVs. Genetic parameters were estimated using the multi-trait restricted maximum likelihood methodology by incorporating six different models that either included or excluded maternal effects. The best model was selected based on the likelihood ratio test. In this context of the research, a total of 18221 cows were assessed for records between 2007 and 2018. Out of the six animal models, the fourth model was chosen as the best model, because it had the smallest -2 Log Likelihood value. The range of direct heritability values were 0.21-0.35, 0.02-0.30, 0.15-0.33, 0.04-0.18, 0.05-0.18, and 0.05-0.15 for MY, PY, LL, DO, CI, and SC, respectively. The estimated maternal heritabilities were lower than direct heritabilities informed by all models. However, models 4 and 6 showed the greatest increase in maternal heritability, for all traits. PCA reduced the standardized EBVs of traits into two components, explaining 75.04 % of the total genetic variance. The EBVs of MY, LL, DO, SC, and CI highly associated with PC1, whereas those of PY is closely connected with PC2. In conclusion, the selection indices could be planned based on two PCs instead of all traits.

Principle components analysis is defined as a multivariate statistical method that can be applied to minimize the number of correlated traits into a smaller number of independent variables called principle components, with minimum loss of information in the original data (Bolormaa et al., 2010). This approach produces orthogonal components that are linear combinations of the main variables, depending on the eigenvalues of the variables of interest. The eigenvalues are created in an order from the highest to the lowest one and each Principle component explains superior variability than the next PC (Meyer, 2007).
Multivariate methodologies, such as principle components analysis could be used to extract the loadings or coefficients that explain the maximum variation in the datasets, providing a tool that picks up the animals with similar characteristics. Once identified, these new components or groups could be selected for dairy breeding programs to improve both productivity and fertility (Karacaören and Kadarmideen 2008; Buzanskas et al., 2013;Jolliffe and Cadima 2016;. Besides, multivariate statistical models might denote relationships and significant outcomes that could not be possible when using univariate approaches (Lopes et al., 2013;Moraes et al., 2015;Fraga et al., 2016). Furthermore, multivariate methods help handle pertinent decisions in animal breeding programs (Cardoso et al., 2003;Selim et al., 2018).
Several studies used principle components methodology in animal breeding. For example, PCA has been investigated for genetic assessment of beef cattle (Bignardi et al., 2014;Boligon et al., 2016;Tramonte et al., 2019) and evaluation of reproductive traits of different breeds (Savegnago et al., 2011;Buzanskas et al., 2013). Bignardi et al. (2012), Moawed and Osman (2018), and Mello et al. (2019) applied PCA for dimension reduction of dairy cow traits. A recent exploratory study was conducted by Oliveira et al. (2014) who utilized PCA to evaluate nine traits in buffalo cattle in Brazil and concluded that the first four PCs are adequate to explore the covariance structure of these traits. The previous studies concluded that PCA permits minimizing the traits dimension, simplifies the interpretation of data with few components, and visualizes the relationship between the original datasets. However, few studies considered the breeding values in the PCA and only incorporated the original phenotypic traits.
By analyzing animal breeding values, it could be possible to identify the genetic relationships among the economi-cally important traits of dairy cows, both in magnitude and direction (Savegnago et al. 2011;Porto-Neto et al. 2013;Osorio-Avalos et al., 2015). Therefore, the present study aimed to estimate genetic parameters and breeding values for the most economic traits of Holstein dairy cows using six different animal models. Furthermore, the principle components analysis (PCA) was applied to explore the relationship between the estimated breeding values for the functional traits.

dAtASet deScription
The dataset investigated in this study was provided by a commercial dairy herd belonging to Modern Agricultural Development Company (MADC) located nearly 80 km from Alexandria, Egypt. The traits analyzed in Holstein Friesian cows were milk yield (MY), 305-day milk yield (305 DMY), peak yield (PY), lactation length (LL), days open (DO), calving interval (CI), and services per conception (SC). The dataset from three stations was collected representing 18221 cows. Animals were born between 2007 and 2018. At all three stations, cows were kept under similar feeding and management systems. Animals are kept in open sheds. All year round, cows fed ad libitum using Total Mixed Ration (TMR), and the ration formulations were done by the National Research Council (NRC) program. In most herds, heifers and cows were artificially bred using frozen semen imported from U.S.A and Canada. Heifers were bred when reached 350 -375 kg of body weight and cows were served during the first heat after the 45 th day post-partum. Cows were machine milked three times daily at eight hours intervals. Cows were usually milked until two months before the expected calving date.

StAtiSticAl ModelS And genetic pArAMeterS
Before estimation of genetic parameters and breeding values for the studied traits, data were examined and tested to be valid for the analytical model fitting. For all traits, records with fewer than three observations, and bulls having less than three offspring were excluded from the prepared data file. Normality of the trait's residuals was verified and data with residual standard deviations greater than 3.5 and below -3.5 standard devotions were removed from the analysis (Buzanskas et al., 2013;Tramonte et al., 2019).
Six animal models were incorporated to estimate variance components, genetic parameters and subsequently breeding values for each trait. In all models, parity, age at calving, calving season (summer, autumn, winter, and spring), and year of calving (2007 and 2018) were included as fixed effects. Random effects were fitted based on the model type. Technically, the analysis was repeated several times until a minimum value of -2 Log L was detected, when -2 Log L

Advances in Animal and Veterinary Sciences
July 2021 | Volume 9 | Issue 8 | Page 1115 remained constant (Lee and Taper 2002;Tilki et al., 2008). In all models applied, the direct additive genetic effect of the animal was considered as a random effect along with the random effect of residuals. In model 2, the random permanent environmental effect was employed. In model 3, the maternal genetic effect of the dam was added without consideration of the direct-maternal covariance. Model 4 was similar to model 3 but the direct-maternal covariance was assumed. Model 5 and model 6 employed all random effects (animal, maternal, environmental, and residuals), but the former model assumed no direct-maternal covariance, while the latter one took into account such covariance. The statistical models proposed for the studied traits are summarized as follows: Model 1: Model 2: Model 3: (σ AM = 0) Model 4: (σ AM ≠ 0) Model 5: (σ AM = 0) Model 6: In which Y is n x 1 vector of observations for the traits; b is p x 1 vector of fixed effects, p = number of levels for fixed effects; u a is q x 1 vector of random animal effects, q = number of levels for random effects; u pe is the vector of random permanent maternal environmental effects; u m is the vector of N x N m maternal additive genetic effects; X is the design matrix of order n x p, which relates records to fixed effects; Z a is the design matrix of order n x q, which relates records to random animal effects; Z pe is the incidence matrix of permanent maternal environment effects; Z m is the incidence matrix of maternal genetic effects, and e is n x 1 vector of random residual effects.
The expected values and variance components are presented as follows: var (a) = A = G, var (pe) = and var (m) = where A is the numerator relationship matrix and I is identity matrix.
The estimated heritabilities for both direct additive genetic effect and maternal genetic effects are given by the following equations (Willham, 1980): Where, is the direct heritability; = maternal heritability; , and are the direct additive genetic variance, maternal additive genetic variances, and phenotypic variance, respectively. Variance components estimation, genetic parameters, and estimated breeding values (EBVs) for all traits were performed using VCE version 6.0.2 software according to permission from Groeneveld et al. (2008), based on the restricted maximum likelihood procedures of the general linear models.

principle coMponent AnAlySiS
Principle component analysis (PCA) is a multivariate dimensionality reduction statistical method and aimed to summarize the information involved in the original EBVs of studied traits into a smaller number of newly generated variables called principle components, without loss of vital information. Moreover, the analysis endeavored to find out the relationships between the EBVs (Hair et al., 2009). The EBVs of all traits were standardized through the standard normal distribution.
The Kaiser-Meyer-Olkin (KMO) and Bartlett's tests were inserted for checking the sampling sufficiency of PCA (Cerny and Kaiser, 1977;Snedecor and William, 1989). The Kaiser criterion was also applied to choose the number of principle components that explain the maximum genetic variation in the data. Such criterion deemed only the principle components with eigenvalues greater than one. The eigenvalue of the principle component is connected with the variability of all EBVs of traits involved in the principle component, which intern constitutes an eigenvector (Rencher, 2002). These eigenvectors explain the correlation of each trait's variance with the principle component. The KMO was calculated according to the following equation: Where R ij is the correlation matrix and C ij is the partial covariance matrix.
Bartlett's test of sphericity was utilized to detect the appropriateness of the data to be analyzed using PC analysis. This test compares the correlation matrix of EBVs of traits with the corresponding zero matrices, or identity matrix, to check the overall correlations among EBVs. This test is distributed as chi-square with a [p (p-1) / 2] as the degree of freedom, and is given as follows:

Advances in Animal and Veterinary Sciences
July 2021 | Volume 9 | Issue 8 | Page 1116 In which p is the number of traits, n is the overall sample size, and |R| is the determinant of correlation matrix R. The null hypothesis for Bartlett's test of sphericity states that the correlation matrix is not diverged from the identity matrix (H 0 : traits are orthogonal). If H 0 is rejected, PCA will do the reduction process for EBVs of traits without loss of information.
The principle components are independent variables representing a linear combination of variables (estimated breeding values). The first extracted and rotated principle component elucidates the highest percent of the total genetic variation in EBVs, followed by the second and third components. With a dataset consisted of p variables, the given principle component i (PC i ) can be calculated as follows: PC i = a i1 X 1 + a i2 X 2 + a i3 X 3 + … + a ij X j Where, i = 1,2,3,…,p and j = 1,2,3,…,p = j th standardized coefficient of the j th EBV of the i th principle component. X j is the value of the original EBV.
By using the standardized EBVs of traits, the principle components score can be outputted. These scores resulted from the sum of standardized BVs for each trait weighted by the corresponding standardized score coefficient. The principle components could be used as an index to assess animals for different traits. The standardized coefficients were estimated as follows: Where, a ij is the standardized coefficient for EBVs of the j th productive or reproductive trait in the j th principal component. Data mining and analyses were conducted by SPSS software (SPSS, version 25) and, the PRINCOMP procedure statement of SAS (SAS 9.4, USA).

RESULTS AND DISCUSSION
The descriptive statistics for the productive and fertility traits obtained in this study are presented in  Table 2 showed the values of maternal heritability for all traits as estimated by models 3, 4, 5, and 6. The values of maternal heritability of MY, PY, and LL ranged from 0.02 to 0.32, from 0.01 to 0.17, and from 0.03 to 0.07, respectively. Fertility traits presented maternal heritability estimates ranged from 0.04 to 0.21 for DO, from 0.04 to 0.14 for CI, and from 0.02 to 0.11 for SC. It was apparent that the model used in the estimation process considerably affected the values of direct and maternal heritability.
In general, the estimates of maternal heritability for productive traits were lower than the direct heritabilities in the four mentioned models. While as, models 4 and 6 informed maternal heritability estimates higher than direct heritability values of reproductive traits.
According to models, the current results revealed that the inclusion of maternal effects (permanent environmental and additive genetic) in some models along with the existence of covariance between the direct effect of animals and maternal effects lead to varied heritability estimates and intern their variance components. Overall, an increase was observed in the values of direct heritability for productive traits (

Advances in Animal and Veterinary Sciences
July 2021 | Volume 9 | Issue 8 | Page 1117    Table 6 showed the correlations of the significant EBVs of traits with each of the extracted components. These values represent the strength and direction of EBVs of traits picked or clustered by each PC. Given model 4, EBVs of MY, LL, DO, SC, and CI showed the greatest and positive correlations (> 0.70) with the first PC. EBVs of PY connected and related positively and strongly with the second PC, denoted a correlation value close to 0.76. Putting all models into consideration, it was observed that EBVs for most of the studied traits were positively and strongly associated with PC1. Although model 6 was suggested to be the second-best model, PCA showed the distribution of EBVs of traits in three components.
The overall description of the traits in this study (Table 1) were consistent with the findings of (Salem et al., 2006;Sahin et al., 2017;Mello et al., 2019), who conducted similar studies and estimated genetic parameters using the same six animal models. The high percentages of the coefficient of variations for all traits indicated the presence of high phenotypic variations among individual animals and hence, suggesting the possibility of selection and improvement of the current herd. Among the studied traits, the highest CV % was recorded for MY, LL, DO, and SC, which means that these traits could have a superior chance of being included in the selection index.
Overall, the direct heritability estimates for MY were moderate, and the highest values were informed by model 4. These values were in accordance with the findings reported by (Tilki et al., 2008;Karabulut et al., 2012;Agudelo-Gomez et al., 2015), who estimates direct heritability estimates for milk yield from 0.15 to 0.37. The direct heritabilities for PY were low in magnitude and the only improvement was observed in model 4, where the maternal genetic effect was added to the model, assuming the

Advances in Animal and Veterinary Sciences
July 2021 | Volume 9 | Issue 8 | Page 1118    (Rosati and Van Vleck, 2002;Bolivar-Vergara et al., 2012), but higher than that reported by (Boli gon et al., 2010;Seno et al., 2010;Buzanskas et al., 2013). It was obvious that the direct heritability estimates for all fertility traits were low (< 0.20) in all models, which agrees with the results of Sahin et al. (2017) who applied similar six animal models, reporting the same findings. In terms of accuracy and validity of the current estimates, the standard errors of heritability estimates were low for all traits and models, indicating that the current estimates are reliable and unbiased (Buzanskas et al., 2013). The present estimates of direct heritability revealed that selection based on productive traits, particularly MY and LL may be more effective and would lead to genetic progress compared with selection based on fertility traits because they denoted the highest estimates compared with other traits. Although, the direct heritability of fertility traits was low, suggesting that these traits could be influenced by the environmental conditions, however, the inclusion of these traits in the selection index is important to improve the cow's performance (Buzanskas et al., 2013;Shalaby et al., 2015).
The present study showed lower estimates of maternal heritability (Table 2) as compared with the corresponding direct heritabilities for productive traits, indicating that the maternal additive genetic variances were lower than the direct additive genetic variance of the animals. These findings were in agreement with those reported by (Tilki et al., 2008;Abera et al., 2011;Sahin et al., 2012) who determined higher direct heritability estimates for Brown Swiss Cattle. Contrary, Karabulut et al. (2012) estimated higher maternal heritability estimates for the same studied traits. The estimated maternal heritability for fertility traits agrees with values denoted by (Albuquerque and Meyer, 2001;Malhado et al., 2007;Bolivar-Vergara et al., 2012) who reported values between 0.02 and 0.18.
In this research, the maternal effects were separated into two units, maternal environmental and maternal genetic effects. The current results revealed that model 4 with maternal additive genetic effects along with the direct-maternal covariance informed the highest improvement in heritability values, particularly for MY trait. These results came in accordance with those in the study of (Agudelo- Gomez et al., 2015;Shalaby et al., 2015) who estimated genetic parameters in Holstein-Friesian cows and fitted a similar model as best for genetic evaluation of dairy cattle. Although, model 6 informed similar results to model 4, however, the latter denoted superior improvement, both indirect and maternal heritabilities. This suggested that the insertion of permanent maternal environmental effect in the analytical model was insignificant. This conclusion is in contradiction with the results of Tilki et al. (2008) who mentioned that model 6 was the best. Based on the present heritability estimates and the best-fitted model (lowest LRT), where the direct-maternal genetic covariance has existed, it was concluded that the inclusion of maternal genetic effect in the model allows for a better estimation of heritability for productive traits. Besides, the reproductive efficiency of cows is affected by their dams. Accordingly, the improvement of health status, reproductive performance of dams would have an impact on cows in the future.
PCA was conducted using breeding values estimated by the six models for all traits. The high KMO values (Table  4) in the present study imply that the correlation between EBVs of traits was not unique, which is not related to the remaining EBVs outside each sample correlation. The values of KMO in this study are in agreement with the measures evaluated by Verma et al. (2015), who reported KMO measure of sampling adequacy to be 0.75, and Egena et al. (2014), who reported that KMO was equal to 0.80. The significance of correlation matrices tested with Bartlett's test of Sphericity (Table 4) gives support for the authenticity of PCA analysis for the data set. The results of PCA (Table 5) are consistent with the previous reports (Buzanskas et al., 2013;Agudelo-Gomez et al., 2015;Moawed and Osman, 2018;Tramonte et al., 2019), indicating the usefulness of PCA to reduce data dimensions. According to Val and Ferraudo (2008), the first two components explained 71 % of the total variation in the original traits of dairy cattle. Oliveira et al. (2014) evaluated seven productive and reproductive traits in Brazil and concluded that the first three PCs are sufficient to explain more than 80 % of the total variance of EBVs of traits. The positive correlations of PCs with EBVs of traits (Table 6, model 4) suggested that the selection of animals could be made