Principal Component Analysis for Body Weight Prediction of Corriedale Ewes from Southern Peru

| We aimed to verify the relationship between body measurements (BM) and body weight as well as to investigate the prediction of live weight (LW) by using original BM and principal component scores of Corriedale ewes. BM of 100 ewes collected in the Illpa Experimental Centre of the National University of Altiplano in Peru were used. Data were recorded on LW, wither height (WH), rump height (RH), thoracic perimeter (TP), abdominal perimeter (AP), fore-shank length (FSL), fore-shank width (FSW), fore-shank perimeter (FSP), tail width (TW), tail perimeter (TPe), hip width (HW), loin width (LWi), shoulder width (SW), forelimb length (FL) and body length (BL). Pearson correlation and principal component analysis (PCA) were applied to LW and others BM. Additionally, regression equations of LW on BM and on its principal components (PC) were computed. Models were compared by using coefficients of multiple determinations (R 2 ), Akaike information (AIC), Bayesian information (BIC) criteria and root mean squared error (RMSE). Correlations (r) for all BM with LW were positive and significant (r = 0.20 -0.78), except for FSW (r = 0.18). The PCA of BM and LW extracted four components explaining 68.7% of the total variance. The prediction LW model by using four PC had the lowest RMSE, AIC and BIC values as well as the highest R 2 compared to models with smaller number of PC or based on original measurements. Our results suggested that this approach is a feasible alternative to predict LW.

An alternative to this situation is an efficient herd management through periodical evaluation of production parameters (e.g., body weight and composition, measurements). Body weight is an important parameter for to assess the general condition of an animal, to increase meat production via selection, for feeding management, health care, to determine the end of fattening period and so on (Kunene et al., 2009;Yilmaz et al., 2013). However, given the importance of sheep farming for the country and the scarcity of studies on this topic, it is necessary to develop a way to estimate body weight, composition and other measurements in a simpler and less biased manner.
A feasible alternative would be to use body measurements, an indirect, fast, and low-cost method. Prediction of sheep body weights from body measurements has been reported by several authors for different breeds (Atta and El khidir, 2004;Riva et al., 2004;Silva et al., 2006;Sowande and Sobola, 2008;Kunene et al., 2009). It has been observed, however, that different models might be needed to predict body weight for different environmental conditions, body condition score and breeds (Enevoldsen and Kristensen, 1997). This is due to the different biological relationship that exists among linear body measurements (Yakubu and Ayoade, 2009), and may lead to collinearity (Akinsola et al., 2014). Thus, principal component analysis (PCA), a multivariate technique, can be used with much success when morphological traits present multicollinearity (Mavule et al., 2013). PCA uses orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components ( Jolliffe, 2002). PCA of body measurements has been used as a tool in breed description and characterization of different sheep breeds (Riva et al., 2004;Cerqueira et al., 2011;Legaz et al., 2011;Silva et al., 2013) as well as to predict body weight (Mavule et al., 2013;Eyduran et al., 2013).
To the best of our knowledge, there are no studies, which has developed linear regression equations for prediction of the body weights using principal component scores in Corriedale breed raised in extensive systems in Peru. Therefore, our objective was to estimate the relationship between body measurements and body weight as well as to determine the best model for investigate body weight prediction, i.e., whether by using original body measurements or principal component scores.

MAtERIAl AND MEtHODS
This study was conducted in the Illpa Experimental Centre of the National University of Altiplano, in Puno Depart-ment, Peru. The climate of study area is rainy during summer and dry in winter. It has average annual precipitation of 654.20 mm, with an average annual temperature of 8ºC and average relative humidity of 53.5%. The animals were grazed on natural pastures. The dominating grass species in these areas are Festuca dolichophylla, Muhlenbergia fastigiata, Alchemilla pinnata, Calamagrostis vicunarum and Stipa ichu. Oat hay was supplemented ad libitum. Since birth records were not available, the age of a ewe was estimated by counting permanent incisors as described by Gatenby (1991).
Live weights and body measurements were recorded on 100 Corriedale female sheep between 1.5 and 2 years old. Both measurements were taken after eight hours of feed restriction in order to avoid any kind of error due to gut fill. Live A digital weighing scale that could measure to the nearest 0.1 kg was used to record live weight, whereas body measurements were carried out by two technicians by using tape measure and Vernier caliper. The 14 body measurements were: wither height (WH), rump height (RH), thoracic perimeter (tP), abdominal perimeter (AP), foreshank length (FSl), fore-shank width (FSW), fore-shank perimeter (FSP), tail width (tW), tail perimeter (tP e ), hip width (HW), loin width (lW i ), shoulder width (SW), forelimb length (Fl), and body length (Bl).

stAtistiCAl AnAlyses
The data were organized and analyzed by using several statistical procedures (PROC) in SAS (SAS Inst. Inc., Cary. NC. 2003). Pearson's correlation coefficients between body measurements and body weights were calculated using the PROC CORR and tested for significance. PCA was performed on body traits measures using PROC FAC-TOR. PCA was used in order to check whether body traits could be reduced to uncorrelated dimensions, that is, linear combinations of original variables, called as principal components (PC). As a prior to performing PCA, the Kaiser-Guttman rule was used to determine the number of extracted factors, i.e., factors with eigenvalues higher than 1 (Kaiser, 1960). Bartlett's test of sphericity was used to verify if the correlation matrix was an identity or a sparse one. Kaiser-Meyer-Olkin measures of sampling adequacy of the correlation matrix and communality were also computed to validate the use PCA (sampling adequacy > 0.5). Subsequently, the varimax rotation algorithm was applied to enhance the PC interpretability. Loadings, i.e., estimated values for each body measurement in every PC, higher or equal than 0.50 were used for PCA interpretation.
To predict live weights from original body measurements [1], and from established principal component scores [2], a multiple regression analysis was performed by the PROC REG using the stepwise selection procedure. The follow-

Journal of Animal Health and Production
December 2021 | Volume 9 | Issue 4 | Page 419 ing regression models were used: , for i = 1,2,3,…,n [1] , for i = 1,2,3,…,n [2] where is the value of the i th observation; is the intercept; , ,…, are the p th partial regression coefficients; , ,…, and …, are the p th original body measurements and principal component scores, respectively, for the i th observation; is the residual error, assumed as statistically independent, with common mean 0 and variance , and are approximately normal in distribution.
The stepwise procedure provides the best prediction equations for body weight and did not include variables with a P > 0.05 as suggested by Diaz et al. (2004) and Marshall et al. (2005). Accuracies of prediction equations were estimated through the coefficients of multiple determinations (R 2 ) and root mean squared error (RMSE). Akaike's information criterion (AIC) and Bayesian information criterion (BIC) statistics were also used to assess the quality of the models (goodness of fit and model complexity). The best fit model should have lowest AIC (Akaike, 1974) and BIC values (Schwarz, 1978), maximum R 2 , and minimum RMSE.

RESultS AND DISCuSSION desCRiptive AnAlysis oF live Weight And body meAsuRements
Descriptive statistics of live weight and body measurements are presented in Table 1. There was a greater phenotypic variability among animals for LW, FSW, TW and TP e with coefficients of variation (CV) ranging from 13.62 to 18.08%, whereas the others traits showed smaller variability (CV < 9%). The average LW was 34.4 ± 4.7 kg. The WH (59.8 cm) and RH (61.1 cm) were of average proportion. The average of TP was 82.6 whereas for AP was 96.7 cm. The region of the fore-shank had measures of 20.4 cm in length, 2.5 cm in width, and 8.4 cm in perimeter, whereas BL was 95.5 cm on average. The availability of such information is essential to seek for optimal production efficiency and better value for its products.

peARson's CoeFFiCients oF CoRRelAtion
Pearson's coefficients of correlation (r) obtained between live weight and body measurements, and among body measurements are presented in Table 2. All the body measurements were positively and significantly (r = 0.20 -0.78; P < 0.05/0.01) correlated with LW, except FSW (r = 0.18, P > 0.05). LW had the highest correlation coefficient (r= 0.78; P <0.01) with AP, followed by TP (r = 0.64) and BL (r = 0.52). The correlation between AP and TP was also high (r=0.772, P<0.01). On the other hand, the lowest values were observed between LW and TP e (r = 0.20), and between LW and FSW (r = 0.18; P > 0.05). Among the body measurements, strongest correlations were verified between WH and RH (r = 0.93; P < 0.01), and between TW and TP e (r = 0.80; P < 0.01), whereas TW with AP and FSL had relatively low values (r = 0.20; P < 0.05). Our results corroborate with those reported by Afolayan et al. (2006), Sowande and Sobola (2008), for Yankasa, West African dwarf and Karayaka breeds, respectively.
Heavier animals tend to have bigger thoracic and abdominal perimeters, indicating that the search for a deep and wide sheep would lead to ewes with more body weight. This can be confirmed by the Pearson's correlations in Table 2, which were the higher among body weight and body measurements. On the other hand, animals with higher width fore-shank, higher width, and perimeter of the tail, are either unrelated or have a weak relationship (below 0.25), that would not be recommend as prediction variables for the body weight. Among the variables that presented r between 0.26 and 0.52, have better potential to apply in the practice are the hip width and body length, that present r, respectively, 0.500 and 0.517 and can be measure with relatively easy in the handling chute. Therefore, the thoracic and abdominal perimeters, together hip width, and body length, present a good relationship and is of easy of application in the management in the sheep facilities.

pRinCipAl Component AnAlysis (pCA)
The sampling adequacy Kaiser-Meyer-Olkin statistician was high for Corriedale ewes (0.70), which supports the use of the correlation matrix for PCA, i.e., it indicates that

Journal of Animal Health and Production
December 2021 | Volume 9 | Issue 4 | Page 421 true PC factors exist (Yakubu et al., 2011). In addition, the Bartlett's sphericity test showed a chi-square highly significant ( = 925.06; P < 0.001) which also supports the use of PCA.
The estimated factor loadings extracted by PCA, eigenvalues and variation explained by each factor, are presented in Table 3. After a varimax rotation of the component matrix, only four PC were extracted with eigenvalues equal to or higher than one. The extraction of only four PC allows us to better understand the complex correlations among traits, as well as the use of more parsimonious models (Mota et al., 2016). Legarra et al. (2004) reported that more parsimonious models require smaller computational demands and are less susceptible to numerical errors. In other words, fewer PC are generally enough to explain a great part of all variability (Boligon et al., 2013).
The four PC contributed for 68.7% of the total variation among all traits. From the total variance, 33.67% was accounted by the first component (PC1). PC1 had high positive loadings for LW, TP, AP, HW, and SW. The second component (PC2) explained 15.39% of the total variance and was characterised by high positive loadings for WH, RH, FSL and FL. The third component (PC3) which was associated with TW, TP e and LWi, accounted for 11.39% of the total variance, whereas the fourth component (PC4) had high positive loadings for FSW, FSP and SW, and contributed to 8.26% of the total variance. In addition, communalities represent the proportion of the variance in the original variables that is accounted by the PC. In general, the communalities were high for almost all traits, ranging from 0.39 (FL) to 0.95 (TP e ) in Corriedale ewes ( Table 3). The use of four PC may play a crucial role in the ranking of animals. This provides a chance to better select animals by using groups of traits instead of a single trait itself (Yakubu et al., 2011;Pinto et al. 2006). Silva et al. (2015), by using a ranking method in performance testing in Morada Nova sheep population, reported that the three first PC better explained the most variability of all evaluated traits. In a Portuguese Bordaleira sheep population, Cerqueira et al. (2011) observed that the two PC were enough to explain 70.5% of the total variation, and PC1 explained 61.4% itself. These authors reported that body measurements of height at withers, height at back, height at rump, length of trunk, length of head, perimeter of the shin and live weight contributed positively for most of variation. Mavule et al. (2013) studying two Zulu sheep populations (young and adult) reported that two PC for a young and four PC for an adult population were sufficient to explain most of the variability. Is clear that variability of body measurements might be different across breeds, but some traits commonly influence sheep populations no matter the breed. This is an indicative that these traits could integrate a selection index in breeding programs for different sheep breeds. Figure 1 showed the component plot of the first three PC in rotated space for body measurements. The plot clearly showed that the body measurements have been clustered into the following five groups: 1-by TW and TP e ; 2-by LW i and SW; 3-by LW, TP and AP; 4-by FSL, HW and BL; 5-by FL, FSP, FSW, WH and RH.  Table 4 shows the regression equations predicting LW of Corriedale ewes from original body measurements and their PC scores. Results of the stepwise multiple regression analysis revealed that abdominal perimeter alone accounted for 60% of the variation in live weight. The inclusion of body length increased this proportion up to 68%. The accuracy of the model was further improved to 76% when shoulder width, fore-shank length and loin width were added to the equation. Besides the higher R 2 , this model showed lowest AIC, BIC and RMSE values, which confirms its better goodness of fit when compared to the models that use original body measurements. Kunene et al.

Journal of Animal Health and Production
December 2021 | Volume 9 | Issue 4 | Page 422

Journal of Animal Health and Production
December 2021 | Volume 9 | Issue 4 | Page 423 (2013) reported that the model considered all traits (rump height, withers height, back height, chest depth, chest width and body length) presented the best R 2 = 0.76 to estimate mature live weight in a Karya sheep population.
However, better results were achieved when using PC instead of original body measurements (Table 4). As mentioned above, four PC contributed for 68.7% of the total variation between recorded body traits. Hence, we performed a stepwise multiple regression analysis using one, two, three and four PC (4 models). It was observed that goodness of fit increased by including the four PC in the model. Although the use of more parsimonious models are preferable as aforementioned, the model considering 4 PC have lowest RMSE, AIC and BIC values and highest R 2 (Table 4). Nevertheless, the 4 PC model are more parsimonious than the model using five original measurements, the best fit model among models using original measurements. In a similar study, Yadav et al. (2016) have founded that body weight prediction model with PC1 and PC2 presented better R 2 (0.94) and lowest RMSE (1.86) rather the model with only PC1 (R 2 = 0.80 and RMSE=3.84) in a Madgyal sheep population. The PC obtained here may be used as a group of variables or even in a selection index. In this case, the selection index would have four weighted coefficients which will decrease computational demands (Pinto et al., 2006;Mota et al., 2016).
From these results, we can suggest that it is possible to predict live weight of Corriedale ewes by using body measurements, which is an advantage for farmers who cannot afford a weighing scale. In addition, prediction of live weight using four PC calculated from the body measurements is rather better alternative than using original measurements. The four PC reported here have high association with the most evaluated body measurements.

CONCluSIONS
Our results suggested that principal components analysis is a suitable approach to evaluate live weight and the relationship between body measurements of Corriedale sheep population in Peru. This approach allows us to predict animals live weight through regression equations using principal components scores. Using this approach also represents a viable alternative for the farmer to rank their animals.

CONFlICt OF INtERESt
There were no conflicts of interest.