(Français, Español)
Editorial Office
Back issues
About the PGR Newsletter
Instructions for authors
Guidelines for referees
Contact us
Bioversity International Home Page
FAO Home Page

Sections > Article

Published in Issue No. 122, page 16 to 23 - (30580) characters

Multivariate analysis of the genetic diversity of Bolivian quinua germplasm

Wilfredo Rojas  Patricio Barriga  Heriberto Figueroa  Introduction

Andean crops characteristically have broad genetic variability. However, in most species this variability is not adequately used, mainly because not much is known about it. One such crop is quinua (Chenopodium quinua Willd.), whose edible starchy seeds are an important food crop throughout the Andes.

Bolivia currently has a collection of over 2500 quinua accessions, which include entries from the Andes between Ecuador and northwestern Argentina, and the coastal lowlands of Chile. To provide options for germplasm use as well as guidelines for future collecting, plant-breeding programmes must understand the patterns of variability within existing quinua collections. Maintaining this broad variability will also make an important contribution to the crop’s stability in Bolivia and in the rest of the Andean region.

Toward these ends, three multivariate methods were applied (Dillon and Goldstein 1984; Hair et al. 1992) to describe the material in the Bolivian quinua collection, taking into account several characters and also the relationships between them. The study aimed to determine patterns of germplasm variation and to identify groups of accessions.

Materials and methods

Of the entire quinua germplasm collection of 2032 accessions existing in 1992, 1512 accessions were analyzed for their diversity (60% of the total). The following accessions were excluded: wild material, accessions with poor or no germination, and accessions without passport data. Most of the accessions were Bolivian and Peruvian, but also included were entries from Argentina and Chile (Table 1). The geographic range of the accessions was from 11°S in Peru to 43°S in southern Chile. Altitudes ranged from sea level (Chile) to 3885 m asl (Bolivia).

Germplasm characterization and agronomic evaluation were carried out during 1992-1993 and 1993-1994 at the Patacamaya Experiment Station, located in the Altiplano, Department of La Paz, Bolivia (17°15´S, 68°55´W; 3789 m asl) where the climate varies from arid to semi-arid. Average annual precipitation is 381 mm, average annual temperature 11°C and the number of frost-days per year averages 187.

Quantitative variables considered in the descriptive and multivariate analyses were:

a. phenological

- emergence of seedlings and flower buds

- initiation of flowering

- 50% flowering

- physiological maturity;

b. morphological

- number of branches

- number of teeth on the serrated margins of leaves

- stem diameter

- panicle length

- panicle diameter

- plant height

- harvest index;

c. grain

- grain diameter

- 100-grain weight

- saponin content.

Two qualitative variables were included for cluster analysis:

- panicle shape

- growth habit.

Statistical analysis

Descriptive analyses of central tendency and dispersion were applied to estimate and describe the performance of the different accessions in terms of each character (Steel and Torrie 1988). Analysis of genetic diversity was performed with the program SYSTAT, version 5 (Wilkinson 1988) and followed four steps:

1. Estimate of degree of association among the different characters analyzed, according to Pearson’s coefficient (Clifford and Stephenson 1975).

2. Derivation of orthogonal variables, using principal components analysis (PCA) based on the correlation matrix (Hair et al. 1992).

3. Classification of accessions in similar groups by the non-hierarchical, k-means technique of cluster analysis (Hair et al. 1992).

4. Verification of the significance of groups by multiple group discriminant function analysis (DFA) to determine the power of each variable to separate groups (Hair et al. 1992).


Descriptive statistical parameters

Table 2 summarizes the parameters estimated for each quantitative variable. There was much variation in phenological variables. For example, the earliest accessions reached maturity in 119 days and the last to mature took 209 days, with a 3-month interval between the earliest- and the latest-maturing material. This broad variability is promising from the viewpoint of genetic improvement to cope with abiotic problems such as frost and drought, the two factors that most affect crop production.

Harvest index also presented broad variability. Accessions with a low harvest index, which is dependent upon plant architecture, have potential for forage, while material with a high harvest index can be used for grain production. In addition, the significant variation in grain diameter, 100-grain weight and saponin content can be used to improve the product’s presentation for market. Plants with highly diverse architecture can be used according to the breeding objectives being pursued.

Pearson’s coefficient

Of all the coefficients, 83 were highly significant (P0.001). However, only coefficients higher than 0.4 were considered as linear associations representing natural variation patterns. Thus, the most important correlations corresponded to variables related to phenology and the grain, rather than to morphology.

Among the phenological variables, the highest correlation corresponded to initiation of flowering and 50% flowering (r=0.94). Traits presenting highly significant correlations with these two characteristics were flower buds with r=0.69 and r=0.73 respectively, physiological maturity with r=0.63 and r= 0.61, and harvest index with r=-0.59 and r=-0.57. The negative correlations with the harvest index show that these values tend to become smaller as the phenological phase becomes longer. This phenomenon is corroborated by the negative correlations with physiological maturity (r=-0.55) and flower buds (r=-0.42).

The positive correlations between physiological maturity and plant height (r=0.56) and stem diameter (r=0.41) indicate that plants tend to become taller and their stems thicker the longer the phenological cycle. However, the negative correlation with 100-grain weight (r=-0.42) indicates that, in turn, harvest indexes become lower. Risi and Galwey (1989a) also reported significant associations for physiological maturity and stem diameter, as did Ochoa and Peralta (1988) for physiological maturity and 100-grain weight.

Positive associations between 50% flowering and number of branches (r=0.44) and number of teeth on leaf margins (r=0.45), and between initiation of flowering and number of branches (r=0.43), indicate that accessions that flower late develop more branches and leaves that are more dentated, as was evident in the accessions from the valleys. Risi and Galwey (1989a) had similar results for these characteristics.

Grain diameter and 100-grain weight formed the second most important correlation (r=0.89). Grain diameter also correlated positively with saponin content (r=0.40) and negatively with panicle length (r=-0.40), indicating that large-grained accessions tend to develop short panicles and high saponin content. Ochoa and Peralta (1988) and Cayoja (1996), in similar evaluations of quinua germplasm, also determined highly significant correlations for grain diameter and 100-grain weight.

Among morphological variables, stem diameter correlated positively with plant height (r=0.69), panicle diameter (r=0.60) and panicle length (r=0.40); and panicle length correlated positively with plant height (r=0.58), indicating that accessions with greater stem diameters and plant height during early phenological phases also developed larger panicles. Risi and Galwey (1989a) determined that plant height, stem diameter and panicle length and diameter were significantly correlated with each other. Ochoa and Peralta (1988) also found associations between panicle length and stem diameter and plant height.

Principal components analysis

This analysis clearly and concisely explained the genetic diversity of quinua. The linear transformation performed by this method generated a new set of 15 independent variables, known as principal components, which are described by their latent roots and vectors (Table 3). The latent root associated with each principal component measures the contribution of each principal component to the total variance, while the coefficients of the latent vector associated with a given principal component indicate the degree of contribution (or ‘loading’) of each original variable to the principal component in question.

There are no tests to evaluate the significance of latent roots. Therefore, we chose to follow the criterion established by Kaiser (1960), which adapts very well to the purpose of this analysis. This criterion is based on the selection of principal components whose latent roots are >1. According to this criterion, the first three components qualify, accounting for more than 63% of total variation (Table 3), giving a clear idea of the structure underlying the quantitative variables analyzed. The first principal component accounted for more than 30% of the total variance. Initiation of flowering, 50% flowering and physiological maturity were the variables with the largest positive loadings. In contrast, harvest index had the largest negative loading (Table 3).

As a result, the first principal component differentiated between accessions that flower and mature later in the season, and thus register low harvest index values, and those with opposite characteristics. The positive contribution of plant height and stem diameter indicates that these accessions, in addition to being late maturing, develop prominent plant architecture that negatively affects the harvest index, although the negative signs of grain diameter and 100-grain weight indicate that low harvest indexes are also partly a result of the formation of small grains. In addition, these accessions are characterized by having many teeth on their leaves and greater branching.

Except for seedling emergence, the first principal component identified mainly phenological variables. Scaff (1996) observed a similar phenomenon when he studied the diversity of accessions collected in southern Chile. Risi and Galwey (1989b) had different results for this component with the exception of seedling emergence, which presented a negative contribution. In contrast, phenological variables were important for the second component. However, they scored each phenological time period separately, whereas in the analysis presented here, cumulative days from sowing were used.

The second principal component accounted for more than 21% of the total variance. Variables with high positive loadings were grain diameter and 100-grain weight, followed by saponin content. In contrast, most morphological variables presented a negative secondary contribution, although panicle length had quite a high negative coefficient value (-0.785) (Table 3). Consequently, this component distinguished quinua accessions forming large grains, with high saponin content and small plant architecture in terms of plant height, stem diameter and panicle size.

Thus, the second component basically identified grain-related variables presenting positive contributions and, secondarily, morphological variables contributing in both senses. Harvest index and phenological variables, except for flower buds, were not important. Riveros (1997) likewise found that grain diameter and 100-grain weight were the variables that contributed most to this component, highlighting the limited relationship with phenological variables. In contrast, Risi and Galwey (1989b) found that all phenological variables, except for seedling emergence, were the most important (with negative loadings). Of the morphological variables, panicle diameter had the highest loading.

The third principal component (12% of total variance) was associated with panicle and stem diameter, plant height, grain diameter and 100-grain weight, thus differentiating those accessions with outstanding architecture, plant height, thick stems, large panicles and medium to large grains (Table 3). Scaff (1996) found that stem diameter and plant height were the variables that contributed most, whereas Risi and Galwey (1989b) and Riveros (1997) found panicle length had the highest loading.

The total proportion of variance accounted for by the first three components was calculated for each variable (Table 4) (Crisci and Lopez 1983). Phenological variables, except for seedling emergence, were more important than grain-related variables and these, in turn, were more important than morphological variables. Among the first group (phenological variables), 50% flowering and initiation of flowering were outstanding, followed by physiological maturity; in the second group, grain diameter and 100-grain weight were outstanding; and, in the third group, stem diameter and plant height.

Harvest index, despite its important contribution to the first component, occupied a relatively low position, contributing little to the other two components. The contribution of the variables (number of teeth on leaf margins and saponin content) to the third and first components, respectively, was also small. Of all the variables studied, seedling emergence was the least discriminatory.

Cluster analysis

Using k-means non-hierarchical clustering, the quinua germplasm was grouped into seven clusters, each cluster containing quinua that were highly similar. This clustering, combined with passport data, provided a very useful description of the germplasm overall. Only the qualitative variables panicle shape and growth habit were included in this analysis.

Accessions in Cluster 1 (cluster labels were assigned arbitrarily) characteristically had medium to large grains (2.12±0.14 mm), with the highest saponin content of the collection (6.49±2.02 cc), and were relatively early maturing (166±13 days). Plants had an intermediate architecture, usually with a branching growth habit and high harvest index. Amaranthiform panicles prevailed over glomerulate panicles. Accessions from Cercado, in the Department of Oruro, Bolivia, were most representative of this cluster.

Accessions in Cluster 2 characteristically had the largest grains in the collection (2.31±13 mm) and were early maturing (148±13 days). While branching growth habits were predominant, plant architecture was below the overall average for plant height, stem diameter, and panicle length and diameter. Harvest indexes, however, were high. As expected, amaranthiform panicles predominated within the cluster. The most representative accessions were those collected from the Salares (a region of salt-pans extending across southern Oruro and northern Potosí), in Ladislao Cabrera, Department of Oruro, and Nor Lípez and Daniel Campos, Department of Potosí.

Accessions in Cluster 3 had high harvest indexes (0.47±0.09); small architecture (plant height = 95.06±11.62 cm, stem diameter = 15.04±1.66 mm; simple growth habits, i.e. few and short branches); and small to medium-sized (1.8±0.13 mm) grains with low saponin content (1.38±2.27 cc). The period of maturity was similar to that of Cluster 1 (168±14 days) but glomerulate panicles predominated. Most of the accessions were collected in Aroma, Department of La Paz, and in several districts in the Department of Puno, Peru. Surprisingly, the Chilean accessions were grouped in this cluster, perhaps because the study did not consider discriminatory characters, as did Wilson (1988) in his work with allozymes.

Accessions in Cluster 4 were tall (119.17±9.45 cm) and moderately late maturing (184±10 days). Their grains were the smallest of the collection (1.79±0.13 mm) and had the lowest saponin content (0.97±1.46 cc). Plants had few short branches and leaves were relatively less dentated. Panicles were more long than wide and were glomerulate. This cluster contained accessions from areas surrounding Lake Titicaca and had the highest percentage of Peruvian accessions.

Accessions in Cluster 5 were mostly late maturing (196±7 days) and had low harvest indexes (0.28±0.09). Plants were tall and heavily branched with thick stems and highly dentated leaves; this last characteristic is common among quinua accessions from valley regions. Grains were medium sized (2±0.13 mm) with high saponin content (5.39±2.74 cc). Panicles were more wide than long because of the plant’s open canopy and amaranthiform. The most representative accessions of this cluster come from the highland valleys in the Bolivian Departments of Potosí, Chuquisaca, Cochabamba and Tarija.

Overall, accessions in Cluster 6 had the largest plant architecture of the collection (plant height = 128.99±13.4 cm, stem diameter = 20.81±2.17 mm, panicle length = 46.12±5.79 cm, panicle diameter = 9.05±1.88 cm) and numerous branches. These accessions, however, produced small to medium-sized grains (1.84±0.15 mm), with low saponin content, and were moderately late maturing (188±11 days), similar to Cluster 4. Glomerulate panicles predominated. As in Cluster 4, accessions from areas surrounding Lake Titicaca were most representative of this cluster.

Cluster 7 contained the smallest number of accessions; these took the longest to mature (204±5 days) and had the lowest harvest index (0.19±0.08). Plant architecture was large and plants were highly branched with highly dentated leaves (24±6), a characteristic of quinua accessions from valley regions (Risi and Galwey 1989a). Glomerulate panicles clearly predominated. Accessions from the lower altitude valleys of the Cochabamba Department were the most representative.

Table 5 summarizes the profile of each cluster, again highlighting the broad variability of the germplasm. Clusters are ordered according to the variables that characterized the first principal component. For example, quinua accessions from Cluster 2 were the earliest maturing and had high harvest indexes, while those from Cluster 7 matured late and had low harvest indexes. Accessions of the other clusters (1, 3, 4, 5 and 6) had intermediate growth cycles and the harvest index decreased as the maturity period lengthened.

Quinua accessions collected from areas around Lake Titicaca and in the central Altiplano of Bolivia (Clusters 3, 4 and 6) had the fewest branches and the simplest growth habits, corroborating Gandarillas’s (1968) findings and contrasting with the highly branched accessions collected from valley regions (Clusters 5 and 7) (Risi and Galwey 1989b). They also had small- to medium-sized grains with low saponin content (Table 5). The predominance of small-grain accessions in this part of the Altiplano was reported by Arze et al. (1977) who found that the grain diameter of germplasm from Puno (Peru) ranged between 1.00 and 1.19 mm. Accessions in Clusters 1 and 2 tended to have the largest grains, a characteristic of accessions from the Salares.

Discriminant function analysis

DFA tells us whether a particular set of variables, in this case the 15 quantitative variables discussed above, is useful in separating previously delineated groups, in this case the seven clusters described in the previous section. The discriminant functions that differentiated among these clusters were obtained by the stepwise procedure (Table 6). All discriminatory functions were statistically significant at a probability of 0.001 according to the chi-square test. However, latent roots indicated that the first two functions accounted for more than 80% of total variance (Table 6).

Comparing DFA with PCA

The interpretation of the discriminatory power of the original characterization variables was carried out with the "potency index" (Hair et al. 1992), which measures the contribution of each variable, taking into account all significant functions and thus the total discriminatory effect. The results obtained are presented, together with the degree of importance of the variables as determined by PCA, in order to compare the efficiency of both methods (Table 4).

In general, both methods determined that 50% flowering was the most discriminatory variable and emergence the least. They also determined that grain diameter and 100-grain weight ranked third and fourth in importance. The exact position reached by several variables, however, was not the same for the two methods. For example, in the most notable cases, DFA assigned greater importance to physiological maturity, even more than to grain diameter and 100-grain weight, whereas PCA allocated this variable to seventh position. In contrast, DFA notably improved the position of saponin content, which was penultimate with PCA.

Time to 50% flowering and physiological maturity were the most discriminatory variables for cluster classification and are strongly dependent on genotype. Likewise, grain diameter and 100-grain weight were highly discriminatory and as these are highly inheritable are dependent on genotype (Espíndola 1980). Among the morphological traits, panicle length, plant height and stem diameter were the most important variables; these form part of quinua’s yield components.

The methods complemented each other well in selecting the most discriminatory variables.

Mahalanobis distance

According to the Mahalanobis distances D2 among clusters (Table 7), also calculated by DFA, the seven clusters were statistically different from each other. Clusters 2 and 7 were the most distant with 112.89 units. Cluster 7 grouped together the quinua accessions that were the last to mature and which originated in the lower altitude valleys of Cochabamba (from 2558 to 3100 m asl). In contrast, Cluster 2 gathered together the earliest-maturing quinua accessions, which came mainly from the southern areas of the Altiplano or from the Salares (from 3665 to 3700 m asl).

The most similar clusters were, on the one hand, Clusters 4 and 6 (7.34 units) and, on the other, Clusters 3 and 4 (7.81 units). The three clusters gathered together quinua accessions from the Altiplano shared by Peru and Bolivia.

Verifying the predictive ability of discriminatory functions

The classification matrix presented in Table 8 summarizes the predictive ability of discriminatory functions when classifying the different groups of germplasm. Each accession was assigned to a cluster based on discriminant functions. In Table 8, these are compared to the actual cluster membership of each accession. DFA is particularly informative because misclassified accessions were identified and reassigned to the appropriate group.

In general, the discriminatory functions reached a high degree of precision for group classification. In all cases, more than 87% of accessions were correctly assigned to clusters. The degree of total precision was highly significant according to the "Q" statistical test (Hair et al. 1992), indicating the high discriminatory ability of the classification matrix (Table 8).

In Group 1, 204 of the 218 accessions were correctly classified (93.58%); the 14 misclassified accessions (6+4+2+2) corresponded, respectively, to Groups 2, 3, 4 and 55. In addition, 25 entries should be reassigned to add up to the 229 accessions predicted for Group 1. In Group 2, 93.66% of the entries were correctly classified. Among the accessions misclassified, 12 corresponded to Group 1 and only 2 accessions to Group 3. Six should be reassigned from Group1 to add up to the 213 accessions predicted. The same procedure of interpretation should be performed for Groups 3, 4, 5, 6 and 7 (Table 8).

Figure 1 shows graphically how accessions are classified into the seven groups according to the first two discriminatory functions. The first function separated groups 1, 2 and 3 very clearly from groups 4, 5, 6 and 7. Group 2 was the earliest maturing, with high harvest indexes and small plants. In contrast, the accessions in group 7 were the last to mature of all the germplasm in the collection, and had the lowest harvest indexes and big plants. The other groups occupied intermediate positions. The second discriminatory function distinctly separated groups 3, 4 and 6 from groups 1, 2, 5 and 7. Accessions in the first set of groups had small to medium seeds with low saponin content and long panicles. In the second set of groups, accessions were characterized by medium to large grains with high saponin content and short panicles.

Three areas of genetic diversity can be readily defined according to altitude: (1) the lower altitude valleys to which group 7 belongs (from 2558 to 3000 m asl), (2) the Altiplano (from 3665 to 3750 m asl) which includes most of the groups (1, 2, 3, 4 and 6), and (3) the higher altitude valleys (from 3000 to 3500 m asl), to which group 5 belongs and which represents a transitional zone between the lower altitude valleys and the Altiplano. Figure 2 shows the geographical location of a typical accession from each group of accessions.

In the Altiplano, group 2, which represents the accessions from the Salares, are found farthest south. These accessions are the earliest maturing, not just of the Andean region but of the entire germplasm collection. Accessions from group 6 were, in contrast, the last to mature and were found farthest north. The maturity period of these quinua accessions thus gradually varied according to latitude of origin. Groups 5 and 7 are even more late maturing but for reasons of altitude not latitude.

The 12 quinua accessions from southern Chile included in the survey were expected to fall within the group of sea-level accessions (Lescano 1989; Tapia 1990). However, this did not occur because the traits considered were not sufficiently discriminatory to permit the classification of the accessions individually as Wilson (1988) did by using allozymes and describing in detail the leaf-blade surface.

Most quinua production is found in the Bolivian Altiplano. This is usually divided into three regions: the northern Altiplano, represented by the most predominant Bolivian quinua accessions of groups 4 and 6; the central Altiplano, mainly represented by group 3; and the southern Altiplano, represented by group 2. However, the limit between the central and southern Altiplano is not well defined and is thus controversial, the reason being that between these two regions is an intermediate area of quinua diversity, represented by group 1 (Fig. 2). According to this analysis, the Bolivian Altiplano has four diversity areas and not the three distinct production areas usually considered.


Thanks are due to Luigi Guarino of IPGRI for his kind collaboration in handling the preparation, translation and revision of the present article.


Arze, J., P. Reyes, D. Ramírez, J.L. Ramos and F. Molina. 1977. Ecofisiología de la quinua. Pp. 75-80 in Curso de Quinua. (Universidad Nacional Técnica del Altiplano, ed.). Fondo Simón Bolivar and IICA-UNTA, Puno, Peru.

Cayoja, M.R. 1996. Caracterización de variables contínuas y discretas del grano de quinua (Chenopodium quinoa Willd.) del germoplasma de la Estación Experimental Patacamaya. Thesis Fac. Agron., Univ. Técnica, Oruro, Bolivia.

Clifford, H.T. and W. Stephenson. 1975. An introduction to numerical classification. Academic Press, New York, USA.

Crisci, J.V. and M.F. Lopez. 1983. Introducción a la teoria y practica de la taxonomia numérica. Secretaria General de la Organización de los Estados Americanos. Washington, D.C. USA.

Dillon, W.R. and M. Goldstein. 1984. Multivariate analysis: methods and applications. Academic Press, New York, USA.

Espíndola, G. 1980. Análisis de los componentes de rendimiento de la quinua. Thesis Fac. Cienc. Agric. Pecu., Univ. Mayor de San Simón, Cochabamba, Bolivia.

Gandarillas, H. 1968. Razas de quinua. Boletín informativo no. 34. Ministerio de Asuntos Campesinos y Agropecuarios, La Paz, Bolivia.

Hair, J.F., R.E. Anderson, R.L. Tatham and W.C. Black. 1992. Multivariate data analysis. Macmillan Publishing Company, New York, USA.

International Board for Plant Genetic Resources. 1981. Descriptores de quinua. IBPGR, Rome, Italy.

Kaiser, H.F. 1960. The application of electronic computers to factor analysis. Educ. Psychol. Meas. 20:141-151.

Lescano, J.L. 1989. Avances sobre los recursos fitogenéticos altoandinos. Pp. 19-35 in Curso: "Cultivos altoandinos". Potosí, Bolivia.

Ochoa, J. and E. Peralta. 1988. Evaluación preliminar morfológica y agronómica de 153 entradas de quinua en Santa Catalina, Pichincha. Pp. 137-142 in Actas del VI Congreso Internacional sobre Cultivos Andinos. Quito, Ecuador.

Risi, J. and N.W. Galwey. 1989a. The pattern of genetic diversity in the Andean grain crop quinoa (Chenopodium quinoa Willd.). I. Associations between characteristics. Euphytica 41:147-162.

Risi, J. and N.W. Galwey. 1989b. The pattern of genetic diversity in the Andean grain crop quinoa (Chenopodium quinoa Willd.). II. Multivariate methods. Euphytica 41:135-145.

Riveros, M.E. 1997. Evaluación agronómica de ecotipos de quinua (Chenopodium quinoa Willd.) en el sur de Chile. Thesis Fac. Cienc. Agrarias, Univ. Austral de Chile, Valdivia, Chile.

Scaff, R.T. 1996. Caracterización agronómica y morfológica de accesiones de quinua de la zona sur de Chile. Thesis Fac. Cienc. Agrarias, Univ. Austral de Chile, Valdivia, Chile.

Steel R.G. and J.H. Torrie. 1988. Principles and procedures of statistics. McGraw Hill, New York, USA.

Tapia, E.M. 1990. Cultivos andinos subexplotados y su aporte a la alimentación. Food and Agriculture Organization, Rome, Italy.

Wilkinson, L. 1988. SYSTAT: The system for statistics. Systat Inc., Evanston, Il.

Wilson, H.D. 1988. Quinua biosystematics, I: Domesticated populations. Econ. Bot. 42:461-477.

Contact about this page

Key words / Descriptors



Copyright © Bioversity International - FAO. All rights reserved.

Warning: mysql_close(): no MySQL-Link resource supplied in /Volumes/Danika1891 HD/WWW/PGR/article.php on line 204