Multivariate genetic analyses unveil the complexity of grain yield and attributing traits diversity in Oryza sativa L. landraces from North - Eastern India

In the North - Eastern region of India, rice stands as the predominant staple, with diverse cultivars evolving over the past six decades. This study syste - matically evaluated 20 rice landraces, analyzing eleven variables related to yield and its attributing traits. The aim was to identify promising genotypes for potential breeding programs and to ascertain the minimum number of components essential for explaining the total diversity. Among the eleven principal components (PCs) examined, four PCs exhibited eigenvalues sur-passing 1.0, collectively contributing to 80.45% of the total variability in the traits. PC1, which explained 31.19% of the overall variance, was associated with plant height, days to 50% flowering, panicle length, grain breadth, and grain length - to - breadth ratio. Utilizing cluster analysis, the 20 rice landraces were categorized into seven distinct clusters. Maximum inter - cluster divergence was observed between clusters VI and I, as well as clusters VI and V, indicating greater genetic distinctiveness among genotypes in these clusters compared to others. Notably, rice landraces such as Borosolpana, Phougak, Satyaranjan, Kakcheng Phou, Moniram, Kanaklata, and Bahadur were identified as genetically divergent. These genotypes hold promise for generating segregating populations, serving as valuable source materials for targeted yield improvement through meticulous selection, as indicated by inter - cluster distances.


Introduction
Rice (Oryza sativa L.) stands as a crucial linchpin for global food security, serving as a dietary staple for over half of the world's population.The cultivation of rice has been intricately woven into human history for millennia, leading to the emergence of diverse landraces and locally adapted varieties shaped by natural and farmer-driven selection processes.These landraces serve as invaluable genetic reservoirs, harboring essential traits for sustainable and resilient production systems (1).With the global population expected to reach 8 billion in November 2022 and India poised to surpass Chihttps://plantsciencetoday.online na as the most populous country in 2023 (World Population Prospects, 2022), there is an urgent imperative to augment rice production to meet escalating demands.Achieving a 50% increase in rice production necessitates a focused effort by breeders to develop cultivars not only characterized by heightened yield potential but also possessing desirable agronomic traits (3).Effectively addressing these challenges requires a profound understanding of the genetic diversity within rice landraces.This paper aims to elucidate the significance of rice landraces as genetic resources and their pivotal role in contemporary agricultural research and development.Additionally, it underscores the critical necessity of harnessing this genetic diversity through the application of multivariate analysis, providing a robust toolkit to dissect the intricate relationships between genotypic and phenotypic variations (4).This analytical approach holds great promise for steering breeding programs towards the creation of improved rice varieties characterized by heightened adaptability, productivity, and resilience in the face of current and future challenges.
A comprehensive investigation, (5) delved into the genetic diversity elucidation of thirty-two elite rice varieties.Utilizing Mahalanobis' D2 statistics and Principal Component Analysis (PCA), the authors discerned pivotal factors influencing genetic variability.Time to maturity, single plant yield, and days to 50 percent flowering emerged as primary contributors to genetic diversity.The varieties were stratified into seven clusters.PCA revealed that the initial three components explicated 77.3 percent of the variability, with the first component, encapsulating single plant yield, panicles per plant, and time to maturity, attributing to 40.6 percent.This research yields crucial insights into the genetic diversity inherent in elite rice varieties.
The principal aim of this study was to delineate the inherent diversity among rice varieties.To achieve this objective, Mahalanobis D2 statistics have been employed as a robust statistical tool to quantify genetic distances between landraces.This methodology proves highly efficacious in categorizing varieties into discrete groups (6,7).Furthermore, the utilization of PCA complements the insights derived from the D2 analysis by reinforcing the identification of key contributors to the variability between genotypes (8,9).Through PCA, essential independent variables are succinctly represented while preserving the original variability (10).The amalgamation of Mahalanobis D2 statistics and PCA holds paramount significance in the selection of genetically divergent parents from a pool of 20 rice landraces, a critical aspect for the success of breeding programs.

Experimental site
The current investigation was conducted at Uttar Banga Krishi Viswavidyalaya situated in Pundibari, Coochbehar, West Bengal, within the terai region of Bengal.The geographical coordinates of the site were 26°19'86" N latitude and 89°23'53" E longitude, at an elevation of 43 meters above mean sea level.

Experimental materials
Twenty (20) distinct rice genotypes were procured from the Regional Agricultural Research Station located in Titabar, Jorhat, Assam, for the experimentation (Table 1).A comprehensive list of these genotypes is provided in Table 1.The study was conducted over the course of two seasons in 2017, encompassing both the pre-kharif and kharif periods.The soil characteristics at the experimental site include a sandy loam texture, optimal drainage conditions, and a pH level of 5.74

Experimental layout
A Randomized Complete Block Design (RCBD) experiment employing three replications was conducted to assess various genotypes for both yield and yield attributing traits.The experimental design comprised 20 treatments, with each plot measuring 2 m by 1.5 m.Plots were spaced at a 50 cm interval, with plant-to-plant spacing set at 15 cm and row-to-row spacing at 20 cm.Standardized amounts of nitrogen (N), phosphorus (P), and potassium (K) fertilizers were applied at rates of 60, 40, and 40 kg per acre, respectively.During land preparation, the entire dose of potash and phosphorus, along with half of the nitrogen, was applied as the basal dose.Subsequently, 22 days posttransplanting, one-fourth of the nitrogen was administered as the first top dressing, followed by the application of the remaining one-fourth nitrogen as the second top dressing 22 days later.Intercultural operations and crop protection measures were implemented as needed throughout the experimental duration.

Name of Landraces Denotation
Phougak

Observations recorded
Plant height (PH) was quantified in centimeters, measuring from the base to the apex of the primary panicle before the plants were harvested.Days to 50% flowering (DF) were recorded when half of the population within the experimental plot exhibited panicle emergence.Panicle length (PL) was determined in centimeters, measured from the basal node of selected plants to the tip of the uppermost spikelet.Effective tillers per plant (ETP) were assessed at maturity by counting from five randomly chosen plants, with the total divided by five to yield the effective tillers per plant in each replication.Filled grains per tiller (FGT) were calculated as the average number of fertile grains on 10 randomly selected panicles.Grain dimensions, including length (GL) and breadth (GB), were measured with digital slide calipers on 10 randomly chosen rice grains with husks, and the mean values were computed.
The grain length-to-breadth ratio (GLBR) was determined as the averaged ratio of length to breadth for 10 grains.Thousand grain weight (TGW) was assessed by weighing selected grains with an electric balance, and the mean was calculated.Grain yield per plant (GYP) was determined by harvesting five randomly selected plants, and the obtained grain amount was divided by five to derive the yield per plant.Harvest index (HI) was calcula-ted and expressed as a percentage for each plot.

Statistical analysis
The statistical analysis in this study involved aggregating average data from each replication.To ascertain the significance of variance among distinct genotypes (treatments), a randomized block design was implemented (11).Genetic diversity within a pool of 20 landraces was assessed using Mahalanobis D2 statistics, following the methodology outlined in reference 12.The resulting clustering pattern was proposed and visualized through hierarchical clustering (13) utilizing the dendextend package in R software (14) to elucidate relationships among genotypes based on their similarities and differences.Furthermore, principal components with eigenvalues surpassing one were explored using the FactoMineR package (15).This transformation facilitated the conversion of the original set of variables into a new set of uncorrelated variables, denoted as principal components.Biplots, generated using the Factoextra package ( 16) in R, were employed to visualize relationships between quantitative variables and individual observations, as well as to depict the clustering structure.The resulting dendrogram was integrated into these visualizations.These methodologies collectively offered a comprehensive evaluation of genetic diversity and relationships among the investigated genotypes.

Results and discussion
The genetic diversity within a population significantly influences genetic advancement.Hence, it is imperative to assess the genetic diversity among a collection of breeding materials, enabling the categorization of genotypes into genetically analogous and dissimilar types.The anticipation lies in the recognition that genotypes displaying sig-nificant genetic divergence are conducive to recombination breeding, thereby offering a broad array of genetic variations and increased opportunities for the emergence of transgressive segregants (17,18).Consequently, the assessment of the degree of genetic divergence among the existing rice genotypes has been undertaken.
Utilizing Euclidean genetic distance measurements between potential pairs of test landraces, the research categorized the 20 rice landraces into seven distinct genetic clusters based on their grain yield and attributing traits, as outlined in Table 2. Notably, Cluster IV emerged as the largest cluster, encompassing six landraces (Joymati, Mahsuri, Haripowasali, Kushal, Piolee, and Teti Sali).Subsequently, Cluster II featured four landraces (Disang, Luit, Dhansiri, and Diphalu), and Cluster V comprised three landraces (Satyaranjan, ChakhaoSempak, and Phourin Nakuppi).Clusters I, III, and VII each accommodated two landraces: Phougak and Moniram in Cluster I, Kakcheng Phou and Kanaklata in Cluster III, and Ranjit and Bahadur in Cluster VII, as detailed in Table 2.The clustering was visually represented through a dendrogram, illustrating the relationships among the 20 rice landraces concerning both yield and yield attributing traits.Fig. 1 illustrates the outcomes of clustering analyses, revealing the grouping patterns of 20 rice landraces through D2 analysis.Intriguingly, the examination disclosed seven distinct clusters, challenging the conventional assumption that germplasm with a shared origin would inherently belong to the same category.Instead, it underscored that genetic divergence was predominantly influenced by morphological traits rather than geographical provenance alone.Moreover, the application of Mahalanobis D2 statistics validated the clear categorization of the 20 rice landraces into five groups (Fig. 1).Subsequent chi-square testing accentuated highly significant differences (p < 0.01) among these clusters, aligning with earlier research (19) that utilized D2 analysis to classify rice genotypes into seven clusters, thereby affirming the robustness of this analytical approach.The observed genetic diversity, leading to the formation of distinct solitary clusters, is presumed to result from geographical barriers impeding gene flow or the effects of intense natural and human selection for unique and adaptive gene complexes.This finding aligns with similar results reported in a previous study (20) 22) identified ten clusters, where cluster I comprised a single genotype, while clusters II and III each encompassed nine genotypes.
In Table 3, it is discernible that Cluster I manifests the highest mean intra-cluster distances for both grain yield and its associated traits, registering a value of 12.87.Closely following are Clusters III and V, exhibiting distances of 9.60, succeeded by Cluster II at 8.97, and Cluster IV at 7.18.Cluster VI, being monogenotypic, records an intracluster distance of 0, while Cluster VII displays a distance of 3.81.Notably, Clusters VI and I demonstrate the most substantial inter-cluster distance at 32.33, succeeded by Clusters VI and V (29.18),III and I (28.79),V and I (27.75),VI and III (27.15), and finally, Cluster VII and I with a distance of 26.51.Based on the inter-cluster distances, it is anticipated that hybridization between genotypes from Cluster VI with Cluster V, Cluster III with I, and Cluster V with I could yield promising candidates with enhanced grain yield and other pivotal traits in rice (Table 3).Moreover, incorporating genotypes from more distantly positioned clusters in hybridization programs has the potential to introduce a broader spectrum of variability, facilitating the generation of superior candidates for grain yield and essential agronomic characteristics.It is recommended to select parents from two clusters with wider inter-cluster distances, as suggested by prior investigators (23,24), to achieve a robust heterotic impact and foster high varia bility.
The examination of genetically divergent clusters and the calculation of distances (D 2 values) among selected Oryza sativa L. landraces are presented in Table 4. Upon meticulous scrutiny of the distances, it was noted that Borosolpana in cluster VI and Phougak in cluster I exhi-bited a conspicuously high genotypic distance (D 2 = 34.12).Analogous patterns of substantial genetic distances between landraces were discerned in other clusters, such as the case of Borosolpana in cluster VI and Satyaranjan in cluster V (D 2 = 32.31).Kakcheng Phou in cluster III and Maniram in cluster I manifested a considerable genotypic distance (D 2 = 34.70).Furthermore, Satyaranjan in cluster V and Phougak in cluster I displayed a substantial genotypic distance (D 2 = 30.42),and Bahadur in cluster VII and Phougak in cluster I exhibited a noteworthy distance as well (D 2 = 30.11)(Table 4).The identification of parental lines from these distinct clusters holds considerable potential for

Cluster
Cluster hybridization programs, as mating genetically divergent parents often leads to enhanced variability and the emergence of transgressive segregations with robust heterotic effects.The magnitude of inter-cluster distances directly corresponds to the level of variability among genotypes between clusters and vice versa.Consequently, a deliberate mating strategy involving genotypes from these clusters has the potential to yield robust hybrids or recombinants with heightened vigor.This observation aligns with previous findings (25) reported in the literature.2, where PH contributed the maximum at 35.2%, followed by HI at 16.4%, DF at 12.2%, FGT at 10.6%, and GYP at 7.9%.This information underscores the significance of these traits in genotype selection, with plant height emerging as a primary determinant for divergence and potential utility in parent selection (26,27).PH= Plant height (cm), DF=Days to 50% flowering, PL=Panicle length (cm), ETP= effective tillers plant -1 , FGT= filled grains tiller -1 , GL= grain length (mm), GB= grain breadth (mm), GLBR=Grain l:b ratio, TGW= 1000 grain weight (g), HI= harvest index (%) and GYP= grain yield plant -1 (g).

Principal component analysis (PCA)
PCA serves as a multivariate statistical method aimed at selecting a reduced set of components from the PCs, effectively eliminating less critical information.This process involves arranging genotypes based on PC scores, prioritizing and elucidating the major variability within the overall variance (28,29).The determination of the number of retained factors is guided by eigenvalues.Table 6 presents the eigenvalues, percentage of variance, and cumulative variance for quality-related traits across 20 landraces.Our investigation identified eleven PCs, and the significance was attributed to the first four PCs with eigenvalues exceeding one.The remaining seven PCs offered limited additional insights and only partially accounted for the observed variation.In this study, PCA highlighted four principal components, namely PC1, PC2, PC3, and PC4, each possessing eigenvalues greater than one (3.43,2.63, 1.51, and 1.27, respectively).Collectively, these four components explained approximately 80.45% of the total variation.To gain a more nuanced understanding, emphasis was placed on the variation associated with these four PCs (Table 6).Notably, characters in the first principal component with absolute values closer to unity exerted a more pronounced influence on grouping than those with smaller absolute values nearing zero.
Following the extraction of PC4, a semi-curve line was delineated, exhibiting a trend toward linearity with minimal fluctuations in each PC.The scree plot graph vividly illustrates that PC1 encapsulates the highest proportion of variance compared to the other PCs (Fig. 3).Consequently, favoring the lines derived from PC1 for inclusion in genetic improvement programs holds considerable advantages.These findings are consistent with the outcomes of prior investigations by (30), elucidating the distribution of variance across principal components.In a study involving forty rice genotypes, (31) reported that PC1 accounted for 33% of the variability, while PC2, PC3, and PC4 contributed 14.3%, 11.4%, and 9%, respectively.Similarly, a separate investigation (32) demonstrated that the first two principal components explained 23% of the variation in sixteen agro-morphological parameters across twentythree rice germplasm lines, with eigenvalues exceeding 1.
In an earlier study analyzing 31 rice germplasm lines, (33) found that the first five components with eigenvalues greater than 1 collectively explained 82.90% of the overall variances.Moreover, research on rice landraces (34) unveiled that, among thirteen principal components, five were statistically significant, contributing significantly to the total variance and amounting to 84.67% in cumulative proportion of variance.A prior investigation in Pearl millet, involving forty germplasm, revealed that the first six principal components, each with an eigenvalue surpassing one, collectively contributed to 78.29% of the observed variability (35).
Analysis of Eleven Key Yield-Related Components in Twenty Rice Landraces: Insights from Multivariate Genetic Assessment.The PC1, explaining 31.2% of the total variability, underscores discriminatory traits such as PH (0.467), PL (0.473), GLBR (0.418), and DF (0.339).This suggests their  pivotal role in delineating the inherent quality potential of each landrace, leading to distinct cluster formations.Similarly, the second PC2, contributing 23.8% to the overall variation, notably influences ETP (0.481) and FGT (0.343) (Table 7).These findings underscore the richness of trait variants within rice landraces, implying substantial prospects for genetic advancements in the investigated traits.
The PC1 and PC2 were utilized to construct a biplot, facilitating an examination of the interrelationships among twenty rice landraces based on yield and yield-related traits in the present study (Fig. 4).The covariate effects of biplot-based correlations among traits elucidated 55.06% of the total variation, representing a robust approximation for understanding the impact of traits on yield and their inter-similarities.Notably, characters such as DF, PL, PH, GLBR, GL, GB, and ETP exhibited longer vector lengths, indicating a more substantial influence on the variation in a particular dimension.Conversely, traits like FGT, GYP, HI, and TGW displayed shorter vector lengths, suggesting a limited contribution to the observed variation.Piolee, Satyaranjan, and Dhansiri formed a distinctive group in the first quadrant of the biplot's right upper corner, exhibiting positive values for both PCs.Traits such as GYP, FGT, and DF were positioned in the same quadrant, as illustrated in Fig. 4. Previous research on South Indian rice landraces identified positive correlations between PC1 and PC2 for traits such as spikelet fertility, kernel length, and length-to -breadth ratio.Similarly, correlations were observed for eight traits, including days to fifty percent flowering, plant height, flag leaf width, panicle length, number of grains, spikelet fertility, length-to-breadth ratio, and single plant yield.The grouping of traits within distinct principal components suggests potential priority selection in breeding programs, emphasizing their tendency to co-segregate (36,37).
In the presented study, a comprehensive examination of the contribution of various yield attributing traits to PCs was conducted, as illustrated in the corrplot (Fig. 5).7. Eleven principal components along with their factor's loadings for yield and its attributing traits of 20 rice landraces.

Conclusion
The analysis of genetic diversity among rice landraces provides comprehensive insights into improving grain yield and morphological traits.Clustering, based on genetic distance, reveals seven distinct clusters that challenge assumptions about shared landraces, highlighting the influence of morphological traits on geographical origin.Statistical tests and Mahalanobis D2 analysis confirm significant differences among clusters, emphasizing opportunities for hybridization between genetically divergent landraces like Borosolpana, Phougak, Satyaranjan, Kakcheng Phou, Maniram, Satyaranjan, Kanaklata, and Bahadur to create superior segregating populations.Traits such as PH, HI, and DF markedly contribute to divergence among clusters.PCA of twenty rice landraces identifies four pivotal components (PC1 to PC4) explaining about 80.45% of the total variability.PC1, comprising 31.2%variability, highlights crucial traits like PH, PL, GLBR, and DF.Biplot analysis showcases distinct groups among landraces based on yield traits, emphasizing trait correlations with specific PCs.This insight directs targeted genetic improvement efforts, particularly focusing on PC1 to select lines with superior genetic potential for quality traits and yield.

Fig. 1 .
Fig. 1.Dendrogram showing the clustering of 20 landraces of rice for yield and yield attributing trait.

Fig. 3 .
Fig. 3. Scree plot showing the clustering of 20 landraces of rice for yield and its attributing traits.

Table 1 .
List of 20 landraces of rice.

Table 2 .
. Another study on Grouping of 20 landraces of rice on different clusters for yield and yield attributing traits.
https://plantsciencetoday.online rice diversity analysis (21) allocated twenty-six genotypes into six clusters, with the maximum inter-cluster distance noted between clusters V and VI.Additionally, a previous investigation on rice genotypes (

Table 3 .
Average intra (diagonal) and inter-cluster (off-diagonal) D 2 values of 20 rice landraces for different yield and yield attributing traits.

Table 5
HI at 55.83, and GYP at 14.98 g.Lastly, Cluster VII, consisting of two landraces, exhibited lengthy PL at 28.16 mm and GB at 2.24 mm.However, this cluster also presented the lowest HI at 33.52 and GYP at 14.98 g.The contribution of individual yield and yield-related traits to total divergence is depicted in Fig.
illustrates the mean values of eleven yield and yield-related traits within each identified cluster.In Cluster I, comprising two distinct landraces, noteworthy attributes included maximum GB at 2.35 mm, highest GYP at 20.24 g, moderate FGT at 368.50, and TGW at 21.92 g.Additionally, Cluster I exhibited the shortest PH at 123.67 cm and PL at 22.99 cm.Cluster II, consisting of four landraces, exhibited characteristics such as high HI at 52.

Table 4 .
Genetically divergent clusters and distance (D 2 value) between the landraces selected for yield and its attributing traits in 20 landraces of rice.

Table 5 .
Cluster means of 20 landraces of rice for different yield and yield attributing traits.

Table 6 .
Principal component analysis eigen values, variance and cumulative variance of yield and attributing traits in rice landraces.
(39)tified 24 genotypes and specific traits, such as Days to 50% Flowering, Plant Height, and Grain Width, displaying positive values in the biplot between PC1 and PC2.A separate investigation on Indonesian local rice germplasm revealed that PC1 was predominantly influenced by Productive Tiller Number, accounting for 32.54% of the total variability.Additionally, Culm Length and Plant Height collectively contributed 22.1% to the overall variability.Furthermore, PC3, driven by Panicle Length, Culm Diameter, and Flag Leaf Width, explained 9.93% of the overall variation.Notably, the top two PCs collectively elucidated 54.65% of the total variation.Moreover, in-depth research demonstrated that PCA of traits such as Panicle Exertion, Flag Leaf Blade Width, and Panicle Length generated five PCs, collectively contributing 80.00% towards the observed diversity(39).