Accurate prediction of crop yields is not only a scientific challenge but also an economic necessity, as it directly influences food security, market stability and efficient resource allocation in agriculture. This study is driven by the hypothesis that the integration of satellite-based vegetation data with machine learning (ML) can substantially improve yield forecasting accuracy under semiarid climatic conditions, thereby reducing financial risks for farmers and agribusinesses. To test this, we developed and compared multiple data-driven prediction models for three key crops – peas, rapeseed and wheat – representing major contributors to regional agricultural income. We used freely available satellite imagery from the Sentinel-2 mission to calculate several vegetation indices that describe crop greenness, canopy structure and water content. These indices were analyzed to determine which combination best captures the relationship between crop condition and final yield. To ensure reliability, we expanded the dataset with controlled random noise and assessed model stability. Nine ML approaches were compared and the gradient boosting algorithm consistently delivered the most accurate results, achieving up to 99 % agreement with
observed yields and fewer than 5 % average errors. The most informative vegetation indices differed among crops, revealing new interdisciplinary insights into how crop physiology and environmental stress interact with spectral indicators. The breakthrough of this research lies in demonstrating a crop-specific optimization strategy that connects remote sensing, agronomy and data science in a single predictive framework. This approach can be immediately applied to improve yield estimation systems at regional and national scales, potentially reducing forecasting uncertainty by 20–30 % and saving agricultural producers millions of euros annually through optimized input management and market planning. Future research should focus on integrating weather forecasts, soil moisture data and economic models to transform yield prediction into a comprehensive decision-support system for precision agriculture. These findings, therefore, provide a practical pathway toward data-driven, climate-resilient and economically sustainable crop production worldwide.