Skip to main navigation menu Skip to main content Skip to site footer

Research Articles

Early Access

Prediction of agricultural crop yields based on spatial vegetation indices and machine learning

DOI
https://doi.org/10.14719/pst.11454
Submitted
25 August 2025
Published
26-02-2026

Abstract

Accurate prediction of crop yields is not only a scientific challenge but also an economic necessity, as it directly influences food security, market stability and efficient resource allocation in agriculture. This study is driven by the hypothesis that the integration of satellite-based vegetation data with machine learning (ML) can substantially improve yield forecasting accuracy under semiarid climatic conditions, thereby reducing financial risks for farmers and agribusinesses. To test this, we developed and compared multiple data-driven prediction models for three key crops – peas, rapeseed and wheat – representing major contributors to regional agricultural income. We used freely available satellite imagery from the Sentinel-2 mission to calculate several vegetation indices that describe crop greenness, canopy structure and water content. These indices were analyzed to determine which combination best captures the relationship between crop condition and final yield. To ensure reliability, we expanded the dataset with controlled random noise and assessed model stability. Nine ML approaches were compared and the gradient boosting algorithm consistently delivered the most accurate results, achieving up to 99 % agreement with
observed yields and fewer than 5 % average errors. The most informative vegetation indices differed among crops, revealing new interdisciplinary insights into how crop physiology and environmental stress interact with spectral indicators. The breakthrough of this research lies in demonstrating a crop-specific optimization strategy that connects remote sensing, agronomy and data science in a single predictive framework. This approach can be immediately applied to improve yield estimation systems at regional and national scales, potentially reducing forecasting uncertainty by 20–30 % and saving agricultural producers millions of euros annually through optimized input management and market planning. Future research should focus on integrating weather forecasts, soil moisture data and economic models to transform yield prediction into a comprehensive decision-support system for precision agriculture. These findings, therefore, provide a practical pathway toward data-driven, climate-resilient and economically sustainable crop production worldwide.

References

  1. 1. Sharifi A. Yield prediction with machine learning algorithms and satellite images. J Sci Food Agric. 2021;101(3):891-96. https://doi.org/10.1002/jsfa.10696
  2. 2. Ashlyn SA. A food insecurity labyrinth: Unveiling the causes and responses to Pakistan’s food crisis (2022-present). J Contemp Politics. 2024;3(1):32. https://doi.org/10.53989/jcp.v3i1.10
  3. 3. El Bilali H, Ben Hassen T. Disrupted harvests: How Ukraine-Russia war influences global food systems-A systematic review. Policy Stud. 2024;45(3-4):310-35. https://doi.org/10.1080/01442872.2024.2329587
  4. 4. Lykhovyd P. A life factor approach to the yield prediction: A comparison with a technological approach in reliability and accuracy. J Ecol Eng. 2019;20(6):177-83. https://doi.org/10.12911/22998993/108630
  5. 5. Torsoni GB, de Oliveira Aparecido LE, Dos Santos GM, Chiquitto AG, da Silva Cabral Moraes JR, de Souza Rolim G. Soybean yield prediction by machine learning and climate. Theor Appl Climatol. 2023;151:1709. https://doi.org/10.1007/s00704-022-04341-9
  6. 6. Lykhovyd PV. Prediction of sweet corn yield depending on cultivation technology parameters by using linear regression and artificial neural network methods. Biosyst Divers. 2018;26(1):11-15. https://doi.org/10.15421/011802
  7. 7. Boiko MO. Implementation of non-linear neural networks for grain sorghum yields modelling in the conditions of Southern Steppe of Ukraine. Bull Dnipro State Agrar Econ Univ. 2016;(2):118-23.
  8. 8. De la Rosa D, Cardona F, Almorza J. Crop yield predictions based on properties of soils in Sevilla, Spain. Geoderma. 1981;25(3-4):267-74. https://doi.org/10.1016/0016-7061(81)90040-9
  9. 9. Joshi A, Pradhan B, Gite S, Chakraborty S. Remote-sensing data and deep-learning techniques in crop mapping and yield prediction: A systematic review. Remote Sens. 2023;15(8):2014. https://doi.org/10.3390/rs15082014
  10. 10. Zhu X, Guo R, Liu T, Xu K. Crop yield prediction based on agrometeorological indexes and remote sensing data. Remote Sens. 2021;13(10):2016. https://doi.org/10.3390/rs13102016
  11. 11. Gavasso-Rita YL, Papalexiou SM, Li Y, Elshorbagy A, Li Z, Schuster-Wallace C. Crop models and their use in assessing crop production and food security: A review. Food Energy Secur. 2024;13(1):e503. https://doi.org/10.1002/fes3.503
  12. 12. Hoshmand R. Statistical methods for environmental and agricultural sciences. CRC Press; 2017.
  13. 13. Araujo SO, Peres RS, Ramalho JC, Lidon F, Barata J. Machine learning applications in agriculture: Current trends, challenges, and future perspectives. Agronomy. 2023;13(12):2976. https://doi.org/10.3390/agronomy13122976
  14. 14. Lavrenko S, Lykhovyd P, Lavrenko N, Ushkarenko V, Maksymov M. Beans (Phaseolus vulgaris L.) yields forecast using normalized difference vegetation index. Int J Agric Technol. 2022;18(3):1033–44.
  15. 15. Feng P, Wang B, Li Liu D, Waters C, Xiao D, Shi L, et al. Dynamic wheat yield forecasts are improved by a hybrid approach using a biophysical model and machine learning technique. Agric For Meteorol. 2020;285:107922. https://doi.org/10.1016/j.agrformet.2020.107922
  16. 16. Shafi U, Mumtaz R, Anwar Z, Ajmal M, Khan M, Mahmood Z, et al. Tackling food insecurity using remote sensing and machine learning-based crop yield prediction. IEEE Access. 2023;11:108640-108657. https://doi.org/10.1109/access.2023.3321020
  17. 17. Pant J, Pant RP, Singh MK, Singh DP, Pant H. Analysis of agricultural crop yield prediction using statistical techniques of machine learning. Mater Today: Proc. 2021;46:10922-10926. https://doi.org/10.1016/j.matpr.2021.01.948
  18. 18. Peel MC, Finlayson BL, McMahon TA. Updated world map of the Köppen-Geiger climate classification. Hydrol Earth Syst Sci. 2007;11(5):1633–44. https://doi.org/10.5194/hess-11-1633-2007
  19. 19. Perry E, Sheffield K, Crawford D, Akpa S, Clancy A, Clark R. Spatial and temporal biomass and growth for grain crops using NDVI time series. Remote Sens. 2022;14(13):3071. https://doi.org/10.3390/rs14133071
  20. 20. Cahyono BE, Putri PO, Subekti A, Nugroho AT, Nishi K. Analysis of soil moisture as an indicator of land quality using vegetation index (SAVI and NDMI) retrieved from remote sensing data in Jember-Indonesia. In: AIP Conference Proceedings. 2022. p. 020006. https://doi.org/10.1063/5.0078761
  21. 21. Davidson C, Jaganathan V, Sivakumar AN, Czarnecki JMP, Chowdhary G. NDVI/NDRE prediction from standard RGB aerial imagery using deep learning. Comput Electron Agric. 2022;203:107396. https://doi.org/10.1016/j.compag.2022.107396
  22. 22. Voitik A, Kravchenko V, Pushka O, Kutkovetska T, Shchur T, Kocira S. Comparison of NDVI, NDRE, MSAVI and NDSI indices for early diagnosis of crop problems. Agric Eng. 2023;27. https://doi.org/10.2478/agriceng-2023-0004
  23. 23. Nadjla B, Assia S, Ahmed Z. Contribution of spectral indices of chlorophyll (RECl and GCI) in the analysis of multi-temporal mutations of cultivated land in the Mostaganem plateau. In: 2022 7th International conference on image and signal processing and their applications (ISPA). 2022. p. 1-6. https://doi.org/10.1109/ISPA54004.2022.9786326
  24. 24. Garbulsky MF, Peñuelas J, Gamon J, Inoue Y, Filella I. The photochemical reflectance index (PRI) and the remote sensing of leaf, canopy and ecosystem radiation use efficiencies: A review and meta-analysis. Remote Sens Environ. 2011;115(2):281–97. https://doi.org/10.1016/j.rse.2010.08.023
  25. 25. Wu C, Niu Z, Tang Q, Huang W. Estimating chlorophyll content from hyperspectral vegetation indices: Modeling and validation. Agric For Meteorol. 2008;148(8-9):1230–41. https://doi.org/10.1016/j.agrformet.2008.03.005
  26. 26. Marill KA. Advanced statistics: Linear regression, part II: Multiple linear regression. Acad Emerg Med. 2004;11(1):94-102. https://doi.org/10.1197/j.aem.2003.09.006
  27. 27. Liu Y, Wang Y, Zhang J. New machine learning algorithm: Random forest. In: Liu B, Ma M, Chang J, editors. International conference on information computing and applications. 2012. p. 246–52. https://doi.org/10.1007/978-3-642-34062-8_32
  28. 28. Basak D, Pal S, Patranabis DC. Support vector regression. Neural Inf Process-Letters Rev. 2007;11(10):203–24.
  29. 29. Chen J, Zhao F, Sun Y, Yin Y. Improved XGBoost model based on genetic algorithm. Int J Comput Appl Technol. 2020;62(3):240–45. https://doi.org/10.1504/IJCAT.2020.106571
  30. 30. Aravind T, Reddy BS, Avinash S. A comparative study on machine learning algorithms for predicting the placement information of under graduate students. In: 2019 third International conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC). 2019. p. 542–46. https://doi.org/10.1109/I-SMAC47947.2019.9032654
  31. 31. Reid S, Tibshirani R, Friedman J. A study of error variance estimation in lasso regression. Stat Sin. 2016;35–67.
  32. 32. Taud H, Mas JF. Multilayer perceptron (MLP). In: Olmedo MTC, Paegelow M, Mas JS, Escobar F, editors. Geomatic approaches for modeling land change scenarios. Cham: Springer International Publishing; 2017. p. 451–55. https://doi.org/10.1007/978-3-319-60801-3_27
  33. 33. Saleh AME, Arashi M, Kibria BG. Theory of ridge regression estimation with applications. John Wiley & Sons; 2019.
  34. 34. Rauschenberger A, Glaab E, van de Wiel MA. Predictive and interpretable models via the stacked elastic net. Bioinformatics. 2021;37(14):2012–16. https://doi.org/10.1093/bioinformatics/btaa535
  35. 35. Tatachar AV. Comparative assessment of regression models based on model evaluation metrics. Int Res J Eng Technol (IRJET). 2021;8(09):2395-3056.
  36. 36. Helland IS. On the interpretation and use of R2 in regression analysis. Biometrics. 1987;61-69.
  37. 37. Moreno JJM, Pol AP, Abad AS, Blasco BC. Using the R-MAPE index as a resistant measure of forecast accuracy. Psicothema. 2013;25(4):500-506. https://doi.org/10.7334/psicothema2013.23
  38. 38. Khan S, Iqbal J, Khan M, Malik N, Khan F, Khan K, et al. Using remotely sensed vegetation indices and multi-stream deep learning improves county-level corn yield predictions. Eur J Agron. 2025;164:127496. https://doi.org/10.1016/j.eja.2024.127496
  39. 39. Jhajharia K, Mathur P. Prediction of crop yield using satellite vegetation indices combined with machine learning approaches. Adv Space Res. 2023;72(9):3998-4007. https://doi.org/10.1016/j.asr.2023.07.006
  40. 40. Arshad S, Kazmi S, Javed M, Mohammed S. Applicability of machine learning techniques in predicting wheat yield based on remote sensing and climate data in Pakistan, South Asia. Eur J Agron. 2023;147:126837. https://doi.org/10.1016/j.eja.2023.126837
  41. 41. Aghighi H, Azadbakht M, Ashourloo D, Shahrabi H, Radiom S. Machine learning regression techniques for the silage maize yield prediction using time-series images of landsat 8 OLI. IEEE J Sel Top Appl Earth Obs Remote Sens. 2018;11:4563-77. https://doi.org/10.1109/JSTARS.2018.2823361
  42. 42. Muruganantham P, Wibowo S, Grandhi S, Samrat N, Islam N. A systematic literature review on crop yield prediction with deep learning and remote sensing. Remote Sens. 2022;14:1990. https://doi.org/10.3390/rs14091990
  43. 43. Tripathi A, Tiwari R, Tiwari S. A deep learning multi-layer perceptron and remote sensing approach for soil health based crop yield estimation. Int J Appl Earth Obs Geoinf. 2022;113:102959. https://doi.org/10.1016/j.jag.2022.102959
  44. 44. Chen Z, Chen J, Ding G, Huang H. A lightweight CNN-based algorithm and implementation on embedded system for real-time face recognition. Multimed Syst. 2023;29(1):129-38. https://doi.org/10.1007/s00530-022-00973-z
  45. 45. Jiao S, Gao Y, Feng J, Lei T, Yuan X. Does deep learning always outperform simple linear regression in optical imaging? Opt Express. 2020;28(3):3717-31. https://doi.org/10.1364/OE.382319
  46. 46. Han Y, Tang R, Liao Z, Zhai B, Fan J. A novel hybrid GOA-XGB model for estimating wheat aboveground biomass using UAV-based multispectral vegetation indices. Remote Sens. 2022;14(14):3506. https://doi.org/10.3390/rs14143506
  47. 47. Hara P, Piekutowska M, Niedbała G. Prediction of pea (Pisum sativum L.) seeds yield using artificial neural networks. Agric. 2023;13(3):661. https://doi.org/10.3390/agriculture13030661
  48. 48. Okupska E, Gozdowski D, Pudełko R, Wójcik-Gront E. Cereal and rapeseed yield forecast in Poland at regional level using machine learning and classical statistical models. Agric. 2025;15(9):984. https://doi.org/10.3390/agriculture15090984
  49. 49. Li Y, Zeng H, Zhang M, Wu B, Zhao Y, Yao X, et al. A county-level soybean yield prediction framework coupled with XGBoost and multidimensional feature engineering. Int J Appl Earth Obs Geoinformation. 2023;118:103269. https://doi.org/10.1016/j.jag.2023.103269
  50. 50. Mouafik M, Fouad M, El Aboudi A. Machine learning methods for predicting Argania spinosa crop yield and leaf area index: A combined drought index approach from multisource remote sensing data. AgriEng. 2024;6(3):2283. https://doi.org/10.3390/agriengineering6030134
  51. 51. Yang S, Li L, Fei S, Yang M, Tao Z, Meng Y, Xiao Y. Wheat yield prediction using machine learning method based on UAV remote sensing data. Drones. 2024;8(7):284. https://doi.org/10.3390/drones8070284
  52. 52. Razavi M, Nejadhashemi A, Majidi B, Razavi H, Kpodo J, Eeswaran R, et al. Enhancing crop yield prediction in Senegal using advanced machine learning techniques and synthetic data. Artific Intel Agric. 2024;14:99-114. https://doi.org/10.1016/j.aiia.2024.11.005
  53. 53. Manjunath M, Palayyan B. An efficient crop yield prediction framework using hybrid machine learning model. Revue d'Intelligence Artificielle. 2023;37(4):1157-67. https://doi.org/10.18280/ria.370428
  54. 54. Chatterjee S, Kliestik T, Rowland Z, Bugaj M. Immersive collaborative business process and extended reality-driven industrial metaverse technologies for economic value co-creation in 3D digital twin factories. Oecon Copernic. 2025;16(1):125. https://doi.org/10.24136/oc.3596
  55. 55. Stefko R, Michalikova KF, Strakova J, Novak A. Digital twin-based virtual factory and cyber-physical production systems, collaborative autonomous robotic and networked manufacturing technologies, and enterprise and business intelligence algorithms for industrial metaverse. Equilibrium. 2025;20(1):389-425.
  56. 56. Zvarikova K, Gajanova L, Horak J. Exploring CSR performance as a proxy for competitive advantage across sectors in the Central European countries. Oecon Copernic. 2024;15(3):991-1020.
  57. 57. Kliestik T, Kral P, Bugaj M, Durana P. Generative artificial intelligence of things systems, multisensory immersive extended reality technologies, and algorithmic big data simulation and modelling tools in digital twin industrial metaverse. Equilibrium. 2024;19(2):429-61.

Downloads

Download data is not yet available.