Adaptive routing in agricultural supply chains: Harnessing Q-learning for optimal decision-making in dynamic environments

Mantaw Suliyaa Chow; M Prahadeeswaran; V Karthick; CS Sumathi; SG Patil

doi:10.14719/pst.5426

Adaptive routing in agricultural supply chains: Harnessing Q-learning for optimal decision-making in dynamic environments

Authors

Mantaw Suliyaa Chow Department of Agricultural Economics, Tamil Nadu Agricultural University, Coimbatore 641 003, India https://orcid.org/0009-0002-1573-2194
M Prahadeeswaran Department of Agricultural Economics, Tamil Nadu Agricultural University, Coimbatore 641 003, India https://orcid.org/0000-0002-6175-3984
V Karthick Department of Agricultural Economics, Tamil Nadu Agricultural University, Coimbatore 641 003, India https://orcid.org/0000-0003-4602-6241
CS Sumathi Department of Physical Sciences and Information Technology, Agricultural Engineering College, Tamil Nadu Agricultural University, Coimbatore 641 003, India https://orcid.org/0000-0002-5426-2722
SG Patil Department of Physical Sciences and Information Technology, Agricultural Engineering College, Tamil Nadu Agricultural University, Coimbatore 641 003, India https://orcid.org/0000-0002-5370-3522

DOI:

https://doi.org/10.14719/pst.5426

Keywords:

Markov Decision Process (MDP), logistics, Q-learning, routing optimization

Abstract

In this study, the authors try to emphasize how Q-learning, a model-free reinforcement learning (RL) technique can be used for optimizing routing in a grid-based environment. This study aims to assess the efficacy of Q-learning in enhancing routing for agricultural supply chains, investigate its flexibility in dynamic environments, and compare its performance across several real-world scenarios. In this specific case of the banana chain, an agent is moving through various entities in the system - from local growers to small traders and warehouses. It models the routing problem as a Markov Decision Process (MDP) and the goal is to optimize cumulative reward. Several possible cases are simulated, e.g. the finding of an optimal route for a given visit sequence that optimizes charging time and non-drivable paths left over when unexpected blockages occur to avoid energy wear penalties as well as how to best save costs; These results demonstrate the adaptability and durability of Q-learning in dynamic environments to obtain near-optimal solutions across diverse settings. Indeed, the present study adds to a growing body of research on the application of RL in logistics and supply chain management, highlighting its potential to enhance decision-making in complex and variable environments. The findings suggest that Q-learning can effectively balance multiple objectives, such as minimizing distance, reducing costs, and avoiding high-wear areas, making it a valuable tool for optimizing routing in real-world supply chains. Future work will explore broader applications and other RL algorithms in similar contexts.

Downloads

References

Watkins CJCH, Dayan P. Q-Learning.Mach Learn.1992;8:279-292 https://doi.org/10.1007/BF00992698

Pallottino S, Scutellà MG. Shortest path algorithms in transportation models: classical and innovative aspects. In: Marcotte P, Nguyen S, editors. Equilibrium and Advanced Transportation Modelling:Centre for Research on Transportation. Springer, Boston, MA; 1998.p245-81. https://doi.org/10.1007/978-1-4615-5757-9_11.

Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, et al. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529-33. https://doi.org/10.1038/nature14236

Tijms HC. A First Course in Stochastic Models.John Wiley & Sons, Ltd;2003.

Sewak M. Temporal difference learning, SARSA and Q-Learning. Deep Reinforcement Learning.Springer;2019. https://doi.org/10.1007/978-981-13-8285-7

Azar NA, Shahmansoorian A, Davoudi M. Uncertainty-aware path planning using reinforcement learning and deep learning methods.Journal of Computer and Knowledge Engineering.2020;3(1):25-37.

Rodrigues P, Vieira SM. Optimizing agent training with deep Q-learning on a self-driving reinforcement learning environment. 2020 IEEE Symposium Series on Computational Intelligence (SSCI). 2020:745-52. https://doi.org/10.1109/SSCI47803.2020.9308525

Sutton RS, Barto AG. Reinforcement learning: an introduction. 2nd ed. A Bradford Book, Cambridge; 2018. (edited based on the link https://www.scirp.org/reference/referencespapers?referenceid=2465216)

Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Driessche GVD, et al. Mastering the game of Go with deep neural networks and tree search. Nature. 2016;529(7587):484-9. https://doi.org/10.1038/nature16961

White III CC, White DJ. Markov decision processes. European Journal of Operational Research. 1989 Mar 6;39(1):1-6.

Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W. Openai gym. arXiv 2016. arXiv preprint arXiv:1606.01540. 2020.

Chenatti S, Previato G, Cano G, Prudencio R, Leite G, Oliveira T, et al. Deep reinforcement learning in robotics logistic task coordination. In 2018 Latin American Robotic Symposium, 2018 Brazilian Symposium on Robotics (SBR), and 2018 Workshop on Robotics in Education (WRE) 2018; 326-332.https://doi.org/10.1109/LARS/SBR/WRE.2018.00066

Downloads

Published

09-12-2024

How to Cite

Chow MS, Prahadeeswaran M, Karthick V, Sumathi C, Patil S. Adaptive routing in agricultural supply chains: Harnessing Q-learning for optimal decision-making in dynamic environments. Plant Sci. Today [Internet]. 2024 Dec. 9 [cited 2025 Apr. 17];11(sp4). Available from: https://horizonepublishing.com/journals/index.php/PST/article/view/5426

Download Citation

Issue

Vol. 11 No. sp4 (2024): Recent Advances in Agriculture by Young Minds - I

Section

Research Articles

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright and Licence details of published articles

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.

Open Access Policy

Plant Science Today is an open access journal. There is no registration required to read any article. All published articles are distributed under the terms of the Creative Commons Attribution License (CC Attribution 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited (https://creativecommons.org/licenses/by/4.0/). Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).