loading page

Reinforcement Learning for the Traveling Salesman Problem: Performance Comparison of Three Algorithms
  • +1
  • Jiaying Wang,
  • Chenglong Xiao,
  • Shanshan Wang,
  • Yaqi Ruan
Jiaying Wang
Shantou University
Author Profile
Chenglong Xiao
Shantou University

Corresponding Author:[email protected]

Author Profile
Shanshan Wang
Shantou University
Author Profile
Yaqi Ruan
Shantou University
Author Profile


TSP is one of the most famous problems in graph theory, as well as one of the typical NP-hard problems in combinatorial optimization. Its applications range from how to plan the most reasonable and efficient road traffic to how to better set up nodes in the Internet environment to facilitate information flow, among others. Reinforcement learning has been widely regarded as an effective tool for solving combinatorial optimization problems. This paper attempts to solve the TSP problem using different reinforcement learning algorithms and evaluated the performance of three RL algorithms (Q-learning, Sarsa, and Double Q-Learning) under different reward functions, ε-greedy decay strategies, and running times. The results show that the Double Q-Learning algorithm is the best algorithm, as it could produce results closest to the optimal solutions, and by analyzing the results, better reward strategies and epsilon-greedy decay strategies are obtained.
11 May 2023Submitted to The Journal of Engineering
12 May 2023Submission Checks Completed
12 May 2023Assigned to Editor
12 Jun 2023Reviewer(s) Assigned
03 Jul 2023Review(s) Completed, Editorial Evaluation Pending
26 Jul 2023Editorial Decision: Revise Minor
30 Jul 20231st Revision Received
01 Aug 2023Submission Checks Completed
01 Aug 2023Assigned to Editor
01 Aug 2023Reviewer(s) Assigned
22 Aug 2023Editorial Decision: Accept