Foto 7

E. Khanlari, L. Borgianni, D. Adami, S. Giordano: “Decentralized Intelligence for Centralized Con-trol: Multi-Agent Reinforcement Learning for SD-WAN”, ACM 21st International Conference on Network and Service Management, 2025

Written by

Modern Software-Defined Wide Area Network (SD-WAN) deployments are required to manage traffic over heterogeneous underlay networks while meeting stringent Quality of Service (QoS) requirements. In scenarios where multiple branches share overlay resources, independent tunnel selection decisions often lead to congestion and degraded performance. Existing approaches lack coordination mechanisms to handle the dynamic interactions between agents competing for shared resources. This paper presents a Multi-Agent Reinforcement Learning (MARL) framework for distributed overlay selection in SD-WANs. Each branch is modeled as an autonomous agent that learns routing policies through interaction with the network environment. To account for the mutual impact of decisions across branches, we adopt the Centralized Training with Decentralized Execution (CTDE) paradigm, enabling agents to learn globally consistent behaviors while preserving scalability at inference. To encourage cooperative policies, we introduce a $\lambda$-weighted reward shaping mechanism that balances local QoS goals with global resource fairness. We evaluate our approach using both PPO and DQN algorithms in a simulated SD-WAN environment. The findings highlight the necessity of MARL in addressing resource contention and ensuring equitable shared overlay utilization.