Full Program »
Multi-Armed Bandits Learning For Optimal Decentralized Control of Electric Vehicle Charging
Optimal control of new grid elements, such as electric vehicles, can ensure an efficient, and stable operation of distribution networks. Decentralization can result in scalability, higher reliability, and privacy (which may not be present in centralized or hierarchical control solutions). A decentralized multi-agent multi-armed combinatorial bandits system using Thompson Sampling is presented for smart charging of electric vehicles. The proposed system utilizes the concepts of bandits reinforcement learning to manage the uncertainties in the choice of other players’ actions, and in the intermittent photovoltaic energy production. This proposed solution is fully decentralized, real-time, scalable, model-free, and fair. Its performance is evaluated through comparison with other charging strategies i.e., basic charging, and centralized optimization.