Full Program »

Multi-Armed Bandits Learning For Optimal Decentralized Control of Electric Vehicle Charging

Optimal control of new grid elements, such as electric vehicles, can ensure an efficient, and stable operation of distribution networks. Decentralization can result in scalability, higher reliability, and privacy (which may not be present in centralized or hierarchical control solutions). A decentralized multi-agent multi-armed combinatorial bandits system using Thompson Sampling is presented for smart charging of electric vehicles. The proposed system utilizes the concepts of bandits reinforcement learning to manage the uncertainties in the choice of other players’ actions, and in the intermittent photovoltaic energy production. This proposed solution is fully decentralized, real-time, scalable, model-free, and fair. Its performance is evaluated through comparison with other charging strategies i.e., basic charging, and centralized optimization.

Sharyal Zafar
École normale supérieure de Rennes (ENS Rennes)
France

Raphaël Féraud
Orange Labs
France

Anne Blavette
École normale supérieure de Rennes (ENS Rennes)
France

Guy Camilleri
Paul Sabatier University Toulouse
France

Hamid Ben Ahmed
École normale supérieure de Rennes (ENS Rennes)
France