DCRAC: Deep Conditioned Recurrent Actor-Critic for Multi-Objective Partially Observable Environments
In many decision-making problems, agents aim to balance multiple, possibly conflicting objectives. Existing research in deep reinforcement learning mainly focuses on fully-observable single-objective solutions. In this paper, we propose DCRAC, a deep reinforcement learning framework for solving partially-objective multi-objective problems. DCRAC follows a conditioned actor-critic approach in learning the optimal policy, where the network is conditioned on the weights, i.e, relative importance for the different objectives. To deal with longer action-observation histories, in the case of partially observable environments, we introduce DCRAC-M which uses memory networks to further enhance the reasoning ability of the agent. Experimental evaluation on benchmark problems show the superiority of our approach when compared to state-of-the-art.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems
No SHERPA/RoMEO policy available
Open Access Status
Nian, X., Irissappane, A. A., & Roijers, D. (2020). DCRAC: Deep Conditioned Recurrent Actor-Critic for Multi-Objective Partially Observable Environments. Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 8.