BALANCING OF PERSONAL AND GROUP GOALS FOR AGENTS USING MULTI-AGENT REINFORCEMENT LEARNING

Zhabinets, Maxim2022-06-302022-06-302022-04Zhabinets, M. (2022). BALANCING OF PERSONAL AND GROUP GOALS FOR AGENTS USING MULTI-AGENT REINFORCEMENT LEARNING (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstanhttp://nur.nu.edu.kz/handle/123456789/6357The number of AI agents in the world is increasing every day and they will need to interact with each other. It is in humanity’s best interest to teach these agents to respect the goals of others and live in harmony. In this study, we try to balance the personal and group goals of agents in social dilemma scenarios using the Proximal Policy Optimisation algorithm for both a decentralized learning approach and a centralized learning approach. After this, we compare the results of both approaches and point out their strong and weak points. We also test the impact of using an inequity-averse penalty that penalizes policies resulting in unequal rewards for agents in both decentralized and centralized learning. We briefly describe the history of multi-agent learning. We then look at the latest achievements in the application of centralized and decentralized multi-agent learning approaches, focusing on methods of balancing agents’ personal preferences with group goals. Next, the thesis describes the environments and methods used in this study. Then we describe the details of the performed experiments and discuss the results. We show that both centralized and decentralized learning approaches have their advantages and discuss them. We also show that inequity averse penalty is an efficient technique for balancing of the agents reward in social dilemma environments.enAttribution-NonCommercial-ShareAlike 3.0 United StatesResearch Subject Categories::TECHNOLOGYType of access: Open AccessProximal Policy Optimisation algorithmAIartificial intelligenceMulti-Agent Reinforcement Learningdecentralized learningReinforcement LearningBALANCING OF PERSONAL AND GROUP GOALS FOR AGENTS USING MULTI-AGENT REINFORCEMENT LEARNINGMaster's thesis