Multi-Agent Reinforcement Learning

- I worked on multi-agent self-play in atari games in collaborative and competitive settings.
- I used variational autoencoders to disentangle multiple near-optimal policies extracted using latent code.
- Our initial results on the model gave win probability of 72%, which is close to 80% SOTA values, and much better than the human score of 40% in multi-agent CTF.
- I worked on developing a generative model for InfoRL to maintain unsupervised setting for latent code generation to allow all standard MARL algorithms to be used with InfoRL.