DAP is a ground-breaking system for learning distributed asynchronous policies. Born out of my master's thesis and fuelled by my passion for Multi-Agent Reinforcement Learning (MARL) it is the first and so far the only system for learning Asynchronous Policies with Gradients. Imagine a Distrubuted Algorithm with message gradients.
Distributed Asynchronous Policy
· One min read