DAP is a ground-breaking system for learning distributed asynchronous policies. Born out of my master's thesis and fuelled by my passion for Multi-Agent Reinforcement Learning (MARL) it is the first and so far the only system for learning Asynchronous Policies with Gradients. Imagine a Distrubuted Algorithm with message gradients.
The project page and some basic visualizations can be found under dap.bru.lu.