This project is a pytorch implementation of the PCC-RL project by Jay et al., given here. The paper is given here.
In this project we implemented the PCC-RL agent, to learn how to adapt the client's sending rate instead of the regular congestion control mechanisms provided with linux. The agent learns using the network simulation provided by the original repository.
We implemented the RL agent using a standard Actor-Critic method, with pytorch.
The network is simple: one hidden layer with 128 units for the actor and critic. The actor learns the mean of the policy distribution as a Normal distribution, with a predifined variance. The given states are history of 10 monitor-intervals statictics, with specific features (as described in the original paper).
We ran a task similar to the one provided in PCC-RL paper, and received similar results:
- Python 3
- Pytorch
- OpenAI's gym
- numpy