jia-yi-chen / bandit-and-reinforcement-learning Goto Github PK
View Code? Open in Web Editor NEWPython implementation for Reinforcement Learning algorithms -- Bandit algorithms, MDP, Dynamic Programming (value/policy iteration), Model-free Control (off-policy Monte Carlo, Q-learning)