MDP
For a model of Markov Decision Process, Policy creation via two methods : Value Iteration and Linear Programming
Model Description
Model world has 4*4 block grid, one positive terminal state, one negative terminal state. is total description of the world.
Value Iteration
value_iter.py contains the code which runs value iteration algorithm to find the utilities of all states and then final policy. It prints the result of every iteration.