ML Quest — Learn Machine Learning by Playing

Python Idle

The warehouse robot goes live tomorrow. The CEO wants a demo that shows everything: an agent that learns from nothing, explores intelligently, converges on the optimal path, and has metrics to prove it. You must combine Q-learning with epsilon-greedy action selection on a grid world — and track every reward to show the learning curve. No scaffolding. No hints until you really need them. This is the final test of your RL skills.

~35 minproject

Loading Python runtime...

Goals: 4 tests

Q-table should be trained (non-trivial values)

trained agent should reach the goal in over 80% of test episodes

epsilon should decay during training

should track reward history across episodes

Python loading...

RL Project