Python Idle
The learning rate is the single most important hyperparameter in gradient descent. Too small and your model crawls toward the minimum over thousands of iterations. Too large and it overshoots, oscillates, or explodes into NaN. Your mission: run the same gradient descent algorithm with four different learning rates and plot the results on one chart. See the difference for yourself.
~20 minsandbox lab
Loading Python runtime...
Goals: 4 tests
results dict should have 4 entries
LR=0.001 should converge (final cost < first cost)
LR=1.0 should diverge or oscillate
Python loading...