Skip to main content
ML Quest
Python Idle

Your spam dataset is imbalanced — roughly 85% ham, 15% spam. A model that blindly predicts "ham" every time scores 85% accuracy and catches zero spam. That's useless. Your mission: train a baseline model, measure its F1 score on the minority class (spam), then train a weighted model that takes the imbalance into account. Compare the two and prove that weighting matters.

~20 minscenario1000 rows
Loading Python runtime...
Goals: 4 tests
baseline_f1 should exist and be a float
weighted_f1 should exist and be a float
weighted model should outperform baseline on minority class F1
both F1 scores should be valid numbers
Python loading...