Machine learning powered predictions for college football games using historical data, team metrics, and game conditions.
This project emerged from my passion for college football and curiosity to try and build a machine learning model. I wanted to create a prediction engine that could compete with professional handicappers while providing transparent insights into how predictions are made. The challenge was to develop a system that could process complex historical data and generate accurate predictions for future games, no matter if those teams consistently played each other or not.
The solution combines multiple machine learning models trained on historical game data, team statistics, seasonal, yearly, strength-of-schedule, and situational factors. Using Python with scikit-learn and pandas, I developed a custom ensemble model that weights various factors differently based on their historical predictive power. I found the Gradient Boost Regressor model performed better than the Random Forrest and XGBBoost models. When testing with model predicted scores the model comes quite close to the actual games scores. This was easily my most challenging project to date given the difficulty in predicting reasonable results.
Simply navigate below and select the two conferences from the drop down list. You will then select the teams you wish to choose, which will then predict the scores and provide the spread and model weights.