Livedocs
XGBoost
EPL Prediction
This notebook predicts the Premier League winner for the 2024-25 season using an XGBoost model. It outlines data collection from football-data.org, extensive feature engineering, and a comprehensive prediction framework. The analysis includes visualizations of championship probabilities, current season performance, and model feature importance, highlighting Liverpool as the predicted winner with an 82.9% probability.
Tip: Try `uv pip` for ⚡️ installs:
!uv pip install
empty line
Requirement already satisfied: requests in ./data/.venv/lib/python3.12/site-packages (2.32.3)
Requirement already satisfied: pandas in ./data/.venv/lib/python3.12/site-packages (2.3.2)
Requirement already satisfied: numpy in ./data/.venv/lib/python3.12/site-packages (2.2.4)
Requirement already satisfied: scikit-learn in ./data/.venv/lib/python3.12/site-packages (1.4.2)
Requirement already satisfied: xgboost in ./data/.venv/lib/python3.12/site-packages (3.0.2)
Requirement already satisfied: matplotlib in ./data/.venv/lib/python3.12/site-packages (3.10.1)
Requirement already satisfied: seaborn in ./data/.venv/lib/python3.12/site-packages (0.13.2)
Requirement already satisfied: plotly in ./data/.venv/lib/python3.12/site-packages (6.1.1)
Requirement already satisfied: charset-normalizer<4,>=2 in ./data/.venv/lib/python3.12/site-packages (from requests) (3.4.3)
Requirement already satisfied: idna<4,>=2.5 in ./data/.venv/lib/python3.12/site-packages (from requests) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./data/.venv/lib/python3.12/site-packages (from requests) (2.5.0)
Requirement already satisfied: certifi>=2017.4.17 in ./data/.venv/lib/python3.12/site-packages (from requests) (2025.8.3)
Requirement already satisfied: python-dateutil>=2.8.2 in ./data/.venv/lib/python3.12/site-packages (from pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in ./data/.venv/lib/python3.12/site-packages (from pandas) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in ./data/.venv/lib/python3.12/site-packages (from pandas) (2025.2)
empty line
Requirement already satisfied: scipy>=1.6.0 in ./data/.venv/lib/python3.12/site-packages (from scikit-learn) (1.16.2)
Requirement already satisfied: joblib>=1.2.0 in ./data/.venv/lib/python3.12/site-packages (from scikit-learn) (1.5.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in ./data/.venv/lib/python3.12/site-packages (from scikit-learn) (3.6.0)
Requirement already satisfied: nvidia-nccl-cu12 in ./data/.venv/lib/python3.12/site-packages (from xgboost) (2.28.3)
Requirement already satisfied: contourpy>=1.0.1 in ./data/.venv/lib/python3.12/site-packages (from matplotlib) (1.3.3)
Requirement already satisfied: cycler>=0.10 in ./data/.venv/lib/python3.12/site-packages (from matplotlib) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in ./data/.venv/lib/python3.12/site-packages (from matplotlib) (4.60.0)
Requirement already satisfied: kiwisolver>=1.3.1 in ./data/.venv/lib/python3.12/site-packages (from matplotlib) (1.4.9)
Requirement already satisfied: packaging>=20.0 in ./data/.venv/lib/python3.12/site-packages (from matplotlib) (25.0)
Requirement already satisfied: pillow>=8 in ./data/.venv/lib/python3.12/site-packages (from matplotlib) (11.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in ./data/.venv/lib/python3.12/site-packages (from matplotlib) (3.2.5)
Requirement already satisfied: narwhals>=1.15.1 in ./data/.venv/lib/python3.12/site-packages (from plotly) (2.5.0)
empty line
Requirement already satisfied: six>=1.5 in ./data/.venv/lib/python3.12/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)
empty line
Testing API access...
Status Code: 403 Response: {"message":"The resource you are looking for is restricted and apparently not within your permissions. Please check your subscription.","errorCode":403}...
Testing competitions endpoint...
Competitions - Status Code: 200 Number of competitions: 183 Premier League found: Premier League - Season: 2025-08-15 Testing areas endpoint...
Areas - Status Code: 200 Checking for API key... No API key found in secrets.
Getting Premier League competition details...
Status Code: 403 Error accessing Premier League data: {"message":"The resource you are looking for is restricted and apparently not within your permissions. Please check your subscription.","errorCode":403} ================================================== Testing Premier League standings access...
Standings Status Code: 403 Standings Error: {"message":"The resource you are looking for is restricted and apparently not within your permissions. Please check your subscription.","errorCode":403}
'<' not supported between instances of 'float' and 'str'
<span style='color:var(--red,#a00)'>---------------------------------------------------------------------------</span>
<span style='color:var(--red,#a00)'>TypeError</span>                                 Traceback (most recent call last)
<span style='color:var(--cyan,#0aa)'>Cell </span><span style='color:var(--green,#0a0)'>In[86], line 18</span>
<span style='color:var(--green,#0a0)'>     16</span> <span style='color:#5f8787'><i># Get unique teams</i></span>
<span style='color:var(--green,#0a0)'>     17</span> <span style='color:#008700'><b>if</b></span> <span style='color:var(--yellow,#a60)'>&#39;HomeTeam&#39;</span> <span style='color:#af00ff'><b>in</b></span> premier_league_data.columns <span style='color:#af00ff'><b>and</b></span> <span style='color:var(--yellow,#a60)'>&#39;AwayTeam&#39;</span> <span style='color:#af00ff'><b>in</b></span> premier_league_data.columns:
<span style='color:var(--green,#0a0)'>---&gt; 18</span>     unique_teams = <span style='color:#008700'><span style='background:var(--yellow,#a60)'>sorted</span></span><span style='background:var(--yellow,#a60)'>(</span><span style='color:#008700'><span style='background:var(--yellow,#a60)'>set</span></span><span style='background:var(--yellow,#a60)'>(premier_league_data[</span><span style='color:var(--yellow,#a60)'><span style='background:var(--yellow,#a60)'>&#39;HomeTeam&#39;</span></span><span style='background:var(--yellow,#a60)'>].unique()) | </span><span style='color:#008700'><span style='background:var(--yellow,#a60)'>set</span></span><span style='background:var(--yellow,#a60)'>(premier_league_data[</span><span style='color:var(--yellow,#a60)'><span style='background:var(--yellow,#a60)'>&#39;AwayTeam&#39;</span></span><span style='background:var(--yellow,#a60)'>].unique()))</span>
<span style='color:var(--green,#0a0)'>     20</span> <span style='color:#5f8787'><i># Basic statistics for goals</i></span>
<span style='color:var(--green,#0a0)'>     21</span> <span style='color:#008700'><b>if</b></span> <span style='color:var(--yellow,#a60)'>&#39;FTHG&#39;</span> <span style='color:#af00ff'><b>in</b></span> premier_league_data.columns <span style='color:#af00ff'><b>and</b></span> <span style='color:var(--yellow,#a60)'>&#39;FTAG&#39;</span> <span style='color:#af00ff'><b>in</b></span> premier_league_data.columns:

<span style='color:var(--red,#a00)'>TypeError</span>: &#39;&lt;&#39; not supported between instances of &#39;float&#39; and &#39;str&#39;
Output Image image/png - f4ac4ea4-d84e-42ca-9fa7-fece67590a75
================================================================================ PREMIER LEAGUE 2025/2026 PREDICTION SUMMARY ================================================================================ • PREDICTED WINNER: LIVERPOOL • Championship Probability: 86.5% • Current Points: 15 points • Points Per Game: 3.00 • Goal Difference: +6 • TOP 3 CONTENDERS: 🥇 Liverpool (86.5% chance) 🥈 Arsenal (78.0% chance) 🥉 Tottenham (66.1% chance) • MODEL PERFORMANCE: - Training Data: 2760 matches - Test Accuracy: 50.6% - Historical Seasons: 9 - Total Matches Analyzed: 3470 • KEY INSIGHTS: - Liverpool leads with superior points per game (3.00) - Strong goal difference (+6) indicates dominant attacking and defensive play - Historical performance data supports current form - Model considers form, historical strength, and head-to-head records ================================================================================