News, updates and analysis
We've added 4 new prediction models to CS2PREDICT, bringing the total to 21 AI models. Each was discovered through automated optimization across 3,400+ feature combinations tested against 667 completed matches.
APEX — ELO + Form + Streak + Strength of Schedule
Balanced model optimized for the best accuracy-to-coverage ratio. Combines team rating with current momentum, win/loss streaks, and opponent difficulty. 66% accuracy, 60% coverage.
ORACLE — ELO + ELOHQ + Form + Tilt + SOS
Five-factor model that adds psychological pressure (tilt) and dual ELO systems. Performs best when teams have clear form differences. 69% accuracy, 51% coverage.
EAGLE — Rank + ELO + ELOHQ + Form + WinRate
High-confidence model that combines three ranking signals with team form. Predicts less frequently but with strong conviction. 77% accuracy, 14% coverage.
PHANTOM — ELO + TierELO + WinRate + Tilt
Maximum accuracy model. Only makes predictions when all signals align strongly. When PHANTOM speaks, it's almost always right. 92% accuracy, 6% coverage.
All four models are currently in calibration mode — they need at least 10 confident predictions before appearing in the model rankings. You can already see their predictions on individual match pages.
The new models are available exclusively for Pro subscribers.
The Model Leaderboard ranks all AI models by their prediction accuracy on Top-tier matches. It's a live scoreboard — accuracy updates after every completed match.
Models are colored on a gradient from green (best accuracy) to red (worst). The color updates dynamically as accuracy changes — if a model goes on a hot streak, it moves up and turns greener.
New models show a pulsing orange dot and "calibrating" status until they have at least 10 predictions. This prevents ranking models based on too few data points.
No single model is always right. The best strategy is to look at consensus — when most models agree, the prediction is usually reliable. When models disagree sharply, the match is genuinely uncertain.
Check the leaderboard regularly — the best model this week might not be the best next week. Form matters for models too.
Brier Score measures how good probability predictions are, not just whether the winner was guessed correctly. It ranges from 0 (perfect) to 1 (worst possible).
Formula: Average of (predicted probability - actual outcome)² across all predictions.
Consider two models predicting the same 10 matches:
Brier Score captures this: Model B has a much better (lower) Brier Score because its confident predictions are correct.
| 0.25 | Random coin flip (baseline) |
| 0.20-0.25 | Poor — barely better than random |
| 0.15-0.20 | Good — meaningful prediction quality |
| 0.10-0.15 | Very good — strong predictive power |
| <0.10 | Excellent — rare in esports |
We also track "Sharp" — accuracy specifically when the model is highly confident (80%+). A model with 85% Sharp score means when it's very confident, it's right 85% of the time. This is what matters for high-stakes decisions.
When a model predicts Team A at 65%, it means that in similar situations historically, Team A won about 65 out of 100 times. It's not a guarantee — it's a probability based on patterns the model has learned.
80%+ — The model is very confident. One team has a clear advantage in skill, form, or matchup. These predictions are correct about 75-85% of the time.
60-70% — Moderate confidence. The favored team has an edge but upsets happen regularly. About 65% accuracy.
50-55% — Coin flip. The model sees no clear advantage. We hide these predictions (shown as 50/50) because they carry no useful signal.
If ELO says 70% Team A but MIND says 60% Team B, the matchup is genuinely uncertain. Different models weigh different factors:
When models disagree, the match is likely to be close.
Our models achieve strong accuracy on top-tier matches where we have rich data about both teams. For lower-tier matches with unknown teams, predictions drop to ~50% — no better than a coin flip. We only show predictions where they add real value.
Every match on CS2PREDICT is automatically classified into one of three tiers based on team rankings and event importance:
Matches between top-30 ranked teams at significant events (BLAST, ESL, PGL, IEM). This is where our models perform best. Both teams are well-known with extensive historical data.
Major championship qualifiers and playoff matches. Even if teams aren't in the top 30, the event importance is high. Good data availability, reliable predictions.
Lower-tier events, regional qualifiers, open qualifiers. Teams often have limited data — small match history, unstable rosters, no HLTV ranking. Our models achieve only ~50% accuracy here (coin flip level), so we show these matches as schedule only, without predictions.
Showing bad predictions is worse than showing no predictions. A user who follows a 50% accurate model will lose trust quickly. By focusing on Top and Major tiers, we ensure every prediction we show has genuine analytical value behind it.