Predicting the Premier League
A statistical journey from naive combinatorics to Bayesian updating in the Egyptian Premier League.
The Egyptian Premier League was witnessing a historic final round—a scenario that hadn't occurred in 34 years. Three teams were competing for the title until the very end: Zamalek (53 points), Pyramids (51 points), and Al Ahly (50 points).
It's a strange question because anyone following the league knew the required scenario to win the championship was "miraculous". But we are statisticians; let's try to answer it.
Only one match remained for each team before the curtain closed on the season:
- Zamalek vs Ceramica Cleopatra (4th place)
- Pyramids vs Smouha (5th place)
- Al Ahly vs Al Masry (6th place)
These were all incredibly tough matchups. We couldn't say for sure that any specific team would definitively win their encounter.
Phase 1: Naive Combinatorics
For each of these three matches, there are 3 possible outcomes: Win, Draw, or Lose. Assuming the results are completely independent from each other (e.g., Al Ahly winning doesn't force Zamalek to lose), we have different possible endings.
[Win, Lose, Draw] or [Win, Lose, Lose].That leaves only 2 scenarios out of the 27. Which means we can divide 2 by 27 directly to get .
This is a simple application of combinatorial probability. The probability of an event equals the number of desired outcomes divided by the total number of possible outcomes.
But what is the problem with this method? We implicitly assumed the probability of a Win, Draw, and Loss are all equally likely for all 3 teams (). Is the probability of Al Ahly winning exactly the same as them losing? The answer is mostly no. That's why we need to look at the probabilities of events individually.
Interactive Universe Explorer
Lock in specific match outcomes to filter the universe. Observe how the probability shifts as possibilities collapse.
Phase 2: Adding Weights (Expected Goals)
If we assume, for example, that Al Ahly's win probability is , a draw is , and a loss is , the overall probability of winning the league is no longer . Each scenario now has a vastly different weight. We need to multiply the actual probabilities of the specific required outcomes.
But where do we get these probabilities? From guessing? No, we use available facts and data: the team's form over the last 5 matches, win ratios, historical direct matchups, and average goals scored and conceded.
Phase 3: Monte Carlo Simulations
We've extracted the variables, calculated the probabilities, and arrived at a final result of . Beautiful. But what if these numbers changed, whether slightly or significantly? What if we tried all possible combinations thousands of times to see what happens?
Initially, the model threw a curveball: Al Ahly had a chance, but there was a massive chance of a "Tie". The model didn't understand the league's head-to-head tiebreaker rules! It treated tied points as an "unknown result". Once we fed the tiebreaker rules into the engine, the unresolved ties collapsed into the final distribution, perfectly matching our mathematical calculation.
Monte Carlo Engine
Phase 4: Updating Beliefs (Bayes' Theorem)
Up until now, we've treated the probabilities as static. As if the world doesn't change. But what if new information appears before the match? A sudden injury? A tactical change? Do we stick to the old numbers, or do we update our conviction entirely?
This brings us to the most beautiful philosophical part of our analysis: What is the probability of Al Ahly winning this match, GIVEN that they won the league? Because winning this match is a strict requirement for winning the league, this probability (the Likelihood) is exactly ().
The equation becomes brilliantly simple: . The probability almost doubled. Here lies the secret beauty of Bayes' Theorem; based on a single piece of evidence, you shift your entire paradigm.
1. Prior Belief P(A)
Initial chance of winning league
2. Evidence P(B)
Probability of winning the match
3. Likelihood P(B|A)
Fixed at 100% (Must-win match)
Dynamic Inference
Adjust the prior belief or the new evidence probability to see how Bayes' Theorem updates our belief in Al Ahly winning the title.
Final Thoughts
From Guesswork to Logical Simulation
The Beauty of Statistics
Perhaps the real question wasn't "Who will win the league?", but rather, "How do we attempt to measure something so chaotic?" Statistics turns random chaos into a logical journey of building probabilities and updating convictions.
Behind the Numbers
A model doesn't "understand" football. It only sees patterns. Modeling is an attempt to match the past with the present to give a glimpse of the future.