## World cup group stage outcomes

With the group stages of World Cup 2018 drawing to a close, I was wondering what the possible scores were attainable in each group (e.g. 9, 6, 3, 0 for Group A), and how many different match outcomes resulted in each score configuration. With just $3^6 = 729$ possibilities (“win”, “draw” or “loss” for each of 6 games), this was easy to code up.

There are 40 different possible group score configurations, with 7, 4, 4, 1 and 6, 4, 4, 3 being the most “common”, in the sense that they are the most common result if each of “win”, “draw” and “loss” was equally likely for each game. The table below shows the full list:

Score configuration No. of permutations
7, 4, 4, 1 36 (4.9%)
6, 4, 4, 3 36 (4.9%)
9, 6, 3, 0 24 (3.3%)
9, 4, 3, 1 24 (3.3%)
9, 4, 2, 1 24 (3.3%)
7, 6, 4, 0 24 (3.3%)
7, 6, 3, 1 24 (3.3%)
7, 6, 2, 1 24 (3.3%)
7, 5, 4, 0 24 (3.3%)
7, 5, 3, 1 24 (3.3%)
7, 5, 2, 1 24 (3.3%)
7, 4, 3, 3 24 (3.3%)
7, 4, 3, 2 24 (3.3%)
7, 4, 3, 1 24 (3.3%)
7, 4, 2, 2 24 (3.3%)
6, 6, 4, 1 24 (3.3%)
6, 6, 3, 3 24 (3.3%)
6, 5, 4, 1 24 (3.3%)
6, 4, 4, 2 24 (3.3%)
5, 5, 4, 1 24 (3.3%)
5, 4, 4, 3 24 (3.3%)
5, 4, 4, 2 24 (3.3%)
5, 4, 3, 2 24 (3.3%)
9, 6, 1, 1 12 (1.6%)
9, 4, 4, 0 12 (1.6%)
7, 7, 3, 0 12 (1.6%)
7, 3, 2, 2 12 (1.6%)
6, 5, 2, 2 12 (1.6%)
5, 5, 3, 2 12 (1.6%)
5, 5, 3, 1 12 (1.6%)
5, 5, 2, 2 12 (1.6%)
5, 3, 3, 2 12 (1.6%)
9, 3, 3, 3 8 (1.1%)
6, 6, 6, 0 8 (1.1%)
4, 4, 4, 3 8 (1.1%)
7, 7, 1, 1 6 (0.8%)
4, 4, 4, 4 6 (0.8%)
9, 2, 2, 2 4 (0.5%)
5, 5, 5, 0 4 (0.5%)
3, 3, 3, 3 1 (0.1%)

The code I used to produce the table above is below:

import collections
import itertools

def update_score(scores, home, away, outcome):
"""Update scores based on outcome."""
if outcome == "win":
scores[home] += 3
elif outcome == "loss":
scores[away] += 3
else:
scores[home] += 1
scores[away] += 1

score_dict = collections.defaultdict(int)

for outcome in itertools.product(["win", "draw", "loss"], repeat = 6):
# compute points for each of the teams
scores = [0, 0, 0, 0]
update_score(scores, 0, 1, outcome[0])
update_score(scores, 0, 2, outcome[1])
update_score(scores, 0, 3, outcome[2])
update_score(scores, 1, 2, outcome[3])
update_score(scores, 1, 3, outcome[4])
update_score(scores, 2, 3, outcome[5])

score_dict[tuple(sorted(scores, reverse = True))] += 1

score_list = [(v, k) for k, v in score_dict.items()]
score_list.sort(reverse = True)

for item in score_list:
print item[1], item[0], round(item[0] / 729.0 * 100, 1)

print len(score_list)


Advertisements
Posted in Sports & Games | Tagged | Leave a comment

## World cup 2018 FAQ

World cup fever is underway! I’ve assembled a little FAQ below on some questions that I thought about while watching the group stages of the tournament. (I wrote a similar post back in 2014 which you can view here.)

Credit: technadu.com

1. What is the minimum number of points needed to guarantee qualifying for the knockout stage?

A team needs 7 points to guarantee qualifying. The most number of points the 4 teams can earn together is 18 (6 games of 3 points each). While it’s possible for 3 teams to get 6 points each, it’s not possible for 3 teams to get at least 7 points each.

2. Is 2 wins enough to guarantee qualification for the knockout stage?

Somewhat surprisingly, no! It is possible to get knocked out even with 2 wins (i.e. points). Let’s say our 4 teams are A, B, C and D. A beats B, B beats C, C beats A, and all 3 teams beat D. In this case, A, B and C all have 6 points but only two of them can qualify.

Interestingly enough, this might happen in this year’s Group F (Mexico, Germany, Sweden, South Korea).

3. Is it possible for the group to be decided after Matchday 2 (i.e. first 4 matches)?

Yes, in the sense that after Matchday 2, we know which 2 teams go through to the knockout rounds and which two get knocked out. For example, if A beats C and D, and B beats C and D, then A and B are through for sure. This happened in this year’s Group A (Russia, Uruguay, Egypt, Saudi Arabia) and Group G (England, Belgium, Tunisia, Panama).

Having A and B beat C and D (as above) is the only way for the group to be decided after Matchday 2 in the sense above. There is no way for the winner of the group to be decided after Matchday 2.

4. Is 2 losses enough to guarantee elimination?

Interestingly, it is not enough! This is the case with this year’s Group F (Mexico, Germany, Sweden, South Korea). Here is a possible configuration: A beats B, C and D to top the group with 9 points. B beats C, C beats D, and D beats B, so they each have 3 points and 2 losses. Since the top 2 teams go through, one of the teams with 2 losses will go into the knockout stages.

This is the only possible configuration for a team with 2 losses to go through. Let’s say D loses to A and B. The maximum number of points D can obtain at the end of the group stage is 3 points. In order for D to advance, we cannot have two other teams scoring more than 3 points. However, A and B already have 3 points, with two other matches (B vs. C and A vs. C) from which they can earn points.

• If A and B both win or draw these games, they will both have at least 4 points and D cannot advance.
• If A and B both lose these games, C will have 6 points, one of A and B will get some points from A vs. B on Matchday 3, so D cannot advance.
• If A draws and B loses, then A and C will have 4 points, and D cannot advance. (Similarly, if B draws and A loses.)
• If A wins and B loses, then A has 6 points, B has 3 points and C has 3 points. The only way D can advance is for A to beat B in the last remaining match, and that is the configuration above. (Similarly, if A loses and B wins.)
Posted in Sports & Games | Tagged | Leave a comment

## p^2-q and q^2-p prime

$p$ and $q$ are 2 prime numbers. $p^2 - q$ and $p - q^2$ are also prime. If you divide $p^2 - q$ by a composite number $n$, where $n < p$, you’ll get a remainder of 14. If you divide $p - q^2 + 14$ by the same number, what will you get as the remainder?” – Akash

Posted in Random | Tagged | Leave a comment

## Statistical odds and ends

Between school and family duties, I’ve been finding it hard to find any time to indulge in olympiad math blogging 😦 At the same time, I’ve missed the feeling of typing up stuff that I find interesting and sharing it with others.

To that end, I just started a new blog Statistical Odds and Ends! The idea for this began when I found myself spending a lot of time googling relatively simple things in the course of my studies and research. For example:

• Why does the ridge regression solution exist and why is it unique?
• What is the formula for the matrix $P$ such that the projection of the vector $v$ onto the column space of a matrix $A$ is $Px$?
• Can I switch supremums and expectations and still have equality? If not, can I get an inequality instead?
• How can I derive the bias-variance decomposition?

I was often googling for the same things over and over again, and trying to re-understand what others were writing.

Hence the idea of Statistical Odds and Ends. The blog will be a place for me to pen down my understanding of these statistical tidbits, and to share it with others. Hopefully some of the material there will be of interest to you! If the content is relevant to this audience, I will cross-post over on this blog too.

Posted in Uncategorized | Leave a comment

## Stats Joke

From page 68 of Simon Singh’s The Simpsons and their Mathematical Secrets:

While heading to a conference on board a train, three statisticians meet three biologists. The biologists complain about the cost of the train fare, but the statisticians reveal a cost-saving trick. As soon as they hear the inspector’s voice, the statisticians squeeze into the toilet. The inspector knocks on the toilet door and shouts: “Tickets, please!” The statisticians pass a single ticket under the door, and the inspector stamps it and returns it. The biologists are impressed. Two days later, on the return train, the biologists showed the statisticians that they have bought only one ticket, but the statisticians reply: “Well, we have no ticket at all.” Before they can ask any questions, the inspector’s voice is heard in the distance. This time the biologists bundle into the toilet. One of the statisticians secretly follows them, knocks on the toilet door and asks: “Tickets please!” The biologists slip the ticket under the door. The statistician takes the ticket, dashes into a another toilet with his colleagues, and waits for the real inspector. The moral of the story is simple: “Don’t use a statistical technique that you don’t understand.”

Posted in Random | Tagged , | Leave a comment

## [Soln] Central Limit Theorem: Strange Result!

For $n \in \mathbb{N}$, define the random variable

$X_n = \begin{cases} \pm 1 &\text{each with probability } \frac{1}{2}\left( 1 - \frac{1}{n^2} \right), \\ \pm n^2 &\text{with probability } \frac{1}{2n^2}. \end{cases}$

Let $S_n = \displaystyle\sum_{k = 1}^n X_k$. Prove that as $n \rightarrow \infty$,

a) the distribution of $\displaystyle\frac{S_n}{\sqrt{n}}$ converges to $\mathcal{N}(0, a)$ for some real number $a \neq 2$,

b) but $\text{Var} \displaystyle\frac{S_n}{\sqrt{n}}$ converges to 2.

(Credits: I learnt of this problem from Persi Diaconis in my probability class.)

Posted in Undergraduate | Tagged | Leave a comment

## [Hints] Central Limit Theorem: Strange Result!

Posted in Undergraduate | Tagged | Leave a comment