*Article originally posted on 30th March 2017 here*

In *Part I* of the series we looked getting data and some basic formulas. *Part II* looked at xGoals in more detail and a basic method of finding the expected number of goals scored between Liverpool and Everton. This method is by no means the end point and there is still some improvements to be made, but we’re beginning from scratch and we’ll come back to improving the numbers later on.

In Part III of the series we are going to delve deeper into the Liverpool v Everton match to discuss bookmaker margin or overround, but first we need to do some troubleshooting.

##### Troubleshooting and Formula Breakdown

It’s great to see so many of you following along with the series in building your own models. My goal with this series is to give people a starting point within this area of analysis and to teach some basic Excel skills. Many of you have contacted me through Twitter and email with some problems you have been having with formulas, so I’ll answer the main question you have all been asking:

**Q: Why does my formula return a DIV/0 ERROR? or Why does my formula open a dialogue box looking for a file?**

**A: **This is due to copying the formulas directly from the articles and pasting them in your formula bar. The issue is the ‘ symbol. The formatting of this website changes this to an apostrophe like in the word Stephen**‘**s and Excel does not recognise it. Compare the following:

=(AVERAGEIF(Results!$C:$C,’Shots and Goal Data’!$A2,Results!L:L)+AVERAGEIF(Results!$D:$D,’Shots and Goal Data’!$A2,Results!M:M))/2

In the formula typed first the symbols before and after Shots and Goal Data have curves, in the image below they are straight. Excel sees the curved symbol as an indication that the sheet you are trying to reference is a different workbook and therefore will not calculate.

So instead of telling you to just manually type the formulas, let’s look at the different components of some formulas so you can see what each part does and more quickly build them yourself.

In Excel, press CTRL+N to open a new workbook. From cell A2 to A11 enter the numbers 1 to 10.

In cell C2 type the following: =SUM(

The “=” sign tells Excel that we want to run a formula.

“SUM” is the *function* we want to run

The open bracket ( tells Excel what to sum.

As you can see, when type a formula in Excel, it gives us a little help, showing us the next part we need to enter. We could type all of the cell references we want to calculate like:

=SUM(A2,A3,A4,A5…..

But there is a much faster way. In your formula, after your first open bracket, simply click cell A2 and press CTRL+SHIFT+DOWN.

This selects the **range** of cells rather than each individual cell. Finish the formula with a closed bracket ) and press enter. Cell C2 should now show a value of 55.

In cell C3 follow the same steps again, except this time find the **AVERAGE.**

Cell C3 should now have a value of **5.5**

We can get Excel to calculate specific data within a data set by adding an **IF** function. Let’s imagine we want to calculate how many goals Arsenal have scored and conceded at home this season.

Copy the results page we created in part one and paste it on a new sheet in our new practice workbook. Be sure to name the new sheet **Results. **

So, to calculate how many goals Arsenal have scored, we need to combine our SUM function with an IF function. Excel already has this function so in cell E2 type the team name: Arsenal. In F2, we will calculate how many goals they have scored at home by typing:

=SUMIF(

This function asks us to first to give a range to check our **IF **statement. So we want to get results only for Arsenal. So the range will be the column that shows the **Home Team:**

Our **criteria **is the value we are searching for within the range. We want to only find results for Arsenal, so after $C:$C type a **comma **and **$E2** or after the comma, click on cell E2 and press the F4 key three times.

After **$E2 **we type one more comma. Now we need to add the range we want to sum, which is **Home Goals**. This leaves us with the formula:

This should give us the total number of goals Arsenal have scored at home this season as **26**. Select cell F2 and hover your cursor over the bottom-right edge until you see a.

Click and drag one cell to the right. Cell G2 should now show how many goals Arsenal conceded at home: **13**.

If you look at the completed formula above again, the only cell or range reference that does not have a $ is the final one. This means that when we dragged it across to cell G2, it changed the sum range from E:E (Home Goals) to F:F (Away Goals).

Do the same again in cells H2 and I2 but this time find the average using **AVERAGEIF**. The results should be 2.0 scored and 1.0 conceded.

So what these formulas do is search through our Home Team column, when it finds Arsenal it pulls the number from the same row in column E and adds it to give us our total & average home goals scored etc… We can do the same for Arsenal’s away record by changing one part of the formula too.

**Overround: The Bookmaker’s Money-Maker**

Imagine you have a deck of cards. If you shuffle the deck there is a 50% chance that the first card you turn over is either red or black. If you put that card back in the deck and shuffle again, you still have a 50% chance of the first card being either red or black. Let’s say you do this 10 times and get black each time. That does not mean that you have a higher chance of getting black the 11th time. The deck has no memory, so if we repeated this 10,000 times we should end up with about half the results being red, and half being black.

If a bookmaker came along and offered you 2.0 odds (EVENS in old money), that the next card would be red, over 10,000 bets you would theoretically make neither a profit nor a loss as you would win half of the time. This assumes of course that the market is fair. What the bookmaker does however, is shift the odds in his favour. So instead of offering you odds of 2.0 (50%), he offers you 1.91 (52.36%) that the next card will be red or black. This means that regardless of the outcome of the next 10,000 bets, the bookmaker will make a 4.72% profit (52.36% + 52.36% = 104.72%). This extra percentage above 100 is what is known as the bookmaker’s margin or overround.

The same thing is done with football. Bookmakers will ~~manipulate~~ shorten or lengthen odds to ensure enough money is bet on each outcome to secure a profit. Let’s look at our Liverpool v Everton example:

Bet365 currently have their market at:

- Liverpool – 1.62 (1/ 1.62 = 61.73%)
- Draw – 3.80 (1/ 3.80 = 26.32%)
- Everton – 5.50 (1/ 5.50 = 18.18%)

This adds up to a total market value of 106.23%, meaning Bet365 have a margin of 6.23%. This is not the “true market”. The odds we calculated do not include an overround, so in order to compare our odds to Bet365, we need to make their market equal to 100%. We do this by dividing the % chance of each outcome by Bet365’s total %.

There’s still an issue here, and is due to the *favourite-longshot **bias *which you can and should read about at the link above.

##### Markets beyond Home-Draw-Away

We can get everything set up now for other match markets like double chance, draw no bet, total goals, both to score, handicap markets, etc.. all by selecting different parts of our correct score matrix and the odds we’ve already calculated. But the xGoals are not tight enough to give anything close to a reliable probability in these markets, yet.

Lets take a look at calculating Both to Score:

We can calculate the market “Both to Score – Yes” by summing all of the parts of this matrix inside the red outline. Similarly, we can calculate the No side of the market by calculating the numbers inside the blue outline.

The same can be done for Over/Under total goals. Sum all of the parts that are **less **that 3 goals. This gives your under 2.5 goals market. If you take this number away from 1 (1- Under2.5Goals %), it will show the probability for over 2.5 goals. We could do the same for all over/under markets.

**Please note, the % and odds for the O/U and BTS markets are not accurate at this stage**. We will be looking at a different method to find these odds in Part IV when we discuss **value **and look at past results, before tackling the issues with our xGoals in Part V.

Alright, I’ll leave it there for now. As usual, If you have any questions please feel free to contact me below or *@petermckeever *but I may be a bit slow to respond as I am currently in Rome on my first holiday in 3 years!

hey man, thanks for the write up. I am looking at calculating individual player rating versus the team’s charisma to get a more accurate outlook of the match. Can it be done via excel or R?

LikeLike

Hey Dennis, that’s an interesting idea. How will you quantify charisma is the first question though!

LikeLike