EPL 29-31 August 2015

EPL 29-31st August 2015

Having left for Tbilisi, Georgia to watch Scotland lose on Saturday August the 29th I'm only just getting back now to have a brief look over that weekend's results in the English Premier League.

Here are the stats from all the matches:

In [1]:
%matplotlib inline
import league_analysis
from IPython.display import display, HTML
epl = league_analysis.epl
league_analysis.display_matches(epl, '29/08/15', '31/08/15')
Home Away
Team Aston Villa Sunderland
Goals 2 2
Shots 21 7
SOT 6 3
Home Away
Team Bournemouth Leicester
Goals 1 1
Shots 5 4
SOT 2 2
Home Away
Team Chelsea Crystal Palace
Goals 1 2
Shots 26 13
SOT 9 6
Home Away
Team Liverpool West Ham
Goals 0 3
Shots 13 12
SOT 1 5
Home Away
Team Man City Watford
Goals 2 0
Shots 18 7
SOT 5 0
Home Away
Team Newcastle Arsenal
Goals 0 1
Shots 1 22
SOT 0 9
Home Away
Team Stoke West Brom
Goals 0 1
Shots 13 15
SOT 5 5
Home Away
Team Tottenham Everton
Goals 0 0
Shots 20 8
SOT 6 3
Home Away
Team Southampton Norwich
Goals 3 0
Shots 23 6
SOT 8 1
Home Away
Team Swansea Man United
Goals 2 1
Shots 9 11
SOT 4 4

Chelsea 1 - 2 Crystal Palace

Read more…

Results Histograms

Result Histograms

Suppose you wish to bet on the correct score for games in a particular league. Of course the odds for each result will depend heavily on the two teams involved but it would at least help to know what the base rates are. That is, how often each kind of result occurs in a particular league. There is no point in betting on a four-nil away victory if that result never happens. Here, we give a few histograms over the kinds of results that happen in our standard five leagues.

We start off with a bunch of imports to get us started.

In [1]:
%matplotlib inline
from IPython.display import display, HTML
import league_analysis
import collections
import matplotlib.pyplot as plot
import numpy
league_names = ['epl', 'ech', 'elo', 'elt', 'spl']

Then we require some code to actually draw the histogram of results from a particular set of matches. Our code here is generic enough to draw the histograms for both results (home, away, draw) and correct scores (0-0, 0-1, etc.).

Read more…

Shots Per Goal

Shots Per Goal

A very quick post which explores the average number of shots and shots on target it takes in order to score a goal. If we assume that shots and shots on target are a better indicator of skill than the number of goals, then we can get an idea of what the score should have been for any given game. Of course this simple analysis ignores score effects which are almost certainly prevalent.

In [1]:
import league_analysis
from IPython.display import display, HTML

The following tables then show for each year the number of shots divided by the number of goals for each of the 5 leagues we are examining and the average across all 5 leagues. We then get an overall picture for all years, firstly for each league individually and then in the final row of the final table we have the overall picture which includes all the matches we have data for.

In addition, we do the same analyses, but only for all home shots/goals and all away shots/goals, since it may be that there is a difference in the number of shots taken for each goal by home and away teams.

Read more…

EPL 22-24 August

EPL 22-24 August 2015/2016

Weekly round-up of games in the English Premier League.

In [1]:
%matplotlib inline
import league_analysis
from IPython.display import display, HTML
epl = league_analysis.epl
In [2]:
league_analysis.display_matches(epl, '22/08/15', '24/08/15')
Home Away
Team Crystal Palace Aston Villa
Goals 2 1
Shots 16 11
SOT 6 2
Home Away
Team Leicester Tottenham
Goals 1 1
Shots 13 19
SOT 2 6
Home Away
Team Man United Newcastle
Goals 0 0
Shots 20 7
SOT 8 0
Home Away
Team Norwich Stoke
Goals 1 1
Shots 21 6
SOT 7 1
Home Away
Team Sunderland Swansea
Goals 1 1
Shots 10 20
SOT 2 9
Home Away
Team West Ham Bournemouth
Goals 3 4
Shots 10 15
SOT 4 7
Home Away
Team Everton Man City
Goals 0 2
Shots 10 16
SOT 1 9
Home Away
Team Watford Southampton
Goals 0 0
Shots 13 14
SOT 0 5
Home Away
Team West Brom Chelsea
Goals 2 3
Shots 15 15
SOT 6 5
Home Away
Team Arsenal Liverpool
Goals 0 0
Shots 19 15
SOT 5 8

This is being done on Monday before the Arsenal vs Liverpool game so I'll update this post after data for that match is available.

Read more…

EPL August 14-17

EPL Second Round

The English Premier league has now gotten through a second round of matches and as everyone predicted Leicester are five points clear of the champions Chelsea. Here are the important statistics from all of the weekend's games:

In [1]:
%matplotlib inline
import league_analysis
epl = league_analysis.epl
In [2]:
league_analysis.display_matches(epl, '14/08/15', '17/08/15')
Home Away
Team Aston Villa Man United
Goals 0 1
Shots 5 9
SOT 1 2
Home Away
Team Southampton Everton
Goals 0 3
Shots 17 10
SOT 4 4
Home Away
Team Sunderland Norwich
Goals 1 3
Shots 6 19
SOT 2 6
Home Away
Team Swansea Newcastle
Goals 2 0
Shots 19 4
SOT 6 2
Home Away
Team Tottenham Stoke
Goals 2 2
Shots 13 16
SOT 7 7
Home Away
Team Watford West Brom
Goals 0 0
Shots 16 6
SOT 5 0
Home Away
Team West Ham Leicester
Goals 1 2
Shots 10 11
SOT 3 6
Home Away
Team Crystal Palace Arsenal
Goals 1 2
Shots 11 20
SOT 4 7
Home Away
Team Man City Chelsea
Goals 3 0
Shots 18 10
SOT 8 3
Home Away
Team Liverpool Bournemouth
Goals 1 0
Shots 18 13
SOT 2 2

Man City 3 - 0 Chelsea

The biggest game was undoubtedly between 1st and 2nd in the league last season. There is little doubt that Man City outplayed their guests and deserved their win, although if you really wanted to argue the case you would point to the potential sending off of Ferdandinho just prior to half time. Whether you agree that it deserved a red-card there was certainly the potential for the referee to see it that way.

Anyway, the shots indicate that City were indeed the better team, but 3-0 likely overstates the case. In particular Chelsea had some very decent opportunities. It's the sort of game that Chelsea could have undeservedly won/drawn.

Read more…

EPL First Weekend

EPL First Weekend

A quick post with some statistics from the first weekend of matches in the EPL. Of course at this stage it is not very sensible to draw any conclusions.

In [1]:
%matplotlib inline
import league_analysis
epl = league_analysis.year_201516.epl_league

Matches

Here are the matches:

In [2]:
league_analysis.display_matches(epl, '08/08/15', '10/08/15')
Home Away
Team Bournemouth Aston Villa
Goals 0 1
Shots 11 7
SOT 2 3
Home Away
Team Chelsea Swansea
Goals 2 2
Shots 11 18
SOT 3 10
Home Away
Team Everton Watford
Goals 2 2
Shots 10 11
SOT 5 5
Home Away
Team Leicester Sunderland
Goals 4 2
Shots 19 10
SOT 8 5
Home Away
Team Man United Tottenham
Goals 1 0
Shots 9 9
SOT 1 4
Home Away
Team Norwich Crystal Palace
Goals 1 3
Shots 17 11
SOT 6 7
Home Away
Team Arsenal West Ham
Goals 0 2
Shots 22 8
SOT 6 4
Home Away
Team Newcastle Southampton
Goals 2 2
Shots 9 15
SOT 4 5
Home Away
Team Stoke Liverpool
Goals 0 1
Shots 7 8
SOT 1 3
Home Away
Team West Brom Man City
Goals 0 3
Shots 9 19
SOT 2 7

Interesting week. Swansea really dominated Chelsea in shots and shots on target at Stamford Bridge, probably a result of Chelsea losing their goal keeper to a red card.

Two of the new boys Bournemouth and Norwich both managed to out-shoot their opponents, but they had fewer shots on target and that proved telling as both lost. Watford can be assured that their draw with Everton was well-deserved.

Graphs

For no particular reason now graphing the shots and shots on target differences against the goals difference. First the difference in total shots vs the difference in goals.

Read more…

Arsenal vs West Ham

Arsenal vs West Ham 09/08/2015

Possibly the largest upset of the week came at the Emirates where Arsenal started their title campaign with a home loss to West Ham. A quick look at the game statistics:

In [1]:
import league_analysis
epl = league_analysis.year_201516.epl_league
league_analysis.display_given_matches([epl.get_game('Arsenal', 'West Ham', '09/08/15')])
Home Away
Team Arsenal West Ham
Goals 0 2
Shots 22 8
SOT 6 4

So Arsenal were pretty dominant in shots, 22 vs 8. They did seem to have trouble converting those shots to shots on target though. We can look at all of the matches in the leagues we have data from (Eng top four leagues and Scottish Premiership), to see how often the team that has fewer shots wins the match:

Read more…

2014/2015 League Commentary

I've done a data dump automatically graphing a bunch of statistics from the season 2014/15 concerning five leagues, the four English leagues, Premiership, Championship, League One and League Two as well as the Scottish Premiership.

You can peruse all of the graphs here. But you'll likely get a bit tired of seeing scatter plots. Hence here, I explain a few of the more interesting graphs.

In [1]:
%matplotlib inline
In [2]:
import league_analysis
epl_league = league_analysis.year_201415.epl_league
ech_league = league_analysis.year_201415.ech_league
elo_league = league_analysis.year_201415.elo_league
elt_league = league_analysis.year_201415.elt_league
spl_league = league_analysis.year_201415.spl_league
all_leagues = [epl_league, ech_league, elo_league, elt_league, spl_league]

Shots For

One of the simplest metrics we measure is the number of shots a team takes. Generally the more shots a team takes the better they do. Of course some teams are quite frugal in that they only take shots when there is a high chance of scoring whilst others shoot on sight. However, here are a couple of graphs from the English and Scottish Premierships, showing a slight difference in the way that the two champions have gone about their successes.

In [3]:
league_analysis.graph_leagues('Shots For', 'Points', leagues=[epl_league, spl_league],
              annotate_teams=['Chelsea', 'Liverpool', 'Man City', 'QPR', 'Swansea', 'Arsenal',
                              'Celtic', 'St Mirren', 'Motherwell', 'St Johnstone', 'Aberdeen'])
line of best fit: 0.1662 x - 29.48
line of best fit: 0.1744 x - 13.14

Chelsea managed to win the league whilst rating only 4th in the league for shots taken. This is likely due to their defensive approach in the second half of the season. One could conclude that Chelsea were simply winning a lot of matches from early on and hence spent a lot of time in the lead. Teams tend to take fewer shots when defending a lead. However, look at Celtic, a theme which occured in many of the Scottish premiership teams is that Celtic are a huge outlier in terms of their output, but they are close to being exactly on the line of best fit. In other words, Celtic are scoring more points than their rivals, but not more than you might expect from their underlying statistics.

QPR are another hugh outlier, having finished bottom of the league despite taking a large number of shots, a number of shots more in line with the top half of the league. Of course, they may be taking a bunch of dreadful shots, but still they had some adventure about them.

A couple of other outliers, Aberdeen and St Johnstone appear to have gotten quite a bit of mileage (that is points) out of their shots. Aberdeen still ranked second in the league for shots taken (and finished second in the league on points), but they are above the trend line. St Johnstone on the other hand are well above the trend line and well down on the number of shots they took.

Read more…

2014/2015 League Graphs Data Dump

Season 2014/2015 Data Dump

This post really just represents a data dump of the information we have from the season 2014/2015. From the data available from football-data I've calculated some statistics and graphed those for the top four English leagues and the Scottish Premiership. I've mostly graphed each statitic against Points, but I've also done some Home/Away graphs. If you'd like any others, let me know in the comments.

In [1]:
from IPython.display import display, HTML
%matplotlib inline
import league_analysis
In [2]:
epl_league = league_analysis.year_201415.epl_league
ech_league = league_analysis.year_201415.ech_league
elo_league = league_analysis.year_201415.elo_league
elt_league = league_analysis.year_201415.elt_league
spl_league = league_analysis.year_201415.spl_league
all_leagues = [epl_league, ech_league, elo_league, elt_league, spl_league]
In [3]:
league_analysis.graph_leagues('TSR', 'Points', leagues=[epl_league],
              annotate_teams=['Chelsea', 'Liverpool', 'Man City', 'QPR', 'Swansea', 'Arsenal'])
line of best fit: 195.7 x - 45.61

Introduction

The intention of this blog is to mostly use the freely available data on football matches to make some graphs which are hopefully interesting. Initially I will concentrate on the English Premiership,Championship, Leagues One and Two and the Scottish Premiership.