For the Love of Soccer and Data Analytics

During my school days, I remember one of my teachers saying “Science is everywhere.” Today we probably relate more with “Analytics is everywhere.” Sports  is no exception. In the past decade, there has been a major drive for sports to become data-driven. On one side, there are sports like Baseball and Tennis ; highly impacted by Data Analytics, while on the other hand there are sports like Cycling which are still to adopt a data driven approach (Read my previous blog on Opportunities of Data Analytics in Cycling). The potential for sports analytics is undoubtedly tremendous.

In this blog, my goal is to explore the data analytics angle in one of the world’s most followed sport, SOCCER!

For the love of football & data analytics

Moneyball was not the start

If you think Michael Lewis’s Moneyball is the pioneer in the Analytics in Sports, then you’d be wrong. Soccer, being the most popular sports in the World with more than 200 soccer watching countries, has cherished the magic of analytics over the past few years. The journey of data analytics in Soccer though, was not very smooth. The old school method of analytics in soccer is the so called Notational Analysis, which was introduced by a British Accountant Charles Reep, who started scribbling down observations using a system of symbols. Reeps’ conclusion was teams would be more efficient if they spent less time trying to string together passes and more time lobbing the ball into their opponent’s area. This strategy became known as the long-ball game. Most Sports Analysts did not agree with the Long Ball game as Reep’s conclusion was based on minimal data. According to one Mathematician who did extensive research on this, “Reep had very strong preconceived notions and when he found what he was looking for — a chance to play the game with minimum input for maximum output — he didn’t investigate other hypotheses like other analysts typically would.”

Analytics in football

Initially, data collection was a big challenge. People tried things such as filming players striking the ball, to study technique from a biomechanical perspective. Those initiatives, however, never had much impact.

Prozone, the pioneer of performance analysis in sports, has played a vital role in the rise of analytics in Soccer. Prozone developed a proprietary player-tracking software which, fed by eight cameras around the pitch, would generate a two-dimensional bird’s-eye animation of a soccer match. The machine could track each player’s movement every 0.1 seconds, registering an average of 3,000 touches of the ball per game, and provide answers to a range of statistical questions.  Some of the Soccer teams greeted data analytics heartily as they understood that in the long run, analytics is the way to go to get the better out of  the team. Some still argued ,”The real game is played on the field, not in the computers”.

Clive Woodward, the coach of England’s World Cup-winning rugby team in 2003, says, on adapting Prozone into rugby,”When I first saw it I was fascinated because I’d never seen a game where you’re looking down and just see dots and data and movement.It removed a lot of the preconceived notions we had about how other teams played. It made a big difference when we started to see them as data, as opposed to teams we had never beaten before.”

Simon Wilson who is the manager of Strategic Performance in Manchester City FC, with the help of his team of data analysts, walked some unexplored path which answered a lot of questions .Let me shed some light on some of the things  they explored,

  • Number of line breaks i.e forward pass that goes through the opposition’s midfielders or line of defenders.
  • What happens within the 20 seconds after the team wins or loses the ball.
  • Ball possession in the last third of the field which is strongly correlated with winning matches.

How the focus shifted to business from the game!

The revolution which started with analyzing the tracked data of a match and measured players performance were used for the betterment of the team. Several opportunities came up when these measures were converted into insights.  The next step was to employ these insights to recruit players.  The idea of measuring the utility of a player to the team was introduced by Michael Lewis in his book Moneyball. Initially features like performance, ability and special skills were the key factors to the coaches and the team management to retain/release a player. Later features like popularity of a player started to play a major role to maintain the fan base of teams.

This was just the beginning for the role of data analytics in the betterment of the sports. Now, professional Soccer clubs have transformed into profitable business and that means profitability is a combination of players’ performance and their ability to grow the fanbase. So, to improve their businesses, the Soccer clubs had to do about their fan base. Analytics once again proved to be the trump card to ensure packed stadiums.


Some advanced sports analytics techniques that have been introduced lately are as follows:

  • Predicting the match result, no of yellow/red cards  in a Soccer match using regression techniques. (Read more)
  • Classifying a match whether it would be a home win /loss, away win/loss or would be drawn.

The above two techniques are implemented via machine learning techniques, based on huge digital database tracked through advanced technology.

There are applications like Statzpack and SOCCERLAB which help users to analyse a recorded match in a digital way. An US based company, Wealth Engine is helping the American Soccer teams in trying to better understand what they can do for teams looking to improve their ticket and sponsorship sales. Read this blog on the prediction of  UEFA Champions League.

With FIFA World Cup underway 2014, analysts are ready to show their skills. Bayesialab recently came up with their idea of predicting potential winner of the World Cup with a group by group analysis.

Is data killing the sport or powering a new game?

The next present of Analytics to the sport? Is it some advanced modelling methods or some high end data tracking /visualization tool? Or is the game being so much influenced by data that we are slowly lagging in terms of the quality of the sport! Let us see what others are saying. Whatever soccer has seen over the years obviously points to one direction that there is an acute awareness at the coal-face that there remains a need to move away from theory and into practical applications. Ultimately, analysis has to lead to decisions. After all, without that you’re just flailing about attending analytics conferences.

“Data is worthless. Only decisions have value.”

So what is the problem?  Since the popularity of Moneyball, it is common for the richest sports clubs to throw money at analysis – often without really knowing what to do with it. According to Mr. Moneyball (Billey Beane) himself, even if all 30 Major League Baseball teams used analysts after Moneyball, only five or six were actually doing analytics. He also adds,”What is the point spending time and money on analysis if you’re not going to use it? You’d be better off going out and buying a new player.”

The big-spending clubs have been one of the biggest investors in analysts in recent years, acquiring some of the game’s prominent thinkers in the field. But the challenge of translating that into practical implementation still remains an issue. Since the coach is a pivotal part of a team and its strategy, it is most important to have a perfect alignment between the analysts and the coach, so that they can provide what is important for the team rather than going for data collection ad infinitum. And after years of excitement about the big data revolution, there is a growing sense that there could now be too much noise out there.

In such a critical time it is up to Analysts to prove their potential to those who feel that sports like soccer is going to be lost in Data. Being an analyst I feel that we will definitely be able to find something from our arsenal to gain their trust again. What do you think?

Please share your thoughts.

This blog is authored by Rahul Dutta, Analytics Consultant at BRIDGEi2i

About BRIDGEi2i:

BRIDGEi2i provides Business Analytics Solutions to enterprises globally, enabling them to achieve accelerated business impact harnessing the power of data. Our analytics services and technology solutions enable business managers to consume more meaningful information from big data, generate actionable insights from complex business problems and make data driven decisions across pan-enterprise processes to create sustainable business impact. To know more visit

Connect with us:
facebook  BRIDGEi2i on twitter  BRIDGEi2i on LinkedIn  BRIDGEi2i on Google+  BRIDGEi2i on YouTube


The views and opinions expressed in this article are those of the author and do not necessarily reflect the official position or viewpoint of BRIDGEi2i.