How the spreadsheet-wielding geeks are taking over football

The statistical revolution comes to the pitch.

You are invited to read this free preview of the upcoming New Statesman, out today. To purchase the full magazine - with our signature mix of opinion, longreads and arts coverage, plus columns by Helen Lewis on Japan, Will Self on Claire Balding, and a major report from Turkey by Helena Drysdale - please visit our subscription page.

 

 

The Numbers Game: Why Everything You Know About Football Is Wrong
Chris Anderson and David Sally
Viking, £12.99, 384pp

Almost exactly a decade ago, the American writer Michael Lewis published a book called Moneyball. It told the story of Billy Beane, the general manager of an unfashionable baseball team, the Oakland A’s, who was using new statistics to evaluate baseball players and strategies. From this unpromising material, Lewis crafted a bestseller that has sold more than a million copies.

Books hardly ever change anything but this one did. Moneyball changed baseball and almost all ball games from basketball to cricket but it also affected worlds beyond sport. Ken Mehlman, Republican campaign manager in the US presidential election of 2004, instructed his staff to read it, because he realised that it wasn’t just a sports book. It was also a perfect case study of how crunching numbers can give you an edge. That made it a book for our era of “Big Data”, in which the amount of data on earth more than doubles every two years and the only mystery is how to use all these confusing numbers.

Football, or “soccer” was always the most hidebound sport and it held out longest against the numbers revolution. But now, as Chris Anderson and David Sally write in their engaging and stimulating book The Numbers Game, “The datafication of life has started to infiltrate football.” That is quite a change. In football, people always did what they did because they had always done it that way. Clubs were historically run by autocratic managers who had left school at 16 to become players and didn’t hold with book-learning. However, unseen by most fans, something profound is happening inside the sport.

Like Moneyball, The Numbers Game reaches us from the US, which is rapidly becoming a soccer society. Anderson, originally from Germany, played semi-professional soccer before becoming a professor of government at Cornell. His neighbour Sally, once a baseball pitcher at Harvard, is a behavioural economist at the Tuck School of Business at Dartmouth. Watching soccer on television together, they grew interested in the game’s relative lack of numbers and analytics.

That absence had struck the first pioneer of numbers in football, Wing Commander Charles Reep. Not a fighter pilot but an accountant in the RAF’s Bomber Command, Reep made what is probably the first known attempt to log “match data”.

In 1950, at a Swindon Town game, he logged 147 attacks by Swindon in the second half. Extrapolating from this small sample, Reep calculated that 99.29 per cent of attacks in football failed. He continued to offer his services as an analyst to clubs into his late nineties but, as Anderson and Sally show, he was on a wild goose chase. Reep assumed that there was only one correct way to play football and, naturally, he thought he had found it. Boot the ball long, said the wing commander – put it near the opposition’s goal and you will win.

In reality, Anderson and Sally write, “There is no winning formula. There is no right answer to football.” Different strokes suit different teams. To quote the great Liverpool manager Bob Paisley: “It’s not about the long ball or the short ball; it’s about the right ball.” (The best managers of the past, including Paisley, intuited many of the findings now emerging from the numbers.)

In the mid-1990s, the spread of computers reignited the data revolution. Companies such as Opta and Prozone began collecting stats on football matches. Suddenly, clubs knew how many passes each player had completed, how many tackles he had made and how many kilometres he had run.

As soon as data becomes available in any industry, some people will use it – but they often use it wrongly. As the American baseball analyst-turned-master psephologist Nate Silver says of the new Big Data: “Most of the data is just noise, as most of the universe is filled with empty space.” Alex Ferguson, Manchester United’s manager, discovered this after he sold his defender Jaap Stam in 2001 because Stam’s number of tackles was decreasing. Ferguson thought Stam was in decline. Stam went on to play several more years for big clubs.

It turned out that tackles were a poor measure of a defender’s worth: they were just noise. We now know that great defenders such as the Italian Paolo Maldini barely tackle. Maldini stopped attacks from happening by positioning himself to close holes. Yet, as Anderson and Sally point out, that kind of negative event – the attack that doesn’t happen, the dog that doesn’t bark – is often hard to spot in match data. Football statistics tend to focus on things that do happen and, above all, on goals that do get scored.

As the data revolution progresses, more and more clubs are finding clever ways to use numbers. Each season, the number of sceptics declines – in part because many people in the game have now read Moneyball or at least seen the 2011 Hollywood movie starring Brad Pitt as Beane.

Not all the traditionalists are going quietly into the night. Some are now scheming to defend their turf against spreadsheet-wielding geeks. But others are learning not to believe their own eyes. As Beane told me: “The idea that I trust my eyes more than the stats – I don’t buy that, because I’ve seen magicians pull rabbits out of hats and I just know that rabbit’s not in there.”

The data revolution keeps stumbling on new truths. At Manchester City, for instance, the analysts finally persuaded the club’s then manager, Roberto Mancini, that the most dangerous corner kick is the inswinger, the ball that swings towards goal. Mancini had long argued (strictly from intuition) that outswingers were best. Eventually he capitulated and, in the 2011-2012 season, when City won the English title, they scored 15 goals from corners, the most in the Premier League. The decisive goal, Vincent Kompany’s header against Manchester United, came from an in swinging corner.

The most powerful figure in English football remains the manager and the statistical revolution has progressed fastest at clubs where the manager believes in data. Probably the leaders in this field in England today are Arsenal’s Arsène Wenger (an economics graduate and gifted mathematician), West Ham’s Sam Allardyce (not a gifted mathematician) and Manchester United’s incoming manager, David Moyes.

In March, I visited Moyes’s then club, Everton, and one of his data analysts told me, “In terms of managers, he is probably as into it [data] as any.” Moyes would often march into the analysts’ offices firing out questions: how efficient were Everton’s next opponents at scoring from crosses? What types of passes did their midfielders make? In which areas of the field did Tottenham’s superstar Gareth Bale usually receive the ball?

For managers such as Moyes, data isn’t everything. It is one tool among many. It gives you an edge and, since you could employ perhaps 30 statisticians for the £1.5m that the average player in the Premier League earns, it’s an edge you can afford. Still, as Anderson and Sally caution: “The data cannot do the manager’s job.” Interpreting data is an art more than a science.

In 2004, the data told Wenger that an unknown French teenager, Mathieu Flamini, was running an astonishing 14 kilometres a game. By itself, that number wasn’t enough. Did Flamini run in the right direction? Wenger went to watch him, decided he did and signed him for peanuts.

Even more cheaply than hiring another statistician, a cunning manager could pop into a bookshop and splash £12.99 on The Numbers Game. The book contains several fascinating examples of statistics that could help club chairmen, managers or fans. Perhaps the book’s most remarkable finding is that football is a “weakest-link game” – although it’s nice to have great players in your team, it’s more important not to have rubbish players. Games are typically decided not by the Wayne Rooneys but by oafs such as Zurab Khizanishvili, a defender whose blunders in a play-off in 2011 arguably cost Reading promotion to the Premier League.

Anderson and Sally crunch some of the new data on individual players to estimate that upgrading your weakest link typically improves your team more than buying a new superstar would. Despite this, managers, being human and wanting to please fans and journalists, usually prefer the superstar.

The book overturns several other tenets of football thinking. For instance, the old saying that you’re most likely to concede a goal straight after scoring turns out to be nonsense. According to the stats, that’s when you’re least likely to concede.

The numbers also show the outsize role of chance in football. In one study of 43,000 matches, the underdog won 45.2 per cent. Favourites win much less often in football than in other ball games.

That is chiefly because goals in football are so scarce: you can attack all match but if the opposition nicks one lucky goal, you can lose. Then the media and fans provide a post hoc rationalisation for your defeat, even though it was dumb luck.

The Numbers Game also shows that sacking the manager – football’s equivalent of the human sacrifice – is usually pointless. Typically, the manager is sacked when the team hits its lowest point. Yet any statistician can predict what will happen after you hit your lowest point: performance will improve, because of the statistical phenomenon known as regression to the mean. Anderson and Sally explain: “An extraordinary period of poor performance is just that: extraordinary. It will auto-correct as players return from injury, shots stop hitting the post or fortune shines her light on you once more.”

Sunderland briefly improved this spring under their new manager, Paolo Di Canio, not because fascism works but because of regression to the mean.

There is an excellent final chapter predicting how football’s data revolution will progress. The authors forecast, for instance, that the historic undervaluation of goalkeepers and defenders – who command lower average salaries and transfer fees than strikers – is likely to end.

That’s because stats show that keeping a “clean sheet” helps a team more than scoring lots of goals does. And, as data evolves, we will find ways to value the almost invisible contributions that defenders such as Maldini make. Anderson and Sally believe that football data will increasingly focus on the geometry of the game off the ball – which is crucial, as the average player has the ball for only 53 seconds a game.

They also predict that the biggest innovations will come from poorer clubs, football’s equivalents of the Oakland A’s: “The strong do not need to innovate; it is the weak who must adapt or die.” Rich clubs such as Chelsea can succeed simply by buying great players. As the authors admit: “Analytics will help you win, but so will money.”

Moneyball was the Communist Manifesto of the data revolution, in sport and beyond. The Numbers Game isn’t as groundbreaking as its authors proclaim. Its subtitle – “Why Everything You Know About Football Is Wrong” – is an unnecessary overstatement. Nonetheless, the book is a valuable addition to the scarce literature at a time when pioneers inside football are only just starting to work out which stats matter, while people outside the game still scarcely know that anything is changing.

The Numbers Game is energetically and cleanly written and is free of academic jargon, though it is occasionally guilty of faux-poetic overwriting: “Each side possesses a light side, seeking the goal, and a dark side, hoping to divert it. And at the centre of that collision between the positive and the negative, the yin and the yang, is the ball” – and so on.

The authors have done their homework and I have only one sad correction to make: Nick Broad isn’t a performance scientist with Paris Saint-Germain any more. He was killed in a car crash in January, aged 38.

Simon Kuper is co-author of “Soccernomics” (HarperSport, £8.99) and a columnist with the Financial Times

An aerial view of the Hackney Marshes football pitches in London. Photograph: Getty Images

This article first appeared in the 10 June 2013 issue of the New Statesman, G0

Getty
Show Hide image

Who will win in Copeland? The Labour heartland hangs in the balance

The knife-edge by-election could end 82 years of Labour rule on the West Cumbrian coast.

Fine, relentless drizzle shrouds Whitehaven, a harbour town exposed on the outer edge of Copeland, West Cumbria. It is the most populous part of the coastal north-western constituency, which takes in everything from this old fishing port to Sellafield nuclear power station to England’s tallest mountain Scafell Pike. Sprawling and remote, it protrudes from the heart of the Lake District out into the Irish Sea.

Billy, a 72-year-old Whitehaven resident, is out for a morning walk along the marina with two friends, his woolly-hatted head held high against the whipping rain. He worked down the pit at the Haig Colliery for 27 years until it closed, and now works at Sellafield on contract, where he’s been since the age of 42.

“Whatever happens, a change has got to happen,” he says, hands stuffed into the pockets of his thick fleece. “If I do vote, the Bootle lass talks well for the Tories. They’re the favourites. If me mam heard me saying this now, she’d have battered us!” he laughs. “We were a big Labour family. But their vote has gone. Jeremy Corbyn – what is he?”

The Conservatives have their sights on traditional Labour voters like Billy, who have been returning Labour MPs for 82 years, to make the first government gain in a by-election since 1982.

Copeland has become increasingly marginal, held with just 2,564 votes by former frontbencher Jamie Reed, who resigned from Parliament last December to take a job at the nuclear plant. He triggered a by-election now regarded by all sides as too close to call. “I wouldn’t put a penny on it,” is how one local activist sums up the mood.

There are 10,000 people employed at the Sellafield site, and 21,000 jobs are promised for nearby Moorside – a project to build Europe’s largest nuclear power station now thrown into doubt, with Japanese company Toshiba likely to pull out.

Tories believe Jeremy Corbyn’s stance on nuclear power (he limply conceded it could be part of the “energy mix” recently, but his long prevarication betrayed his scepticism) and opposition to Trident, which is hosted in the neighbouring constituency of Barrow-in-Furness, could put off local employees who usually stick to Labour.

But it’s not that simple. The constituency may rely on nuclear for jobs, but I found a notable lack of affection for the industry. While most see the employment benefits, there is less enthusiasm for Sellafield being part of their home’s identity – particularly in Whitehaven, which houses the majority of employees in the constituency. Also, unions representing Sellafield workers have been in a dispute for months with ministers over pension cut plans.

“I worked at Sellafield for 30 years, and I’m against it,” growls Fred, Billy’s friend, a retiree of the same age who also used to work at the colliery. “Can you see nuclear power as safer than coal?” he asks, wild wiry eyebrows raised. “I’m a pit man; there was just nowhere else to work [when the colliery closed]. The pension scheme used to be second-to-none, now they’re trying to cut it, changing the terms.”

Derek Bone, a 51-year-old who has been a storeman at the plant for 15 years, is equally unconvinced. I meet him walking his dog along the seafront. “This county, Cumbria, Copeland, has always been a nuclear area – whether we like it or don’t,” he says, over the impatient barks of his Yorkshire terrier Milo. “But people say it’s only to do with Copeland. It ain’t. It employs a lot of people in the UK, outside the county – then they’re spending the money back where they’re from, not here.”

Such views might be just enough of a buffer against the damage caused by Corbyn’s nuclear reluctance. But the problem for Labour is that neither Fred nor Derek are particularly bothered about the result. While awareness of the by-election is high, many tell me that they won’t be voting this time. “Jeremy Corbyn says he’s against it [nuclear], now he’s not, and he could change his mind – I don’t believe any of them,” says Malcolm Campbell, a 55-year-old lorry driver who is part of the nuclear supply chain.

Also worrying for Labour is the deprivation in Copeland. Everyone I speak to complains about poor infrastructure, shoddy roads, derelict buildings, and lack of investment. This could punish the party that has been in power locally for so long.

The Tory candidate Trudy Harrison, who grew up in the coastal village of Seascale and now lives in Bootle, at the southern end of the constituency, claims local Labour rule has been ineffective. “We’re isolated, we’re remote, we’ve been forgotten and ignored by Labour for far too long,” she says.

I meet her in the town of Millom, at the southern tip of the constituency – the opposite end to Whitehaven. It centres on a small market square dominated by a smart 19th-century town hall with a mint-green domed clock tower. This is good Tory door-knocking territory; Millom has a Conservative-led town council.

While Harrison’s Labour opponents are relying on their legacy vote to turn out, Harrison is hoping that the same people think it’s time for a change, and can be combined with the existing Tory vote in places like Millom. “After 82 years of Labour rule, this is a huge ask,” she admits.

Another challenge for Harrison is the threat to services at Whitehaven’s West Cumberland Hospital. It has been proposed for a downgrade, which would mean those seeking urgent care – including children, stroke sufferers, and those in need of major trauma treatment and maternity care beyond midwifery – would have to travel the 40-mile journey to Carlisle on the notoriously bad A595 road.

Labour is blaming this on Conservative cuts to health spending, and indeed, Theresa May dodged calls to rescue the hospital in her campaign visit last week. “The Lady’s Not For Talking,” was one local paper front page. It also helps that Labour’s candidate, Gillian Troughton, is a St John Ambulance driver, who has driven the dangerous journey on a blue light.

“Seeing the health service having services taken away in the name of centralisation and saving money is just heart-breaking,” she tells me. “People are genuinely frightened . . . If we have a Tory MP, that essentially gives them the green light to say ‘this is OK’.”

But Harrison believes she would be best-placed to reverse the hospital downgrade. “[I] will have the ear of government,” she insists. “I stand the very best chance of making sure we save those essential services.”

Voters are concerned about the hospital, but divided on the idea that a Tory MP would have more power to save it.

“What the Conservatives are doing with the hospitals is disgusting,” a 44-year-old carer from Copeland’s second most-populated town of Egremont tells me. Her partner, Shaun Grant, who works as a labourer, agrees. “You have to travel to Carlisle – it could take one hour 40 minutes; the road is unpredictable.” They will both vote Labour.

Ken, a Conservative voter, counters: “People will lose their lives over it – we need someone in the circle, who can influence the government, to change it. I think the government would reward us for voting Tory.”

Fog engulfs the jagged coastline and rolling hills of Copeland as the sun begins to set on Sunday evening. But for most voters and campaigners here, the dense grey horizon is far clearer than what the result will be after going to the polls on Thursday.

Anoosh Chakelian is senior writer at the New Statesman.