A statistical trick which reveals whether MPs are lying about expenses

Benford's law has many uses. Can it trip up MPs?

Are politicians routinely making up expenses? A simple statistical test suggests not.

Benford's law is a statistical artefact found in numerical data spanning several orders of magnitude. Ben Goldacre explains:

Imagine you have data on, say, the population of every world nation. Now, take only the "leading digit" from each number: the first number in the number, if you like. For the UK population, which was 61,838,154 in 2009, that leading digit would be "six". Andorra's was 85,168, so that's "eight". And so on.

If you take all those leading digits, from all the countries, then overall, you might naively expect to see the same number of ones, fours, nines, and so on. But in fact, for naturally occurring data, you get more ones than twos, more twos than threes, and so on, all the way down to nine. This is Benford's law: the distribution of leading digits follows a logarithmic distribution, so you get a "one" most commonly, appearing as first digit around 30% of the time, and a nine as first digit only 5% of the time.

This pattern should repeat for almost any data which matches the key condition of spanning a large range of sizes. Take the example above, world populations, which goes from 800 in the Vatican City to 1.35 billion in China. But one category of data which rarely obeys the law is that where the numbers are made-up. When people are trying to "randomly" write down numbers, they rarely do it very well, more frequently following the intuition that random data ought to have just as much chance of starting with any given digit.

The value of MP's expenses certainly spans several orders of magnitude. Excluding repaid claims, expenses in the latest tranche, released last week, span from a value of 10p (reconciliation for a travelcard between Euston and Coventry) to £9900 (for staffing costs in Woking constituency office).

So does the data follow Benford's law? It largely does:

 

The largest variation is a 3 percentage point difference between the expected number of leading 2s and the actual number, with most other digits being present in slightly larger quantities than expected.

Scanning through the data, it's easy to see why this is. There are a large number of claims which are made repeatedly. For instance, 18 different MPs claimed £139.26 for the same twin pack of HP toner cartridges; while nearly every claim for petrol costs came in between £10 and £19.99, boosting the 1s' count again. Conversely, there simply weren't that many must-have services which began with a 2 (although a lot of things MPs need do, apparently, cost £20 on the dot, from venue hire to cleaning bills and car parking).

None of which means there may not still be fraud in the expenses. It simply means that the actual values being claimed for have been drawn from real life. MPs are not, on the whole, making up numbers on the spot as the fill in expense forms; whether what they are claiming for ought to be paid out of the public pocket, statistics are less likely to help with.

(As an aside, it's actually surprising that the figures match Benford's law quite so well; while MP's may not be choosing the numbers they submit, the people who set the prices clearly are. That's probably the reason for the slight uptick in the 9s, for instance; a lot of things which may cost £10 instead are charged as £9.99. It seems that there are either enough counter-examples that it gets balanced out, or lots of claims for things like mileage, which have no set price)

Two data CDs, much like the ones which sparked the original expenses scandal. Photograph: Getty Images

Alex Hern is a technology reporter for the Guardian. He was formerly staff writer at the New Statesman. You should follow Alex on Twitter.

Photo: Getty
Show Hide image

Who will win in Stoke-on-Trent?

Labour are the favourites, but they could fall victim to a shock in the Midlands constituency.  

The resignation of Tristram Hunt as MP for Stoke-on-Central has triggered a by-election in the safe Labour seat of Stoke on Trent Central. That had Westminster speculating about the possibility of a victory for Ukip, which only intensified once Paul Nuttall, the party’s leader, was installed as the candidate.

If Nuttall’s message that the Labour Party has lost touch with its small-town and post-industrial heartlands is going to pay dividends at the ballot box, there can hardly be a better set of circumstances than this: the sitting MP has quit to take up a well-paid job in London, and although  the overwhelming majority of Labour MPs voted to block Brexit, the well-advertised divisions in that party over the vote should help Ukip.

But Labour started with a solid lead – it is always more useful to talk about percentages, not raw vote totals – of 16 points in 2015, with the two parties of the right effectively tied in second and third place. Just 33 votes separated Ukip in second from the third-placed Conservatives.

There was a possible – but narrow – path to victory for Ukip that involved swallowing up the Conservative vote, while Labour shed votes in three directions: to the Liberal Democrats, to Ukip, and to abstention.

But as I wrote at the start of the contest, Ukip were, in my view, overwritten in their chances of winning the seat. We talk a lot about Labour’s problem appealing to “aspirational” voters in Westminster, but less covered, and equally important, is Ukip’s aspiration problem.

For some people, a vote for Ukip is effectively a declaration that you live in a dump. You can have an interesting debate about whether it was particularly sympathetic of Ken Clarke to brand that party’s voters as “elderly male people who have had disappointing lives”, but that view is not just confined to pro-European Conservatives. A great number of people, in Stoke and elsewhere, who are sympathetic to Ukip’s positions on immigration, international development and the European Union also think that voting Ukip is for losers.

That always made making inroads into the Conservative vote harder than it looks. At the risk of looking very, very foolish in six days time, I found it difficult to imagine why Tory voters in Hanley would take the risk of voting Ukip. As I wrote when Nuttall announced his candidacy, the Conservatives were, in my view, a bigger threat to Labour than Ukip.

Under Theresa May, almost every move the party has made has been designed around making inroads into the Ukip vote and that part of the Labour vote that is sympathetic to Ukip. If the polls are to be believed, she’s succeeding nationally, though even on current polling, the Conservatives wouldn’t have enough to take Stoke on Trent Central.

Now Theresa May has made a visit to the constituency. Well, seeing as the government has a comfortable majority in the House of Commons, it’s not as if the Prime Minister needs to find time to visit the seat, particularly when there is another, easier battle down the road in the shape of the West Midlands mayoral election.

But one thing is certain: the Conservatives wouldn’t be sending May down if they thought that they were going to do worse than they did in 2015.

Parties can be wrong of course. The Conservatives knew that they had found a vulnerable spot in the last election as far as a Labour deal with the SNP was concerned. They thought that vulnerable spot was worth 15 to 20 seats. They gained 27 from the Liberal Democrats and a further eight from Labour.  Labour knew they would underperform public expectations and thought they’d end up with around 260 to 280 seats. They ended up with 232.

Nevertheless, Theresa May wouldn’t be coming down to Stoke if CCHQ thought that four days later, her party was going to finish fourth. And if the Conservatives don’t collapse, anyone betting on Ukip is liable to lose their shirt. 

Stephen Bush is special correspondent at the New Statesman. His daily briefing, Morning Call, provides a quick and essential guide to British politics.