A statistical trick which reveals whether MPs are lying about expenses

Benford's law has many uses. Can it trip up MPs?

Are politicians routinely making up expenses? A simple statistical test suggests not.

Benford's law is a statistical artefact found in numerical data spanning several orders of magnitude. Ben Goldacre explains:

Imagine you have data on, say, the population of every world nation. Now, take only the "leading digit" from each number: the first number in the number, if you like. For the UK population, which was 61,838,154 in 2009, that leading digit would be "six". Andorra's was 85,168, so that's "eight". And so on.

If you take all those leading digits, from all the countries, then overall, you might naively expect to see the same number of ones, fours, nines, and so on. But in fact, for naturally occurring data, you get more ones than twos, more twos than threes, and so on, all the way down to nine. This is Benford's law: the distribution of leading digits follows a logarithmic distribution, so you get a "one" most commonly, appearing as first digit around 30% of the time, and a nine as first digit only 5% of the time.

This pattern should repeat for almost any data which matches the key condition of spanning a large range of sizes. Take the example above, world populations, which goes from 800 in the Vatican City to 1.35 billion in China. But one category of data which rarely obeys the law is that where the numbers are made-up. When people are trying to "randomly" write down numbers, they rarely do it very well, more frequently following the intuition that random data ought to have just as much chance of starting with any given digit.

The value of MP's expenses certainly spans several orders of magnitude. Excluding repaid claims, expenses in the latest tranche, released last week, span from a value of 10p (reconciliation for a travelcard between Euston and Coventry) to £9900 (for staffing costs in Woking constituency office).

So does the data follow Benford's law? It largely does:

 

The largest variation is a 3 percentage point difference between the expected number of leading 2s and the actual number, with most other digits being present in slightly larger quantities than expected.

Scanning through the data, it's easy to see why this is. There are a large number of claims which are made repeatedly. For instance, 18 different MPs claimed £139.26 for the same twin pack of HP toner cartridges; while nearly every claim for petrol costs came in between £10 and £19.99, boosting the 1s' count again. Conversely, there simply weren't that many must-have services which began with a 2 (although a lot of things MPs need do, apparently, cost £20 on the dot, from venue hire to cleaning bills and car parking).

None of which means there may not still be fraud in the expenses. It simply means that the actual values being claimed for have been drawn from real life. MPs are not, on the whole, making up numbers on the spot as the fill in expense forms; whether what they are claiming for ought to be paid out of the public pocket, statistics are less likely to help with.

(As an aside, it's actually surprising that the figures match Benford's law quite so well; while MP's may not be choosing the numbers they submit, the people who set the prices clearly are. That's probably the reason for the slight uptick in the 9s, for instance; a lot of things which may cost £10 instead are charged as £9.99. It seems that there are either enough counter-examples that it gets balanced out, or lots of claims for things like mileage, which have no set price)

Two data CDs, much like the ones which sparked the original expenses scandal. Photograph: Getty Images

Alex Hern is a technology reporter for the Guardian. He was formerly staff writer at the New Statesman. You should follow Alex on Twitter.

Photo: Getty
Show Hide image

UnHerd's rejection of the new isn't as groundbreaking as it seems to think

Tim Montgomerie's new venture has some promise, but it's trying to solve an old problem.

Information overload is oft-cited as one of the main drawbacks of the modern age. There is simply too much to take in, especially when it comes to news. Hourly radio bulletins, rolling news channels and the constant stream of updates available from the internet – there is just more than any one person can consume. 

Luckily Tim Montgomerie, the founder of ConservativeHome and former Times comment editor, is here to help. Montgomerie is launching UnHerd, a new media venture that promises to pull back and focus on "the important things rather than the latest things". 

According to Montgomerie the site has a "package of investment", at least some of which comes from Paul Marshall. He is co-founder of one of Europe's largest hedge funds, Marshall Wace, formerly a longstanding Lib Dem, and also one of the main backers and chair of Ark Schools, an academy chain. The money behind the project is on display in UnHerd's swish (if slightly overwhelming) site, Google ads promoting the homepage, and article commissions worth up to $5,000. The selection of articles at launch includes an entertaining piece by Lionel Shriver on being a "news-aholic", though currently most of the bylines belong to Montgomerie himself. 

Guidelines for contributors, also meant to reflect the site's "values", contain some sensible advice. This includes breaking down ideas into bullet points, thinking about who is likely to read and promote articles, and footnoting facts. 

The guidelines also suggest focusing on what people will "still want to read in six, 12 or 24 months" and that will "be of interest to someone in Cincinnati or Perth as well as Vancouver or St Petersburg and Cape Town and Edinburgh" – though it's not quite clear how one of Montgomerie's early contributions, a defence of George Osborne's editorship of the Evening Standard, quite fits that global criteria. I'm sure it has nothing to do with the full page comment piece Montgomerie got in Osborne's paper to bemoan the deficiencies of modern media on the day UnHerd launched. 

UnHerd's mascot  – a cow – has also created some confusion, compounded by another line in the writing tips describing it as "a cow, who like our target readers, tends to avoid herds and behave in unmissable ways as a result". At least Montgomerie only picked the second-most famous poster animal for herding behaviour. It could have been a sheep. In any case, the line has since disappeared from the post – suggesting the zoological inadequacy of the metaphor may have been recognised. 

There is one way in which UnHerd perfectly embodies its stated aim of avoiding the new – the idea that we need to address the frenetic nature of modern news has been around for years.

"Slow news" – a more considered approach to what's going on in the world that takes in the bigger picture – has been talked about since at least the beginning of this decade.

In fact, it's been around so long that it has become positively mainstream. That pusher of rolling coverage the BBC has been talking about using slow news to counteract fake news, and Montgomerie's old employers, the Times decided last year to move to publishing digital editions at set points during the day, rather than constantly updating as stories break. Even the Guardian – which has most enthusiastically embraced the crack-cocaine of rolling web coverage, the live blog – also publishes regular long reads taking a deep dive into a weighty subject. 

UnHerd may well find an audience particularly attuned to its approach and values. It intends to introduce paid services – an especially good idea given the perverse incentives to chase traffic that come with relying on digital advertising. The ethos it is pitching may well help persuade people to pay, and I don't doubt Montgomerie will be able to find good writers who will deal with big ideas in interesting ways. 

But the idea UnHerd is offering a groundbreaking solution to information overload is faintly ludicrous. There are plenty of ways for people to disengage from the news cycle – and plenty of sources of information and good writing that allow people to do it while staying informed. It's just that given so many opportunities to stay up to date with what has just happened, few people decide they would rather not know.