Why public policy shouldn’t be guided by master numbers

GDP, the Wellby and other indicators may tell us something about societal well-being, but they aren't substitutes for political decision-making.

By Anna Alexandrova

Illustration by Alice Mollon/Ikon Images

There is a persistent dream among some social scientists: the dream of a master number. This would be an indicator of progress that integrates all the values we care about – individual, social, economic, cultural – into one summary quantity. If such an indicator existed, an increase in its value would be a sure sign that life is getting better and a decrease a reliable warning that it is getting worse. Decisions about what to do, for whom, and when, would then be regulated by the movements of the master indicator, rather than by politics and emotion. A technocratic dream.

Each time a new such number is proposed, it is hailed as an advance on previous versions. And it usually is. GDP is much better than earlier attempts to estimate national income. But it fails to capture non-monetary goods. As Bobby Kennedy noted in a speech in 1968, it counts many things that it shouldn’t (more prisons, for example) and disregards many things that it should (such as good parenting). He bemoaned that it counts everything “except that which makes life worthwhile”.

That was an exaggeration. In fact, GDP counts a lot of things of genuine value and techniques for its estimation are constantly improving. But the complaints have served as effective campaign slogans for the idealistic initiatives to devise a replacement. Among these replacements are Millennium Development Goals (MDGs), superseded more recently by Sustainable Development Goals (SDGs), and by the many different attempts to measure happiness, well-being, and quality of life.

Each of these numbers is better than GDP or another competitor in some respect. MDGs capture poverty better than GDP, while SDGs capture environmental costs, and so on. But this doesn’t mean we are approaching some omega point of the maximally inclusive indicator, or even moving in a single direction of improvement. There is no progress in all respects, there are only piecemeal improvements of specific estimation methodologies, improvements that often come at the expense of others. For example, the more inclusive an indicator is, the harder it is to collect and to analyse sufficient data to use it. The simpler it is, the more it is open to criticism.

What if there is something wrong with the dream of a master number itself? With the very idea that rational decision-making and effective policy demands an integrated indicator? My worry is not whether numbers can ever fully capture our values, but whether one number can capture all the numbers that matter.

Consider the most recent attempt at integration now gaining ground in the UK, a Wellby or “well-being-adjusted life year” – that is, the value of a year of life for a person depending on its subjective quality. The indicator is now endorsed by the Treasury, and charities and NGOs all around the UK are advised to use Wellbys to produce evidence about the value of their activities, be it protection of rivers or youth sports.

Researchers gauge “subjective quality” with surveys that invite responders to evaluate some aspect of their lives. There are many such surveys but Wellby opts for “life satisfaction”: overall, how satisfied are you with your life? Answer from 0 (not at all) to 10 (completely). The assumption is that the answer to this question reflects all the values that make up a person’s quality of life and no irrelevant factors. If many people answer this question, repeatedly, and in a variety of settings, then it is possible, at least in principle, to determine the effect of any given policy on the overall quantity of life satisfaction.

The brainchild of LSE economists, the Wellby method takes this survey data and converts it into estimates of which collection of policies produces the most life satisfaction over time at the lowest cost. Instead of asking how much GDP a policy produces or how far it furthers SDGs, we ask instead how many Wellbys – “well-being-adjusted life years” – we can expect from it.

According to its advocates, the Wellby is a master indicator because life satisfaction ratings are comprehensive and democratic. They are comprehensive because when a person is asked how satisfied they are with their life as a whole, they take into account every value that matters to them – be it income, health, leisure, or anything else.

They are democratic in that each person surveyed is weighted equally, no one is told by outsiders how to rate their life, and the whole cost-effectiveness analysis can be reproduced, at least in theory, by anyone. The Wellby is not overly complicated and labour-intensive, unlike the SDGs, which have 231 unique indicators, nor is it narrowly focused on monetisable goods, unlike GDP. What’s not to like?

A lot, actually. The idea that people perform the complex balancing of each of their values when asked to rate their life satisfaction from 1 to 10 is a very controversial assumption about our psychology. And without a plausible psychology, we cannot obtain a comprehensive evaluation of life through these ratings.

There are also doubts that ratings of life satisfaction are reliably comparable and scalable across people and time. These doubts dampen the claim to rigorous measurement.

The Wellby methodology also requires evidence of which aspects of life – say, housing or education – have the largest effect on life satisfaction. I am sceptical about whether this demand to identify the precise causal contribution of different factors to overall life satisfaction can be met in practice. In fact, these relationships are changeable, complex and context-dependent.

Finally, many of us resist the utilitarian principle that all decisions in public policy need to maximise how people feel, rather than, for example, their constitutional rights and opportunities.

Wellby advocates have prepared stock answers to all these objections. They retort that problems with life satisfaction ratings dissipate with large enough samples. That fallible evidence of causality is better than none. That public policy is well served by clarity about which policies maximise Wellbys, even if Wellbys might not be the only determinant of policy.

These answers are not watertight but they are pragmatic. They are judgement calls. The Wellby, its proponents say, may not be perfect but in practice it is superior. It is our least bad option at a price worth paying to improve upon the standard methods of economic evaluation or the unwieldy multidimensional metrics of the SDGs.

But how can we know that the Wellby is the least bad option? I do not know and do not pretend to. My bigger frustration is with the very exercise that the search for a master indicator like Wellby forces upon us.

This exercise compels us to accept compromises for the sake of the widespread ideal of “evidence-based practice”. But if evidence-based practice requires compromises, then why not resolve them the way compromises should be resolved – by honest politics?

Why can’t we keep track of the state of our communities using multiple indicators each representing different values, rather than a single master indicator? When these different indicators do not unequivocally support one policy over another, we should have a political debate about which indicator best captures our priorities at that time. The decision thus taken will not be solely evidence-based, and in any case we have already seen that evidence about well-being always comes with many judgements that remain in experts’ hands – judgements that hide in the assumptions about measurement and estimation of well-being.

[see also: Agora: why social science needs stories]

The dream of a master indicator is dangerous precisely because it buries disagreements about what else matters other than life-years weighted by life satisfaction. Consider, for example, a town deciding whether to fund a public library. Suppose there is data showing that this library would likely make residents more informed, but depress their mental health. Which matters more? The Wellby expert reassures the town: “There is no need to even raise this thorny question. Here’s the relevant Wellby estimate of the effect of a public library. It reflects the trade-off between informedness and mental health from the residents’ point of view.”

The expert’s offer is tempting. Who doesn’t want to avoid making a difficult choice? But the Wellby estimate does not reflect residents’ considered judgement about what matters to them as citizens – that’s not the question life satisfaction answers – and so the Wellby is not an adequate substitute for public deliberation about what to prioritise.

What the Wellby, or any other master indicator, does well is to make conflicts between values invisible. Such indicators create the utterly false impression that legitimate decisions can be made on the basis of evidence alone. It then becomes harder for the public to challenge the expert judgements that, in this case, clearly deserve to be challenged. This technocratic dream thus perpetuates the illusion that there can be politics-free rule on the basis of science alone.

This myth will persist among the faithful. But it needs to be eradicated from the social sciences. They are our best source of knowledge about the fact that we are far more complex and interesting than any one number can capture.

Anna Alexandrova is Professor in Philosophy of Science at the University of Cambridge. She is the author of “A Philosophy for the Science of Well-Being” (OUP).

This article is part of the Agora series, a collaboration between the New Statesman and Aaron James Wendland. Wendland is Vision Fellow in Public Philosophy at King’s College, London and a Senior Research Fellow at Massey College, Toronto. He tweets @aj_wendland.