Rise of the geeks

A new class of specialists is analysing which websites you look at, what you buy in the supermarket,

In the city of Portland, Oregon, hundreds of elderly people have invited Intel Corp, the semiconductor manufacturer, to wire their houses with sensors. This machinery maps their movements through their homes and measures their average strides. It notes the volume at which they speak and the amount of time it takes them to recognise a known friend or relative on the telephone. Sensors in their beds keep track of their weight and nocturnal activity, including trips to the bathroom. Toothbrushing, midnight snacks, nighttime thrashing: it is all in the data, and all of it travels through the internet to Intel’s computers.

With this trove of information, Intel researchers are developing what they call “behavioural baselines” for each household. Any deviation from these norms is a signal that something might be amiss. Research is at an early stage, but in time, Intel hopes the computers will be able to recognise the patterns of certain diseases, from the early stages of Parkinson’s to Alzheimer’s. Eventually, the thinking goes, expensive home help and hospital care will be replaced by ever cheaper surveillance gizmos.

As that happens, a new class of medical professionals will rise. They are not doctors or nurses, but instead specialists in finding patterns in the growing mountains of digital data that we produce. These specialists are one sub-section of a group I call “the numerati”. Engineers, mathematicians and computer scientists, they are sifting through data we produce – and not only in the field of medicine. They are studying the web pages we visit, the groceries we buy, our wanderings with our mobile phones. For them, our digital records create a fast-growing laboratory of human behaviour. It holds the clues as to what we might buy, which online ads we are most likely to click on, which diseases loom largest in our future, and whether we might be inclined – ­statistically speaking – to strap a bomb under our coat and climb on to a bus.

This science, based on statistics, determines only probability. It cannot predict with certainty the behaviour of any individual. For that reason, the numerati are rising fastest in the sectors in which they can afford to make mistakes on a ­regular basis without getting themselves (or, in theory, us) into trouble. Advertising and marketing are their test beds; and Google, a company that answers our queries with educated guesses, is the first titan of the realm.

I have given quite a number of talks about the numerati recently, and as I describe their research into our shopping carts and medicine ­cabinets, I see people start to squirm in their seats, alarmed that Yahoo!, for example, captures a monthly average of 2,500 bits of data about each of its 250 million users. Towards the end of each session, someone usually asks how we can protect ourselves from the inquisitive numerati.

This growing concern is pushing politicians and regulators on both sides of the Atlantic to rein in a form of internet advertising known as behaviour targeting. This involves companies such as Yahoo!, Google, and the British online advertising company Phorm. They reach agreements with publishers, including leading newspapers and magazines, to tag each visitor with a bit of computer code, known as a cookie, which allows them to trace much of our online activity. Most of these companies do not bother searching out our names or addresses. Our patterns alone suffice. A Londoner who reads an article about Mallorca and checks the prices of Rioja is more likely than most, the automatic program might decide, to click on an Iberia Airlines ad. So it drops one in that person’s path. Those concerned about privacy can erase cookies or even instruct their ­computer not to accept them. In doing so, they are opting to be treated not as a known person. For decades, that is what millions of us have been in shopping centres and supermarkets, and on the pavements of big cities: virtually indistinguishable from everyone else. Many of us associate this anonymity with privacy.

However, not everyone shares that view, not by a long shot. Sitting in the same audiences where some fret about privacy and vow to go “off the grid” are others who publish the most intimate details of their lives on Facebook, MySpace and Twitter. Many of these people take the time to answer surveys which pop up on book, movie and online dating sites. They want automatic systems to know them better so that they can get customised service.

There is a fundamental divide between those who want machines to be informed and those who’d rather that they stay in the dark. The privacy divide is not between the numerati and the rest of us; it exists, and is widening, among ourselves. We’ve not yet made up our minds about the machines that increasingly manage our lives.

One thing is clear. The amount of digital data that we produce will continue to grow exponentially. Those concerned about behavioural advertising are getting just the slightest whiff of what is ahead. Consider Sense Networks. A New York-based start-up, the company studies the paths we follow as we move around with mobile phones. In Sense’s computers, each of the millions of mobile-phone users simply show up as an anonymous blinking dot on a map. But by studying those dots, Sense’s scientists can ­derive all sorts of insights about what they signify. On the basis of which neighbourhood a dot spends its nights in, Sense can calculate average home value or income. Dots that pause at regular stops en route to work are train commuters. It is easy to see the ones that go clubbing in the early hours. Golfers, churchgoers, people who sleep around, it is all in the data.

That is just the beginning. As Sense’s system follows the movements of the dots, it begins to recognise similar patterns. A group, or tribe, of dots that behave in similar ways is identified by a particular colour. The patterns are picked out by the computer, not people, so the tribes largely transcend the traditional ­demographic groupings that marketers have focused on for decades. In Sense’s scheme, a pair of identical twins might have different coloured dots. Common behaviours, after all, may well be more telling than similar ages or skin colours.

Why focus on these dots? Say a brewery runs a promotion in a series of pubs around Charing Cross. The campaign is successful, so the company wants to extend it. Sense’s maps can identify other neighbourhoods pulsating with dots of the same colour, or bus routes popular with that hue, in which the brewery might advertise. Politicians, who are starting to use similar data-crunching techniques to target voters, could study their supporters’ dots, and hunt for other, untapped clusters of the same tribes.
Studying people’s movements via their mobile phones is only just beginning. As handsets grow more sophisticated, we deliver ever more information to the numerati. Our mobile web browsers tell advertisers when and where the craving for shopping or fine dining hits us. Nokia envisions analysing people by the places they take, and send, photos of. What can a company infer about those who take snaps of Buckingham Palace or Wembley Stadium? They won’t know until they crunch the data.

While some no doubt recoil at the notion of being tracked as a coloured dot, others embrace it. In February, Google launched its Latitude ­program in 27 countries. The application lets people with high-end mobile phones share location data with friends – and with Google. Meanwhile, more than 25 million people have downloaded Facebook’s mobile application. This allows the social networking company, which ­already houses an immense amount of personal information, to study the movements and behaviour of its large and growing following.

As the global economy swoons, the prospects for the numerati grow brighter. Their efforts to target people carry the promise of efficiency and lower costs. Nowhere is this more evident than in the workplace, where employers can scrutinise the patterns of workers’ clicks and web browsing. One San Francisco technology company, Cataphora, has developed a system for evaluating workers on the basis of their emails. Those whose sentences are forwarded most ­often to others are marked out as “idea generators”, and those who pass on these nuggets get high marks as “networkers”. On a chart that ­Cataphora prepared for an internet company, each worker is portrayed as a coloured disc. Large, dark-hued discs are judged to be the most active, and therefore the most effective. And small, pale ones? They may be the first to be considered when it comes to workforce reduction.

Cataphora’s system, of course, is primitive, and managers who bow to its dictates probably deserve small, pale discs of their own. After all, the much-quoted messages could contain anything: dirty jokes, juicy gossip about colleagues (or bosses). The point, though, is that the quantification of the knowledge workforce is under way. Managers will increasingly take its conclusions into account. And the techniques are sure to grow more sophisticated.

Researchers at Massachusetts Institute of Technology and IBM, a leader in workplace analysis, recently studied the social networks of several thousand of IBM’s technology consultants. They noted that workers who maintain regular email contact with their managers bring in nearly $1,000 revenue a month more than the average. Those who communicate with several managers, and contact each one less often, perform worse than the norm, bringing in $88 less. Again, what these numbers prove is arguable, and the conclusions would be obvious to anyone familiar with the age-old adage about the effect that too many cooks can have on a broth. But as we workers produce more data, the machines are sure to develop ever more nuanced analysis.

Not that the numerati do not face steep challenges. Many of IBM’s workforce studies are based on the same mathematics the company uses to fine-tune industrial supply chains. But humans do not behave like machine parts. We learn, we change, we conspire when our interests are at risk. And we are experts at playing the very systems designed to monitor and control us. So the numerati at IBM work with teams of anthropologists, psychologists and linguists.

Their goal is to place each worker in the right job with just the right training, and surrounded by the most supportive colleagues to ensure that they are as productive as humanly possible. If you think that sounds grim – and it does to me – there is an upside. Studies leave no doubt that happier workers are more productive and come up with better ideas. So job satisfaction is sure to make its way into these productivity algorithms at some stage.

In some areas, like the workplace, the methods of the numerati are imposed upon us. But in other domains, like online dating, it is up to us. We can choose to send our data (and even how truthful to be). There, we are the data masters.

But let us return to the monitored homes in Portland. Would you agree to be monitored for, say, a £100 tax break? How about £200? Increasingly, we are going to face these questions. And I am betting that many of us would choose to keep an electronic eye on the people we feel respon­sible for. A sensor, which can report when a 90-year-old grandmother spends the day in bed, might be a very sensible idea. And the black boxes that insurers are testing to monitor driving patterns might help keep a 17-year-old alive – or at least lower his or her premium.

If surveillance of the young and the old makes sense, it will not be long until we all surround ourselves with it. We will be spying on ourselves and sending in digital reports. In fact, the process is well under way: think of all those security cameras. As far as the numerati are concerned, we are already delivering our lives into their ­laboratories, each day in greater detail.

Stephen Baker’s book “The Numerati” is out now, published by Jonathan Cape (£12.99)

Who is watching you?

The UK government Spends £16bn a year on databases, and plans to spend £105bn on data collection projects over the next five years. The number of systems it operates runs into the thousands.

Google Owns more than 50 applications that collect users’ information, and has multimillion-dollar licensing deals with several news agencies, to help newspapers make money through “interest-based ads”.

Tesco Two hundred million purchases a week are recorded on Tesco loyalty cards. This data is used to build personality profiles of shoppers including their travel habits, donations to charity and concern for the environment. The data is sold to companies such as Sky, Orange and Gillette.

Yahoo! Captures 2,500 bits of data a month about each user. It has formed an alliance with the Newspaper Consortium, which represents more than 800 dailies.

Facebook Two hundred and twenty-two million people visited the site in December 2008, which grants itself a perpetual licence to use content, scanning profiles, pictures and messages, and tailoring the adverts that run beside it.

Phorm Allows partner websites to target their users with tailored ads. The European Commission has started legal action against Britain, after complaints the behavioural advertising service was tested on BT’s broadband network without users’ consent.

Charlotte Middlehurst