Opening up government

Jeni Tennison, technical director of the Open Data Institute, discusses progress on releasing data, how it feeds into wider Whitehall technology reform and why companies should follow the government's lead.

It’s three years since the launch of data.gov.uk and we now have access to datasets covering topics as diverse as HS2, broadband and GP prescriptions. The UK has won many plaudits as a result and is often touted as a world leader in the field of open data.

Time then, perhaps, to give some credit to the people behind these achievements, one of whom is Jeni Tennison.

Tennison has a number of roles: primarily as technical director of the Open Data Institute, but she also sits on the Cabinet Office’s Open Data User Group (ODUG) and the Government Digital Service’s Open Standards Board, to name but a couple.

Tennison agrees that major progress has been made. She says:

We’re getting to a stage now in the UK where, happily, a lot of data that can easily be released has already been published.

However, she is far from complacent. Tennison says, “The challenges are around data that is still being made available for money, where a business case has to be made to demonstrate that there is wider economic benefit against the short-term cost of losing that revenue. There are also challenges around the formats in which information is made available and the regularity and guarantees around that data.”

She explains that a key part of ODUG’s work is representing user requirements for data into government and helping to prioritise datasets for release.

Tennison adds, “It’s just trying to unpick those requirements and represent them into government. Obviously we’ve also had campaigns that ODUG has run around open address information, and ongoing discussions with Ordnance Survey over its licensing terms for example.

“So basically ODUG is a campaigning body that’s close to government and that tries to represent the need for data, spanning a broad range of different interests including big businesses, small businesses and civil society.”

Privacy trade-offs

Regarding concerns raised about open data, particularly around security and privacy, Tennison explains, “I think that you'll find that within the open data community there is a very strong recognition of the need to protect privacy but it’s more complicated than baldly saying 'open data is not personal data'. There are grey areas where personal information needs to be anonymised or aggregated in order to provide the benefits of openness without intruding on people's lives.

“And we're doing that very slowly, very gradually and are very much aware of the risks. The worst thing that could possibly happen for the open data community would be for some personal data to be released under the open data agenda because it would set it back. So it's not in anyone's interest. It's something we feel very strongly about in fact.”

She adds, “It's not 'everything should be out there', it's much more 'what data can we best make use of', and where are the trade-offs?”

As an example of these concerns in practice, Tennison points to a recent HM Revenue & Customs consultation about the VAT register, which contains information on businesses.

However, she explains:

A large proportion of businesses in the VAT register, especially if they aren't in Companies House already, will be named individuals as sole traders. So there is personal information within the register, certainly personally identifiable information.

On the other hand there's a big transparency and open data story about being able to get VAT registration information because it helps you to tie up all of the spending that local government does and central government does with particular companies. So there are lots of benefits economically, and in terms of service delivery and transparency, of making that information available. But then you have this personal data. So you have to balance these up.

And the argument that we made in response to that was that there was a subset of fields that didn't reveal too much but provided enough information to be useful for the 80% of things that didn't need all of this information to be made available. And so, just opening that up would be sufficient to address most of the requirements. It doesn't address all of them but in this case that was the right place to draw the line.

Tennison suggests that organisations follow ICO (Information Commissioner’s Office) advice, which says “it’s open by default, but if you have personal information in your data then you should conduct a Privacy Impact Assessment: you consult with any stakeholders that would be affected by revealing this information, and you do a considered analysis of what the impact will be of releasing that data.

“And when you've done that considered analysis then you may find there are certain bits that shouldn't be released, other bits that can be released, and it allows you to make those trade-offs in a risk-assessment-like process.”

Expanding the audience

Tennison says that one of her worries is about trying to explain what open data means for the general public. She says, “People often don’t know they want open data until they need it, for example about their local schools and hospitals.”

“Certainly one of the things that we concentrate really strongly on at the ODI is trying to expand the set of people that think about and understand what open data is and what it means beyond the existing open data community and the geeks. And the kind of technique that we're using at the moment is lots of storiesto try to bring home how open data enables new things to happen.”

Tennison is also keen to expand an open data ethos throughout Whitehall. She regularly visits the Government Digital Service (GDS) for meetings of the Open Standards Board, but recently did a presentation during one of their monthly meetings.

She says, “The pitch that I made to GDS was that open data should be very much at the heart of what they are doing, and they get that. It’s been very hard for them because they’ve had to focus on getting GOV.UK running, getting the government departments in, and now getting the public sector agencies in too.

“They are starting now, really excitingly, to look more at how they make data available as open data and to build that into their thinking. GDS should be one of the greatest beneficiaries of open data from the public sector because they should be able to build any open data that is available into their own services and the transactions that they are supporting.

“And so I’d really like GDS to be making a stronger stand around asking for the data that they need to be made available as open data by departments. To be asking for that themselves. They’re beginning to do some really great things. They are starting to build in micro APIs to the GOV.UK back end, so if you’re building an application that depends on the VAT rate, you can have the API call that goes and gets whatever the current VAT rate is and uses that.

“That’s always been what they intended to do at GDS: making it wholesale, enabling other people to build in services and data that they’re making available. So it’s really good to see them starting to move in that direction and I’m expecting great things from them.”

Making government more open

Tennison explains that ‘opening up’ technology is crucial if the government is to succeed in its plans to reform technology and break away from a reliance on a small pool of suppliers.

She says, “What I think is really interesting here is how open source, open standards and open data all come together to support the same agenda. Government doesn’t want to be locked into massive IT contracts with a small set of suppliers. “Open" helps because when you have open data being published then you have a level playing field about what people know. When you have open standards being used then you have a level playing field about what tools can be used. And when you have open source established you have a level playing field about really just getting started with using that data and using those standards.

“We’re not going to see the effect [of the reforms] until the big contracts start coming to an end, but they are laying the ground work. We can’t think that it’s job done. If you look at Dun& Bradstreet identifiers and the ACORN classification, you get those being built into IT systems and that forces a reliance on one supplier. These things have been built into the heart of some of the IT systems and processes and metrics and everything that is being used within government in order to make decisions. You’re completely locked in.

“And the other thing that you have to be extremely careful about within government is ‘open-washing’ that sometimes goes on. So, for example, you will get an API that is described as an ‘open API’ where in fact it doesn’t mean that the data is open data or using an open standard. The underlying standard that is used is open but the actual language, the vocabulary itself, is developed by one company and only one company uses it.

“So there are lots of things to be careful of. And of course it’s completely understandable that it’s in a company’s interest to try and keep people locked into using them and they’re going to keep on trying to get people locked into using them, because that’s what companies do. The government initiatives around open source, open standards and open data are so important because it’s only those that push back. It’s only those that enable you to stop that kind of thing from happening.”

Moving beyond the public sector

Regarding the ODI’s future, Tennison says, “A really important role that ODI will have over the next few years is about moving the open data community outside the government data space. Because although government holds a lot of data, so do companies. And so do third sector organisations, like charities and not-for-profits.

“In just the same way you have websites for all kinds of organisations, we should be thinking about how we have open data for all kinds of organisations and how publishing open data can benefit every kind of organisation. So the most heartening thing to me about the way that it looks like the ODI will work in future is our partnerships with companies who are not just thinking about consuming open data but actually publishing open data themselves.

“And they’re doing that for a number of reasons: for transparency, for innovation, in order to communicate better with their partners and peers, and for regulation reasons. There’s a whole range of reasons why companies should open data, and to me that’s the untapped space.”

She adds, “In some ways some companies are able to move much quicker than government. I think it’s going to be very interesting to see if we can actually accelerate the amount of open data that’s available by moving into corporate open data.”

However, she says, “Pre-GDS, government transactional services had been very much behind the private sector which has pushed ahead. In the open data arena it’s actually government that’s ahead. Government has really led the way and pushed innovation.”

This article originally appeared on governmentcomputing.com

What data can we best make use of - and where are the trade-offs? Photograph: Getty Images.

Charlotte Jee is a Reporter at Government Computing

 

Show Hide image

What it’s like to fall victim to the Mail Online’s aggregation machine

I recently travelled to Iraq at my own expense to write a piece about war graves. Within five hours of the story's publication by the Times, huge chunks of it appeared on Mail Online – under someone else's byline.

I recently returned from a trip to Iraq, and wrote an article for the Times on the desecration of Commonwealth war cemeteries in the southern cities of Amara and Basra. It appeared in Monday’s paper, and began:

“‘Their name liveth for evermore’, the engraving reads, but the words ring hollow. The stone on which they appear lies shattered in a foreign field that should forever be England, but patently is anything but.”

By 6am, less than five hours after the Times put it online, a remarkably similar story had appeared on Mail Online, the world’s biggest and most successful English-language website with 200 million unique visitors a month.

It began: “Despite being etched with the immortal line: ‘Their name liveth for evermore’, the truth could not be further from the sentiment for the memorials in the Commonwealth War Cemetery in Amara.”

The article ran under the byline of someone called Euan McLelland, who describes himself on his personal website as a “driven, proactive and reliable multi-media reporter”. Alas, he was not driven or proactive enough to visit Iraq himself. His story was lifted straight from mine – every fact, every quote, every observation, the only significant difference being the introduction of a few errors and some lyrical flights of fancy. McLelland’s journalistic research extended to discovering the name of a Victoria Cross winner buried in one of the cemeteries – then getting it wrong.

Within the trade, lifting quotes and other material without proper acknowledgement is called plagiarism. In the wider world it is called theft. As a freelance, I had financed my trip to Iraq (though I should eventually recoup my expenses of nearly £1,000). I had arranged a guide and transport. I had expended considerable time and energy on the travel and research, and had taken the risk of visiting a notoriously unstable country. Yet McLelland had seen fit not only to filch my work but put his name on it. In doing so, he also precluded the possibility of me selling the story to any other publication.

I’m being unfair, of course. McLelland is merely a lackey. His job is to repackage and regurgitate. He has no time to do what proper journalists do – investigate, find things out, speak to real people, check facts. As the astute media blog SubScribe pointed out, on the same day that he “exposed” the state of Iraq’s cemeteries McLelland also wrote stories about the junior doctors’ strike, British special forces fighting Isis in Iraq, a policeman’s killer enjoying supervised outings from prison, methods of teaching children to read, the development of odourless garlic, a book by Lee Rigby’s mother serialised in the rival Mirror, and Michael Gove’s warning of an immigration free-for-all if Britain brexits. That’s some workload.

Last year James King published a damning insider’s account of working at Mail Online for the website Gawker. “I saw basic journalism standards and ethics casually and routinely ignored. I saw other publications’ work lifted wholesale. I watched editors...publish information they knew to be inaccurate,” he wrote. “The Mail’s editorial model depends on little more than dishonesty, theft of copyrighted material, and sensationalism so absurd that it crosses into fabrication.”

Mail Online strenuously denied the charges, but there is plenty of evidence to support them. In 2014, for example, it was famously forced to apologise to George Clooney for publishing what the actor described as a bogus, baseless and “premeditated lie” about his future mother-in-law opposing his marriage to Amal Alamuddin.

That same year it had to pay a “sizeable amount” to a freelance journalist named Jonathan Krohn for stealing his exclusive account in the Sunday Telegraph of being besieged with the Yazidis on northern Iraq’s Mount Sinjar by Islamic State fighters. It had to compensate another freelance, Ali Kefford, for ripping off her exclusive interview for the Mirror with Sarah West, the first female commander of a Navy warship.

Incensed by the theft of my own story, I emailed Martin Clarke, publisher of Mail Online, attaching an invoice for several hundred pounds. I heard nothing, so emailed McLelland to ask if he intended to pay me for using my work. Again I heard nothing, so I posted both emails on Facebook and Twitter.

I was astonished by the support I received, especially from my fellow journalists, some of them household names, including several victims of Mail Online themselves. They clearly loathed the website and the way it tarnishes and debases their profession. “Keep pestering and shaming them till you get a response,” one urged me. Take legal action, others exhorted me. “Could a groundswell from working journalists develop into a concerted effort to stop the theft?” SubScribe asked hopefully.

Then, as pressure from social media grew, Mail Online capitulated. Scott Langham, its deputy managing editor, emailed to say it would pay my invoice – but “with no admission of liability”. He even asked if it could keep the offending article up online, only with my byline instead of McLelland’s. I declined that generous offer and demanded its removal.

When I announced my little victory on Facebook some journalistic colleagues expressed disappointment, not satisfaction. They had hoped this would be a test case, they said. They wanted Mail Online’s brand of “journalism” exposed for what it is. “I was spoiling for a long war of attrition,” one well-known television correspondent lamented. Instead, they complained, a website widely seen as the model for future online journalism had simply bought off yet another of its victims.