Predicting the text in redacted documents is close to reality

Releasing delicate information with big black bars all over it has kept secrets safe for years - but not for much longer, maybe.

For those with secrets they want to keep, redacting documents is a pretty important thing to get right. It’s necessary to understand how to redact documents, firstly - look to Southwark Council, which in February uploaded its controversial agreement with developer Lend Lease for the regeneration of the Heygate Estate in a form that let people copy and paste the text underneath the black bars.

But it’s also necessary to know which parts of a document to redact so that the context from the stuff left open doesn’t give the game away. There is always, however, information left behind. The choices made in how to block text - be it with other bits of paper, or black marker pen, or even by typing out new words and then covering those up - can reveal something about the person doing the redacting. Different agencies had different redaction standards at different times, which gives a further clue as to what technique is needed. Each typeface fits into the space under a bar in a limited number of contextually-relevant ways, as well.

In the New Yorker, William Brennan reports on The Declassification Engine, an intriguing attempt by a group of academics to use these clues to try and crack any redacted text. A snippet:

Together with a group of historians, computer scientists, and statisticians, [Columbia history professor Matthew] Connelly is developing an ambitious project called the Declassification Engine, which, among other things, employs machine-learning and natural language processing to study the semantic patterns in declassified text. The project’s goals range from compiling the largest digitized archive of declassified documents in the world to plotting the declassified geographical metadata of over a million State Department cables on an interactive global map, which the researchers hope will afford them new insight into the workings of government secrecy. Though the Declassification Engine is in its early stages, Connelly told me that the project has “gotten to the point where we can see it might be possible to predict content of redacted text. But we haven’t yet made a decision as to whether we want to do that or not.”

One of the things that jumps out in here is the parallel between the "mosaic theory" - where "pieces of banal, declassified information, when pieced together, might provide a knowledgeable reader with enough emergent detail to uncover the information that remains classified" - and critics of the NSA who realise that mass collection of metadata rather than the actual data of communications is, in many ways, just as bad.

Redacted Iraq War info at a 2004 US Senate press conference (Photo: Getty)

Ian Steadman is a staff science and technology writer at the New Statesman. He is on Twitter as @iansteadman.

Getty Images/ Staff
Show Hide image

The answer to the antibiotics crisis might be inside your nose

The medical weapons we have equipped ourselves with are losing their power. But scientists scent an answer. 

They say there’s a hero in everyone. It turns out that actually, it resides within only about ten percent of us. Staphylococcus lugdunensis may be the species of bacteria that we arguably don’t deserve, but it is the one that we need.

Recently, experts have cautioned that we may be on the cusp of a post-antibiotic era. In fact, less than a month ago, the US Centres for Disease Control and Prevention released a report on a woman who died from a "pan-resistant" disease – one that survived the use of all available antibiotics. Back in 1945, the discoverer of penicillin, Alexander Fleming, warned during his Nobel Prize acceptance speech against the misuse of antibiotics. More recently, Britain's Chief Medical Officer Professor Dame Sally Davies has referred to anti-microbial resistance as “the greatest future threat to our civilisation”.

However, hope has appeared in the form of "lugdunin", a compound secreted by a species of bacteria found in a rather unlikely location – the human nose.

Governments and health campaigners alike may be assisted by a discovery by researchers at the University of Tubingen in Germany. According to a study published in Nature, the researchers had been studying Staphylococcus aureus. This is the bacteria which is responsible for so-called "superbug": MRSA. A strain of MRSA bacteria is not particularly virulent, but crucially, it is not susceptible to common antibiotics. This means that MRSA spreads quickly from crowded locations where residents have weaker immune systems, such as hospitals, before becoming endemic in the wider local community. In the UK, MRSA is a factor in hundreds of deaths a year. 

The researchers in question were investigating why S. aureus is not present in the noses of some people. They discovered that another bacteria, S. lugdunensis, was especially effective at wiping out its opposition, even MRSA. The researchers named the compound created and released by the S. lugdunensis "lugdunin".

In the animal testing stage, the researchers observed that the presence of lugdunin was successful in radically reducing and sometimes purging the infection. The researchers subsequently collected nasal swabs from 187 hospital patients, and found S. aureus on roughly a third of the swabs, and S. lugdunensis on up to 10 per cent of them. In accordance with previous results, samples that contained both species saw an 80 per cent decrease of the S. aureus population, in comparison to those without lugdunin.

Most notably, the in vitro (laboratory) testing phase provided evidence that the new discovery is also useful in eliminating other kinds of superbugs, none of which seemed to develop resistance to the new compound. The authors of the study hypothesised that lugdunin had evolved  “for the purpose of bacterial elimination in the human organism, implying that it is optimised for efficacy and tolerance at its physiological site of action". How it works, though, is not fully understood. 

The discovery of lugdunin as a potential new treatment is a breakthrough on its own. But that is not the end of the story. It holds implications for “a new concept of finding antibiotics”, according to Andreas Peschel, one of the bacteriologists behind the discovery.

The development of antibiotics has drastically slowed in recent years. In the last 50 years, only two new classes of this category of medication have been released to the market. This is due to the fact almost all antibiotics in use are derived from soil bacteria. By contrast, the new findings record the first occurrence of a strain of bacteria that exists within human bodies. Some researchers now suggest that the more hostile the environment to bacterial growth, the more likely it may be for novel antibiotics to be found. This could open up a new list of potential areas in which antibiotic research may be carried out.

When it comes to beating MRSA, there is hope that lugdunin will be our next great weapon. Peschel and his fellow collaborators are in talks with various companies about developing a medical treatment that uses lugdunin.

Meanwhile, in September 2016, the United Nations committed itself to opposing the spread of antibiotic resistance. Of the many points to which the UN signatories have agreed, possibly the most significant is their commitment to “encourage innovative ways to develop new antibiotics”. 

The initiative has the scope to achieve a lot, or dissolve into box ticking exercise. The discovery of lugdunin may well be the spark that drives it forward. Nothing to sniff about that. 

Anjuli R. K. Shere is a 2016/17 Wellcome Scholar and science intern at the New Statesman