Science, technology, and all things awesome with Ian Steadman

RSS

The statistical inevitability of ISP porn filters blocking the wrong stuff

Even if it's an honest mistake, blocking useful sexual health and LGBT resources constitutes an inevitable encroach upon online civil liberties.

Sky's content filter in action.

Consumers have had to put up with the government’s stupid, naive, and/or malicious decision to mandate the censorship of the internet in the UK that ISPs block certain “harmful” content for their customers’ own good for almost a month. The results are starting to roll in, and - unsurprisingly - they’re needlessly frustrating.

Over the course of Sunday evening and Monday morning this week, a crucial browser plugin called jQuery was listed by Sky’s filter as malicious, and the code.jquery.com site was blocked. jQuery’s a Javascript library that makes the language behind modern websites - HTML5 - work properly. Block jQuery, and a lot of the web stops working as smoothly as it should. Sky eventually fixed the problem, but the cause of the error is still unknown.

This follows on from the problems experienced by ISPs in December, when it became clear that the filters included sites offering sexual health advice, or advice on how to deal with abusive partners, as content that needed to be blocked.

As Martin Robbins laid out in the NS in December, there’s a massive gap between the PR-spin “porn filter” idea and the reality of an “objectionable content” filter. The former seeks to block something that doesn’t have a universal definition, while the latter does block sites of a set definition - set, that is, arbitrarily. It’s dangerous to enable censorship - by default! - of stuff that could save lives, just as it’s irresponsible to conflate kids accessing porn (which, by the way, has never been shown to cause long-term damage to children) and the very real problem of child pornography (or child abuse media, as it should really be referred to).

The most frustrating aspect of this problem might well be that it only takes a bit of back-of-the-envelope statistics to show how useless a concept porn filters as a tool, regardless of whether the sites are picked by algorithms or by humans. This is an argument Cory Doctorow has been making with characteristic intelligence and wisdom for years:

There simply aren't enough people of sound judgment in all the world to examine all the web pages that have been created and continue to be created around the clock, and determine whether they are good pages or bad pages. Even if you could marshal such a vast army of censors, they would have to attain an inhuman degree of precision and accuracy, or would be responsible for a system of censorship on a scale never before seen in the world, because they would be sitting in judgment on a medium whose scale was beyond any in human history.

Think, for a moment, of what it means to have a 99 percent accuracy rate when it comes to judging a medium that carries billions of publications.

Consider a hypothetical internet of a mere 20bn documents that is comprised one half "adult" content, and one half "child-safe" content. A 1 percent misclassification rate applied to 20bn documents means 200m documents will be misclassified. That's 100m legitimate documents that would be blocked by the government because of human error, and 100m adult documents that the filter does not touch and that any schoolkid can find.

In practice, the misclassification rate is much, much worse.

How much worse? Try a misclassification rate of more than 75 percent, according to the Electronic Frontier Foundation.

Or, as this tweet succinctly expresses:

In December, as Talk Talk discovered that its HomeSafe filter was blocking the site of LGBT organisation London Friend, its spokesman’s response, delivered on Newsnight, was telling: “Sadly there is no silver bullet when it comes to internet safety and we have always been clear that no solution can ever be 100 percent. We continue to develop HomeSafe and welcome feedback to help us continually improve the service.” ISPs are just as aware of how impossible a challenge it will be to develop a filter that’s a 100 percent safe.

The nuts and bolts work of combating child abuse is the work of a government agency that the coalition has limited by cutting its budget; it's instead hard not to sympathise with the argument (or is it conspiracy theory?) that the filters have been introduced as a kind of backdoor Section 28, designed to appease the puritanical dinosaur wing of the Tory party. And in the meantime, Chrome users with control over their computers can install the Go Away Cameron plugin, which will route around the filters using a proxy.