Digital healthcare is a fast-growing market, predicted to grow to £2.9bn in the UK by the end of 2018. Tens of thousands of health apps are available on Apple’s App Store and Google Play, and one of the most prominent, with more than 14,000 reviews, is Babylon Health.
The UK-based private company says it has 1.4m users in the UK. Among them is the new health secretary, Matt Hancock. It has been trialled by the NHS, but the Babylon app and the AI that underpins it is unregulated (other than self-regulation), which has raised concerns among healthcare professionals.
A report from the Care Quality Commission found 43 per cent of digital health providers were not providing “safe” care. A hospital doctor working in acute medicine told Spotlight that in his experience, Babylon Health “couldn’t tell the difference” between “a heart attack or heartburn”, and that if it was being promoted as being able to make this type of distinction, there was a risk that “people would have been harmed.”
The company’s claims for the app have met with some concern. Posters for the company’s GP At Hand app were ruled to have been “misleading” by the Advertising Standards Agency, and at a press conference Ali Parsa, the CEO of Babylon claimed that the AI scored 81 per cent on a medical exam while the average pass mark for doctors is 72 per cent.
Professor Martin Marshall, vice chair of the Royal College of GPs, responded that “no app or algorithm will be able to do what a GP does,” and described the assertion that Babylon’s AI could perform better than the average GP as “dubious”.
The AI chatbot, which offers a triage service for patients to type in ailments and receive advice, has come under particular scrutiny. Although warnings on the app stress that it does not provide a diagnosis, the app gives suggestions as to what could be wrong with the patient and recommends whether to see a doctor or call an ambulance.
In tests described by a hospital doctor, it appeared to struggle to tell the difference between a heart attack and a panic attack, and suggested symptoms of a chest infection could indicate multiple sclerosis. In a test of the app conducted by Spotlight, when the “patient” reported vomiting it questioned whether the patient’s testicles were sore, and whether wind could be passed, before determining that a doctor should be seen. When typing in symptoms of a panic attack (as described on the NHS website) it advised calling 999.
The hospital doctor, who asked to remain anonymous, described the service as “absolutely woeful… I think most members of the general public would do better. It doesn’t seem to nuance the length of symptoms so the answers and responses you get back from the algorithm are often quite bizarre.”
He believes that the app was “inadequately tested and overpromoted”. When launched in 2015 it was presented as an app that “gave safe advice 100 per cent of the time”. But for those that had chest pain or breathlessness, placing their trust in an app that could make errors could have constituted, he said, “a significant risk”. Babylon Health has since been presented as a more basic triage app, which, he accepts, means “the risk is [now] low”.
But Margaret McCartney, a GP of 20 years, warned that the false positives it could produce would be “just as harmful to the healthcare service as a whole”. “When people are identified as needing help, when in fact they don’t, it creates an increase in waiting times and an increased difficulty getting to see a health care professional; this harms everybody.”
What regulation is in place to ensure patient safety?
The Babylon website states: “The outcomes, usage data and feedback are audited regularly by our in-house medical team to ensure that we are satisfying our users, providing a safe and useful service, and to see how we can improve our content.”
Dr Mobasher Butt, medical director at Babylon, admitted the two published technical reports that assess the performance and safety of the app do not include a randomised control trial – a form of trial which Dr Butt described as providing “the highest level of evidence”. His reasoning for this was that the slow and methodical nature of the test made it outdated for testing the “rapidly evolving technology”, meaning that any results would be out of date by the time they were published. He added: “our approach to clinical testing and validation is incredibly robust.” He explained that it involves several stages of testing and validation by internal and external clinicians, and that this was an ongoing process.
Dr Butt also claimed that there was “a strong regulatory component” from the Medicines and Healthcare products Regulatory Agency (MHRA), and “while the device manufacturer might need to submit their own materials it is actually a very comprehensive process”.
Health apps, such as Babylon, Ada and Push Doctor, are listed as a class one or low-risk medical device – the same class as more rudimentary devices such as stethoscopes, bandages or splints. This means that they only require self assessment in order to be registered with the MHRA.
McCartney wrote in the British Medical Journal: “We have many regulators but little proactivity, even for an app which – despite the small print warning us that it ‘does not constitute medical advice, diagnosis, or treatment’ – is being used as the front door into NHS care.”
“AI has great potential in healthcare, but this potential will not be realised, and harm may be caused, if we don’t accept the need for robust testing before it’s publicly launched and widely used. We have no clear regulator, no clear trial process, and no clear accountability trail. What could possibly go wrong?”
Her sentiments were reflected by Professor Martin Marshall, vice chair of the Royal College of GPs, who said that the claim that the AI worked better than a GP “made for a nice headline,” but “wasn’t very meaningful”.
“Technology like this has enormous potential to help doctors make better diagnoses,” but “I think regulation needs to evolve and it needs to do so rapidly”. Marshall also thought that it was government’s responsibility “to promote the technology, and it’s likely to change the nature of healthcare in the future”. However, he said, “I don’t think it’s their job to promote single products”. Instead, he called for more support for “making sure that it’s properly evaluated and properly regulated”.
The MHRA declared that it regularly carries out “post-market surveillance and maintain[s] dialogue with manufacturers. Patient safety is our highest priority and should anything be identified during our post-market surveillance, we take action as appropriate to protect public health.”
Changes to MHRA regulation are coming into place by May 2020, when the agency will be required to assess the apps’ data. Murphy believes this will give the MHRA “greater oversight over the safety of these apps”. Babylon said it was already working to meet requirements.
Concerns with health apps were raised in parliament in July, during questions in the House of Commons on online NHS services following Matt Hancock’s appointment as Health Secretary. Sarah Wollaston asked why no regulator is examining the safety and effectiveness of diagnostic apps, to which Hancock replied: “The response when there are challenges such as the one my honourable friend raises is not to reject the technology, but the opposite: to keep improving the technology so that it gets better and better, and to make sure that the rules keep up to pace.”
Babylon say their mission is “to provide affordable and accessible healthcare and put it into the hands of everyone on earth”. It’s a bold statement and a problem that Babylon believes AI can solve. However, regulation may still have to follow the lead of technology as it continues to set the pace.