Speech Recognition in Healthcare

October 7, 2019 | Kulika Supcharoen

Speech recognition is a catch-all term that describes a wide range of fields of academia that all work towards delivering something truly complex and valuable. That is the translation of spoken language into text that can be read by computers. It’s also often referred to as Automatic Speech Recognition (ASR), Computer Speech Recognition or Speech-to-Text (STT).

Whatever you call it, it’s possibly the apogee of human-machine interfacing as it replicates the control and communication of machines by using the exact same medium we squishy humans use to control and communicate with other squishy humans. Simply put, ASR allows machines to literally become part of the conversation.

Speech recognition isn’t new, in fact, work began on speech recognition as early as 1952 at Three Bell Labs. Researchers ; Stephen Balashek, R.Biddulph, and K.H. Davis built a system they called “Audrey”, a HUGE machine that occupied a six-foot-high relay rack. Audrey built the foundations of ASR as she was able to recognise the fundamental units of speech sounds, something that is referred to as phonemes. It wasn’t until the 1970s, however, until ASR research really took off. In1971, DARPA funded five years of “Speech understanding Research” seeking a minimum vocabulary size of 1,000 words.

One of the first actual consumer products to use ASR was Dragon Dictate, a Voice Recognition call processing service delivered in 1992. Developed by Lawrence Rabiner and others at Bell Labs it was deployed by AT&T to route calls without the use of a human operator.

The development of ASR has continued exponentially since this first commercial deployment of voice recognition. We’re all familiar now with several consumer products used daily by millions of users such as Siri, Google Assistant, Alexa and others and is now at the point where it is uncommon for it to miss-understand you rather than previous generations where it was lucky if it was able to understand you at all. Despite all of these commercial applications of ASR that have been a highly successful a new area of development is opening up huge benefit to healthcare services.

There are few domains where clarity and quality of communication are as paramount as they are in a healthcare environment. The NHS found that 1 in 20 deaths was preventable, a large number caused by misdiagnosis. When it comes to explaining your symptoms there could be many barriers for health providers from simple ones such as language or accents making it difficult to understand patients to patients innate inability to communicate complex feelings in a way that can effectively lead to a diagnosis. On top of that, patients care is often only as effective as the records that are kept about that care. This requires doctors and nurses to diligently record all interactions, actions and follow up treatment about a patient.

So that’s a lot of information packed into a small space regarding challenges facing healthcare but it’s important to note that many aspects of these challenges could well be either solved by ASR or at the very least, their impact on patient care could be greatly reduced. Firstly, let’s take a look at Language and Accents. You may or may not be familiar but the Google Translate App has a conversation feature that turns your phone into a digital interpreter. Both people are able to talk to the google translate app and it will detect the language that is being spoken, and then it will translate and speak out loud what has been said into any chosen language, you are then able to respond to it in your native tongue and it will translate and say out loud what you have said back to the other person in their own language. It might be slightly slower than speaking the same language and it also might not yet be 100% but a dedicated Healthcare translator that could understand complex medical diagnoses and explain at the very least a definition of the illness in someone’s native tongue could be a genuine lifesaver. At the very least it means that a foreign patient could explain natively what problem they are having and effectively communicate it to a doctor in the doctor’s native tongue improving the quality of their diagnosis and leading to better patient care.

A CMIO is a Chief Medical Information Officer, or CCIO Chief Clinical Information Officer in the UK is a healthcare executive who’s responsible for the health informatics platform or the management and use of patient health care information. They are also responsible for the efficient design, implementation and use of health technology within a healthcare organisation. So the implementation of any ASR tool would be implemented by a CMIO. Patient healthcare information is a really powerful tool.

In the near future, Artificial intelligence (AI) will be a regular tool that enables and empowers doctors to get more accurate, faster diagnoses based on huge statistical data gathered from patients. ASR has a huge role to fulfil here on two fronts. Firstly is acting as a simple scribe, taking notes during patient examinations and transcribing the patients’ own words as well as recording a doctors’ diagnosis and course of action. This would drastically improve patient care as it would act as a whole record of what a patient has said, something that doesn’t exist anywhere in the world currently but also having a full account of machine-readable patient information will allow artificial intelligence to mine the data to either, provide a potential diagnosis to a doctor or mine the use the data to build on its diagnostic models helping it to diagnose a patient in the future. Through analysing the language patients use to describe their illnesses and ailments, it could be possible for a machine to understand these and to diagnose someone else who is describing the same or similar problems that they are facing.

Beyond the advantages of full documentation and minable data, a tool that can automatically understand diagnostic conversations and transcribe them not just verbatim but to interpret what was said and to transcribe this into a medical format would be hugely beneficial. What I mean by this is that doctors will talk to patients but be asking specific diagnostic questions such as height, weight, heart rate, medication and dosages, where the pain is, asking patients to describe that pains and so on. The doctor then records this often in a patient chart or table. A machine that can understand these conversations and automatically fill out these forms that doctors have to currently fill in manually would save a huge amount of time for a doctor potentially increasing the number of patients a doctor could see in a day by reducing the amount of paperwork that they have to complete. There is a question of how accurate could a machine be at doing this kind of interpretation but if it can reduce data entry down to simply double-checking that entered data is correct then it could become a valuable tool to physicians.

The big challenge with ASR in healthcare, however, is that it is a software solution that requires hardware to allow it to be fully utilised in-situ. Designing medical devices and hardware that is compliant with medical regulations and requirements is a real challenge especially if you don’t have any experience designing and developing medical equipment. Fear not though, as we here at Detekt and our partners have a huge wealth of experience with designing and delivering medical products that meet the highest of standards. If you would like help getting your medical product idea to market then call us today to find out how we can get you there faster.