Typically, manual control input, such as a finger on the steering wheel, includes a speech recognition system, and this is signaled to the driver by an audible prompt. After the audible prompt, the system has a “listening window” during which it can accept speech input for recognition.
Simple voice commands can be used to initiate phone calls, select radio stations or play music from a compatible smartphone, MP3 player or downloaded music from a flash drive. Voice recognition capabilities vary by car make and model. Some of the recent[when? ] car models offer natural language speech recognition instead of a fixed set of commands, allowing the driver to use full sentences and common phrases. Thus, in such systems, the user does not need to memorize a set of fixed command words.
In the healthcare sector, speech recognition can be implemented at the interface or at the end of the medical documentation process. Front-end speech recognition is where the provider dictates to the speech recognition engine, the recognized words are displayed as they are spoken, and the dictator is responsible for editing and signing the document. Internal or delayed speech recognition is what the provider dictates as a digital dictation system, the voice is routed through the speech recognition machine, and the recognized draft draft is sent along with the output voice file to the editor, where the draft is edited and the report is finalized. Currently, advanced speech recognition is widely used in the industry.
One of the main issues surrounding the use of speech recognition in health care is that the American Recovery and Reinvestment Act of 2009 (ARRA ) provides substantial financial benefits to physicians who use EMFs under Meaningful Use standards. These standards require that a significant amount of data be maintained by an EMR (now more commonly called an Electronic Health Record or EHR). Using speech recognition is more naturally suited to generating narrative text, as part of a radiology/pathology interpretation, progress note, or discharge summary: the ergonomic benefits of using speech recognition to enter structured discrete data (such as numerical values or codes from a list or a controlled vocabulary) relative to minimal for sighted people who can operate a keyboard and mouse.
A more significant issue is that most EHRs were not specifically designed to take advantage of voice recognition capabilities. Much of the clinician’s interaction with the EHR involves navigation through the user interface using menus and key/button clicks and is heavily dependent on the keyboard and mouse: voice navigation provides only moderate ergonomic benefits. In contrast, many custom radiology and pathology dictation systems use voice “macros” where the use of certain phrases – for example “regular report” automatically fills in a large number of standard values and/or generates a template that varies depending on the type of examination – for example, chest x-ray versus gastrointestinal contrast for an x-ray system.
Long-term use of software for speech recognition in combination with word processors has shown benefits in terms of short-term memory enhancement in brain AVM patients who underwent resection treatment. Further studies should be conducted to determine the cognitive benefits for individuals whose AVMs were treated with radiologic techniques.
High-performance fighter aircraft
In the last decade, significant efforts have been made to test and evaluate speech recognition in fighter aircraft. Of particular note is the US speech recognition program for the Advanced Fighter Technology Integration (AFTI) /F-16 aircraft (F-16 VISTA ), the program in France for the Mirage aircraft, and other programs in the UK involving a variety of aircraft platforms. In these applications, speech recognizers have been successfully deployed on fighter jets using applications such as radio frequency setting, autopilot system control, turning point coordinates and weapon release parameters, and flight display control.
Working with Swedish pilots flying in the JAS-39 Gripen cockpit, Englund (2004) found that recognition deteriorated with increasing g-load. The report also concluded that adaptation significantly improved performance in all cases and that the introduction of breathing models significantly improved recognition performance. Contrary to what might be expected, the effects of impaired speakers’ English were not detected. It was obvious that the spontaneous speech caused problems for the recognizer, as might be expected. Thus, a limited vocabulary and, above all, correct syntax can be expected to significantly improve recognition accuracy.
The Eurofighter Typhoon, currently in service with the UK RAF, uses a dynamic dependent system that requires each pilot to create a pattern. The system is not used for any critical safety-critical or weapon-related tasks, such as weapon release or landing gear lowering, but is used for a wide variety of other cockpit functions. Voice commands are confirmed by visual and/or audio feedback. The system is seen as a major design feature to reduce pilot control load, and even allows the pilot to assign targets to his aircraft with two simple voice commands or to any of his wing commands, which has only five commands.
Speaker-independent systems are also being developed and tested for the F35 Lightning II (JSF) and the Alenia Aermacchi M-346 Master fighter entry trainer. These systems gave a word accuracy rating in excess of 98%
The challenges of achieving high recognition accuracy under stress and noise are highly relevant to the helicopter environment as well as to the jet fighter environment. The problem of acoustic noise is actually more serious in the helicopter environment, not only because of the high noise level, but also because the helicopter pilot does not usually wear a face mask, which would reduce the acoustic noise into the microphone. Over the last decade, speech recognition systems in helicopters have undergone significant testing and evaluation programs, notably by the US Army Aviation Research and Development Activity (AVRADA) and the Royal Aerospace Agency (RAE) in the UK. Work in France included speech recognition in the Puma Helicopter. There was also a lot of useful work in Russia. The results were encouraging, and voice applications included: radio control, system navigation settings, and automated targeting system control.
As with fighter aircraft applications, the main issue of voice in helicopters is the effect on pilot performance. Encouraging results are reported for the AVRADA tests, although they represent only a demonstration of feasibility in a test environment. Much remains to be done in both speech recognition and speech technology in general to continuously improve performance in operational settings.
Training of air traffic controllers
Air traffic controller (ATC) training is a great application for speech recognition systems. Many ATC training systems currently require a person to act as a “pseudo-pilot”, participating in a voice dialogue with a listening controller that simulates the dialogue a controller would have with pilots in a real ATC situation. Speech recognition and synthesis techniques offer the potential to eliminate the need for a human to act as a pseudo-pilot, thereby reducing training and support personnel. Theoretically, the tasks of the air controller are also characterized by highly structured language as the main output of the controller, so it should be possible to reduce the complexity of the speech recognition task. In practice, this rarely happens. FAA Document 7110.65 details the phrases that air traffic controllers should use. Although less than 150 examples of such phrases are provided in this document, the number of phrases supported by one of the speech recognition modeling providers exceeds 500,000.
The US Air Force, USMC, US Army, US Navy and FAA, as well as a number of international ATC training organizations, such as the Australian Air Force and civil aviation in Italy, Brazil and Canada, currently use speech recognition ATC simulators from a number of different vendors