21 October, 2020

Developed by the automotive start-up Hi-Auto, is a plugin that converts any action in an interface into a voice command. The solution is aided by a camera that "reads" the speaker's lips

Since the outbreak of the COVID-19 crisis, numerous Israeli auto-tech companies have used their technology intended for the automotive market to develop dedicated solutions to deal with the special needs of the new reality. This action was also taken by the Tel Aviv-based auto-tech company Hi Auto, which launched a solution called, which allows touch-based user interfaces, such as those found in self-service machines in clinics or fast food chains, to be turned into “sterile” interfaces operated by voice commands, i.e., touch-free.

The new solution is based on the audio-video technology developed by the company for in-vehicle voice control systems. Hi-Auto’s technology includes a microphone, a camera focused on the driver’s lips, and a deep-learning software installed on the vehicle’s computer that removes background noise. is an add-on based on the same technology and can be installed on any existing touch interface. Roy Baharav, one of the founders of Hi Auto and the director of the new venture, told TechTime that as soon as the COVID-19 crisis broke out, the company identified the new need and opportunity.

“We realized very quickly that everything related to voice command would gain momentum, and that voice interfaces would transform from an application that’s nice to have, into something imperative [in light of the current reality]. We built a solution that is separate from what we do in the automotive world, that is simpler and that allows people to use voice-based interfaces in a reliable and user-friendly way, without having to make significant adjustments from business to business “.

The same challenge found within the space of the vehicle regarding voice-command interfaces – to identify the relevant speaker, i.e., the user, and separate his voice commands from background noises – exists also in public spaces. Baharav: “Even in a restaurant, airport, or train station there is considerable environmental noise that interferes with speech comprehension.”

A camera that reads lips

The plugin solves the problem in two ways: it converts each and every action in the interactive interface to a defined voice command, and displays to the user on the screen what the relevant voice command is, for example, to order a particular dish in a restaurant or to issue a certain document in a governmental self-service machine. The plugin makes the interface accessible to the user and reduces the possible “conversation” scenarios between the user and the machine, thereby making it easier for the voice processor to accurately identify the command. In addition, the recognition of voice commands is also aided by a camera, which reads the speaker’s lip movements and helps to identify commands.

At this stage, the company has adapted the software to English, Japanese and Hebrew, and has begun pilots with several retail chains in the US and Europe. The market of body gesture-based control interfaces is still in its infancy, especially in the automotive sector where there is a safety need for contactless control of information and entertainment systems. However, in the field of consumer electronics and smart home, the technology has not been adopted yet. It is possible that the current need to maintain hygiene in public space will lead to a significant boost. The research company Research & Markets estimates that the market for contactless operating interfaces is expected to grow in the coming years at an annual rate of 17.6% and reach a volume of approximately $15.3 billion in 2025.

