Spoken Language Interfaces In Industrial Equipment

Part
01
of one
Part
01

Spoken Language Interfaces In Industrial Equipment

Key Takeaways

  • There is a significant interest in deploying SLIs in traditional manufacturing and newer, Industry 4.0-aware processes.
  • Spoken language dialogue appeals to the manufacturing industry due to its hands-free, flexible, and natural features, which are expected to allow integration between the manufacturing workforce and growing digital information systems.
  • BSH recently integrated Fluent.ai's voice recognition solution at one of its German factories to reduce transition time along its assembly lines.

Introduction

This report provides insights into the integration of spoken language interfaces into industrial equipment as part of the broader Industry 4.0.

Mixed Reality

  • The industrial metaverse is expected to take shape with the further integration of technologies, leveraging rich, immersive experiences that have applications in several sectors, including remote maintenance and onsite training.
  • The idea of mixed reality is emerging as an innovative means of collaboration and communication expected to phase out keyboards and flat displays. Subsequently, Spoken Language Interfaces (SLIs) help to provide meaningful interactions and real-time data to operators immersed in virtual experiences.
  • SLIs were proposed in a study for use in aircraft maintenance metaverses, leveraging it to provide education and training via integration in smart glasses for trainee engineers.
  • Using the same, the engineers can rely on deep learning speech interaction modules to control virtual assets and workflow using speech commands, allowing full use of both hands while operating.
  • Based on the study outcome, the speech interaction module significantly improved aircraft maintenance training and education, enabling the trainees to have efficient and intuitive control over their operations.

Robotics Assistance

  • There is a significant interest in deploying SLIs in traditional manufacturing and newer, Industry 4.0-aware processes.
  • While investment in robotic processes is still quite futuristic for traditional manufacturing, "spoken language access to databases on the shop floor would be valuable right now."
  • Spoken language interactions are becoming more relevant with the advancement of robotics, being that it is potentially the most efficient means for robots to communicate in real-time.
  • The integration of SLIs would find more compelling applications when integrated into more digitally advanced environments. Spoken language dialogue appeals to the manufacturing industry due to its hands-free, flexible, and natural features, which are expected to allow integration between the manufacturing workforce and growing digital information systems.
  • The potential applications for this technology include smart decision support for complex tasks in assembly, which can be combined with digital work instructions available visually. It would also have maintenance, repair, and overhaul (MRO) data capture applications.
  • However, spoken language interactions in robotics extend significantly beyond manufacturing processes and into education, healthcare, field assistance, etc.

Integrations

BSH

  • BSH recently integrated Fluent.ai's voice recognition solution at one of its German factories to reduce transition time along its assembly lines.
  • The technology works by allowing workers at the company to speak into a headset to carry out production functions. By leveraging a WakeWord, workers can deliver commands that automatically activate the assembly line movement.
  • The technology is 100% hands-free and is expected to produce 75%-100% efficiency gains. It also offers multilingual support, allowing BSH to automate to any of its 50+ countries easily.

Travel Industry

  • Spoken dialogue systems and technologies are useful in the travel industry for tourism booking and improving the customer lifecycle experience. Voice-user interfaces (VUIs) provide an innovative approach to business in this industry.
  • Travel and tourism businesses can leverage the same to be more efficient in providing information quickly and conveniently. Since many travelers (81%) already use voice assistants on their smartphones, travel businesses can employ voice technology to provide more engaging and easily accessible info.
  • They can also provide more personalized experiences by leveraging high-value interactions and improved functionality. Leveraging VUIs can enable travel agents to overcome language barriers, allowing them to interact with a diverse group of customers without any language restrictions.
  • The multilingual capability of voice recognition systems is arguably the most significant potential for VUIs in the travel industry, leading to "location-independent human-computer interaction" improvements.

Integrations

Heathrow Airport

  • Heathrow Airport leveraged Amazon's Alexa voice assistant to create its own Alexa skill, which consumers can download to receive flight information in near real-time by simply asking aloud.
  • The Heathrow skill responds to passenger flight numbers, providing travelers with the most recent live data for their flight. The integration relays data to consumers with little manual effort, such as online searches.
  • With increasing popularity, Heathrow is expected to improve the skill's capacity to include secondary services and allow consumers to book their flights using voice.

Marriot

  • Marriot partnered with Amazon's Alexa to improve guest experience, ease of translation, and time and cost savings.
  • The company has developed a concept hotel room that employs voice controls for virtually every command, including air conditioning and TV control.
  • Marriot is leveraging the Internet of Things (IoT) technology and VUIs to gather habitual guest data, which would improve guest personalizations.

SLI Integration Factors and Challenges

Speed and Efficiency

  • Since SLI integration allows workers and operators to carry out hands-free processes, the technology significantly improves productivity by ensuring increased speed and efficiency.
  • According to a Stanford study, the average human can speak 200 words per minute (WPM) compared to typing (50 WPM). This data shows that SLI integration could make the human-computer interaction 4x more efficient.
  • For example, speed is critical to robot interactions with people in real-time, especially in fast-paced operations where slower performance could cause failure.
  • Efficient human-to-computer interactions require speed before and during command execution, and SLIs will allow more cooperative action and real-time coordination between operators and their machines.

Intuitiveness and Convenience

  • Natural communication promoting convenience and frictionless interaction constitutes one of SLIs' primary advantages.
  • Compared to graphical user interfaces (GUIs), VUIs provide more convenience and ease of use since it requires less cognitive effort. VUI developers might no longer need to give instructions on using it. Users can ask the voice assistant if they require help/instructions.
  • VUIs can also process several expressions for a command, such as display, view, visualize, and show, which is another advantage over GUIs.
  • "Voice is today's vehicle for human-machine interactions. This democratizes the use of technology by letting users interact with their computers as if they were speaking with a friend."

Data Privacy

  • Data privacy is one challenge to SLI integration and adoption, as integrating the technology often involves third-party data sharing.
  • Companies seeking to leverage LSI solutions from third-party developers must entrust them with extensive information while trusting them to keep it secure. A data breach for the third party would subsequently translate into a data breach for the company.
  • In other sectors, such as education, some teachers approve of the potential benefits of VUI integrations. However, many school districts refrain from implementing the technology due to concerns regarding the "Children's Online Privacy Protection Act" compliance.
  • Companies might consider establishing robust end-to-end infrastructure encrypted data to ensure that personal and enterprise information is well protected in addressing this difficulty.

Integration Requirements

  • Voice technology adoption and integration also have significant process and product requirements that might be difficult to meet.
  • Two essential integration requirements that pose a challenge to SLI adoption include "the complexity of voice recognition technology" and the significant computational power that automated speech recognition demands.
  • Not all companies have the required time and resources to carry out this integration, considering it can also be cost-intensive. Most companies would require a separate team with the operational skills to simplify the integration process.
  • They would need to be well-versed in speech recognition, language, computer vision, and machine learning technologies to efficiently implement SLIs in the company.

Research Strategy

We leveraged several scientific reports and studies from publications such as MDPI and the Engineering and Physical Sciences Research Council (EPSRC) to provide the requested information. We leveraged a few resources beyond Wonder's typical two-year standard for sources. However, since they provided relevant information to a topic that is generally focused on future developments and advancements, we assumed the findings to be valid.

Did this report spark your curiosity?

Sources
Sources