NXP Semiconductors announced the VIT Speech to Intent engine, a natural language understanding engine that leverages edge computing to enable local voice control. Part of NXP’s Voice Intelligent Technology (VIT) software suite, the VIT Speech to Intent engine rivals cloud-based systems’ performance, and does not require a cloud connection, supporting improved user privacy in consumer electronics and automotive applications.
Whether in the smart home, smart factory or smart city, voice is an essential user interface for device control. However, many of these smart devices require precise phrasing to execute the desired action or cloud connections to translate the user’s speech into device actions. NXP’s VIT Speech to Intent allows people to speak naturally to all sorts of machines, across all application fields, rather than memorize precise commands or phrases to operate the devices.
“Natural language understanding delivered by VIT Speech to Intent allows devices to understand users’ intent, without requiring exact phrasing or cloud connectivity. This opens new possibilities for innovation, particularly in the smart home and places where users’ hands may not be free such as hospitals or factory floors. From advanced AI-driven devices to context-aware voice commands using VIT Speech to Intent, NXP helps simplify the development of voice-first devices with software optimized for its MCUs and MPUs,” the company announced.
“As we move towards smart devices that can better anticipate and automate based on our needs, particularly in the smart home, voice has emerged as one of the preferred ways to communicate our preferences to devices. VIT Speech to Intent allows people to interact with smart devices seamlessly, without needing to rely on specific keywords, delivering convenience and ease of use, reducing complexity and alleviating user frustration. This further enables the transition from a smart home to an autonomous one,” says Rafael Sotomayor, Executive Vice President and General Manager, Secure Connected Edge, NXP.
With a small memory footprint and modest computational requirements, the VIT Speech to Intent engine is available and suitable for use on NXP devices including i.MX RT Crossover MCUs and RW61x MCUs and i.MX 8M Mini, i.MX 8M Plus, and i.MX 9x applications processors. Since the VIT Speech to Intent engine runs locally, besides eliminating the need for a cloud connection, manufacturers will benefit from improved user privacy, lower latency, reduced power consumption and reduced costs. This can be used to support natural speech interfaces with a wide variety of applications, including smart watches, smart HVAC, and more.
At launch, VIT Speech to Intent supports English language interactions, with support for Mandarin coming later in 2023. Additional support is planned for Spanish, German, Korean, French and Japanese in 2024.
The free VIT Wake Word and Voice Command Engines available through the MCUXpresso SDK and online model tool allows developers to start developing with NXP’s Voice Intelligent Technology Portfolio immediately. Developers interesting in evaluating VIT Speech to Intent should visit NXP.com/SpeechToIntent.
VIT Speech to Intent is part of the Voice Intelligent Technology (VIT) software suite, a fully comprehensive, local voice control software package. Based on advanced deep learning, VIT is comprised of an always-on Wake Word engine, a Voice Command engine, and a Speech to Intent engine. Developers can get started with the free, ready-to-use Wake Word and Voice Command engines available through the MCUXpresso SDK and supported by an online model creation tool. In addition, developers can upgrade to the newly released Speech to Intent Engine.