Acoustic Simulation Technology Transforms Voice AI and Robotics

While there is no shortage of development in AI applications involving large language models and more recently agentic AI, there’s one crucial element missing as applications such as robotics increasingly rely on AI: sound. Current methods to develop AI-based sound currently rely on a lot of trial-and-error recognition.

A 5-year old company called Treble, headquartered in Iceland, hopes to change that. The company has developed a program that uses physical simulation to recognize sound scenarios and help engineers create appropriate AI-generated sound for robots, wearables, and other devices. Treble’s platform is based on patented algorithms in acoustic simulation and spatial audio. According to the company, these algorithms deliver measurement-grade acoustic realism much faster than with existing simulation methods.

“Sound has been overlooked,” said Finnur Pind, CEO of Treble, in a recent interview with Design News. “AI needs training to teach a robot and to recognize sound and speech. AI needs to experience a lot of sound scenarios.”

Related:Siemens Debuts Intelligence Center X—Industrial AI for Enterprise Impact

Pind explained to Design News that current methods of acoustic simulation rely on geometric acoustics, which involve high-frequency approximation of sound, a method Pind says is not always accurate. By contrast Treble’s acoustic simulation method relies on numerical, wave-based acoustic simulation, which directly solves the wave equation by capturing wave phenomena such as diffraction, phase, and scattering.

Audio AI performance is shaped by acoustic factors such as room acoustics and reverberation, source distance and positioning, competing speakers and background noise, and microphone characteristics and device placement. More accurate acoustic simulation enables multichannel speech enhancement, reducing word error rate. Treble’s hybrid algorithm outperforms traditional software by accurately modeling single reflections, diffraction, and coupled-room dynamics, making it possible to achieve accurate simulation in complex scenarios and capturing low-frequency effects missed by other methods.

Treble also enables realistic simulation of own-voice propagation and multi-microphone device acoustics, generating high-fidelity training and testing data for voice AI, headsets, earbuds, and advanced mic array systems. According to the company, the simulation platform can be useful for voice AI and conversational systems, generating synthetic audio data, virtual prototyping of audio hardware, robotics and embedded AI, automotive acoustics and infotainment, and spatial audio and immersive data. Pind expects the simulation platform to reduce development time for these and other applications.

To account for the wide variability in voice, Treble developed a leaderboard provides a comprehensive, easy-to-use and community-driven benchmark to evaluate automatic speech recognition performance reflective of real-world deployment. The evaluation conditions are based on actual end user far field scenarios, and benchmark results are divided into various scenarios (e.g. easy, medium, difficult) to give more insight.

According to Pind, Treble will refine future versions of its AI sound platform to make the tool easier to use for those without acoustic knowledge.