Donate to Science & Enterprise

S&E on Mastodon

S&E on LinkedIn

S&E on Flipboard

Please share Science & Enterprise

Univ. Lab Creates Open-Source Intelligent Assistant

Develoers of Sirius

Developers of the Sirius open-source intelligent personal assistant software, l-r, professors Lingjia Tang and Jason Mars, with graduate students Johann Hauswald and Yiping Kang (Joseph Xu, University of Michigan)

11 March 2015. A computer science lab at University of Michigan is developing an intelligent personal assistant program that responds to voice commands like Apple’s Siri and Google Now, but is freely available for use or adoption in other software. The team from Michigan’s Clarity Lab, led by professors Jason Mars and Lingjia Tang, will give a half-day tutorial on Sirius, as the program is called, on Saturday, 14 March, and later present a paper at the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) in Istanbul, Turkey.

The Michigan team sought to build intelligent personal assistant software with the voice recognition and natural language processing currently found on mobile devices, but offer it in an open-source package like the Linux operating system, to encourage wider adoption and integration into a more kinds of devices, such as the emerging array of wearable systems coming on the market. The developers also believe offering Sirius with an open-source license will promote the technology into a greater variety of industries and applications, such as health care and auto maintenance.

In addition, Sirius offers more advanced functions, including image matching and question-and-answer capabilities, that its designers say are not fully operational on commercial systems. “What we’ve done with Sirius is pushed the limits of the traditional intelligent personal assistant,” says doctoral student Johann Hauswald in a university statement. “Not only can you interact with your voice but you can also ask questions about what you’re seeing, which is a new way to interact with this type of device.”

Mars, Tang, and colleagues combine elements of other open-source software with functions similar to commercial systems. Voice recognition borrows features from Sphinx developed at Carnegie Mellon University, RASR written by RWTH Aachen University in Germany, and Kaldi from Microsoft Research. Image matching is derived from a computer-vision algorithm known as SURF, developed at Swiss Federal Institute of Technology in Zurich and commercialized by a company recently acquired by Qualcomm. Question-and-answer functions come from OpenEphyra, an open-source framework of natural language algorithms for answering questions.

Sirius relies on computing power in the cloud for its more complex processes. Mobile devices may translate voice to text, but its cloud-based processes that interpret the text questions, then find and return the answers. A demonstration version of Sirius answers questions from the entire Wikipedia knowledge base, but the system’s designers say Wikipedia can be swapped out for various information sources, such as those used in industries and professional practices, stored in the cloud.

The researchers say proliferation of intelligent personal assistants and wearable devices using these techniques will likely require a more powerful cloud architecture. “We have to think of new ways to redesign our cloud platforms to support this type of workload,” notes Mars. The team estimates voice recognition processes can be more than 100 times more computationally intensive than text searches, and require a data-center infrastructure 165 times larger than what’s offered today.

“Some people ask whether speech or visual-driven computer interaction is just hype or the next big thing,” says Tang, “and I truly believe it’s the natural trend.”

The Michigan team tells more about Sirius in the following video.

Read more:

*     *     *

Comments are closed.