Virtual Personal Assistant (Mockup).
Voice Input: "Show me this one!"
When humans communicate vis-a-vis, they employ a variety of channels to convey meaning. Speech is not necessarily the primary mode. It is supplemented or replaced by facial gestures, body posture, gestures (deictic, iconic). Can we use these channels to communicate with machines? How can various signals from the different channels (that a human computer user may employ) be integrated to form a unique meaning representation, i.e. a computer command? Can a computer automatically and adaptively display content in a coherent and concise fashion? These questions are part of the research work done in FASiL. The focus of this two-year project is to produce a conversational language engine, demonstrated in three languages: English, Swedish and Portuguese. Important objectives for the EU are to ensure that the end system will be useful both in terms of functional requirements as well as being inclusive to all citizens - including hard of hearing and visually impaired people.
Generated multimodal output,
Voice Output: "Send it now?"
UI on the Fly is a technique that allows a computer to automatically generate multimodal user interfaces, in particular for small computers uch as cell phones or iPAQs. We enable these devices to engage in natural language conversation, using the touch-screen and voice out- and input at the same time. The output is tailored to the particular usage situation (in a restaurant, in the car, at home) as well as to the device and preferences of the user. The central system can thus remain blissfully agnostic as you switch from using a phone, to a PDA, to a computer, and back.
Technically, we formulate a hybrid natural language generation approach as a constraint optimization problem with hard and soft constraints. Multimodal Functional Unification Grammar is a formalism based on the unificiation of attribute-value-matrices. It enforces cross-modal coherence in the output. A graphical workbench allows us to maintain and debug grammars. The generation system has been evaluated positively with users, who judged it to be more efficient and showed a trend to perform better at a recall-task.
Detailed information about MUG Workbench and the system, and Open Source download
(Developed at MIT Media Lab Europe, 2002-2004, with E. M. Panttaja, F. Cummins, and others.)