The conference is now over. We are very thankful to all the participants, the authors and the
Program Committee members who made this possible.
The conference day - September 2, 2008 had 11 participants from the following institutions/companies (from left to right in the image above):
Aidan Kehoe, University College Cork, Ireland
Antonio M. Peinado, University of Granada, Spain
Jose A. Gonzalez, University of Granada, Spain
Nitendra Rajput, IBM Research, India
Nikolaos Tsourakis, University of Geneva, Switzerland
Arjan Geven, Center for Usability Research and Engineering, Austria
Alexander I. Rudnicky, Carnegie Mellon University, USA
Ian Pitt, University College Cork, Ireland
Tomi Heimonen, Univ. of Tampere, Finland
Markku Turunen, Univ. of Tampere, Finland
Zhipeng Zhang, NTT DoCoMo, Japan
In addition to the 6 presentations of the peer-reviewed papers, we had interesting discussions on the three key aspects of SiMPE:
1. Usability issues in SiMPE applications
2. Applications where SiMPE can provide an added advantage
3. Core speech processing issues in SiMPE.
Following are some of the key thoughts that came up during the discussions:
Usability issues in SiMPE
There is a huge difference between lab-tests vs. on-field experiments.
We should differentiate usability in the three components:
Speech input
Speech output
Dialog systems
To evaluate usability as close to the real environment, it is important to evaluate features in the lab and evaluate usability in the field.
There is a difference between a trained user vs. a novice user. So usability tests should include both type of users.
Applications for SiMPE
Look at applications where we provide help in the other modality. So a user does not need to do a context switch while using the application.
When people are mobile, or are driving - standard use of speech !
People choose modality that is more efficient. Therefore accuracy of ASR system will also decide the modality.
Speech processing issues in SiMPE area
We need to look at NSR/DSR systems because large vocabulary speech recognition on devices is still not possible.
It is important to look at environment-specific / context-specific recognition techniques.
Need to look at and develop server based TTS for improved quality (since TTS expectation is very high).
We also discussed how to improve the SiMPE workshop. There were several suggestions. We played a fun game where we identified the best comment during the day and the best suggestion to improve SiMPE. Following are the "awarded" comment and suggestion:
1. Best comment: "Use speech as a help channel to visual interfaces" - Aidan Kehoe.
2. Best suggestion: "Rotate the conference venue to attract a larger/different audience" - Alex Rudnicky
Hopefully we will be able to continue these discussions in the future. We will be glad to hear your feedback on the workshop and suggestions on taking this area forward. Kindly post them at the SiMPE Wiki.
Here are some images:
Markku describes the model to capture user experience.
Arjan explains the multimodal reminder system.
Nikos and Markku argue on the mobile architecture.
Antonio happily points at results of his ambient detection system.