Sydney, Australia | 29 January 2006 to 1 February 2006
Online Registration
Home Conference
Overview
Location
& Hotels
Program Tutorial &
Workshop Info
Committee Sponsors Past IUI
Conferences
Submission
Statistics
 
 
Back to IUI Digital Library

 
Sunday, Jan 09 Monday, Jan 10 Tuesday, Jan 11 Wednesday, Jan 12
Tutorials and Workshops 9:00-9:25
Welcome and Outstanding Paper Award Presentation
9:00-10:30
Invited Talk
Stuart Card
9:00-10:30
Invited Talk
Barry Smyth
9:25-10:55
Invited Talk
Justine Cassell
11:25-12:40
Papers
Affective Computing
11:00-12:40
Papers
Visualization and Presentation
11:00-12:40
Papers
Recommendation and Instruction
2:10-3:50
Papers
Multimodal Interaction
2:10-3:25
Papers
Natural Language and Gestural Input
2:10-3:50
Papers
Knowledge Acquisition and Knowledge-Based Design
4:20-6:00
Papers
Personal Assistants
4:00-5:30
Panel Discussion
4:20-5:35
Papers
Smart Environments and Ubiquitous Computing
5:35-5:50
Farewell
7:00-9:30
Opening Reception
6:30-10:00
Poster / Demo Session and Banquet
 

 

Program Details:

Tutorials and Workshops
 
Workshops, Part 1
Sunday, January 9th, 8:30 - 10:00 am

Workshops, Part 2
Sunday, January 9th, 10:30 am - 12:00 pm

Tutorials, Part 1
Sunday, January 9th, 1:30 - 3:00 pm

Tutorials, Part 2
Sunday, January 9th, 3:30 - 5:00 pm

[top]

Opening Reception
Sunday, January 9th, 7:00 - 9:30 pm

  Kontiki Room, Catamaran Resort Hotel
San Diego, California

[top]

Welcome and Outstanding Paper Award Presentation
Monday, January 10th, 9:00 - 9:25 am

 

Robert St. Amant (North Carolina State University)
John Riedl (University of Minnesota)
Anthony Jameson (DFKI and International University in Germany)

[top]

Invited Talk: Justine Cassell
Monday, January 10th, 9:25 - 10:55 am
Chair: W. Lewis Johnson (University of Southern California)

 

Oral Tradition, Aboral Coordination: Building Rapport with Embodied Conversational Agents
Justine Cassell (Northwestern University)

Abstract
Harmony or rapport between people is essential for relationships as diverse as seller-buyer and teacher-learner. In this talk I describe the kinds of verbal behaviors - such as common interactional structures and narrative resonance - and non-verbal behaviors - such as attention, positivity, and coordination - that function together to establish a sense of rapport between two people in conversation. These studies are used as the basis for the implementation of virtual peers - adults, but also more recently embodied conversational virtual children who are capable of acting as friends and learning partners with real children from different ethnic traditions, collaborating to tell stories from the child's own cultural context, and aiding children in making the transition between home and school language.

About Justine Cassell
Justine Cassell is a full professor in the departments of Communication Studies and Computer Science at Northwestern University, the director of the ArticuLab research group, and the graduate director of the interdisciplinary Technology and Social Behavior Ph.D. program in the School of Communication. Before coming to Northwestern, Cassell was a tenured associate professor at the MIT Media Lab where she directed the Gesture and Narrative Language Research Group. In 2001, Cassell was awarded the Edgerton Faculty Achievement Award at MIT.

Cassell's research concentrates on building technologies that simulate, mediate, and facilitate everyday kinds of talk. These technologies, such as Embodied Conversational Agents, Story Listening Systems, and Lowest Common Denominator Online Communities, in turn allow her to study the nature of human communication with and through technology.

[top]

Papers: Affective Computing
Monday, January 10th, 11:25 am - 12:40 pm
Chair: Dina Goren-Bar (Ben-Gurion University of the Negev)

 

Experimental Evaluation of Polite Interaction Tactics for Pedagogical Agents
Ning Wang and W. Lewis Johnson (University of Southern California)
Paola Rizzo (University of Rome "La Sapienza")
Erin Shaw and Richard E. Mayer (University of California, Santa Barbara)

Abstract
Recent research shows that instructors commonly use politeness strategies to achieve affective scaffolding in educational contexts. The importance of affective factors such as self-confidence and interest that contribute to learner motivation is well recognized. In this paper, we describe the results of a Wizard-of-Oz experiment to study the effect of politeness strategies on both cognitive and motivational factors. We compare the results of two different politeness strategies, direct and polite, in assisting seventeen students in a computer-based learning task. We find that politeness can affect students motivational state and help students learn difficult concepts. The results of the experiment provide a basis for the design of a polite pedagogical agent and its tutorial intervention strategies.

Recognising Emotions in Human and Synthetic Faces: The Role of the Upper and Lower Parts of the Face
Erica Costantini, Fabio Pianesi, and Michela Prete (ITC-irst)

Abstract
Embodied Conversational Agents that can express emotions are a popular topic. Yet, despite recent attempts, reliable methods are still lacking to assess the quality of facial displays. This paper extends and refines previous work, focusing on the role of the upper and the lower portions of the face. We analysed the recognition rates and errors from the responses of 74 subjects to the presentations of dynamic (human and synthetic) faces. The results points to the possibility of: a) addressing the issue of the naturalness of synthetic faces, and b) a greater importance of the upper part.

Extraction and Classification of Facemarks with Kernel Methods
Yuki Tanaka, Hiroya Takamura, and Manabu Okumura (Tokyo Institute of Technology)

Abstract
We propose methods for extracting facemarks (emoticons) in text and classifying them into some emotional categories. In text-based communication, facemarks have gained popularity, since they help us understand what writers imply. However, there are two problems in text-based communication using facemarks; the first is the variety of facemarks and the second is lack of good comprehension in using facemarks. These problems are more serious in the areas where 2-byte characters are used, because the 2-byte characters can generate a quite large number of different facemarks. Therefore, we are going to propose methods for extraction and classification of facemarks. Regarding the extraction of facemarks as a chunking task, we automatically annotate a tag to each character in text. In the classification of the extracted facemarks, we apply the dynamic time alignment kernel (DTAK) and the string subsequence kernel (SSK) for scoring in the k-nearest neighbor (k-NN) method and for expanding usual Support Vector Machines (SVMs) to accept sequential data such as facemarks. We empirically show that our methods work well in classification and extraction of facemarks, with appropriate settings of parameters.

[top]

Papers: Multimodal Interaction
Monday, January 10th, 2:10 - 3:50 pm
Chair: Antonio Krüger (University of Münster)

 

Two-Way Adaptation for Robust Input Interpretation in Practical Multimodal Conversation Systems
Shimei Pan, Michelle Zhou, and Keith Houck (IBM T. J. Watson Research Center)
Siwei Shen (University of Michigan)

Abstract
Multimodal conversation systems allow users to interact with computers effectively using multiple modalities, such as natural language and gesture. However, these systems have not been widely used in practical applications mainly due to their limited input understanding capability. As a result, conversation systems often fail to understand user requests and leave users frustrated. To address this issue, most existing approaches focus on improving a system s interpretation capability. Nonetheless, such improvements may still be limited, since they would never cover the entire range of input expressions. Alternatively, we present a two-way adaptation framework that allows both users and systems to dynamically adapt to each other s capability and needs during the course of interaction. Compared to existing methods, our approach offers two unique contributions. First, it improves the usability and robustness of a conversation system by helping users to dynamically learn the system s capabilities in context. Second, our approach enhances the overall interpretation capability of a conversation system by learning new user expressions on the fly. Our preliminary evaluation shows the promise of this approach.

Linguistic Theories in Efficient Multimodal Reference Resolution: An Empirical Investigation
Joyce Y. Chai, Zahar Prasov, Joseph Blaim, and Rong Jin (Michigan State University)

Abstract
Multimodal conversational interfaces provide a natural means for users to communicate with computer systems through multiple modalities such as speech, gesture, and gaze. To build effective multimodal interfaces, understanding user multimodal inputs is important. Previous linguistic and cognitive studies indicate that user language behavior does not occur randomly, but rather follows certain linguistic and cognitive principles. Therefore, this paper investigates the use of linguistic theories in multimodal interpretation. In particular, we present a greedy algorithm that incorporates Conversation Implicature and Givenness Hierarchy for efficient multimodal reference resolution. Empirical studies indicate that this algorithm significantly reduces the complexity in multimodal reference resolution compared to a previous graph-matching approach. One major advantage of this greedy algorithm is that the prior linguistic and cognitive knowledge can be used to guide the search and significantly prune the search space. Because of its simplicity and generality, this approach has the potential to improve the robustness of interpretation and provide a more practical solution to multimodal input interpretation.

Multimodal New Vocabulary Recognition Through Speech and Handwriting in a Whiteboard Scheduling Application
Edward C. Kaiser (Oregon Health & Science University)

Abstract
Our goal is to automatically recognize and enroll new vocabulary in a multimodal interface. To accomplish this our technique aims to leverage the mutually disambiguating aspects of co-referenced, co-temporal handwriting and speech. The co-referenced semantics are spatially and temporally determined by our multimodal interface for schedule chart creation. This paper motivates and describes our technique for recognizing out-of-vocabulary (OOV) terms and enrolling them dynamically in the system. We report results for the detection and segmentation of OOV words within a small multimodal test set. On the same test set we also report utterance, word and pronunciation level error rates both over individual input modes and multimodally. We show that combining information from handwriting and speech yields significantly better results than achievable by either mode alone.

Multimodal Interaction for Pedestrians: An Evaluation Study
Matthias Jöst, Jochen Häußler, Matthias Merdes, and Rainer Malaka (European Media Laboratory)

Abstract
What are the most suitable interaction paradigms for navigational and informative tasks for pedestrians? Is there an influence of social and situational context on multimodal interaction? Our study takes a closer look at a multimodal system on a handheld device that was recently developed as a prototype for mobile navigation assistance. The system allows visitors of a city to navigate, to get information on sights, and to use and manipulate map information. In an outdoor evaluation, we studied the usability of such a system on site. The study yields insight about how multimodality can enhance the usability of hand-held devices with their future services.

[top]

Papers: Personal Assistants
Monday, January 10th, 4:20 - 6:00 pm
Chair: Mathias Bauer (DFKI)

 

Automated Email Activity Management: An Unsupervised Learning Approach
(Honorable Mention for Outstanding Paper Award)
Nicholas Kushmerick (University College Dublin)
Tessa Lau (IBM T. J. Watson Research Center)

Abstract
Many structured activities are managed by email. For instance, a consumer purchasing an item from an e-commerce vendor may receive a message confirming the order, a warning of a delay, and then a shipment notification. Existing email clients do not understand this structure, forcing users to manage their activities by sifting through lists of messages. As a first step to developing email applications that provide high-level support for structured activities, we consider the problem of automatically learning an activity's structure. We formalize activities as finite-state automata, where states correspond to the status of the process, and transitions represent messages sent between participants. We propose several unsupervised machine learning algorithms in this context, and evaluate them on a collection of e-commerce email.

TaskTracer: A Desktop Environment to Support Multi-Tasking Knowledge Workers
Anton Dragunov, Thomas G. Dietterich, Kevin Johnsrude, Matthew McLaughlin, Lida Li, and Jonathan L. Herlocker (Oregon State University)

Abstract
This paper reports on TaskTracer - a software system being designed to help highly multitasking knowledge workers rapidly locate, discover, and reuse past processes they used to successfully complete tasks. The system monitors users interaction with a computer, collects detailed records of users' activities and resources accessed, associates (automatically or with users' assistance) each interaction event with a particular task, enables users to access records of past activities and quickly restore task contexts. We present a novel Publisher-Subscriber architecture for collecting and processing users' activity data, describe several different user interfaces tried with TaskTracer, and discuss the possibility of applying machine learning techniques to recognize/predict users tasks.

Intelligent Data Entry Assistant for XML Using Ensemble Learning
Danico Lee and Costas Tsatsoulis (University of Kansas)

Abstract
XML has emerged as the primary standard of data representation and data exchange. Although many software tools exist to assist the XML implementation process, data must be manually entered into the XML documents. Current form filling technologies are mostly for simple data entry and do not provide support for the complexity and nested structures of XML grammars. This paper presents SmartXAutofill, an intelligent data entry assistant for predicting and automating inputs for XML documents based on the contents of historical document collections in the same XML domain. SmartXAutofill incorporates an ensemble classifier, which integrates multiple internal classification algorithms into a single architecture. Each internal classifier uses approximate techniques to propose a value for an empty XML field, and, through voting, the ensemble classifier determines which value to accept. As the system operates it learns which internal classification algorithms work better for a specific XML document domain and modifies its weights (confidence) in their predictive ability. As a result, the ensemble classifier adapts itself to the specific XML domain, without the need to develop special learners for the infinite number of domains that XML users have created. We evaluated our system performance using data from eleven different XML domains. The results show that the ensemble classifier adapted itself to different XML document domains, and most of the time (for 9 out of 11 domains) produced predictive accuracies as good as or better than the best individual classifier for a domain.

Active Preference Learning for Personalized Calendar Scheduling Assistance
Melinda Gervasio (SRI International)
Michael D. Moffitt, Martha E. Pollack, and Joseph M. Taylor (University of Michigan)
Tomas E. Uribe (SRI International)

Abstract
We present PLIANT, a learning system that supports adaptive assistance in an open calendaring system. PLIANT learns user preferences from the feedback that naturally occurs during interactive scheduling. It contributes a novel application of active learning in a domain where the choice of candidate schedules to present to the user must balance usefulness to the learning module with immediate benefit to the user. Our experimental results provide evidence of PLIANT s ability to learn user preferences under various conditions and reveal the tradeoffs made by the different active learning selection strategies.

[top]

Invited Talk: Stuart Card
Tuesday, January 11th, 9:00 - 10:30 am
Chair: John Riedl (University of Minnesota)

 

Attention-Reactive User Interfaces for Sensemaking
Stuart Card (Palo Alto Research Center)

Abstract
I will talk about an emerging class of user interfaces that if not exactly intelligent are at least attention-reactive. They are being developed to handle "sensemaking" tasks, in which users find, analyze, and create products or action from large collections of documents. Applications might be expected to develop in law, education, scholarship, security, and medicine. These interfaces have a focus + context visualization on the front end and a semantic contextual computing engine on the back end. Ultimately they can be expected to have mixed initiatives. These interfaces require the development of a supporting science of human information interaction that stresses interaction between the user and information and deemphasizes the platform through which this occurs.

About Stuart Card
Stuart Card is a Senior Research Fellow and the manager of the User Interface Research group at the Palo Alto Research Center. His study of input devices led to the Fitts's Law characterization of the mouse and was a major factor leading to the mouse's commercial introduction by Xerox. His group has developed theoretical characterizations of human-machine interaction, including the Model Human Processor, the GOMS theory of user interaction, information foraging theory, and statistical descriptions of Internet use. These theories have been put to use in new paradigms of human-machine interaction including the Rooms workspace manager, papertronic systems, and the Information Visualizer. The work of his group has resulted in a dozen Xerox products as well as contributing to the founding of three software companies, Inxight Software, Outride, and Content Guard. Card is a co-author of the book The Psychology of Human-Computer Interaction and a co-editor of the book, Human Performance Models for Computer-Aided Engineering. He has served on many editorial boards, government panels, and university review boards. He received his A.B. in Physics from Oberlin College and his Ph.D. in Psychology from Carnegie Mellon University, where he pursued an interdisciplinary program in psychology, artificial intelligence, and computer science. He has been an adjunct faculty member at Stanford University. His most recent book, Readings in Information Visualization was published in 1999. Card is currently concentrating on the theory and design of systems for attending to and interpreting large amounts of information (information foraging theory and sensemaking theory). Card is a Fellow of the ACM, the first recipient of the ACM CHI Lifetime Achievement Award, and the first member of the ACM CHI Academy.

[top]

Papers: Visualization and Presentation
Tuesday, January 11th, 11:00 am - 12:40 pm
Chair: Jim Blythe (University of Southern California)

 

The Centrality of Pivotal Points in the Evolution of Scientific Networks
Chaomei Chen (Drexel University)

Abstract
In this paper, we describe the development of CiteSpace as an integrated environment for identifying and tracking thematic trends in scientific literature. The goal is to simplify the process of finding not only highly cited clusters of scientific articles, but also pivotal points and trails that are likely to characterize fundamental transitions of a knowledge domain as a whole. The trails of an advancing research field are captured through a sequence of snapshots of its intellectual structure over time in the form of Pathfinder networks. These networks are subsequently merged with a localized pruning algorithm. Pivotal points in the merged network are algorithmically identified and visualized using the betweenness centrality metric. An example of finding clinical evidence associated with reducing risks of heart diseases is included to illustrate how CiteSpace could be used. The contribution of the work is its integration of various change detection algorithms and interactive visualization capabilities to simply users tasks.

Interfaces for Networked Media Exploration and Collaborative Annotation
Preetha Appan, Bageshree Shevade, Hari Sundaram, and David Birchfield (Arizona State University)

Abstract
In this paper, we present our efforts towards creating interfaces for networked media exploration and collaborative annotation. The problem is important since online social networks are emerging as conduits for exchange of everyday experiences. These networks do not currently provide media-rich communication environments. Our approach has two parts collaborative annotation, and a media exploration framework. The collaborative annotation takes place through a web based interface, and provides to each user personalized recommendations, based on media features, and by using a common sense inference toolkit. We develop three media exploration interfaces that allow for two-way interaction amongst the participants (a) spatio-temporal evolution, (b) event cones and (c) viewpoint centric interaction. We also analyze the user activity to determine important people and events, for each user. We also develop subtle visual interface cues for activity feedback. Preliminary user studies indicate that the system performs well and is well liked by the users.

A Graph-Matching Approach to Dynamic Media Allocation in Intelligent Multimedia Interfaces
(Outstanding Paper Award)
Michelle X. Zhou, Zhen Wen, and Vikram Aggarwal (IBM T. J. Watson Research Center)

Abstract
To aid users in exploring large and complex data sets, we are building an intelligent multimedia conversation system. Given a user request, our system dynamically creates a multimedia response that is tailored to the interaction context. In this paper, we focus on the problem of media allocation, a process that assigns one or more media, such as graphics or speech, to best convey the intended response content. Specifically, we develop a graphmatching approach to media allocation, whose goal is to find a set of data-media mappings that maximizes the satisfaction of various allocation constraints (e.g., data-media compatibility and presentation consistency constraints). Compared to existing rule-based or plan-based approaches to media allocation, our work offers three unique contributions. First, we provide an extensible computational framework that optimizes media assignments by dynamically balancing all relevant constraints. Second, we use featurebased metrics to uniformly model various allocation constraints, including those cross-content and cross-media constraints, which often require special treatment in existing approaches. Third, we further improve the quality of a response by automatically detecting and repairing undesired allocation results. We have applied our approach to two different applications and our preliminary study has shown the promise of our work.

A Location Representation for Generating Descriptive Walking Directions
Gary Look, Buddhika Kottahachchi, Robert Laddaga, and Howard Shrobe (Massachusetts Institute of Technology)

Abstract
An expressive representation for location is an important component in many applications. However, while many location-aware applications can reason about space at the level of coordinates and containment relationships, they have no way to express the semantics that define how a particular space is used. We present lair, an ontology that addresses this problem by modeling both the geographical relationships between spaces as well as the functional purpose of a given space. We describe how lair was used to create an application that produces walking directions comparable to those given by a person, and a pilot study that evaluated the quality of these directions. We also describe how lair can be used to evaluate other intelligent user interfaces.

[top]

Papers: Natural Language and Gestural Input
Tuesday, January 11th, 2:10 - 3:25 pm
Chair: Alfred Kobsa (University of California at Irvine)

 

User Interfaces with Semi-Formal Representations: A Study of Designing Argumentation Structures
Timothy Chklovski, Varun Ratnakar, and Yolanda Gil (University of Southern California)

Abstract
When designing mixed-initiative systems, full formalization of all potentially relevant knowledge may not be cost-effective or practical. This paper motivates the need for semi-formal representations that combine machine-processable structures with free text statements, and discusses the need to design them in a way that makes the free text more amenable to automated structuring and processing. Our work is done in the context of argumentation systems, and has explored a range of tradeoffs in combining informal free-text statements with formal connectors. The paper compares alternative argument representations which combine structured argument connectors with free text. We discuss merits of the systems based on a variety of analysis structures that we have collected from Web users to date.

An Agent-Based Approach to Dialogue Management in Personal Assistants
Anh Nguyen and Wayne Wobcke (University of New South Wales)

Abstract
Personal assistants need to allow the user to interact with the system in a exible and adaptive way such as through spoken language dialogue. In this research we focus on an application in which the user can use a variety of devices to interact with a collection of personal assistants each specializing in a task domain such as email or calendar management, information seeking, etc. We propose an agent-based approach for developing the dialogue manager that acts as the central point maintaining continuous user-system interaction and coordinating the activities of the assistants. In addition, this approach enables development of multi-modal interfaces. We describe our initial implementation which contains an email management agent that the user can interact with through a spoken dialogue and an interface on PDAs. The dialogue manager was implemented by extending a BDI agent architecture.

Relaxing Stylus Typing Precision by Geometric Pattern Matching
Per-Ola Kristensson (Linköping University)
Shumin Zhai (IBM Almaden Research Center)

Abstract
Fitts' law models the inherent speed-accuracy trade-off constraint in stylus typing. Users attempting to go beyond the Fitts' law speed ceiling will tend to land the stylus outside the targeted key, resulting in erroneous words and increasing users' frustration. We propose a geometric pattern matching technique to overcome this problem. Our solution can be used either as an enhanced spell checker or as a way to enable users to escape the Fitts' law constraint in stylus typing, potentially resulting in higher text entry speeds than what is currently theoretically modeled. We view the hit points on a stylus keyboard as a high resolution geometric pattern. This pattern can be matched against patterns formed by the letter key center positions of legitimate words in a lexicon. We present the development and evaluation of an "elastic" stylus keyboard capable of correcting words even if the user misses all the intended keys, as long as the user's tapping pattern is close enough to the intended word.

[top]

Panel Discussion
Tuesday, January 11th, 4:00 - 5:30 pm

 

The Usability Crisis in High-Tech Home Products: An Opportunity for Intelligent User Interfaces?
Charles Rich (Mitsubishi Electric Research Laboratories)
David Keyson (Technical University of Delft)
Yogendra Jain (Personica)
Boris de Ruyter (Philips)

Abstract
Ordinary people already have great difficulty using the advanced features of digitally operated household devices such as personal video recorders and DVD burners and "white goods" such as washing machines, microwave ovens, and programmable thermostats. The problem is getting worse as more customization and programming features are continually being added. This is a challenging and practical application for intelligent user interface research, one in which new ideas are badly needed. This panel brings together industrial and academic researchers as well as business people to report on their activities and to stimulate others to join.

[top]

Poster / Demo Session and Banquet
Tuesday, January 11th, 6:30 - 10:00 pm

Chairs:
Tessa Lau (IBM T. J. Watson Research)
Daniel Billsus (FX Palo Alto Laboratory)

 

Note: During this session, system demonstrations will be given by some of the short paper presenters, as well as by presenters of some of the long papers.


Affective Computing

Person-Independent Estimation of Emotional Experiences From Facial Expressions
Timo Partala, Veikko Surakka, and Toni Vanhala (University of Tampere)

Abstract
The aim of this research was to develop methods for the automatic person-independent estimation of experienced emotions from facial expressions. Ten subjects watched series of emotionally arousing pictures and videos, while the electromyographic (EMG) activity of two facial muscles: zygomaticus major (activated in smiling) and corrugator supercilii (activated in frowning) was registered. Based on the changes in the activity of these two facial muscles, it was possible to distinguish between ratings of positive and negative emotional experiences at a rate of almost 70% for pictures and over 80% for videos. Using these methods, the computer could adapt its behavior according to the user s emotions during humancomputer interaction.

Pedagogical Agent Image Matters
Amy Baylor (Florida State University)

Abstract
Pedagogical agent image is a key feature for animated interface agents. Experimental research indicates that agent interface images should be carefully designed, considering both the relevant outcomes (learning or motivational) together with student characteristics. This paper summarizes empirically-derived design guidelines for pedagogical agent image.

Emotive Alert: HMM-Based Emotion Detection In Voicemail Messages
Zeynep Inanoglu and Ron Caneel (MIT Media Laboratory)

Abstract
Voicemail has become an integral part of our personal and professional communication. The number of messages that accumulate in our voice mailboxes necessitate new ways of prioritizing them. Currently, we are forced to actively listen to all messages in order to find out which ones are important and which ones can be attended to later on. In this paper, we describe Emotive Alert, a system that can detect some of the significant emotions in a new message and notify the account owner along various affective axes, including ururgency, formality, valence (happy vs. sad) and arousal (calm vs. excited). We have used a purely acoustic, HMM-based approach for identifying the emotions, which allows application of this system to all messages independent of language.


Human-Robot Interaction

Vision-Based GUI for Interactive Mobile Robots
Randeep Singh, Bhartendu Seth, and Uday B. Desai (Indian Institute of Technology)

Abstract
Interactive mobile robots are an active area of research. This paper presents a framework for designing a real-time vision based hand-body gesture user interface for such robots. The said framework works in real world lighting conditions, with complex background, and can handle intermittent motion of the camera. The input signal is captured by using a singular monocular color camera. Vision is the only feedback sensor being used. It is assumed that the gesturer is wearing clothes that are slightly different from the background. We have tested this framework on a gesture database consisting of 11 hand-body gestures and have recorded recognition accuracy up to 90%.

User Intentions Funneled Through a Human-Robot Interface
Michael T. Rosenstein, Andrew H. Fagg, Shichao Ou, and Roderic A. Grupen (University of Massachusetts Amherst)

Abstract
We describe a method for predicting user intentions as part of a human-robot interface. In particular, we show that funnels, i.e., geometric objects that partition an input space, provide a convenient means for discriminating individual objects and for clustering sets of objects for hierarchical tasks. One advantage of the proposed implementation is that a simple parametric model can be used to specify the shape of a funnel, and a straightforward heuristic for setting initial parameter values appears promising. We discuss the possibility of adapting the user interface with machine learning techniques, and we illustrate the approach with a humanoid robot performing a variation of a standard peginsertion task.


Personal Assistants

Context-Based Similar Words Detection and Its Application in Specialized Search Engines
Hisham Al-Mubaid and Ping Chen (University of Houston)

Abstract
This paper presents a new context-based method for automatic detection and extraction of similar and related words from texts. Finding similar words is a very important task for many NLP applications including anaphora resolution, document retrieval, text segmentation, and text summarization. Here we use word similarity to improve search quality for search engines in (general and) specific domains. Our method is based on rules for extracting the words in the neighborhood of a target word, then connecting this with the surroundings of other occurrences of the same word in the (training) text corpus. This is an ongoing work, and is still under extensive testing. The preliminary results, however, are promising and encouraging more work in this direction.

Interactively Building Agents for Consumer-Side Data Mining
Rattapoom Tuchinda and Craig A. Knoblock (University of Southern California)

Abstract
Integrating and mining data from different web sources can make end-users well-informed when they make decisions. One of many limitations that bars end-users from taking advantages of such process is the complexity in each of the steps required to gather, integrate, monitor, and mine data from different websites. We present the idea of combining the data integration, monitoring, and mining as one single process in the form of an intelligent assistant that guides end-users to specify their mining tasks by just answering questions. This easy-to-use approach, which trades off complexity in terms of available operations with the ease of use, has the ability to provide interesting insight into the data that would requires days of human effort to gather, combine, and mine manually from the web.

Adaptive Teaching Strategy for Online Learning
Jungsoon Yoo, Cen Li, and Chrisila Pettey (Middle Tennessee State University)

Abstract
Finding the optimal teaching strategy for an individual student is difficult even for an experienced teacher. Identifying and incorporating multiple optimal teaching strategies for different students in a class is even harder. This paper presents an Adaptive tutor for online Learning, AtoL, for Computer Science laboratories that identifies and applies the appropriate teaching strategies for students on an individual basis. The optimal strategy for a student is identified in two steps. First, a basic strategy for a student is identified using rules learned from a supervised learning system. Then the basic strategy is refined to better fit the student using models learned using an unsupervised learning system that takes into account the temporal nature of the problem solving process. The learning algorithms as well as the initial experimental results are presented.

Providing Intelligent Help Across Applications in Dynamic User and Environment Contexts
Ashwin Ramachandran and R. Michael Young (North Carolina State University)

Abstract
The problem of providing help for complex application interfaces has been a source of interest for a number of researcher efforts. As the computational power of computers increases, typical applications not only increase in functionality but also in the degree of interaction with the computational environment in which they reside. This paper describes an ongoing project to design an Intelligent Help System (IHS) that provides context-sensitivity not only through its modeling of application states but also its modeling of the interaction between applications and between an application and the environment in which it resides.


Visualization and Presentation

ScentHighlights: Highlighting Conceptually Related Sentences During Reading
Ed Chi, Lichan Hong, Michelle Gumbrecht, and Stuart Card (Palo Alto Research Center)

Abstract
Researchers have noticed that readers are increasingly skimming instead of reading in depth. Skimming also occur in re-reading activities, where the goal is to recall specific topical facts. Bookmarks and highlighters were invented precisely to achieve this goal. For skimming activities, readers need effective ways to direct their attention toward the most relevant passages within text. We describe how we have enhanced skimming activity by conceptually highlighting sentences within electronic text that relate to search keywords. We perform the conceptual highlighting by computing what conceptual keywords are related to each other via word co-occurrence and spreading activation. Spreading activation is a cognitive model developed in psychology to simulate how memory chunks and conceptual items are retrieved in our brain. We describe the method used, and illustrate the idea with realistic scenarios using our system.

Personal Reporting of a Museum Visit as an Entry Point to Future Cultural Experience
Charles Callaway, Tsvi Kuflik, Elena Not, Alessandra Novello, Oliviero Stock, and Massimo Zancanaro (ITC-irst)

Abstract
Museum visitors can continue interacting with museum exhibits even after they have left the museum. We can help them do this by creating a report that includes a basic, personalized narration of their visit, the items and relationships they found most interesting, pointers to additional related online information, and suggestions for future visits to the current and other museums. In this work we describe the automatic generation of personalized natural language reports to help create one episode in an ongoing coherent sequence of cultural activities.


Speech- and Vision-Based Interfaces

How to Wreck a Nice Beach You Sing Calm Incense
Henry Lieberman, Alexander Faaborg, Waseem Daher, and José Espinosa (MIT Media Laboratory)

Abstract
A principal problem in speech recognition is distinguishing between words and phrases that sound similar but have different meanings. Speech recognition programs produce a list of weighted candidate hypotheses for a given audio segment, and choose the "best" candidate. If the choice is incorrect, the user must invoke a correction interface that displays a list of the hypotheses and choose the desired one. The correction interface is time-consuming, and accounts for much of the frustration of today's dictation systems. Conventional dictation systems prioritize hypotheses based on language models derived from statistical techniques such as n-grams and Hidden Markov Models. We propose a supplementary method for ordering hypotheses based on Commonsense Knowledge. We filter acoustical and word-frequency hypotheses by testing their plausibility with a semantic network derived from 700,000 statements about everyday life. This often filters out possibilities that "don't make sense" from the user's viewpoint, and leads to improved recognition. Reducing the hypothesis space in this way also makes possible streamlined correction interfaces that improve the overall throughput of dictation systems.

HMM-Based Efficient Sketch Recognition
Tevfik Metin Sezgin and Randall Davis (Massachusetts Institute of Technology)

Abstract
Current sketch recognition systems treat sketches as images or a collection of strokes, rather than viewing sketching as an interactive and incremental process. We show how viewing sketching as an interactive process allows us to recognize sketches using Hidden Markov Models. We report results of a user study indicating that in certain domains people draw objects using consistent stroke orderings. We show how this consistency, when present, can be used to perform sketch recognition efficiently. This novel approach enables us to have polynomial time algorithms for sketch recognition and segmentation, unlike conventional methods with exponential complexity.

Interaction Techniques Using Prosodic Features of Speech and Audio Localization
Alex Olwal (Royal Institute of Technology)
Steven Feiner (Columbia University)

Abstract
We describe several approaches for using prosodic features of speech and audio localization to control interactive applications. This information can be applied to parameter control, as well as to speech disambiguation. We discuss how characteristics of spoken sentences can be exploited in the user interface; for example, by considering the speed with which a sentence is spoken and the presence of extraneous utterances. We also show how coarse audio localization can be used for low-fidelity gesture tracking, by inferring the speaker's head position.

Doubleshot: An Interactive User-Aided Segmentation Tool
Tom Yeh and Trevor Darrell (Massachusetts Institute of Technology)

Abstract
In this paper, we describe an intelligent user interface designed for camera phones to allow mobile users to specify the object of interest in the scene simply by taking two pictures: one with the object and one without the object. By comparing these two images, the system can reliably extract the visual appearance of the object, which can be useful to a wide-range of applications such as content-based image retrieval and object recognition.

Generating Semantic Contexts from Spoken Conversation in Meetings
Jürgen Ziegler and Zoulfa El Jerrroudi (University of Duisburg-Essen)
Karsten Böhm (University of Leipzig)

Abstract
SemanticTalk is a tool for supporting face-to-face meetings and discussions by automatically generating a semantic context from spoken conversations. We use speech recognition and topic extraction from a large terminological database to create a network of discussion topics in real-time. This network includes concepts explicitly addressed in the discussion as well as semantically associated terms, and is visualized to increase conversational awareness and creativity in the group.

Conventions in Human-Human Multi-Threaded Dialogues: A Preliminary Study
Peter Heeman and Fan Yang (Oregon Health & Science University)
Andrew Kun and Alexander Shyrokov (University of New Hampshire)

Abstract
In this paper, we explore the conventions that people use in managing multiple dialogue threads. In particular, we focus on where in a thread people interrupt when switching to another thread. We nd that some subjects are able to vary where they switch depending on how urgent the interrupting task is. When time-allowed, they switched at the end of a discourse segment, which we hypothesize is less disruptive to the interrupted task when it is later resumed.

Towards Automatic Transcription of Expressive Oral Percussive Performances
Amaury Hazan and Rafael Ramirez (Pompeu Fabra University)

Abstract
We describe a tool for transcribing voice generated percussive rhythms. The system consists of: (a) a segmentation component which separates the monophonic input stream into percussive events (b) a descriptors generation component that computes a set of acoustic features from each of the extracted segments, (c) a machine learning component which assigns to each of the segmented sounds of the input stream a symbolic class. We describe each of these components and compare different machine learning strategies that can be used to obtain a symbolic representation of the oral percussive performance.

Communicating the User's Focus of Attention by Image Processing as Input for a Mobile Museum Guide
Adriano Albertini, Roberto Brunelli, Oliviero Stock, and Massimo Zancanaro (ITC-irst)

Abstract
The paper presents a first prototype of a handheld museum guide delivering contextualized information based on the recognition of drawing details selected by the user through the guide camera. The resulting interaction modality has been analyzed and compared to previous approaches. Finally, alternative, more scalable, solutions are presented that preserve the most interesting features of the system described.


Knowledge Acquisition and Knowledge-Based Design

Acquiring Story Scripts Using Common Sense Feedback
Ryan Williams, Barbara Barry, and Push Singh (MIT Media Laboratory)

Abstract
At the Media Lab we are developing a resource called StoryNet, a very-large database of story scripts that can be used for commonsense reasoning by computers. This paper introduces ComicKit, an interface for acquiring StoryNet scripts from casual internet users. The core element of the interface is its ability to dynamically make commonsense suggestions that guide user story construction. We describe the encouraging results of a preliminary user study, and discuss future directions for ComicKit.

Metafor: Visualizing Stories as Code
Hugo Liu and Henry Lieberman (MIT Media Laboratory)

Abstract
Every program tells a story. Programming, then, is the art of constructing a story about the objects in the program and what they do in various situations. So-called programming languages, while easy for the computer to accurately convert into code, are, unfortunately, difficult for people to write and understand. We explore the idea of using descriptions in a natural language as a representation for programs. While we cannot yet convert arbitrary English to fully specified code, we can use a reasonably expressive subset of English as a visualization tool. Simple descriptions of program objects and their behavior generate scaffolding (underspecified) code fragments, that can be used as feedback for the designer. Roughly speaking, noun phrases can be interpreted as program objects; verbs can be functions, adjectives can be properties. A surprising amount of what we call programmatic semantics can be inferred from linguistic structure. We present a program editor, Metafor, that dynamically converts a user's stories into program code, and in a user study, participants found it useful as a brainstorming tool.

Task-Aware Information Access for Diagnosis of Manufacturing Problems
Larry Birnbaum, Wallace Hopp, Seyed Iravani, Kevin Livingston, and Biying Shou (Northwestern University)

Abstract
Pinpoint is a promising first step towards using a rich model of task context in proactive and dynamic IR systems. Pinpoint allows a user to navigate decision tree representations of problem spaces, built by domain experts, while dynamically entering annotations specific to their problem. The system then automatically generates queries to information repositories based on both the userís annotations and location in the problem space, producing results that are both task focused and problem specific. Initial feedback from users and domain experts has been positive.

Designing Interfaces for Guided Incremental Collection of Knowledge About Everyday Objects from Volunteers
Timothy Chklovski (University of Southern California)

Abstract
A new generation of intelligent applications can be enabled by broad-coverage knowledge repositories about everyday objects. We distill lessons in design of intelligent user interfaces which collect such broad-coverage knowledge from untrained volunteers. We motivate the knowledge-driven template-based approach adopted in LEARNER2, a second generation proactive acquisition interface for eliciting such knowledge. We present volume, accuracy, and recall of knowledge collected by fielding the system for 5 months. LEARNER2 has so far acquired 99,018 general statements, emphasizing knowledge about parts of and typical uses of objects.

An Ontology-Based Interface for Machine Learning
Mathias Bauer and Stephan Baldes (DFKI, German Research Center for Artificial Intelligence)

Abstract
Machine learning (ML) is a complex process that can hardly be carried out by non-expert users. Especially when using adaptive systems that interpret and exploit observations of the user to modify their behavior according to the user's perceived preferences, even naïve users may be confronted with learning systems. This paper presents an approach to make non-expert users understand and influence an ML system such as to improve trust and acceptance of the overall system behavior.


Smart Environments and Ubiquitous Computing

A Framework for Designing Intelligent Task-Oriented Augmented Reality User Interfaces
Leonardo Bonnani, Chia-Hsun Lee, and Ted Selker (MIT Media Laboratory)

Abstract
A task-oriented space can benefit from an augmented reality interface that layers the existing tools and surfaces with useful information to make cooking more easy, safe and efficient. To serve experienced users as well as novices, augmented reality interfaces need to adapt modalities to the user s expertise and allow for multiple ways to perform tasks. We present a framework for designing an intelligent user interface that informs and choreographs multiple tasks in a single space according to a model of tasks and users. A residential kitchen has been outfitted with systems to gather data from tools and surfaces and project multi-modal interfaces back onto the tools and surfaces themselves. Based on user evaluations of this augmented reality kitchen, we propose a system to tailor information modalities based on the spatial and temporal qualities of the task, and the expertise, location and progress of the user. The intelligent augmented reality user interface choreographs multiple tasks in the same space at the same time.

Seamless User Notification in Ambient Soundscapes
Andreas Butz (University of Munich)
Ralf Jung (Saarland University)

Abstract
We describe a method for notifying users through auditory cues embedded in an ambient soundscape in the environment. It uses pieces of music which are composed in such a way, that particular instruments or motifs can be added or omitted without losing the aesthetic quality of the overall composition. This allows for very subtle modifications in the soundscape which are only noticed by those users who have chosen this particular instrument or motif as their notification instrument before. As a side effect, the soundscape itself can be used to subtly influence the mood of users. The method has been implemented in a prototype, which we briefly discuss. The prototype is implemented using a spatial audio framework and can hence notify users from particular directions.

A Cart-Mounted Intelligent Shopping Assistant
Chad Cumby, Andrew Fano, Rayid Ghani, and Marko Krema (Accenture Technology Labs)

Abstract
This paper describes an Intelligent Shopping Assistant designed for a shopping cart mounted tablet PC that enables individual interactions with customers. We use machine learning algorithms to predict a shopping list for the customer's current trip and present this list on the device. As they navigate through the store, personalized promotions are presented using consumer models derived from loyalty card data for each inidvidual. In order for shopping assistant devices to be effective, we believe that they have to be powered by algorithms that are tuned for individual customers and can make accurate predictions about an individual's actions. We formally frame the shopping list prediction as a classification problem, describe the algorithms and methodology behind our system, and show that shopping list prediction can be done with high levels of accuracy, precision, and recall. Beyond the prediction of shopping lists we briefly introduce other aspects of the shopping assistant project, such as the use of consumer models to select appropriate promotional tactics, and the development of promotion planning simulation tools to enable retailers to plan personalized promotions delivered through such a shopping assistant.

Adaptive Navigation Support with Public Displays
Christian Kray and Gerd Kortuem (Lancaster University)
Antonio Krüger (University of Münster)

Abstract
In this paper, we describe a public navigation system which uses adaptive displays as directional signs. The displays are mounted to walls where they provide passersbys with directional information. Each sign is an autonomous, wirelessly networked digital displays connected to a central server. The signs are position-aware and able to adapt their display content in accordance with their current position. Advantages of such a navigation system include improved exibility, dynamic adaptation and ease of setup and maintenance.

[top]

Invited Talk: Barry Smyth
Wednesday, January 12th, 9:00 - 10:30 am
Chair: Anthony Jameson (DFKI and International University in Germany)

 

Adaptive Information Access and the Quest for the Personalization-Privacy Sweetspot
Barry Smyth (University College Dublin and ChangingWorlds)

Abstract
This talk will focus on how so-called personalization techniques are being used in response to the information overload problem. Personalization research brings together ideas from artificial intelligence, user profiling, information retrieval and user-interface design to provide users with more proactive and intelligent information services that are capable of predicting the needs of individuals and adapting to their implicit preferences. We will describe how personalization techniques have been successfully applied to the two dominant modes of information access, browsing and search, with reference to deployed applications in the mobile Internet and Web search arenas. Particular attention will be paid to the natural tension that exists between the potential value of personalization, on the one hand, and the perceived privacy risk associated with profiling, on the other. We will highlight certain recent approaches to personalization that appear to achieve a useful balance between personalization and privacy and argue that realizing this personalization-privacy sweetspot may be the key to the large-scale success of personalization technologies in the future. (The support of the Informatics Research Initiative of Enterprise Ireland and Science Foundation Ireland is gratefully acknowledged.)

About Barry Smyth
Barry Smyth holds the Digital Chair of Computer Science. He is an ECCAI Fellow and is currently the head of the Department of Computer Science at University College Dublin. He is also a founder of ChangingWorlds Ltd., a company that has developed a range of personalization solutions for the Mobile Internet space. He received a BSc in Computer Science from University College Dublin and a PhD from Trinity College, Dublin. Barry works in several areas of artificial intelligence including personalization, case-based reasoning and information retrieval. He has authored almost 200 technical papers and received best paper awards from conferences such as the International Joint Conference on Artificial Intelligence and the European Conference on Artificial Intelligence.

[top]

Papers: Recommendation and Instruction
Wednesday, January 12th, 11:00 am - 12:40 pm
Chair: Jon Herlocker (Oregon State University)

 

Improving Proactive Information Systems
Daniel Billsus and David M. Hilbert (FX Palo Alto Laboratory)
Dan Maynes-Aminzade (Stanford University)

Abstract
Proactive contextual information systems help people locate information by automatically suggesting potentially relevant resources based on their current tasks or interests. Such systems are becoming increasingly popular, but designing user interfaces that effectively communicate recommended information is a challenge: the interface must be unobtrusive, yet communicate enough information at the right time to provide value to the user. In this paper we describe our experience with the FXPAL Bar, a proactive information system designed to provide contextual access to corporate and personal resources. In particular, we present three features designed to communicate proactive recommendations more effectively: translucent recommendation windows increase the user s awareness of particularly highly-ranked recommendations, query term highlighting communicates the relationship between a recommended document and the user s current context, and a novel recommendation digest function allows users to return to the most relevant previously recommended resources. We present empirical evidence supporting our design decisions and relate lessons learned for other designers of contextual recommendation systems.

Trust in Recommender Systems
John O'Donovan and Barry Smyth (University College Dublin)

Abstract
Recommender systems have proven to be an important response to the information overload problem, by providing users with more proactive and personalized information services. And collaborative filtering techniques have proven to be an vital component of many such recommender systems as they facilitate the generation of high-quality recommendations by leveraging the preferences of communities of similar users. In this paper we suggest that the traditional emphasis on user similarity may be overstated. We argue that additional factors have an important role to play in guiding recommendation. Specifically we propose that the trustworthiness of users must be an important consideration. We present two computational models of trust and show how they can be readily incorporated into standard collaborative filtering frameworks in a variety of ways. We also show how these trust models can lead to improved predictive accuracy during recommendation.

Experiments in Dynamic Critiquing
Kevin McCarthy, James Reilly, Lorraine McGinty, and Barry Smyth (University College Dublin)

Abstract
Conversational recommender systems are commonly used to help users to navigate through complex product-spaces by alternatively making product suggestions and soliciting user feedback in order to guide subsequent suggestions. Recently, there has been a surge of interest in developing effective interfaces that support user interaction in domains of limited user expertise. Critiquing has proven to be a popular and successful user feedback mechanism in this regard, but is typically limited to the modification of single features. We review a novel approach to critiquing, dynamic critiquing, that allows users to modify multiple features simultaneously by choosing from a range of so-called compound critiques that are automatically proposed based on their current position within the product-space. In addition, we introduce the results of an important new live-user study that evaluates the practical benefits of dynamic critiquing.

Animating an Interactive Conversational Character for an Educational Game System
Andrea Corradini, Manish Mehta, Niels-Ole Bernsen, and Marcela Charfuelan (University of Southern Denmark)

Abstract
Within the framework of the project NICE (Natural Interactive Communication for Edutainment) [2], we have been developing an educational and entertaining computer game that allows children and teenagers to interact with a conversational character impersonating the fairy tale writer H.C. Andersen (HCA). The rationale behind our system is to make kids learn about HCA s life, fairy tales and historical period while playing and having fun. We report on the character s generation and realization of both verbal and 3D graphical non-verbal output behaviors, such as speech, body gestures and facial expressions. This conveys the impression of a human-like agent with relevant domain knowledge, and distinct personality. With the educational goal in the foreground, coherent and synchronized output presentation becomes mandatory, as any inconsistency may undermine the user s learning process rather than reinforcing it.