Showing posts with label spatial arrangement. Show all posts
Showing posts with label spatial arrangement. Show all posts

Monday, September 14, 2009

GaZIR: Gaze-based Zooming Interface for Image Retrieval (Kozma L., Klami A., Kaski S., 2009)

From the Helsinki Institute for Information Technology, Finland, comes a research prototype called GaZIR for gaze based image retrieval built by Laszlo Kozma, Arto Klami and Samuel Kaski. The GaZIR prototype uses a light-weight logistic regression model as a mechanism for predicting relevance based on eye movement data (such as viewing time, revisit counts, fixation length etc.) All occurring on-line in real time. The system is build around the PicSOM (paper) retrieval engine which is based on tree structured self-organizing maps (TS-SOMs). When provided a set of reference images the PicSOM engine goes online to download a set of similar images (based on color, texture or shape)

Abstract
"We introduce GaZIR, a gaze-based interface for browsing and searching for images. The system computes on-line predictions of relevance of images based on implicit feedback, and when the user zooms in, the images predicted to be the most relevant are brought out. The key novelty is that the relevance feedback is inferred from implicit cues obtained in real-time from the gaze pattern, using an estimator learned during a separate training phase. The natural zooming interface can be connected to any content-based information retrieval engine operating on user feedback. We show with experiments on one engine that there is sufficient amount of information in the gaze patterns to make the estimated relevance feedback a viable choice to complement or even replace explicit feedback by pointing-and-clicking."


Fig1. "Screenshot of the GaZIR interface. Relevance feedback gathered from outer rings influences the images retrieved for the inner rings, and the user can zoom in to reveal more rings."

Fig2. "Precision-recall and ROC curves for userindependent relevance prediction model. The predictions (solid line) are clearly above the baseline of random ranking (dash-dotted line), showing that relevance of images can be predicted from eye movements. The retrieval accuracy is also above the baseline provided by a naive model making a binary relevance judgement based on whether the image was viewed or not (dashed line), demonstrating the gain from more advanced gaze modeling."

Fig 3. "Retrieval performance in real user experiments. The bars indicate the proportion of relevant images shown during the search in six different search tasks for three different feedback methods. Explicit denotes the standard point-and-click feedback, predicted means implicit feedback inferred from gaze, and random is the baseline of providing random feedback. In all cases both actual feedback types outperform the baseline, but the relative performance of explicit and implicit feedback depends on the search task."
  • László Kozma, Arto Klami, and Samuel Kaski: GaZIR: Gaze-based Zooming Interface for Image Retrieval. To appear in Proceedings of 11th Conference on Multimodal Interfaces and The Sixth Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI), Boston, MA, USA, Novermber 2-6, 2009. (abstract, pdf)

Monday, November 17, 2008

Wearable Augmented Reality System using Gaze Interaction (Park et al., 2008)

Hyung Min Park, Seok Han Lee and Jong Soo Choi from the Graduate School of Advanced Imaging Science, Multimedia & Film at the University of Chung-Ang, Korea presented a paper on their Wearable Augmented Reality System (WARS) at the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality. They use a half-blink mode (called "aging") for selection which is detected by their custom eye tracking algorithms. See the end of the video.

Abstract
Undisturbed interaction is essential to provide immersive AR environments. There have been a lot of approaches to interact with VEs (virtual environments) so far, especially in hand metaphor. When the user‟s hands are being used for hand-based work such as maintenance and repair, necessity of alternative interaction technique has arisen. In recent research, hands-free gaze information is adopted to AR to perform original actions in concurrence with interaction. [3, 4]. There has been little progress on that research, still at a pilot study in a laboratory setting. In this paper, we introduce such a simple WARS(wearable augmented reality system) equipped with an HMD, scene camera, eye tracker. We propose „Aging‟ technique improving traditional dwell-time selection, demonstrate AR gallery – dynamic exhibition space with wearable system.
Download paper as PDF.

Thursday, August 28, 2008

Mixed reality systems for technical maintenance and gaze-controlled interaction (Gustafsson et al)

To follow up on the wearable display with an integrated eye tracker one possible application is in the domain of mixed reality. This allows for interfaces to be projected on top of a video stream (ie. the "world view") Thus blending the physical and virtual world. The paper below investigates how this could be used to assist technical maintenance of advanced systems such as fighter jets. It´s an early prototype but the field is very promising especially when an eye tracker is involved.


Abstract:
"The purpose of this project is to build up knowledge about how future Mixed Reality (MR) systems should be designed concerning technical solutions, aspects of Human-Machine-Interaction (HMI) and logistics. The report describes the work performed in phase2. Regarding hardware a hand-held MR-unit, a wearable MR-system and a gaze-controlled MR-unit have been developed. The work regarding software has continued with the same software architecture and MR-tool as in the former phase 1. A number of improvements, extensions and minor changes have been conducted as well as a general update. The work also includes experiments with two test case applications, "Turn-Round af Gripen (JAS) and "Starting Up Diathermy Apparatus" Comprehensive literature searches and surveys of knowledge of HMI aspects have been conducted, especially regarding gaze-controlled interaction. The report also includes a brief overview of ohter projects withing the area of Mixed Reality."
  • Gustafsson, T., Carleberg, P., Svensson, P., Nilsson, S., Le Duc, M., Sivertun, Å., Mixed Reality Systems for Technical Maintenance and Gaze-Controlled Interaction. Progress Report Phase 2 to FMV., 2005. Download paper as PDF

Wednesday, July 2, 2008

Hot Zone prototype (Wakaruru)

The following video demonstrates a prototype of the Hot Zone system for controling windows applications. Call out the Hot Zone are made by pressing a single Hotkey and blink. The menu can be closed by looking outside the zone and blinking. Submitted to YouTube by "Wakaruru"


Submitted to YouTube by "Wakaruru"

The second video demonstrates how Hot Zone could be used to work with real world applications (PowerPoint). The center of the zone is located at the current gaze position when it's call out (like a context menu). the commands in the five zones depend on current selected object (based on what you are looking at now). No mouse needed (the cursor was designated as the gaze position using API). Pure blink without hotkey pressed was designated as single mouse click. Thus blink on the text area start text editing. Keyboard is only used for typing and hotkey to call out the zone. The eye tracking device used is ASL EH6000.


Submitted to YouTube by "Wakaruru"

Tuesday, April 15, 2008

Gaze Interaction Demo (Powerwall@Konstanz Uni.)

During the last few years quite a few wall sized displays have been used for novel interaction methods. Not seldomly these have been used with multi-touch, such as the Jeff Han´s FTIR technology. This is the first demonstration I have seen where eye tracking is used for a similar purpose. A German Ph.D candidate, Jo Bieg, is working on this out of the HCI department at the University of Konstanz. The Powerwall is 5.20 x 2.15M and has a resolution of 4640 x 1920.



The demonstration can be view at a better quality (10Mb)

Also make sure to check out the 360 deg. Globorama display demonstration. It does not use eye tracking for interaction but a laser pointer. Nevertheless, really cool immersive experience, especially the Google Earth zoom in to 360 panoramas.

Wednesday, February 20, 2008

Inspiration: StarGazer (Skovsgaard et al, 2008)

A major area of research for the COGAIN network is to enable communication for the disabled. The Innovative Communications group at IT University of Copenhagen continuously work on making gaze-based interaction technology more accessible, especially in the field of assistive technology.

The ability to enter text into the system is crucial for communication, without hands or speech this is somewhat problematic. The StartGazer software aims at solving this by introducing a novel 3D approach to text entry. In December I had the opportunity to visit ITU and try the StarGazer (among other things) myself, it is astonishingly easy to use. Within just a minute I was typing with my eyes. Rather than describing what it looks like, see the video below.
The associated paper is to be presented at the ETRA08 conference in March.



This introduces an important solution to the problem of eye tracker inaccuracy namely zooming interfaces. Fixating on a specific region of the screen will display an enlarged version of this area where objects can be earlier discriminated and selected.

The eyes are incredibly fast but from the perspective of eye trackers not really precise. This is due to the physiology properties of our visual system, in specific the foveal region of the eye. This retinal area produces the sharp detailed region of our visual field which in practice covers about the size of a thumbnail on an armslenght distance. To bring another area into focus a saccade will take place which moves the pupil, thus our gaze, this is what is registered by the eye tracker. Hence the discrimination of most eye trackers are in the 0.5-1 degree (in theory that is)

A feasible solution to deal with this limitation in accuracy is to use the display space dynamically and zoom into the areas of interest upon glancing. The zooming interaction style solves some of the issues with inaccuracy and jitter of the eye trackers but in addition it has to be carefully balanced so that it still provides a quick and responsive interface.

However, the to me the novelty in the StarGazer is the notion of traveling through a 3D space, the sensation of movement really catches ones attention and streamlines the interaction. Since text entry is really linear character by character, flying though space by navigating to character after character is a suitable interaction style. Since the interaction is nowhere near the speed of two hand keyboard entry the employment of linguistic probabilities algorithms such as those found in cellphones will be very beneficial (ie. type two or three letters and the most likely words will display in a list) Overall, I find the spatial arrangement of gaze interfaces to be a somewhat unexplored area. Our eyes are made to navigate in a three dimensional world while the traditional desktop interfaces mainly contains a flat 2D view. This is something I intend to investigate further.

Tuesday, February 19, 2008

Inspiration: GazeSpace (Laqua et al. 2007)

Parallel to working on the prototypes I continuously search and review papers and thesises on gaze interaction methods / techniques, hardware and software development etc. I will post references on some of these to this blog. A great deal of research and theories on interaction / cognition lies behind the field of gaze interaction.

The paper below was presented last year on a conference held by the British Computer Society specialist group on Human Computer Interaction. Catching my attention is the focus on providing a custom content spaces (canvas), good feedback and using a dynamic dwell-time, something I intend to incorporate into my own gaze GUI components. Additionally, the idea on expanding the content canvas upon a gaze fixation is really nice and something I will attempt to do in .Net/WPF (initial work displays a set of photos that becomes enlarged upon fixation)

GazeSpace Eye Gaze Controlled Content Spaces (Laqua et al. 2007)

Abstract
In this paper, we introduce GazeSpace, a novel system utilizing eye gaze to browse content spaces. While most existing eye gaze systems are designed for medical contexts, GazeSpace is aimed at able-bodied audiences. As this target group has much higher expectations for quality of interaction and general usability, GazeSpace integrates a contextual user interface, and rich continuous feedback to the user. To cope with real-world information tasks, GazeSpace incorporates novel algorithms using a more dynamic gaze-interest threshold instead of static dwell-times. We have conducted an experiment to evaluate user satisfaction and results show that GazeSpace is easy to use and a “fun experience”. Download paper (PDF)





About the author
Sven Laqua is a PhD Student & Teaching Fellow at the Human Centred Systems Group a part of the Dept. of Computer Science at University College London. Sven has a personal homepage, university profile and a blog (rather empty at the moment)