Copyright © 2002
PARC Inc.
All Rights Reserved
|
|
Publications
Show Publications
By:
All Publications [Listing only]
The Use of Proximal Information Scent to Forage for Distal Content on the World Wide Web.
The particular focus of this chapter will be on a psychological theory of information scent (Pirolli, 1997, 2003; Pirolli & Card, 1999) that is embedded in a broader model (Pirolli & Fu, 2003) of information foraging on the Web. The notion of information scent also has been used in developing models of people seeking information in document-clustering browsers (Pirolli, 1997) and highly interactive information visualizations (Pirolli, Card, & Van Der Wege, 2003). Information scent refers to the cues used by information foragers to make judgments related to the selection of information sources to pursue and consume. These cues include items such as Web links or bibliographic citations that provide users with concise information about content that is not immediately available. The information scent cues play an important role in guiding users to the information they seek, and they also play a role in providing users with an overall sense of the contents of collections. The purpose of this paper is to present a theoretical account of information scent that supports the development of models of navigation choice.
Pirolli, P. (2004).
Working with Technology in Mind: Brunswikian Resources for Cognitive Science and Engineering. Oxford University Press. [PDF]
|
Log-based Longitudinal Study Finds Window Thrashing
Although large displays are becoming more cost effective,
most user interfaces are optimized for a single monitor of
modest size even though many traditional workspaces such
as desks and workbenches are much larger and some studies
have found benefits from large displays. This paper
explores whether a single monitor is sufficient for
information work using standard software. A log-based
longitudinal field study finds that most of the time a single
monitor allows skilled information analysts to have a
reasonable pattern of window activity. However, a novel
visualization of the data shows that windows typically fill
the monitor and the pattern is occasionally interrupted by
window thrashing, the rapid manipulation of windows
caused by limited display resource. Given these findings,
we identify some common tasks that justify the development
and the expense of wideband visual interfaces that are
optimized for larger displays.
Mackinlay, J. D. and Royer, C. (2004). . [PDF]
|
Transient User Profiling
Our work in the past five years on modeling user actions on the Web has shown that a great deal of information about user actions can be recovered from the informational cues processed by the user during navigation. We call these informational cues by the name of "Information Scent." We have shown in various papers that Information Scent can be used as a methodology for clustering a group of user profiles [Chi02], simulating a collection of users navigating thru the Web with an information need [Chi03], and provid-ing navigational cues to users with transient information goals [Olston03].
We argue in this position paper for a CHI2004 workshop that more research in user profiling should be done for user goals that are transient in nature.
Chi, E. H. (2004).
Proceedings of the Workshop on User Profiling (CHI2004), Vienna, Austria. [PDF]
|
Wideband Displays: Mitigating Multiple Monitor Seams
Wideband displays fill our field of view, creating new
opportunities to develop effective visual interfaces.
Although multiple monitors are becoming an affordable
way to create wideband displays, the resulting seams create
gaps in words and divide diagonal lines into nonaligned
segments. We present several novel user interface
techniques for creating seam-aware applications, proving
that vendors need not wait for affordable seamless displays
to exploit the potential of wideband displays.
Mackinlay, J. D. and Heer, J. (2004).
Proceedings of the Human Factors in Computing Systems Conference (CHI2004), Vienna, Austria. [PDF]
|
3Book: a 3D Electronic Smart Book
This paper describes the 3Book, a 3D interactive visualization
of a codex book as a component for various digital
library and sensemaking systems. The book is designed to
hold large books and to support sensemaking operations by
readers. The book includes methods in which the automatic
semantic analysis of the book’s content is used to dynamically
tailor access.
Card, S. K., Hong, L., Mackinlay, J. D. and Chi, E. H. (2004).
Advanced Visual Interfaces (AVI) 2004. [PDF]
|
3Book: A Scalable 3D Virtual Book
This paper describes the 3Book, a 3D interactive visualization
of a codex book as a component for digital library and
information-intensive applications. The 3Book is able to
represent books of almost unlimited length, allows users to
read large format books, and has features to enhance reading
and sensemaking.
Card, S. K., Hong, L., Mackinlay, J. D. and Chi, E. H. (2004).
Proceedings of the Human Factors in Computing Systems Conference (CHI2004) Conference Companion, Vienna, Austria. [PDF]
|
eBooks with Indexes that Reorganize Conceptually
Subject indexes were an important step forward for books
because they enabled the comparison and correlations of
information without extensive reading, re-reading and
memorization. In this short paper, we focus on the user
interaction and usage scenario of a new system called
ScentIndex that enhances the subject index of an eBook by
conceptually reorganizing it to suit particular information
needs. Users first enter information needs via keywords
describing the concepts they are trying to retrieve and
comprehend. ScentIndex then computes what index entries
are conceptually related, and reorganizes and displays these
index entries on a single page.
Chi, E. H., Hong, L., Heiser, J. and Card, S. K. (2004).
Proceedings of the Human Factors in Computing Systems Conference (CHI2004) Conference Companion, Vienna, Austria. [PDF]
|
Efficient User Interest Estimation in Fisheye Views
We present a new technique for efficiently computing Degree-of-Interest distributions to inform the visualization of graph-structured data. The technique is independent of the interest distribution used, and enables fluid interaction with very large data sets (over 100,000 nodes).
Heer, J. and Card, S. K. (2003).
Extended Abstracts of CHI 2003, Conference on Human Factors in Computing Systems, Fort Lauderdale, FL. [PDF]
|
Wideband Visual Interfaces: Sensemaking on Multiple Monitors
Although vendors have made multiple-monitor systems for many years, our interfaces have been stuck in a 30-year old windows paradigm focused on displays much smaller than the desktops we use when working with paper. Advances in flat panel displays and graphics cards now enable affordable personal computers with 6-8 monitors and may someday eliminate seams. This paper argues that vendors should be developing wideband visual interfaces that are designed for displays that fill the human visual field. We describe a longitudinal field study of window activity that found that windows almost always filled a typical single monitor display and that subjects occasionally struggled with window thrashing when they needed to work with two or more windows at the same time. Vendors need not wait for affordable seamless wideband displays before addressing these findings. We have implemented several novel user interface techniques for creating seam-aware applications that target wideband displays based on multiple monitors.
Mackinlay, J. D., Heer, J. and Royer, C. (2003).
Technical Report. [PDF]
|
AVID: Supporting the creation of scalable, responsive visualizations
In this paper we describe a visualization architecture (AVID) that employs a dynamic model of user interest to support the design and creation of highly responsive, scalable visualizations of hierarchical data. We present evidence of the architecture's efficacy, showcasing dynamic visualizations with near-immediate (<100ms) update times, even on structures of over 100,000 nodes. We discuss how the key concepts used generalize to arbitrary graph structures. Additionally, we present the results of a user study comparing a prototypical visualization built using AVID to a more traditional file-browser interface, showcasing up to 20% improvement in information access times.
Heer, J., Card, S. K., Heiser, J. and Pirolli, P. (2003).
Working Paper.
|
ScentTrails: Integrating Browsing and Searching on the Web
The two predominant paradigms for finding information on the Web are browsing and keyword searching. While they exhibit complementary advantages, neither paradigm alone is adequate for complex information goals that lend themselves partially to browsing and partially to searching. To integrate browsing and searching smoothly into a single interface, we introduce a novel approach called ScentTrails. Based on the concept of information scent developed in the context of information foraging theory, ScentTrails highlights hyperlinks to indicate paths to search results. This interface enables users to interpolate smoothly between searching and browsing to locate content matching complex information goals effectively. In a preliminary user study, ScentTrails enabled subjects to find information more quickly than by either searching or browsing alone.
Olston, C. and Chi, E. H. (2003).
ACM Transaction on Computer-Human Interaction. [PDF]
|
SNIF-ACT: A model of information foraging on the world wide web.
No Abstract Available
Pirolli, P. and Fu, W.-T. (2003).
Ninth International Conference on User Modeling, Johnstown, PA. [PDF]
|
The Bloodhound Project: Automating Discovery of Web
Usability Issues using the InfoScent™ Simulator
According to usability experts, the top user issue for Web sites is difficult navigation. We have been developing auto-mated usability tools for several years, and here we describe a prototype service called InfoScent™ Bloodhound Simula-tor, a push-button navigation analysis system, which auto-matically analyzes the information cues on a Web site to produce a usability report. We further build upon previous algorithms to create a method called Information Scent Absorption Rate, which measures the navigability of a site by computing the probability of users reaching the desired destinations on the site. Lastly, we present a user study involving 244 subjects over 1385 user sessions that show how Bloodhound correlates with real users surfing for in-formation on four Web sites. The hope is that, by using a simulation of user surfing behavior, we can reduce the need for human labor during usability testing, thus dramatically lower testing costs, and ultimately improving user experience. The Bloodhound Project is unique in that we apply a concrete HCI theory directly to a real-world prob-lem. The lack of empirically validated HCI theoretical model has plagued the development of our field, and this is a step toward that direction.
Chi, E. H., Rosien, A., Supattanasiri, G., Williams, A., Royer, C., Chow, C., Robles, E., Dalal, B., Chen, J. and Cousins, S. (2003).
CHI 2003, Fort Lauderdale, FL. [PDF]
|
Browse Hierarchical Data with the Degree of Interest Tree
This demonstration shows a method and implementation to interactively display large hierarchies (up to 10,000 nodes) within a web browser. This software computes a degree of interest (DOI) for each node in the hierarchy and displays an overview of the complete hierarchy while showing more detail for nodes with a higher DOI value.
Nation, D., Roberts, D. and Card, S. K. (2002).
ACM CHI 2002 Conference on Human Factors in Computing Systems. [PDF]
|
Degree-of-Interest Trees: A Component of an Attention-Reactive User Interface
This paper proposes Degree-of-Interest trees. These trees use degree-of-interest calculations and focus+context visualization methods, together with bounding constraints, to fit within pre-established bounds. The method is an instance of an emerging “attention-reactive” user interface whose components are de-signed to snap together in bounded spaces.
Card, S. K. and Nation, D. (2002).
Advanced Visual Interfaces, Trento, Italy. [PDF]
|
LumberJack: Intelligent Discovery and Analysis of Web User Traffic Composition
Web Usage Mining enables new understanding of user goals on the Web. This understanding has broad applications, and traditional mining techniques such as association rules have been used in business applications. We have developed an automated method to directly infer the major groupings of user traffic on a Web site [Heer01]. We do this by utilizing multiple data features in a clustering analysis. We have performed an extensive, systematic evaluation of the proposed approach, and have discovered that certain clustering schemes can achieve categorization accuracies as high as 99% [Heer02b]. In this paper, we describe the further development of this work into a prototype service called LumberJack, a push-button analysis system that is both more automated and accurate than past systems.
Chi, E. H., Rosien, A. and Heer, J. (2002).
ACM-SIGKDD Workshop on Web Mining for Usage Patterns and User Profiles, Edmonton, Canada. [PDF]
|
Expressiveness of the Data Flow and Data State Models in Visualization Systems
Visualization can be viewed as a process that transforms raw data (value) into views. There has been two major category of data process models that have been proposed to model the visualization transformation process. This paper seeks to compare the Data Flow Models and the Data State Models. Specifically, it proves that, in terms of expressiveness, anything that can represented using the Data Flow Model can also be represented using the Data State Model, and vice versa.
Chi, E. H. (2002).
Advanced Visual Interfaces Conference, Trento, Italy 375-378. [PDF]
|
Scent of the Web
No Abstract Available
Chi, E. H. (2002).
Human Factors and Web Development. Erlbaum, Hilsdale, New Jersey: pp. 265-285.
|
A User-Tracing Architecture for Modeling Interaction with the World Wide Web
No Abstract Available
Pirolli, P., Fu, W.-T., Reeder, R. and Card, S. K. (2002).
Advanced Visual Interfaces, Trento, Italy. [PDF]
|
Automatic Centerline Extraction for Virtual Colonoscopy
In this paper, we introduce a concise and concrete definition of an accurate colon centerline and provide an efficient automatic means to extract the centerline and its associated branches (caused by a forceful touching of colon and small bowel or a deep fold in twisted colon lumen). We further discuss its applications on fly-through path planning and endoscopic simulation, as well as its potential to solve the challenging touching and colon collapse problems in virtual colonoscopy. Experimental results demonstrated its centeredness, robustness, and efficiency.
Wan, M., Liang, Z., Ke, Q., Hong, L., Bitter, I. and Kaufman, A. (2002).
IEEE Transactions on Medical Imaging 21(12): 1450-1460. [PDF]
|
Improving Web Usability Through Visualization
Predictive Web usage visualizations can help analysts uncover
traffic patterns and usability problems.
Chi, E. H. (2002).
IEEE Internet Computing: 64-71. [PDF]
|
A Framework for Visualizing Information
Information visualization is the design and creation of interactive graphic depictions of information by combining principles in the disciplines of graphic design, cognitive science, and interactive computer graphics. This book describes a framework to make information visualization systems easier to develop through the creation of a reference model. It develops and discusses the general utility of this Data State Model, and validates it by applying it to various visualization techniques and showing several systems that illustrate issues such as how to model operators and interactions in visualization systems.
The book also applies this reference model to make information visualization more accessible to potential users by creating a `Visualization Spreadsheet', where each cell can contain an entire set of data represented using interactive graphics.
Chi, E. H. (2002).
Human-Computer Interaction Kluwer Academic Publishers, Netherlands: 176.
|
What Did They Do? Understanding Clickstreams with the WebQuilt Visualization System
This paper describes the visual analysis tool WebQuilt, a web usability logging and visualization system that helps web design teams record and analyze usability tests. The logging portion of WebQuilt unobtrusively gathers clickstream data as users complete specified tasks. This data is then aggregated and presented as an interactive graph, where nodes of the graph are images of the web pages visited, and arrows are the transitions between pages. To aid analysis of the gathered usability test data, the WebQuilt visualization provides filtering capabilities and semantic zooming, allowing the designer to understand the test results at the gestalt view of the entire graph, and then drill down to sub-paths and single pages. The visualization highlights important usability issues, such as pages where users spent a lot of time, pages where users get off track during the task, navigation patterns, and exit pages, all within the context of a specific task. WebQuilt is designed to conduct remote usability testing on a variety of Internet-enabled devices and provide a way to identify potential usability problems when the tester cannot be present to observe and record user actions.
Waterson, S., Hong, J. I., Sohn, T., Heer, J., Matthews, T. and Landay, J. A. (2002).
Advanced Visual Interfaces, Trento, Italy. [PDF]
|
Mining the Structure of User Activity using Cluster Stability
Recent research has explored web user session clustering as a means of understanding user activity and interests on the World Wide Web. Though the proposed techniques have proven to be useful and effective, they require that one either specify the number of clusters in advance or browse a large hierarchy of clusters to find the optimal depth at which to describe user activity. In this paper, we examine the utility of a stability-based technique for automatically determining the optimal number of clusters in the context of web user session clustering. We present two case studies evaluating the technique’s effectiveness.
Heer, J. and Chi, E. H. (2002).
SIAM International Conference on Data Mining, Workshop on Web Analytics, Arlington, VA. [PDF]
|
Fluid Annotations Through Open Hypermedia: Using and Extending Emerging Web Standards
The Fluid Documents project has developed various research prototypes that show that powerful annotation techniques based on animated typographical changes can help readers utilize
annotations more effectively. Our recently-developed Fluid Open Hypermedia prototype supports the authoring and browsing of fluid annotations on third-party Web pages. This prototype is an extension of the Arakne Environment, an open hypermedia application that can augment Web pages with
externally stored hypermedia structures. This paper describes how various Web standards, including DOM, CSS, XLink, XPointer, and RDF, can be used and extended to support fluid annotations.
Bouvin, N. O., Zellweger, P. T., Grønbæk, K. and Mackinlay, J. D. (2002).
WWW2002, Hawaii. [PDF]
|
CHI@20: Fighting Our Way from Marginality to Power
The Special Interest Group on Computer Human Interaction (SIGCHI) has had a successful history of 20 years of growth in its numbers and influence. To help guide the continued evolution of the academic discipline and professional community, we invite several senior members to offer their visions for what the field of CHI actually accomplished over the past several decades, and what do we still need to accomplish? What do we need to do differently/better/smarter? What haven't we tried because the technology, the money or the will wasn't there in the past, but perhaps is now?
The CHI field is more than just technology. We understand that our work can have a profound effect on individuals, families, neighborhoods, corporations, and countries. We know that we can influence education, commerce, healthcare, and government. How can we contribute to bridging the digital divides in developed and developing countries? What agendas can we offer for the academic, research, industrial, and civic spheres for the next 20 years? How can we be more ambitious? How can we truly serve human needs?
Shneiderman, B., Card, S. K., Norman, D. A., Tremaine, M. and Waldrop, M. M. (2001).
ACM CHI 2002 Conference on Human Factors in Computing Systems. [PDF]
|
Using Information Scent to Model User Information Needs and Actions on the Web
On the Web, users typically forage for information by
navigating from page to page along Web links. Their surfing
patterns or actions are guided by their information needs.
Researchers need tools to explore the complex interactions
between user needs, user actions, and the structures and
contents of the Web. In this paper, we describe two
computational methods for understanding the relationship
between user needs and user actions. First, for a particular
pattern of surfing, we seek to infer the associated information
need. Second, given an information need, and some pages as
starting points, we attempt to predict the expected surfing
patterns. The algorithms use a concept called “information
scent”, which is the subjective sense of value and cost of
accessing a page based on perceptual cues. We present an
empirical evaluation of these two algorithms, and show their
effectiveness.
Chi, E. H., Pirolli, P., Chen, K. and Pitkow, J. (2001).
ACM CHI 2001 Conference on Human Factors in Computing Systems, Seattle, WA 490--497. [PDF]
|
Reliable Path for Virtual Endoscopy: Ensuring Complete Examination of Human Organs
Virtual endoscopy is a computerized, noninvasive procedure for detecting anomalies inside human organs. Several preliminary studies have demonstrated the benefits and effectiveness of this
modality. Unfortunately, previous work cannot guarantee that an existing anomaly will be detected, especially for complex organs with multiple branches. In this paper, we introduce the concept
of reliable navigation, which ensures the interior organ surface is fully examined by the physician performing the virtual endoscopy procedure. To achieve this, we propose computing a reliable fly-through path that ensures no blind area during the navigation. Theoretically, we discuss the criteria of evaluating a reliable path and prove that the problem of generating an optimal reliable path for
virtual endoscopy is NP-complete. In practice, we develop an efficient method for the calculation of an effective reliable path. First, a small set of center observation points are automatically located inside the hollow organ. For each observation point, there exists at least one patch of interior surface visible to it, but that cannot be seen from any of the other observation points. These chosen points
are then linked with a path that stays in the center of the organ. Finally, new points inside the organ are recursively selected and connected into the path until the entire organ surface is visible from the path. We present encouraging results from experiments on several data sets. For a medium size volumetric model with several hundred thousand inner voxels, an effective reliable path can be generated in several minutes.
He, T., Hong, L., Chen, D. and Liang, Z. (2001).
IEEE Transactions on Visualization and Computer Graphics 7(4): 333-342. [PDF]
|
Separating the Swarm: Categorization Methods for User Access Sessions on the Web
Understanding user behaviors on Web sites enables site owners to make sites more usable, ultimately helping users to achieve their goals more quickly. Accordingly, researchers have devised methods for categorizing user sessions in hopes of revealing user interests. These techniques build user profiles by combining users' navigation paths with other data features, such as page viewing time, hyperlink structure, and page content. Previously, we have presented complex techniques of combining many of these data features to cluster user profiles. In this paper, we introduce a user study and a systematic evaluation of these different data features and their associated weighting schemes. We present the results of our study, including accuracy measures for a number of clustering approaches, and offer recommendations for Web analysts. While further investigation over more sites is needed to definitively settle on a robust scheme, we have characterized this analytic space.
Heer, J. and Chi, E. H. (2002).
Proc. of the Human Factor in Computing Systems Conference (CHI 2002), Minneapolis, MN. [PDF]
|
WebQuilt: A Proxy-based Approach to Remote Web Usability Testing
WebQuilt is a web logging and visualization system that helps web design teams run usability tests (both local and remote) and analyze the collected data. Logging is done through a proxy, overcoming many of the problems with server-side and client-side logging. Captured usage traces can be aggregated and visualized in a zooming interface that shows the web pages people viewed. The visualization also shows the most common paths taken through the web site for a given task, as well as the optimal path for that task, as designated by the designer. This paper discusses the architecture of WebQuilt and also describes how it can be extended for new kinds of analyses and visualizations.
Hong, J. I., Heer, J., Waterson, S. and Landay, J. A. (2001).
ACM Transactions on Information Systems. [PDF]
|
Identification of Web User Traffic Composition using Multi-Modal Clustering and Information Scent
On the Web, users typically forage for information by navigating from page to page along Web links. Their surfing patterns or actions are guided by their information needs. Researchers need tools to explore the complex interactions between user needs, user actions, and the structures and contents of the Web. In this paper, we describe two computational methods for understanding the relationship between user needs and user actions. First, for a particular pattern of surfing, we seek to infer the associated information need. Second, given an information need, and some pages as starting points, we attempt to predict the expected surfing patterns. The algorithms use a concept called “information scent”, which is the subjective sense of value and cost of accessing a page based on perceptual cues. We present an empirical evaluation of these two algorithms, and show their effectiveness.
Heer, J. and Chi, E. H. (2001).
Proceedings of the Workshop on Web Mining, SIAM Conference on Data Mining, Chicago, IL 51-58. [PDF]
|
WebEyeMapper and WebLogger: Tools for Analyzing Eye Tracking Data Collected in Web-use Studies
Eye trackers output a stream of points at which the eye was looking. To give these points meaning, researchers analyzing eye tracking data need to map the points onto the meaningful objects at which the eye was looking. Performing this mapping has proven to be a tedious, time-consuming task. We present a software system that automates this task for Web usability studies that incorporate eye tracking.
Reeder, R. W., Pirolli, P. and Card, S. K. (2001).
CHI 2001, Seattle. [PDF]
|
Using Thumbnails to Search the Web
We introduce a technique for creating novel, textually-enhanced thumbnails of Web pages. These thumbnails combine the advantages of image thumbnails and text summaries to provide consistent performance on a variety of tasks. We conducted a study in which participants used three different types of summaries (enhanced thumbnails, plain thumbnails, and text summaries) to search Web pages to find several different types of information. Participants took an average of 67, 86, and 95 seconds to find the answer with enhanced thumbnails, plain thumbnails, and text summaries, respectively. We found a strong effect of question category. For some questions, text outperformed plain thumbnails, while for other questions, plain thumbnails outperformed text. Enhanced thumbnails (which combine the features of text summaries and plain thumbnails) were more consistent than either text summaries or plain thumbnails, having for all categories the best performance or performance that was statistically indistinguishable from the best.
Woodruff, A., Faulring, A., Rosenholtz, R., Morrison, J. and Pirolli, P. (2001).
CHI 2001, Seattle. [PDF]
|
A Taxonomic Analysis of What World Wide Web Activities Significantly Impact People's Decisions and Actions
In this paper, we present three taxonomic classification schemes based on Web users' responses to what Web activities significantly impacted their decisions and actions. The taxonomic classifications focus on three variables: the Purpose of people's search on the Web, the Method people use to find information, and the Content of the information for which they are searching. These taxonomies are useful for understanding people's activity on the Web and for developing ecologically-valid tasks to be used when studying web behavior.
Morrison, J. B., Pirolli, P. and Card, S. K. (2001).
CHI 2001, Seattle. [PDF]
|
The Guidebook, the Friend, and the Room: Visitor Experience in a Historic House
In this paper, we describe an electronic guidebook prototype and report on a study of its use in a historic house. Supported by mechanisms in the guidebook, visitors constructed experiences that had a high degree of interaction with three entities: the guidebook, their companions, and the house and its contents. For example, we found that most visitors played audio descriptions through speakers (rather than using headphones or reading textual descriptions) to facilitate communication with their companions.
Woodruff, A., Aoki, P. M., Hurst, A. and Szymanski, M. H. (2001).
CHI 2001, Seattle. [PDF]
|
Tap Tips: Lightweight Discovery of Touchscreen Targets
We describe tap tips, a technique for providing touch-screen target location hints. Tap tips are lightweight in that they are non-modal, appear only when needed, require a minimal number of user gestures, and do not add to the standard touchscreen gesture vocabulary. We discuss our implementation of tap tips in an electronic guidebook system and some usability test results.
Aoki, P. M., Hurst, A. and Woodruff, A. (2001).
CHI 2001, Seattle. [PDF]
|
Visual Information Foraging in a Focus+Context Visualization
Eye tracking studies of the Hyperbolic Tree browser suggest that visual search in focus+context displays is highly affected by information scent (i.e., local cues, such as text summaries, used to assess and navigate towards distal information sources). When users detected a strong information scent, they were able to reach their goal faster with the Hyperbolic Tree browser than with a conventional browser. When users detected a weak scent or no scent, users exhibited less efficient search of areas with a high density of visual items. In order to interpret these results we present an integration of the CODE Theory of Visual Attention (CTVA) with information foraging theory. Development of the CTVA-foraging theory could lead to deeper analysis of interaction with visual displays of content, such as the World Wide Web or information visualizations.
Pirolli, P., Card, S. K. and Van Der Wege, M. (2001).
CHI 2001, Seattle. [PDF]
|
Information Scent as a Driver of Web Behavior Graphs: Results of a Protocol Analysis Method for Web Usability
The purpose of this paper is to introduce a replicable WWW protocol analysis methodology illustrated by application to data collected in the laboratory. The methodology uses instrumentation to obtain detailed recordings of user actions with a browser, caches Web pages encountered, and videotapes talk-aloud protocols. We apply the current form of the method to the analysis of eight Web protocols, visualizing the structure of the interaction and showing the strong effect of information scent in determining the path followed.
Card, S. K., Pirolli, P., Van Der Wege, M., Morrison, J., Reeder, R. W., Schraedley, P. and Boshart, J. (2001).
CHI 2001, Seattle. [PDF]
|
The Future of Software: Visualization+Computation Tools
The complexity of software has continuously risen since the invention of the computer, and while Moore's law predicts the growth in processor speed, it fails to take into account our ability in managing complex software processes. Our utilization of the increase in processor speed is very much dependent on our ability to manage this complexity.
Chi, E. H. (2000).
Future of Software Special Issue. [PDF]
|
Visualization Method for Biological Sequence Similarity Reports
Previously, we presented a system called AlignmentViewer that uses information visualization techniques to visualize similarities between a single DNA sequence and a large database of other sequences. In this paper, we extend, summarize, and describe the system using several interesting case studies. We present our comb glyph technique for visualizing alignments between sequences. In this paper, we also extend the original system by incorporating computational steering, and the visualization of differences between data sets. The case studies and the new extended system present our novel approach of extracting significant relationships in the biological data set.
Chi, E. H., Riedl, J. T., Shoop, E. and Barry, P. (2000).
Journal of Electronic Imaging: Special Issue on Visualization and Data Analysis. [PDF]
|
Getting Portals to Behave
Data visualization environments help users understand and analyze their data by permitting interactive browsing of graphical representations of the data. To further facilitate understanding and analysis, many visualization environments have special features known as portals, which are sub-windows of a data canvas. Portals provide a way to display multiple graphical representations simultaneously, in a nested fashion. This makes portals an extremely powerful and flexible paradigm for data visualization. Unfortunately, with this flexibility comes complexity. There are over a hundred possible ways each portal can be configured to exhibit different behaviors. Many of these behaviors are confusing and certain behaviors can be inappropriate for a particular setting. It is desirable to eliminate confusing and inappropriate behaviors. In this paper, we construct a taxonomy of portal behaviors and give recommendations to help designers of visualization systems decide which behaviors are intuitive and appropriate for a particular setting. We apply these recommendations to an example setting that is fully visually programmable and analyze the resulting reduced set of behaviors. Finally, we consider a real visualization environment and demonstrate some problems associated with behaviors that do not follow our recommendations.
Olston, C. and Woodruff, A. (2000).
InfoVis 2000, Salt Lake City 15-25. [PDF]
|
A Taxonomy of Visualization Techniques Using the Data State Reference Model
In previous work, researchers have attempted to construct taxonomies of information visualization techniques by examining the data domains that are compatible with these techniques. This is useful because implementers can quickly identify various techniques that can be applied to their domain of interest. However, these taxonomies do not help the implementers understand how to apply and implement these techniques. In this paper, we will extend and then propose a new way to taxonomize information visualization techniques by using the Data State Model. In fact, as the taxonomic analysis in this paper will show, many of the techniques share similar operating steps that can easily be reused. The paper shows that the Data State Model not only helps researchers understand the space of design, but also helps implementers understand how information visualization techniques can be applied more broadly.
Chi, E. H. (2000).
InfoVis 2000, Salt Lake City 69-75. [PDF]
|
Improving Electronic Guidebook Interfaces Using a Task-Oriented Design Approach
Item selection is a key problem in electronic guidebook design. Many systems do not apply so-called "context-awareness" technologies to infer user interest, placing the entire burden of selection on the user. Conversely, to make selection easier, many systems automatically eliminate information that they infer is not of interest to the user. However, such systems often eliminate too much information, preventing the user from finding what they want.
To realize the full potential of electronic guidebooks, designers must strike the right balance between automatic context-based inference and manual selection. In this paper, we introduce a task-oriented model of item selection for electronic guidebooks to help designers explore this continuum. We argue that item selection contains three sub-tasks and that these sub-tasks should be considered explicitly in system design. We apply our model to existing systems, demonstrating pitfalls of combining sub-tasks, and discuss how our model has improved the design of our own guidebook prototype.
Aoki, P. M. and Woodruff, A. (2000).
DIS 2000, New York 319-325. [PDF]
|
Opportunities for Information Visualization
No Abstract Available
Mackinlay, J. D. (2000).
IEEE Computer Graphics and Applications 20(1).
|
WebLogger: A Data Collection Tool for Web-use Studies
Considering the amount of interest in studying Web-browsing behavior, there is a relative lack of tools for data collection in this area. Those tools that do exist have significant limitations on the data they are able to collect or on their suitability for efficient analysis. We present WebLogger, a tool which instruments Microsoft's Internet Explorer Web browser. We have found that WebLogger alleviates some of the problems associated with other approaches to browser-based data collection methods.
Reeder, R. W., Pirolli, P. and Card, S. K. (2000).
Xerox PARC, Palo Alto, CA. [PDF]
|
Guidelines for Using Multiple Views in Information Visualization
A multiple view system uses two or more distinct views to support the investigation of a single conceptual entity. Many such systems exist, ranging from computer-aided design (CAD) systems for chip design that display both the logical structure and the actual geometry of the integrated circuit to overview-plus-detail systems that show both an overview for context and a zoomed-in-view for detail. Designers of these systems must make a variety of design decisions, ranging from determining layout to constructing sophisticated coordination mechanisms. Surprisingly, little work has been done to characterize these systems or to express guidelines for their design. Based on a workshop discussion of multiple views, and based on our own design and implementation experience with these systems, we present eight guidelines for the design of multiple view systems.
Baldonado, M. Q. W., Woodruff, A. and Kuchinsky, A. (2000).
AVI 2000, Palermo, Italy. [PDF]
|
The Effect of Information Scent on Searching Information Visualizations of Large Tree Structures
Focus + context information visualizations have sought to amplify human cognition by increasing the amount of information immediately available to the user. We study how the focus + context distortion of the Hyperbolic Tree browser affects information foraging behavior in a task similar to the CHI '97 Browse Off. In comparison to a more conventional browser, Hyperbolic users searched more nodes, searched at a faster rate, and showed more learning. However, the performance of the Hyperbolic was found to be highly affected by "information scent", proximal cues to the value of distal information. Strong information scent made hyperbolic search faster than with a conventional browser. Conversely, weak scent put the hyperbolic tree at a disadvantage. There appears to be two countervailing processes affecting visual attention in these displays: strong information scent expands the spotlight of attention whereas crowding of targets in the compressed region of the Hyperbolic narrows it. The results suggest design improvements.
Pirolli, P., Card, S. K. and Van Der Wege, M. (2000).
AVI 2000, Palermo, Italy. [PDF]
|
Case Study: Resource Steering in a Visualization System
Visual computational steering environments extend traditional visualization environments by enabling the user to interactively steer the computations applied to the data. In this paper, we develop a new type of computational steering. "Resource steering" extends current visual steering techniques by providing machine resource estimation and control to the user. With resource steering, the user controls the execution of the computation on a parallel or distributed computer based on experimentally or theoretically derived estimates of the parallel performance of the computation. We demonstrate this extended steering model by applying it to an information visualization system that analyzes genetic sequence similarity reports. We show how our extended steering model enhances the user's ability to control visualization computations.
Chi, E. H. and Riedl, J. T. (2000).
Proceedings of the Joint Eurographics IEEE TCVG Symposium on Visualization (VisSym '00), Amsterdam, The Netherlands. [PDF]
|
Enhancing a Digital Book with a Reading Recommender
Digital books can significantly enhance the reading experience, providing many functions not available in printed books. In this paper we study a particular augmentation of digital books that provides readers with customized recommendations. We systematically explore the application of spreading activation over text and citation data to generate useful recommendations. Our findings reveal that for the tasks performed in our corpus, spreading activation over text is more useful than citation data. Further, fusing text and citation data via spreading activation results in the most useful recommendations. The fused spreading activation techniques outperform traditional text-based retrieval methods. Finally, we introduce a preliminary user interface for the display of recommendations from these algorithms.
Woodruff, A., Gossweiler, R., Pitkow, J., Chi, E. H. and Card, S. K. (2000).
CHI 2000, The Hague, The Netherlands 153-160. [PDF]
|
The Scent of a Site: A System for Analyzing and Predicting Information Scent, Usage, and Usability of a Web Site
Designers and researchers of users' interactions with the World Wide Web need tools that permit the rapid exploration of hypotheses about complex interactions of user goals, user behaviors, and Web site designs. We present an architecture and system for the analysis and prediction of user behavior and Web site usability. The system integrates research on human information foraging theory, a reference model of information visualization and Web data-mining techniques. The system also incorporates new methods of Web site visualization (Dome Tree, Usage Based Layouts), a new predictive modeling technique for Web site use (Web User Flow by Information Scent, WUFIS), and new Web usability metrics.
Chi, E. H., Pirolli, P. and Pitkow, J. (2000).
CHI 2000, The Hague, The Netherlands 161-168. [PDF]
|
Sensemaking of Evolving Web Sites using Visualization Spreadsheets
In the process of knowledge discovery, workers examine available information in order to make sense of it. By sensemaking, we mean interacting with and operating on the information with a variety of information processing mechanisms [3, 18]. Previously, we introduced a concept that uses the spreadsheet metaphor with cells containing visualizations of complex data. In this paper, we extend and apply a cognitive model called "visual sensemaking" to the Visualization Spreadsheet. We use the task of making sense of a large Web site as a concrete example throughout the paper for demonstration. Using a variety of visualization techniques, such as the Disk Tree and Cone Tree, we show that the interactions of the Visualization Spreadsheet help users draw conclusions from the overall relationships of the entire information set.
Chi, E. H. and Card, S. K. (1999).
Symposium on Information Visualization (InfoVis '99), San Francisco. [PDF]
|
Web Analysis Visualization Spreadsheet
In this paper, we present methods in information visualization that apply to the discovery of patterns in World-Wide Web sites. We hope to use techniques of information visualization to help in the organization and categorization of Web sites. We present a detailed case study of using the spreadsheet to analyze the content, usage, and structure of a large Web site. We demonstrate how the visualization spreadsheet principles apply in this specific data domain.
Chi, E. H. (1999).
ACM Digital Library Workshop on Organizing Web Space (WOWS '99), Berkeley, CA 24-31. [PDF]
|
Mining longest repeated subsequences to predict World Wide Web surfing
Modeling and predicting user surfing paths involves tradeoffs between model complexity and predictive accuracy. In this paper we explore predictive modeling techniques that appempt ot reduce model complexity while
retaining predictive accuracy. We show that compared to various Markov models, longest repeating subsequence
models are able to significantly reduce model size while retaining the ability to make accurate predictions. Sharp
increases in the overall predictive capabilities of these models are achievable by models increases to the size of the number of predictions made.
Pitkow, J. E. and Pirolli, P. (1999).
Second USENIX Symposium on Internet Technologies and Systems. [PDF]
|
The Internet Edge: A Book of Changes
No Abstract Available
Stefik, M. J. (1999).
MIT Press.
|
Information Foraging
Information Foraging Theory is an approach to understanding how strategies and technologies for information seeking, gathering and consumption are adapted to the flux of information in the environment. The theory assumes that people, when possible, will modify their strategies or the structure of the environment to maximize their rate of gaining valuable information. Field studies inform the theory by illustrating that people do freely structure their environments and their strategies to yield high gains in information foraging. The theory is developed by (a) adaptation (rational) analysis of information foraging problems and (b) a detailed process model (ACT-IF). The adaptation analysis develops (a) information patch models, which deal with time allocation and information filtering and enrichment activities in environments in which information is encountered in clusters (e.g. bibliographic collections), (b) information scent models which address the identification of information value from proximal cues, and (c) information diet models which address decisions about the selection and pursuit of information items. ACT-IF is developed to instantiate these rational models and to fit the moment-by-moment behavior of people interacting with complex information technology. ACT-IF is a production system in which the information scent of bibliographic stimuli is calculated by spreading activation mechanisms. Time allocation and item selection heuristics make use of information sent to select production rules in ways that maximize information foraging activities.
Pirolli, P. and Card, S. K. (1999).
Psychological Review 106(4): 643-675. [PDF]
|
Cognitive Architectures and Cognitive Engineering Models in Human-Computer Interaction
(First paragraph) We engage our physical and social environments through highly evolved technologies, in interactions that often require sophisticated knowledge and virtuoso performance. Finding the order underlying the complexity of these interactions may be one of the most daunting challenges facing science. Over the past few decades, the study of human-computer interaction (HCI) has become an increasingly important arena for pursuing this challenge. HCI is a discipline concerned with the study and design of interactive computing systems used by people towards satisfying their goals. HCI has become an arena in which new computer applications can benefit from new cognitive engineering models that synthesize results from sound cognitive science. It has also become a useful testbed for cognitive architectures, which are integrated theories of psychological mechanisms that aim to predict complex learning, cognition, and performance. This chapter provides an overview of cognitive engineering models and cognitive architectures in the context of HCI.
Pirolli, P. (1999).
The handbook of applied cognition. John T. Wiley, Sussex, England. [PDF]
|
VIDA (Visual Information Density Adjuster)
Multiple studies have shown that clutter or sparsity in visual representations can have negative effects ranging from decreased user performance to diminished visual appeal. We have developed a system that assists users in the construction and navigation of visualizations with appropriate visual information density. This system, VIDA (Visual Information Density Adjuster), applies a cartographic principle to minimize clutter and sparsity in visual displays of information.
Woodruff, A., Landay, J. and Stonebraker, M. (1999).
CHI '99, Pittsburgh, PA. [PDF]
|
Distributions of Surfers' Paths Through the World Wide Web: Empirical Characterization
Surfing the World Wide Web (WWW) involves traversing hyperlink connections among documents. The ability to predict surfing patterns could solve many problems facing producers and consumers of WWW content. We analyzed WWW server logs for a WWW site, collected over ten days, to compare different path reconstruction methods and to investigate how past surfing behavior predicts future surfing choices. Since log files do not explicitly contain user paths, various methods have evolved to reconstruct user paths. Session times, number of clicks per visit, and Levenshtein Distance analyses were performed to show the impact of various reconstruction methods. Different methods for measuring surfing patterns were also compared. Markov model approximations were used to model the probability of users choosing links conditional on past surfing paths. Information-theoretic (entropy) measurements suggest that information is gained by using longer paths to estimate the conditional probability of link choice given surf path. The improvements diminish, however, as one increases the length of path beyond one. Information-theoretic (Total Divergence to the Average entropy) measurements suggest that the conditional probabilities of link choice given surf path are more stable over time for shorter paths than longer paths. Direct examination of the accuracy of the conditional probability models in predicting test data also suggests that shorter paths yield more stable models and can be estimated reliably with less data than longer paths.
Pirolli, P. and Pitkow, J. E. (1999).
World Wide Web 2(1-2): 29-45. [PDF]
|
A Framework for Information Visualization Spreadsheets
Information has become interactive. Information visualization is the design and creation of interactive graphic depictions of information by combining principles in the disciplines of graphic design, cognitive science and interactive computer graphics. As the volume and complexity of the data increases, users require more powerful visualization tools that allow them to more effectively explore large abstract datasets. This thesis seeks to make information visualization more accessible to potential users by creating a "Visualization Spreadsheet", where each cell can contain an entire set of data represented using interactive graphics. Just as a numeric spreadsheet enables exploration of numbers, a visualization spreadsheet enables exploration of visual forms of information. Unlike numerical spreadsheets, which store only simple data elements and formulas in each cell, a cell in the Visualization Spreadsheet can hold an entire abstract data set, selection criteria, viewing specifications, and other information needed for a full-fledged information visualization. Similarly, intra-cell and inter-cell operations are far more complex, stretching beyond simple arithmetic and string operations to encompass a range of domain-specific operators. The complexity of operations and interactions requires a visualization framework that is easily understandable to both end-users and visualization designers. This thesis develops and discusses the general utility of a novel visualization framework, and validates the framework by applying it to various visualization techniques and showing several systems that illustrate some of these research issues. We show that the spreadsheet approach facilitates certain visual user tasks that are more difficult using other approaches. The underlying approach in our work allows domain experts to define new data types and data operations and enable visualization experts to incorporate new visualizations, viewing parameters, and view operations.
Chi, E. H. (1999).
University of Minnesota.
|
Summary of WWW Characterizations
To date there have been a number of efforts that attempt to characterize various aspects of the World Wide Web. This paper presents a summary of these efforts, highlighting regularities and insights that have been discovered across the variety of access points available for instrumentation. Characterizations that are derived from client, proxy, and server instrumentation are reviewed as well as effort to characterize the entire structure of the WWW. Given the dynamic nature of the Web, it may be surprising for some readers to find that many properties of the Web follow regular and predictable patterns that have not changed in form over the Web's lifetime. Understanding theses aspects as well as those that vary is critical to designing a better Web, and as a direct consequence, creating amore enjoyable user experience.
Pitkow, J. E. (1998).
Web Journal 2(1-2): 3-13. [PDF]
|
Perception and Its Application in Computer Graphics
No Abstract Available
Ferwerda, J., Gossweiler, R., Healey, C., Interrante, V. and Reingans, P. (1998). .
|
Information Visualization: Using Vision to Think
No Abstract Available
Card, S. K., Mackinlay, J. D. and Shneiderman, B. (1998).
Morgan-Kaufmann, San Francisco, California.
|
Fluid Links for Informed and Incremental Link Transitions
We have developed a novel user interface technique for hypertext, called fluid links, that has several advantages over current methods. Fluid links provide additional information at a link source to support readers in choosing among links and understanding the structure of hypertext. Fluid links present this information in a convenient location that does not obscure the content or layout of the source material. The technique uses perceptually-based animation to provide a natural and lightweight feeling to readers. In their richer forms, fluid links can provide a novel hypertext navigation paradigm that blurs the boundaries of hypertext nodes and can allow readers to fluidly control the focus on the material to support their current reading goals.
Zellweger, P. T., Chang, B.-W. and Mackinlay, J. D. (1998).
Proceedings of Hypertext'98 50-57. [PDF]
|
Goal-Directed Zoom
We introduce a novel zoom method, goal-directed zoom. In a goal directed zoom system, users specify which representation of an object they wish to see. The system automatically zooms to the elevation at which that representation appears at appropriate detail. We have extended a database visualization environment to support end-user construction of visualization that have goal directed zoom. We present a sample visualization we have constructed using this environment.
Woodruff, A., Landay, J. and Stonebraker, M. (1998).
CHI '98, Los Angeles 305-6. [PDF]
|
Constant Information Density in Zoomable Interfaces
We introduce a system that helps users construct interactive visualizations with constant information density. This work is an extension of the DataSplash database visualization environment. DataSplash is a direct manipulation system in which users can construct and navigate visualizations. Objects' appearances change as users zoom closer to or further away from the visualization. Users specify graphically the point at which these changes occur. Our experience with DataSplash indicates that users find it difficult to construct visualizations that display an appropriate amount of detail. In this paper, we introduce an extension to DataSplash based on the Principle of Constant Information Density. Thins extension gives users feedback about the density of visualizations as they create them. We also introduce an extension that suggests improvements to existing visualizations. We have performed an informal study of user navigation in applications with and without constant information density. We suggest that designers take density into account when designing applications to avoid biasing user navigation in unexpected ways.
Woodruff, A., Landay, J. and Stonebraker, M. (1998).
Advanced Visual Interfaces '98, L'Aquila, Italy 57-65. [PDF]
|
Constant Density Visualizations of Non-Uniform Distributions of Data
The cartographic Principle of Constant Information Density suggests that the amount of information in an interactive visualization should remain constant as the user pans and zooms. In previous work, we presented a system, VIDA (Visual Information Density Adjuster), which helps users manually construct applications in which overall display density remains constant. In the context of semantic zoom systems, this approach ensures uniformity in the z dimension, but does not extend naturally to ensuring uniformity in the x and y dimensions. In this paper, we present a new approach that automatically creates displays that are uniform in the x, y, and z dimensions. In the new system, users express constraints about visual representations that should appear in the display. The system applies these constraints to subdivisions of the display such that each subdivision meets a target density value. We have implemented our technique in the DataSplash/VIDA database visualization environment. We describe our algorithm, implementation, and the advantages and disadvantages of our approach.
Woodruff, A., Landay, J. and Stonebraker, M. (1998).
UIST '98, San Francisco 19-28. [PDF]
|
Summary of WWW Characterizations
To date there have been a number of efforts that attempt to characterize various aspects of the World Wide Web. This paper represents a summary of these efforts, highlighting regularities and invariants that have been discovered.
Pitkow, J. E. (1998).
The Seventh International World Wide Web Conference, Brisbane, Australia. [PDF]
|
A Theory of the Measurement of Knowledge Content, Access, and Learning
We develop an approach to the measurement of knowledge content, knowledge access and knowledge learning. This approach has two elements: First we describe a theoretical view of cognition, called the Newell-Dennett framework, which we see as being particularly favorable to the development of a measurement approach. Then, we describe a class of measurement models, based on Rasch modeling, which we see as being particularly favorable to the development of cognitive theories. Knowledge content and access are viewed as determining the observable actions selected by an agent in order to achieve desired goals in observable situations. To the degree that models within the theory fit the data at hand, one considers measures of observed behavior to be manifestations of intelligent agents having specific classes of knowledge content and varying degrees of access to that knowledge. Although agents, environment, and knowledge are constitutively defined (in terms of one another), successful application of our theory affords separation of parameters associated with the person from those associated with the environment. We present and discuss two examples of measurement models developed within our approach that address the evolution of cognitive skill, strategy choice and application, and developmental changes in mixtures of strategy use.
Pirolli, P. and Wilson, M. (1998).
Psychological Review 105: 58-82. [PDF]
|
Information Foraging Models of Browsers for Very Large Document Spaces
Information Foraging (IF) Theory addresses user strategies and technology for seeking, gathering, and using on-line information. We present IF-based models and evaluations of two interfaces: the Scatter/Gather browser for large document collections, and the Butterfly interface for surfing the citation link structure of scientific literatures. A computational cognitive model, ACT-IF, models observed users by assuming that they have heuristics that optimize their information foraging behavior in accordance with IF theory.
Pirolli, P. and Card, S. K. (1998).
AVI '98, L'Aquilla, Italy 83-93. [PDF]
|
Exploring Browser Design Trade-Offs Using a Dynamical Model of Optimal Information Foraging
Designers and researchers of human-computer interaction need tools that permit the rapid exploration and management of hypotheses about complex interactions of designs, task conditions, and user strategies. Dynamic programming is introduced as a such a tool for the analysis of information foraging technologies. The technique is illustrated in the context of the Scatter/Gather text clustering browser. Hypothetical improvements in browser speed and text clustering are examined in the context of variations in task deadlines and the quality of the document repository. A complex and non-intuitive set of tradeoffs emerge from even this simple space of factors, illustrating the general utility of the approach.
Pirolli, P. (1998).
Conference on Human Factors in Computing Systems, CHI '98, Los Angeles 33-40. [PDF]
|
Report of the 7-8 May 1998 ONR/CNMOC Interactive METOC Working Group
(First paragraph) The purpose of this workshop was to develop an appreciation and understanding of the METOC function for Naval Operations. Then, using a scenario-based group discussion method, develop recommendations to facilitate the METOC function.
ONR/CNMOC Working Group (1998).
Office of Naval Research, San Diego, California. [PDF]
|
Fluid Visualization of Spreadsheet Structures
Spreadsheets augment a visible tabular layout with invisible formulas. Direct manipulations of the tabular layout may or may not result in the desired changes to the formulas. The user is forced to explore the individual cells to find, verify, and modify the formulas, which causes heavy cognitive overhead. We present a set of techniques that make these formulas and their resulting dataflow structure easily accessible while maintaining the natural appearance of the spreadsheet. Transient local views visualize dataflow structures associated with individual cells, while static global views and animated global explanations visually present the entire dataflow structure at once. Semantic
navigation enables the user to navigate through the dataflow structure interactively, and visual editing techniques make it possible to construct formulas using graphical editing techniques. Central to these techniques is the use of animation and lightweight interaction for rapid and non-intrusive visualization. Our prototype implementation suggests that these techniques can greatly improve the expressive power of current spreadsheets as well as other applications that have rich underlying structures.
Igarashi, T., Zellweger, P. T., Chang, B.-W. and Mackinlay, J. D. (1998).
Proceedings of Visual Languages'98. [PDF]
|
Strong Regularities in World Wide Web Surfing
One of the most common modes of accessing information in the World Wide Web is surfing from one document to another along hyperlinks. Several large empirical studies have revealed common patterns of surfing behavior. A model that assumes that users make a sequence of decisions to proceed to another page, continuing as long as the value of the current page exceeds some threshold, yields the probability distribution for the number of pages that a user visits within a given Web site. This model was verified by comparing its predictions with detailed measurements of surfing patterns. The model also explains the observed Zipf-like distributions in page hits observed at Web sites.
Huberman, B. A., Pirolli, P., Pitkow, J. and Lukose, R. J. (1998).
Science 280: 95-97. [PDF]
|
Information Visualization
No Abstract Available
Gershon, N., Eick, S. and Card, S. K. (1998).
Interactions March-April.
|
An Operator Interaction Framework for Visualization Systems
Information visualization encounters a wide variety of different date domains. The visualization community has developed representation methods and interactive techniques. As a community, we have realized that the requirements in each domain are often dramatically different. In order to easily apply existing methods, researchers have developed a semiology of graphic representation. We have extended this research into a framework that includes operators and interactions in visualization systems, such as a visualization spreadsheet. We discuss properties of this framework and use it to characterize operations spanning a variety of different visualization techniques. The framework developed in this paper enables a new way of exploring and evaluating the design space of visualization operators, and helps end-users in their analysis tasks.
Chi, E. H. and Riedl, J. T. (1998).
Symposium on Information Visualization (InfoVis '98), Research Triangle Park, North Carolina 63-70. [PDF]
|
Principles for Information Visualization Spreadsheets
Spreadsheets have proven highly successful for interacting with numerical data, such as applying algebraic operations, refining data propagation relationships, manipulating rows or columns, and exploring "what-if" scenarios. Spreadsheet techniques have recently been extended from numeric domains to other domains. Here we present a spreadsheet approach to displaying and exploring information visualizations, with large, abstract, multidimensional data sets that are visually represented in multiple ways. We illustrate how spreadsheet techniques provide a structured, intuitive, and powerful interface for investigating information visualizations. An earlier version of this article appeared in the proceedings of the 1997 Information Visualization Symposium. Here we refocus the discussion to illustrate principles that make the spreadsheet approach powerful. These principles show how we can perform many user tasks easily in the visualization spreadsheet that prove much more difficult using other approaches.
Chi, E. H., Riedl, J., Barry, P. and Konstan, J. (1998).
Computer Graphics and Applications: 30-38. [PDF]
|
Visualizing the Evolution of Web Ecologies
Several visualizations have emerged which attempt to visualize all or part of the World Wide Web. Those visualizations, however, fail to present the dynamically changing ecology of users and documents on the Web. We present new techniques for Web Ecology and Evolution Visualization (WEEV). Disk Trees represent a discrete time slice of the Web ecology. A collection of Disk Tress forms a Time Tube, representing the evolution of the Web over longer periods of time. These visualizations are intended to aid authors and webmasters with the production and organization of content, assist Web surfers making sense of information, and help researchers understand the Web.
Chi, E. H., Pitkow, J., Mackinlay, J., Pirolli, P., Gossweiler, R. and Card, S. K. (1998).
ACM Conference on Human Factors in Software (CHI '98), Los Angeles 400-407, 644-645. [PDF]
|
A Negotiation Architecture for Fluid Documents
The information presented in a document often consists of primary content as well as supporting material such as explanatory notes, detailed derivations, illustrations, and the like. We introduce a class of user interface techniques for fluid documents that supports the reader's shift to supporting material while maintaining the context of the primary material. Our approach initially minimizes the intrusion of supporting material by presenting it as a small visual cue near the annotated primary material. When the user expresses interest in the annotation, it expands smoothly to a readable size. At the same time, the primary material makes space for the expanded annotation. The expanded supporting material must be given space to occupy, and it must be made salient with respect to the surrounding primary material. These two aspects, space and salience, are subject to a negotiation between the primary and supporting material. This paper presents the components of our fluid document techniques and describes the negotiation architecture for ensuring that the presentations of both primary and supporting material are honored.
Chang, B.-W., Mackinlay, J. D., Zellweger, P. T. and Igarashi, T. (1998).
Proceedings of UIST'98, ACM Symposium on User Interface Software and Technology 123-132. [PDF]
|
A Spreadsheet Approach to Information Visualization
In information visualization, as the volume and complexity of the date increases, researchers require more powerful visualization tools that enable them to more effectively explore multidimensional datasets. In this paper, we discuss the general utility of a novel visualization spreadsheet framework. Just as a numerical spreadsheet enables exploration of numbers, a visualization spreadsheet enables exploration of visual forms of information. We show that the spreadsheet approach facilitates certain information visualization tasks that are more difficult using other approaches. Unlike traditional spreadsheets, which store only simple data elements and formulas in each cell, a visualization spreadsheet cell can hold an entire complex data set, selection criteria, viewing specifications, and other information needed for a full-fledged information visualization. Similarly, inter-cell operations are far more complex, stretching beyond simple arithmetic and string operations to encompass a range of domain-specific operators. We have built two prototype systems that illustrate some of these research issues. The underlying approach in our work allows domain experts to define new data types and data operations, and enables visualization experts to incorporate new visualizations, viewing parameters, and view operations.
Chi, E. H., Barry, P., Riedl, J. and Konstan, J. (1997).
Symposium on Information Visualization (InfoVis '97), Phoenix, AZ 17-24. [PDF]
|
Supporting Fine-Grained Data Lineage in a Database Visualization Environment
The lineage of a datum records its processing history. Because such information can be used to trace the source of anomalies and errors in processed data sets, it is valuable to users for a variety of applications including investigation of anomalies and debugging. Traditional data lineage approaches rely on metadata. However, metadata does not scale well to fine-grained lineage, especially in large data sets. For example, it is not feasible to store all of the information necessary to trace from a specific floating point value in a processed data set to a particular satellite image pixel in a source data set. In this paper, we propose a novel method to support fine-grained data lineage. Rather than relying on metadata, our approach lazily computes lineage using a limited amount of information about the processing operators and the base data. We introduce the notions of weak inversion and verification. While our system does not perfectly invert the data, it uses weak inversion and verification to provide a number of guarantees about the lineage it generates. We propose a design for the implementation of weak inversion and verification in an object-relational database management system.
Woodruff, A. and Stonebraker, M. (1997).
13th Int'l Conf. on Data Engineering, Birmingham, England 91-102. [PDF]
|
In Search of Reliable Usage Data on the WWW
The WWW is currently the hottest test-bed for future interactive digital systems. While much is understood technically about how the WWW functions, substantially less is known about how this technology
is used collectively and on an individual basis. This disparity of knowledge exists largely as a direct
consequence of the decentralized nature of Web. Since each user of the Web is not uniquely identifiable across
the system and the system employs various levels of caching, measurement of actual usage is problematic.
This paper establishes terminology to frame the problem of reliably determining usage of WWW resources while
reviewing current practice and their shortcomings. A review of the various metrics and analyses that can be
performed to determine usage is then presented. This is followed by a discussion of the strengths and
weaknesses of the hit-metering proposal [Mogul and Leach 1997] currently in consideration by the HTTP
working group. Lastly, new proposals, based upon server-side sampling are introduced and assessed against
the other proposal. It is argued that server-side sampling provides more reliable and useful usage data while
requiring no change to the current HTTP protocol and enhancing user privacy.
Pitkow, J. E. (1997).
The Sixth International World Wide Web Conference, Santa Clara, California. [PDF]
|
A Spreadsheet Approach to Information Visualization
In information visualization, as the volume and complexity of the data increases, researchers require more powerful visualization tools that allow them to more effectively explore multi-dimensional datasets. In this paper, we show a novel new visualization framework built upon the spreadsheet metaphor, where each cell can contain an entire dataset. Just as a numerical spreadsheet enables exploration of numbers, a visualization spreadsheet enables exploration of visualizations of data. Our prototype spreadsheet enabled users to compare visualizations in cells using the tabular layout. Users can use the spreadsheet to display, manipulate, and explore multiple visual representation techniques for their data. By applying different operations to the cells, we showed how visualization spreadsheets afford the construction of 'what-if' scenarios. The possible set of operations that users can apply consists of animation, filtering, and algebraic operators.
Chi, E. H., Konstan, J., Barry, P. and Riedl, J. (1997).
ACM Symposium on User Interface Software and Technology (UIST '97) 79-80. [PDF]
|
Characterizing World Wide Web Ecologies
One of the fastest growing sources of information today is the World Wide Web (WWW), having grown from only fifty sources of information in January of 1993 to over a half million four years later. The exponential growth of information within the Web has created an overabundance of information and a poverty of human attention, with users citing the inability to navigate and find relevant information on the Web as one of the biggest problems facing the Web today. The primary goal of the research presented here is to put forth new techniques and models that can be used to help efficiently manage people's attentional processes when dealing with large, unstructured, heterogeneous information environments. The primary model is based upon the desirability of items on the Web. This research searches for lawful patterns of structure, content, and use. Methods are developed to exploit these patterns to organize and optimize users' information foraging and sensemaking activities. The enhancements rely on predicting, categorization and allocation of attention. Several methods are explored for inducing categorical structures for the WWW. Some of these enhancements involve clustering in a high-dimensional space of content, use, and structural features. Others derive from cocitation analysis methods used in the study of scientific communities. A user would also be aided by retrieval mechanisms that predicted and returned the most likely needed WWW pages, given that the user is attending to some given page(s). The approach of this research uses a spreading activation mechanism to predict the needed, relevant information, computed using past usage patterns, degree of shared content, and WWW hyperlink structure.
Pitkow, J. E. (1997).
Xerox PARC, Palo Alto, CA. [PDF]
|
The Evolutionary Ecology of Information Foraging
We present Information Foraging Theory as an approach to understanding how strategies and technologies for information seeking, gathering, and consumption are adapted to the flux of information in the cultural environment. The theory is developed within an evolutionary-ecological framework that includes analysis of adaptation, knowledge, and cognition. The theory is applied to field studies, controlled experiments, and technology design. We present the Information Diet Model and Information Patch Residence Time Model as optimization models of information foraging under some strong constraints. These are used to develop a specific production system model called ACT-IF that predicts the fine-grained information seeking and gathering behavior of participants using a sophisticated document browsing system. We also present the Overlapped Patch Foraging with Queueing Model to address situations in which information search and information handling may occur in parallel, the Extreme Variance Rule which deals with information foraging under deadlines and uncertainty, a general class of Dynamic Information Foraging Models, and the Hogg-Huberman Model of the phase space of cost functions for heuristic information search.
Pirolli, P. and Card, S. (1997).
Xerox PARC, Palo Alto, CA. [PDF]
|
Analyzing Differences Between Internet Information System Software Architectures
No Abstract Available
Abowd, G., Pitkow, J. and Kazman, R. (1996).
International Communications Conference 1996 (ICC 96), Dallas, Texas.
|
The Human, the Computer, the Task, and their Interaction-Analytic Models and Use-Centered Design
No Abstract Available
Card, S. K. (1996).
Mind Matters: A Tribute to Allen Newell. Erlbaum, Hillsdale, New Jersey.
|
An Investigation of Documents on the World World Web
We report on our examination of pages from the World Wide Web. We have analyzed data collected by the Inktomi Web crawler (this data currently comprises over 2.6 million HTML documents). We have examined many characteristics of these documents, including: document size; number and types of tags, attributes, file extensions, protocols, and ports; the number of in-links; and the ratio of document size to the number of tags and attributes. For a more limited set of documents, we have examined the following: the number and types of syntax errors and readability scores. These data have been aggregated to create a number of ranked lists, e.g., the ten most-used tags, the ten most common HTML errors.
Woodruff, A., Aoki, P. M., Brewer, E., Gauthier, P. and Rowe, L. A. (1996).
5th Int'l Conf. on the World Wide Web, Paris 963-980. [PDF]
|
Predicting Document Access in Large Multimedia Repositories
No Abstract Available
Recker, M. M. and Pitkow, J. E. (1996).
Transactions on Computer-Human Interaction 3(4): 352-375.
|
Emerging Trends in the WWW User Population
Vast amounts of attention and resources have recently been devoted toward the World Wide Web (WWW) [1], but relatively little research has been conducted examining Web usage and societal implications. With the goals of understanding the Web user population and promoting the Web as a viable surveying medium, the WWW User Surveys were initially conducted by Georgia Institute of Technology's Graphics, Visualization, and Usability center during January of 1994. Subsequent surveys have been administered approximately every six months thereafter. Each survey is conducted for one month using the limited interactivity of the Web, where users point and click on responses within their Web browsers and submit results to a centralized server for processing. The first survey [6] was administered during January 1994 and received over 1,500 responses, which was a considerable amount at that time. This response rate, along with tremendous positive feedback from the Web community, justified continuing the surveys. The second survey (October 1994) [7] employed an extended and refined question base, which included a set of questions developed by the University of Michigan's Hermes Team regarding consumer attitudes toward electronic commerce. The response rate continued to grow significantly, recording over 4,500 unique users. This tremendous growth has continued through the third and fourth surveys (April and October 1995) [4,5], with 13,000 and 23,300 users responding, respectively. Based upon current estimates, the last two surveys were completed by nearly one out of every thousand web users [2,3]. We expect this trend to continue for the fifth survey, the results of which will be available in mid-June 1996.
Pitkow, J. E. and Kehoe, C. M. (1996).
Communications of the ACM 39(6). [PDF]
|
Supporting the Web: A Distributed Hyperlink Database System
In its current implementation, the World-Wide Web lacks much of the explicit structure and strong typing found in many closed hypertext systems. While this property has directly fueled the explosive acceptance of the Web, it further complicates the already difficult problem of identifying usable structures and aggregates in large hypertext collections. These reduced structures, or localities, form the basis to simplifying visualizations of and navigation through complex hypertext systems. Much of the previous research into identifying aggregates utilize graph theoretic algorithms based upon structural topology, i.e., the linkages between items. Other research has focused on content analysis to form document collections. This paper presents our exploration into techniques that harness both the topology and textual similarity between items as well as integrate new analyses based upon actual usage of the Xerox's WWW space. Linear equations and spreading activation models are employed to arrange Web pages based upon functional categories, node types, and relevancy.
Pitkow, J. E. and Jones, R. K. (1996).
The Fifth International World Wide Web Conference, Paris, France. [PDF]
|
Silk from a Sow's Ear: Extracting Usable Structures from the Web
In its current implementation, the World Wide Web lacks much of the explicit structure and strong typing found in many closed hypertext systems. While this property probably relates to the explosive acceptance of the Web, it further complicates the already difficult problem of identifying usable structures and aggregates in large hypertext collections. These reduced structures, or localities, form the basis for simplifying visualizations of and navigation through complex hypertext systems. Much of the previous research into identifying aggregates utilize graph theoretic algorithms based upon structural typology, i.e. the linkage between items. Other research has focused on content analysis to form document collections. This paper presents our exploration into techniques that utilize both the topology and textual similarity between items as well as usage data collected by servers and page meta-information like title and size. Linear equations and spreading activation models are employed to arrange Web pages based upon functional categories, node types and relevancy.
Pirolli, P., Pitkow, J. and Rao, R. (1996).
Conference on Human Factors in Computing Systems, CHI '96, Vancouver, Canada. [PDF]
|
Surveying the Territory: GVU's Five WWW User Surveys
Five years is not very long on most historical scales, but for the World Wide Web (WWW) it constitutes a lifetime. A question almost as old as the web itself is, "Who is using it, and for what?" One way to answer this question is to use paper surveys, telephone survey, or diaries which are some of the same methods used to measure the audiences of other one-way media such as television and radio. However, something interesting happened in early 1994: the implementation of HTML Forms turned the web into a two-way medium which made it possible to contact the audience directly. To test the viability of the web as a survey medium and collect preliminary data on the web population, the first GVU WWW User Survey was conducted in January 1994. Subsequent surveys have been conducted approximately every six months. The collection of responses from over 55,000 Web users over five surveys has given us a unique perspective on the advances in surveying technology and methodology and changes in the web population itself. In the following sections we discuss what we have learned in each of these areas.
Kehoe, C. M. and Pitkow, J. E. (1996).
The World Wide Web Journal 1(3). [PDF]
|
Flexible Information Visualization of Multivariate Data from Biological Sequence Similarity Searches
Information visualization faces challenges presented by the need to represent abstract data and the relationships within the data. Previously, we presented a system for visualizing similarities between a single DNA sequence and a large database of other DNA sequences [6]. Similarity algorithms generate similarity information in textual reports that can be hundreds or thousands of pages long. Our original system visualized the most important variables from these reports. However, the biologists we work with found this system so useful they requested visual representations of other variables. We present an enhanced system for interactive exploration of this multivariate data. We identify a larger set of useful variables in the information space. The new system involves more variables, so it focuses on exploring subsets of the data. We present an interactive system allowing mapping of different variables to different axes, incorporating animation using a time-axis, and providing tools for viewing subsets of the data. Detail-on-demand is preserved by hyperlinks to the analysis reports. We present three case studies illustrating the use of these techniques. The combined technique of applying a time axis with a 3D scatter plot and query filters to visualization of biological sequenced similarity data is both powerful and novel.
Chi, E. H., Riedl, J., Shoop, E., Carlis, J., Retzel, E. and Barry, P. (1996).
IEEE Visualization '96 133-140, 477. [PDF]
|
The WebBook and the Web Forager: An Information Workspace for the World-Wide Web
The World-Wide Web has achieved global connectivity stimulating the transition of computers from knowledge processors to knowledge sources. But the Web and its client software are seriously deficient for supporting users' interactive use of this information. This paper presents two related designs with which to evolve the Web and its clients. The first is the WebBook, a 3D interactive book of HTML pages. The WebBook allows rapid interaction with objects at a higher level of aggregation than pages. The second is the Web Forager, an application that embeds the WebBook and other objects in a hierarchical 3D workspace. Both designs are intended as exercises to play off against analytical studies of information workspaces.
Card, S. K., Robertson, G. G. and York, W. (1996).
ACM Conference on Human Factors in Software (CHI '96) 111-117. [PDF]
|
Tioga-2: A Direct Manipulation Database Visualization Environment
This paper reports on user experience with Tioga, a DBMS-centric visualization tool developed at Berkeley. Based on this experience, we have designed Tioga-2 as a direct manipulation system that is more powerful and much easier to program. A detailed design of the revised system is presented, together with an extensive example of its application.
Aiken, A., Chen, J., Stonebraker, M. and Woodruff, A. (1996).
12th Int'l Conf. on Data Engineering, New Orleans, LA 208-17. [PDF]
|
Shifting the Possible
No Abstract Available
Stefik, M. (1996).
Xerox PARC, Palo Alto, CA.
|
Letting Loose the Light: Igniting Commerce in Electronic Publication
In "The Digital Library Project: The World of Knowbots" in Part 1, Robert Kahn and Vinton Cerf ask, "If a thousand books are combined on a single CD-ROM and the acquirer of the CD-ROM only intends to read one of them, what sort of royalty arrangement is appropriate to compensate the copyright owners? How would compensation be extended for cases in which electronic copies are provided to users?" Their questions show how, in 1988, issues about copyright protection and payment for using information arose in the context of early CD-ROM distribution. By 1994 copyright issues had not only not been settled, they were coming to a boil. Laura Fillmore's effort to build a successful publishing business on the Internet reveals the limitations of what was practical in May of 1994. Although digital works were being sold on the Internet, provisions for commerce were primitive. Furthermore, the ease of copying digital works had led many people to believe that digital information should be free. Fast access to the network had made trading programs or other data as easy as mixing songs on audio tape. In short, it had become much simpler for network users to infringe copyright than to uphold it. This is the context for the oft-quoted statement by John Perry Barlow of the Electronic Freedom Foundation, "Copyright is dead." Advocates of free information argue that because you don't lose the original when you make a copy of a digital work, there should be no charge for copying information. The conventional wisdom among publishers in late-1994, when this article was written, was that digital containers for software were inherently leaky vessels and that no viable solution w | | | |