The first paper. Describes how to cluster an entire document collection hierarchically, and allow the user to gather clusters at different levels and rescatter them. Implementation details are provided.
Douglass Cutting, David Karger, Jan Pedersen, and John W. Tukey. Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections, Proceedings of the 15th Annual International ACM/SIGIR Conference, Copenhagen, 1992. postscript (209K) html
Describes performance improvements for the system presented in the first paper.
Douglass Cutting, David Karger, and Jan Pedersen. Constant Interaction-Time Scatter/Gather Browsing of Large Document Collections. Proceedings of the 16th Annual International ACM/SIGIR Conference, Pittsburgh, PA, 1993. postscript (160K) html
Describes an evaluation of the effectiveness of the system browsing the contents of a large collection.
Peter Pirolli, Patricia Schank, Marti A. Hearst, and Christine Diehl, Scatter/Gather Browsing Communicates the Topic Structure of a Very large Text Collection, Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), May 1996. html
The first paper discussing the idea of using Scatter/Gather on retrieval results, that is, documents retrieved as the result of a query. Anecdotes only.
Marti A. Hearst, David R. Karger, and Jan O. Pedersen, Scatter/Gather as a Tool for the Navigation of Retrieval Results , The proceedings of the 1995 AAAI Fall Symposium on Knowledge Navigation.) postscript (2.2M)
Presents the discovery that when clustering is applied to documents retrieved as the result of a query, most of the relevant documents tend to appear in one of the clusters. This implies that picking the best cluster will find most of the relevant documents. This observation verified with experimentation.
Marti Hearst and Jan Pedersen, Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results, Proceedings of the 19th Annual International ACM/SIGIR Conference, Zurich, August 1996. postscript (1.3M) html
Anecdotal description of the users responses to Scatter/Gather as part of an interface that uses TileBars as well.
Marti Hearst, Jan Pedersen, Peter Pirolli, Hinrich Schuetze, Gregory Grefenstette, and David Hull, Four TREC-4 Tracks: the Xerox Site Report, Proceedings of the Fourth Text REtrieval Conference (TREC-4), Nov 1-3, Arlington, VA, 1996. postscript (11M)
A videotape that shows both Scatter/Gather and TileBars.
Marti Hearst and Jan Pedersen, Revealing Collection Structure through Information Access Interfaces, Video track on the Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Montreal, Canada, Volume II, 2047-2048, 1995.
Describes an experiment comparing several systems with different approaches to showing clusters. Finds that an approach that shows the contents of the titles, as does Scatter/Gather, works better for naive users than systems that show clusters graphically.
Adrienee J. Kleiboemer and Manette B. Lazear and Jan O. Pedersen, Tailoring a Retrieval System for Naive Users , Proceedings of the Fifth Annual Symposium on Document Analysis and Information Retrieval (SDAIR), Las Vegas, NV, 1996.