Xerox PARC's Bayou Project

Bayou was a project at the Xerox Palo Alto Research Center (PARC) from July 1993 until December 1997. Enclosed below is information about the project including a list of publications.

Project Members

Overview

The Bayou system was designed to support collaboration among users who cannot be or choose not to be continuously connected. Network connections may at times be too slow, too expensive, or too faulty for users to effectively utilize, or it may not be possible to establish a network connection at all. This can occur, for example, when collaborators are widely dispersed across the Internet or work on portable computers. In such a setting, weak consistency replication of shared data is the key to obtaining high data availability, good access performance, and good scalability.

A major premise that distinguished the Bayou effort from previous work on replicated data algorithms is that disconnected operation by mobile users is a common, rather than exceptional, case. Earlier replicated data algorithms, such as those based on maintaining strong data consistency by atomically updating a quorum of copies or based on server-initiated callbacks for cached data invalidation, do not work well in a frequently-partitioned network. Algorithms that had been designed for operation in a partitioned network either enforce overly pessimistic locking on replicas or else provide few consistency guarantees and little support for resolving conflicts.

Bayou uses weak consistency replication techniques to manage replicas of shared calendars, electronic mail messages, databases, documents, and other artifacts that are central to collaboration. To maximize availability, users can read and write any accessible replica. Bayou's design focused on supporting application-specific mechanisms to detect and resolve the update conflicts that naturally arise in such a system, ensuring that replicas move towards eventual consistency, and defining a protocol by which the resolution of update conflicts stabilizes. It includes novel methods for application-specific conflict detection and per-write conflict resolution based on client-provided merge procedures.

Thus, Bayou's key features are:

Two implementations of the Bayou server and the client libraries were developed. The first implementation is based on POSIX complicant C, and currently runs on both SunOS and Linux. The second implementation is in Java using commercial databases through JDBC. Bayou clients and servers can communication with each other via both wired network connections and wireless infrared channels, such as those commonly found on laptops.

Bayou applications include the Bayou calendar application, a bibliographic database, and the BXMH electronic mail reader. BXMH permits a user to have multiple replicas of his mailbox, such as one on a workstation and another on a laptop, or permits several users to share a replicated mailbox. Users can concurrently update mailboxes, such as by deleting messages or moving them into folders, with confidence that all replicas will eventually converge to a consistent state.

Publications

Each paper is available in multiple formats. The PostScript rendition is best, followed by the HTML, then the plain ASCII. Please see our CopyRight Notice before downloading copyrighted materials.

``The Case for Non-transparent Replication: Examples from Bayou''

D. B. Terry, K. Petersen, M. J. Spreitzer, and M. M. Theimer.
IEEE Data Engineering, December 1998, pages 12-20.

Abstract: Applications that rely on replicated data have different requirements for how their data is managed. For example, some applications may require that updates propagate amongst replicas with tight time constraints, whereas other applications may be able to tolerate longer propagation delays. Some applications only require replicas to interoperate with a few centralized replicas for data synchronization purposes, while other applications need communication between arbitrary replicas. Similarly, the type of update conflicts caused by data replication varies amongst applications, and the mechanisms to resolve them differ as well. The challenge faced by designers of replicated systems is providing the right interface to support cooperation between applications and their data managers. Application programmers do not want to be overburdened by having to deal with issues like propagating updates to replicas and ensuring eventual consistency, but at the same time they want the ability to set up appropriate replication schedules and to control how update conflicts are detected and resolved. The Bayou system was designed to mitigate this tension between overburdening and underempowering applications. This paper looks at two Bayou applications, a calendar manager and a mail reader, and illustrates ways in which they utilize Bayou's features to manage their data in an application-specific manner.

Paper available in PDF (68K), Postscript (199K), HTML (47K), and plain text (33K), .

Copyright © 1998 by Institute of Electrical and Electronics Engineers (IEEE).

``Flexible Update Propagation for Weakly Consistent Replication''

K. Petersen, M. J. Spreitzer, D. B. Terry, M. M. Theimer, and A. J. Demers.
Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP-16), Saint Malo, France, October 5-8, 1997, pages 288-301.

Abstract: Bayou's anti-entropy protocol for update propagation between weakly consistent storage replicas is based on pair-wise communication, the propagation of write operations, and a set of ordering and closure constraints on the propagation of the writes. The simplicity of the design makes the protocol very flexible, thereby providing support for diverse networking environments and usage scenarios. It accommodates a variety of policies for when and where to propagate updates. It operates over diverse network topologies, including low-bandwidth links. It is incremental. It enables replica convergence, and updates can be propagated using floppy disks and similar transportable media. Moreover, the protocol handles replica creation and retirement in a light-weight manner. Each of these features is enabled by only one or two of the protocol's design choices, and can be independently incorporated in other systems. This paper presents the anti-entropy protocol in detail, describing the design decisions and resulting features.

Paper available in HTML and PostScript.

Copyright © 1997 by the Association for Computing Machinery, Inc.

``Designing and Implementing Asynchronous Collaborative Applications with Bayou''

W. K. Edwards, E. D. Mynatt, K. Petersen, M. J. Spreitzer, D. B. Terry, and M. M. Theimer.
Proceedings of the Tenth ACM Symposium on User Interface Software and Technology (UIST), Banff, Alberta, Canada, October, 1997, pages 119-128.

Abstract: Asynchronous collaboration is characterized by the degree of independence collaborators have from one another. In particular, collaborators working asynchronously typically have little need for frequent and fine-grained coordination with one another, and typically do not need to be notified immediately of changes made by others to any shared artifacts they are working with. We present an infrastructure, called Bayou, designed to support the construction of asynchronous collaborative applications. Bayou provides a replicated, weakly-consistent, data storage engine to application writers. The system supports a number of mechanisms for leveraging application semantics; using these mechanisms, applications can implement complex conflict detection and resolution policies, and choose the level of consistency and stability they will see in their databases. We present a number of applications we have built or are building using the Bayou system, and examine how these take advantage of the Bayou architecture.

Paper available in PDF (122K) and Postscript (310K) .

Copyright © 1997 by the Association for Computing Machinery, Inc.

``Dealing with Server Corruption in Weakly Consistent, Replicated Data Systems''

M. J. Spreitzer, M. M. Theimer, K. Petersen, A. J. Demers, and D. B. Terry.
Proceedings of the Third Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom '97), Budapest, Hungary, September 1997, pages 234-240.

Abstract: Providing high availability and the ability to share data despite the weak connectivity of mobile computing raises the problem of trusting replicated data servers that may be corrupt. This is because servers must be run on portable computers, and these machines are less secure and thus less trustworthy than those traditionally used to run servers. We describe the kinds of problems one must be prepared to deal with, noting that even users of secured, non-portable computers are at risk if servers trust all authorized peers. We show that high availability through data replication on portable computers need not be mutually exclusive with various levels of data security one might want. We give three solutions to this trust problem, achieving progressively higher levels of security with progressively higher costs.

Paper available in Postscript (121K) plus there's an Errata in HTML (2K) .

``Dealing with Server Corruption in Weakly Consistent Replicated Data Systems''

M. J. Spreitzer, M. M. Theimer, K. Petersen, A. J. Demers, and D. B. Terry.
Wireless Networks, Baltzer Science Publishers, Bussum, The Netherlands, 1999 Vol. 5 No. 5, pages 357-371.

Abstract: Providing high availability and the ability to share data despite the weak connectivity of mobile computing raises the problem of trusting replicated data servers that may be corrupt. This is because servers must be run on portable computers, and these machines are less secure and thus less trustworthy than those traditionally used to run servers. We describe the kinds of problems one must be prepared to deal with, noting that even users of secured, non-portable computers are at risk if servers trust all authorized peers. We show that high availability through data replication on portable computers need not be mutually exclusive with various levels of data security one might want. We give three solutions to this trust problem for a simple example architecture, achieving progressively higher levels of security with progressively higher costs. We then show how to solve this trust problem for the more complex architecture of Bayou, a weakly consistent replicated data system we built at Xerox PARC.

Paper available through Baltzer's web site.

``Bayou: Replicated Database Services for World-wide Applications''

K. Petersen, M. J. Spreitzer, D. B. Terry, and M. M. Theimer.
Proceedings Seventh ACM SIGOPS European Workshop (EuroSIGOPS '96), Connemara, Ireland, September 1996, pages 275-280.

Abstract: The Bayou architecture provides scalability, availability, extensibility, and adaptability features that address database storage needs of world-wide applications. In addition to discussing these features, this paper presents Bayou's mechanisms for permitting the replicas of a database to vary dynamically without global coordination. Key is the use of weak consistency replication among autonomous machines and strict adherence to the tenet that no oper- ation should involve more than two machines.

Paper available in Postscript (109K) and plain ASCII (22K). This paper is also available in Postscript (109K) from the workshop site.

Copyright © 1996 by the Association for Computing Machinery, Inc.

``Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System''

D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J. Spreitzer, and C. Hauser.
Proceedings 15th Symposium on Operating Systems Principles (SOSP-15) , Cooper Mountain, Colorado, December 1995, pages 172-183.

Abstract: Bayou is a replicated, weakly consistent storage system designed for a mobile computing environment that includes porta- ble machines with less than ideal network connectivity. To maxi- mize availability, users can read and write any accessible replica. Bayou's design has focused on supporting application-specific mechanisms to detect and resolve the update conflicts that natu- rally arise in such a system, ensuring that replicas move towards eventual consistency, and defining a protocol by which the resolu- tion of update conflicts stabilizes. It includes novel methods for conflict detection, called dependency checks, and per-write con- flict resolution based on client-provided merge procedures. To guarantee eventual consistency, Bayou servers must be able to roll- back the effects of previously executed writes and redo them according to a global serialization order. Furthermore, Bayou per- mits clients to observe the results of all writes received by a server, including tentative writes whose conflicts have not been ultimately resolved. This paper presents the motivation for and design of these mechanisms and describes the experiences gained with an initial implementation of the system.

Paper available in Postscript (271K) , `gzip`ped Postscript (75K) , HTML (76K + 29K inline GIFs) , and plain ASCII (73K). You might also consult the Proceedings on ACM's server.

Towards a Quality of Service Model for Replicated Data Access

D. B. Terry.
Proceedings Second International Workshop on Services in Distributed and Networked Environments (SDNE) , Whistler, British Columbia, Canada, June 1995, pages 118-122.

Abstract: This paper introduces the notion of a quality of service model for read and write access to a replicated database or file system. The goal is to provide clients with a choice of service guarantees and to separate the specification of a client's needs from the mechanisms implemented to meet those needs. An analogy is drawn with the quality of service models being explored for communications protocols.

Paper is available in Postscript (72K) and plain ASCII (14K).

Copyright © 1995 Institute of Electrical and Electronics Engineers.

``The Bayou Architecture: Support for Data Sharing among Mobile Users''

A. J. Demers, K. Petersen, M. J. Spreitzer, D. B. Terry, M. M. Theimer, and B. B. Welch.
Proceedings of the Workshop on Mobile Computing Systems and Applications, Santa Cruz, California, December 1994, pages 2-7.

Abstract: The Bayou System is a platform of replicated, highly-available, variable-consistency, mobile databases on which to build collaborative applications. This paper presents the preliminary system architecture along with the design goals that influenced it. We take a fresh, bottom-up and critical look at the requirements of mobile computing applications and carefully pull together both new and existing techniques into an overall architecture that meets these requirements. Our emphasis is on supporting application-specific conflict detection and resolution and on providing application-controlled inconsistency.

Paper is available in Postscript, 100K , `gzip`ped Postscript, 31K , and HTML, 34K .

Copyright © 1994 Institute of Electrical and Electronics Engineers.

``Dealing with Tentative Data Values in Disconnected Work Groups''

M. M. Theimer, A. J. Demers, K. Petersen, M. J. Spreitzer, D. B. Terry, and B. B. Welch.
Proceedings of the Workshop on Mobile Computing Systems and Applications, Santa Cruz, California, December 1994, pages 192-195.

Abstract: This paper describes a problem of weakly-consistent replicated data systems used in support of disconnected groups of people. The problem concerns actions and updates derived from tentative data updates that are ulti- mately determined to be in conflict. While some such actions and updates can be automatically resolved, many require human intervention. Furthermore, although some file and database systems support internal conflict detec- tion and resolution, derived actions may be external to those systems, implying that human users must ensure that proper consistency is maintained between independent components of the system. The entire problem becomes exascerbated when disconnected work groups are taken into account, where tentative data values may be seen and acted upon by multiple people.

Paper available in Postscript (77K) , `gzip`ped Postscript (23K) , HTML (19K + 3K inline GIF).

Copyright © 1994 Institute of Electrical and Electronics Engineers.

``Session Guarantees for Weakly Consistent Replicated Data''

D. B. Terry, A. J. Demers, K. Petersen, M. J. Spreitzer, M. M. Theimer, and B. B. Welch.
Proceedings International Conference on Parallel and Distributed Information Systems (PDIS), Austin, Texas, September 1994, pages 140-149.

Abstract: Four per-session guarantees are proposed to aid users and applications of weakly consistent replicated data: Read Your Writes, Monotonic Reads, Writes Follow Reads, and Monotonic Writes. The intent is to present individual applications with a view of the database that is consistent with their own actions, even if they read and write from various, potentially inconsistent servers. The guarantees can be layered on existing systems that employ a read-any/write-any replication scheme while retaining the principal benefits of such a scheme, namely high-availability, simplicity, scalability, and support for disconnected operation. These session guarantees were developed in the context of the Bayou project at Xerox PARC in which we are designing and building a replicated storage system to support the needs of mobile computing users who may be only intermittently connected.

Copyright © 1994 Institute of Electrical and Electronics Engineers.

Paper available in Postscript (171K) , `gzip`ped Postscript (45K) , HTML (56K + 2K inline GIF) , and plain ASCII (52K) .

Copyright Notice

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

ACM copyright: Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org.

IEEE copyright: This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Xerox's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by sending a blank email message to info.pub.permission@ieee.org.

Why the name ``Bayou''?

We had our reasons.


Last edited June 25, 1999 by Doug Terry.