Abstract 3 (003)Enabling productive online collaboration that enhances the ability of people to work toward a common goal or examine a common problem has been one of the fundamental goals of computing since Vannevar Bush's "As We May Think" appeared in 1945. Douglas C. Engelbart has been pursuing this goal since the sixties and has recently defined what he calls an "Open Hyperdocument System" as the foundation for its solution. We examine the need for such a collaborative environment and the barriers to its development and deployment. Consideration of these barriers has lead to the development of a system for ubiquitous collaboration which we call NODAL. 3A (004) NODAL is designed as a general, document-oriented distributed database and filesystem with a data model that allows addressing, searching and linking of content of any kind from any document. The data model defines documents as directed graphs of content nodes and provides adaptable addressing, security, privacy and version control at the granularity of these nodes. Moreover, it is built on a distributed client-server (or peer-to-peer) communication model that seamlessly shifts from synchronous, real-time interaction to asynchronous or intermittently-connected interaction. Finally, it is designed to extensibly support a wide range of input and output formats so that it will interoperate easily with systems using existing standard document formats and exchange protocols, including even applications unaware of its existence. It is hoped that this simple system will become a standard, universal component of the infrastructure of information management and exchange and thus allow for flexible, productive collaboration between willing people for any purpose, in any place, using any tools. 3B (005) |
Copyright 2001 SRI International, Inc. All Rights Reserved. 3C (0188)
Douglas Engelbart's Augmentation Research Center (ARC) at SRI International, perhaps most famous for developing the computer mouse, was a crucible for the evolution of interactive computing (Bardini, 2000). Engelbart and his team developed and deployed the oNLine System (NLS), a computer system that implemented a graphically rich, interactive, collaborative computing and software development environment in the late 1960's, an era when punched cards and batch processing were the state of the art for accessing computing resources. At its very core, NLS embedded a completely shared working environment so that anybody working on any part of the system was always collaborating with his/her colleagues, either actively or passively. NLS was a system of ubiquitous collaboration. 5A (007)
The ideas behind NLS catalyzed a revolution in the development of computing environments and personal computing, but for many reasons both technical and social, the system itself was largely ignored and the vision was transformed in ways that diminished the essential role of collaborative work. 5B (008)
As with many such efforts, a "good enough" solution was presented that stood in the way of Engelbart's revolutionary vision of ubiquitous collaboration. On the early Arpanet, e-mail was adopted as the universal tool for collaboration, in large part because it was a simple technology that was much more evolutionary than the radically revolutionary system Engelbart had assembled. Email became the killer app for the early Arpanet (Bardini, 2000) and is in some ways still the primary collaborative technology in wide deployment today. 5C (009)
The dream of ubiquitous collaboration still remains today, a fundamental part of what Engelbart has called his Unfinished Revolution (Engelbart, 2000). Engelbart was and is convinced that the only way to really tackle the most difficult basic problems that we are presented with as individuals, groups and societies is through a kind of teamwork in which the accumulated knowledge of the group is necessarily greater than that of any individual. An individual's role in the collective enterprise is then to augment the collective knowledge base by observation and synthesis. The networked computing environment that is now widely deployed as the Internet and World Wide Web is but one piece of a larger system that is needed to truly enable this collaborative knowledge development. Collaboration is currently a kind of afterthought in much knowledge work, with each individual locked in his or her own cubicle between group meetings, exchanging a few email messages when necessary. 5D (010)
Imagine instead a work environment in which every person is organically connected to a team of common purpose no matter where they are or what software they are using. An individual's ideas, decisions and actions become amplified and enhanced by an automatic, seamless integration with the ideas, decisions and actions of others. And the collective enterprise becomes centered on a deeply cross-linked repository of accumulated knowledge and action, which is completely open for exploitation and application to any new team activity. Without this kind of infrastructure in place, the potential for radical enhancement of efficiency and productivity that has always been the promise of networked computing can never be realized. But this will only happen when the barriers to the accumulation of group knowledge and group processes are dismantled. 5E (011)
In order to move toward ubiquitous collaboration, we must understand both the technical and social barriers that hinder collective work. In order to frame this analysis we will assume that the "worker" is someone Engelbart has called a "knowledge worker," someone whose job or avocation involves the exploitation, organization and production of knowledge. It is a class that includes virtually all white-collar workers and managers including CEOs, software developers, lawyers, reporters, researchers, virtually anyone doing any kind of computer or paperwork. 6A (013)
A first technical barrier is the tyranny of format, the fact that an enormous amount of what could be useful and reusable knowledge is trapped inside unshareable, proprietary or inaccessible data formats. Much of this reality is driven by the fact that virtually every new application and new version of an application involves the creation of a new, incompatible data storage format. The barrier to sharing even the products of work, much less the process, becomes one of incompatibility of application formats. 7A (015)
Another problem can be described as the dis-integration of knowledge, in which the concentration of work on personal computers with distinct data stores means that in many cases, the preponderance of potentially exploitable team knowledge is distributed in a completely inaccessible way on un-indexed and often inaccessible individual systems. Many large enterprises attempt to solve this problem with central data stores and shared filesystems to ensure that the products of work are centrally maintained and accessible. Unfortunately, these systems become imposed solutions that often do little to significantly enhance the productivity of individual users. They are designed and organized for the benefit of management and rarely exhibit the flexibility and open access policies that are necessary for adaptive collaboration. Ted Nelson has gone a step further and blamed the emphasis on "files" themselves as an aspect of this problem, noting that to simply change a particular date (e.g. a proposal deadline that has been modified) may require one to modify many different files when, in fact, they should all be using a shared resource. 7B (016)
A further barrier to collaboration is the un-integration of implicit knowledge, in which knowledge that is not actively and directly integrated into a system is lost. Consider the ubiquity and importance of meetings in any kind of collective work, from impromptu gatherings at the water-cooler, to brainstorming sessions, to formal meetings. Every aspect of a meeting that goes unrecorded cannot be integrated into a collective repository and is thus inaccessible to anyone but the participants. And even for those present, much of the process of producing good ideas and rejecting bad ones is lost to recall. The barrier here is the effort often required to ensure that useful dialogues and observations are both recorded and then integrated into a collective knowledge base in a useful way. It is often simple to record and store meetings either with a human note-taker or a digital recording, but without content-based retrieval mechanisms that provide the means to exploit these recordings, and the ability to integrate them into the products that are being developed by the team, it may be of little advantage to do so. 7C (017)
An aspect of this obstacle that is worthy of consideration on its own is the transience of history, that is, the loss of the process of development of collective knowledge to this unrecorded and imperfectly recalled past. It is clear how this is an aspect of the meeting scenario above, but even in situations that are completely computer-bound, the development of ideas, and thus much of the reasoning and justification for decisions, can be difficult to trace or reconstruct. For example, the relationship between email discussing a document and the document itself is often never made explicit, and is thus inaccessible to future collaborators or those trying to reconstruct an argument. 7D (018)
So far the discussion has focused on technical problems, when many of the barriers to collaboration are primarily social in origin. The first point that needs to be made about social barriers is that most revolve around issues of trust. Communities form and become self-sustaining only when the members trust each other. Contributors lose motivation or interest in collaboration when social conventions that create and sustain trust break down. The following discussion will consider a number of different barriers that form around issues of trust. 8A (020)
The first of these can be thought of as the detachment of interaction that creates distance in the personal relationships formed or maintained using certain forms of typewritten communication, such as electronic mail. Email users quickly recognize how easily misinterpretations or escalations of conflicts in email can create deep rifts in relationships ? even in cases where the email exchanges are only a small part of the relationship. There have been many theories proposed as to why this is so, but it is certainly some combination of the lack of immediacy of email and the terse, uninflected mode of correspondence that seem to naturally evolve. Emoticons (e.g. smileys :-) and their cousins) are a social construct intended to provide a counterbalance to this tendency. 8B (021)
Another barrier to trust is the evaporation of credit that happens in environments where digital products can be easily copied and disguised. Since so much of the social and economic incentive for collaboration is tied to proper attribution and even payment for contributions, the lack of adequate audit trails can obscure the actual relationships between digital products and their key contributors. This can then stand in the way of an individual's or corporation's commitment to the social contracts that encourage contribution to a collective endeavour. Volunteer-oriented open source projects are encouraged to be very generous with public recognition of individual contributions, since such recognition is often the primary currency in the open source community. In a certain sense, the recognition of the need to encourage collaboration by creating means of assigning credit and reward systems for contributions is the foundation for modern patent and intellectual property law. The current focus of artists and distributors on digital rights management is a result of breakdowns in existing credit-based reward structures when copyrighted works are distributed in easily copied digital form. 8C (022)
Another barrier is the maintenance of conflict, in which unmanaged electronic "collaboration" often actually promotes the hardening of points of disagreement rather than consensus. This effect may arise from some of the same factors that lead to disconnection in electronic mail exchanges, the overall lack of the secondary (i.e. nonverbal) modes of communication that people use in conversation. In general, the kinds of interactions that seem to promote consensus building are difficult to communicate with text-based communication. Telephone and video conferencing are technologies that have been deployed to bring nuance to distributed consensus-building, but, as was previously noted, the records of these conversations are rarely integrated into a project's digital knowledge base. 8D (023)
While there are certainly other such barriers, we believe that these comprise the most important ones to overcome. To move forward though, we will limit our immediate focus. Ultimately, this is a paper about a technology, so we can't really hope to actually remove these social barriers with technology. We will instead suggest that a strategy of co-evolution that combines careful social monitoring with an adaptive set of technologies capable of managing networks of trust and sharing will be a necessary step on the path to their solution. 8E (024)
How then shall we achieve the goal of ubiquitous collaboration? In the tradition of Engelbart's Unfinished Revolution, we propose an updated, subtler revolution in the development of computer-mediated collaboration. We will describe the development of a toolset that can destroy most of the barriers to collaboration described above and lower the others. 9A (026)
What is required is a new kind of distributed database infrastructure, which could be called an "SQL for the Millenium". It will be a document-oriented, distributed, multi-user database language and API designed primarily for the shared composition, publication and dissemination of rich, structured documents. It will consist of a data modeling language appropriate for representing, addressing, and interchanging the complete range of document formats needed to populate a rich, accessible knowledge management infrastructure. It will provide implementation-neutral APIs that will support simple, filesystem-like interaction with existing applications as well as a persistence and communications infrastructure upon which new, inherently collaborative applications can be created. It will provide means for interacting with the knowledge repositories both synchronously and asynchronously, with a wide array of existing protocols including at minimum HTTP, FTP, and electronic mail. It will provide automatic, integrated search capabilities so that all accumulated knowledge is accessible in whatever context the user needs. It will automatically audit all interactions to ensure that both history and attribution of modifications will be maintained and recoverable. It will contain a flexible, understandable and trustable security and privacy management system at its very core. It will be made available as a high-quality open source sample implementation including both client and server. Finally, it will never be more complicated than is actually necessary to "get the job done." 9B (027)
In short, it will provide a means by which both developers and end users can incrementally (on a tool-by-tool, need-by-need basis) move to a transparent, universal and ubiquitous collaborative environment. 9C (028)
Currently, the term groupware is used to describe a set of applications for coordinating group activities. These typically consist of a number of messaging and discussion archiving systems (including email, online forum and instant-messaging or chat tools) and a small set of "shared" applications such as calendar and workflow tracking. In general, these systems are moving to web-based solutions and suffer from a general lack of deep integration. 11A (031)
A conceptual descendant of Lotus Notes, Groove is a peer-to-peer infrastructure for group collaboration. It has received a great deal of adulatory press, and is seen by many to be the next generation of online collaboration. It seems to actually address many of the principles outlined in the manifesto above. It does however suffer from two flaws that would seem to disqualify it from consideration as a candidate for our system: it is intimately tied to Microsoft Windows, and it is a proprietary, owned system. 12A (033)
Subversion was designed and is being built as the next generation of the Concurrent Version System (CVS). A client-server application built around WebDAV and the Apache Web Server, it uses much the same simple, intuitive interaction model as CVS. A user checks out a directory tree onto his own filesystem, makes modifications using whatever editing tools he prefers and then checks in the files. While editing files locally, changes from the repository are only incorporated into the local copy on request, but such an update will automatically merge non-conflicting changes, even in the same files as have been edited. The client-server protocol depends on file difference summaries being transmitted, so an automatic file difference discovery engine is an integral part of the system. 13A (035)
The Network-Oriented Document Abstraction Language (NODAL) is a set of standards and protocols for a client-server or peer-to-peer system supporting ubiquitous collaboration. It is intended as a simple, general, language-independent API and protocol set that can unify a wide variety of approaches to collaborative knowledge development and exploitation. It is designed as a client-server system (although it should be possible to balance the protocols to support true peer-to-peer implementations) in which the client-side APIs create a directly manipulable, automatically updated, local copy of a working subset of the server-side data repository. The relationships between the local copies and the remote sources are automatically maintained, updated and audited so that changes are never lost and the attributed history of all objects is accessible. It will be supported by a number of high-quality, open source sample implementations. NODAL is intended as a candidate implementation and standard upon which one can build Engelbart's Open Hyperdocument System. 15A (038)
In order to achieve the radical implementation language independence required, the system is designed around a very simple data modeling language that maps easily to existing programming languages and data formats. A document is modeled as a graph of typed nodes with serialization and de-serialization methods that support legacy formats. Any of these typed nodes or their contents are addressable either within the document context or directly via globally unique identifiers. Using this addressability, the data model directly supports arbitrary, granular reuse of content within documents of almost any type, thus breaking down the data format barriers described above. 15B (039)
For example, a plain text document is modeled as a sequence of character strings, each of which represents a line in the document. The serialization method simply writes out the character strings separated by whichever line-ending sequence is appropriate for the target system. HTML, XML and SGML files have more complicated data models but all serialize to textual form, while other document types such as Microsoft Word, PowerPoint and Excel, audio, video and image files serialize to binary formats. The data models for these formats are defined with NODAL document schema files and the serialization and de-serialization methods are provided as component plugins for both clients and servers. Using this design, document editing and viewing applications written or extended with the NODAL library can operate directly on the data model with no need to ever interface with external data formats. 15C (040)
Along with the automatic change auditing, a simple, flexible security and privacy system is inherent in the APIs and protocol. Using these facilities, a user can completely control access, editing and management of any content he creates even within shared documents and thus have full confidence that sharing his knowledge with these tools is under his control. 15D (041)
In the description below, we will offer sample XML encodings and IDL interface descriptions. These interface descriptions, however, represent only one of a number of possible language bindings of the NODAL networked data modeling system. Since it is important to find efficient, natural bindings for NODAL data models in whichever implementation language is chosen, we assume that each implementation language will offer a potentially distinct language binding. As an example, even though the Java binding will naturally track very close to the IDL descriptions, the C++ binding is likely to depend highly on templates and the STL in order to be maximally efficient and usable. The IDL examples, however, do represent fragments of a language-independent IDL binding that can be used to define interfaces for CORBA, COM, XPCOM, XML-RPC or SOAP services. 15E (042)
While it is true that the system has been designed so that both existing and new applications may build directly on the NODAL APIs to support direct collaboration, that path will be unlikely to achieve true ubiquity since it would require users to move to these new, NODAL-based systems. To avoid this requirement, NODAL is designed so that it can be used as a simple filesystem, with file reading and saving via FTP, HTTP or other distributed filesystem protocols. The simple act of storing a file into the repository would trigger the integration of that file's contents into the shared, structured APIs. In this way, we hope to minimize the impact on reluctant users while offering radically improved functionality to those who move to the new applications. 15F (043)
The NODAL system is centered around a simple, natural data modeling language based on individually addressable, typed nodes in a document graph. All objects in a graph have an associated type, either an atomic or a node type. Atomic objects are the typical data typing primitives such as integers, characters and floating point numbers. Nodes are collections of objects and provide interfaces for accessing their components which is equivalent to traversing the edges of a graph. Using this model, a document consists of all of the nodes reachable from a single identified node, the document root. 16A (045)
The collection primitives chosen as building blocks for these graphs are designed to be a minimal set that is sufficient to provide both programmability and a natural mapping to existing document types. This design is derived directly from "Groves" ( the Graph Representation Of property ValuEs), a data-modeling standard for SGML developed as part of the HyTIME standard. While it could have been used as a building block for subsequent standards such as XML, it has been largely ignored by the larger community, possibly due to its perceived complexity and non-standard nomenclature. The modeling language described here is a radical generalization of Groves. Groves proponents will recognize the lineage and hopefully forgive the liberties we have taken in the name of simplicity and familiarity for the larger body of computer scientists and programmers. 16B (046)
Any data model must begin with a set of primitive or atomic types. These form a set of irreducible components usable as property or attribute values for the graph nodes. While making no promises as to universality, we believe the atomic types we have selected cover a sufficiently wide range of applications as to form a useful, evolvable starting point for the NODAL system. In particular, they form a sufficient subset for implementation of the NODAL system itself. 17A (048)
Type Name | Standard Reference | Description |
---|---|---|
A unique point in time, with at least millisecond accuracy for 1900-2100AD 17U (0251) |
||
A time interval with at least millisecond accuracy up to 100 years. 17X (0254) |
||
Initially, these atomic types are unmodifiable and inextensible and thus limited in their ability to express finer constraints on a data model. We anticipate the extension of these basic types with facets that constrain their values and representations in various ways. Using this extension model, each facet applied to a data type represents a restriction of the value range of that type. Therefore we can eventually define all of the numeric types above as restrictions of a Number type, and allow end users to define their own restrictions of these types (e.g. a 16-bit signed integral type). 17132 (075)
Note that there is no string type among these primitives. This omission is deliberate. In the NODAL model, a string is node comprising a sequence of characters. As we shall see, this allows any individual character or subsequence of a string to be addressable and reusable. Were we to represent strings at the level of an atomic type they would be, by definition, un-decomposable. Moreover, since the auditing and history tracking will also be associated with sequence nodes, manipulations of string values will be properly tracked. 17133 (076)
The node types can be seen as a minimal set of collection types. Their role in the NODAL model is fundamental, as they not only form the building blocks for the document graphs but are also the fundamental units of addressability, auditing and security. There are three types of nodes: the struct, the sequence and the map. 18A (078)
A struct is a node that represents a collection of named values. Variously called a "class," a "record," or a "table" in relational databases, it is a familiar building block for structured data. In abstract terms, a struct type is a set of mappings from names of fields to data types. In most data models, the fields of a struct are considered to be ordered, but for our purposes we will assume that this ordering is significant only in the context of a specific serialization binding. 19A (080)
A sequence is an ordered set of values of like type. Also referred to as a "list", a sequence is the basic building block for ordered data. A sequence type consists of an itemtype, which constrains the contents of the sequence. As we noted previously, in NODAL, a string is a represented as sequence of characters. 20A (082)
A map is an unordered collection which maps values of one data type to values of another. Also known as a "dictionary," it is an extensible association between objects. Unlike a struct, a map type has a single key type and a single value type. For example, an XML attribute list is a map from names to strings. 21A (084)
Type declarations are expressed in XML and provide a simple means of defining document types. Shown below is an excerpt of the definitions for the basic NODAL types. 22A (086)
<?xml version="1.0"> <!-- Base datatypes for NODAL --> <schema xmlns="http://nodal.sf.net/types.ndl"> <atomicType name="boolean"/> <atomicType name="character"/> <atomicType name="octet"/> <atomicType name="integer"/> ... <nodeType name="string"> <sequence itemType="character"/> </nodeType> ... </schema>
A complete, simple document schema is that for plain text files. It consists entirely of a sequence of lines, represented as strings. 22B (087)
<?xml version="1.0"> <!-- A schema for a simple text file --> <schema xmlns:n="http://nodal.sf.net/types.ndl"> <n:nodeType name="plain"> <sequence itemType="n:string"/> </n:nodeType> </schema>
One of the most important features of the NODAL design is that every object in a NODAL document is addressable via a URI. This is accomplished by associating a unique node identifier (NID) with every node in a repository. Accessing a node via its repository and NID is referred to as direct addressing, and is the basis for all object reference. NODAL also provides a path language that describes the traversal of edges between nodes in a document graph. Accessing an object via a path from some node is referred to as relative addressing. 23A (090)
The URI associated with a directly addressed node consists of the URI of the repository's "nodes" pseudo-document and then the NID of the node expressed as an ID. For example, a repository at "http://nodal.sf.net/nodal" would provide direct access to the node "N4520" via the URI "http://nodal.sf.net/nodal?nid=N4520". 24A (092)
In addition to this, the URI of a document in a NODAL repository is also a direct reference to the root node of that document. 24B (093)
A path consists of a sequence of path operators and describes the process by which one would walk the edges of a document graph from one point to another. From a mathematical point of view, a path is itself an operator that produces an object when applied to a node. Since there are three different types of nodes, there are three basic path operators representing movement through those node types. A struct node is associated with the field path operator, which consists of a name nm. The field operator returns the value the named field for that node. A sequence node is associated with the index path operator, which consists of an integer n. The index operator returns the value of the nth element of the sequence. A map node is associated with the select operator, which consists of a value v. The select operator returns the value associated with v in the node's map. 25A (095)
These simple building blocks comprise the canonical path operators, each of which represents a simple arc from one node to another object. A canonical path is one which consists entirely of canonical operators. Like the XPath language, we intend to support a richer variety of operators for expressing extensions (or even retractions) of paths. The algebra of paths is, however, expressed entirely in terms of canonical paths. 25B (096)
In order to allow relative addressing to map to URIs, it is necessary to describe a syntax for translating paths to strings. The canonical path operators can be serialized as described in the table below. 25C (097)
Operator | Short Form | Functional Form |
---|---|---|
A full path is serialized by concatenating the serializations of each of its operators in sequence, using either the short or functional forms for each operator. If an operator is serialized with the functional form, it is separated from a previous operator by a `/' character. Clearly, any absolute address must have a direct address as an initial component, since there must be a node to start from. The URI for a path address is then simply the URI of the direct address with the path serialization appended. In contrast, a relative address does not have such a direct address as a prefix. 25M (098)
Note that while direct addresses can only specify nodes, path or relative addresses can select any object in a document, whether of atomic or nodal type. 25N (099)
NODAL is designed to allow virtually complete flexibility in controlling privacy and security of the information stored in a repository. The first component of this facility is a notion of user identity encapsulated in aUser object, which is uniquely identified by name within a given repository. Users are authenticated with some external method (a minimal password-based scheme is provided by default), with unauthenticated access associated with an anonymous user. 26A (0101)
Permission to perform various operations within a NODAL repository is then granted to individual users via the delegation of capabilities. Unlike many such shared repositories, in NODAL this control is not only associated with individual documents, but with all of the actual nodes in a document graph. Each node in a repository has an associated set of capabilities including at minimum: visibility,content access, pedigree access, editing, and permission management. Each of these capabilities is expressed by distinct interfaces to a node. These interfaces are accessible only if the associated capability is available. By organizing permissions around these faceted interfacesit becomes impossible to perform any action without the appropriate capability. And since these capabilities can be created, delegated and revoked at will, they form a direct and understandable means for a user to manage his network of trust. 26B (0102)
In order to make the management of these capabilities as simple as possible for the end user, each User identity object acts as a kind of bank for capabilities. When a capability is delegated to a user, it is inserted into a capability list that represents that user's full complement of permissions. These permissions can then be grouped into a Role within the User object and some subset of a user's Roles can be made active at any time. In this way, a user can also have complete control over the portion of the shared workspace that is currently within his reach. 26C (0103)
While the principles outlined above are clear, the details of both design and implementation of the capability management and node permission interfaces are still an active area of research. 26D (0104)
To this point, we have only discussed the problem of defining and accessing content in a NODAL repository. Creating and editing that content is, of course, just as fundamental. 27A (0106)
Recall that we have adopted faceted interfaces in order to control access to various capabilities of a node. The Editor interfaces encapsulate all of the data mutation operations for a given node. Since there are three node types, there are three basic Editor interfaces: StructEditor, SequenceEditor and MapEditor. In addition, we have chosen to enforce an additional data integrity constraint by disallowing the creation of new nodes without edit access to a collection in which the new node might be placed. Therefore, the NodeFactory interface is encapsulated by Editor. The following fragment of IDL illustrates these basic interfaces. 27B (0107)
interface NodeFactory { NodeEditor create (in NodeType type); NodeEditor copy (in Node node); }; interface Editor : NodeFactory { readonly attribute Cursor cursor; }; interface Setter { boolean with (in Object val); }; interface StructEditor : Editor { readonly attribute Struct struct; Setter setField(in Name name); }; interface SequenceEditor : Editor { readonly attribute Sequence sequence; Setter replaceRange(in int startIndex, in int endIndex); boolean removeRange(in int startIndex, in int endIndex); Setter insertBefore(in int index); Setter insertAfter(in int index); }; interface MapEditor : Editor { readonly attribute Map map; Setter setKeyValue(in Object key); boolean removeKey(in Object key); };
Note that the Setter interface is an abstract interface encapsulating value assignment with an arbitrary value. Its exposure allows for a wide range of compile-time and run-time optimizations of access and mutation of values. 27C (0109)
It is now possible to appreciate some of the programming methodologies that may be used to build applications on top of the NODAL data model. For example, when an implementation object in some programming language is associated with a NODAL struct, the Struct interface and all of the Getter and Setter interfaces for the struct's fields can be cached private to the object, thus enabling very efficient access to the node's fields. For exactly these situations, NODAL also provides Observer interfaces for each of the node types, as described by the Observer design pattern in (Gamma et. al., 1995, p. 293). Thus, the object would register as a StructObserver of the associated Struct and thus ensure that it is dynamically notified when any of the struct's fields are modified elsewhere. A complete discussion of these idioms with examples drawn from the sample implementation of the NODAL client-server system will be presented in a paper to follow. 27D (0110)
The IDL fragment above contained an unexplained reference to a Cursor object. A cursor represents not only a reference to a particular node in a repository but the entire context of that reference, including the current User state, document through which the node was accessed, the minimal path from that document's root node, and an optional date reference if we are examining document's history. It is via the Cursor interface, and not the node, that a program accesses the data mutation interfaces, since the system needs such a context description in order to evaluate permissions and maintain an audit trail, or pedigree. In general, a program accessing a NODAL repository for any other purpose than browsing will deal primarily with Cursor interfaces. 28A (0112)
Pedigree is the term we will use to describe the complete history of a document, including a full description of who did what, when and where. This document pedigree is stored as a chain of previous versions for every node in the repository. One way to think of a node's pedigree is to begin with the metadata typically associated with files in a traditional filesystem. Each such file has a set of attributes, typically including creation time, last modification time, and possibly access time. More sophisticated filesystems may also store user attributions for these events. 29A (0114)
In NODAL, each node tracks the User/Timestamp pair for each of these events (referred to as an Attribution) and maintains a chain of previous versions of the node's content accessible via a NodeHistory interface. 29B (0115)
interface Node : Object { readonly attribute Name id; readonly attribute Attribution created; readonly attribute Node copiedFrom; readonly attribute Attribution modified; }; interface NodeHistory : Node { readonly attribute NodeHistory current; readonly attribute NodeHistory newerVersion; readonly attribute NodeHistory olderVersion; NodeHistory dated (in Timestamp ts); };
Tracking the history of a graph becomes a little interesting because of the granularity of this auditing information. One approach is simply to use a timestamped Cursor object. Any document navigation done via such a cursor is implicitly accessing the state of the document at the time of that timestamp. Notably, all access to the data mutation interfaces from such a cursor is disabled, independent of permissions. To replay all of the changes made to a document involves a more substantial effort of sorting the document's nodes in a priority queue based on modification timestamps and then navigating the history of these nodes via the queue. This functionality is encapsulated in a higher-level HistoryWalker class. 29C (0116)
So far we have described a system in which all content is held within the same repository and all relationships are expressed as containment. For a truly universal and flexible system we will need to be able to refer to nodes in other repositories and express hyperlinks that are independent of document structure. 30A (0118)
The first of these constitutes a remote transclusion operator (Nelson, 1995), in which a node from a different repository is placed in a document. Such an external reference is expressed by creating a proxy node that contains a resolvable URI to the remote node. These proxy nodes are referred to as external references and can be created and embedded anywhere within a document, and are normally cached in the embedding repository. Permissions are, of course, managed with respect to the original location. 30B (0119)
The second type is a more traditional hyperlink. This construct is built on the range type, which represents the collection of nodes between two paths that are the inclusive end-points of the range (note that a simple path is also a range that identifies exactly one node). This collection of nodes is determined by the path difference between the two paths: a relative path from the range start to the range end. Any node in or below the relative path is considered to be part of the graph fragment identified by the range. A hyperlink is then constructed from two or more such range expressions and can be either unidirectional or multidirectional, embedded or external. All such hyperlinks are managed by a linkdatabase that maintains these paths through modifications and delivers link descriptions along with published documents. 30C (0120)
The simplest top-level interface to a NODAL repository is presented as a traditional hierarchical filesystem. Directories in this filesystem may contain documents and subdirectories. Documents consist of a type reference, a document namespace, and a reference to a root node in the repository. A bare URI reference to the document returns the document encoded as per its associated MIME type. 31A (0122)
Whereas nothing unfamiliar is introduced with the directory structure, a document in a NODAL repository is a somewhat richer building block than in traditional filesystems. 32A (0124)
As we have previously stated, all documents have an associated type. A document type is identified by a MIME type identifier. Each such document type is further associated with a NODAL schema and serialization and de-serialization methods that translate from node graph to character stream and vice versa. These serialization methods are supported by a system of server and client-side plugins. It is important to note however, that any NODAL document form can be serialized in XML using the base translation described below. A lack of access to the appropriate plugins therefore limits only the ability to import and export particular document types in legacy form, not interchange between NODAL systems. 33A (0126)
The discussion above has ignored the issue of access control within the filesystem interface for a simple reason: it is no different than security and privacy management for general NODAL document graphs. The filesystem interface is, in fact, a perfect case study of how interfaces are constructed on top of the NODAL APIs and data model. While the details of this will be presented in a subsequent paper, it suffices to say that filesystem objects are modeled and implemented with the NODAL building blocks described above and thus get all of the collaboration, addressing, security and auditing capabilities of NODAL for free. 34A (0128)
All content stored in a NODAL repository is automatically indexed and searchable. Moreover, since the content is almost entirely structural, the basic building blocks for the search system are also primarily structural. Without going into any detail at all, it is possible to search for words and phrases, data types and structures, and Boolean combinations of these with simple, general interfaces. Moreover, all such search expressions translate into URIs and are resolved as path references, so they may be embedded in external references and hyperlinks. 35A (0130)
As a system for network computing, an essential component of the NODAL system is a suite of network protocols for defining, accessing and manipulating data embedded in NODAL repositories. In reality, support for the network protocols and their semantics is all that is required for a repository to be compliant with the NODAL system. We anticipate that a commercial marketplace for high-performance and high-security implementations of client libraries and servers for these protocols will develop, in addition to the open-source sample implementation we will provide. Without such work, it would be impossible to imagine the kind of organic growth and ubiquity that is the core motivation for this system. 36A (0132)
At its core, the NODAL system can be implemented as a client-server system with a local client attached to various user-oriented applications and a remote server managing the integration and publishing of documents being produced by multiple users. In order to support this, there must be a standard interface for accessing, creating and modifying content on a remote server. An outline of such an interface is provided below. 37A (0134)
interface Storage { readonly attribute Repository repository; /* Node access */ Node getNode (in NodeRef id, in int depth); Node getNodeDated (in HistoryRef id, in int depth, in Timestamp ts); /* Transaction management */ boolean beginTransaction (); boolean endTransaction (); /* Node creation */ Node createNode (in NodeRef typeRef); /* Modifiers for struct nodes */ boolean setField (in EditRef structNode, in Name field, in ObjectRef val); /* Modifiers for map nodes */ boolean setKeyValue (in EditRef mapNode, in ObjectRef key, in ObjectRef value); boolean removeKey (in EditRef mapNode, in ObjectRef key); /* Modifiers for sequence nodes */ boolean replaceRange (in int startIndex, in int endIndex, in ObjectRef val); boolean removeRange (in int startIndex, in int endIndex); boolean insertBefore (in int index, in ObjectRef val); boolean insertAfter (in int index, in ObjectRef val); /* Permission management interfaces */ Node getPermission (in ManageRef ref); boolean setPermission (in ManageRef ref, in NodeRef perm); };
By way of explanation, the two getNode interfaces have a depth argument, which restricts the depth of content expansion for the returned node. A depth value of -1 requests the full subgraph of the named node. Moreover, the HistoryRef, EditRef and ManageRef values are returned as attributes of a Node returned by a getNode or createNode call whenever the server determines that the current session has permission to access these interfaces. Finally, an ObjectRef value may be an atomic type or a NodeRef. With these methods, an application can perform any necessary operation on a repository. 37B (0135)
It is not accidental that this interface is named Storage and not Server. In an abstract sense, this interface describes a minimal set of access and manipulation primitives for any service acting as a storage device for a repository. The same set if interface methods can represent the client-server model, an SQL database, or a local filesystem interface. By providing a set of implementations of this interface for these different storage substrates, it will be possible to compose these interfaces similarly on both clients and servers. Given this possibility, it is no stretch to imagine implementation of this interface symmetrically between client and server in a much more peer-to-peer arrangement. 37C (0136)
There are a number of reasons why submitting repository manipulation requests at the level of granularity of the single operations shown above is not desirable. The simplest reason is that wide-area networks have non-negligible latencies and costs per message. Increasing the granularity of the messaging to the level naturally implied by a single-operation-per-message protocol would be horribly inefficient. In addition, if the NODAL client-server pair is performing validation of structures versus type constraints (and we imagine that this validation will increase in power and specificity as the system evolves), many of these single transactions may leave the system in a temporarily invalid state. For at least these reasons, a transaction mode is built into the Storage interface. 38A (0138)
A transaction consists of a sequence of operations to be performed on a repository that are accumulated locally and then all submitted at once. In traditional database terms, such a transaction is viewed as the atomic unit of change, with the failure of any one operation requiring failure of the entire transaction. Database integrity is only checked on the completion of a transaction and is a condition for the transaction's success. 38B (0139)
For example, a typical transaction might involve the creation of a new node, initialization of the node's content to non-default values, and then storage of a reference to the new node in some other node's content. Without embedding such a sequence of operations in a transaction, the newly created node would not be referenced in any document node graph until the after the final operation, and would thus potentially be subject to any garbage collection on the server during this interval. 38C (0140)
For purposes of auditing, each transaction is assigned a specific Attribution record and all nodes affected appear to have been updated at a single instant in time. 38D (0141)
The traditional and simplest binding to an interface specification for a remote service is via some Remote Procedure Call (RPC) protocol, such as Sun's XDR/RPC, CORBA, DCOM or COM+, XML-RPC or SOAP. We anticipate that bindings of the NODAL client-server interface to a variety of these RPC substrates will be defined. The sample implementation will, however probably provide only XML-RPC and SOAP bindings. 39A (0143)
However, RPC models have a significant problem when deployed over wide-area networks or with heavily-loaded servers: they block. Because an RPC simulates a local procedure call, the flow of control through an RPC call requires the client application to block until the server has performed the requested action and returned some result. Since we anticipate building interactive user-oriented applications on top of the NODAL client libraries, this blocking behavior may be unacceptable in many situations. One solution to this problem is to encapsulate the RPC calls in separate threads of execution from the main user interface. This however may impose too much of a constraint on the implementation of client applications which want to take advantage of the NODAL facilities. 39B (0144)
An alternative to this paradigm is a request-update style of protocol, in which a client request is made to a server and an acknowledgement message is returned immediately. The server has simply queued the request for processing. At some later time, when the server processes the request, it sends a response to the client with the return value of the request. Ideally, the client-to-server and server-to-client transport streams are asynchronous, in which case the client needs only ever block when waiting for truly necessary data from the server (e.g. the response to a getNode request). Another advantage is that the server may send any update messages to the client at any time, thus fulfilling the need to notify the client when other users have modified content. That is not to say that the request-update model cannot be layered on top of a synchronous transport protocol such as HTTP, it is just necessary to embed the update messages in the responses to client-initiated messages. 39C (0145)
It is important to demonstrate the ease with which data models can be developed in the NODAL language. Once such a data model is in place, it is possible to store and validate document in a NODAL repository and take full advantage of the collaborative infrastructure this provides. 40A (0147)
The following model represents a parsed XML document with all entity references resolved. Details have been omitted for clarity (e.g. the Entity, EntityRef and DocumentType types), but the general principles and ability to store, manipulate and serialize the resulting XML documents is essentially complete. In fact, a full DOM implementation could be built on this simple data model. 41A (0149)
<?xml version="1.0"> <!-- A NODAL schema for XML documents --> <n:schema xmlns:n="http://nodal.sf.net/types.ndl"> <n:nodeType name="Document"> <n:struct> <n:property name="idref" type="NamedNodeMap"/> <n:property name="element" type="Element"/> </n:struct> </n:nodeType> <n:nodeType name="NamedNodeMap"> <n:map keyType="n:name" valueType="Node"/> </n:nodeType> <n:nodeType name="Attributes"> <n:map keyType="n:name" valueType="n:string"/> </n:nodeType> <n:nodeType name="Element"> <n:struct> <n:property name="tag" type="n:name"/> <n:property name="attr" type="Attributes"/> <n:property name="children" type="NodeList"/> </n:struct> </n:nodeType> <n:nodeType name="CharacterData"> <n:struct> <n:property name="type" type="n:integer"/> <n:property name="text" type="n:string"/> </n:struct> </n:nodeType> <n:nodeType name="Node"> <n:union types="Element CharacterData"/> </n:nodeType> <n:nodeType name="NodeList"> <n:sequence itemType="Node"/> </n:nodeType> </n:schema>
This model represents an RGBA image. It is sufficiently rich that it could be used as the base data model for GIF, JPEG or PNG images (without animation). 42A (0151)
<?xml version="1.0"> <!-- A NODAL schema for RGBA images --> <n:schema xmlns:n="http://nodal.sf.net/types.ndl"> <n:nodeType name="image"> <struct> <property name="width" type="n:integer"/> <property name="height" type="n:integer"/> <property name="data" type="content"/> </struct> </n:nodeType> <n:nodeType name="content"> <sequence> <sequence itemType="pixel"/> </sequence> </n:nodeType> <n:nodeType name="pixel"> <struct> <property name="red" type="n:octet"/> <property name="green" type="n:octet"/> <property name="blue" type="n:octet"/> <property name="alpha" type="n:octet"/> </struct> </n:nodeType> </n:schema>
In this paper, we have discussed the need for a system of universal, ubiquitous collaboration and the many barriers to the deployment of such a system. With knowledge of the need and these barriers, it was possible to design the Network-Oriented Document Abstraction Language (NODAL), a set of APIs and protocols for data modeling and manipulation. We have outlined the components of this system and some of the constraints on implementations of the APIs and network protocols. The system as presented is general, secure and useful. Moreover, it is not only possible, but straightforward to build applications on top of the NODAL model and APIs. 43A (0154)
In order to breach the technical and social barriers described above, NODAL is designed to be sneaky. It can be presented as a simple typed, version-controlled filesystem for people who think they only want that. It can be a database backend for a PHP or Perl or JSP web publishing system. It can be a building block for collaborative work in developing documents, design specifications, diagrams or source code. It can be a subsystem on which to build new systems for handling online dialogue (starting with email and instant messaging or chat). It can be a repository for all kinds of the kinds of explicit knowledge infrastructure presupposed by the development of DAML, Topic Maps and the Semantic Web. Most importantly, it can be all of these things at the same time and in such a way that you can layer on new higher-level representations and personalities as your social and organizational models demand. 43B (0155)
At this point in time, the design is complete save for details of the security model and the network protocol bindings. With appropriate support, we plan to refine this model and implement a system upon which ubiquitous collaboration and knowledge development can become a reality. 43C (0156)
This work has been completely inspired by Doug Engelbart and his passionate devotion to the Augmentation of Human Intellect. Without him, the world of computers would be a very different place right now. 44A (0158)
I must also thank the members of the OHS and Nodeland teams which grew out of Doug's "Unfinished Revolution" colloquium at Stanford. These groups were a crucible for virtually all of the ideas that became the NODAL design. Notable for their direct contributions to this design have been Eric Armstrong, Eugene Kim, and Jack Park. In particular Eric's earlier system KRNL was a direct inspiration for this design. 44B (0159)
At SRI, this work has been done in the context of the CyberHabitats project, lead by Charlie Ortiz and Victoria Stavridous, and has been deeply informed in structure and conceptualization by discussions with Jamie Fenton and Steve Dawson. 44C (0160)
Thierry Bardini, "Bootstrapping: Douglas Engelbart, Coevolution, and the Origins of Personal Computing" Stanford University Press, Stanford California, 2000. 45A (0162)
Paul V. Biron and Ashok Malhotra (editors), "XML Schema Part 2: Datatypes" Online: http://www.w3.org/TR/xmlschema-2/ 45B (0163)
Vannevar Bush, "As We May Think" The Atlantic Monthly, July 1945. Volume 176, No. 1; pages 101-108. http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm 45C (0164)
Dan Connolly, "Overview of SGML Resources" Online: http://www.w3.org/MarkUp/SGML/ 45D (0165)
Douglas C. Engelbart, "Knowledge-Domain Interoperability and an Open Hyperdocument System", Proceedings of the Conference on Computer-Supported Cooperative Work, Los Angeles, CA, October 7-10, 1990, pp. 143-156. Republished in Hypertext/Hypermedia Handbook, Emily Berk & Joseph Devlin [Editors], McGraw-Hill, 1991, pp. 397-413. Online: http://www.bootstrap.org/augment-132082.htm 45E (0166)
Douglas C. Engelbart, "OHS Framework" http://www.bootstrap.org/alliance-980.htm 45F (0167)
Douglas C. Engelbart, "The unfinished revolution, strategy and means for coping with complex problems" Proceedings of Engelbart's Colloquium Stanford University, January 6 - March 9, 2000. http://bootstrap.org/colloquium/index.html. 45G (0168)
N. Freed and N. Borenstein "RFC 2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types" Internet Engineering Task Force, 1996. http://www.faqs.org/rfcs/rfc2046.html. 45H (0169)
Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, "Design Patterns: Elements of Reusable Object-Oriented Software" Addison-Wesley Professional Computing Series, Addison-Wesley, Reading, Massachusetts (1995). 45I (0170)
Theodor H. Nelson, "The Heart of Connection: Hypermedia Unified by Transclusion." Communications of the ACM, 38:8 (August 1995), 31-33. 45J (0171)
ISO/IEC JTC1/SC18/WG8 "Hypermedia/Time-based Structuring Language (HyTime)" http://www.ornl.gov/sgml/wg8/document/n1920/html/n1920.html. 45K (0172)
Paul Prescod "Addressing the Enterprise: Why the Web needs Groves", August 1999. http://www.prescod.net/groves/shorttut/. 45L (0173)