NODAL: Introduction

Lee Iverson
OHS Design Group

Lee Iverson <leei@ai.sri.com>
Last modified: Wed May 2 10:20:02 2001

What Do We Want?

Through years of discussion and months of consensus building within Doug Engelbart's Open Hyperdocument System meeting groups, we have developed a consensus on a certain set of design principles for a collaborative document repository. It must be:

In essence, we are describing a need for a new kind of database language which has a standard, language-independent API, a document-oriented data modelling language, a fully addressable and navigable object heirarchy, and an extensible security model. In the next section we will propose just such an architecture.

NODAL: An Object-Oriented SQL for Documents

Careful readers will notice that much of the motivational requirements outlined above are exactly those which lead to the design and development of relational database management systems (RDMS) 20 years ago and the development of SQL 10-15 years ago. There was a need for an interoperable, shareable, secure resource behind many kinds of enterprise-level applications. These database management systems filled that need admirably and SQL became a standard language for modelling data in RDBMS's and formulating updates and queries to those databases.

Unfortunately, RDMS systems do not adapt well to the kinds of graph-like document structures that are represented by modern markup languages and various forms of knowledge representation languages (e.g. XTM, RDF/S, DAML+OIL, etc.). They typically do not handle tree or graph structures well, do not track change histories for table rows, and do not provide granular and adaptable control over security and privacy at levels lower than the individual table. Moreover, their networking models severely limit the ability to maintain small-scale local caches of data and operate well when transactions and queries are distributed over a wide-area network.

We would like to suggest a new paradigm. NODAL is a language for data modelling that directly and efficiently supports arbitrarily complicated typed graph structures with a very small number of general building blocks. This language is able to express the internal structure of a wide variety of document formats from markup languages to multimedia files and will allow applications and knowledge bases to access and share their contents heedless of the containing data format.

NODAL implementations will automatically manage distributed, multi-user change tracking, attribution and historical recovery of individual nodes in these graphs. The design is expressed using modern object-oriented principles and will allow server implementations to support a variety of access protocols. The client APIs will allow applications to be built directly on top of this data modelling language so as to painlessly support synchronous and asynchronous collaboration in the development and exploitation of shared documents and knowledge bases in a wide variety of user-oriented tools.

Why Not Just XML?

Many proponents of XML technology (and XML databases) suggest that XML and recently the XML Schema language is a basis for supporting exactly these capabilities. They claim that XML is a general data modelling language and that Document Object Model (DOM) interfaces to XML database implementations form a general, interoperable basis for shared document management. If that is so, then why hasn't this revolution already started? The simple answer is that XML has both problems and limitations that make its applicability to the range of problems we hope to solve somewhat limited. One of the most fundamental problems is the lack of separation between the XML data model and the expressive syntax. This complicates many aspects of the design of XML-aware applications and libraries to the point that XML-based specifications have become enormously complicated (e.g. XML Schema) Pointedly, the XML language does not even have a broadly recognized data model of its own. The XML Infoset is still an area of debate and the lack of consensus on its structure makes the continued development of such things as the DOM itself increasingly difficult.

In the NODAL design, we have expressly separated the data modelling language and APIs from the serialization of said data so that we may support a wide variety of serializations for the same document model. In this context, we see XML as a tool we can both use to build NODAL serializations and protocols and as a particular target application which may be built using the NODAL tools. We might thus define an XML Infoset using the NODAL language and use the client APIs as a basis for a DOM implementation.

Related Work

Kimber's Groves

Subversion

INXAR

Castor

Requirements Table

Over the course of discussions in Douglas Engelbart's OHS group and eventually a small group of collaborators referred to as Nodeland, we have come to some consensus on a minimal set of architectural requirements for implementing an Open Hyperdocument System. I have selected from amongst those requirements, a subset which I feel are directly addressed by the NODAL design. I list these, with some explanation below and will provide hyperlinks to the sections of the design documents in which these requirements are addressed.