nfirvine.comwiki

Superstruct

Filed in: Ideas.Superstruct · Modified on : Tue, 14 Dec 10

Impetus

While JSON is fun, it's really meant for a one-off, "here's your document; I don't need it back" kind of situation. If we compare JSON to XML (and its entourage), JSON's missing a common manipulation framework, for one thing.

Instead of looking at JSON as simply a "notation" and a way to transmit information, in this article, we look at it more as a changeable object in memory, akin to the DOM.

A "superstruct" is used to refer to this new DOM-like idea (sexier name forthcoming).

The superstruct

A JSON document is essentially a tree with a few types of nodes and edges:

  • Objects, which relate a child node to its parent node via a string edge. The "order" of children is undefined.
  • Arrays, which relate a child node to its parent node via an index (integer) edge from 0..(n-1).
  • Leaves, of any of the primitive datatypes (number, string, true, false, null).

The structure of this document can be described using JSONSchema. The schema/structure of a document is fixed for its lifetime.

There are essentially two subtypes of arrays that often need to be treated separately:

  • Tuples, which are heterogeneous, fixed-length. Use of tuples should generally be discouraged as an object is clearer.
  • Lists, which are homogeneous (i.e., all elements are similar in some respect, not necessarily having the same schema) and variable-length.

Paths

There's a need to specify particular node(set)s within the tree. The path syntax I use looks like paths in UNIX:

  • /'s separate path components (keys or indices).
  • Absolute paths (those starting with '/') start at the superstruct root, whereas relative paths (those starting with a key/index) are relative to some context node.
  • The last component of the path is called the "tail", the first the "head", and the everything else the "body".
  • I use Python-esque notation to refer to components, like [-1] for the tail and [1:-1] for the body.
  • Pipes (|) and ampersands (&) are used to denote set operations "union" and "intersection", respectively.

Every node should have exactly one canonical absolute path.

Unlike UNIX filesystems paths, a superstruct path may specify multiple nodes.

Operations

The intention of JSON is to transmit data, not to store it. As such, there are no operations defined to manipulate the tree. Javascript itself has methods for dealing with the components of a superstruct, but not with a superstruct as a whole.

superstruct operations have:

  • A name, which is limited to C-like identifiers;
  • An input superstruct; and
  • An output superstruct

While the parameter and the return value are superstructs, it could be as simple as a single leaf.

This allows for compatibility with JSON-RPC.

Method signatures are specified in the form input:datatype»methodname»output:datatype.

Some common datatypes:

  • nodeset: an object mapping canonical paths to the content thereof.
  • path: a string; discussed above.
  • freetree: a superstruct without a schema.
  • schema: a JSON Schema freetree

{init:freetree, schema:schema} » new » null

Creates a new superstruct. A superstruct is initialized in a few phases:

  1. Build structure of superstruct.
  2. Initialize with defaults from schema.
  3. Initialize with init freetree.
  4. Validate whole tree against schema.

Building the structure of the tree is somewhat difficult. Optional elements (including variable-length/non-tuple-typed arrays) should not be precreated, rather this should be done when the optional element is written to.

[path, path, ...] » read » nodeset

Evaluates paths and returns the nodesuperset result.

["/a", "/b"] » read is equivalent to ["/(a|b)"] » read.

{canonical_path:freetree,...} » write » null

Write new_values to canonical_paths. Writing is really more of a "graft" operation.

[0:-1] of the path must already exist in the superstruct.

Depending on the last few components, write has different behaviour:

  • If [-1] is an array, append the new freetree to it.
  • If [-2] is an array and [-1] is an index:
    • If the index is within the bounds of the array, update that element.
    • IF the index is one larger, append to array.
    • If the index is two or more larger, this is an error; we cannot create Swiss arrays :)

write supports writing to multiple paths at once, so it is in effect a transaction.

[canonical_path, canonical_path, ...] » schema » {canonical_path:schema, canonical_path:schema, ...}

Returns subschemas for each canonical_path.


Powered by PmWiki