Overview
In a corporate environment, the task of centralizing the "enterprise" data model has had its challenges. Communicating the definition of what a data object looks like has been rather inflexible with some popular technologies like XML-Schema or awkwardly mismatched with the needs of end-applications using relational databases. JSON document encoding has become popular for transporting and storing application data but it is often prone to problems because the common methods of defining what should be expected in a JSON document (structure, field names, etc.) are somewhat haphazard and weak. JSONSchema goes a long way towards satisfying the need to explicitly define JSON content, but it is still a challenge to implement a process that provides a useful data-document definition that can support meaningful data-validation, while retaining JSON's agile, freely-changeable roots. This describes a possible approach to getting the best of both worlds, by implementing processes around JSON Schemas that can achieve flexibility and clearly defined data-documents at the same time.
Definitions
Since there is some confusion about what means what in the world of JSON data, let's get a few terms clear up front."Consumer-View" JSON Schema - an artifact, meant to be published for use by consumers of the corresponding data (i.e. application developers), that describes what can be expected in a JSON document that complies with the schema. Unlike a database or XML schema, there isn't an expectation that this FULLY describes the document, just that the document should match what actually is defined in the schema.
"Producer-View" JSON Schema - a schema artifact, meant to be used internally (i.e. not published for consumers), that exactly defines every detail of a concrete JSON Document.
JSON Document - a "data document" encoded in JSON, that, if it is advertised as compliant with a particular JSON Schema, should at least include data matching the field names and structures defined in that JSON Schema.
The Problem
The "desired" definition of a data document changes over time. Attribute names change. Data types might be altered. New stuff is included. Old stuff disappears. The organizational structure of the data gets deeper or flatter. Also, if multiple projects require different changes to a data document they use in common, at the same time, it becomes VERY difficult to manage release timing and cross-compatibility. If there must be only one JSON Schema that defines what an actual JSON document looks like, that one JSON Schema will end up having impossible constraints in order to meet everyone's needs.The Typical, Rigid, "One-Schema" Approach
A common strategy for defining a JSON Document is to lock it together with one and only one JSON Schema. In other words, this demands that everything that is defined in the schema, must be represented exactly that way in the document. Nothing more. Nothing less. This comes with all sorts of concerns and frustrations about when something can be added, and/or whether anything can ever be renamed or removed. If an old application were written using a previous version of the JSON Schema, changing anything besides adding more fields either breaks that old application or requires it to be updated. This also implies the need to keep all instances of the JSON Document in perfect synch with the JSON Schema that defines the document.The Proposed Solution
Freely change, or "version," the "consumer-perspective" JSON Schema as often as necessary, and in ways that would not be permitted if it were rigidly mapped one-to-one with a JSON Document, retain all previous versions of the JSON Schema in a published catalog, and include, in each JSON Document, a list of which versions of the JSON Schema it still supports. Then, separately, if desired, create a "producer-view JSON Schema" to rigidly define an actual JSON document, and "version" that separately.Detailed Example (Bookstore Theme)
1st Published JSON Schema Version - Everything Starts Out One-to-One
JSON Schema - One field named bookTitle
{"title": "Book",
"type": "object",
"version": "X",
"properties": {
"bookTitle": {
"type": "string"
}
}
}
JSON Document - Complies with one version (X) of JSON Schema
{"jsonSchemaVersions": ["X"],
"bookTitle": "Hitchhiker's Guide to the Galaxy"
}
2nd Published JSON Schema Version - Add Field - Nothing Complicated Yet
JSON Schema - One new field named isbn
{"title": "Book",
"type": "object",
"version": "ProjectISBN",
"properties": {
"bookTitle": {
"type": "string"
},
"isbn": {
"type": "string"
}
}
}
JSON Document - Complies with both published versions of JSON Schema
{"jsonSchemaVersions": ["X", "ProjectISBN"],
"bookTitle": "Hitchhiker's Guide to the Galaxy",
"isbn": "0345391802"
}
- Note: The JSON Document still complies with JSON Schema version "X", because all it requires it the "bookTitle" field... and it's still in the document, still has the same name, etc.
3rd Published JSON Schema Version 3 - Oops, ISBN wasn't Quite Right
JSON Schema - Replace "isbn" with Separate Fields for ISBN-10 and ISBN-13
{"title": "Book",
"type": "object",
"version": "ISBN-FIX",
"properties": {
"bookTitle": {
"type": "string"
},
"isbn10": {
"type": "string"
},
"isbn13": {
"type": "string"
}
}
}
JSON Document - Duplicates Some Data to Remain Compliant with both "ProjectISBN" and ISBN-FIX JSON Schema Versions (... for now).
{"jsonSchemaVersions": ["X", "ProjectISBN", "ISBN-FIX"],
"bookTitle": "Hitchhiker's Guide to the Galaxy",
"isbn": "0345391802",
"isbn10": "0345391802",
"isbn13": "978-0345391803"
}
- Note: This document has everything it needs for all JSON Schemas published so far. However, any application that is using the "isbn" field just got notified that it may not be around forever.
- Note: This illustrates a little more clearly how the JSON Document can satisfy the requirements of previous JSON Schema versions, without the "latest" JSON Schema rigidly defining everything in the document. This JSON Schema does not define old field "isbn" but the document still has it in order to still support the "ProjectISBN" version of the "Book" Schema.
4th and 5th Published JSON Schema Versions - Concurrent Projects
JSON Schema - Add Fields to Support Selling Books
{"title": "Book",
"type": "object",
"version": "ProjectSellBooks",
"properties": {
"bookTitle": {
"type": "string"
},
"isbn10": {
"type": "string"
},
"isbn13": {
"type": "string"
},
"cost": {
"type": "number"
},
"price": {
"type": "number"
}
}
}Another JSON Schema Published Independently, at the Same Time - Add Fields to Support Inventory Management
{"title": "Book",
"type": "object",
"version": "ProjectInventory",
"properties": {
"bookTitle": {
"type": "string"
},
"isbn10": {
"type": "string"
},
"isbn13": {
"type": "string"
},
"countOnHand": {
"type": "integer"
},
"countOnOrder": {
"type": "integer"
}
}
}JSON Document - Adds Support for BOTH Projects, Independently - Also Drops ProjectISBN Compliance ("isbn" field is gone now)
{"jsonSchemaVersions": ["X", "ISBN-FIX", "ProjectSellBooks", "ProjectInventory"],
"bookTitle": "Hitchhiker's Guide to the Galaxy",
"isbn10": "0345391802",
"isbn13": "978-0345391803",
"cost": 5.05,
"price": 7.99,
"countOnHand": 20,
"countOnOrder": 10
}
- Note: This document still has everything it needs for most previously published JSON Schema versions as well as both of the two new ones. Notice that the two new JSON Schemas do not need to include each other's added fields. The independent schema changes ONLY affect the document.
- Note: The jsonSchemaVersions list no longer has "ProjectISBN" because the document no longer supports everything the "ProjectISBN" schema included (i.e. the "isbn" field). The app developers were warned this was coming!!
Latest Published JSON Schema Version - Single New Project Additions + Cleanup
JSON Schema - Pull Multiple Previous JSON Schemas Together, and Add a few Things
{"title": "Book",
"type": "object",
"version": "Book2.0",
"properties": {
"bookTitle": {
"type": "string"
},
"isbn10": {
"type": "string"
},
"isbn13": {
"type": "string"
},
"cost": {
"type": "number"
},
"price": {
"type": "number"
},
"countOnHand": {
"type": "integer"
},
"countOnOrder": {
"type": "integer"
},
"coverImageLink": {
"type": "string"
},
"synopsis": {
"type": "string"
},
"author": {
"type": "string"
}
}
}"type": "integer"
},
"countOnOrder": {
"type": "integer"
},
"type": "string"
},
"type": "string"
},
"type": "string"
}
JSON Document - Everything that was Published Before, and then Some...
{"jsonSchemaVersions": ["X", "ISBN-FIX", "ProjectSellBooks", "ProjectInventory", "Book2.0"],
"bookTitle": "Hitchhiker's Guide to the Galaxy",
"isbn10": "0345391802",
"isbn13": "978-0345391803",
"cost": 5.05,
"price": 7.99,
"countOnHand": 20,
"countOnOrder": 10,
"coverImageLink": "http://mybookstore.example.com/images/covers/img0345391802.jpg",
"synopsis": "The answer to life, the universe, and everything, is 42.",
"author": "Douglas Adams"
}
- Note: This document still identifies all of the previously published JSON Schema versions it supports, and any application that was coded against any one of those listed should still find the fields it knows about, right where they should be.
Producer-View JSON Schema
One of the main things that seems to aggravate the process of modeling data-documents that are shared by multiple consumers is the lack of separation between the "consumer-view" of the data and the "producer-view" of the data. Back up in the "definitions" section, there are two different JSON Schema artifacts defined. The example doesn't say much (or maybe anything at all) about the "Producer-View JSON Schema" That's because the example focuses on the primary reason for defining JSON Documents, which is the application-end / consumer-view.Part of this proposed solution is to stop trying to combine them. Each of the JSON Document examples above, except for the very first one, didn't match up exactly to the entire set of JSON Schema documents. In some cases like the multiple, independent changes for concurrent project. The actual JSON document wouldn't have exactly matched any single JSON Schema. This fact exposes the need for an internal-use-only "super-schema", or "producer-view JSON Schema" that exactly defines the content of a document that satisfies the requirements of all of its supported "consumer-view JSON Schemas." While it isn't strictly necessary to create this Schema document, having it would help to communicate with the "back-office" developers who need to know what the actual super-set document needs to have in it.
No comments:
Post a Comment