Method | Akoma Ntoso

Strategic Goals

“lingua franca”, long term storage, common metadata, self-explanatory, extensible.

The Akoma Ntoso model has been informed by the following strategic goals:

To create a “lingua franca” for the interchange of parliamentary, legislative and judiciary documents between institutions. For example, Parliament/Court X should be able to easily access a piece of legislation made available in Akoma Ntoso format by Parliament/Court Y. The goal here is to speed up the process of drafting new legislation, writing sentences, etc. by reducing the amount of accessing, re-keying, re-formatting etc. required.
To provide a long term storage and access format to parliamentary, legislative and judiciary documents that allow search, interpretation and visualization of such documents several years from now, even in the absence of the specific applications and technologies that were originally used to generate them.
To provide an implementable baseline for parliamentary, legislative and judiciary systems in institutions. It is envisaged that this will lead to one or more systems that provide the base layer of software “out of the box” that can then be customized to local needs. The goals here are twofold. Firstly, to facilitate the process of introducing IT into institutions. Secondly, to reduce the amount of re-invention of the wheel that would result if all institutions pursued separate IT initiatives in the area of parliamentary, legislative and judiciary document production and management.
To create a common data and metadata models so that information retrieval tools & techniques used in Parliament/Court X can be also be used in Parliament/Court Y. To take a simple example, it should be possible to search across the document repositories of multiple Parliaments/Courts in a consistent and effective way.
To create common mechanisms for resource naming and linking so that documents produced by Parliaments/Courts can be easily cited and cross-referenced – either by other Parliaments/Courts or by other users.
To be “self-explanatory”, that is to be able to provide all information for their use and meaning through a simple examination, even without the aid of specialized software.
To be “extensible”, that is it must be possible to allow modifications to the models within the Akoma Ntoso framework so that local customisation can be achieved without sacrificing interoperability with other systems.

Simple data model

identify a number of basic, fundamental classes of structures.

The Akoma Ntoso document model is designed, first and foremost, to be actually used. As a consequence, a high premium has been placed on simplicity throughout its design. Data models created to handle complex document types (such as legislation or sentences) need to deal with two apparently opposed requirements: on the one hand, they need to be sufficiently sophisticated to handle all possible semantic and structural occurrences and situations that may occur in the actual documents. On the other, they need to be speedily understood and used by the people who would need to apply these models.

These opposed requirements can be jointly satisfied not by simplifying the vocabularies of available structures and elements, which would reduce the available descriptive sophistication of the language, but rather by simplifying the structure variability and types (in XML parlance, the content models), thereby reducing the learning time and the software complexity without compromising a full and detailed descriptive power of the language. The idea therefore is to identify a number of basic, fundamental classes of structures (containers, hierarchies, blocks, etc.) that can be immediately understood and used appropriately, regardless of their actual names.

Ability to evolve

built to stand evolutions and changes over time.

A critical attribute of a successful XML model is its ability to evolve over time. This “evolvability” has been a key concern in the creation of the Akoma Ntoso model. Thus, although the language is built to stand evolutions and changes over time, the language can be customized at will for local needs and purposes, and still be made compatible with the overall Akoma Ntoso infrastructure and the general language.

Furthermore, the language is built to stand evolutions and changes even regarding the number of actual functionalities provided: features such as the number and type of metadata values, or the automatic generation of amended text, or the activation of special analysis tools on the text may require the language to evolve in time. In these cases, it can be guaranteed that existing documents already marked up according to initial versions of Akoma Ntoso will be either immediately compatible with the new schemas, or easily convertible to it via a single XSLT stylesheet to be provided.

Correctness assurance

verifying the correctness of an XML document against specific schema.

“Validation” is the word used in XML parliance to refer to the act of checking the correctness of an XML document according to some pre-defined structural rules expressed in the formal definition of the language as expressed in one or more DTDs and XML Schemas. The validation step verifies whether the XML document contains, in number and position, all the expected elements of the type this document is an instance of. The Akoma Ntoso XML language is formally defined using an XML Schema, and the corresponding validation schema Akoma Ntoso imposes a number of constraints and restrictions on the final form of the XML document, requiring for instance a specific order in the containment of parts or that any part is preceded by a heading, etc.

The problem with being very restrictive in the constraints specified in the validation schema is that, documents may be drafted (e.g. approved by a Parliament) that do not conform to these rules: many countries have guidelines for the correct drafting of legislation or judgements, but have no prescriptive value: they are just what they are called, guidelines, that can be ignored and customized at will by a higher authority such as a Parliament or Court.

This fact has a very important effect on the generation of electronic versions of such documents: everything that gets approved by Parliaments or written by Courts has to be accepted by the system, and everything that has already been approved even more so. Therefore, failing XML validation (i.e., violating one or more of the constraints and restrictions expressed in the schemas) cannot have the effect of rejecting documents, but, at most, of pointing out issues and differences from the guidelines that the authority itself, if it wants and has time to spend on this, can consider for editing and modifications.

If we consider the validation schema a contract on the form a document has to assume, then this contract clearly binds only the author of the XML markup, leaving the author of the textual content (i.e., the legislator or the judge) absolutely free to organise the text as he/she chooses. Thus compliance to rules such as “A numerical identifier will always be associated to each substructure of the act” or “The enactment date will be specified” can be safely required, as they only bind the XML markup of the document, while structural rules (such as “Every subpart will have a heading”, or “A section will contain paragraphs which contain clauses”) can only be suggested, and not imposed, as they would interfere with the authority and independence of the legislator, which in the case of Parliaments is most often total and not constrainable by the mundane requirement of adherence to an abstract document structure.

The goals of forcing authors of XML markup to fully describe all the parts of the document, and of leaving the legislator with the maximum freedom in writing, may seem incompatible and hard-to-reach, but they can be and are reached within the Akoma Ntoso framework. Akoma Ntoso clearly separates data and metadata, thereby clearly distinguishing the contribution of the legislator (data) and the contribution of the author of the XML markup (metadata); Akoma Ntoso provides a richly evocative vocabulary of structures and elements, so that the markup author can correctly and precisely describe what is actually contained in the documents. Akoma Ntoso imposes little or no constraints on data, letting the legislator write and organize the text matter as wished, but imposes a number of constraints on the metadata, forcing the markup author to provide all bits of information that are necessary to manage and organize the document.

Yet it might be appropriate for the tool to guide through the drafting guidelines enacted in each country, and help in following them as precisely as possible. One way out, as adopted by Akom Ntoso is to provide a number of concentrical schemas, the outer of which, called the General Schema, is fully descriptive, not binding the legislator but only the author of the markup, allowing him to describe as precisely as possible the actual structure of the document as approved and emanated by the Parliament. Within the rules dictated by the General Schema one can add as many custom schemas that can be made more prescriptive, and can be used to check whether the document actually conforms to the existing legal drafting guidelines in each individual country. Successful validation of documents is required against the general schema, since these errors would signal incorrect markup (which is not acceptable in Akoma Ntoso), while the detailed schema can be used, at the discretion of the Parliament itself, to automatically check conformance of the proposed bill against the drafting guidelines, and thus warn the legislator to modify it accordingly in case conformance is sought.

Both descriptive and prescriptive schemas, of course, are closely related: they both use the same vocabularies, and have the same set of basic, fundamental rules. They then differ in the number of additional rules and constraints they impose. All rules enforced in the general schema also exist in all custom schemas, so that all documents that are valid according to each of the custom schemas are also valid according the general schema.

Furthermore, this double layer of schemas also allows interoperability of documents coming from different countries. In fact the descriptive general schema can accept all Akoma Ntoso documents from any interested country, and can be used as the baseline for accessing and displaying documents regardless of their provenance.

On the other hand each prescriptive detailed schema is created to deal with the specific guidelines of each individual country, helping towards the more precise legal drafting process and the correct preservation of cultural peculiarities of each individual country. The fundamental commonality of these schemas provides therefore full description of individual and country-specific document types without renouncing to interoperability and document interchange.

Tools

editor, converter, name resolver, post-editing tools.

Just as many are the users (some of whom are not even aware of the fact they are using or relying on Akoma Ntoso-compatible systems), many also are the tools that need to be created around the Akoma Ntoso document model. Some of them are basic tools that are necessary for the Akoma Ntoso system to work at all. Others are additional applications that are to be used once for a sufficiently large document base has been generated using the Akoma Ntoso language.

Although this is not the place to provide a full list of the foreseeable tools, a brief list of the main categories may help in explaining the breadth and variety of the Akoma Ntoso project, and the number of issues that need to be considered in the development of the data format.

It is also to be remembered that Akoma Ntoso is fundamentally an open standard for data formats: tools can be produced by any individual and organization, and those proposed within the Akoma Ntoso project are only to be considered as proposals and baseline tools: as long as a new tool conforms to the requirements of the Akoma Ntoso data formats, it is to be considered as good as the Akoma Ntoso tools for all purposes.

The editor

The editor is one of the two fundamental tools for the generation of XML versions of legislation or judgement, etc. Although drafting can be done without a specialized editor in most real life scenarios, there will be situations in which using a specific editor is appropriate and possibly even necessary.

The editor is used in three different scenarios:

as an application for the direct insertion of both text and markup, starting off an empty document: although this is the easiest case to understand, itis probably the rarest concrete scenario of use, as the drafting offices will most usually work off existing documents in some other format.
as an application to manually mark-up a document whose textual content was provided in a different format. Depending on the sophistication of the conversion engine, this scenario will most probably blend naturally with the following one, with automatic tools suggesting markup that is then verified and approved by the human user. The editor basically provides functionalities to edit and add any kind of Akoma Ntoso-conformant markup, and is able to check the validity of any intermediate result.
as an interface to activate, control and verify the automatic conversion obtained by the tool described in the next section. Through the editor the user can verify the correctness of the conversion, and change and add whatever markup or content the conversion engine has forgot or misidentified using the standard editing interface.

The converter

The converter is, with the editor, the most fundamental tool for the Akoma Ntoso system. The need for a converter is based on the assumption that in many cases the drafting process is already in place with a number of generic and specialized tools, and the offices and the drafter may start conformance by adopting a tool for the final task of generating the XML version of the document, and not replace their existing workflow with new tools and new tasks. Such final tool is the converter.

The converter has the double purpose of

converting into Akoma Ntoso XML documents the files that the “drafter” is producing traditionally, and,

converting into Akoma Ntoso files the legacy documents, such as the existing acts and judgements that form the current situation of each country, and whose conversion into XML is needed for any hypertext web of references to work at all. Since legacy documents are, by definition, in any old format, and since it makes no sense to type them in XML using an XML editor, the task of conversion happens through a semi-automatic operation using the converter.

The conversion is based on the idea of semi-automatic operation, i.e., it i based on an automatic process that determines as correctly as possible the actual structures, and a subsequent manual process that confirm (or, if there is an error, modifies) the inferences made by the automatic process. In fact, this application is often meant to be one of the modules of the editor, and uses the editor itself for corrections to the automatic inferences of the converter. Of course, the amount of human editing is inversely proportional to the regularity of the documents and the sophistication of the converter, so that large quantities of regular documents can usually be processed automatically with little or no manual intervention.

The converter works by examining the typographical and textual regularities of the document, and by inferring from them a structural or semantic role for each text fragment. When no deducible structural or semantic role can be inferred automatically, the presentational characteristics will be recorded instead and if the human user will be asked to provide the structural or semantic role that the tool failed to identify.

Experiences with European laws show that the basic structure of the bill (sections, subsections, clauses, preambles, conclusions, attachments, etc.) can be inferred automatically with great precision and few errors. The most important semantic elements, references and dates, can also be deduced automatically with great precision as long as the human-readable text used for them uses one of a limited number of acceptable forms. More complex structural elements (explicit modifications, specialized terms, people, etc.) might be difficult to catch in a fully automatic way, but this is not impossible.

Name resolvers

The Akoma Ntoso Naming Convention is a standard mechanism for creating identifiers of documents that can be used for accessing content and metadata regardless of storage options and architecture.

Akoma Ntoso documents are stored on networked computers and are accessible by specifying their network addresses. Yet these addresses are extremely dependent on the specificities of the architecture that were adopted by the custodian of those documents, and are dependent on the technologies and tools that are in vogue or appropriate for the economic and technical context of the custodians. It is extremely inappropriate, therefore, that any content or structure that is planned to last for more than a short period of time is named according to the physical address of the document in the form that is currently used to access it.

For this reason, the Akoma Ntoso Naming Convention specifies an architecture-independent network address (using a permanent family of Web-derived URI addresses) for all relevant structures of the Akoma Ntoso standard, which on the other hand is not meant to be used directly for accessing these structures.

A name resolver is a software tool that can, given an architecture-independent URI, identify the resource being sought and provide the current architecture-dependent address that needs to be used at any given time in order to perform the actual access.

Name resolvers are either indirect, in that they return the client application the current address of the requested document, and leave the client application the task of re-requesting the document at the correct address or direct, in that they immediately return the requested document by generating the actual physical address and requesting the document as a proxy for the initial client application.

Post-editing tools

The post-editing tools are a number of validation, enrichment, and storage tools that are used after the “legal drafter” has finished his/her editing job. All these tools require no user-interface to speak of, are managed either automatically or by the system administrator of the storage centre for all Akoma Ntoso documents. These tools include at least (but the list might be longer and more sophisticated):

A content and structure validator that checks the correctness of the document instance with regard to the Akoma Ntoso schema document, and to any additional rules that were added locally.
A reference validator that checks whether all references contained in the document already belong to the document collection and are correctly referenced.
A metadata validator that checks whether the metadata stored with the document are correct and complete.
A sophisticated and complex document management system, with search engines, hypertext functionalities, XSLT support and versioning facilities.
An XSLT stylesheet (or a series thereof) to create visualizations of individual documents for a number of browsers and applications that will increase and get more sophisticated in time.