Archetype Design Principles

Archetype definition:

  • a computable expression of a domain content model in the form of structured constraint statements, based on a reference (information) model. openEHR archetypes are based on the openEHR reference model. Archetypes are all expressed in the same formalism. In general, they are defined for wide re-use, however, they can be specialized to include local particularities. They can accommodate any number of natural languages and terminologies.

Purpose of Archetypes

Archetypes are created for a number of purposes:

  • Human Communication: to enable domain concepts to be modelled in a formal way by domain experts;
  • Knowledge-enabling systems: the separation of information and knowledge concerns in software systems, allowing cheap, future-proof software to be built;
  • Knowledge-level interoperability: the ability of systems to reliably communicate with each other at the level of knowledge concepts;
  • Domain empowerment: the empowerment of domain specialists to define the informational concepts they work with, and have direct control over their information systems.
  • Intelligent Querying: to be used at runtime to enable the efficient querying of data based on the structure of archetypes from which the data was created.

Archetype Design Principles

Principle 1 - Encapsulation

An archetype defines a whole, distinct, domain-level model of content. Archetypes should define coherent, whole informational concepts from the domain, in order to be useful. Archetypes enable self-standing groupings of information to be defined regardless of context. For example, there may be an archetype for "ECG result" since this is understood and used as a whole concept by clinicians, but not "ECG lead 2 result", which would only ever be understood as part of an "ECG result". The heart rate, as determined in an ECG, may be archetyped separately as this is a distinct concept that can be understood on its own. Similarly, we would not consider the heading "systolic" to be a meaningful archetype on its own, rather it would be part of a "blood pressure" or  "intravascular pressure" archetype.

Principle 2 - Archetypes constrain reference model instance structures

An archetype defines constraints on the structure of instances of a reference model. An archetype can define the valid structuring of data instances to form a logical instance of domain content. For example, the hierarchical structure of "SOAP" headings used in problem-oriented recording is definable in an archetype in terms of a structure of Section instances, assuming Section is a type from the model. The implication of this principle is that reference models do not need to supply domain-specific structures, only generic building blocks suitable for creating the latter.

Principle 3 - Archetypes constrain reference model types and values

An archetype defines constraints on types and values of instances of a reference model. Archetypes also express constraints on allowable constructions of reference model instances, e.g. on allowed types, ordering, cardinality, values and so on. The combination of structure and constraint expression means that numerous variations on a data instance may conform to a single archetype.

Principle 4 - Granularity of encapsulation

The granularity of an archetype corresponds to the granularity of a business concept in an information model. Archetypes are defined at the same level of granularity as the 'business' entities in the
reference model, i.e. the key types that in turn may have internal finer-grained structure. For example, since the openEHR reference model includes the business concepts Composition and Observation, archetypes of these same concepts can be created, e.g. a "cholesterol result" (a kind of Observation) and an "encounter note" (a kind of Composition).

Principle 5 - Archetypes and the reference model belong to an information ontology

Since each business concept in the reference model corresponds to a particular ontological level found in the domain, archetypes based on each business concept belong to the same ontological level. Taking the openEHR reference model as an example, there are four ontological levels: the EHR, the Composition, the Section, and the Entry. Each of these defines a different category of artefact, for exmaple, the Entry (including subtypes Observation, Evaluation etc) defines the generic semantics of 'clinical statements'. All archetypes based on Entry or its subtypes belong to this same ontological category, defining specific kinds of clinical statement.

Principle 6 - Archetypes can be aggregated

A compositional relationship can exist between archetypes. Archetypes can be composed to express valid possibilities for larger structures of data from different levels of the ontological hierarchy of the reference model. Such compositional connections are termed 'slots'. For example, Section and Entry archetypes can be linked in a compositional way to define valid structures for the headings and data of a model of information captured in a "physical examination".

Principle 7 - Archetypes can be specialised

An archetype can be a specialisation of another archetype. Archetypes can be defined at higher or lower levels of detail at a given ontological level. Thus, a "biochemistry result" archetype would define the general shape and constraints for all biochemistry results, while a "cholesterol result" archetype could be defined as a specialisation of this, in order to further constrain data to conform only to the shape of a cholesterol test.

Principle 8 - Archetype structures are hierarchical

Archetypes are internally hierarchical in structure; that is to say, the constraints in an archetype has an internal hierarchical compositional structure. This is because object models give rise to data that is inherently hierarchical in structure.

Principle 9 - Archetype nodes are identified by unique paths

Archetype nodes, including the root and all leaves, are identified by semantic identifiers which act as the basis for human-readable 'meanings', and for computational paths. Archetype node identifiers are defined within the archetype using coded terms. The definition of any such code acts as a standardised 'design-time meaning' of the node, e.g. "5 minute Apgar result". Any node in an archetype can be referenced by concatenating attribute names and node identifiers from the archetype root to the node, to form an archetype path.

Principle 10 - Queries are based on paths

Archetype paths form the basis of reusable semantic queries on archetyped data. Archetype paths can be used to construct queries that specify data items at a domain level, rather than being limited to the classes and attributes of the reference model as for a query in standard database theory. For example, paths from a "blood pressure measurement" archetype may identify the systolic blood pressure (baseline), systolic pressures for other time offsets, the patient position, and numerous other data items.

Principle 11 - Data elements can be uniquely identified via archetypes

Data generated from an archetype will have a compositional structure, in which nodes must be uniquely named, in order to be able to refer to them. Since an archetype acts as a kind of 'template', there can be repetitions of archetype structures in real data. Each node therefore needs a unique runtime name in order to be uniquely identified within the data. The archetype can be used to constrain this unique name.

Principle 12 - No language primacy

Archetypes have no language primacy: they are completely translatable artefacts. An archetype can be developed in any language, and have translations in other languages added a posteriori.

Principle 13 - Terminology neutrality

Archetypes are neutral with respect to terminologies. Archetypes are able to be developed with or without reference to external terminologies. Multiple external terminologies can be bound to an archetype; the definition of terms in an archetype is not predicated on the existence or use of any particular terminology.

Principle 14 - Backwards compatible evolution

There is a means of evolving existing archetypes to accommodate changing requirements, without invalidating data created with earlier versions. Since archetypes are used to create data, changes to archetypes must be regarded as creating a new archetype; i.e. the identifier of an archetype must incorporate its version. The only types of change to archetypes that can be made without changing the version are those which do not invalidate previously created data. Formally, such changes must not 'narrow' constraints expressed in the existing version.