VERSIONED_COMPOSITION Archetype_node_id_valid invariant contains error
Description
Activity
Well we’re caught between two philosophies which both relate to openEHR: a) version control, where rule #1 is thou shalt not touch or rewrite the repo and b) schema migration, which is an ongoing challenge in any persistence system. Now, our ‘schema migrations' are already a step out from the actual DB schema - it’s archetypes. But there is still schema migration that needs to be addressed with the evolution of archetypes over time.
Usually you only break the versioning rule (thou shalt not touch old versions) when you are either doing a complete technological upgrade, as we did from Subversion to Git some years ago, or you are doing built-in tricks which are bullet-proof, like Git’s version compression.
I’m not sure what you meant by ‘history’ but I am sure that there are no good half-baked solutions in this area. We have to make sure every last detail is covered - as you say, links, do we make the migrated form point to pre-migrated form (I would say no; instead provide a tool that can look at a new version and find its antecedent in the old form)… we need a new analysis on this.
We could contemplate this in the short term:
get rid of this invariant rule, and don’t make any requirement for all VERSIONs of a VERSIONED_OBJECT (Composition, Party, …) to use the same archetype at all; we could instead say that it just has to be the same content, i.e. medication list, allergies list, lab result etc, and the system can use whatever archetypes it likes to represent that content, including using new archetypes for new versions.
That’s a bit radical and it means that any smart archetype-based logic can no longer assume an entire version stack is conformant to the same set of archetypes (i.e. including the ones in slots etc) but now, a version stack potentially consists of more than one coherent series of same / compatible archetypes.
We could come up with a theoretical design around this, but I think early involvement of the implementers would be worthwhile, to find out their thoughts, since versioning of both the data and also archetypes is fairly close to the core of most implem software ecosystems.
Ah, yes. Of course, those conversion strategies make sense. I guess what we try to do differently is to keep old versions of the data exactly the same, without conversion, and only converting the most recent version of the data to the new form, preserving all history in the old version.
Converting all versions is possible I guess, and even the old version can be kept as deleted compositions. But then, how to link the new versions to the old ones? Should that use links on the root locatable/composition? or the feeder audit? Something else? Wouldn’t history be better?
I’ll try asking other vendors what they do.
Previous comments don’t mean we should not discuss a better strategy for solving this. I’m happy to look at better designs. We just have to take account of existing systems and find a solution that works in the real world.
I would say that the invariant should be satisfied for the archetype Id down to the major version, i.e. using ‘semantic_id()’ function here or something equivalent. There is more discussion here in the Identification spec.
The specs don’t specify any specific way to deal with breaking changes (i.e. new major version of archetypes) - they just say that when an archetype of new major version number goes in to use, the system has to do either (or maybe both) of two things:
convert data saved with the older major version on the fly (i..e on any retrieval) to the form required by the new major version of that archetype
migrate the older data permanently (i.e. one-off operation) to the new form.
For both of these, some conversion algorithm is needed. This usually involves filling some new field with a synthesised data item that didn’t exist in the old version, but could be anything.
More on version referencing here in the Identification spec.
@Thomas Beale , you mentioned there are ways mentioned in the specification to handle this. We tried to find that, but cannot find them. Could you provide a pointer to where this is defined?
The current spec in combination with adl 2 may not even allow template patch version updates, so following this invariant seems highly impractical at best.
The invariant reads:
Archetype_node_id_valid: for_all v in all_versions | v.archetype_node_id.is_equal (all_versions.first.archetype_node_id)
However, v in all_versions is an instance of the class VERSION. Which does not have an attribute called archetype_node_id.
Is v.data.archetype_node_id intended?