Problems with paths to ISM_TRANSITION, need of archetype_node_id

Description

In the spec, ISM_TRANSITION is PATHABLE, so it can't have a nodeId (needs to be LOCATABLE).

When creating an ACTION archetype, sibling ISM_TRANSITION nodes need to be differentiated.

The AE is using nodeIds on ISM_TRANSITIONS to differentiate between siblings, which I think is correct, but violates the IM spec.

The solution is the make ISM_TRANSITION inherit from LOCATABLE.

In current working baseline specs, this is also an issue: ISM_TRANSITION still inherits from PATHABLE.

Ref: http://openehr.org/releases/RM/latest/docs/ehr/ehr.html#_entry_and_its_subtypes

Also the XSD need to be corrected since it doesn't allow archetype_node_id on ISM_TRANSITION

Sample ACTION archetype generated by the AE:

ACTION[at0000] matches { – Sdf
ism_transition matches {
ISM_TRANSITION[at0003] matches { – Scheduled
current_state matches {
DV_CODED_TEXT matches {
defining_code matches {[openehr::529]}
}
}
careflow_step matches {
DV_CODED_TEXT matches {
defining_code matches {[local::at0003]} – Scheduled
}
}
}
ISM_TRANSITION[at0002] matches { – Active
current_state matches {
DV_CODED_TEXT matches {
defining_code matches {[openehr::245]}
}
}
careflow_step matches {
DV_CODED_TEXT matches {
defining_code matches {[local::at0002]} – Active
}
}
}
ISM_TRANSITION[at0004] matches { – Active2
current_state matches {
DV_CODED_TEXT matches {
defining_code matches {[openehr::245]}
}
}
careflow_step matches {
DV_CODED_TEXT matches {
defining_code matches {[local::at0004]} – Active2
}
}
}
ISM_TRANSITION[at0006] matches { – ActiveSuspended
current_state matches {
DV_CODED_TEXT matches {
defining_code matches {[openehr::530]}
}
}
careflow_step matches {
DV_CODED_TEXT matches {
defining_code matches {[local::at0006]} – ActiveSuspended
}
}
}
ISM_TRANSITION[at0005] matches { – Completed
current_state matches {
DV_CODED_TEXT matches {
defining_code matches {[openehr::532]}
}
}
careflow_step matches {
DV_CODED_TEXT matches {
defining_code matches {[local::at0005]} – Completed
}
}
}
}
description matches {
ITEM_TREE[at0001] matches {*}
}
}

Environment

None

Activity

Show:
Pablo Pazos
July 9, 2017, 4:01 PM

Added the change on my Version.xsd (merge of the official XSDs):

https://github.com/ppazos/cabolabs-ehrserver/blob/master/xsd/Version.xsd#L233-L243

The XML instance changed from this:

<ism_transition>
<current_state>
<value>active</value>
<defining_code>
<terminology_id>
<value>openehr</value>
</terminology_id>
<code_string>245</code_string>
</defining_code>
</current_state>
</ism_transition>

To this (valid with updated XSD):

<ism_transition archetype_node_id="at0004">
<name>
<value>start</value><!-- transition name to active is "start", page 63 http://www.openehr.org/releases/1.0.2/architecture/rm/ehr_im.pdf -->
</name>
<current_state>
<value>active</value>
<defining_code>
<terminology_id>
<value>openehr</value>
</terminology_id>
<code_string>245</code_string>
</defining_code>
</current_state>
</ism_transition>

There is a full valid VERSION XML instance on our Insomnia REST Client test file https://github.com/ppazos/cabolabs-ehrserver/blob/master/api/ehrserver_rest_insomnia.json#L482

Thomas Beale
October 26, 2018, 10:45 PM

Using node ids on ISM_TRANSITIONs in an archetype doesn't violate the RM spec - because it is not a hard requirement for RM classes to be LOCATABLE for their archetypes to put ids on those nodes. It's a long time ago that this was modelled but I think the assumption was that an ISM_TRANSITION instance could easily be identified because it is 1:1 with its careflow step code.

Pablo Pazos
October 27, 2018, 4:55 AM

Thanks, my understanding was that paths in an RM object instance should exist in the correspondent AOM/TOM instance. At least it makes sense but I'm not sure about the design decisions leading to ISM_TRANSITION being PATHABLE and not LOCATABLE.

In the case of ACTION.ism_transition the path from the RM instance always exists in the AOM/TOM, BUT! is a path to the C_ATTRIBUTE, not the C_OBJECT (we can argue that is a path for both attr and object, but the point of paths is to point to one node on the tree, not two at different hierarchical levels (attr is parent of object)).

Doing a lookup works on some cases, but it's an ad-hoc workaround for something that IMO should be working as any other RM class instance.

Consider the use case of data validation, for instance we check that all nodes that are in a COMPO instance are defined by the OPT or the underlying RM. For doing this we process the instance, calculate the path of each node, and try to get the correspondent definition from the AOM/TOM using the path from the RM.

1. Data comes with path /ism_transition for both the ACTION.ism_transition attribute and the ACTION.ism_transition object instance (if the object could hold a node_id if would be /ism_transition[atNNNN]).

2. W use the /ism_transition path to get the correspondent node from the AOM, that returns a C_ATTRIBUTE, not a C_OBJECT (this operation to get nodes by path from the AOM/TOM is not part of the specs but is part of the Java Ref Impl and returns one C_XXX ArchetypeConstraint node(String path)**, maybe it should return both attribute and object that have the same path).

**Ref: https://github.com/openEHR/java-libs/blob/1423cc89928e1ec14552f6117c3d4ea655b8d156/openehr-aom/src/main/java/org/openehr/am/archetype/Archetype.java#L381

3. Since we need to validate the object and we have a C_ATTRIBUTE, we need to do an ad-hoc lookup through the C_ATTRIBUTE.children to get the right C_OBJECT for the rm_type_name ISM_TRANSITION, and there can be many alternatives, so to get the right one we also need to get the ACTION.ism_transition.careflow_step.defining_code, so we need an ad-hoc lookup in the RM to do an ad-hoc lookup in the AOM, to get the right C_OBJECT, BUT!!!! then: ISM_TRANSITION.careflow_step is not mandatory in the RM, so it can be empty in the data, so the whole lookup doesn't work on every case.

So for some cases, getting the right C_OBJECT or a RM node instance is not possible, or I'm missing something obvious.

Possible solutions to this case are: 1. make ISM_TRANSITION.careflow_step mandatory, or 2. make ISM_TRANSITION extend LOCATABLE. With 2. all the ad-hoc lookups disappear.

I would really like to know how other implementers deal with this case, I'm sure I'm missing something here.

Thomas Beale
November 3, 2018, 11:44 AM

Without discounting the idea of making ISM_TRANSITION inherit from LOCATABLE, one thing to note is that PATHABLE.item_at_path() and items_at_path() always returns an object - there are no 'attributes' in the data structure.

Functions that return C_ATTRIBUTE and C_OBJECT would be defined on ARCHETYPE (and maybe C_OBJECT etc) in the AOM.

Pablo Pazos
November 28, 2018, 6:02 AM

Thanks Thomas,

I'm reviewing the specs and your comments.

1. Both item_at_path(path) and items_at_path(path) work against data instances.

2. Both need the path to exist in the data instances (the suggested use includes a call to path_exists(path) before calling item_at_path and items_at_path.

3. If the path parameter is taken from the OPT or Archetype, then the path for ISM_TRANSITION objects defined in the OPT / Archetype will include the node_id like this /ism_transition[at0003].

4. When executing path_exists(ism_transition[at0003]), it will return false, since that path is not present in data and since path_exists is also executed against data instances, not the OPT or Archetype models. Currently the path that exists in data instances is /ism_transition without the node_id.

5. When implementing data queries over a repository, the paths used to get data, at least in my case, are taken from the model, not from the data instances. If we extract the paths from each data instance, is like regenerating the metadata (archetypes and templates) from the data, only because some paths in data are not equal to the paths defined in the metadata.

6. As a design principle, we try to make queries based 100% on the defined models, not on the specific data instances. Maybe the specs mandate to get paths from data instead of the correspondent models in order to have the right paths.

So I'm a little confused here. From where should we get the paths to make safe queries? Should we extract queries from what is supported by instances or from the Archetype/OPT models?

Another consideration of extracting paths from data instances, is there might be missing nodes in data that are defined in the correspondent archetype, and one case we cover is: if an archetype path is used to query, and there are no data nodes for that path, there is an empty result retrieved associated with that path. If we need to use only paths existing on data, paths defined in archetypes but with no data instances associated with them will not be able to be queried and return empty results.

This makes sense in the context of databases, if a query is defined in SQL over a column that doesn't exists, the DB returns an error (like querying for a path that is not defined), and if the column is defined in the schema (path defined in the archetype), and there is no data, the query will success and return an empty result. That is the behavior we follow. Maybe this is not supported by the specs .

One last comment is path_exists, item_at_path and items_at_path, return a result over a data instance. So given, for example, a COMPOSITION instance, I can call those methods over that instance. But when querying a database, we need to check path_exists over all the COMPOSITIONS stored. First we have the issue of which path to use, as mentioned above, a path from the archetype or from the data instances, then do some queries to know if the path exists and pre-filter the results by those PATHABLES in which the path exists, then execute the rest of the query getting item/items at path, but at the query level we can't check if the path is to a unique item, so we need to check items instead of item, then execute the filters (WHERE) then get specific data for the projections (SELECT a, b, c). I'm not seeing a simple solution for all this if we use data paths instead of archetype paths.

And if we choose to keep using archetype paths for everything, in the cases of querying PATHABLES instead of LOCATABLES, we need to manipulate the paths defined in the archetype to remove the node_ids. I think this is the easiest solution but feels like a hack. From the design point of view, it is simpler to develop knowing that paths in the archetypes will be supersets of the paths defined in data instances, and paths in archetype nodes, will be the same as the data nodes they define. Not saying the current spec is wrong, just that it is easier if there is more correspondence between an archetype and the data it defines, avoiding exceptions and special cases makes development easier.

Reporter

Pablo Pazos

Labels

Components

Affects versions

Priority

Major
Configure