Clarify LINKED_PLAN semantics - convert reference to containment
Description
PR is addressed in
Activity
Matija Polajnar November 4, 2019 at 12:49 PM
Well, anything in this area, yes. When we’re out of the openEHR model realm, the engine can do whatever restructuring is necessary to handle large work plans. As long as we’re dealing with OPTs and RM instances, we usually hold the entire OPT or Composition in-memory (or both, during validation), which can become an issue with real-life plans. And if we now reduce the footprint 10-fold, we’ll hit problems again when somebody comes up with plans 10 times as large. I’d rather have a way to naturally split them into as many components (task plans) as necessary.
Thomas Beale November 4, 2019 at 11:34 AM
So just to be clear, the problem is with the size of the OPT in its XML representation instantiated in the modelling tool, not the size of Work Plans instantiated as Java objects in memory in the TP engine?
Matija Polajnar November 4, 2019 at 10:47 AM
So, yes, there are multiple levels to each plan:
The archetypes, templates and the resulting operational template(s).
Instance(s) of composition(s) of the work plan and task plan(s) (stored in an EHR, so this is for a specific patient).
For each run, the work plan is materialised (either in full, or in parts when needed; in case of repeat_spec materialisation in parts is unavoidable).
The current runtime state; some of it can be thrown away and reconstructed from materialised state, and some of it needs to be persisted in another way.
For the last two, we can leverage the division of the work plan into task plans (and smaller units) to avoid having to work with too much data at once. We usually only need to work with a single task plan or even less (and perhaps some top-level data on the work plan) to execute some action.
The size of operational templates and compositions, however, can in no sensible way be reduced by the implementation or by the modeller if splitting the work plan into more task plans does not result in (a greater number of) smaller templates and compositions.
While a different serialisation approach might help somewhat, we need to keep in mind that most existing software (like modelling tools, CDRs, validation utilities etc.) assume reasonable-sized documents, thus keeping it in the working memory in entirety. The work plans break that assumptions and splitting them up into separate documents for separate task plans is a reasonable “work around” that is out of reach if we force linked plans to reside in slots.
Thomas Beale November 4, 2019 at 9:58 AM
Right, but don't forget what is actually happening at execution time: you are executing a distinct copy of a WP for each run of that plan for each subject, and that means every part of the plan (i.e. all linked TPs) have to be instances unique to that run. So even if you wanted to use UID referencing to those linked TPs, you are still pointing to unique copies for the same {run,subject}.
Or maybe you are considering just the instantiation stage, and wanting to minimise OPT file size, but create unique joined instances when the plan is activated? I think this also won't work, because in general, a plan execution can outlive any particular computation session, and its intermediate execution state will need to be persisted until it is next needed (maybe nothing happens for 2 weeks). Then you are going to want the complete instance, and cutting it up on save, and sewing it back together on the next instantiation seems unnecessarily painful to me.
What is the concrete concern here - size of some XML files? It may be that we need to consider some different serialisation approach.
Matija Polajnar November 4, 2019 at 7:49 AM
I know this was our (Better) idea, but we actually meant to have both mechanisms available with an invariant that only one is used at a time. At the moment our models use the embedded (slot) mode exclusively, because it is much easier to deterministically create an instance autonomously from a template that way, but we are beginning to bump into problems: if a large task plan is used as a sub-plan in multiple places, the operational templates grow in size considerably. So we would actually like to have an UID reference in there as well, although we do not yet have an idea as of how to encode hints to the automatic instance creation routines regarding how to connect the plans together...
The LINKED_PLAN class in release 1.0.0 represents the target logical link as an id-based link (https://specifications.openehr.org/releases/PROC/Release-1.0.0/UML/diagrams/PROC-task_planning.definition-actions.svg)
This should be changed to a composition, because Task Plans are separately instantiated for each incoming reference, when going from template to instantiated definition. Doing this provides a tractable way to compute the overall Work Plan graph.