Null Flavours and Boolean data in openEHR
The Use of Null Flavor in openEHR
In the openEHR Reference Model, the low-level class ELEMENT has attributes 'value' and 'null_flavor'. The latter attribute is taken from HL7 (although used in a different way) and is used to mark a 'lack of data'. Using this attribute in openEHR was inspired by a) the need to do something about marking missing data in health information and b) the use of 'data quality markers' in SCADA control systems which show on the screen when a measured value from the field is out of date or wrong due to technical failure to obtain the current value. In the development of openEHR it was thought that some kind of data quality marker should be available for a similar reason: to indicate technical incapacity to obtain data.
The Problem of 'Boolean' or two-valued data
A general design problem in health information is when to use Boolean data values. There are many situations where a naive analysis might indicate to use a Boolean, i.e. a DV_BOOLEAN in openEHR-speak for a data field such as gender, or as the response to a question like 'have you eaten in the last 12 hours?'. In both cases, the possible values are greater than simply yes/no or some equivalent. Gender could typically have values from a small set of codes like:
male
female
unable to determine
intersex
...
Similarly, yes/no questions in A&E might not be answered due to the patient being unconscious - which is a 'normal' happening in A&E. A 'don't know' answer might be prefectly sensible for many questions asked to patients.
In the HL7v3 modelling approach, Null Flavour is used to indicate 'missing' data, such as to represent situations like 'asked but not answered'.
The openEHR Approach
In openEHR, we see that when a physician asks a question and it is not answered - e.g. the patient is dazed or becomes unconscious - as being a normal medical situation. There is no technical incapacity of the physician to obtain information - he or she is in effect making a normal observation. So in modelling the information obtained in such situations, we should ensure that value set for questions like 'have you eaten in the last 12 hours' should include yes/no/don't know, and possibly also things like maybe/most likely/unlikely etc. In an A&E (ED) situation, most likely the responses to any question might include no answer, due to unconsciousness.
In general, the value set should include values for any possible patient response - the data then correctly show that the patient was asked, but responded with some kind of 'don't know' or did not respond at all. Situations where the information could technically not be obtained, e.g. physician was talking to patient using an internet chat tool and the communication dropped out, or a response was techically impossible for some other reason, e.g. faulty equipment, should be marked with a null flavour. In general, null flavour is used sparingly in openEHR, and is not used for representing typical (if not necessarily common) clinical events that can be observed perfectly well by the clinician.
The openEHR Null Flavours are currently (with HL7 mappings):
openEHR code |
Rubric |
Description |
HL7_NullFlavor |
---|---|---|---|
271 |
"no information" |
No information provided; nothing can be |
NI |
253 |
"unknown" |
A possible value exists but is not provided. |
UNK |
272 |
"masked" |
The value has not been provided due to privacy |
MSK |
273 |
"not applicable" |
No valid value exists for this data item. |
NA |
What this means for Archetyping
The implications of the above approach for templates are that archetypes should fully model the range of responses for questions and other data elements that may seem to initially be Boolean in nature.
Other approaches
This analysis is the only possble view of affairs, but it does ake care of the need to know what the patient or clinician said, even if it was not definitive. Complicated null flavour approaches tend to mix up such situations with the situation where data were unavailable for a techincal reason.
HL7 Approach
|
Code mnemonic |
Rubric |
Description |
openEHR equivalent |
1 |
NI |
no information |
The value is exceptional (missing, incomplete, improper). No information as to the reason for being an exceptional value is provided. This is the most general exceptional value. It is also the default exceptional value. |
(same in openEHR) |
2 |
INV |
invalid |
The value as represented in the instance is not an element in the constrained value domain of a variable. |
Q: This seems very similar to OTH. |
3 |
OTH |
other |
The actual value is not an element in the constrained value domain of a variable. (e.g., concept not provided by required code system). |
Q: This seems to be saying that a value has been recorded in violation of the model?
|
4 |
NINF |
negative infinity |
Negative infinity of numbers. |
Modelled in Interval<T> and DV_INTERVAL<T> classes. |
4 |
PINF |
positive infinity |
Positive infinity of numbers. |
Modelled in Interval<T> and DV_INTERVAL<T> classes. |
3 |
UNC |
unencoded |
No attempt has been made to encode the information correctly but the raw source information is represented (usually in originalText). |
Not currently handled explicitly in openEHR. |
|
DER |
derived |
An actual value may exist, but it must be derived from the provided information (usually an expression is provided directly). |
This can happen anywhere in the data; openEHR doesn't see it as missing data / null data. The archetype shows what is derived and what is not, e.g. Apgar total from input values. |
2 |
UNK |
unknown |
A proper value is applicable, but not known. |
(same in openEHR) |
3 |
ASKU |
asked but unknown |
Information was sought but not found (e.g., patient was asked but didn't know) |
In openEHR, this is a legitimate response of the patient, and is not a case of missing data. See discussion above. |
4 |
NAV |
temporarily unavailable |
Information is not available at this time but it is expected that it will be available later. |
Q: What information systems can predict te future and reliably set a value like this? What use would it serve? If the data are in fact available later, they will be recorded. |
3 |
QS |
sufficient quantity |
The specific quantity is not known, but is known to be non-zero and is not specified because it makes up the bulk of the material.'Add 10mg of ingredient X, 50mg of ingredient Y, and sufficient quantity of water to 100mL.' The null flavor would be used to express the quantity of water. |
In openEHR, this is not missing data. It just happens to be quantitative data that is expressed in narrative form. Modelled in archetypes by allowing a narrative alternative to a quantity. |
3 |
NASK |
not asked |
This information has not been sought (e.g., patient was not asked) |
Q: is this a missing data concept, when by definition no data is expected to be there? In any case can only apply in specific questionnaire-like situations. Should be part of design of archetypes / templates for questionnaires. |
3 |
TRC |
trace |
The content is greater than zero, but too small to be quantified. |
This is a laboratory concept, and has nothing to do with missing data; labs routinely specify some amounts as 'trace'.The value not in itself computable, but is usually part of an ordered set, e.g. trace, +, ++ etc. |
2 |
MSK |
masked |
There is information on this item available but it has not been provided by the sender due to security, privacy or other reasons. There may be an alternate mechanism for gaining access to this information.Note: using this null flavor does provide information that may be a breach of confidentiality, even though no detail data is provided. Its primary purpose is for those circumstances where it is necessary to inform the receiver that the information does exist without providing any detail. |
(same in openEHR) |
2 |
NA |
not applicable |
No proper value is applicable in this context (e.g., last menstrual period for a male). |
(same in openEHR) |
This is the HL7 nullFlavor table as of Dec 2007. (Grahame)
It would seem that some of these are not appropriate in the context of openEHR. I would think that OpenEHR has handled the concepts of OTH, NINF and PINF differently, and needn't support them.
I'd be interested to know how OpenEHR models the concepts of UNC, DER, QS and TRC