AbstractsMedical & Health Science

Ontology-based approaches to identify patients with type 2 diabetes mellitus from electronic health records: development and validation

by Alireza Rahimi Khorzoughi

Institution: University of New South Wales
Department: Public Health & Community Medicine
Year: 2015
Keywords: Electronic health records; Ontology; Data quality; Type 2 diabetes mellitus
Record ID: 1057146
Full text PDF: http://handle.unsw.edu.au/1959.4/54224


Introduction Issues around the data quality (DQ) of patient registers are often raised when a data set is used for clinical or research purposes. An ontology-based approach provides a flexible semantic framework and supports the automation of data extraction from electronic health records (EHRs). This research aimed to assess the flexibility of an ontology-based approach to accurately identify patients with type 2 diabetes mellitus (T2DM) in a clinical database. This research also demonstrated the role of an ontology-based approach to assess quality of a register. Method A systematic review was conducted, which addressed DQ, ‘fitness for purpose’ of data used and ontology-based approaches. Included papers were critically appraised with a ‘context-mechanism-impacts/outcomes’ overlay. Using a literature review, the Australian National Guidelines for type 2 diabetes mellitus, the Systematised Nomenclature of Medicine – Clinical Term – Australian Release and input from health professionals, a five-stage methodology for DQ ontology (MDQO) was adopted. The methodology consisted of: (1) knowledge acquisition; (2) conceptualisation; (3) semantic modelling; (4) knowledge representation; and (5) validation. Although MDQO can be used in any validation domain, this thesis validated it in the context of T2DM diagnosis and management. The accuracy of the MDQO was validated with a manual audit of general practice EHRs through the diabetes mellitus ontology. Contingency tables were prepared and sensitivity and specificity (accuracy) of the model to diagnose T2DM was determined, using T2DM cases of a general practice, which kept a diabetes register with complete and current reason for visit information, found by manual EHR audit as the gold standard. Accuracy was determined with three attributes – reason for visit, medication and pathology – singly and in combination. Results The T2DM ontology included six object properties, 15 data properties, 68 concepts and 14 major themes in four main classes: actor, context, mechanism and impact. The validation showed sensitivity and specificity were 100% and 99.88% respectively with reason for visit; 96.55% and 98.97% with medication; and 15.6% with pathology test result. This suggests that medication and pathology test result data were not as complete as reason for visit data for the general practice audited. However, the completeness was adequate for the purpose of this thesis, as confirmed by the very small relative deterioration of accuracy (sensitivity and specificity of 97.67% and 99.18%, respectively) when calculated for the combination of reason for visit, medication and pathology test result. Discussion Current research shows a lack of comprehensive ontology-based approaches for DQ in chronic disease management and there are few validation studies comparing ontological and non-ontological approaches on the assessment of clinical DQ. The MDQO developed in this thesis provides a semantically flexible mechanism to capture patients’ data from EHRs. It is also designed to be…