Authors (including presenting author) :
Law MHA (1), Tong YHA (1), Cheung NT (1)(2),Fong KYB (1), Wu SY (1) Lee MWD (1), Chan WTW (2), Lee PKD (2) TSOUGENIS Efstratios (2), Li HFP (2), Wong TWR (2), Lau KLV (2), Cheung TCJ (2), Wu CWE (2)
Affiliation :
(1) Health Informatics Department, Information Technology and Health Informatics Division, Head Office, Hospital Authority, HKSAR
(2) Information Technology Department, Information Technology and Health Informatics Division, Head Office, Hospital Authority, HKSAR
Introduction :
In 2018, the Hospital Authority (HA) established the HA Data Collaboration Laboratory (HADCL) which offers external parties a big data analytics platform with innovative machine learning tools with the aims to support formulation of healthcare policies, biotechnological research, and improve clinical and healthcare services through the big data of the HA.
Objectives :
As a new alternative channel for more flexible and interactive data sharing in HA, the HADCL provides diverse types of large dataset that contains both structured and unstructured data, including narrative notes and image, for big data analysis. Currently, there are 11 data domains in the Lab and it keeps growing.
Demographics
Attendance & Appointment
A&E attendance
In-patient attendance
Out-patient Appointment
Diagnosis
Procedure
Medication
Immunization
Laboratory Result
Radiology
Appointment & Examination Result
Clinical Structured Notes
Diabetes Metabolic Complication Screening
Family Medicine
Obstetrics
Clinical Notes (Text)
Methodology :
Characteristics of HADCL Data
Health data in the HADCL are either de-identified or masked to avoid and minimize re-identification or cross-referencing by data users. To further protect security and privacy, data is not allowed to leave the physical site of HADCL.
Rich pool of clinical data
More than 28years of de-identified data from all hospitals
Massive continuous flow of data
Earliest from 1990’s and HADCL is up to 2018; updated yearly
Heterogeneous sources of data
Structured and Unstructured data (e.g. textual notes & image data)
Quality data
Data collected based on the clinical operation and secondary for research
Meaningful inferences
Provide Health Informatics support for the use and interpretation of health data
Apart from contributing to the healthcare research community, the HADCL desires to stimulate innovation ideas in the organization through the collaboration between the HA and external parties for deeper data analysis within a secured environment for conducting health data collaboration projects
Result & Outcome :
The Lab was piloted in 2018, there were 6 collaboration projects experienced the novel data sharing channel in HA and conduct the big data researches and after reviews, the Lab has expanded the capacity to 10 in 2019.
Pilot year 2018: 6 projects
First year 2019: 10 projects