(22 November 2022) KnowLab is awarded £5,000 from UCL Global Engagement Funding.
The funding is to extend and deepen our collaboration with iris.ai - a Norway based start-up behind the award-winning AI engine for scientific text understanding. The funded project is titled - Towards self-updatable knowledge base for evidence based medicine - join force with iris.ai (Norway) and beyond.
(27 October 2022) The Alan Turing Heath Equity Interest Group - https://www.turing.ac.uk/research/interest-groups/health-equity, co-organised by KnowLab and colleagues, is now official online!
In an era where AI is expected to improve our daily life - particularly in health, “How can we ensure that developments and applications of data science and AI improve everyone’s health?” This is a pressing and very challenging question. Please join forces with a multidisciplinary group to form a formidable synergistic force tackling one of the biggest challenges of AI in medicine.
(22 October 2022) Dr Hang Dong’s great perspective piece on automated coding using NLP and knowledge-driven approaches has now been published on npj Digital Medicine. The work illustrates how NLP and AI can help improve the efficiency of clinical coding in healthcare - i.e., assign ICD/SNOMED codes to hospital visits, which currently is a very inefficient/erroneous process in NHS and, for that matter, in many other health systems across the world.
(5 October 2022) Study on The Impact of Inconsistent Human Annotations on AI driven Clinical Decision Making.
Annotation inconsistencies commonly occur when even highly experienced clinical experts annotate the same phenomenon (e.g., medical image, diagnostics, or prognostic status), due to inherent expert bias, judgements, and slips, among other factors. While their existence is relatively well-known, the implications of such inconsistencies are largely understudied in real-world settings.
Aneeta Sylolypavan did her MSc with us addressing this hugely important research question using real-world ICU datasets with annotated data from 11 ICU consultants. The results suggest that (a) there may not always be a “super expert” in acute clinical settings; and (b) standard consensus seeking (such as majority vote) consistently leads to suboptimal models. Further analysis, however, suggests that assessing annotation learnability and using only ‘learnable’ annotated datasets for determining consensus achieves optimal models in most cases.
The manuscript is now under review with npj Digital Medicine and preprint is available at doi:10.21203/rs.3.rs-1937575/v1.
(18 August 2022) Our systematic review titled “Artificial intelligence models for predicting cardiovascular diseases in people with type 2 diabetes - a systematic review”, led by Minhong Wang, has been accepted by Intelligence-Based Medicine. This study identified and reviewed existing AI models for predicting risk of cardiovascular diseases in people with type 2 diabetes. We found that compared to risk scores developed using conventional methods, AI approaches have the potential to achieve more accurate predictions than risk scores developed using conventional methods. However, none of the reviewed models is directly reusable or reproducible, due to incomplete reporting and lack of transparency. Clinically, none of the AI models includes interventions that may affect risks such as medications and lifestyle changes. There were no indications in the studies on whether the prediction models might be able to adapt to include these factors.
(29 July 2022) Our collaboration study titled “Prediction of Five-Year Cardiovascular Disease Risk in People with Type 2 Diabetes Mellitus - Derivation in Nanjing, China and External Validation in Scotland, UK”, led by Cheng Wan from Nanjing Medical University, has been published by Global Heart. This study shows it is feasible to generate a risk prediction model using routinely collected Chinese hospital data. This indicates there is a great potential to make use of the large-scale and relatively easy accessible route data for identifying those at risk of CVD and help significantly improve CVD prevention in people with diabetes.
(16 June 2022) Our collaboration study titled “Spine-GFlow - A Hybrid Learning Framework for Robust Multi-tissue Segmentation in Lumbar MRI without Manual Annotation”, led by Dr Teng Zhang from Hong Kong University, has been accepted by Computerized Medical Imaging and Graphics. Results of this study show that our method, without requiring manual annotation, has achieved a segmentation performance comparable to a model trained with full supervision (mean Dice 0.914 vs 0.916).
(10 June 2022) Out work, titled COVID-19 trajectories among 57 million adults in England - a cohort study using electronic health records, is now out with Lancet Digital Health. Our analyses illustrate the wide spectrum of disease trajectories as shown by differences in incidence, survival, and clinical pathways. We have provided a modular analytical framework that can be used to monitor the impact of the pandemic and generate evidence of clinical and policy relevance using multiple EHR sources.
(2 May 2022) Our work Quantifying Health Inequalities Induced by Data and AI Models has been accepted by IJCAI-ECAI2022 ‘AI for Good track’. This work introduced a generic allocation-deterioration framework for detecting and quantifying AI induced inequality. Extensive experiments were carried out to quantify health inequalities (a) embedded in two real-world ICU datasets of HiRID and MIMIC III; (b) induced by AI models trained for two resource allocation scenarios. The results showed that compared to men, women had up to 33% poorer deterioration in markers of prognosis when admitted to HiRID ICUs. All four AI models assessed were shown to induce significant inequalities (2.45% to 43.2%) for non-White compared to White patients. The models exacerbated data embedded inequalities significantly in 3 out of 8 assessments, one of which was >9 times worse. preprint, slides, recording, repo.
(26 April 2022) Study led by Isabel Straw, Investigating for bias in healthcare algorithms - a sex-stratified analysis of supervised machine learning models in liver disease prediction, demonstrates a previously unobserved sex disparity present in published machine learning models. It suggests “To ensure sex-based inequalities do not manifest in medical AI, an evaluation of demographic performance disparities must be integrated into model development.” The work has been published on BMJ Health & Care Informatics.
(22 April 2022) Dr Honghan Wu joined the editorial board of BMC Digital Health. BMC Digital Health considers research on all aspects of the development and implementation of digital technology in both medicine and public health, such as mobile health applications, virtual healthcare and wearable technology, as well as the role of social media and other communications technology in digital health.
(25 March 2022) Study led by Huayu, Increased COVID-19 mortality rate in rare disease patients - a retrospective cohort study in participants of the Genomics England 100,000 Genomes project, has shown rare disease patients, especially ones affected by neurology and neurodevelopmental disorders, in the Genomics England cohort had increased risk of COVID-19 related death during the first wave of the pandemic in UK. This work has now been accepted by Orphanet Journal of Rare Diseases.
(20 March 2022) Clinical coding is the task of transforming medical information in a patient’s health records into structured codes like ICD-10 for diagnosis, which is cognitive, time-consuming task and error-prone. In this preprint, titled Automated Clinical Coding - What, Why, and Where We Are? , Hang introduces the idea of automated clinical coding and summarises its challenges from the perspective of Artificial Intelligence (AI) and Natural Language Processing (NLP), based on the literature, our project experience over the past two and half years (late 2019 - early 2022), and discussions with clinical coding experts in Scotland and the UK.
(18 January 2022) KnowLab was awarded an enabling grant (£29k) from British Council to strengthen academic exchanges and deepen our collaborations with the two Nanjing based universities of Nanjing Medical University (Prof Yun Liu’s group) and Southeast University (Dr Xiang Zhang’s group). At UCL side, in addition to KnowLab colleagues, we have Prof Paul Taylor and Dr Holger Kunz. For research focuses, we will focus on Artificial Intelligence in Medicine - tackling challenges of low generalisability and health inequality. This will involve both teaching and research activities.
(15 January 2022) Great work by Shaoxiong Ji and colleagues from Aalto University on reviewing automated coding from free-text clinical notes using a unified/abstract architecture view - now on arXiv. Great that KnowLab is part of this.
(13 January 2022) Minhong’s project “COOLNeo-an automated COOLing therapy for NEOnates” has been awarded £9,960.00 in funding from the ACCELERATE Innovation Team Challenge, finacially supported throught the Wellcome Trust Translational Partnership Award linked to Translational Research Office.
(8 January 2022) New members! A very warm welcome to Xuezhe Wang and Zhaolong Wu to join KnowLab for doing their MSc projects. Both are MSc students based in Institute of Health Informatics, UCL. Xuezhe will be working on graphic neural networks and Zhaolong will be doing clinical natural language processing.
(3 December 2021) Led by Dr Rebecca Bendayan at King’s College London, our work Investigating the Association between Physical Health Comorbidities and Disability in Individuals with Severe Mental Illness is now accepted by European Psychiatry. We found individuals with Severe Mental Illness and musculoskeletal, skin/dermatological, respiratory endocrine, neurological, haematological or circulatory disorders are at higher risk of disability compared to those that do not have those comorbidities. There is a great and urgent need to provide targeted prevention and intervention programs for these vulnerable people.
(28 November 2021) Huayu’s work on rare disease is now out with the Lancet as a conference abstract. Common conditions are widely recognised as risk factors for COVID-19 death, BUT effects of Rare Diseases are largely unknown. This study on Genomics England data shows significant increased mortality risks (OR 3·47) among rare disease individuals.
(23 November 2021) New members! A very warm welcome to Yun-Hsuan Chang and Hengrui Zhang to join KnowLab for doing their MSc projects. Both are MSc students based in Institute of Health Informatics, UCL. Yun will be working on Parkinson’s Disease Modelling using mutimodal data and Hengrui will do deep learning models for automated coding from discharge summaries. Both projects are exciting!
(9 October 2021) Great to know that Emma and Claire’s work on COVID subtype identification work has featured in Health Data Research UK’s website as a case study. Health Data Research UK is UK’s national institute of health data science.
(5 October 2021) NLP of radiology reports has wide applications. However, the current literature has suboptimal reporting quality. This impedes comparison, reproducibility, and replication. Check our systematic review on reporting quality of NLP on radiology reports on BMC Medical Imaging.
(26 September 2021) We are pleased to announce the grand open of our KnowLab Blog. We aim to irregularly share our research in a layman way. This is to reach out to the general public for disseminating what we are doing and why we are doing these.
(6 September 2021) KnowLab is thrilled to be part of a new £3.9m NIHR Research Collaboration on Artificial Intelligence and Multimorbidity, called AIM-CISC. We will lead the objective 4 work in England - use machine learning and natural language processing on multimodal health data for better understanding of disease clusters.
(18 August 2021) Great news - Dr Honghan Wu has become a Turing Fellow at The Alan Turing Institute, UK’s national institute for data science and artificial intelligence.
(1 August 2021) Dr Minhong Wang takes up a new position as a research fellow at IHI, UCL (Top 5 in the world for Public Health according to Shanghai Ranking 2021) to work on exciting projects on health data using NLP + ML! Congratulations, Minhong!
(28 July 2021) A warm welcome to Nickil Maveli who will be working with Dr Hang Dong on the automated medical coding project. Nickil will focus on (a) tackling the shortcomings of BERT models that only deal with 512 tokens or fewer; (b) utilising multiple documents in the task.
(9 July 2021) Dr Honghan Wu joined the editorial board of BMC Medical Informatics and Decision Making.
(7 July 2021) Exciting news - Dr Honghan Wu has been promoted to Associate Professor at UCL! (effective 1 October 2021).
(5 July 2021) We are recruiting a Research Fellow in Health Data Research to be based at IHI, UCL. Part of the role will be conducting exciting collaborations with iris.ai on The AI Chemist project. Please apply here.
(15 June 2021) COVID-19 subtyping work has now been accepted by AMIA 2021 Annual Symposium. What a great work, Emma and Claire! Both are joint first authors and first year PhD students of HDR UK, Turing Institute and Wellcome CDT, who worked with KnowLab on the COVID-19 project. Axes of Prognosis - Identifying Subtypes of COVID-19 Outcomes. medRxiv. Github Repo
(15 June 2021) Led by Zina, our work “A Knowledge Distillation Ensemble Framework for Predicting Short and Long-term Hospitalisation Outcomes from Electronic Health Records Data” has been published on IEEE Journal of Biomedical and Health Informatics.
(8 June 2021) Our paper “Developing automated methods for disease subtyping in UK Biobank - an exemplar study on stroke” has been accepted by BMC Medical Informatics and Decision Making. This is our first work to combine NLP + reusable domain knowledge (encoded as rules) to derive sub-phenotypes (specific conditions like intracerebral hemorrhage stroke).
(17 May 2021) Our paper “A Systematic Review of Natural Language Processing Applied to Radiology Reports” has been accepted by BMC Medical Informatics and Decision Making. Preprint arXiv. Well done, Arlene!
(5 May 2021) Our work with Dr Adam Levine on Pathology NLP has been accepted for oral presentation at 13th Joint meeting of the BDIAP and The Pathological Society on 6-8 July 2021, titled Natural Language Processing for the Automated Extraction of Tumour Immunohistochemical Profiles from Diagnostic Histopathology Reports. What a great start to work on Pathology NLP!
(21 April 2021) Hang’s work - “Weakly supervised entity linking and ontology matching to enrich patients’ rare disease coding” has been accepted to the 2021 virtual workshop on Personal Health Knowledge Graphs. Well done, Hang!
(24 March 2021) Axes of Prognosis - Identifying Subtypes of COVID-19 Outcomes - a work led by Emma and Claire now on medRxiv. Both are first year PhD students of HDR UK, Turing Institute and Wellcome CDT. Great work, Emma and Claire! Github Repo
(25 February 2021) Our paper “Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation” has been accepted by Journal of Biomedical Informatics. Well done, Hang!
(25 February 2021) Our recent work of using deep learning to automate diagnosis coding (ICD) from discharge summaries. Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation, preprint, GitHub Repo
(20 February 2021) Invited talk from Jose Manuel Gomez-Perez “On the Role of Knowledge Graphs and Language Models in Machine Understanding of Scientific Documents”.
(1 February 2021) Excited to kick off our project with NHS Lothian to use Natural Language Processing + Machine learning to automatically triage patients in long-waiting-list (due to the impact of COVID-19) of dermatology. This study is funded by Data-Driven Innovation.
(1 February 2021) Many congratulations to Minhong for having successfully passed her viva with minor corrections! Huge achievement, Dr Wang!
(21 January 2021) Our paper “Evaluation of NEWS2 for predicting severe COVID outcome”, now published with BMC Medicine. Key findings - NEWS2 had poor-moderate discrimination for severe outcome (ICU/death) at 14 days. But improved with blood/physio params.. Mentions in News Stories - dailymail / eurekalert / KCL
(4 January 2021) Our paper “Benchmarking network-based gene prioritization methods for cerebral small vessel disease” has been accepted by Briefings in Bioinformatics.
(11 November 2020) Our ensemble learning for COVID-19 work has been accepted by Journal of the American Medical Informatics Association. This study synergies seven multinational prediction models to realise a robust and high-performing prediction model. This is the first work to use ensemble learning for risk prediction of COVID-19 and the validation cohorts are one of the most diverse international COVID-19 datasets (4 cohorts with mortality rates 2.4-45%). The ensemble model consistently outperformed any single models in all aspects validated. DOI:10.1093/jamia/ocaa295, GitHub Repo
(12 September 2020) Our study uses ensemble learning to synergise seven multinational prediction models to realise a robust and high-performing prediction model. This is the first work to use ensemble learning for risk prediction of COVID-19 and the validation cohorts are one of the most diverse international COVID-19 datasets (4 cohorts with mortality rates 2.4-45%). The ensemble model consistently outperformed any single models in all aspects validated. preprint, GitHub Repo
(11 August 2020) Great news - Huayu Zhang’s project, titled “Towards data-driven fine management of COVID-19 hospitalization risk for rare-disease patients” is awarded £1,000 by The iTPA Translational Innovation Competition. Quite an achievement for an early stage postdoctoral researcher!
(11 June 2020) Our study shows “Adding age and a minimal set of blood parameters to NEWS2 improves the detection of patients likely to develop severe COVID-19 outcomes” - “Evaluation and Improvement of the National Early Warning Score (NEWS2) for COVID-19 - a multi-hospital study”. MedRxiv.
(11 June 2020) Our study shows “CVD services have dramatically reduced across countries, leading to potential (probably avoidable) excess mortality during and after the COVID-19 pandemic” - “Excess deaths in people with cardiovascular diseases during the COVID-19 pandemic”. MedRxiv.
(3 May 2020) Our COVID-19 risk prediction preprint out on Medrxiv - “Risk prediction for poor outcome and death in hospital in-patients with COVID-19 - derivation in Wuhan, China and external validation in London, UK”. MedRxiv.
(1 May 2020) Dr Honghan Wu started his new job as a lecturer in health informatics at IHI, UCL. He will continue his personal fellowship project on both University of Edinburgh and UCL.
(27 February 2020) Paper accepted by HealTAC 2020 - “Identifying physical health comorbidities in a cohort of individuals with severe mental illness - An application of SemEHR”
(21 February 2020) Paper accepted by ECAI 2020 - “Modeling Rare Interactions in Time Series Data Through Qualitative Change - Application to Outcome Prediction in Intensive Care Units”
(27 January 2020) Delighted to co-develop a NLP work package in the exciting Advanced Care Research Centre programme, a £20m investment dedicated to the field of ageing and care.
(20 January 2020) Great to have a short visit to Department of Orthopaedics and Traumatology, Hong Kong University, discussing the exciting opportunity of personalised pain prediction after spine treatments using multimodal data (free text + imaging).
(17 January 2020) our paper - “On Classifying Sepsis Heterogeneity in the ICU - Insight Using Machine Learning” has been published by JAMIA. https://doi.org/10.1093/jamia/ocz211
(14 January 2020) Delighted to have a kick-off meeting with Edinburgh Innovations team for our project “Towards an AI-driven Health Informatics Platform for supporting clinical decision making in Scotland – a pilot study in NHS Lothian” funded by Wellcome iTPA 2019.
(6 December 2019) Knowledge driven phenotyping on medrxiv now - an automated approach to translating phenotypes defined in domain vocabularies into queries executable on heterogenous and distributed health datasets.
(28 November 2019) Using SemEHR on EHRs to answer an important clinical question – “Association of physical health multimorbidity with mortality in people with schizophrenia spectrum disorders - Using a novel semantic search system that captures physical diseases in electronic patient records” has been accepted by Schizophrenia Research. DOI:10.1016/j.schres.2019.10.061
(11 November 2019) Great to know our paper - “Semantic computational analysis of anticoagulation use in atrial fibrillation from real world data” has been accepted by PLoS One. Preprint
(22 October 2019) Our first NLP transfer learning paper for identifying phenotype mentions has been accepted by JMIR Medical Informatics a relatively new journal (started in 2013) with an inaugural impact factor of 3.188.
(20 July 2019) Delighted to know our proposal for HDR UK NLP Implementation Project has been awarded as one of the 3 National Implementation Projects Look forward to working on this exciting UK-wide collaboration
(13 January 2019) Delighted to know our “Sprint” project proposal – “Building the Knowledge Graph for UK Healthcare Data Science” is awarded by HDR UK as part of Digital Innovation Hub Programme and one of the ten innovative data solutions to prove the potential of health data to transform lives
(19 October 2018) Thrilled to be invited to give a talk about our CogStack EHR platform in China’s National Centre for Cardiovascular Diseases. Great to learn their excellent infrastructures, research and datasets; and grand vision! Look forward to the first CogStack deployment in China for supporting EHR based research.
(15 February 2018) Proudly begin a MRC/Rutherford Fund Fellowship of HRD UK hosted by Centre for Medical Informatics of University of Edinburgh. My research focuses on “Deriving an actionable patient phenome from healthcare data“
(10 February 2018) An application paper describing our SemEHR toolkit has been accepted by JAMIA, titled “SemEHR- A General-purpose Semantic Search System to Surface Semantic Data from Clinical Notes for Tailored Care, Trial Recruitment and Clinical Research”.