(23 November 2023) New paper - Term-BLAST-Like Alignment Tool for Concept Recognition in Noisy Clinical Texts - published on Bioinformatics
Texts from electronic health records (EHRs) frequently contain spelling errors, abbreviations, and other non-standard ways of representing clinical concepts.Here, we present a method inspired by the BLAST algorithm for biosequence alignment that screens texts for potential matches on the basis of matching k-mer counts and scores candidates based on conformance to typical patterns of spelling errors derived from 2.9 million clinical notes. Our method, the Term-BLAST-like alignment tool (TBLAT) leverages a gold standard corpus for typographical errors to implement a sequence alignment-inspired method for efficient entity linkage. We present a comprehensive experimental comparison of TBLAT with five widely-used tools. Experimental results show an increase of 10% in recall on scientific publications and 20% increase in recall on EHR records (when compared against the next best method), hence supporting a significant enhancement of the entity linking task. The method can be used stand-alone or as a complement to existing approaches.
Read it at DOI:10.1093/bioinformatics/btad716
(27 October 2023) Two papers published on Frontiers in Digital Health
[1] Casey, Arlene, Emma Davidson, Claire Grover, Richard Tobin, Andreas Grivas, Huayu Zhang, Patrick Schrempf, Alison Q. O’Neil, Liam Lee, Michael Walsh, Freya Pellie, Karen Ferguson, Vera Cvoro, Honghan Wu, Heather Whalley, Grant Mair, William Whiteley, Beatrice Alex. “Understanding the performance and reliability of NLP tools - a comparison of four NLP tools predicting stroke phenotypes in radiology reports.” Frontiers in Digital Health 5 (2023), 1184919. Read it here
[2] Zhang, Huayu, Arlene Casey, Imane Guellil, Víctor Suárez-Paniagua, Clare Macrae, Charis Marwick, Honghan Wu, Bruce Guthrie, and Beatrice Alex. “FLAP - A framework for linking free-text addresses to the Ordnance Survey Unique Property Reference Number database.” Frontiers in Digital Health 5 (2023), 1186208. Read it here
(25 September 2023) Three new starters at KnowLab!
Yunsoo Kim is a new PhD students working on multimodal large language model in health data. His research interest also includes applications of the models in diagnosis and prognosis of neurodiseases such as dementia.
Yusuf Abdulle is a Research Assistant based at the Institute of Health Informatics at University College London. He is currently working on using Graph Neural Networks and Knowledge Graphs to work on early diagnosis of rare neurodegenerative diseases.
Yue Gao is a PhD student from Beijing University of Posts and Telecommunications. Funded by CSC, Yue is visiting KnowLab for one year doing research on human in the loop AI models for automated clinical coding.
(11 September 2023) New grant! KnowLab is awarded £649,218 by MRC for Quantifying and Mitigating Bias affecting and induced by AI in Medicine!
Artificial Intelligence (AI) has demonstrated exciting potential in improving healthcare. However, these technologies come with a big caveat. They do not work effectively for minority groups. A recent study published in Science shows a widely used AI tool in the US concludes Black patients are healthier than equally sick Whites. Using this tool, a health system would favour White people when allocating resources, such as hospital beds. AI models like this would do more harm than good for health equity. Funded by Medical Research Council, KnowLab is leading a 30-month research project focusing on using data science and machine learning to quantify and mitigate data embedded and AI induced bias and inequality. Clearly, this is a challenge too grand to be tackled by a single institute. We will be working closely with BHF Data Science Centre, University of Edinburgh, University of Birmingham, Nanjing Medical University (China), and wider communities including Health Dat Research UK, the Alan Turing Institute and beyond.
Check the Project Page at UKRI
(23 July 2023) Hard exudate plays an important role in grading diabetic retinopathy (DR) as a critical indicator. Therefore, the accurate segmentation of hard exudates is of clinical importance. However, the percentage of hard exudates in the whole fundus image is relatively small, and their shapes are often irregular and the contrasts are usually not high enough. Hence, they are prone to misclassifications e.g., misclassified as part of the optic disc structure or cotton wool spots, which results in the low segmentation accuracy and efficiency. This paper proposes a novel neural network RMCA U-net to accurately segmentation hard exudate in fundus images. The network features a U-shape framework combined with a residual structure to obtain the subtle features of hard exudate. A multi-scale feature fusion (MSFF) module and an improved channel attention (CA) module are designed and involved to effectively segmentation sparse small lesions. The proposed method in this paper has been trained and evaluated on three data sets - IDRID, Kaggle and one local data set. Experiments are shown and indicate that RMCA U-net of this paper is superior to the other convolutional neural networks. The method in this paper is increased by 6% higher in PR-MAP than U-net on the IDRID dataset, increased by 10% in Recall than U-net on the Kaggle dataset and increased by 20% in F1-score than U-net on the local dataset.
Read it at DOI:10.1016/j.eswa.2023.120987
(15 July 2023) This paper presents our contribution to the RadSum23 shared task organized as part of the BioNLP 2023. We compared state-of-the-art generative language models in generating high-quality summaries from radiology reports. A two-stage fine-tuning approach was introduced for utilizing knowledge learnt from different datasets. We evaluated the performance of our method using a variety of metrics, including BLEU, ROUGE, bertscore, CheXbert, and RadGraph. Our results revealed the potentials of different models in summarizing radiology reports and demonstrated the effectiveness of the two-stage fine-tuning approach. We also discussed the limitations and future directions of our work, highlighting the need for better understanding the architecture design’s effect and optimal way of fine-tuning accordingly in automatic clinical summarizations.
Read it at DOI:10.18653/v1/2023.bionlp-1.54
(5 May 2023) Ontology-driven and weakly supervised rare disease identification from clinical notes published on BMC Medical Informatics and Decision Making
Superb work from Dr Hong Dong and colleagues, demonstrating how weak supervised NLP + Ontology techniques can greatly facilitate the identification of rare disease mentions from electronic health records with >90% accuracy. This uses training data that need no human annotations!
Read it at DOI:10.1186/s12911-023-02181-9
(5 May 2023) New paper titled Prediction of disease comorbidity using explainable artificial intelligence and machine learning techniques - A systematic review published on International Journal of Medical Informatics
Mohanad M. Alsaleh - a PhD student at UCL - did a great systematic review on explainable AI methods for predicting comorbidity from electronic health records. It finds “(a) The use of explainable artificial intelligence (XAI) can improve predictions of comorbidities by providing a transparent understanding of the reasoning behind predictions and helping healthcare providers make informed decisions. (b) There is a great potential to uncover novel disease associations and better understand the mechanisms of diseases by integrating genetic and electronic health record (EHR) data, leading to improved quality of care and earlier diagnoses. (c) The use of AI in healthcare can improve patient outcomes and reduce healthcare costs by identifying disease risks and making personalised treatment plans.”
Read it at DOI:10.1016/j.ijmedinf.2023.105088
(7 April 2023) New paper - a systematic review on antidepressant and antipsychotic drug prescribing and diabetes outcomes
As part of her PhD research, Charlotte led this systematic review to investigate the association between antidepressant or antipsychotic drug prescribing and type 2 diabetes outcomes. It concludes - Studies of antidepressant and antipsychotic drug prescribing in relation to diabetes outcomes are scarce, with shortcomings and mixed findings. Until further evidence is available, people with diabetes prescribed antidepressants and antipsychotics should receive monitoring and appropriate treatment of risk factors and screening for complications as recommended in general diabetes guidelines.
Read it at DOI:10.1016/j.diabres.2023.110649
(20 March 2023) <h3>Workshop paper on “Ontology-driven Self-supervision for Adverse Childhood Experiences Identification using Social Media Datasets” now published!</h3> Adverse Childhood Experiences (ACEs) are defined as a collection of highly stressful, and potentially traumatic, events or circumstances that occur throughout childhood and/or adolescence. They have been shown to be associated with increased risks of mental health diseases or other abnormal behaviours in later lives. In this paper, Jinge and colleages present an ontology-driven self-supervised approach (derive concept embeddings using an auto-encoder from baseline NLP results) for producing a publicly available resource that would support large-scale machine learning (e.g., training transformer based large language models) on social media corpus. Jinge presented this paper in 2022 summer at the 1st Workshop on Scarce Data in Artificial Intelligence for Healthcare, which was with IJCAI 2022 in Vienna. Check - Paper, Github Repo
(7 March 2023) npj Digital Medical’s editorial on automating clinical coding echoes our prospective
Recently, npj Digital Medicine’s editor Dr Kvedar and colleagues have published a great editorial on automating clinical coding (link), pointing out the main challenges including technological and implementation levels; clinical documents are redundant and complex, code sets like the ICD-10 are rapidly evolving, training sets are not comprehensive of codes; capturing the logic and rules of coding decisions. Great to see our prospectives on the automated coding research challenges and future directions were echoed in the editorial!
(21 February 2023) New paper - The impact of inconsistent human annotations on AI driven clinical decision making now published by npj Digital Medicine
Annotation inconsistencies commonly occur when even highly experienced clinical experts annotate the same phenomenon (e.g., medical image, diagnostics, or prognostic status), due to inherent expert bias, judgements, and slips, among other factors. While their existence is relatively well-known, the implications of such inconsistencies are largely understudied in real-world settings.
Aneeta Sylolypavan did her MSc with us addressing this hugely important research question using real-world ICU datasets with annotated data from 11 ICU consultants. The results suggest that (a) there may not always be a “super expert” in acute clinical settings; and (b) standard consensus seeking (such as majority vote) consistently leads to suboptimal models. Further analysis, however, suggests that assessing annotation learnability and using only ‘learnable’ annotated datasets for determining consensus achieves optimal models in most cases.
Read the paper from here
(23 January 2023) ![]() KnowLab is proud to be part of 16 projects funded by HDR UK and funded by NIHR which will use data-driven approaches to pin-point pressures in the health care system, understand their causes and develop ways to overcome or avoid them. Particularly, we will use machine learning and rare disease phenotype models to uncover much-needed information on the added risks of severe COVID-19 in people who are clinically more vulnerable and come from disadvantaged socioeconomic backgrounds. This can then inform policy responses to provide better management and treatment for these most vulnerable groups who might have been overlooked. The team are well placed to derive quick actionable findings for the winter pressures as they have been working with CVD-COVID-UK/COVID-IMPACT on rare diseases since October 2021. HDR UK Press Release on the funded projects |
HDR UK News on this project | Herald Scotland News |
(21 December 2022) Our UK’s clinical NLP landscaping (a survey) paper is now published with npj Digital Medicine
Aiming to survey the landscape of Clinical NLP in the UK, we used a relatively extraordinary approach - start with finding all relevant funded projects and extract their interlinked information. Then, conducted community analysis and literature review. We described WHO (key players of funders, universities, companies, researchers), WHAT (techs, applications, disease areas, clinical questions, datasets), WHERE (the community developments, tech trends & maturity), GAPS (barriers to unleash the full power of NLP in health). While on the community level we focused on the UK, analyses and discussions on the research, tech and developments were beyond the country boundary. In particular, we compared tech, data, regulatory similarities and differences of the US and the UK. This is one of the key outputs of HDR UK funded National Text Analytics Project.
Read it at DOI:10.1038/s41746-022-00730-6
(20 December 2022) KnowLab co-edits a new cross journal collection with BMC Series on Ethics of Artificial Intelligence in Health and Medicine.
As the implementation of artificial intelligence (AI)-based innovations in health and care services become more and more common, it is increasingly pressing to address the ethical challenges associated with AI in healthcare to find appropriate solutions. In the cross-journal BMC collection Ethics of Artificial Intelligence in Health and Medicine, we urge the research communities, industry, policy makers and other stakeholders to join forces in tackling the grand challenges of realising Ethical and fair AI in health and medicine. Check our blog article with BMC Series on the topic for what & why. Please spread the words and contribute to the collection, current deadline is 31 Oct 2023.
(22 November 2022) KnowLab is awarded £5,000 from UCL Global Engagement Funding.
The funding is to extend and deepen our collaboration with iris.ai - a Norway based start-up behind the award-winning AI engine for scientific text understanding. The funded project is titled - Towards self-updatable knowledge base for evidence based medicine - join force with iris.ai (Norway) and beyond.
(27 October 2022) The Alan Turing Heath Equity Interest Group - https://www.turing.ac.uk/research/interest-groups/health-equity, co-organised by KnowLab and colleagues, is now official online!
In an era where AI is expected to improve our daily life - particularly in health, “How can we ensure that developments and applications of data science and AI improve everyone’s health?” This is a pressing and very challenging question. Please join forces with a multidisciplinary group to form a formidable synergistic force tackling one of the biggest challenges of AI in medicine.
(22 October 2022) Dr Hang Dong’s great perspective piece on automated coding using NLP and knowledge-driven approaches has now been published on npj Digital Medicine. The work illustrates how NLP and AI can help improve the efficiency of clinical coding in healthcare - i.e., assign ICD/SNOMED codes to hospital visits, which currently is a very inefficient/erroneous process in NHS and, for that matter, in many other health systems across the world.
(5 October 2022) Study on The Impact of Inconsistent Human Annotations on AI driven Clinical Decision Making.
Annotation inconsistencies commonly occur when even highly experienced clinical experts annotate the same phenomenon (e.g., medical image, diagnostics, or prognostic status), due to inherent expert bias, judgements, and slips, among other factors. While their existence is relatively well-known, the implications of such inconsistencies are largely understudied in real-world settings.
Aneeta Sylolypavan did her MSc with us addressing this hugely important research question using real-world ICU datasets with annotated data from 11 ICU consultants. The results suggest that (a) there may not always be a “super expert” in acute clinical settings; and (b) standard consensus seeking (such as majority vote) consistently leads to suboptimal models. Further analysis, however, suggests that assessing annotation learnability and using only ‘learnable’ annotated datasets for determining consensus achieves optimal models in most cases.
The manuscript is now under review with npj Digital Medicine and preprint is available at doi:10.21203/rs.3.rs-1937575/v1.
(18 August 2022) Our systematic review titled “Artificial intelligence models for predicting cardiovascular diseases in people with type 2 diabetes - a systematic review”, led by Minhong Wang, has been accepted by Intelligence-Based Medicine. This study identified and reviewed existing AI models for predicting risk of cardiovascular diseases in people with type 2 diabetes. We found that compared to risk scores developed using conventional methods, AI approaches have the potential to achieve more accurate predictions than risk scores developed using conventional methods. However, none of the reviewed models is directly reusable or reproducible, due to incomplete reporting and lack of transparency. Clinically, none of the AI models includes interventions that may affect risks such as medications and lifestyle changes. There were no indications in the studies on whether the prediction models might be able to adapt to include these factors.
(29 July 2022) Our collaboration study titled “Prediction of Five-Year Cardiovascular Disease Risk in People with Type 2 Diabetes Mellitus - Derivation in Nanjing, China and External Validation in Scotland, UK”, led by Cheng Wan from Nanjing Medical University, has been published by Global Heart. This study shows it is feasible to generate a risk prediction model using routinely collected Chinese hospital data. This indicates there is a great potential to make use of the large-scale and relatively easy accessible route data for identifying those at risk of CVD and help significantly improve CVD prevention in people with diabetes.
(11 July 2022) Our health inequality studies - one led by Isabel Straw and one with Minhong, Aneeta and Prof Sarah Wild (University of Edinburgh) - are featured in a science piece on i news.
(16 June 2022) Our collaboration study titled “Spine-GFlow - A Hybrid Learning Framework for Robust Multi-tissue Segmentation in Lumbar MRI without Manual Annotation”, led by Dr Teng Zhang from Hong Kong University, has been accepted by Computerized Medical Imaging and Graphics. Results of this study show that our method, without requiring manual annotation, has achieved a segmentation performance comparable to a model trained with full supervision (mean Dice 0.914 vs 0.916).
(10 June 2022) Out work, titled COVID-19 trajectories among 57 million adults in England - a cohort study using electronic health records, is now out with Lancet Digital Health. Our analyses illustrate the wide spectrum of disease trajectories as shown by differences in incidence, survival, and clinical pathways. We have provided a modular analytical framework that can be used to monitor the impact of the pandemic and generate evidence of clinical and policy relevance using multiple EHR sources.
(2 May 2022) Our work Quantifying Health Inequalities Induced by Data and AI Models has been accepted by IJCAI-ECAI2022 ‘AI for Good track’. This work introduced a generic allocation-deterioration framework for detecting and quantifying AI induced inequality. Extensive experiments were carried out to quantify health inequalities (a) embedded in two real-world ICU datasets of HiRID and MIMIC III; (b) induced by AI models trained for two resource allocation scenarios. The results showed that compared to men, women had up to 33% poorer deterioration in markers of prognosis when admitted to HiRID ICUs. All four AI models assessed were shown to induce significant inequalities (2.45% to 43.2%) for non-White compared to White patients. The models exacerbated data embedded inequalities significantly in 3 out of 8 assessments, one of which was >9 times worse. preprint, slides, recording, repo.
(26 April 2022) Study led by Isabel Straw, Investigating for bias in healthcare algorithms - a sex-stratified analysis of supervised machine learning models in liver disease prediction, demonstrates a previously unobserved sex disparity present in published machine learning models. It suggests “To ensure sex-based inequalities do not manifest in medical AI, an evaluation of demographic performance disparities must be integrated into model development.” The work has been published on BMJ Health & Care Informatics.
(22 April 2022) Dr Honghan Wu joined the editorial board of BMC Digital Health. BMC Digital Health considers research on all aspects of the development and implementation of digital technology in both medicine and public health, such as mobile health applications, virtual healthcare and wearable technology, as well as the role of social media and other communications technology in digital health.
(25 March 2022) Study led by Huayu, Increased COVID-19 mortality rate in rare disease patients - a retrospective cohort study in participants of the Genomics England 100,000 Genomes project, has shown rare disease patients, especially ones affected by neurology and neurodevelopmental disorders, in the Genomics England cohort had increased risk of COVID-19 related death during the first wave of the pandemic in UK. This work has now been accepted by Orphanet Journal of Rare Diseases.
(20 March 2022) Clinical coding is the task of transforming medical information in a patient’s health records into structured codes like ICD-10 for diagnosis, which is cognitive, time-consuming task and error-prone. In this preprint, titled Automated Clinical Coding - What, Why, and Where We Are? , Hang introduces the idea of automated clinical coding and summarises its challenges from the perspective of Artificial Intelligence (AI) and Natural Language Processing (NLP), based on the literature, our project experience over the past two and half years (late 2019 - early 2022), and discussions with clinical coding experts in Scotland and the UK.
(18 January 2022) KnowLab was awarded an enabling grant (£29k) from British Council to strengthen academic exchanges and deepen our collaborations with the two Nanjing based universities of Nanjing Medical University (Prof Yun Liu’s group) and Southeast University (Dr Xiang Zhang’s group). At UCL side, in addition to KnowLab colleagues, we have Prof Paul Taylor and Dr Holger Kunz. For research focuses, we will focus on Artificial Intelligence in Medicine - tackling challenges of low generalisability and health inequality. This will involve both teaching and research activities.
(15 January 2022) Great work by Shaoxiong Ji and colleagues from Aalto University on reviewing automated coding from free-text clinical notes using a unified/abstract architecture view - now on arXiv. Great that KnowLab is part of this.
(13 January 2022) Minhong’s project “COOLNeo-an automated COOLing therapy for NEOnates” has been awarded £9,960.00 in funding from the ACCELERATE Innovation Team Challenge, finacially supported throught the Wellcome Trust Translational Partnership Award linked to Translational Research Office.
(8 January 2022) New members! A very warm welcome to Xuezhe Wang and Zhaolong Wu to join KnowLab for doing their MSc projects. Both are MSc students based in Institute of Health Informatics, UCL. Xuezhe will be working on graphic neural networks and Zhaolong will be doing clinical natural language processing.
(3 December 2021) Led by Dr Rebecca Bendayan at King’s College London, our work Investigating the Association between Physical Health Comorbidities and Disability in Individuals with Severe Mental Illness is now accepted by European Psychiatry. We found individuals with Severe Mental Illness and musculoskeletal, skin/dermatological, respiratory endocrine, neurological, haematological or circulatory disorders are at higher risk of disability compared to those that do not have those comorbidities. There is a great and urgent need to provide targeted prevention and intervention programs for these vulnerable people.
(28 November 2021) Huayu’s work on rare disease is now out with the Lancet as a conference abstract. Common conditions are widely recognised as risk factors for COVID-19 death, BUT effects of Rare Diseases are largely unknown. This study on Genomics England data shows significant increased mortality risks (OR 3·47) among rare disease individuals.
(23 November 2021) New members! A very warm welcome to Yun-Hsuan Chang and Hengrui Zhang to join KnowLab for doing their MSc projects. Both are MSc students based in Institute of Health Informatics, UCL. Yun will be working on Parkinson’s Disease Modelling using mutimodal data and Hengrui will do deep learning models for automated coding from discharge summaries. Both projects are exciting!
(9 October 2021) Great to know that Emma and Claire’s work on COVID subtype identification work has featured in Health Data Research UK’s website as a case study. Health Data Research UK is UK’s national institute of health data science.
(5 October 2021) NLP of radiology reports has wide applications. However, the current literature has suboptimal reporting quality. This impedes comparison, reproducibility, and replication. Check our systematic review on reporting quality of NLP on radiology reports on BMC Medical Imaging.
(26 September 2021) We are pleased to announce the grand open of our KnowLab Blog. We aim to irregularly share our research in a layman way. This is to reach out to the general public for disseminating what we are doing and why we are doing these.
(6 September 2021) KnowLab is thrilled to be part of a new £3.9m NIHR Research Collaboration on Artificial Intelligence and Multimorbidity, called AIM-CISC. We will lead the objective 4 work in England - use machine learning and natural language processing on multimodal health data for better understanding of disease clusters.
(18 August 2021) Great news - Dr Honghan Wu has become a Turing Fellow at The Alan Turing Institute, UK’s national institute for data science and artificial intelligence.
(1 August 2021) Dr Minhong Wang takes up a new position as a research fellow at IHI, UCL (Top 5 in the world for Public Health according to Shanghai Ranking 2021) to work on exciting projects on health data using NLP + ML! Congratulations, Minhong!
(28 July 2021) A warm welcome to Nickil Maveli who will be working with Dr Hang Dong on the automated medical coding project. Nickil will focus on (a) tackling the shortcomings of BERT models that only deal with 512 tokens or fewer; (b) utilising multiple documents in the task.
(15 July 2021) Paper accepted by EMBC 2021, titled “Rare Disease Identification from Clinical Notes with Ontologies and Weak Supervision”. arXiv:2105.01995
(9 July 2021) Dr Honghan Wu joined the editorial board of BMC Medical Informatics and Decision Making.
(7 July 2021) Exciting news - Dr Honghan Wu has been promoted to Associate Professor at UCL! (effective 1 October 2021).
(5 July 2021) We are recruiting a Research Fellow in Health Data Research to be based at IHI, UCL. Part of the role will be conducting exciting collaborations with iris.ai on The AI Chemist project. Please apply here.
(15 June 2021) COVID-19 subtyping work has now been accepted by AMIA 2021 Annual Symposium. What a great work, Emma and Claire! Both are joint first authors and first year PhD students of HDR UK, Turing Institute and Wellcome CDT, who worked with KnowLab on the COVID-19 project. Axes of Prognosis - Identifying Subtypes of COVID-19 Outcomes. medRxiv. Github Repo
(15 June 2021) Led by Zina, our work “A Knowledge Distillation Ensemble Framework for Predicting Short and Long-term Hospitalisation Outcomes from Electronic Health Records Data” has been published on IEEE Journal of Biomedical and Health Informatics.
(8 June 2021) Our paper “Developing automated methods for disease subtyping in UK Biobank - an exemplar study on stroke” has been accepted by BMC Medical Informatics and Decision Making. This is our first work to combine NLP + reusable domain knowledge (encoded as rules) to derive sub-phenotypes (specific conditions like intracerebral hemorrhage stroke).
(17 May 2021) Our paper “A Systematic Review of Natural Language Processing Applied to Radiology Reports” has been accepted by BMC Medical Informatics and Decision Making. Preprint arXiv. Well done, Arlene!
(5 May 2021) Our work with Dr Adam Levine on Pathology NLP has been accepted for oral presentation at 13th Joint meeting of the BDIAP and The Pathological Society on 6-8 July 2021, titled Natural Language Processing for the Automated Extraction of Tumour Immunohistochemical Profiles from Diagnostic Histopathology Reports. What a great start to work on Pathology NLP!
(21 April 2021) Hang’s work - “Weakly supervised entity linking and ontology matching to enrich patients’ rare disease coding” has been accepted to the 2021 virtual workshop on Personal Health Knowledge Graphs. Well done, Hang!
(12 April 2021) Honghan is invited to speak at Personal Knowledge Graph Workshop 2021 about our work and thoughts on knowledge graph and particularly the “personal” aspect in health data science.
(24 March 2021) Axes of Prognosis - Identifying Subtypes of COVID-19 Outcomes - a work led by Emma and Claire now on medRxiv. Both are first year PhD students of HDR UK, Turing Institute and Wellcome CDT. Great work, Emma and Claire! Github Repo
(25 February 2021) Our paper “Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation” has been accepted by Journal of Biomedical Informatics. Well done, Hang!
(25 February 2021) Our recent work of using deep learning to automate diagnosis coding (ICD) from discharge summaries. Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation, preprint, GitHub Repo
(20 February 2021) Invited talk from Jose Manuel Gomez-Perez “On the Role of Knowledge Graphs and Language Models in Machine Understanding of Scientific Documents”.
(1 February 2021) Excited to kick off our project with NHS Lothian to use Natural Language Processing + Machine learning to automatically triage patients in long-waiting-list (due to the impact of COVID-19) of dermatology. This study is funded by Data-Driven Innovation.
(1 February 2021) Many congratulations to Minhong for having successfully passed her viva with minor corrections! Huge achievement, Dr Wang!
(21 January 2021) Our paper “Evaluation of NEWS2 for predicting severe COVID outcome”, now published with BMC Medicine. Key findings - NEWS2 had poor-moderate discrimination for severe outcome (ICU/death) at 14 days. But improved with blood/physio params.. Mentions in News Stories - dailymail / eurekalert / KCL
(4 January 2021) Our paper “Benchmarking network-based gene prioritization methods for cerebral small vessel disease” has been accepted by Briefings in Bioinformatics.
(16 December 2020) Evaluation and Improvement of the National Early Warning Score (NEWS2) for COVID-19 a multi-hospital study. MedRxiv, accepted by BMC Medicine now.
(10 December 2020) Excess deaths in people with cardiovascular diseases during the COVID-19 pandemic, MedRxiv, now accepted by European Journal of Preventive Cardiology.
(11 November 2020) Our ensemble learning for COVID-19 work has been accepted by Journal of the American Medical Informatics Association. This study synergies seven multinational prediction models to realise a robust and high-performing prediction model. This is the first work to use ensemble learning for risk prediction of COVID-19 and the validation cohorts are one of the most diverse international COVID-19 datasets (4 cohorts with mortality rates 2.4-45%). The ensemble model consistently outperformed any single models in all aspects validated. DOI:10.1093/jamia/ocaa295, GitHub Repo
(16 October 2020) Many congratulations to Victor and Hang, both are awarded £1,000 by The iTPA Translational Innovation Competition. Details
(23 September 2020) We are hiring! Two posts (£33,797 - £40,322) in NLP / Health Data research. Deadline 14 Oct 2020, based in Usher, Edinburgh Medical School.
(12 September 2020) Our study uses ensemble learning to synergise seven multinational prediction models to realise a robust and high-performing prediction model. This is the first work to use ensemble learning for risk prediction of COVID-19 and the validation cohorts are one of the most diverse international COVID-19 datasets (4 cohorts with mortality rates 2.4-45%). The ensemble model consistently outperformed any single models in all aspects validated. preprint, GitHub Repo
(11 August 2020) Great news - Huayu Zhang’s project, titled “Towards data-driven fine management of COVID-19 hospitalization risk for rare-disease patients” is awarded £1,000 by The iTPA Translational Innovation Competition. Quite an achievement for an early stage postdoctoral researcher!
(11 June 2020) Our study shows “Adding age and a minimal set of blood parameters to NEWS2 improves the detection of patients likely to develop severe COVID-19 outcomes” - “Evaluation and Improvement of the National Early Warning Score (NEWS2) for COVID-19 - a multi-hospital study”. MedRxiv.
(11 June 2020) Our study shows “CVD services have dramatically reduced across countries, leading to potential (probably avoidable) excess mortality during and after the COVID-19 pandemic” - “Excess deaths in people with cardiovascular diseases during the COVID-19 pandemic”. MedRxiv.
(3 May 2020) Our COVID-19 risk prediction preprint out on Medrxiv - “Risk prediction for poor outcome and death in hospital in-patients with COVID-19 - derivation in Wuhan, China and external validation in London, UK”. MedRxiv.
(1 May 2020) Dr Honghan Wu started his new job as a lecturer in health informatics at IHI, UCL. He will continue his personal fellowship project on both University of Edinburgh and UCL.
(27 February 2020) Paper accepted by HealTAC 2020 - “Identifying physical health comorbidities in a cohort of individuals with severe mental illness - An application of SemEHR”
(21 February 2020) Paper accepted by ECAI 2020 - “Modeling Rare Interactions in Time Series Data Through Qualitative Change - Application to Outcome Prediction in Intensive Care Units”
(27 January 2020) Delighted to co-develop a NLP work package in the exciting Advanced Care Research Centre programme, a £20m investment dedicated to the field of ageing and care.
(20 January 2020) Great to have a short visit to Department of Orthopaedics and Traumatology, Hong Kong University, discussing the exciting opportunity of personalised pain prediction after spine treatments using multimodal data (free text + imaging).
(17 January 2020) our paper - “On Classifying Sepsis Heterogeneity in the ICU - Insight Using Machine Learning” has been published by JAMIA. https://doi.org/10.1093/jamia/ocz211
(14 January 2020) Delighted to have a kick-off meeting with Edinburgh Innovations team for our project “Towards an AI-driven Health Informatics Platform for supporting clinical decision making in Scotland – a pilot study in NHS Lothian” funded by Wellcome iTPA 2019.
(6 December 2019) Knowledge driven phenotyping on medrxiv now - an automated approach to translating phenotypes defined in domain vocabularies into queries executable on heterogenous and distributed health datasets.
(28 November 2019) Using SemEHR on EHRs to answer an important clinical question – “Association of physical health multimorbidity with mortality in people with schizophrenia spectrum disorders - Using a novel semantic search system that captures physical diseases in electronic patient records” has been accepted by Schizophrenia Research. DOI:10.1016/j.schres.2019.10.061
(11 November 2019) Great to know our paper - “Semantic computational analysis of anticoagulation use in atrial fibrillation from real world data” has been accepted by PLoS One. Preprint
(22 October 2019) Our first NLP transfer learning paper for identifying phenotype mentions has been accepted by JMIR Medical Informatics a relatively new journal (started in 2013) with an inaugural impact factor of 3.188.
(20 July 2019) Delighted to know our proposal for HDR UK NLP Implementation Project has been awarded as one of the 3 National Implementation Projects Look forward to working on this exciting UK-wide collaboration
(5 April 2019) The 4th International Workshop on Knowledge Discovery in Healthcare Data will be with IJCAI2019 in Macao, China. Website, submission and dates.
(13 January 2019) Delighted to know our “Sprint” project proposal – “Building the Knowledge Graph for UK Healthcare Data Science” is awarded by HDR UK as part of Digital Innovation Hub Programme and one of the ten innovative data solutions to prove the potential of health data to transform lives
(19 October 2018) Thrilled to be invited to give a talk about our CogStack EHR platform in China’s National Centre for Cardiovascular Diseases. Great to learn their excellent infrastructures, research and datasets; and grand vision! Look forward to the first CogStack deployment in China for supporting EHR based research.
(15 February 2018) Proudly begin a MRC/Rutherford Fund Fellowship of HRD UK hosted by Centre for Medical Informatics of University of Edinburgh. My research focuses on “Deriving an actionable patient phenome from healthcare data“
(10 February 2018) An application paper describing our SemEHR toolkit has been accepted by JAMIA, titled “SemEHR- A General-purpose Semantic Search System to Surface Semantic Data from Clinical Notes for Tailored Care, Trial Recruitment and Clinical Research”.
(27 November 2017) Our work of using knowledge graph techniques in predicting adverse drug reactions has published by Scientific Reports.
(5 July 2017) Our data harmonisation and search toolkit for EHR – CogStack is mentioned in Annual Report of the Chief Medical Officer 2016 by the UK Government.