Applications of Natural Language Processing in Cardiology using Text Clinical Data: A Systematic Review
Keywords:
Heart Failure, Natural Language Processing, Diagnosis, Prognosis, Text data.Abstract
A low survival rate of heart failure (HF) is attributed to the under-diagnosis due to lack of the diagnostic reference standard. Heart failure usually is not well-documented in the administrative databases due to inconsistency in use of diagnostic codes and inter-examiner variability. The majority of EHR databases can export data for certain patients’ characteristics, such as demographics and lab results, in a structured and analysis-friendly format. However, a lot of clinical data are stored in text and unstructured format. The use of unstructured clinical text data can substantially enhance both discission-making and clinical research. A manual extraction of unstructured data is time- and money- consuming process, hence using NLP algorithms with automatic extraction and classification could enhance efficiency and accuracy of the process.
This review aimed to highlight the literature that addressing application of NLP in the analysis of clinical text data related to diagnosis and prognosis of cardiovascular diseases. A multiple-term search strategy was used in PubMed and resulted in 53 studies, while only 20 studies used NLP techniques to handle text data related to HF. The included studies used NLP in different clinical purposes such as clinical features extraction, HF classification, and prediction of various HF outcomes. Early detection of HF symptoms was achieved in many studies and sometimes a median time of 6 months was found between a symptom reporting and the clinical diagnosis. Not only symptoms were extracted, characteristics of self-management, social determinants, and home-care were successfully identified by NLP techniques. Ejection fraction in clinical notes was used mainly to determine the type and severity of HF and it was associated with very good performance of NLP classifiers. Using of semi-structured clinical data, such as radiological reports, were usually associated with a better performance than using unstructured data, such as nurse notes. However, a combination of different types of data, particularly those supported by expert-knowledge, showed a promising results in HF diagnosis or prognosis. Using NLP techniques in the future can reduce underestimation of HF, particularly, when computer-extracted features and expert-optimized concepts are combined.