Unlocking the Full Story: The Power of Clinical Notes in Real-World Data 

The year is 2025, or more than fifteen years since the enaction of The HITECH Act and Meaningful UseAlmost all of the clinical data recorded from ordinary Americans’ physician office visits and hospital stays have now shifted to electronic formatTherefore, increasing emphasis is gradually being placed on the value of real-world data, with the hope that medical knowledge resulting in care improvements can be extracted from the vast amount of information that exists in electronic health records. 

Structured vs. Unstructured Clinical Data 

This electronic clinical data can be subdivided into two categories – structured and unstructuredExamples of structured data include demographic information, diagnoses and procedures (in the form of clinical codes), medication prescription information, insurance records, and vital signs, while examples of unstructured data include free-text clinical narratives and imaging and test reports. 

Both types of clinical information are important and perform complementary functions in real-world dataStructured data contains many basic data elements and is traditionally easier to process, due to its tabular natureHowever, unstructured data has been estimated to comprise 80% of clinical data by volume and often provides insights that are absent from structured clinical data and claims data [1]There is an old saying among medical professionals that “90% of diagnoses can be made using the patient history, and 10% using the physical exam [2]” (notably, both elements are virtually absent from structured EHR data). 

What are some of the details captured in the well-written clinical note that are typically excluded from structured EHR data?

Information Extraction from Unstructured Data in an LLM-World 

A well-written clinical note contains many details about the patient that are absent from both structured tabular data and claims dataUntil just a couple of years ago, the challenge was extracting information from a clinical note into a usable formatHowever, with the advent of large language models (LLMs), one can present a note as context and ask a favorite LLM questions about the note, such as “Where is the location of this patient’s pain?” or “Why did the patient discontinue lisinopril?”  Adaptation of this method enables extraction of information from the note as structured categorical data, which can then be used as structured data. 

OMNY Notes: A First-of-its-Kind Clinical Notes Data Product  

OMNY Notes is one of our exciting new data products that makes billions of de-identified clinical notes from diverse health systems available to the end-user.   Researchers no longer must rely solely on structured EHR and claims data; they can now view the full patient journey with our HIPAA-compliant de-identified linked structured EHR, claims and notes solutions representing more than 75M individuals. No other solution available today provides the combined depth, breadth, and scale of OMNY structured and unstructured data to support improving quality, safety, and efficiency of healthcare delivery and overall public health.

Contact us at info@omnyhealth.com to learn more about our OMNY Notes product and our other data products: 

  • OMNY Foundation 
  • OMNY Linked Claims 
  • OMNY MedTech

References: 

    [1] https://healthtechmagazine.net/article/2023/05/structured-vs-unstructured-data-in-healthcare-perfcon.

    [2] Tsukamoto, Tomoko, et al. “The contribution of the medical history for the diagnosis of simulated cases by medical students.” International Journal of Medical Education 3 (2012).

Popular Blogs

Recent News