Lesson Learned - Cruising the Ocean of Unstructured Case Notes

There are various approaches to dealing with the growing long-term care needs of the ageing population across the UK. One in eight adults are carers and every day another 6,000 people take on caring responsibilities. Societies largely rely on the service of the unpaid informal carers and each individual saves the UK economy over £19,000 per annum (£132 billion in total,) preventing thousands of people from needing social care services.

The significant demands of caring mean that 600 people give up work every day to care for an older or disabled relative. People providing high levels of care are twice as likely to be permanently sick or disabled and 61% said they had suffered physical ill health as a result of caring. Eight in 10 people caring for loved ones say they have felt lonely or socially isolated, and 72% of carers said they had suffered mental ill health as a result of caring.

Anecdotally we know that the “Carer” label can be problematic and many people do not choose to identify as a carer and/or turn down carers assessments - this is currently a known data gap.  Knowing more about who these people are, what their relationship is to the cared for person, and the sorts of tasks that they support with allows us to use an evidence-based approach when designing services intended to reach and support these people.

Identifying informal carers from unstructured case notes is not trivial, and despite the challenges, it also comes with a lot of promising rewards and opportunities for ECC to provide support. Some of the challenges are associated with the unstructured and inconsistent manner of the case notes (if there’s anything I would like to change in this world, it’s the way these records are being kept).

The overarching goal was to fetch out informal carers from a pool of text data, containing over 300K observations. Quite overwhelming, ‘innit’? but my NLP (natural language processing) skills say otherwise! Before training our NLP model for the first time, the text case note was processed and cleaned. The model training in this context involved building a corpus, a subset of the main case note, and making sure the corpus contains some indicators of informal carers. Fast forward, after the corpus is trained, the testing phase involved applying the function/model on our main case notes and predicting/flagging a service user as receiving a service of an informal carer.


Using this brute-force approach, another 351 ‘uncounted Informal Carers’ were identified in the current cohort which equates to 20% of all carers we formally assess.  With 33% of carers identified as women and 44% as men, further insights confirmed that men are less likely to identify as an informal carer or engage in the service role, than women. Moreover, non-family members, such as friends and neighbours were grossly under-reported as providing informal care at one point or another.

In the course of this project, I have deepened my appreciation for NLP, as one of the most exciting fields of artificial intelligence that enables computers to understand and process human language. Beyond applying NLP to unstructured text data, computers can also do a better job at feedback analysis, predicting IT incidents tickets, analysing public comments and GP appointment text data etc – for any complex free-text problems at all, believe me when I say NLP is your go-to guy!

Share this page

Leave a comment

We only ask for your email address so we know you're a real person