The secondary use of Electronic Health Records (EHRs) has transformative potential for how healthcare research is conducted, yielding new possibilities in areas of business intelligence, observational research, clinical trial recruitment and decision support.
However, as much as 80% of the data in the EHR are known to be locked in the form of unstructured text, making this information 'invisible' for standard analysis techniques.
This two-day course will provide an introduction to the field of clinical natural language processing (NLP), from its origins before the advent of 'Big Data', to the current state of the art, comprising information extraction algorithms processing millions of documents on supercomputer hardware.
In addition, this course will convey an appreciation of the complexity that different NLP problems pose, via a series of talks and practical sessions.
For individuals wishing to participate fully in practical sessions, some basic programming experience, ideally with Java or Python, is recommended.
Planned Timetable
Day 1
Time | Session Title | Lead Tutor |
---|---|---|
09:30-10:00 | Introduction | Angus Roberts |
10:00-11:00 | Practical session: Introducing GATE Developer | Angus Roberts |
Coffee | ||
11:15-11:45 | Practical session: Information Extraction with ANNIE | Angus Roberts |
11:45 - 12:15 | Group discussion: Issues when building NLP IE applications | Angus Roberts |
12:15-12:45 | Practical session: Simple information extraction with pattern matching | Angus Roberts |
Lunch | ||
13:45-15:15 | continued. Practical session: Simple information extraction with pattern matching | Angus Roberts |
15:15-15:30 | Coffee | |
15:30-17:00 | Practical session: A medications example using pattern matching | Angus Roberts |
17:00 | Close |
Day 2
Time | Session Title | Lead Tutor |
---|---|---|
09:00 – 10:30 | Practical Session continued: A medications example using pattern matching | Angus Roberts |
10:30 - 10:45 | Coffee | |
10:45 – 11:30 | Group discussion: Issues when building NLP IE applications - validation | Angus Roberts |
11:30-12:15 | Machine Learning for NLP: Introduction | Angus Roberts |
Lunch | ||
13:15-14:45 | Practical session: Supervised Machine Learning - classification | Angus Roberts |
14:45-15:15 | Coffee | |
15:15-17:00 | Practical session: Supervised Machine Learning - chunking | Angus Roberts |
17:00 | Close |
Course Team
- Dr Angus roberts (Lead tutor)
Angus is a Senior Research Fellow in The University of Sheffield's Natural Language Processing group in the Department of Computer Science. His main research interests are: extraction of meaning from biomedical texts, such as medical records and medical research papers; and text mining software and infrastructure.
He is a member of the GATE team, for which he leads life science related work. GATE is a widely used software platform and framework for large-scale text mining and language engineering. It is used in the life sciences by medical record software companies, pharmaceutical companies, genetics researchers, and many others.
Angus originally trained and worked as a Biomedical Scientist, before working as a software developer and development manager, mainly in the UK National Health Service. This led to an interest in medical terminologies, ontologies, and the language of medical text.