So what generic questions can we answer in gastroenterology?

The answer to this is really applicable to most areas in medicine. Many fields share the same practical, day to day questions. Although we may focus on specific questions for a specific field, the structure of many questions in medicine is the same.

Gastroenterology guidelines often present evidence based aims for gastroenterologists to achieve to provide the best patient care possible. Many of the older guidelines did not specify deliverable metrics to audit against- a key part of governance. The more recent guidelines, particularly in endoscopy do specify key performance indicators and therefore auditing them is becoming increasingly mandatory in specific areas. An excellent example is the global rating scale which is a truly triumphant series of endoscopic performance indicators from both the procedure and the resulting pathology that has undoubtedly improved endoscopists performance. Its method of implementation is quite rightly being used in other areas such as upper GI endoscopy.

In the end someone has to dig out the data and make sense of it to prove that we are hitting the indicators or find out who isn’t and why. Some of the indicators require datasets from different software, or even the same software, to be integrated somehow. Therein lies the problem. Gastroenterologists are not data scientists and so the merging of data is an issue. The most common datasets are illistrated in the diagram.

You will see how the merge between endoscopy and pathology datasets is usually straightforward (apart from the date as pathology report dates or even sample receipt are usually a few days after endoscopy). The relationship is one to one (one endoscopy results in one pathology report possible with several specimens but one report nevertheless). The problem datasets lie in those where the relationship is one to many such as the patient undergoing several endoscopies or even worse the complications for a patient after an endoscopy. The integration of datasets needs a strategy and this is something that still needs to be correctly sorted. We can give it a go here though…

Based on observation and experience I think the main questions are:

1. Analysis patient flow

If we understand the fact that patient’s flow through a system, and the way to characterise that flow is by organising data according to the patient’s unique identifier and episode date as a combined index, then there are a huge number of questions that can be ansered under this generic question. For example, at a population level analysis, how many patients will need to undergo further colonoscopy surveillance in 5 years based on the procedures already performed. On an individual patient level perhaps you need to know who is due to have further endoscopic follow up based on previous procedure, or which patients have been lost to follow-up. Once data is prepared in a manner that satisfies surveillance tasks, the data structure can be used for many other questions

2. Diagnostic yields

This relates to commonly held performance measures such as the adenoma detection rate or the pathological diagnoses per indication

3. Analysis of quality

This relates to measures such as the adequacy of sampling and documentation, adherence to minimum standards including lesion recognition performance).

Having structured our datasets as per the previous pages, we can extract this easily as follows: