Skip to main content

Table 5 Summary of issues observed and potential solutions

From: Design matters in patient-level prediction: evaluation of a cohort vs. case-control design when developing predictive models in observational healthcare datasets

Issue

Issues observed in study

Solution

Cohort

Case-control

Subjective methodology choices

No

Yes – the case-control designs used different matching criteria

Use a cohort design

Selection bias

NA

Did not appear to be a problem in the two predictions investigated

NA

Covariate issue

NA

1. Symptoms appeared in the diabetes model but didn’t impact performance.

2. The dementia model was unable to include variables used to match controls

Use covariates to stratify patients and develop separate models

Performance metric bias

Yes—due to temporal changes the internal validation was slightly optimistic

Yes—due to incorrect matching ratios and potentially non-generalizable development population the internal validation was very optimistic

Perform external validation with cohort design to fairly assess performance

Train models on more recent data

Recalibrate if necessary

Miscalibration

Some—due to temporal changes the risk was under-estimated

Yes—due to incorrect matching ratios the risk was over-estimated in both examples

Ill-defined time to apply model

NA

Not a problem for the two predictions investigated—the models appeared to perform reasonably when applied at the validation index event (even though they were not developed using this index)

NA