What do patient experience, genome-wide association studies and randomised controlled trials have in common? A blog about p-values

As far as I understand, genome-wide association studies (GWAS) have been a very successful approach to looking at huge numbers of possible associations of genes with different medical problems. The approach is broadly “hypothesis free” without specific prior reasons to think that any single genetic change out of the millions considered might be associated. This gives something of a problem with multiple testing, which has been dealt with analytically by comparing the distribution of the p-values obtained from that which would be expected were there no associations.

Which is actually more or less what we did in our study looking at the association between patient experience and having been treated in a hospital in London compared with the rest of England.

Martin Bland points out that the approach has also been proposed for use to look at the distribution of p-values for comparisons of baseline characteristics between arms in randomised controlled trials to ensure that the randomisation has been adequate. But he also points out that the uniform distribution expected if there were no associations (as would be expected with adequate randomisation) may not actually hold in some (fairly extreme) cases where the test assumptions are not met. Back to genetics again: where underlying models are mis-specified the approach also falls down – it’s not something to do without a bit of thought.

So what should we do if we want to use a patient experience survey where often several different patient experience outcomes are measured and we are interested with associations with each outcome? Well, one approach argues that each test is a separate and valid hypothesis test and each p-value should be considered independently. With patient experience we have specific questions we want to ask, in contrast with the hypothesis free GWAS situation.

Using a Bonferroni correction would be another approach (which is what the team from CCHSR did when looking at 300+ correlations between clinical quality and patient experience.) Caution may still be needed for large numbers of tests as the expected uniform distribution of p-values under the null hypothesis may also not hold at the real extremes of the distribution (p<tiny). But this is maybe more of an issue for the millions of tests in GWAS than the 70 or so questions in a patient experience survey.

We also know that this correction is very conservative, particularly when the tests are not strictly independent. We know that different domains of patient experience are not independent from each other.

So was plotting the distribution of p-values and comparing it to the uniform distribution expected under the null hypothesis a reasonable approach for us to take? I think yes. We performed about 60 tests, with appropriate analysis, in large samples, where the assumptions of the tests were met. We had fairly strong priors for all, and most of which were significant at p<0.05. We reported all results, not just the significant ones. Plotting of p-values and comparing to a uniform distribution was a simple attempt to illustrate this, that we thought something really probably was going on, and we hadn’t just reported a few significant results from a random trawl of multiple tests.

This entry was posted in Blog and tagged , , . Group: . Bookmark the permalink. Both comments and trackbacks are currently closed.
  • The Cambridge Centre for Health Services Research (CCHSR) is a thriving collaboration between the University of Cambridge and RAND Europe. We aim to inform health policy and practice by conducting research and evaluation studies of organisation and delivery of healthcare, including safety, effectiveness, efficiency and patient experience.