Reasons why the public interest strongly favours disclosure of the
requested data:
(FOI 2014/F73) 18th June 2014
Assertions about complete recovery and/or functional remission from
any chronic debilitating illness with a poor prognosis that is
regarded as difficult to treat should be taken seriously and be
based on reasonably stringent definitions. However, a recent 2014
systematic review of studies on recovery from CFS (including the
PACE trial) concluded that in general what the literature defined
as 'recovery' is better described as modest clinical improvement
only. There was no guarantee of 'recovery' per se, as
classification was based on limited assessments, less than a full
restoration of health, and self-reports lacking objective measures
in function which when used in behavioural intervention studies
suggested no changes (prompting the authors to conclude that these
therapies for CFS were not rehabilitative as often claimed). [1] An
earlier 2012 systematic review concluded that a "comprehensive
rehabilitation programme only rarely results in full recovery". [2]
The PACE trial has been repeatedly presented as offering
'definitive' answers on the controversial issue of 'rehabilitative'
treatments for CFS (e.g. in the official website FAQ, [3] in the
trial statistical analysis plan, [4] by the Science Media Centre,
[5] and on ABC radio [6]). The published definition of
recovery/remission was presented by the principal investigators
White et al. as 'comprehensive and conservative' and purported to
use stricter thresholds than a previous study on recovery from CFS
by Knoop et al. published in 2007. [7] However, multiple
significant issues have been identified with the recovery criteria
which strongly challenge or contradict these presentations and have
not been adequately addressed by the authors. [1,8,9] Disclosure of
the requested data will greatly help the resolution of these
issues.
The thresholds used for the 'normal range' score of fatigue and
physical function inappropriately overlapped with the trial
eligibility criteria for 'severe fatigue' and 'significant
disability'. The recovery definition allowed participants to be
classified as 'recovered' without reporting clinically significant
improvements to fatigue and physical function, as such improvements
were not required and allowed a 5 point decline in physical
function. No longer meeting Oxford criteria for CFS in the trial
did not necessarily mean no longer meeting Oxford criteria or
suffering from CFS in the clinic, because additional criteria for
fatigue and physical function were required, and participants were
classified as 'no longer meeting Oxford criteria' if they failed to
meet a single one of these thresholds e.g. moving from a score of
65 to 70 points in physical function but remaining unwell. 11% of
excluded candidates failed to meet these additional criteria
despite otherwise meeting Oxford criteria, which itself also
requires fatigue to be the only principal symptom (which is not a
requirement of any other CFS case definition. [10] and 80% of
candidates who were definitely or provisionally diagnosed with CFS
before the trial were excluded from participation, with the most
common reason being not meeting Oxford criteria for CFS).
Improvement on the clinical global impression scale does not
guarantee a recovery from CFS or any improvement in the primary
outcome measures of fatigue and physical function. The optional
requirements of not meeting CDC criteria for CFS or London criteria
for ME were superfluous because these were not an entry
requirement, tend to be more difficult to meet than the Oxford
criteria in the first place, and were not applied properly in the
trial. [7,11]
The relevant trial oversight bodies approved the original 2007
protocol published in BioMed Central, which included a much more
stringent definition of clinically significant improvement
('positive outcome') and complete 'recovery'. [12] According to
BioMed Central, "publishing study protocols will help to improve
the standard of medical research by ... enabling readers to compare
what was originally intended with what was actually done, thus
preventing both 'data dredging' and post-hoc revisions of study
aims". [13] The purpose of pre-publishing a protocol is to avoid
accusations of cherry picking the results, but when the protocol is
ignored this clearly cannot be guaranteed. The thresholds for
clinical improvement on an individual patient level for the primary
measures of fatigue and physical function were abandoned and
replaced with weaker thresholds which have been criticized for
being minimal. [14,15] Similarly, all components of the recovery
definition were significantly modified in a manner which made them
substantially less stringent and easier to qualify. Of particular
note, the threshold for normal physical function was dropped from
85 to 60 out of 100 points, a score low enough that 13% of
participants were already within the 'normal range' at baseline
despite meeting trial eligibility criteria for 'significant
disability' (65 points or less). [16] In contrast, participants
originally had to improve a minimum of 20 to 25 points to physical
function to be classified as recovered. Other researchers of CBT
for CFS have even classified a score of 60 to 70 points as
indicative of 'severe' impairments in physical function. [17,18]
Professor White previously requested that the threshold for a
'positive outcome' in physical function (later abandoned) be raised
from 70 to 75 points, because the entry criteria had been raised
from 60 to 65 to increase recruitment, so a 10 point gap between
entry criteria and 'positive outcome' scores was needed to avoid a
'trivial' difference. [19,20] Now there is a 5 point gap in the
opposite direction, which cannot not be described as a strict or
'conservative' threshold. Although it has been argued that
protocols can change in light of new information, it is unclear how
any of these changes could "more accurately reflect recovery" as
asserted in the paper by White et al. [7] Furthermore, as the
changes to the definition of recovery published in 2013 appear to
be largely based on controversial post hoc analyses conducted for a
previous paper on the trial results published in 2011, [21] it is
unclear whether these major deviations from the protocol were
approved by the relevant trial oversight bodies, and this confusion
surrounding the timing of changes has reached the level of
parliamentary debate in the House of Lords. [22]
As a previous claim made in the Lancet paper about the normative
dataset used from a population study had turned out to incorrect,
[23] it seemed prudent to examine the justification behind the most
controversial change to the recovery criteria. White et al.
asserted that the change to the threshold of normal physical
function was justified because a score of ≥85 "would mean that
approximately half the general working age population would fall
outside the normal range". [7] However this is incorrect, as
independent analyses of the English normative dataset cited by
White et al. revealed that over half score the maximum of 100
points. The median(IQR) score for the general working age
population sample is 100(90-100) not about 85 as implied (which
suggested an erroneous assumption that the mean and median were
equivalent), and only about 18% of the general working age
population sample had a score under 85. [24] The original threshold
of >=85 points appears to be reasonable and appropriate, as it
"represents the ability to carry out moderate activities, such as
lifting a table, carrying purchases, or bowling, without
limitations". [25] 92% of the healthy working age population score
85 points or more, and 61% score the maximum of 100 points. The
mean(SD) and median(IQR) scores for this population are 95.0(10.2)
and 100(95-100) respectively, with scores under 80 appearing to be
extreme outliers when defined as more than 3 x IQR below the
median. [24] It is highly unlikely that the PACE trial participants
classified as 'recovered' have a similar distribution of scores
compared to a healthy working age population.
White et al. stated that "we derived a mean (S.D.) score of 84 (24)
for the whole sample, giving a normal range of 60 or above for
physical function" and asserted that this sample was
"demographically representative". [7] However, the 'whole sample'
was a general population which included the elderly and chronically
disabled, [26] with age and illness having a major impact on
physical function scores in a way which decreases the mean and
increases the standard deviation, therefore lowering the threshold
of 'normal'. The mean(SD) age was 48.3(19.0) years, 32% were aged
60 years or more, and 22% reported chronic debilitating illnesses
(many of which would have medically excluded candidates from
participating in the PACE trial). [24][26] Whereas PACE trial
participants had a mean(SD) age of about 38(12) years at baseline,
only 3% were aged 60 years or more, [27] and were previously
screened for common chronic debilitating illnesses in the
population which would have excluded them from a CFS diagnosis.
Although described as a 'conventional' method, [23] White et el.
have applied a simple parametric statistical method to a dataset
without any apparent consideration for what the authors of the
cited paper (Bowling et al.) described as a heavily skewed
distribution, [26] which was specifically warned against in a paper
previously co-authored by Professor White [28] and has been
described elsewhere as a "fundamental misuse of statistics". [29]
Furthermore, the use of normative data from a general population
sample with important demographic differences to PACE trial
participants (age distribution and presence of common debilitating
illnesses) has never been justified in any of the publicly
available PACE trial literature. It is unclear why the authors did
not stop and think twice before using a 'recovery' threshold that
was unusually low and overlapped with their own trial criteria for
'significant disability'. A score of 60 points means reporting
significant limitations in multiple domains (somewhere between
minor limitations for 8/10 questions or major limitations in 4/10
questions), [30] which is unusual for healthy people of working age
and an unsuitable threshold for a genuine recovery. White et al.
[7] incorrectly claimed that their threshold was more
"conservative" i.e. stricter than the previous work of Knoop et al.
[28] The latter paper actually used the same mean plus or minus 1
SD formula as PACE did (not mean plus or minus 2 SD as claimed by
White et al.), and relied on a healthy population instead of a
general population to calculate a higher threshold of 80 points in
physical function as the normal threshold for recovered. Similarly,
serious questions have also been raised about the suitability of
the threshold for normal fatigue and the population used to derive
it. [8,31-33]
In response to the paper on 'recovery', Dr Esther Crawley from the
University of Bristol said that "Every patient with CFS/ME wants to
know how likely they are to recover." [34] Yet, many patients were
rather unsatisfied with the major deviations from the previously
established protocol, questioned the 'normal range' in particular,
and wanted to know the 'positive outcomes' and more importantly the
recovery rates as previously defined more stringently. A collection
of patient charities made a FOI request for this information in
2011, which included the results according to the original recovery
criteria [35] but were refused on the grounds that this information
was exempt under s.22 of the FOIA i.e. due for future publication.
[36] A similar FOI request in 2012 was refused on the grounds that
the information was not held in final form because the definition
of recovery had changed with a pending paper and there was no
intention on publishing the requested information in the future
(the refusal notice also incorrectly claimed that some of the
changes made the definition more stringent). [37] Another FOI
request in 2013 was refused on the grounds that the raw data
required to calculate these outcomes does exist but would require
over 18 hours to do so. [38] Therefore, this FOI request is for
selected raw data so that these calculations can be done without
QMUL.
ME/CFS is regarded as a controversial subject, but this controversy
is only further fuelled by the lack of transparency over trial
results presented as 'definitive' and the failure to publish the
measures specified in the original approved protocol. Given that
the published recovery thresholds appear to be fundamentally based
on previous post hoc analyses, coincide with less than expected
clinical improvements, and are generally at or below the level of
the trial entry criteria, it is difficult to believe that the
accusations of cherry picking or intentionally misleading
vulnerable patients and clinical commissioners (irrespective of
whether it is true or not) will simply go away without the
publication of these stricter outcomes. It is critical that
sufficient data is placed in the public domain to allow patients
and clinical commissioners to accurately assess recovery and the
sensitivity to any particular threshold.
There have been recent calls for medical research to be more
transparent and accessible and accountable, as per the AllTrials
campaign (www.alltrials.net). Although this does not necessarily
mean unrestricted public access to all the data of a trial,
AllTrials calls for "All trials past and present should be
registered, and the full methods and the results reported." The
Wellcome Trust takes a step further and calls for the full release
of all trial data. [39] The public interest in transparency around
drug trials has been well established by the European Medicines
Agency and the same principles should apply to psychotherapy and/or
behavioural interventions. [40] The PACE trial was publicly funded
research and the (anonymised) data should be openly available to
the maximum practical extent. Answers to remaining questions in
science are generally gained from further replication, but the PACE
trial cost taxpayers £5m, and due to its high cost and large size,
it is highly unlikely that another similar trial will be conducted
anytime soon. Therefore, the collected data should be explored to
the maximum extent possible. Without voluntary transparency, the
task of finding out the results as promised in the original PACE
trial protocol depends on members of the public, and the FOIA
appears to be most plausible method for seeing this happen in the
foreseeable future.
The ongoing confusion and controversy is adding to the suffering of
patients, and getting to the bottom of this issue is important
whatever the outcome may be. The results and interpretations do not
just affect those who are curious about research, but have national
and perhaps even global ramifications. Patients and clinical
commissioners of this chronic debilitating illness have a right to
accurate information about treatments which are promoted to them as
rehabilitative and potentially curative. This is required for them
to assess and give informed consent for treatments, or make
informed decisions about health care. Lax definitions of
recovery/remission and clinical improvement lead to unreasonable
expectations from patients by those who provide their care. In a
similar study known as the FINE trial [41] (which released the
results according to its own published protocol and failed to show
significant improvements with therapies similar to and sharing
elements with CBT/GET tested in the PACE trial), some participants
had doubts about the (overly optimistic) treatment rationales, and
therapists reported becoming angry and blaming participants as "the
bastards don't want to get better". [42]
It is doubtful whether disclosure would actually deter future
research. Conversely, it could be counter-argued that research
candidates may feel discouraged from participating in controversial
research if previous trials have involved major, questionable, and
possibly unapproved, deviations from the pre-approved original
protocol which made it much easier for the tested therapies to
appear successful, coincided with less impressive than expected
results, and led to the results being exaggerated. For example, the
published rates of trial participants within the 'normal range' for
fatigue and physical function (which overlapped with trial
eligibility criteria for severe chronic fatigue and significant
disability) was presented in 2011 at a Lancet press conference with
the principal investigators as returning back to normal, [43] and
this was then widely misinterpreted as a complete recovery or cure
in the national news media e.g. [44,45] and medical journals e.g.
[46,47] The Lancet editorial which accompanied the 2011 paper on
the PACE trial results inaccurately claimed that the 'normal range'
was a strict criterion for recovery based on scores from healthy
people, [48] but the Press Complaints Commission later ruled that
this comment was misleading and breached Clause 1 (Accuracy) of the
Code. [49] Such repeated misstatements of fact have negative
implications for how patients are treated by doctors, how funding
decisions are made, and for scientific accuracy concerning recovery
from ME/CFS. A poll conducted on the ME Association website during
March 2011 revealed that 89% of 751 respondents were significantly
concerned that the PACE trial results would adversely affect
treatment within the NHS. [50]
Unless the PACE group themselves promptly publish the original
protocol-defined 'positive outcomes', the original protocol-defined
'recovery' rates, and summary statistics on those classified as
recovered (both versions) compared with appropriate summary
statistics of healthy populations with a similar age distribution
as trial participants, then the disclosure of the requested data
allowing others to do the necessary calculations is certainly in
the public interest. Given that the lax definition of 'recovery'
fundamentally depends on a threshold for 'normal' physical function
which appears to be seriously flawed and inaccurately presented as
strict or conservative, with the reason for abandoning the original
protocol-defined threshold found to be erroneously based on a
misinterpretation of summary statistics from a population study,
the requested data will be important to help the public (patients,
carers, research community, healthcare staff, et cetera) further
assess the degree and nature of improvements in the PACE trial.
Please help resolve this controversy once and for all by granting
this FOI request.
References (abbreviated to save space):
1.
http://www.ncbi.nlm.nih.gov/pubmed/24791...
2.
http://www.ncbi.nlm.nih.gov/pubmed/22725...
3.
http://www.pacetrial.org/faq/faq2.html
4.
http://www.trialsjournal.com/content/14/...
5.
http://www.sciencemediacentre.org/expert...
6.
http://www.abc.net.au/radionational/prog...
7.
http://www.ncbi.nlm.nih.gov/pmc/articles...
8.
http://www.ncbi.nlm.nih.gov/pubmed/23363...
9.
http://www.meassociation.org.uk/2013/07/...
10.
http://www.iacfsme.org/BULLETINWINTER201...
11.
http://www.ncbi.nlm.nih.gov/pubmed/21334...
12.
http://www.biomedcentral.com/1471-2377/7/6
13.
http://www.biomedcentral.com/authors/pro...
14.
http://www.thelancet.com/journals/lancet...(11)60689-2/fulltext
15.
http://www.thelancet.com/journals/lancet...(11)60685-5/fulltext
16.
https://listserv.nodak.edu/cgi-bin/wa.ex...
17.
http://www.ncbi.nlm.nih.gov/pubmed/22354...
18.
http://eurpub.oxfordjournals.org/content...
19.
https://listserv.nodak.edu/cgi-bin/wa.ex...
20.
http://www.ico.org.uk/~/media/documents/...
21.
http://www.ncbi.nlm.nih.gov/pmc/articles...
22.
https://www.whatdotheyknow.com/request/t...
23.
http://www.thelancet.com/journals/lancet...
24.
http://dx.doi.org/10.5255/UKDA-SN-3660-1
25.
http://www.ncbi.nlm.nih.gov/pubmed/9054791
26.
http://www.ncbi.nlm.nih.gov/pubmed/10528...
27.
http://www.bmj.com/content/347/bmj.f5963...
28.
http://www.ncbi.nlm.nih.gov/pubmed/17426...
29.
http://www.bmj.com/content/347/bmj.f5963...
30.
http://www.rand.org/health/surveys_tools...
31.
http://www.thelancet.com/journals/lancet...(11)60688-0/fulltext
32.
http://evaluatingpace.phoenixrising.me/a...
33.
http://www.meactionuk.org.uk/Normal-fati...
34.
http://www.sciencemediacentre.org/expert...
35.
http://www.meassociation.org.uk/?p=6171
36.
http://www.meassociation.org.uk/wp-conte...
37.
https://www.whatdotheyknow.com/request/p...
38.
https://www.whatdotheyknow.com/request/p...
39.
http://blog.wellcome.ac.uk/2013/08/05/fu...
40.
http://www.ema.europa.eu
41.
http://www.ncbi.nlm.nih.gov/pmc/articles...
42.
http://www.ncbi.nlm.nih.gov/pmc/articles...
43.
http://www.meactionuk.org.uk/pacepressco...
44.
http://www.ncbi.nlm.nih.gov/pmc/articles...
45.
http://www.bmj.com/content/342/bmj.d1168...
46.
http://www.guardian.co.uk/society/2011/f...
47.
http://www.dailymail.co.uk/health/articl...
48.
http://www.thelancet.com/journals/lancet...
49.
http://www.pcc.org.uk/news/index.html?ar...
50.
http://www.meassociation.org.uk/archive-...
Mr Matthees left an annotation ()
• If there is any doubt about the ease of which some trial participants could be disqualified from meeting Oxford criteria at followup (for failing any additional ad hoc criteria bolted onto the Oxford criteria for the purposes of the trial), QMUL previously stated: "A score of 70 or more on the SF-36 sub-scale would mean that the participant did not meet Oxford criteria. Similarly having a score of 5 or less on the Chalder fatigue questionnaire would mean that the participant did not meet Oxford criteria." [1] This is not usually part of the Oxford criteria and means that no longer meeting Oxford criteria at followup in the trial could be achieved by borderline cases improving a single increment on either scale but who could still be significantly affected by CFS and still otherwise meet Oxford criteria outside the trial setting. In the original trial protocol, a fatigue score of 5/11 on the CFQ (bimodal scoring) and a score of 70/100 on the physical function sub-scale of the SF-36 were still both regarded as abnormal scores, and required a fatigue score of 3/11 and a physical function score of 85/100 to be classified as recovered. [2]
• Re: "11% of excluded candidates failed to meet these additional criteria despite otherwise meeting Oxford criteria". This was based on Figure 1 of the 2011 Lancet paper, [3] but it is not entirely clear how many excluded candidates failed to meet each stated reason for exclusion, as only one reason is given for each exclusion but it is plausible that excluded candidates failed more than one. Out of the 2517 candidates who were excluded (all of which had previously been definitely or provisionally diagnosed with CFS), 43% were excluded for not meeting Oxford criteria, 10% were excluded for scoring over 65/100 points in physical function, and 1% were excluded for scoring under 6/11 points in fatigue, suggesting that some excluded candidates still had CFS.
• A video has been made demonstrating how a trial participant could improve a single increment in fatigue, decline a single increment in physical function, report feeling significantly better e.g. due to pain medication, and then be classified as 'recovered' despite remaining significantly unwell with CFS. [4] Some participants would have to improve more to cross the thresholds, but it demonstrates a problem with the lax recovery criteria and that combining multiple recovery criteria does not necessary negate the weaknesses of each individual criteria. The definition of 'recovery' is simply too lax to justify the use of the word 'recovered', particularly without objective measures (which when used have generally shown a lack of clinically significant improvement).
• Re: "The thresholds for clinical improvements on an individual patient level for the primary measures of fatigue and physical function were abandoned and replaced with weaker thresholds...". To further clarify in context with the sentence which preceded it about BioMed Central's statement on post hoc revisions, the latter thresholds were the 'clinically useful difference' i.e. 2/33 points in fatigue (Likert scoring) and 8/100 points in physical function, which did not appear in the final version of the statistical analysis plan, [5] and were described as post hoc in the 2011 Lancet paper. [3] It is unclear when these were introduced and whether they were approved by the relevant trial oversight bodies (see below for why doubts have been raised).
• Re: "as the changes to the definition of recovery published in 2013 appear to be largely based on controversial post hoc analyses conducted for a previous paper on the trial results published in 2011". To further clarify, the methods and thresholds for the normal range for fatigue and physical function used in the revised definition of recovery did not appear in the final version of the statistical analysis plan [5] and are exactly the same as the post hoc analyses that were introduced during the peer review stage [6] of the fast tracked 2011 Lancet publication [3] (i.e. not introduced by the authors themselves). This is why doubts have been understandably raised over whether the changes to the recovery thresholds were approved by the relevant trial oversight bodies and whether the recovery definition was revised after rather than before these post hoc analyses were conducted, which would contradict the published impression that the changes were made to the recovery criteria before analysing any data. [7,8]
1. https://listserv.nodak.edu/cgi-bin/wa.ex...
2. http://www.biomedcentral.com/1471-2377/7/6
3. http://www.ncbi.nlm.nih.gov/pmc/articles...
4. http://www.youtube.com/watch?v=d_7J5ELjArU
5. http://www.trialsjournal.com/content/14/...
6. http://www.meactionuk.org.uk/whitereply....
7. http://www.ncbi.nlm.nih.gov/pmc/articles...
8. https://www.whatdotheyknow.com/request/t...