Centre for Health Evidence: Home » Users' Guides to EBP |
Alexandra Barratt, Les Irwig, Paul Glasziou, Robert Cumming, Angela Raffle, Nicholas Hicks, JA Muir Gray, Gordon H. Guyatt, and the Evidence Based Medicine Working Group
Based on the Users Guides to Evidence-based Medicine and reproduced with permission from JAMA. (1999;281(21):2029-2034). Copyright 1999, American Medical Association.
You are a family physician seeing a 47 year old woman and her husband of the same age. They are concerned because a friend recently found she had bowel cancer and has urged them both to undergo screening with fecal occult blood tests (FOBT) because, she says, prevention is much better than the cure she is now undergoing. Both your patients have no family history of bowel cancer and no change in bowel habit. They ask whether you agree that they should be screened.
You know that trials of FOBT screening have demonstrated screening can reduce mortality from colorectal cancer (CRC) but you also recall that FOBTs can have a high false positive rate which then requires investigation by colonoscopy. You are unsure whether screening these relatively young, asymptomatic people at average risk of bowel cancer is likely to do more good than harm. You decide to check the literature to see if there are any guidelines or recommendations about screening for CRC that might help you.
Since you know there is more than one randomized trial you look first for a systematic review. Your MEDLINE search (using the terms fecal occult blood test and colorectal or colonic neoplasms and mass screening and systematic review) produces a systematic review by Towler et al in the BMJ. [1] However there may be ancillary evidence that would influence your decision about whether to recommend screening to your patient (such as the false positive rate of the test, the side effects of subsequent investigation and treatment and costs) so you also check for a practice guideline. You find the American Gastroenterological Association (AGA) guideline "Colorectal cancer screening: clinical guidelines and rationale" [2] which is based on the same trials as the systematic review but also provides the additional information you were hoping to find. The full text is provided so you print off a copy to take home and read tonight.
When assessing a guideline or recommendation about screening you should apply the criteria suggested earlier in this series about assessment of therapies/interventions. [3] [4] As well, you may consider other criteria for evaluating whether screening is worthwhile. [5] [6] [7] [8] Sometimes screening is clearly effective, with large benefits and negligible harms, as is the case with phenylketonuria (PKU) screening and screening for systolic hypertension (above 160mmHg) among the elderly. [9] In other situations clinicians must often weigh up the benefits and harms when considering whether to screen.[10] This guide extends earlier approaches by providing a framework for assessing the methodological strength of guidelines on screening, and by demonstrating the importance of weighing up the benefits and harms of screening when they are closely balanced. The final decision about whether to screen is greatly influenced by the values different individuals place on each of the possible benefits and harms.
Our criteria for reviewing a guideline (or a meta-analysis) about screening follow the Users' Guides for an article about practice guidelines; in this article we will not review all the Users' Guides for guidelines, but highlight only those issues specific to screening.
Table 1 presents the possible consequences of screening. Some people will be true positives with clinically significant disease (a0): a proportion of this group will benefit according to the effectiveness of treatment and the severity of the detected disease. For example, children found to have PKU will experience large, long-lasting benefits. Other people will be "true" positives with inconsequential disease (a1): they may suffer harms of labelling, investigation and treatment for a disease or risk factor that would never have affected their lives. Consider, for instance, a man in whom screening reveals low grade prostate cancer who is destined to die from a heart attack before his prostate cancer becomes clinically manifest. He may suffer unnecessary treatment and associated adverse effects. False positive persons (b) may suffer the harms associated with investigation of the screen detected abnormality. False negative persons (c0) may experience harm if false reassurance results in delayed presentation or investigation of symptoms; some may also be angry when they discover they have disease despite having a negative screening test. In contrast though, "false" negatives with inconsequential disease (c1) are not harmed by their "disease" being missed because it was never destined to affect them. True negatives (d) may experience benefit associated with an accurate reassurance of being disease free but may also suffer inconvenience, cost and anxiety.
Table 1: Summary of benefits and harms of screening by underlying disease state
|
||||||||||||||||||||
The longer the gap between possible detection and clinically important consequences the greater the number of people in the inconsequential disease category (a1). When screening for risk factors, very large numbers of people need to be screened and treated in order to prevent one adverse event years later [11] and thus most people found to have a risk factor at screening will be treated for inconsequential disease.
Guidelines recommending screening are on strong ground if they are based on randomized trials in which screening is compared to conventional care. In the past, many screening programs, some effective (such as cervical cancer screening and screening for PKU), have been implemented on the strength of observational data. When the benefits are enormous and the down sides minimal there is no need for randomized trials. More often, the benefits and harms from screening are more evenly balanced. In these situations, observational studies of screening may be misleading. Survival as measured from the time of diagnosis may be increased not because patients live longer, but because screening lengthens the time that they know they have disease (lead-time bias). Patients whose disease is discovered by screening may also appear to live longer because screening tends to detect slowly progressing disease, and miss rapidly progressive disease that becomes symptomatic between screening rounds (length-time bias). Therefore unless the evidence of benefit is overwhelming, randomized trial assessment is required.
Investigators may choose one of two designs to test the impact of a screening process. The trial may assess the entire screening process (early detection and early intervention, Figure 1a), in which case people are randomised to be screened and treated if early abnormality is detected or not screened (and treated only if symptomatic disease occurs). Trials of mammographic screening have utilized this design. [12] [13] [14]
Figure 1: Designs for randomized controlled trials of screening
|
Alternatively everyone may participate in screening and those with positive results are randomised to be treated or not treated (Figure 1b). If those who receive treatment do better, then one can conclude that early treatment has provided some benefit. Investigators usually use this design when screening detects not the disease itself, but factors that increase the risk of disease. Tests of screening programs for hypertension and high cholesterol have utilized this design.[15] [16] The principles outlined in this paper apply to both screening for occult disease and screening for risk factors for later disease.
As for all guidelines, developers must specify the inclusion and exclusion criteria for the studies they choose to consider, conduct a comprehensive search, and assess the methodologic quality of the studies they include. The review by Towler et al1 searched for published and unpublished trials and assessed their quality using criteria recommended by the Cochrane Collaboration. The investigators extracted data from the trials and combined them in a meta-analysis on an intention-to-screen basis.
The AGA guideline on colorectal screening used explicit inclusion and exclusion criteria and a comprehensive search to identify all the randomized trials of FOBT screening. The authors include a critical appraisal of the trials and conclude that, in general, they were rigorously conducted, though limited in that they do not consider the effect of screening on health-related quality of life.
A good guideline about a screening program should summarise the trial evidence about benefits and present data about the harms. The guideline should then provide information about how these benefits and harms can vary in subgroups of the population and under different screening strategies.
What outcomes need to be measured to estimate the benefits of a screening program? Benefits will usually be experienced by some of those who test positive, as either a reduction in mortality or an increase in quality of life. The benefit can be estimated as an Absolute Risk Reduction (ARR) or a Relative Risk Reduction (RRR) in adverse outcomes. Readers desiring a full discussion of these concepts can refer back to an earlier Users' Guide.[17] Briefly, the ARR depends on the baseline risk of disease and thus presents a more realistic estimate of the size of the mortality benefit. RRR, in contrast, is independent of baseline risk and can lead to a misleading impression of benefit (see Table 2). The number of people needed to screen to prevent an adverse outcome (NNS) provides another way of presenting benefit.
Table 2: Comparison of data presented as relative and absolute risk reductions and number needed to treat with varying baseline risks of disease and constant relative risk.
|
In addition to prevention of adverse outcomes, people may also regard knowledge of the presence of an abnormality as a benefit, as in, antenatal screening for Down syndrome. Another potential benefit of screening comes from reassurance afforded by a negative test, if a person is experiencing anxiety because a family member or friend has developed the target condition, or from discussion in the media. However, if the anxiety is a result of the publicity surrounding the screening program itself, we would not view anxiety reduction as a benefit.
The AGA guideline reports that, based on a computer simulation, the relative risk reductions from three trials of FOBT screening are 33% (annual screening ) and 15% and 18% (biennial screening). An estimate of the uncertainty associated with these estimates (as one would get from the 95% confidence interval around a pooled relative risk reduction) would help the reader appreciate the range within which the true relative risk reduction plausibly lies. The AGA guideline estimates an absolute risk reduction of 1,330 deaths prevented per 100,000 (13.3 per 1,000) people screened annually using FOBT from 50-85 years of age assuming 100% participation (Table 3).
Table 3: Clinical consequences for 1,000 people entering a program of annual FOBT screening for CRC at age 50 years and remaining in it until age 85 or death.*
*Adapted from AGA guideline 1997 |
||||||||||||||||||||||||||
Among those who test positive harms may include:
The AGA guideline reports that of the patients who do not have CRC 8% to 10% will test falsely positive (specificity 90-92% using rehydrated slides). In the trials only 2 to 6% of those who tested positive actually had colon cancer (positive predictive value of 2 to 6%). Thus, of every 100 screening participants with a positive test, only 2 to 6 will have cancer, but all 100 will be exposed to colonoscopy and its attendant risks (Table 3). While the colonoscopies will reveal few cancers, they will show many polyps (25% of people aged 50 years or more have polyps some of which will be judged to need removal depending on the size of the polyp). Part of the benefit of screening will come from removal of the small proportion of polyps that would have progressed to invasive cancer. Part of the harm of screening will come from regular colonoscopies that are recommended for people who have had a benign or inconsequential polyp removed.
Among those who test negative harms may include:
Of those who have cancer, FOBT screening using rehydrated slides will correctly identify 90% and miss the other 10% (sensitivity of 90%), according to the AGA guideline. Those who present with symptoms after a false negative screen may experience a sense of anger and betrayal that they would not suffer in the absence of a screening program.
Using the computer simulation, the AGA guideline presents data on the frequency of some of these harms. These data are summarized in Table 3, for 1,000 people participating in annual screening by fecal occult blood testing from 50 to 85 years of age. The model assumes those who test positive have a colonoscopy.
We now know the magnitude of both benefits and harms (as presented in Table 3). This balance sheet tells us that screening 1,000 people annually with FOBT from 50 years will prevent 13.3 deaths from CRC but cause 0.5 deaths from the complications of investigation and surgery. There will also be 10.4 major complications (perforations and major bleeding episodes) and 7.7 minor complications. The authors provide no data on anxiety but we could assume that some people will feel anxious prior to colonoscopy. Figure 2 presents these data as a flow diagram.
Figure 2: Flow diagram of the clinical consequences for 1,000 people entering a program of annual FOBT screening for CRC at age 50 years and remaining in it until age 85 or death
|
These data assume that the screening programs will deliver the same magnitude of benefit and harms as found in randomized trials; this will be true only if the program is delivered to the same standard of quality as in the trials. Otherwise, benefits will be smaller and the harms greater.
The AGA guideline recommends that people at average risk and over 50 years of age be offered screening for CRC. The guideline discusses several screening strategies (FOBT, flexible sigmoidoscopy, barium enema and colonoscopy) and, in relation to FOBT, recommends offering annual screening. The magnitude of benefits and harms will vary in different patients and under different screening strategies, as the following discussion reveals.
Risk of disease: Assuming that the relative risk reduction is constant over a broad range of risk of disease, benefits will be greater for people at higher risk of disease. For example, mortality from CRC rises with age and the mortality benefit achieved by screening rises accordingly (Figure 3a). But the life years lost in the population to CRC are related both to the age at which mortality is highest and the length of life still available. Thus the number of life years which can be saved by CRC screening increases with age to about 75 years and then decreases again as life expectancy declines (Figure 3b). The number of deaths averted by screening over 10 years for those aged 40, 50 and 60 years at first screening: 0.2, 1.0 and 2.4 per 1,0001 reflects these differences. Because of a greater benefit, it may be rational for a person aged 60 years to decide screening is worthwhile while a person aged 40 years (or 80 years) with smaller potential benefit might decide it is not worthwhile.
Figure 3: Mortality from CRC (Fig 3a) and years of life lost due to CRC (Fig 3b) with and without screening
|
Risk of disease, and therefore benefits from screening, may be increased by other factors, such as a family history. The AGA guideline reports that people with one or more first-degree relatives (parent, sibling, child) with CRC, but without one of the specific genetic syndromes, have approximately twice the risk of developing CRC as average-risk individuals without a family history. This means that for people aged 40 years who have a first-degree relative with CRC, the incidence of CRC is comparable to that for people aged 50 years without a family history. The guideline also notes that within each age group the risk is greatest in those whose relatives developed cancer at a younger age.
Screening interval: As the screening interval is shortened the effectiveness of a screening program will tend to improve, although there is a limit to the amount of improvement which is possible. For example, screening twice as often could theoretically double the relative mortality reduction obtainable by screening, but in practice the effect is usually much less. Cervical cancer screening may, for instance, reduce the incidence of invasive cervical cancer by 64%, 84% and 94% if screening is conducted at 10 yearly, 5 yearly and yearly intervals respectively.[18]
The frequency of harms will also increase with more frequent screening, potentially directly in proportion to the frequency of screening. Thus we will see diminishing marginal return as the screening interval is shortened. Ultimately the marginal harms will outweigh the marginal benefit of further reductions in the screening interval.
Test characteristics: If the sensitivity of a new test is greater than the test used in the trials, and is detecting significant disease earlier, the benefit of screening will increase. But it may be that the new, apparently more sensitive, test is detecting more cases of inconsequential disease (for example by detecting more low-grade prostate cancers, or more low-grade cervical epithelial abnormalities[19]), which will increase the harms. On the other hand, if specificity is improved and testing produces fewer false positives, net benefit will increase and the test may now be useful in groups in which the old test was not.
Ideally, clinicians would look to randomized trials of the new test compared to the old test. However, new tests often appear in profusion, and randomised trials are expensive and often only interpretable after long follow-up. Being pragmatic, we will usually need to accept that the trials have shown that earlier detection works and a comparison of a new versus the old test only needs to examine test characteristics. Returning to CRC screening, since we have randomized trial data of mortality reduction, we may assume that earlier detection using other methods such as flexible sigmoidoscopy will also reduce mortality from CRC even though there are no published reports of randomized trials of screening with flexible sigmoidoscopy.
People will value benefits and harms of screening differently. For example, pregnant women who are considering screening for Down syndrome may make different choices depending on the value they place on having a Down syndrome baby versus the risk of iatrogenic abortion from amniocentesis.[20]
Individuals who choose to participate in screening programs are benefiting (in their view) from screening and other individuals are benefiting (in their view) from not participating. Individuals can only make the right choice for themselves if they have access to high quality information about the benefits and harms of screening and are able to weigh up that information. This probably will require much better educational materials and decision support materials; some examples are already available.[21] [22]
There is always uncertainty about the benefits and harms of screening. The 95% confidence intervals around the magnitude of each benefit and harm provides an indication of the amount of uncertainty in each estimate. Where sample size is limited the confidence intervals will be wide and clinicians should alert potential screening participants that the magnitude of the benefit or harm could be considerably smaller or greater than the point estimate.
While clinicians will be most interested in the balance of benefits and harms for their individual patients, policy makers must consider issues of cost effectiveness and local resources in their decisions. Clinicians can look to previous Users' Guides to help them evaluate studies addressing these economic issues.[23] [24]
The AGA guideline reports the estimated cost-effectiveness of FOBT screening is approximately $US10,000 per life year gained among people over 50 years of age (although, like the absolute size of the benefit, it will vary with risk of disease). The AGA guideline also notes that all CRC screening strategies examined (FOBT, flexible sigmoidoscopy, barium enema, colonoscopy) cost $US20,000 per life year saved.
These cost-effectiveness ratios are within the range of what is currently paid in some countries for the benefits of other screening programs such as mammographic screening for women aged 50-69 years (estimated at $21,400 per life year saved[25]), ultrasound screening for carotid stenosis (incremental cost per QALY gained is estimated at $39,495[26]) and ultrasound screening for abdominal aortic aneurysm in men aged 60 to 80 years (estimated $41,550 per life-year gained[27]).
The guideline should quantify the benefit of screening according to age so you can inform your patients as accurately as possible about the benefits of screening for them. The AGA guideline does not provide age specific mortality reductions attributable to screening; therefore you cannot easily quantify the benefit for your patients. From the guideline, all you could say is that screening a group of 1,000 people with FOBT beginning at age 50 and continuing annually to age 85 will avert about 13 deaths from CRC. However, we know from the Towler et al systematic review that the mortality benefit for people between 40 and 50 years of age is about 0.2-1.0 deaths averted over 10 years per 1,000 people screened. Next you could outline the potential harms of screening. As noted earlier, the harms are mostly related to the colonoscopy. According to the AGA guideline, the risks of colonoscopy are about 0.1-0.3 per 1,000 for death and 1-3 per 1,000 for perforation and hemorrhage. In addition, there would also be issues of cost, inconvenience and anxiety.
It is up to your patients to weigh up whether the benefit of reduced risk of death from CRC is worth the risks. If they feel unable to do this, then you could consider helping them to clarify their values about the possible outcomes. For example, if they are not bothered by the prospect of a colonoscopy, they would probably chose to be screened. But if either of them places a high value on avoiding colonoscopy now, he or she may prefer to reconsider screening in a few years time when the benefits will be greater.
1. Towler B, Irwig L, Glasziou P et al. A systematic review of the effects of screening for colorectal cancer using the faecal occult blood test, Hemoccult. BMJ 1998;317:559-565.
2. Winawer SJ, Fletcher RH, Miller L et al. Colorectal cancer screening: clinical guidelines and rationale. Gastroenterology 1997;112:594-642.
3. Hayward RSA, Wilson MC, Tunis SR et al for the Evidence Based Medicine Working Group. Users' guides to the medical literature VIII. How to use clinical practice guidelines A. Are the recommendations valid? JAMA 1995;274:570-574.
4. Wilson MC, Hayward RSA, Tunis SR et al for the Evidence Based Medicine Working Group. Users' guides to the medical literature VIII. How to use clinical practice guidelines B. What are the recommendations and will they help you in caring for your patients? JAMA 1995;274:1630-1632.
5. Wilson JMG & Jungner G. Principles and practice of screening for disease. World Health Organization, Geneva:1968.
6. Gray JA Muir. Evidence-Based Healthcare. Churchill Livingstone, 1997.
7. Sackett DL, Haynes RB, Tugwell P. Clinical Epidemiology: a basic science for clinical medicine. Boston, Little Brown & Co. 2nd edition, 1991.
8. Welch HG & Black WC. Evaluating randomized trials of screening. Journal of General and Internal Medicine 1997;12:118-124.
9. SHEP Co-operative Research Group. Prevention of stroke by antihypertensive drug treatment in older persons with isolated systolic hypertension. Final results of the Systolic Hypertension in the Elderly Program (SHEP). JAMA 1991;265: 3255-3264.
10. Eddy DM. Comparing benefits and harms: the balance sheet. JAMA 1990;263:2493, 2498, 2501, 2505.
11. Khaw KT, Rose G. Cholesterol screening programmes: how much benefit? BMJ 1989;299:606-607.
12. Andersson I, Aspegren K, Janzon L et al. Mammographic screening and mortality from breast cancer: the Malmo mammographic screening trial. BMJ 1988;297:943-948.
13. Tabar L, Fagerberg G, Duffy S et al. The Swedish two county trial of mammographic screening for breast cancer: recent results and calculation of benefit. J Epidemiology and Community Health 1989;43:107-114.
14. Roberts MM, Alexander FE, Anderson TJ et al. Edinburgh trial of screening for breast cancer: mortality at seven years. Lancet 1990;335:241-246.
15. Multiple Risk Factor Intervention Trial Research Group. Multiple Risk Factor Intervention Trial: Risk factor changes and mortality results. JAMA 1982;248:1465-1477.
16. Frick MH, Elo E, Haapa K et al. Helsinki Heart Study: Primary prevention trial with gemfibrizil in middle-aged men with dyslipidemia. New Engl J Med 1987;317:1237-1245.
17. Guyatt GH, Sackett DL, Cook DJ for the Evidence Based Medicine Working Group. Users' guides to the medical literature II How to use an article about therapy or prevention. B. What were the results and will they help me in caring for my patients? JAMA 1994;271:59-63.
18. IARC Working Group on Evaluation of Cervical Cancer Screening Programmes. Screening for squamous cervical cancer: duration of low risk after negative results of cervical cytology and its implication for screening policies. BMJ 1986;293:659-664.
19. Raffle A. New tests in cervical screening. Lancet 1998;351:297.
20. Fletcher J, Hicks NR, Kay JDS, Boyd PA. Using decision analysis to compare policies for antenatal screening for Down's syndrome. British Medical Journal 1995; 311: 351-6.
21. Wolf A, Nasser J, Wolf AM, Schorling JB. The impact of informed consent on patient interest in prostate-specific antigen screening. Archives of Internal Medicine 1996;156:1333-1336.
22. Flood AB, Wennberg JE, Nease RF et al. The importance of patient preference in the decision to screen for prostate cancer. J Gen Intern Med 1996;11:342-349.
23. Drummond MF, Richardson WS, O'Brien B, Levine M, Heyland DK, for the Evidence-Based Medicine Working Group. Users' Guides to the Medical Literature XIII. How to use an article on economic analysis of clinical practice. A. Are the results of the study valid? JAMA 1997;277:1552-1557.
24. O'Brien BJ, Heyland DK, Richardson WS, Levine M, Drummond MF, for the Evidence-Based Medicine Working Group. Users' Guides to the Medical Literature XIII. How to use an article on economic analysis of clinical practice. B. What are the results and will they help me in caring for my patients? JAMA 1997:277:1802-1806.
25. Salzmann P, Kerlikowske K, Phillips K. Cost-effectiveness of extending screening mammography guidelines to include women 40-49 years. Ann Intern Med 1997;127:955-965.
26. Yin D. Carpenter JP. Cost-effectiveness of screening for asymptomatic carotid stenosis. Journal of Vascular Surgery 1998;27:245-255.
27. Frame PS, Fryback DG, Patterson C. Screening for abdominal aortic aneurysm in men ages 60 to 80 years. A cost-effectiveness analysis. Ann Intern Med 1993; 119:411-6
© 2001 Centre for Health Evidence.
Home.
Users' Guides to EBP.
Webmaster.
Disclaimer.