Centre for Health Evidence: Home » Users' Guides to EBP |
Bernie J. O'Brien, Ph.D., Daren Heyland, M.D., W. Scott Richardson, M.D., Mitchell Levine, M.D., Michael F. Drummond, Ph.D., for the Evidence-Based Medicine Working Group
Based on the Users Guides to Evidence-based Medicine and reproduced with permission from JAMA. (1997 May 21; 277(19):1552-1557) and (1997 Jun 11; 277(22):1802-1806) (Published erratum in 1997 Oct 1; 278(13):1064). Copyright 1999, American Medical Association.
You are a general internist on the staff of a large community hospital. Your Chief of Medicine knows of your interest in evidence-based medicine, and she asks you to help her solve a problem. The hospital's Pharmacy and Therapeutics committee has been trying to decide on formulary guidelines for the use of streptokinase or tPA in the treatment of acute myocardial infarction (MI). Members of the committee have been arguing for weeks about the GUSTO trial [1] and whether the added expense of tPA is worth it. The committee has reached an impasse and has asked the Chief of Medicine for some outside help to reach a good decision. Knowing that the hospital faces pressure to keep costs down, the Chief wants good information about this question to bring to the next committee meeting later this week. She asks you to help her find out if a formal economic analysis that compares thrombolytic agents for acute MI has been done and then help her present it to the Committee.
From your office computer you enter the hospital library's CD-ROM MEDLINE system via the hospital's information network. In the current MedLine file, you cross the terms 'myocardial infarction' (11,099 citations), 'thrombolytic therapy' (3350 citations) and 'cost-benefit analysis' (4,232 citations). This yields a set of only 11. Reviewing these onscreen, you find three articles directly relevant to your question. One is an economic analysis done as part of the GUSTO study [2], and another is an economic analysis using data from the GUSTO trial in a decision model [3]. Your searching program includes a 'Local Messages' field, and this field reports that both of these two studies are available in your hospital's library. Your search also turn up another analysis based on modelling [4], but the 'Local Messages' note indicates that this journal is not available in your library. You request a copy via interlibrary loan, but realize it will probably arrive long after the Committee's meeting later this week. You thus turn to the first two articles, hoping to find some evidence you can use to help the Committee.
In the course of their work, clinicians make many decisions about the care of individual patients. Clinicians are also asked to participate in decisions for large groups of patients, whether to set clinical policy for an institution ("Should streptokinase or tPA be recommended routinely for patients with an acute MI who present to our hospital?"), or to set health policy at a more 'macro' level ("Which thrombolytic agents should our national or local health authority choose to purchase and provide for our citizens who suffer acute MI?"). When making decisions for such patient groups, clinicians need not only weigh the benefits and risks, but also consider whether these benefits will be worth the health care resources consumed. Resources used to provide health care are vast, but not limitless. This is particularly the case in managed care settings where, in essence, a fixed sum is available to provide care for enrolees. Thus, more and more, clinicians will have to convince colleagues and health policy makers that the benefits of their interventions justify the costs.
To inform these decisions, clinicians can use economic analyses of clinical practices. Economic analysis is a set of formal, quantitative methods used to compare alternative strategies with respect to their resource use and their expected outcomes [5] [6]. Economic evaluations seek to inform resource allocation decisions, not make them. Economic analyses have been attracting more attention in recent years and could potentially inform decisions at different levels in the health care system, such as managing major institutions like hospitals and in determining regional or national policy [7] [8] [9].
Randomized trials generate data about relative treatment efficacy, but sometimes investigators may also collect data about cost. As with other integrative studies such as decision analyses, [10] and practice guidelines, [11] economic analyses may use estimates of cost and effectiveness from summaries of several studies of therapy, diagnosis, and prognosis. Either way, the main distinction between economic analyses and other studies is the explicit measurement and valuation of resource consumption or cost. The integration of cost data often involves placing values on the health outcomes so that they can be related to the costs of alternative treatment strategies.
In helping you understand economic analyses, we will introduce you to how these analyses are conducted and review some of their strengths and weaknesses. This is not, however, an article on how to perform economic analysis; should you wish to do so, you should look elsewhere [12] [13] [14]. Since you may frequently encounter economic analyses that are based on decision models, you may also find it useful to review the earlier articles in the series on clinical decision analysis [10] when reading such studies.
We will approach articles on economic analysis of clinical strategies with the same three organizing questions introduced in earlier articles in this series:
This question addresses whether an economic analysis truly determines which of the clinical strategies would provide the most benefit for the available resources. Just as with other types of studies, the validity of an economic analysis is primarily determined by the strength of the methods used.
If the answer to the first question was yes, and the economic analysis likely yields an unbiased assessment of the costs and outcomes of the clinical strategies under study, then the results are worth examining further. The guides under this second question consider the size of the expected benefits and costs and benefits from adopting the most efficient strategy and the level of uncertainty in the results.
If the economic analysis yields valid and important results, you can then examine how to apply these results in your own clinical setting. Table 1 summarizes the specific questions you can ask in addressing these three areas. We will explore the guides by applying them to the articles we found in our search. This article will deal with the validity guides, while the next in the series will address the results and applicability.
Table 1: Users' Guide for Economic Analysis of Clinical Practice
|
Economic analyses compare two or more treatments, programmes or strategies. If two strategies are analysed but only costs are compared, this comparison would inform only the resource use half of the decision and is termed a 'cost analysis'. Comparing two or more strategies only by their efficacy (such as in a randomized trial) informs only the outcomes portion of the decision. A full economic comparison requires that both the costs and outcomes be analysed for each of the strategies being compared. To help you understand the structure of the comparison further, some additional questions will be useful.
a. Was a broad enough viewpoint adopted?
Costs and outcomes can be evaluated from a number of viewpoints: the patient, the hospital, the third party payer (e.g., HMO) or society at large. Each viewpoint may be relevant depending on the question being asked, but broader viewpoints are most relevant to those concerned about the overall allocation of health care resources [9]. That is, an evaluation adopting (say) the viewpoint of the hospital will be useful in estimating the budgetary impact of alternative therapies for that institution. However, economic evaluation is usually directed at informing policy from a broader societal perspective.
For example, in an evaluation of an early discharge programme, it is not sufficient to report only hospital costs, since patients discharged early may consume substantial resources in the community. These costs may not be borne by the hospital, but are likely to impact on the third party payer or the patient in some way or another. This was a weakness of the study by Topol et al. [15] which assessed the feasibility and cost savings of hospital discharge three days after acute myocardial infarction, considering only hospital and professional charges. We have no knowledge of other community services consumed and whether these differed between early discharge and conventional discharge patients.
One of the main reasons for considering narrower viewpoints in conducting an economic analysis is to assess the impact of change on the main budget holders, since budgets or payments may need to be adjusted before a new therapy can be adopted. This is particularly true in countries like the USA, where resource allocating decisions are made in a decentralized way by a range of actors rather than a health ministry. Weisbrod et al. [16] pointed out that, whilst a community-oriented mental illness programme was worthwhile from the perspective of society as a whole, it would be more costly to the organization responsible for providing the care. Even within the same institution, narrow budgetary viewpoints can prevail. In our example comparing streptokinase with tPA, it would be wrong just to focus on the relative costs of the drugs, which fall on the pharmacy budget, if there are also impacts on other hospital resource use.
The patient's perspective may also merit specific consideration, if costs (e.g. in travel) reduce access to care. Also, some patients may not be able to participate in community care programs if these impose major costs in terms of informal nursing support in the home. In some countries, most notably the USA, patients may also be responsible for a sizeable proportion of their health care bills. Many economic analysts do not track all of these costs, owing to the time and effort required. However, the patient's perspective is partially integrated into the analysis by measuring the outcomes of therapy, such as impact on quality of life.
The way in which the articles by Mark et al. [2] and Kalish et al. [3] handle these and other key methodological issues is presented in Table 2. Mark et al. [2] point out the importance of considering a broad, societal viewpoint, whereas Kalish et al. [3] do not discuss the issue. In practice, both analyses concentrate on the identification and quantification of direct medical care costs, both inside and outside the hospital. The reasons for exclusion of other cost items, such as patients' costs, are not explicitly discussed, but may relate to the practical problems of data collection.
Table 2: Key Methodological Features of the Two Studies
|
The breadth of outcomes considered varies according to the type of economic analysis. In cost-effectiveness analyses the health outcomes are not valued, but reported in physical units such as 'life years gained' or 'cases successfully treated'. In one variant of cost-effectiveness analysis, sometimes called cost-utility analysis, outcomes of different types are weighted to produce a composite index, such as the quality-adjusted life-year [12], or healthy years equivalent [17]. Quality adjustment involves placing a lower value on time spent with impaired physical and emotional function than time spent in full health. On a scale where 0 represents death and 1.0 full health, the greater the impairment, the lower the value of a particular health state. These approaches are particularly useful when alternative treatments produce outcomes of different types, or when increased survival is 'bought' at the expense of reduced quality of life.
Finally, in cost-benefit analyses, the health consequences are valued by asking health care consumers what they would be 'willing-to-pay' for health services that achieve combinations of outcomes of particular types. This has one advantage that it would be possible to assess directly whether the intervention is worthwhile to society, as all costs and outcomes would be valued in the same units (usually dollars). However, this approach may introduce a bias towards interventions for the rich, if their willingness-to-pay were higher than that of the poor. Nevertheless it is worth remembering that most of the methods of economic evaluation ultimately lead towards some type of social valuation, such as how much are we willing to pay to gain an extra life-year or an extra QALY. Also, the QALY approach introduces another kind of bias, in favour of those individuals with potentially more years to live in a good health state.
In the study by Mark et al. [2] the primary analysis was cost-effectiveness analysis, using the outcome 'years of life saved'. The outcome in quality-adjusted life-years (QALYs) was considered in a secondary analysis. In the study by Kalish et al. [3] the primary analysis used QALYs. In both cases the value of states of health were obtained by the time trade-off approach; that is, by asking patients how many years in their current state of health they would be willing to give up in order to live their remaining years in excellent health. Mark et al. [2] obtained these values from patients in the GUSTO trial one year after treatment. Kalish et al. [3] obtained them from a subset of patients in the GISSI-2 trial.
Another type of consequence is the impact that therapy may have on the patient's ability to work and hence her or his contribution to the nation's production. These impacts are known as indirect costs and benefits in much of the health economics literature, but this terminology is falling from favor as it is at odds with the accounting use of the term 'indirect costs', to mean overheads. The issue of inclusion or exclusion of productivity changes is a frequent topic of debate. On one hand, these represent resource use changes just like those occurring in the health care system. On the other hand, production may not actually be lost if a worker is absent for a short period. Also, for longer periods of absence, a previously unemployed worker may be obtained. Furthermore, inclusion of productivity changes biases evaluations in favour of programs for those individuals that are in full-time employment. Therefore, you should be sceptical about any economic analysis that includes productivity changes without clearly presenting the implications.
Neither of the thrombolytic studies discussed here considered productivity changes. The inclusion would be unlikely to substantially influence the comparison between SK and tPA, and may not be appropriate. The possible avoidance lost productivity could, however, constitute another argument for thrombolysis over 'doing nothing'.
b. Were all the relevant clinical strategies compared?
The second assessment of the breadth of an economic evaluation relates to the range of alternative strategies examined. A frequently omitted strategy is that of maintaining the status quo. Another mistake is to view alternatives as being 'all or nothing'. In medicine it is not often the case of whether one should adopt a particular test or apply a particular therapy, but how much of it should be applied. Thus, the interesting and more clinically relevant questions often relate to whether a given procedure should be applied selectively or routinely, whether a treatment should be given to low risk patients as well as high risk, or whether the dose of a drug should be intensified.
One difficulty faced by economic analysts is that the comparisons they would like to make are to some extent limited by the availability of clinical data. A particular concern is the fact that many clinical trials of new medicines (e.g. NSAIDs) make a comparison with placebo rather than another active therapy. This means that, often, economic analyses cannot be based on either a particular clinical trial, or an overview of several trials. Rather, they become integrative studies which, of necessity, employ a number of assumptions. Users of economic analyses therefore need to check on the methodology of the studies generating the clinical data for the economic analysis and whether such studies are really comparable. They may be concerned if the clinical data used in an economic evaluation came from studies that enrolled patients of different baseline risk, or measured clinical outcomes in a slightly different way.
Both the papers by Mark et al. [2] and Kalish et al. [3] examine only the strategies compared in the GUSTO trial. This is reasonable because previous randomized trials had shown that thrombolysis was both effective and cost-effective when compared with no treatment, so the issue of a 'do nothing' strategy does not arise. However, the question of which patients should be treated with a particular therapy is likely to be important (we return to this point later).
a. Was clinical effectiveness established?
To be valid, economic evaluations require evidence on the effectiveness of the alternatives being compared. The standards for assessment of effectiveness correspond to those discussed in earlier guides in the series. Although evidence based on experiments, such as that obtained from randomized trials, is considered the best evidence for answering questions of therapy, economic evaluations are more valid if effectiveness data reflect normal clinical practice as closely as possible. Some economic evaluations are now being undertaken concurrently with randomized trials. Others are being based on systematic overviews of a number of trials. For example, Mugford et al. [18] used data from a systematic overview of 58 controlled trials to estimate the cost-effectiveness of giving prophylactic antibiotics routinely to reduce the incidence of wound infection after caesarian section.
The decision about whether to base an economic evaluation on results of a single trial, an overview of a number of trials, or a broader synthesis (in a modelling study) of trial and other evidence, is not straightforward. In principle all three approaches can be used. The considerations that guide the choice of approach in a given situation are as follows.
An evaluation based on prospective economic data collection alongside a single methodologically rigorous trial has high internal validity. However, the results it may not be widely generalizable (that is, it may have low external validity) if the setting for the trial was atypical, the protocol highly prescriptive, or compliance higher than one would expect in routine clinical practice. An evaluation based on an overview of a number of trials is likely to be more precise, as the pooled estimate of effectiveness will have a narrower confidence interval, and is likely to be more widely generalizable because of a wider range of patients, practice settings and ways of administering the intervention in several trials.
Sometimes data from trials require adjustment when used in an economic analysis. In their economic evaluation of misoprostol, a drug for prophylaxis against gastric ulcer in patients on long-term NSAID use, Hillman and Bloom [19] used clinical data from a trial undertaken by Graham et al. [20]. This compared misoprostol (400g and 800g daily) with placebo in a double-blind randomized controlled trial of three months duration. An important issue for economic analysis was that ulcers prevented by misoprostol may generate savings in health care expenditure, which could balance the cost of adding the drug. However, it was not possible to use the rates of ulcer observed in the trial for the economic analysis without adjustment. First, lesions were discovered by endoscopy, which was performed monthly. Many of these 'ulcers' would not have come to the notice of the patient or her physician in regular practice. Secondly, the compliance rate observed in the trial was higher than that typically observed in patients taking NSAIDs. Therefore, Hillman and Bloom adjusted the observed ulcer rates to reflect the fact that 40% of endoscopically determined lesions remain 'silent'. They also adjusted for lower compliance by using the ulcer rates in the evaluable cohort and assuming that only 60% of this efficacy would be achieved in practice.
Sometimes the length of follow-up in the clinical trial may be too short for the purposes of economic evaluation, as this tends to use long-term endpoints such as survival. The problem of length of follow-up is equally relevant for both costs and benefits. In some cases an increase in length of follow-up in a clinical trial by a number of months may make a lot of sense. For example, although it is common, in trials of thrombolytic therapy, to record 30 day mortality, most major trials, such as the GUSTO study, incorporate one year follow-up.
In other fields, such as cholesterol lowering, data on final outcomes such as overall mortality may take years to obtain. Here modelling studies have been undertaken, making projections of long term outcomes from short-term trial data relating to intermediate endpoints, such as percentage cholesterol reduction. Therefore, the problem of short-term follow-up is compounded by the use of an intermediate endpoint. The wisdom of this approach depends on the validity of the hypothesis linking intermediate and final outcomes. In at least one case projections based on short-term evidence turned out to be wrong. Schulman et al. [21] concluded that early use of zidovudine therapy in asymptomatic individuals with HIV infection was cost-effective based on projections of disease progression from a clinical trial with one-year follow-up. However, a subsequent study with three-year follow-up showed that the advantages of therapy in the first year were eroded in subsequent years [22]. The authors also called into question the uncritical use of CD4 counts as a surrogate endpoint for assessment of benefit from long-term antiviral therapy.
Where long-term evidence is lacking, economists are in a quandary, particularly where the treatment concerned is already in use. Do they say nothing at all, or undertake a modelling study that may help the decision maker understand the likely range of cost-effectiveness outcomes? The same problem confronts the user of economic evaluation results. Should any decision be postponed until definitive data are available, or should an interim policy be formulated, pending further results?
Of the two thrombolysis studies discussed here, the one by Mark et al. [2] was undertaken concurrently with the clinical trial, whereas that by Kalish et al. [3] is a modelling study using the GUSTO trial results as its main source of clinical evidence. Therefore the cost-effectiveness results are likely to be more similar than in a situation, for example, where the modelling study draws on clinical data from a number of different sources.
The main methodological difference between the two studies is that the resource consumption (e.g. days in hospital, number of outpatient visits) in the Mark et al. [2] study are those actually observed during the trial. By contrast, the estimates in the Kalish et al. [3] study are drawn from other sources, although the probabilities of resource-consuming events (e.g. coronary artery bypass surgery) are taken from the GUSTO trial.
Finally, it should be noted that, by using observational data bases, both papers extrapolated survival data beyond the one-year observed in the trial. This re-affirms the point that, even when good quality clinical data are available, modelling is often necessary to conduct an economic evaluation.
b. Were costs measured accurately?
Whilst the viewpoint determines the relevant range of costs and outcomes to be included in an economic evaluation, there are many issues relating to their measurement and evaluation. First, it is useful to report the physical quantities of resources consumed or released by the treatments, separately from their prices or unit costs. Not only does this allow us to scrutinize the method of assigning monetary values to resources, it also helps us to interpret the results of a study from one setting to another, as prices are known to vary by location.
Secondly, there are different approaches to valuing costs or cost savings. One approach is to use published charges. However, charges may differ from real costs, depending on the sophistication of accounting systems and the relative bargaining power of health care institutions and third party payers [23]. Where there is a systematic deviation between costs and charges, the analyst may adjust the latter by a 'cost-to-charge ratio'. However, very little is currently known about how charges differ from costs, so simple adjustments may not suffice. From the third party payer's perspective, charges will bear some relation to the amounts actually paid, although in some settings payments vary by payer. From a societal perspective we would like the real costs, since these reflect what society is forgoing, in benefits elsewhere, to provide a given treatment.
For example, Cohen et al. [24] compared costs and charges for conventional angioplasty, directional atherectomy, stenting and bypass surgery. Previous studies had suggested that total hospital charges for directional coronary atherectomy or intracoronary stenting are significantly higher than those for conventional angioplasty. However, when costs were examined, by adjusting itemized patient accounts by department-specific cost/charge ratios, it was found that the in-hospital costs of angioplasty and directional coronary atherectomy were similar. Also, although the cost of coronary stenting was approximately $2,500 higher than that of conventional angioplasty, the magnitude of this difference was smaller than the $6,300 increment previously suggested on the basis of analysis of hospital charges. The implication is that we may be deterred from using coronary atherectomy or stenting because of the high 'cost', whereas this may be an artifact of hospital accounting systems or bargaining power, rather than a reflection of the real value to society of the resources consumed by those procedures.
Mark et al. [2] use costs from the Duke Transition One cost-accounting system, Medicare diagnosis-related-group DRG reimbursement rates and Medicare physicians' fees in their estimations. Since the costs of the thrombolytic agents are an important component of the analysis, drug costs are calculated in two ways; from the Drug Topics Red Book average of 1993 wholesale prices and from the average costs of the drugs in 16 randomly selected GUSTO hospitals. They examine the impact on cost-effectiveness of the different estimation methods is examined. Kalish et al. [3] use medication costs and Medicare DRG reimbursement rates for one hospital. They took costs of treating serious haemorrhage, and the costs of managing coronary artery disease and stroke, from the literature.
c. Were data on costs and outcomes appropriately integrated?
When making comparisons between alternatives in terms of cost per life-year gained, or cost per QALY gained, it is important to compute the incremental cost-effectiveness ratio of one therapy over another. This is because the most relevant information, for the decision maker relates to the extra benefit that would be gained, compared with any extra cost. Of course, if one therapy is dominated by another, having both higher benefits and lower costs, then the incremental comparison is not needed. In this case both the papers calculate the incremental cost per life year or QALY gained from the use of tPA, compared with SK.
One important point to note about incremental analysis is that the incremental cost-effectiveness ratios of a given intervention is critically dependent on the comparison made. The most relevant comparison is current care, which could include 'doing nothing' where this is ethically defensible. In the example discussed here, most would argue that SK is the appropriate comparison and that 'doing nothing' is not really an option. Where there are multiple interventions, each of which could be delivered at different scales or intensities, the ranking of options becomes quite complex [25].
A final issue in the measurement and valuation of costs and consequences relates to the adjustment for differences in their timing. It is normally assumed that we prefer benefits sooner, and prefer to postpone costs because of uncertainty about the future and because resources, if invested, usually yield a positive return. The accepted way of allowing for this in economic evaluations is to discount costs and benefits occurring in the future to present values [12]. The effect of this is to assign a lower weight, in the analysis, to costs and benefits occurring in the future. An annual discount rate of 5% is common in the published literature, although this choice is not necessarily theoretically or empirically justified. There are also debates about whether health outcomes should be discounted at the same rate as costs [26] [27].
In the studies considered here, both sets of authors discount costs and benefits occurring in the future at a rate of 5% per annum. Mark et al. [2] also report results for discount rates of 0% and 10%, whereas Kalish et al. [3] report results for rates of 1% and 10%.
Uncertainty in economic evaluation can arise either from lack of precision in estimation or from methodological controversy. The conventional way of allowing for uncertainty in economic analyses is to undertake a sensitivity analysis (discussed in an earlier guide [10]), where the estimates for key variables are altered in order to assess what impact they have on study results.
In addition, conducting economic evaluations concurrently with clinical trials provides the opportunity to apply conventional tests of statistical significance to the resource quantities or costs [28]. Also, where measurements from a clinical trial inform us of the distribution of cost variables, it is possible to set the range of estimates for sensitivity analysis in relation to the statistical properties of the distribution (e.g. 2 SD from the mean). This raises a number of important issues, such as the size of the 'economically important difference' when comparing the cost or cost-effectiveness of two alternatives, and the appropriateness of, and methods for, statistical tests on cost-effectiveness ratios.
Both papers report extensive sensitivity analyses, many of which relate to different methodological choices (e.g. source of cost estimates) rather than to observed variability in the data. Mark et al. [2] use the 95% confidence interval for the increase in one-year survival to explore the possible range in cost per life year saved. They also perform statistical tests for differences in cost but not for differences in cost-effectiveness ratios.
Because economic evaluation methods are in their infancy compared with those for randomized trials, investigators still debate many issues [29]. We've already mentioned one major issue, of the appropriateness of alternative methods for valuing outcomes. Other issues relate to the appropriateness of considering some types of outcome (such as the costs of lost production if individuals are away from work because of illness), or the choice of discount rate. Some methodological uncertainties can be taken into account by sensitivity analysis (e.g. if the choice of discount rate does not affect the choice of strategy in a given situation, then this particular controversy, though important, may not be critical to the decision).
The other way in which methodological uncertainties can be accommodated is in the reporting and discussion of results. Economists are often criticised for failing to reach a firm conclusion, but if the result is truly equivocal that information will be important for the decision maker. It is important to remember that economic evaluation is no more than an aid to decision making, since there are often many difficult value-judgements in reaching a decision.
Finally, we must recognize that in clinical practice the costs and outcomes of treatment are likely to be related to the baseline risk in the treatment population. For example, the cost-effectiveness of drug therapy for elevated cholesterol, compared with no treatment, will depend on age, gender, pre-treatment, cholesterol level and other risk factors; the greater the patients' risk, the lower the cost per unit of benefit [30].
Division of patients into risk categories is common in clinical practice. In a study of the cost-effectiveness of beta-blockers after acute myocardial infarction, Goldman et al. [31] found that the cost per life year gained was $2,400 for those patients at high risk, compared with $13,000 for those at low risk. The differences in the cost-effectiveness ratios were driven primarily by the patient's ability to benefit from therapy, rather than treatment cost.
Both articles investigate the impact of patient age on cost-effectiveness, as older patients have a higher mortality risk and fewer years of life left to live. In addition, Mark et al. [2] investigate the impact of infarction location on the cost-effectiveness estimates.
In this article we have outlined some of the threats to validity in economic evaluations. In the next article on economic analysis, we will show you how to determine the results and how to use them in your practice.
Let us start with the incremental costs. Look in the text and tables for the listings of all the costs considered for each treatment option and remember that costs are the product of the quantity of a resource used and its unit price. These should include the costs incurred to 'produce' the treatment such as the physician's time, nurse's time, materials, etc. which we might term the 'up-front costs'; as well as the 'downstream costs' due to resources consumed in the future and associated with clinical events that are attributable to the therapy.
The study by Mark et al [2] quantifies resources used by treatment group in three periods of time over one year; initial hospitalization, discharge to 6 months and 6 months to one year. Both treatment groups were very similar in their use of hospital resources over the year; both experienced mean length of stay of 8 days of which 3.5 were in ICU and both groups had the same rate of CABG (13%) and PTCA (31%) on initial hospitalization. As summarized in Table 2, the one-year health care costs, excluding the thrombolytic agent, were $24,990 per tPA-treated patient and $24,575 per streptokinase-treated patient. As is clear from Table 2, the main cost difference between the two groups is the cost of the thrombolytic drugs themselves $2,750 for tPA and $320 for streptokinase. The overall difference in cost between tPA-treated and streptokinase-treated patients is therefore our incremental cost at $2,845 over the first year. This is discounted at 5% per annum for a final figure of $2,760. The authors argue that there is no cost difference between the two groups after one year. These data for incremental costs from tPA are very similar to those estimated by Kalish [3] who found a difference of $2,535 in the use of tPA to manage MI in preference to streptokinase.
The measure of effectiveness chosen in the Mark et al [2] study is the gain in life expectancy associated with tPA. The available follow-up experience was to one year, with 89.9% surviving in the streptokinase group versus 91.1% in the tPA group (p < 0.001). To translate these observations into life expectancy gains, the authors project survival curves for another 30 years or more using first a 14-year MI survivorship database from Duke University and then an assumption that survivorship will follow a statistical distribution (Gompertz). Having projected two survival curves, the authors calculate the area under each curve, which represents the expected value of survival time or life expectancy. For tPA patients life expectancy was 15.41 years and for streptokinase 15.27 years. As summarized in Table 2, the difference in life expectancy is 0.14 years per patient; or phrased another way, for every 100 patients treated with tPA in preference to streptokinase we would expect to gain 14 years of life.
In other situations, quantifying incremental effectiveness may be more difficult. Not all treatments change survival, and those that do not may affect different dimensions of health in many ways. For example, drug treatment of asymptomatic hypertension may result in short-term health reductions from drug side-effects, in exchange for long-term expected health improvements, such as reduced risk of strokes. Note that in our tPA example the outcome is not unambiguously restricted to survival benefit because there is a small but statistically significant increased risk of non-fatal hemorrhagic stroke associated with tPA [1]. The existence of trade-offs between different aspects of health, or between length of life versus quality of life, means that to arrive at a summary measure of net effectiveness, we must implicitly or explicitly weight the 'desirability' of different outcomes relative to each other.
There is a large and growing literature on quantitative approaches for combining multiple health outcomes into a single metric using patient preferences [32]. Foremost among current practice is the construction of quality-adjusted life years (QALYs) as a measure that captures the impact of therapies in the two broad domains of survival and quality-of-life. (QALYs were described in more detail earlier in this series [10].) Alternative approaches include the Healthy Year Equivalent method [33].
Our second thrombolytic study by Kalish et al [3] used QALYs as their primary measure of effectiveness. First they took the same one-year survival probabilities from the GUSTO study and projected them forward to estimate life expectancy using data from a different longitudinal study, the Worcester Heart Attack Study. Similar to Mark et al [2] they estimate that the average life span after MI is 14.6 years and then used GUSTO risk reductions to estimate life expectancy difference for tPA and streptokinase patients.
To derive QALYs they applied utility weights (from death=0 to healthy=1) to patients surviving the MI but sustaining morbid events over time such as non-fatal stroke (utility of 0.79) or reinfarction (utility of 0.93). These utility weights were taken from the literature, based on preference measurements undertaken in the GISSI-2 trial [34]. However, given the small differences between treatment groups in risk of morbid events that receive quality-adjustment in survival, although the total number of future QALYs is fewer than unadjusted life years at 8.842 for streptokinase and 8.926 for tPA, the difference in QALYs (0.084), using 30 day GUSTO survival data, is identical to the effect calculated by Mark et al [2] using unadjusted life expectancy.
In summary, both studies use the efficacy data from the GUSTO trial as their starting point to conclude that tPA treatment is more costly than streptokinase but that it provides an increase in survival (quality- adjusted or otherwise). Table 2, using Mark et al data, illustrates the next calculation in both studies which determines the incremental cost-effectiveness ratio for tPA. After discounting future costs and effects at 5% per year to reflect time preference (for rationale, see our first paper [35]), the difference (tPA minus streptokinase) in cost per patient over the year (and by extension into the future because they assume no cost differences beyond one year) is $2,760, which is divided by the difference in life expectancy per patient (0.084) to yield a ratio of $32,678 per year of life gained.
A simple interpretation of this ratio is that it is the 'price' at which we are buying additional years of life by using tPA in preference to streptokinase; the lower this price, the more attractive is the use of tPA. The Kalish study [3] reaches a similar incremental cost-effectiveness ratio (with their adjusted denominator of QALYs and using the 30-day risk reduction GUSTO data) of $30,300 per QALY. These are the main results of the studies; we will discuss their interpretation later in this article.
Both tPA cost-effectiveness studies explore uncertainty using sensitivity analysis; examining the impact on incremental cost-effectiveness of alternative values for uncertain variables. We described, in detail, one-way and multi-way sensitivity analysis in the user guides on decision analysis [10].
A useful starting point for a sensitivity analysis is to examine the impact of variation in the effectiveness measure on the cost-effectiveness estimates. Where effectiveness is based on clinical trial data the analyst does not have to make an additional judgement about the plausible range over which to vary the data, but can use a conventional measure of precision around a treatment effect such as the 95% confidence interval. Using data from the Mark et al [2] study we know the tPA treatment effect was a 1.1% increase in one-year survivorship with a 95% confidence interval of 0.46% to 1.74%. Applying this variation to the denominator of the incremental cost-effectiveness ratio, Mark et al [2] report a range of $71,039 per life-year gained to $18,781 around their baseline estimate of $36,678, with smaller benefit yielding a higher ratio. Both studies conclude that their estimates of cost-effectiveness are most sensitive to uncertainty in the magnitude of mortality benefit. This form of analysis, however, only partially captures the uncertainty in the cost-effectiveness ratio because it assumes the numerator (cost) does not vary. Investigators are currently developing more formal procedures for estimating confidence intervals for cost-effectiveness ratios that permit both numerator and denominator to vary [28].
In an editorial accompanying the GUSTO economic analysis, Lee [36] stresses that "...cost-effectiveness should focus on strategies, not drugs. The cost-effectiveness of tPA depends on how the drug is administered and to whom it is given". The first point relates mainly to the fact that the GUSTO trial had a protocol for accelerated administration of tPA; slower regimens of administration of the same drug had previously shown no clinical advantage [34]. The second point is that because some patients (e.g., the elderly) have a greater prior risk of mortality, the tPA treatment effect will likely yield a higher absolute risk reduction in mortality [1].
This second point has important implications for cost-effectiveness as can be seen in Table 3, which presents cost per life-year estimates among eight sub-groups on the basis of infarction site and patient age. Because the baseline risk of mortality in MI varies by age and infarct site, the mortality benefit from treatment with tPA also varies, and it is clear from Table 3 that tPA is more cost-effective in older patients with anterior infarcts. To take the extreme cases, the cost per life-year gained in a person aged 40 years or less with an inferior infarct is $203,071 compared to a person aged 75 years or more with an anterior infarct at only $13,410 per life-year gained.
Table 3: Incremental Cost-effectivelness of Tissue-type Plasminogen Activator vs. Streptokinase in Patient Subgroups From the Global Utilization of Streptokinase and Tissue Plasminogen Activator for Occluded Coronary Arteries (GUSTO)
|
|||||||||||||||||||
In reviewing these studies you decide that the variation in yield per dollar expended may have some important implications for your P & T Committee decision, because they wish to use tPA only in selected patients.
Having established the results of the two economic studies and the precision of the estimates, we now turn to two important issues of interpretation; the first is how incremental cost-effectiveness ratios can be interpreted to help in decision making, the second is the extent to which the cost and/or effects from the study can be applied to your practice setting.
In Figure 1 we present a framework for categorising economic study results when data on incremental costs and effects have been determined. This 3x3 matrix has nine cells to categorize studies depending upon whether the new treatment is more/same/less costly than control and whether it has more/same/less effectiveness.
Figure 1The CHE regrets that we are unable to supply this graphic image. Please refer to the printed version. |
In category 1, the new treatment is both less costly and more effective than control, so the new treatment is said to be strongly 'dominant'. For example, treatment to eradicate helicobactor pylori for duodenal ulcer is strongly dominant over acid suppression with an H2 receptor antagonist because it is both less costly and results in fewer recurrences of ulcer over a one-year period [37]. Category 2 represents strong dominance to reject a new therapy where the costs are higher and the effectiveness is worse than control. Then follow four cases of so-called weak dominance where one of either costs or effectiveness are equivalent between the two therapies: category 3 indicating weak dominance to accept the treatment (equivalent cost but better effectiveness); category 4 indicating weak dominance to reject the treatment (greater cost with equivalent effectiveness). By analogy, categories 5 and 6 indicate weak dominance to reject and accept respectively.
All the shaded cells in Figure 1 indicate comparative cost and effectiveness combinations that provide evidence of strong or weak dominance. To inform decision making, no further analysis, such as calculation of cost-effectiveness ratios, is required for these shaded cells. However, further analysis is needed if results fall into the non-dominance unshaded cells of 7, 8, or 9. First, it may arise that the treatment is associated with no statistically significant or clinically important difference in either effectiveness or costs; although it should be noted that the process of implementation and change of programs will generate costs not captured in the analysis. The most common non-dominance circumstance is category 7, where the new therapy offers additional effectiveness but at an increased cost (or its mirror image in category 8). Both tPA studies fall into category 7, requiring calculation of the incremental cost-effectiveness ratios of the new therapy as we discussed above and illustrate in Table 2.
Having estimated the incremental cost-effectiveness of tPA over streptokinase, and assuming for the moment that these data apply to your practice setting, how do you decide whether approximately $32,000 is an acceptable price to pay for saving one additional year of life? The first important point to note is that this question involves a value judgement and cannot be resolved using only the study data. As noted in the conclusion of the GUSTO economic analysis, the study data can inform the decision but cannot make the choice. Some appeal must be made to external criteria to ascertain whether a jurisdiction or society is willing to pay this price for this improvement in outcome.
There are a number of approaches to the interpretation of incremental cost-effectiveness ratios. In an ideal world of complete information we would have data indicating the health (or other) outcomes we would be forgoing from other interventions and programmes, within and outside health care, not funded as a consequence of using tPA. This is what economists refer to as 'opportunity cost'. However, data to accomplish this task are very limited and investigators have promulgated a variety of 'second best' interpretive strategies. One approach assumes that previous decisions to adopt new medical therapies of known cost-effectiveness reveal an underlying set of values with which to judge the acceptability of the current treatment candidate. Our two tPA cost-effectiveness studies both use this interpretive strategy to assess their $30,000 per life-year estimates: both cite the cost-effectiveness of two to three other interventions, some non-cardiac, that are currently funded and both conclude that an acceptable cost-effectiveness threshold would be $50,000 per QALY gained (for Kalish) and per life-year gained (for Mark).
Investigators have debated the validity of such interpretive strategies for incremental cost-effectiveness ratios at both a theoretical [38] [39] and practical level [40]. For example, Johannesson and Weinstein [38] maintain that prioritising resource allocations based on rank-orderings of interventions by incremental cost-effectiveness does lead to an efficient allocation of resources. However, Birch and Gafni [39] contend that this is only the case where two assumptions hold true; programs exhibit constant returns to scale and are perfectly divisible.
What do these two terms mean? Constant returns to scale implies a linear relationship between costs and outcomes at different levels of production; in many cases this may not hold true because we observe economies of scale, an example being the regionalization of cardiac surgery in one centre where high volume can produce lower cost per case and often better clinical outcomes. Divisibility of programs implies we can reallocate $1 or $1m to tPA and purchase benefits at the same rate implied by the cost-effectiveness ratio; this divisibility does not hold because to treat one additional patient with tPA we would require a block of resources equal, at least, to the cost of tPA. While this methodologic debate continues, Drummond et al [40] caution readers about the practical problems of comparisons between cost-effectiveness studies that may have used very different methods, data, and assumptions.
In summary, you should exercise caution when drawing conclusions from incremental cost-effectiveness ratios. The ultimate criterion is one of local opportunity cost; if the money for a new program will result in decreased ability to deliver other health care interventions, what are the health benefits you will no longer realize in order to have tPA available for all? The practical difficulty applying this criterion is that many existing programs or services currently provided may not have been evaluated and so the opportunity cost of reducing or removing them is unknown or speculative.
After understanding the results, you should now turn to whether they will apply to your own practice setting. There are two levels of applicability for economic appraisal to the local setting. The first is the extent to which the evidence from the clinical trial(s) which forms the basis for the estimated treatment effect can be applied to routine clinical practice in any jurisdiction. A distinction is sometimes made between the efficacy of a treatment, as observed in a highly selected and compliant clinical trial population, and its effectiveness in the real world. For economic evidence to be relevant to policy decisions we would prefer evidence to be more related to effectiveness than efficacy. The second aspect is the extent to which the observed effect and cost data are transferable between jurisdictions. Threats to the transferability of cost-effectiveness data include variation in clinical practice patterns and variation in the prices of health care resources.
The applicability of clinical data to populations other than those studied was previously discussed in our User Guide on therapy or prevention [41]. To assess whether patients in your setting can expect the same health outcomes, you must examine two factors: (1) are the patients in the study similar to your patients?; (2) is the clinical management of the study patients similar to your local practice? If your patients meet the inclusion and exclusion criteria of the primary article(s) for effectiveness used in the economic evaluation, then there is little difficulty in passing judgement that the patients are indeed similar. In many circumstances your patients may not be a perfect replicate of the study population, and you should then proceed by considering whether there are reasons to suppose your patients will respond differently to treatment than those included in the study. If the analysis is based on patients different from yours, check the sub-group and sensitivity analyses to see if relevant clinical variables were examined to permit extrapolation to your patients. Note that both of our economic studies used effectiveness data from the GUSTO trial [1] which was a large, simple trial where the inclusion and exclusion criteria were sufficiently broad that patients likely reflect the mix of those suffering an acute MI in many local settings.
Next, determine if the intervention is, or would be, used in the same way in your community. Local deviation from the observed patient management in the trial can have implications for generalizing both costs and outcomes from the study to the local setting. With respect to outcomes the key question is whether practice differs with respect to factors that will influence the magnitude of the treatment effect. First, let us consider whether these data apply to non-study hospitals in the US. Kalish et al [3] doubt whether the efficacy data from the GUSTO trial are good predictors of effectiveness in routine practice:
"It has been questioned whether the results achieved in the GUSTO trial are possible in actual practice, largely due to the small time delay between symptom onset and treatment in this trial [28] [42]. The benefit of tPA in the GUSTO trial was seen primarily among patients treated within four hours of symptom onset [1], and the majority of patients who have acute MI in the United States are not treated within four hours [43]." (p.325)
Another issue is whether the GUSTO efficacy data are applicable to centres outside the US. The GUSTO trial enrolled patients from 15 different countries; the majority of these patients (56%) were recruited from the United States. US patients were managed differently from non-US patients in a number of ways, including greater use of invasive revascularization such as PTCA and CABG, and greater use of non protocol medications such as antiarrhythmics and calcium antagonists [44]. Statistical analysis by logistic regression reveals that, although mortality reduction with accelerated tPA versus streptokinase was greater in the US (1.2% absolute decrease versus 0.7% elsewhere), the test for treatment-by-country interaction against streptokinase was not significant (p=0.30). In other words, if the truth were that there was no difference between the U.S. and other countries, differences as or greater than 1.2% versus 0.7% would be found in 30% of similar trials. Thus, while the results do not exclude a difference in effect between countries, neither do they provide substantial support for this hypothesis.
In considering the transferability of cost (and cost-effectiveness) estimates between jurisdictions remember that the cost of a treatment is the summation of the product of physical resources consumed (e.g., drugs, tests) and their unit prices. Cost data may not transfer well between jurisdictions for two reasons: (1) clinical practice patterns vary in such a way that resource consumption associated with the treatment differs from that reported in the study; (2) local prices for resources differ from those used in the study. To address these points a good economic evaluation should report resource use and prices separately so that a reader can ascertain whether practice patterns and prices apply to their jurisdiction. The economic analysis by Mark et al [2] gives detailed reporting of resources and prices so the reader can judge whether, for example, the 73% rate of cardiac catheterization, 31% rate of PTCA and 13% rate of CABG are applicable to their institution.
As previously noted, the GUSTO economic analysis is undertaken only on the US patients from the multi-national trial, and the intensity of resource use was lower in other countries. Such resource use differences reflect a number of factors including availability of resources and financial incentives to health care providers. For example, the length of hospital stay was significantly lower in US hospitals than non-US hospitals (8 vs 10 days; p < 0.001) despite a greater incidence of complications among US patients. This difference likely reflects downward pressure exerted on length of stay in the US by the prospective payment system to hospitals based on diagnosis related groups.
Variation in the prices of health care resources can threaten the validity of cross-jurisdictional inferences about cost-effectiveness. The problem is not due to variation in overall price levels between countries but variation in the price of one health care input relative to another (i.e., relative prices). For example, in a cost-effectiveness study of misoprostol as prophylaxis against gastrointestinal events in persons taking NSAIDs for arthritis, Drummond et al [45] found that among four countries compared, the price of misoprostol was highest in US but, surprisingly, the cost-effectiveness analysis was most favourable in the US indicating that prophylaxis was actually cost-reducing. This result is explained largely by different prices for health care resources because the use of misoprostol reduced the risk of surgery, the relative price of which was highest in the US.
The results of the GUSTO economic analysis [2] are clearly dependent upon the relative prices of tPA and streptokinase, and furthermore we know that these relative drug prices vary between countries. For example, if the drug costs were those typical in Europe (approximately $1,000 for 100 mg of tPA and $200 for 1.5 million units of streptokinase), the cost-effectiveness ratio would be $13,943 per year of life saved.
Finally, countries may differ with respect to the value they place on health benefits versus other commodities. There is no reason why $50,000 per life-year as an acceptable cost-effectiveness threshold for the US is applicable to, say, a less-industrialized country where the opportunity cost of such resources will be much higher. Countries vary in their willingness to pay for health and health care.
Returning to our scenario and referring to the framework in Figure 1, both tPA cost-effectiveness studies indicate that tPA is not dominant over streptokinase but falls into category 7 implying that a trade-off between increased effectiveness at increased cost needs to be resolved. Since the effectiveness, resource use and price data are applicable to your hospital, you inform the committee that the analyses you have reviewed can help inform their decision but they must make the choice and decide what cost-effectiveness threshold is acceptable. You help frame this choice as one of ocal opportunity cost; by diverting resources to tPA what health benefits will be forgone from other treatments or programs no longer funded?
The committee decides that universal use of tPA in all MI cases will be very costly and divert resources from other health-producing programs in the hospital (although the benefits of these programs have not been as clearly documented as the new program!). They decide that tPA should be used selectively based upon the cost-effectiveness evidence in Table 3 and adopting the cutpoint of $50,000 per life year suggested by Mark et al [2]. The committee decides that the preferred clinical strategy in their hospital is streptokinase in patients aged less than 60 years with an inferior infarct and patients aged 40 years or less with an anterior infarct; all other patients would receive tPA.
© 2001 Evidence-Based Medicine Informatics Project
© 2001 Centre for Health Evidence.
Home.
Users' Guides to EBP.
Webmaster.
Disclaimer.