CardioBrief: Diving Deep into the ORBITA Trial

— William Boden, Ajay Kirtane, and Dan Mark analyze the ORBITA trial

by Larry Husten, CardioBrief November 2, 2017

This article is a collaboration between �鶹��ý and:

Editor's note: I asked a wide variety of cardiologists for their thoughts about ORBITA, presented at the TCT meeting in Denver and published simultaneously in . Three of them, William Boden, Ajay Kirtane, and Dan Mark, sent highly detailed lengthy comments about the trial. Here are their comments in full.

You can find full �鶹��ý coverage of the trial findings and background for the controversy here and here.

William Boden (VA Boston):

The ORBITA findings are incredible, really, when one considers that stenting a coronary artery with a mean 84% stenosis should impart an immediate clinical and physiologic benefit -- yet there was none observed across the board between the PCI and sham PCI groups: CCS angina class, SAQ scores, EQ-5 QOL [quality of life], all exercise indices, and Duke Treadmill scores were neutral, though the peak stress wall motion score index did improve in the PCI group as one would expect -- meaning that the PCI was technically successful.

post-FAME-2 addressed the need for a sham control trial in stable CAD, and cites how this was accomplished in the late 1950's with IMA ligations to improve angina, so this paper highlighted many of the shortcomings of more recent unblinded trials in the PCI era (like COURAGE, BARI-2D, FAME-2), in which preconceived treatment biases led to assumptions of procedural treatment superiority and clinical benefit and which likely introduced bias into the less rigorous/soft endpoints like "unplanned revascularization" in FAME-2 that drove the composite primary endpoint in favor of FFR-guided PCI, and angina-relief in all of these trials. This same limitation will apply to the large, ongoing ISCHEMIA Trial, which is also unblinded, but won't end until 2019. Both the Dickert paper and ORBITA underscore the powerful placebo/nocebo effect on both physicians and patients who undergo a procedure that is supposed to impart significant symptomatic improvement -- yet remarkably, ORBITA found no between group differences.

The narrative in the aftermath of COURAGE was "COURAGE taught us nothing new," we were told repeatedly. "We've known for decades that PCI doesn't reduce death or MI in stable CAD, but that's not why we do PCI in these patients -- we do it to improve symptoms and quality of life, and PCI is better than OMT for angina relief. Plus, many patients don't like to take medications (or don't want to wait for medications to work), so PCI is a more effective treatment for angina, and that's what we can offer to our patients." This prevailing view has been firmly espoused by cardiology professional society guidelines internationally for the past 8 to 10 years.

So, now along comes ORBITA, which confronted this "angina superiority with PCI hypothesis" head-on with a sham-controlled trial, the results of which emphatically underscored the powerful placebo/nocebo effect of a PCI procedure that both physicians and patients assumed (and have for 40 years) would be of unquestioned benefit. These results are a "Back to the Future" requiem for an earlier era some 60 years ago when the sham-controlled IMA ligation studies of the late 1950's showed the unexpected findings of a lack of benefit of this procedure.

How will the results of ORBITA be viewed? It will be a combination of love and hate. ORBITA was rigorously designed and undertaken with great care and painstaking attention to detail using objective exercise and physiologic outcome measures before and after stabilization on OMT, combined with the use of well-validated quality of life metrics before and after randomization. Overall, the results were stunningly negative, which ORBITA supporters will cite. By contrast, it is very likely that many in the interventional community will be ready to pounce on and discredit this study -- there certainly hasn't been an opportunity since COURAGE was published 10 years ago in 2007 to potentially discredit a trial that now confronts the sacred cow of PCI benefit for angina relief as the sole basis to justify PCI in stable CAD patients. They will likely cite the limitations of small numbers (only 200 patients), that the study was woefully underpowered, the potential ethical conundrum of subjecting subjects with significant flow-limiting CAD to a sham procedure (or deferred PCI for clinical need), that 28%-32% of randomized subjects had either normal FFR or IFR (and therefore didn't have a "physiologically significant," or flow-limiting stenosis, that PCI would otherwise benefit), that there was a low frequency of multivessel CAD, that the short duration of follow-up (only 6 weeks) was too brief to assess potential benefit (though this actually favored the PCI group) and, of course, who would have the time or patience to call patients three times/week to assess their response to intensifying medical therapy -- "not real-world," just like the OMT used in COURAGE wasn't achievable in the real-world.

Will ORBITA change practice? Likely not, because the narrative will likely be that rates of PCI for SIHD are declining, while more recent data from ACC/NCDR show that, based on AUC criteria, the percentage of patients undergoing "inappropriate PCI" (or the more recently-sanitized term "rarely appropriate PCI") is likewise declining. But, what isn't entirely clear at present is whether, in the post-AUC era, there has been an increase in "coding creep" to up-classify (or re-classify) SIHD patients with "stable angina" to "unstable angina" so as blur these distinctions -- is it CCS Class 2 angina, or maybe is this Class 3 angina? How can such data be ascertained reliably?

Finally, I struggle with where the authors get the estimate that there are 500,000 PCI procedures performed annually on stable CAD patients worldwide. In the U.S. alone, I would estimate this to be more like 200,000-300,000/year though, as noted above, obtaining such accurate data may be difficult because of the inherent problem of voluntary self-reporting of cath lab and PCI data.

Ajay Kirtane (Columbia University):

To my mind, ORBITA is a mechanistic study that was incredibly hard to do, and the investigators should be commended for doing it (a position I have expressed previously). It does help to assess the short-term physiologic effects of PCI among functional patients with single-vessel coronary disease, further emphasizing an already existing movement within the interventional space "to treat the patient, and not just a coronary stenosis."

However, my genuine concern is that the study findings will be over-emphasized and extrapolated to patients who are more symptomatic, have a more severe extent of disease, and actually can derive more significant (and lasting) effects from coronary revascularization. Why do I say that? Because (despite the angiographic severity of coronary lesions randomized in the study) these patients still constitute a lower-risk cohort of patients, the likes of whom it is reasonable to take off the table and try medications first. Do all interventionalists do that? No -- and that is a lesson to be learned from the study. But I (and other thoughtful colleagues of mine) certainly do.

This assessment is based upon examination of several features of the study. Only patients with single vessel disease were included. Additionally, pre-randomization stress testing was essentially low-risk (as quantified by Duke Treadmill Score, exercise duration, VO2max, and even dobutamine stress echocardiography). Do you know what the average 65-year-old man's (mean age in ORBITA) exercise duration is? It is just over 8 minutes. Average VO2max for a 65-year-old man is in the mid-20s (the baseline in ORBITA, assuming 70 kg average weight). Even if one looks at the dobutamine stress echocardiography scores, the baseline WMSI scores are only 1.11, which is a low-risk result (scores of 1.1-1.7 are considered mild-moderately abnormal).

In ORBITA, the PCI group of this trial started out at an exercise duration of 8 minutes and 48 seconds at baseline. That is really good, at a MET level that many would argue obviates the need for even diagnostic catheterization. Even though PCI was able to normalize in-lab physiology assessed by FFR/iFR, significantly reduce ischemia by stress echocardiography, and even increase exercise duration compared to baseline (by 28.4 seconds, with 95% CI not overlapping 0), the overall clinical difference in the exercise duration between PCI and placebo was not detectably different (or likely clinically relevant). But that should not come as such a surprise, because at 6 weeks (and without undergoing exercise training) it is not necessarily easy to get even an average 65 year old's exercise duration up further when it starts at that level. Also note that the placebo-treated patients' exercise times were not detectably different from their baseline (with 95% confidence intervals overlapping 0). So no placebo effect in exercise duration was detected either -- a very important point.

The quality of life data tells a similar story. If one looks at the starting Seattle Angina Questionnaire scores, these patients -- with a mean duration of angina for 9 months -- had mild symptoms at the time of randomization. For example, anginal frequency scores in the range of 60-90 are consistent with "monthly" angina. ORBITA patients started out in the upper 70s, and were only assessed at 6 weeks (e.g., with possibly only one to two episodes of angina in the prior 6 weeks). One only needs to take a look at John Spertus' original SAQ validation manuscript to see that the ORBITA population represents a very different population of patients compared to that original study, or those who really are limited by their angina as assessed by a range of SAQ indices. So it's not surprising that there were no detectable differences in groups, although both groups did improve.

At the end of the day I don't want to be perceived as criticizing the study. I am not; I am just trying to provide context. This study was very hard to do, and the group of investigators is respected and thoughtful. The reason that they enrolled this type of patient population and even could only do a study of 6 weeks (rather than longer, which might have been even more helpful) is that they no doubt feared keeping these lesions unrevascularized for longer periods of time. I would also not doubt if the site ethics committees gave them a hard time with even this current study design regarding clinical equipoise. So that is a critical lesson to be learned of the study. I do think that placebo-controlled trials are important, and this is a start.

But we already know that especially when describing the symptom benefits of coronary revascularization, we need to be realistic and focus less upon how severe the angiographic lesion is, and more upon how the patient actually feels when deciding whether to offer coronary revascularization. In appropriate patients, coronary revascularization offers both "true" anti-anginal relief as well as some placebo-related effects. It is very interesting how the ORBITA data bookend with the FAME-2 results being presented here at TCT. In FAME-2 at 3 years, many of the patients randomized to medical therapy ended up with PCI anyway, and there was still a detectable difference between randomized groups when analyzed by intent-to-treat.

It may sound obvious, but if we as physicians focus on the patients who really need our help, we ultimately will do the most good. Also - I don't think this has implications for ISCHEMIA, where patients are randomized prior to cath.

Dan Mark (Duke University)

I am sure we are going to be debating and discussing this trial for some time to come. Let me emphasize first off that my comments are intended to stimulate discussion in the quest for greater understanding. I think we need to express congratulations to the investigators for completing a very high-quality, thoughtful and challenging trial without a deep-pocket commercial sponsor.

As I understand it, the investigators had three goals in doing this trial

To demonstrate that a placebo/sham control for PCI therapy is possible in the context of a clinical trial – in this regard, the trial is a significant success and the authors are to be congratulated on doing what most considered outside the realm of the possible. Will this open up an era of a lot more sham-controlled procedural trials? I think we will need to take that under advisement since there are a lot of technical aspects to this that investigators and IRBs would have to learn to do it skillfully.
To demonstrate that PCI causes a placebo effect – I am not sure anyone would actually dispute that this was likely. The problem is understanding what distortions in estimated outcomes the placebo effect introduces. In this regard, it is disappointing that the trial was not significantly longer, since one of the ways that has been proposed to tell a placebo effect from other effects on patient outcomes is by its durability over time. Six weeks is not close to long enough to help with this question.
To define the "true efficacy" of PCI for symptom relief in angina patients. A few thoughts on this last objective.

In the paper, the ORBITA investigators start their abstract by pointing out that while we commonly observe angina patients getting symptom relief with PCI, this effect has not been proven in a blinded RCT. Perhaps so, but its unclear that everything we learn in medicine can only be considered valid if it comes from a placebo controlled RCT. I'd refocus this to say that PCI pretty clearly does improve angina in patients with obstructive CAD, but the benefits are not uniform across all CAD patients and we need to understand when PCI added to medicine/usual care can produce an important increment in patient QOL and decrease in suffering. For a variety of reasons, cardiologists over the last 40 years have gotten more focused on ischemia than angina, perhaps because of the finding that much ischemia can take place without symptoms and ischemia seems like a more proximate pathophysiologic cause of morbidity and mortality. Since ORBITA is dealing with a very low-risk population, there is no possibility of showing an effect on events and thus the ischemia issue is less relevant than angina/QOL. Relieving ischemia that does not have a measurable effect on QOL may not benefit the low-risk CAD patient in a way we can detect, particularly in a 200-patient trial.

specifically addresses the issue of the effect size for QOL in comparative revascularization trials, in that case PCI versus CABG. The same general considerations apply here. The anginal symptom burden at presentation is a major determinate of how responsive patients are to therapies directed at angina relief. You cannot test angina relief therapy very effectively in a population with a low burden of angina, regardless of their ischemic results on stress testing or their FFR results. Most modern revascularization trials of stable cohorts enroll patients with a low angina burden, thus constricting the opportunity to demonstrate a benefit of a better therapy on angina. Asymptomatic ischemia on a stress test is not angina. Drug companies developing anti-anginal therapies know this, so for example the ivabradine [Corlanor] trial published by Borer, et al., (Circulation 2003) enrolled patients who had exercise limiting angina on the treadmill accompanied by ST depression on the exercise ECG. The ranolazine [Ranexa] trial MARISA published in JACC 2004 also enrolled patients with angina-limited exercise on ETT who also had ST changes.

Many clinicians reading the PCI trials think that the Canadian Class data showing patients with Class I or greater angina establishes that these subjects will benefit from anti-anginal therapy, but in ORBITA there is an important disconnect between the CCS data (98% with CCS class II or III) and the SAQ AF scales that shows relatively high baseline scores, suggesting a low angina burden at randomization. ORBITA also has longer (better) baseline ETT times than COURAGE but similar to ACME. Only about one-quarter of patients apparently had exercise-induced ST depression at baseline; and if they reported exercise-induced angina rates, I missed it. But the Duke Treadmill scores at baseline border on the edge of "low risk" so I presume at least that few subjects had exercise-limiting angina as in the drug trials mentioned above. Thus, from all these data one can suppose that the opportunities for a better antianginal therapy to demonstrate its superior qualities was constrained in this trial due to the lack of anything approaching the "refractory disabling angina" that led Gruentzig to develop PTCA 40 years ago.

The decision to optimize medical therapy before randomization beyond what patients were receiving from their doctors is, I presume, an attempt to forestall criticisms that the contest did not test the best that medical therapy had to offer. Whether the PCI arm might have required less medication in follow-up with a different study design approach is difficult to discern from the data thus far available.

The concept that PCI has some "true efficacy" relative to medical therapy is appealing conceptually but clearly oversimplifies the problem when you consider the number of complex variables at play that determine what effect sizes are seen in different trials. The trial estimated efficacy of PCI relative to medical therapy [which] is not a fixed thing of nature but rather varies contingent on a host of relevant details.

The choice of exercise time as the primary endpoint -- rather than a measure of angina like the SAQ -- is curious because exercise time has never shown itself to be a very sensitive measure of what QOL benefits PCI provides to patients. Its objectivity may seem in its favor, but against that, one must consider that since the study enrolled a population most of whom did not have exercise-induced ST depression and we presume relatively few had exercise-limiting angina, it's unclear why PCI would produce a big change in exercise time.

The investigators indicate that they powered their study to detect a 30-second incremental change in exercise time with PCI over placebo/medical therapy. In the early ACME Trial, at 6 months the medical therapy arm had a 30-second increase in treadmill time versus a 2.1-minute increase for PTCA. Even if we assume that a 30 second increase in exercise time represents something clinically important to patients, such a small average increment will be quite difficult to detect using standard statistical testing, given the amount of background noise that can alter the exercise time independent of treatment effect. In the power calculation in ORBITA, the investigators assumed a between-patient change score SD of 75 seconds. I do not see the data on the observed SD for the change score in the paper and its unclear what this assumption is based on, but the 30 second effect size is about one-sixth of a SD using the baseline treadmill time data. That is a pretty small effect size to be searching for.

If we look at effect size results pattern in ORBITA, we see small incremental increases in exercise time, SAQ AF, and Duke Treadmill scores with PCI. The CI shows that the estimates all lack precision – that is they cannot exclude the null hypothesis of a 0 effect size. But it would be incorrect to conclude that based on the p-values, the null case is as likely as the observed effect sizes. The lack of similar trends in peak oxygen uptake, SAQ physical function scale and EQ5D is not at all surprising. These would be much less sensitive to a small treatment effect as was sought in this trial. EQ5D in particular is sensitive to only the very largest treatment effects seen in cardiovascular medicine and is not really a QOL measure but a generic health status measure.

So the ORBITA main results are a predictable function of two main factors: a small estimated incremental effect size of PCI (about half of the postulated benefit), likely due to the relatively low angina burden study population enrolled (little opportunity for PCI to do anything measurable in the QOL space), and a lack of precision in estimating the effect sizes (as reflected in the CIs) due to a small sample size.