Tuesday, May 23, 2017

REMOTE Redux: DTP trials are still hard

Maybe those pesky sites are good for something after all. 

It's been six years since Pfizer boldly announced the launch of its "clinical trial in a box". The REMOTE trial was designed to be entirely online, and involved no research sites: study information and consent was delivered via the web, and medications and diaries were shipped directly to patients' homes.

Despite the initial fanfare, within a month REMOTE's registration on ClinicalTrials.gov was quietly reduced from 600 to 283. The smaller trial ended not with a bang but a whimper, having randomized only 18 patients in over a year of recruiting.

Still, the allure of direct to patient clinical trials remains strong, due to a confluence of two factors. First, a frenzy of interest in running "patient centric clinical trials". Sponsors are scrambling to show they are doing something – anything – to show they have shifted to a patient-centered mindset. We cannot seem to agree what this means (as a great illustration of this, a recent article in Forbes on "How Patients Are Changing Clinical Trials" contained no specific examples of actual trials that had been changed by patients), but running a trial that directly engages patients wherever they are seems like it could work.

The less-openly-discussed other factor leading to interest in these DIY trials is sponsors' continuing willingness to heap almost all of the blame for slow-moving studies onto their research sites. If it’s all the sites’ fault – the reasoning goes – then cutting them out of the process should result in trials that are both faster and cheaper. (There are reasons to be skeptical about this, as I have discussed in the past, but the desire to drop all those pesky sites is palpable.)

However, while a few proof-of-concept studies have been done, there really doesn't seem to have been another trial to attempt a full-blown direct-to-patient clinical trial. Other pilots have been more successful, but had fairly lightweight protocols. For all its problems, REMOTE was a seriously ambitious project that attempted to package a full-blown interventional clinical trial, not an observational study.

In this context, it's great to see published results of the TAPIR Trial in vasculitis, which as far as I can tell is the first real attempt to run a DIY trial of a similar magnitude to REMOTE.

TAPIR was actually two parallel trials, identical in every respect except for their sites: one trial used a traditional group of 8 sites, while the other was virtual and recruited patients from anywhere in the country. So this was a real-time, head-to-head assessment of site performance.

And the results after a full two years of active enrollment?
  • Traditional sites: 49 enrolled
  • Patient centric: 10 enrolled
Even though we’re six years later, and online/mobile communications are even more ubiquitous, we still see the exact same struggle to enroll patients.

Maybe it’s time to stop blaming the sites? To be fair, they didn’t exactly set the world on fire – and I’m guessing the total cost of activating the 8 sites significantly exceeded the costs of setting up the virtual recruitment and patient logistics. But still, the site-less, “patient centric” approach once again came up astonishingly short.


ResearchBlogging.org Krischer J, Cronholm PF, Burroughs C, McAlear CA, Borchin R, Easley E, Davis T, Kullman J, Carette S, Khalidi N, Koening C, Langford CA, Monach P, Moreland L, Pagnoux C, Specks U, Sreih AG, Ytterberg S, Merkel PA, & Vasculitis Clinical Research Consortium. (2017). Experience With Direct-to-Patient Recruitment for Enrollment Into a Clinical Trial in a Rare Disease: A Web-Based Study. Journal of medical Internet research, 19 (2) PMID: 28246067

Thursday, March 30, 2017

Retention metrics, simplified

[Originally posted on First Patient In]

In my experience, most clinical trials do not suffer from significant retention issues. This is a testament to the collaborative good will of most patients who consent to participate, and to the patient-first attitude of most research coordinators.

However, in many trials – especially those that last more than a year – the question of whether there is a retention issue will come up at some point while the trial’s still going. This is often associated with a jump in early terminations, which can occur as the first cohort of enrollees has been in the trial for a while.

It’s a good question to ask midstream: are we on course to have as many patients fully complete the trial as we’d originally anticipated?

However, the way we go about answering the question is often flawed and confusing. Here’s an example: a sponsor came to us with what they thought was a higher rate of early terminations than expected. The main problem? They weren't actually sure.

Here’s their data. Can you tell?

Original retention graph. Click to enlarge.
If you can, please let me know how! While this chart is remarkably ... full of numbers, it provides no actual insight into when patients are dropping out, and no way that I can tell to project eventual total retention.

In addition, measuring the “retention rate” as a simple ratio of active to terminated patients will not provide an accurate benchmark until the trial is almost over. Here's why: patients tend to drop out later in a trial, so as long as you’re enrolling new patients, your retention rate will be artificially high. When enrollment ends, your retention rate will appear to drop rapidly – but this is only because of the artificial lift you had earlier.

In fact, that was exactly the problem the sponsor had: when enrollment ended, the retention rate started dropping. It’s good to be concerned, but it’s also important to know how to answer the question.

Fortunately, there is a very simple way to get a clear answer in most cases – one that’s probably already in use by your  biostats team around the corner: the Kaplan-Meier “survival” curve.

Here is the same study data, but patient retention is simply depicted as a K-M graph. The key difference is that instead of calendar dates, we used the relative measure of time in the trial for each patient. That way we can easily spot where the trends are.


In this case, we were able to establish quickly that patient drop-outs were increasing at a relatively small constant rate, with a higher percentage of drops coinciding with the one-year study visit. Most importantly, we were able to very accurately predict the eventual number of patients who would complete the trial. And it only took one graph!




Saturday, March 18, 2017

The Streetlight Effect and 505(b)(2) approvals

It is a surprisingly common peril among analysts: we don’t have the data to answer the question we’re interested in, so we answer a related question where we do have data. Unfortunately, the new answer turns out to shed no light on the original interesting question.

This is sometimes referred to as the Streetlight Effect – a phenomenon aptly illustrated by Mutt and Jeff over half a century ago:


This is the situation that the Tufts Center for the Study of Drug Development seems to have gotten itself into in its latest "Impact Report".  It’s worth walking through the process of how an interesting question ends up in an uninteresting answer.

So, here’s an interesting question:
My company owns a drug that may be approvable through FDA’s 505(b)(2) pathway. What is the estimated time and cost difference between pursuing 505(b)(2) approval and conventional approval?
That’s "interesting", I suppose I should add, for a certain subset of folks working in drug development and commercialization. It’s only interesting to that peculiar niche, but for those people I suspect it’s extremely interesting - because it is a real situation that a drug company may find itself in, and there are concrete consequences to the decision.

Unfortunately, this is also a really difficult question to answer. As phrased, you'd almost need a randomized trial to answer it. Let’s create a version which is less interesting but easier to answer:
What are the overall development time and cost differences between drugs seeking approval via 505(b)(2) and conventional pathways?
This is much easier to answer, as pharmaceutical companies could look back on development times and costs of all their compounds, and directly compare the different types. It is, however, a much less useful question. Many new drugs are simply not eligible for 505(b)(2) approval. If those drugs
Extreme qualitative differences of 505(b)(2) drugs.
Source: Thomson Reuters analysis via RAPS
are substantially different in any way (riskier, more novel, etc.), then they will change the comparison in highly non-useful ways. In fact, in 2014, only 1 drug classified as a New Molecular Entity (NME) went through 505(b)(2) approval, versus 32 that went through conventional approval. And in fact, there are many qualities that set 505(b)(2) drugs apart.

So we’re likely to get a lot of confounding factors in our comparison, and it’s unclear how the answer would (or should) guide us if we were truly trying to decide which route to take for a particular new drug. It might help us if we were trying to evaluate a large-scale shift to prioritizing 505(b)(2) eligible drugs, however.

Unfortunately, even this question is apparently too difficult to answer. Instead, the Tufts CSDD chose to ask and answer yet another variant:
What is the difference in time that it takes the FDA for its internal review process between 505(b)(2) and conventionally-approved drugs?
This question has the supreme virtue of being answerable. In fact, I believe that all of the data you’d need is contained within the approval letter that FDA posts publishes for each new approved drug.

But at the same time, it isn’t a particularly interesting question anymore. The promise of the 505(b)(2) pathway is that it should reduce total development time and cost, but on both those dimensions, the report appears to fall flat.
  • Cost: This analysis says nothing about reduced costs – those savings would mostly come in the form of fewer clinical trials, and this focuses entirely on the FDA review process.
  • Time: FDA review and approval is only a fraction of a drug’s journey from patent to market. In fact, it often takes up less than 10% of the time from initial IND to approval. So any differences in approval times will likely easily be overshadowed by differences in time spent in development. 
But even more fundamentally, the problem here is that this study gives the appearance of providing an answer to our original question, but in fact is entirely uninformative in this regard. The accompanying press release states:
The 505(b)(2) approval pathway for new drug applications in the United States, aimed at avoiding unnecessary duplication of studies performed on a previously approved drug, has not led to shorter approval times.
This is more than a bit misleading. The 505(b)(2) statute does not in any way address approval timelines – that’s not it’s intent. So showing that it hasn’t led to shorter approval times is less of an insight than it is a natural consequence of the law as written.

Most importantly, showing that 505(b)(2) drugs had a longer average approval time than conventionally-approved drugs in no way should be interpreted as adding any evidence to the idea that those drugs were slowed down by the 505(b)(2) process itself. Because 505(b)(2) drugs are qualitatively different from other new molecules, this study can’t claim that they would have been developed faster had their owners initially chosen to go the route of conventional approval. In fact, such a decision might have resulted in both increased time in trials and increased approval time.

This study simply is not designed to provide an answer to the truly interesting underlying question.

[Disclosure: the above review is based entirely on a CSDD press release and summary page. The actual report costs $125, which is well in excess of this blog’s expense limit. It is entirely possible that the report itself contains more-informative insights, and I’ll happily update that post if that should come to my attention.]

Wednesday, February 22, 2017

Establishing efficacy - without humans?

The decade following passage of FDAAA has been one of easing standards for drug approvals in the US, most notably with the advent of “breakthrough” designation created by FDASIA in 2012 and the 21st Century Cures Act in 2016.

Although, as of this writing, there is no nominee for FDA Commissioner, it appears to be safe to say that the current administration intends to accelerate the pace of deregulation, mostly through further lowering of approval requirements. In fact, some of the leading contenders for the position are on record as supporting a return to pre-Kefauver-Harris days, when drug efficacy was not even considered for approval.
Build a better mouse model, and pharma will
beat a path to your door - no laws needed.

In this context, it is at least refreshing to read a proposal to increase efficacy standards. This comes from two bioethicists at McGill University, who make the somewhat-startling case for a higher degree of efficacy evaluation before a drug begins any testing in humans.
We contend that a lack of emphasis on evidence for the efficacy of drug candidates is all too common in decisions about whether an experimental medicine can be tested in humans. We call for infrastructure, resources and better methods to rigorously evaluate the clinical promise of new interventions before testing them on humans for the first time.
The author propose some sort of centralized clearinghouse to evaluate efficacy more rigorously. It is unclear what they envision this new multispecialty review body’s standards for green-lighting a drug to enter human testing. Instead they propose three questions:
  • What is the likelihood that the drug will prove clinically useful?
  • Assume the drug works in humans. What is the likelihood of observing the preclinical results?
  • Assume the drug does not work in humans. What is the likelihood of observing the preclinical results?
These seem like reasonable questions, I suppose – and are likely questions that are already being asked of preclinical data. They certainly do not rise to the level of providing a clear standard for regulatory approval, though perhaps it’s a reasonable place to start.

The most obvious counterargument here is one that the authors curiously don’t pick up on at all: if we had the ability to accurately (or even semiaccurately) predict efficacy preclinically, pharma sponsors would already be doing it. The comment notes: “More-thorough assessments of clinical potential before trials begin could lower failure rates and drug-development costs.” And it’s hard not to agree: every pharmaceutical company would love to have even an incrementally-better sense of whether their early pipeline drugs will be shown to work as hoped.

The authors note
Commercial interests cannot be trusted to ensure that human trials are launched only when the case for clinical potential is robust. We believe that many FIH studies are launched on the basis of flimsy, underscrutinized evidence.
However, they do not produce any evidence that industry is in any way deliberately underperforming their preclinical work, merely that preclinical efficacy is often difficult to reproduce and is poorly correlated with drug performance in humans.

Pharmaceutical companies have many times more candidate compounds than they can possibly afford to put into clinical trials. Figuring out how to lower failure rates – or at least the total cost of failure - is a prominent industry obsession, and efficacy remains the largest source of late-stage trial failure. This quest to “fail faster” has resulted in larger and more expensive phase 2 trials, and even to increased efficacy testing in some phase 1 trials. And we do this not because of regulatory pressure, but because of hopes that these efforts will save overall costs. So it seems beyond probable that companies would immediately invest more in preclinical efficacy testing, if such testing could be shown to have any real predictive power. But generally speaking, it does not.

As a general rule, we don’t need regulations that are firmly aligned with market incentives, we need regulations if and when we think those incentives might run counter to the general good. In this case, there are already incredibly strong market incentives to improve preclinical assessments. Where companies have attempted to do something with limited success, it would seem quixotic to think that regulatory fiat will accomplish more.

(One further point. The authors try to link the need for preclinical efficacy testing to the 2016 Bial tragedy. This seems incredibly tenuous: the authors speculate that perhaps trial participants would not have been harmed and killed if Bial had been required to produce more evidence of BIA102474’s clinical efficacy before embarking on their phase 1 trials. But that would have been entirely coincidental in this case: if the drug had in fact more evidence of therapeutic promise, the tragedy still would have happened, because it had nothing at all to do with the drug’s efficacy.

This is to some extent a minor nitpick, since the argument in favor of earlier efficacy testing does not depend on a link to Bial. However, I bring it up because a) the authors dedicate the first four paragraphs of their comment to the link, and b) there appears to be a minor trend of using the death and injuries of that trial to justify an array of otherwise-unrelated initiatives. This seems like a trend we should discourage.)

[Update 2/23: I posted this last night, not realizing that only a few hours earlier, John LaMattina had published on this same article. His take is similar to mine, in that he is suspicious of the idea that pharmaceutical companies would knowingly push ineffective drugs up their pipeline.]

ResearchBlogging.org Kimmelman, J., & Federico, C. (2017). Consider drug efficacy before first-in-human trials Nature, 542 (7639), 25-27 DOI: 10.1038/542025a

Tuesday, February 7, 2017

Jerry Matczak

Jerry Matczak passed away suddenly last Thursday at the much-too-young age of 54.

I can say, without exaggeration, that Jerry embodied pretty much everything I aspire to be in my professional life. The MedCityNews headline called him a “social media guru”, but in reality he was temperamentally the exact opposite of a "guru":

He was constantly curious; it seemed that every conversation I had with him was composed mainly of questions. Many of us try to be “listen first, talk second” types, but Jerry was a “listen first, ask questions, listen some more, then talk” type.

He also never stopped trying to figure out how to improve whatever he was working on. He participated in a lot of pilot projects, which means he was a part of a lot of projects that didn’t meet their objectives – but I never witnessed Jerry being the least bit negative or frustrated. Every project was just another opportunity to learn more.

Mostly, though, Jerry was remarkable in his ability to connect with patients, even patients who were deeply distrustful of his employer and industry. If nothing else, I hope you read the words of two such patients, coming from very different places, with remarkably similar reactions to Jerry:


Jerry, thank you for your service and your example. I carry it with me.