Interview: Unlearn.AI - AI in the Life Sciences 2020

Charles Fisher,

Founder and CEO,

Unlearn.AI

"You can cut the number of patients by half, because you are running a trial in which every single patient who enrolls receives the experimental therapy, and you are using a computer simulation to provide the control group."

What are digital twins and what are the ways in which Unlearn.AI is using this concept to address weaknesses in how clinical trials are conducted?

A digital twin is a concept that is used all the time in engineering and other kinds of applications. The idea applied in engineering is that if you are building a device, you would simultaneously create a computer simulation of that device, where you could simulate different conditions. If you are building an engine, you might want to know what would happen if you were to rev an engine up to very high rpm's. You would want to be able to do experiments on the computer that you would not want to do on the physical device, because you risk damaging it. The idea Unlearn has is very similar. When a patient enrolls in a clinical trial, we create a computer simulation of that patient. That enables us to ask, what would happen to a particular patient if they were to receive a placebo. Every clinical trial is a comparison between an outcome that would happen if a patient receives a new treatment and an alternative outcome, that would happen if the patient were to receive a placebo or some other control. What we do is take the patient and give them the experimental treatment and you observe what happens. Then we take the digital twin of that patient and we predict what would have happened if that patient had received a placebo or control. Then you can do the comparison between the observed and predicted potential outcomes to determine if that treatment was effective for that particular patient. This enables us to then run clinical trials that require fewer people being randomized to control groups.

Is the data you acquire easily available? How are you kind of going about building robust data sets?

In every data science problem, it always seems that we ended up spending the majority of our time cleaning data. That is true at Unlearn as well. We started with a foundation of data taken from previously completed clinical trials and there are a variety of partnerships that we have entered into with pharma companies and academic centers to be able to aggregate data from many different clinical trials. We did that, because it is the highest quality healthcare data that you can get. It still however, requires years of work to integrate these different trial data sets and to clean them. On top of that data, we leverage some data from electronic health records. Unlearn has taken a different approach to thinking about data that you often see in a lot of machine learning contexts, where we are very focused on quality and we think it is more important than quantity. For example, we throw out at least half of our patient data. We are trying to build very high quality regulatory grade datasets starting with data that was collected for clinical trial regulatory purposes, cleaning it up even more, and then basing our training on those kinds of data.

Are there any big regulatory hurdles Unlearn must overcome?

We have a very different pathway through the regulators than do many other AI companies that are operating in pharmaceuticals. If you are a company that is operating using AI to do drug discovery, you do not have any special exposure to regulators compared to a regular pharma or biotech company. The fact that you use AI is irrelevant because you find the compound and then you run clinical trials on it and the FDA only cares that the compound works. For us, it is very different. We are talking about actually using AI for regulatory decision making, to guide it. In fact, one of our main stakeholders is FDA and EMA. We have been pretty active in conversing with FDA and we are planning to do a submission into EMA soon. There is guidance and frameworks out of FDA on how AI and software are regulated when it is a medical device that is being used for guiding treatment decisions, but there is nothing like that for AI or any kind of software that is being used in drug development.

How do you convince big pharma that results from your DiGenesis platform are accurate and capable of lowering the cost of clinical trials?

If you could simulate the placebo arm of a clinical trial and you can do that perfectly, then you would only need half of the patients because it would eliminate the need for a placebo arm. It comes down to evaluating the performance of these models on either historical data or on prospective clinical trials and then demonstrating with data that the models are working well. We have done a lot of different things, we have peer reviewed publications in scientific journals, we are doing collaborations with pharma companies to evaluate our technology and we have presented data to FDA on the validation of our approach in Alzheimer's disease.

A lot of what we do in terms of our business model involves pharmaceuticals. We often start with a proof of concept project where we demonstrate how our models work and they can actually see results on their own data. Then moving on, if they are successful in that project, it moves into later stage things. We have also had a lot of success with smaller companies who are thinking about innovative ways to make their trials more efficient, because as a smaller company you are so much more sensitive to time and cost.

Why are these neurodegenerative diseases good to apply to your platform?

When you see a machine learning approach being applied to something, it should be a really hard problem. You should not be taking an easy problem and applying machine learning to it. We focus not just on neurosciences, but on all complex chronic diseases, the kinds of diseases where you have many symptoms and you want to understand how all of those symptoms are changing over the course of time. We started on these neurodegenerative diseases because there's a huge unmet need. If you think about Alzheimer's disease, the last major approval was in 2003 and clinical trials now have thousands of patients. They take five to ten years, cost hundreds of millions of dollars and something must be done to speed up research in the field. That way, we can start getting new medicines to patients who really need them. Our initial focus was on neurosciences, in part to simplify who we have to talk to. When we are talking to pharma companies they are normally organized by disease area, so we are selling into one particular group, and then also at FDA, there is the office of neurosciences that we can primarily interface with. Once we demonstrate this approach in these initial diseases, the approach we take is really disease nonspecific and entirely data driven. This allows us to easily expand into new indications after we have proven the concept out here.

How have clinical trials been affected by the pandemic?

The Covid-19 pandemic has caused hundreds of trials to stop. In many indications medical research has completely ground to a halt. Clinical trials have very low enrollment. People don't want to participate in them right now, particularly if you are in an elderly population that is at high risk of Covid-19. People that are in trials are dropping out, so the result has been a lot of trials stopped and a lot of trials with very low enrollment. Unlearn’s primary value proposition is the ability to run trials with fewer patients. One of the things that we are actually looking into is ways that we can apply this approach to help salvage some of these trials that have had a negative impact because of low enrollment due to Covid-19. We are looking into whether or not we can boost their statistical power back up to the range that was initially planned.

How does Unlearn’s technology help with clinical trial diversity?

What clinical trials are designed to do is to say, under the ideal circumstances, what is the efficacy of this new therapeutic. The result is that we are typically looking at pretty homogeneous populations. For us, the main thing that we are thinking about in the area is how we can get diverse data into our training datasets to make sure that the algorithms that we are building will be accurate for many different kinds of people. That is not necessarily easy because we start with clinical trial data and that data tends to be biased. What we have to do is find data from many different sources and try to put it together. We then track those kinds of metrics to make sure that we are covering different groups. It is also more than just saying we want to simulate a US population. Clinical trials are being run all over the world and we would like to be able to capture those differences between different parts of the world as well.

Who do you view as your competition?

We don't have any true competitors, because no one else is doing what we are doing - using novel machine learning approaches to generate Digital Twins, or virtual patients, to accelerate clinical trials. We could view the traditional randomized control trial (RCT's) as our biggest competitor. This is when pharmaceutical companies run trials and there is no additional data that is brought to bear on the problem. That is the way things are done today. They are done in an isolated way, which is a terribly inefficient use of patient data. We want to be able to leverage all of that patient data to make better decisions. We are proposing a new kind of RCT that is more efficient and that is better by leveraging additional data.

What is your vision for the future of clinical trials?

The goals are to have clinical trials be very efficient, very ethical and to provide individual information that is helpful for specific patients to know who those treatments are going to be effective for. I think what we need to do in terms of clinical trials is have a system. This is the whole system, where all of the data is being brought through each individual patient to make inferences so that data is aggregated and cleaned up and re-used to make better decisions in the future. That is where I think things are going to go. We are seeing a huge transformation right now in clinical trials. The move to get Covid-19 drugs to market as quickly as possible, has led to some interesting innovative developments in the use of Bayesian methods, adaptive trials and platform trials where you can compare many different trials at the same time. All of those different methods are going to lead to success and transformation in the way trials are going to be running in the future.

To what extent could Unlearn cut down on clinical trial time?

It depends a little bit on the types of characteristics that the sponsor wants their clinical trial to control. In the most ideal circumstance you can cut the number of patients by half, because you are running a trial in which every single patient who enrolls receives the experimental therapy, and you are using a computer simulation to provide the control group. With half as many patients, a trial is at least twice as fast. In some cases you can get an even a better acceleration. This allows costs to be cut dramatically. If you are going to completely eliminate a control group your trial will no longer have some properties that you would want it to have, so that is done frequently in earlier stage, phase two trials, or in cases where maybe it is unethical to have a placebo controlled trial. However, typically you are going to want to keep some patients randomized for placebo arm, but maybe not equally. For every one patient that is randomized to placebo, five or six patients can be randomized for treatment. That keeps the trial randomized and enables you to rule out alternative explanations that statisticians call confounding. In general, you are talking about increasing the speed of trials, on the order of probably 50 to 100%.

Next: Quartic.ai Company Profile