Sam Charrington: Today we’re excited to continue the AI for the Benefit of Society series that we’ve partnered with Microsoft to bring you. In this episode. We’re joined by Peter Lee, Corporate Vice President at Microsoft Research responsible for the company’s healthcare initiatives. Peter and I met a few months ago at the Microsoft ignite conference where he gave me some really interesting takes on AI development in China. We reference those in the conversation and you can find more on that topic in the show notes. This conversation centers on three impact areas that Peter sees for AI and healthcare, namely diagnostics and therapeutics, tools and the future of precision medicine. We dig into some examples in each area and Peter details the realities of applying machine learning and some of the impediments to rapid scale. Before diving in I’d like to thank Microsoft for their support of the show and their sponsorship of this series. Microsoft is committed to ensuring the responsible development and use of AI and is empowering people around the world with this intelligent technology to help solve previously intractable societal challenges spanning sustainability, accessibility and humanitarian action. Learn more about their plan at Microsoft.ai. Enjoy.

Sam Charrington: [00:02:18] All right, everyone. I am on the line with Peter Lee. Peter is a corporate vice president at Microsoft responsible for the company’s healthcare initiatives.

Peter, it is so great to speak with you again. Welcome to This Week in Machine Learning and AI.

Peter Lee: [00:00:14] Sam, it’s great to be here.

Sam Charrington: [00:00:17] Peter, you gave a really interesting presentation to a group that I was at at Ignite about what some of Microsoft was working on, at Microsoft Research as well as a really interesting take on AI development in China. That kind of piqued my interest, and we ended up sitting down to chat about that in a little bit more detail. While I did cover that for my blog and newsletter, and I’ll be linking to it in the show notes, we won’t be diving into that today. It was a really, really interesting take that I reflect on often, and I think it’s an interesting setup for diving into your background, because you do have a very interesting background and an interesting perspective and set of responsibilities at Microsoft.

On that note, can you share with our audience a little bit about your background?

Peter Lee: [00:01:11] Sure, Sam. I’d love to do that. I agree it is a little bit unusual, although I think the common thread throughout has been about research and trying to bring research into the real world. I’m a computer scientist by training. I was a professor of computer science at Carnegie Mellon for a long time, actually for 24 years, and at the end of my time there was the head of the Computer Science Department. Then I went to Washington, D.C, to serve at an agency called DARPA, which is the Defense Advanced Research Projects Agency. That’s kind of the storied research agency that built the Saturn V booster technology, invented the ARPANET, which became the Internet, developed robotics, lots and lots of other things. I learned a lot about bringing research to life there.

Then, after a couple of years there, I was recruited to Microsoft and joined Microsoft Research. Started managing the mothership lab in Redmond, in the headquarters in Redmond, and then a little bit later all of the U.S. research labs and then ultimately, all of Microsoft’s 13 labs around the world. Right about that time, Steve Ballmer announced his retirement. Satya Nadella took over as the CEO. Harry Shum took over all of AI and research at Microsoft and became my boss. They asked me to start a new type of research organization internally. It’s called NExT, which stands for New Experiences in Technologies, and we’ve been trying to grow and incubate new research-powered businesses ever since, and most recently in healthcare.

Sam Charrington: [00:03:04] I think when I think about AI and healthcare, there’s certainly a ton of ground to cover there, but I think one of the areas that gets a lot of attention of late is all the progress that’s being made around applying neural nets, CNNs in particular, to imagery. I’m wondering from your perspective, how do you tend to think about AI applied to the healthcare space and where the big opportunities are?

Peter Lee: [00:03:37] Yeah. When I think about AI and healthcare, I’m really optimistic about the future. Not that there aren’t huge, difficult problems and sometimes things always seem to go slower than you expect. It’s a little bit like watching grass grow. It does grow and things do happen, but sometimes it’s hard to see it. But over the last 15 years, the thing that I think is underappreciated is the entire healthcare industry has gone digital. It was only 15 years ago that, for example, in the United States, less that 10% of physicians were recording your health history in a digital electronic health record. Now, we’re up over 95%, and that’s just an amazing transformation over 15 years. It’s not like we don’t still have problems, data is siloed, it’s not in standard formats. There’s all sorts of problems, but the fact that it’s gone digital just opens up huge, huge amounts of potential.

I kind of look at the potential for AI in three areas. One is the thing that you pointed out, which are AI technologies that actually lead to better diagnostics and therapeutics, things that actually advance medical science and medical technology. A second area for AI is in the area of tools, tools that actually make doctors better at what they do, make them happier while they’re doing it, and also improve the experience for you and me as patients or consumers of healthcare. Then the third area is in this wonderful future of precision medicine that’s taking new sources of information, digital information, your genome, your proteome, your immunome, data from your fitness wearables and so on and integrating all of that together to give you a complete picture of what’s going on with your body. Those are sort of three broad areas, and they’re all incredibly exciting right now.

Sam Charrington: [00:05:51] When you think about the first two of those categories, better diagnostics and therapeutics and tools, how do you distinguish them? It strikes me that giving doctors a better way to analyze medical imagery, for example, or to use that example again, is a tool that they can use, but when you say tools, what do you specifically mean?

Peter Lee: [00:06:14] Yeah. You’re absolutely right. There’s an overlap. It’s not like the boundaries between these things are all that hardened, but if you think about one problem that doctors have today is by some estimates in the United States, doctors are spending 40 to 50% of their workdays entering documentation, entering notes that record what happened in their encounters with patients. That’s sometimes called an encounter note. That documentation is actually required now by various rules and regulations. It’s an incredible source of burden.

In fact, I’m guessing you’ve had this experience, most people have. You go to your doctor, I go to mine, and I like her very much, but while I’m being examined by her, she’s not looking at me. She’s actually sitting at a PC, typing in the encounter notes. The reason she’s doing that is if she doesn’t do it while she’s examining me, she’ll have to do it for a couple of hours maybe in the evening, taking time away from her own family. That burden is credited or blamed for a rise in physician burnout. Well, AI technologies today are rapidly approaching the point where ambient intelligence can just observe and listen to a doctor-patient encounter and automate the vast majority of the burden of that required clinical note-taking.

That’s an example of the kind of technology that could in a really material way just improve the lives and the workday satisfaction of doctors and nurses. I put that in a different category than technologies that actually give you more precise diagnosis of what’s ailing you or ability to target therapies that might actually attack the very specific genetic makeup, let’s say, of a cancer that’s inhabiting your body right now.

Sam Charrington: [00:08:17] Got it. Got it. Maybe let’s take each of these categories in turn. I’d love to get a perspective from you on where you see the important developments coming from, from a research perspective, and where you see the opportunities and where you see things heading in each.

Peter Lee: [00:08:42] Sure. Well, why don’t we start with your example of imaging, because computer vision based on deep neural nets has just been progressing at this stunning rate. It seems like every week you see another company, another startup, or another university research group showing off their latest advances in using deep neural net-based computer vision technologies to do various kinds of medical image diagnosis or segmentation.

Here at Microsoft, we’ve been working pretty hard on those as well. We have this wonderful program based primarily in India that’s been trained on the health records and eye images of over 200,000 patients. That idea of taking all that data, you get the signal of which of those patients have, let’s say, suffered from, say, diabetic retinopathy or a progression of refractive error leading to blindness. From that signal in the electronic health record, coupled with the images, we are able to train a computer vision-based thing to make a prediction about whether a child whose eye image has been taken is in danger of losing eyesight. That is in deployment right now in India, and, of course, for other parts of the world like the United States and Europe, which are more regulated, these things are in various states of clinical validation so they can be more broadly deployed.

Another example is a project that we have called InnerEye that is trying to just reduce the incredible, kind of boring and mundane problem of just pixel-by-pixel outlining the parts of your body that are tumor and should be attacked with the radiation beam as opposed to healthy tissue. That problem with radiation therapy planning has to be done really perfectly, which is why it’s this sort of pixel-by-pixel process. But there is maybe five or 15 minutes of real black magic that’s drawing on all of the intuition and experience and wisdom of a radiologist and then two to three hours of complete drudgery, and much of that complete drudgery can just be eliminated with modern computer vision technologies.

These things are really developing so rapidly and coming online. They tend not to replace completely what doctors and radiologists can do, because there is always some judgment and intuition involved in these things, but when done right, they can integrate into the workflow to really enable, to kind of liberate clinicians from a lot of drudgery and to reduce mistakes. I think one other thing that’s sometimes not fully appreciated is you also, when you get these tools, you can take these measurements over and over and over again. When they become cheap, you can take them every day, if necessary, which allows you to track progression of a disease or its treatment over time much more precisely. These sorts of applications, I think, in medical imaging, I think are really promising.

One thing I … it’s a hobby horse of mine … before I pause, is in 2015 here in Microsoft Research we invented something called deep residual networks, which are now commonly called ResNets. ResNet has become part of an industry standard and research standard in computer vision using deep neural nets. We ourselves have refrained from using ResNets for doing things like imaging of 3D images for the purposes of radiation therapy planning, and there are various technical reasons for that. Sometimes we have a mixture of being proud seeing the rest of the world use our invention for interesting medical imaging, but we also sometimes get worried that people don’t quite understand the failure modes in these things. But, still, the progress has just been spectacular.

Sam Charrington: [00:13:14] That’s kind of an interesting prompt. Maybe let’s take a moment to explore the failure modes, and why don’t you … It sounds like you don’t advise folks to apply ResNets to the types of images that we tend to see in medical imaging. What’s that about?

Peter Lee: [00:13:32] Yeah. It’s not advising or warning people against it. If you think about, let’s say, take the problem of radiation therapy planning, it’s a 3D problem. You have a tumor that is a 3D mass in your body and you’re trying to come up with the plan for that radiation beam to attack ideally as much of that tumor while preserving as much healthy tissue as possible. Of course, your picture into that 3D tumor is as a series of two-dimensional slices, at least with current medical imaging.

One very basic question is, as you examine slice-by-slice that tumor with respect to the healthy tissue, is each slice being properly and logically registered with the next one? A simple or naïve application of a convolutional neural network, like a ResNet, doesn’t automatically do that. The other problem is it’s unclear to what extent a bad training sample or set of training samples will do to one of these deep neural nets. In fact, just in the last few weeks and months, there have been more and more interesting academic research studies showing some interesting failure modes from a surprisingly small number of bad training samples.

I think that these things are changing all the time. Our algorithms and our algorithmic understanding are improving all the time, but at least within our research groups, we’ve taken pains to understand that this application of computer vision isn’t like others. It’s more in the realm of, say, driverless cars where safety is of paramount concern, and we just have to have absolute certainty that we understand the possible failure modes of these things. Sometimes with just an off-the-shelf application of ResNets or any similar deep neural net algorithm, we and now more and more other researchers at universities are finding that we don’t yet fully understand the failure modes.

Sam Charrington: [00:16:02] In some ways, there’s an opportunity beyond kind of naïve application of an algorithm that performs very well on ImageNet. Today, you can get data sets that include kind of these 2D representations of what are fundamentally 3D applications or 3D images and apply the regular 2D algorithms to them and find interesting things. But you’re saying that a) we can do better and b) we may not even be doing the right things in many cases because of these safety issues.

I’m wondering, on the first of those two points, the doing better, is there either a standard approach that’s better than ResNet for these 3D images that you’ve developed at Microsoft or have seen otherwise? Or where are we in terms of taking advantage of the 3D nature of medical images and deep learning?

Peter Lee: [00:17:06] Yeah. That’s a good question. For our InnerEye project, which is really run by a great set of researchers based mostly in our Cambridge, U.K. research lab and led by Antonio Criminisi. He’s really one of the preeminent authorities in computer vision. In fact, he led an effort some years ago to work out the 3D computer vision for Kinect, and so he’s really specialized in 3D. The InnerEye project, which is really for us an effort to really understand completely the workflow of radiation therapy planning, that system actually doesn’t use residual network.

What it does is it uses kind of an architecture of layered what are called decision forests. That gives not only some benefits in terms of more compact representations of machine-learned models and, therefore, some performance improvements, but it allows us to kind of capture a kind of logical registration of the images as they go slice-by-slice. In other words, you’re inferring not just the segmentation of each 2D image slice, but you’re actually trying to infer the voxel, the 3D voxel volume of the tumor that you’re trying to attack.

Then on top of that, there’s a process involved when you’re dealing with medical technologies. You don’t just put it out there and start applying it on people. You get it peer-reviewed. You get it peer-reviewed, in this case, in computer science journals and in medical journals, and you go through a clinical validation, and if you’re in the United States, for example, through an FDA approval process.

For us, as we’re learning about what does our cloud, what do our AI services, what do our tools have to be in order to support this future of AI-powered healthcare, InnerEye is an example of us going end-to-end to try to build it all out and to understand all those components and to understand what has to be done to really do it right. It’s been a great learning experience. We’re now in the process not only of working with various companies who might want to integrate this InnerEye technology into their medical devices, but we’re starting to now pull apart the kind of bricks and mortar that we used in the technical architecture for InnerEye in order to expose those as APIs for other developers to use. Our intent is not to get into the radiation therapy business. Our intent is not to get into radiology. But we do want our cloud and our AI services and our algorithms to be a great place for any other company or any other startup or innovator who wants to do that and ideally do it on our cloud, using our tools.

Sam Charrington: [00:20:29] An interesting point in there. You mention that the decision forests that you developed to address this problem … I guess we often think of there being this tradeoff between factors like explainability or safety, as you related that second point, and performance, which we think of as the neural net is delivering kind of the ultimate in performance in many cases. But in this case, this decision forest algorithm is outperforming at least your classic 2D ResNets, and I’m imagining also providing benefits in terms of explainability/safety. Is that correct?

Peter Lee: [00:21:21] Well, we feel very strongly that it provides benefits in terms of safety. Explainability is really another very interesting question and problem. There’s a potential for greater explainability. One of the lessons that we learned when we were working on AI for sales intelligence … We had really developed tremendous amount of AI that would ingest large amounts of data from the world as well as from customer relationship management databases, emails and so on for our sales teams and used that through various AI algorithms to do things like synthesize new offers to specific customers or to surface new prospective customers or to suggest new discount pricing for specific customers. One of the things we learned is that no self-respecting sales executive is going to offer a 20% discount to a customer just because his algorithm says so. Typically-

Sam Charrington: [00:22:35] Doctors are probably similar?

Peter Lee: [00:22:37] That’s right. In that situation, we also moved away from, in that specific case, moved away from the pure deep neural net architecture to having a kind of layered architecture of Bayesian graphical models. The reason for that was so that we could synthesize an explanation in plain English of not only offer a 20% discount, but why. As we get into, away from more point solutions that are kind of machine learning or AI-powered to more of that digital assistant that is the companion to a clinician and gives that clinician a second opinion or advice on a first opinion, those sorts of explanations undoubtedly are going to become important, especially at the beginning when we’re trying to establish trust in these things.

As we’ve been experimenting even with the kind of ambient intelligence to just listen in on a doctor-patient encounter and try to automate a note, one thing we’ve found is that doctors will look at the synthesized note and not trust everything in it because they don’t quite yet have the understanding of why did the note come out this way. It became important to provide tools so that when you, say, click on a specific entry in the note, that it could be mapped back to a running transcript and to the right spot in the running transcript that was recorded. These sorts of things I think are part of maybe the human-computer interaction or the human-AI interaction that we’re having to think about pretty hard as we try to integrate these things into clinical workflow.

Sam Charrington: [00:24:30] Before we move on beyond diagnostics and therapeutics, all of the examples that you gave fell into the domain of computer vision. Are there interesting things happening in diagnostics beyond the kind of onslaught of these new computer vision-based approaches?

Peter Lee: [00:24:51] Yeah. I think actually some of the most interesting things are not in computer vision, and this maybe crosses over into the precision medicine thing. One of the projects I’m so excited about is something that we’re doing jointly with a Seattle biotech startup, Adaptive Biotechnologies. The setup is this: If you take a small blood sample from your body, in that sample, in that one-mL sample, you’ll end up capturing on the order of one million T cells. The T cells are one of the primary agents in your adaptive immune system.

About two and a half years ago, there was a major scientific breakthrough that got published that showed that the receptor … There’s a receptor on the surface of your T cells, and in that receptor, there’s a small snippet of DNA. There was strong evidence two and a half years ago that that snippet of DNA completely determines what pathogen or infectious disease agent or cancer that T cell has been programmed to seek out and destroy. That paper was very interesting because it used a simple linear regression in order to identify from a read of that little snippet of DNA on the T cell receptor whether you had CMV, cytomegalovirus, or not.

It was really just an impressive paper and just very recent. Well, the thing that was interesting about Adaptive Biotechnologies is Adaptive Biotechnologies was in the business of giving you a printout of that specific snippet of DNA in all the T cell receptors in a blood sample. They had a business model that would help some cancer centers titrate the amount of specific chemotherapy you were getting based on a reading of that DNA.

That raised the question, would it be possible to take that printout of those T cell receptor DNA sequences and, in essence, think of that as a language and translate it into the language of antigens? Then, if you can do that, can you take those antigens and do a kind of topic identification problem to figure out what infectious diseases, what cancers, and what autoimmune disorders your body is currently coping with right now?

It turned into this very interesting new business opportunity for Adaptive Biotechnologies that if machine learning could be used to solve those two problems, then they would have a technology that would be very similar to a universal diagnostic, a simple blood test powered by machine learning that could do early diagnosis of any infectious disease, any cancer, and any autoimmune disorder. Microsoft found that interesting enough that we actually took an investment position in Adaptive Biotechnologies and agreed to work with them on the machine learning. And Adaptive, for their part, agreed to build a bigger production pipeline in order to generate training data to power that machine learning that we’re developing at Microsoft.

What has transpired since then has been an amazing amount of progress where we’ve added tremendous amount of sophistication actually using deep neural nets and started to feed it with billions of points of training data. In fact, this year, the production facility at Adaptive will be able to generate up to a trillion points of training data. We’re now targeting five specific diseases, ovarian cancer, pancreatic cancer, type I diabetes, celiac disease, and Lyme disease. That’s two cancers, two autoimmune disorders, and one infectious disease with the same machine learning pipeline.

It’s still an experiment, but it kind of shows you the potential power of these advances in immunology, in genomics, and AI all being bound together to give the possibility. We know the science now is valid, and if we can now build the technology that ties those things together, we get the potential for a universal diagnostic, but as close a thing that we could imagine getting to the Star Trek tricorder as anything.

Sam Charrington: [00:29:31] Mm-hmm (affirmative). That was the thing that popped immediately to mind for me, the tricorder. That example, I think, captures for me really plainly both the promise of applying machine learning and AI to this healthcare domain, but also maybe a little bit of the frustration in thinking through, okay, collecting a trillion samples and you’ve got this pipeline, why does it take so long? There’s certainly regulatory and political types of reasons that maybe we’ll get into. I’m wondering if you can elaborate on with that much training data and kind of the science in place and a pipeline in place, what are the realities of applying machine learning in this type of context that impede kind of rapid scale? Why just five diseases and not 25, for example?

Peter Lee: [00:30:43] Yeah. That’s such a great question. Yeah, human biology is just so complicated. I will say there are three ways, maybe, to take a cut at that. If we took a look at the very basic science, just consider the human genome, something that geneticists at several universities have taught me which was really eye-opening, is if you look at the human genome and then look at all the possible variants, the number of variants in the human genome that would still be considered homo sapiens is just astronomically large. Yet, the total number of people on the planet relative to that number is really tiny, only, what, seven and a half billion people. In fact, if we had somehow DNA samples from every human that has ever existed, I think most estimates say there are fewer than 106 billion people that have ever existed since Adam and Eve.

If we are using modern machine learning, which is basically looking at statistical patterns and correlations, we have an immediate problem for a lot of basic problems in genomics, because we basically don’t have a source of enough training data. The complexity of human beings, the complexity of cancer, the genetic complexity of disease, is just vastly larger than the number of people that have ever existed.

Sam Charrington: [00:32:21] Meaning relative to the possible combinations of genes-

Peter Lee: [00:32:28] That’s right.

Sam Charrington: [00:32:28] … every human is … I guess it shouldn’t be surprising that every human is unique, but even given … It’s a little counterintuitive. You’d think there’s only these four letters that were thrown together to figure all this stuff out. Right?

Peter Lee: [00:32:43] Yes. What that means is that, yes, we will and we have been making … We, meaning the scientific community and the technology community, have been making stunning advances and making really meaningful improvements for neonatal intensive care, for cancer treatments, for immunology, but fundamentally, scientifically, we still need something beyond just machine learning. We really need something that gets into the basic biology. That’s kind of one reason why this is hard.

Another reason is these are just big problems. In the project with Adaptive Biotechnologies, there are between 10 to the 15th and 10 to the 16th different T cell receptors that your body can produce and on the order of maybe 10 to the 7th known antigens. Imagine we’re trying to do is trying to fill out a gigantic Excel spreadsheet with 10 to the 16th columns and 10 to the 7th rows. That’s just a heck of a big table, and so you end up needing a large amount of training data to discern enough structure, find enough patterns in order to have a shot at filling in at least useful parts of that table.

The good news is everybody has T cells, and so we can take blood samples from anybody, from just ordinary, healthy people, and then we can go to research laboratories around the world that have stored libraries of antigens and start correlating those stored libraries of antigens against those what are called naïve blood samples. That’s exactly what Adaptive Biotechnologies is doing in order to generate the very large amount of training data. It’s a little bit of a good news situation there that we don’t need to find thousands or millions of sick people. We can generate the data from just ordinary samples. But it’s still a very large amount of data that we need.

Then the third kind of way that I think about this is it gets back to the safety issue. We do things a certain way because ultimately, medicine and medical science is based on causal relationships. In other words, we want to know that A causes B, but what we typically get out of machine learning is just A is correlated with B. We get those inferences, and then it takes more work and more testing under controlled circumstances to know that there’s a causal relationship.

All three of those things kind of create challenges. It does take time, but I think the good thing is as the regulatory organizations like the FDA have gotten smarter and smarter about what is machine learning, what is it good for, what are its limitations, that whole process has gotten, I think, faster and more efficient over time. Then there’s a second element, which is, of course, companies are in it to make money. At a minimum, even if they have purely humanitarian intentions, at a minimum they have to be sustained over time. That means that insurance companies and Medicare and Medicaid, they have to be willing to reimburse doctors and nurses when they actually use or prescribe these diagnostics and therapeutics. All of that takes time.

Sam Charrington: [00:36:37] At least on the second of your three points, in thinking about scaling, solving problems like this, specifically training data, do you have a rule of thumb, a chart that says, okay, one trillion training samples will get us these five diseases, but we’ll need 10 trillion to get to 10 diseases? I realize that that’s almost an asinine question and it’s much more complex than that, but does it make sense at all to think of it like that? And think of, I guess, the impact of collecting training data and what the trajectory looks like that over time, kind of like the way we thought of as we drive the cost of sequencing down, the downstream effects that that’ll have?

Peter Lee: [00:37:27] Yeah. Well, when you find the answer to that question, please tell me. In my experience, I’ve seen this go two ways. One of the wonderful things about modern machine learning algorithms today is that they’re far less susceptible to problems of over-fitting. They come very close to this wonderful property that the more data, the more better. But it does happen that sometimes you hit a wall, that you start to see a trail-off in improvement. We really don’t know. The kind of early results that we’ve gotten with admittedly simpler diseases like CMV, and then CMV is actually not that interesting from a medical perspective, they give us tremendous hope. Then other internal, more technical validations, give us supreme confidence that the basic science, the biological science is well-understood now.

Once you start really attacking much more complex diseases, like any cancer, it’s really hard. I would be unwilling personally to make a prediction about what will happen. But there’s every reason today for optimism, and I think the only unknown is whether there is a what if we fall off a cliff at some point and stop finding improvements. Or if we’re going to just get to a viable FDA-approved diagnostic in the near term that will be constantly improving as more and more people are diagnosed. It could really go in either way. I’m really unable and actually unwilling to make a prediction about which way it will go, but we are feeling pretty confident.

Incidentally, I should say last month Adaptive Biotechnologies closed a deal with Genentech for applications of this T cell receptor antigen map in the therapeutic space, in the area of cellular therapies for targeted cancer treatments. That deal has a value of over $2 billion, so there’s also some … When you’re dealing with commercial relationships like that, there’s a tremendous amount of due diligence. These are big bets and big pharma is accustomed to making large, risky bets like this, but I think it’s another sign that at least leading scientists at one of the larger pharmaceutical organizations is also increasingly confident that we can fill out this map.

Sam Charrington: [00:40:38] We’ve talked about diagnostics. We’ve talked about precision medicine. What do you see happening on the tooling side, both from the doctor’s perspective as well as the patient experience perspective?

Peter Lee: [00:40:52] Yeah. One thing, it’s a simple thing, but it’s been surprising how useful it has turned out to be. We’ve been piloting chatbot technology that we call the Microsoft Health Bot. This has been sort of in a beta program with a few dozen healthcare organizations. What it does is, we’ve sort of advanced our cognitive services for language processing, for natural language processing, for conversational understanding and the tooling to provide a drag-and-drop interface so that ordinary people can program these chatbots, at least for medical settings, and then we’ve improved the models, the language models, so they understand medical and healthcare concepts and terms.

We’ve been surprised at the kinds of applications that people use. One example is there are organizations that have made prescription bots. The idea is this. Maybe you get a prescription from your doctor or from the hospital and you go to the pharmacy, you get your prescription filled, and then a day or two later, you get a message from this intelligent chatbot that’s asking, “How’s it going? Do you have any questions? Or have you had any issues with your medication?” It invites you proactively to get into a conversation that gives the healthcare provider tremendous insight into whether you’re adhering to your prescription. That’s a huge problem. Something like 35% of people actually don’t follow through with their prescription medications.

It’s just there to answer questions. Maybe you have some stomach upsets or some people who are on a lot of medications hate having all those bottles and they put them all, dump all the pills into a baggy and then they can’t remember which pills are which. The health bot is able to converse with you and say, “Oh, well, why don’t you point your phone camera at a bunch of pills and I’ll remind you what they are.” It uses modern computer vision ResNets, actually, to remind you what these pills are.

The kind of engagement that the healthcare providers get, the improvements in engagement and the satisfaction that people like you and me have is really improved. Or just asking simple benefits questions or medical triage of various sorts, these kinds of ideas have been surprisingly interesting. In fact, so surprising for us that later this week, we’ll be making that product generally available for sale. You’ll be able to use the Microsoft Health Bot technology without any restriction, except for payment, of course. That is something that has gone extremely well.

That technology now is being baked into more and more of, I think, of what people will be seeing. We have a collaboration hub application in Office 365 called Teams, and Teams has been this just wonderful technology for improving collaboration in all sorts of workplace settings. Well, we’ve made Teams healthcare compliant and able to connect to electronic health record systems, and then by integrating great kind of collaboration intelligence tools, to just parse records or a newer way to go to find certain bits of information or just to be able to ask an intelligent agent that is part of your team, “Did so-and-so check the sutures last night?” and be able to get a smart answer whether people are awake or not.

There are all these little ways that I think AI can be used in the workflow of healthcare delivery. One of the things that is, I think, underappreciated about healthcare delivery today, especially in acute care settings, is it’s a super collaborative environment. Sometimes there can be as many as 20 people that are working together as a team delivering care to multiple patients at a time. How to keep that team of 20 people all on the same page and all coordinated is getting to be a really difficult problem, typically done with Post It notes and half-erased whiteboards now transitioning to pretty insecure consumer messaging apps. But the idea of having real enterprise-grade collaboration support with AI, I think just can make all of that much better and then provide much more security and privacy for people.

A lot of these applications of AI end up being less flashy than doing some automatic radiation therapy planning of a medical image, but they really kind of help people, those people on the front lines of healthcare delivery do their jobs better.

Sam Charrington: [00:46:34] I tend to find myself having really kind of mixed feelings about conversational applications, at least from the perspective of talking about them on the podcast. There’s no question that conversational experiences and interfaces will be a huge part of the way we interact with computers in the future and that there’s tons of work that needs to happen there because of the reasons that you mentioned, like less flashy. I wonder if there’s still interesting research. At least my question to you is are there still interesting research challenges there? Or is it all, do we have all the pieces and it’s just kind of rolling up the sleeves and building enterprise software, which we know is hard and takes time?

Peter Lee: [00:47:21] Yeah. It’s a good question. It feels like research to me.

Sam Charrington: [00:47:27]. (laughter) Elaborate.

Peter Lee: [00:47:28] Some of the problems, if anything, feel little difficult, honestly. If we just, say, take the problem of listening to a doctor-patient conversation and from that, understanding what should go into the standard form of a clinical encounter note. Here’s a typical thing. There could be an exchange. Let’s say, Sam, you’re my doctor and I’m your patient, you might be asking me how I’m doing and I might complain about the pain in my left knee hasn’t gone away.

We can have an exchange about how that goes, and ultimately, what goes into the note by you is a note about my continued lack of weight loss and that my being overweight is contributing to the lack of healing with my knee problem. That may or may not have been a part of our conversation. While it’s important that the weight loss element be in that clinical note … In fact, it might even mean revenue for that doctor because there might be a weight loss program that gets prescribed and so on. That’s important and it’s important not to miss that. The human exchange here and the things that are implicit in those conversations, let alone the fact that I’ll say kneecap and you’ll say patella, are things that are as close to general artificial intelligence style problems as anything.

Sam Charrington: [00:49:15] Yeah.

Peter Lee: [00:49:18] Look, we don’t kid ourselves that we’re anywhere close to solving those types of problems, but those are the kinds of problems we think about, even when we just look at the kind of day-to-day, minute-by-minute work that people do to deal with their healthcare.

Sam Charrington: [00:49:33] Right, right.

Peter Lee: [00:49:34] There’s another one that’s interesting. To really unlock the power of AI, what we would want to do is to just open up huge databases to great researchers and innovators everywhere, but, of course, we need to do that without violating anyone’s privacy. There’s one problem, something called de-identification. It would be great to be able to take a treasure trove of what’s in electronic health records and “de-identify” it.

Well, some parts of those electronic health records are easy to do because there might be a field called Social Security Number, another field called Name, another one called Address, and so on, so you can just scrub those out. But large amounts of clinical data involve just unstructured notes, and to really have a deep understanding of what’s in those notes and in order to scrub those in a way that won’t inadvertently reveal somebody’s identity or their medical condition, again, is something that in the ultimate, ends up being a very general AI problem.

Sam Charrington: [00:50:41] That’s a great reframing of the way to think about this is I guess most chatbots are boring because they’re boring. Kind of the entity intent framework that most chatbots are built on is kind of like table stakes relative to what we’re really trying to do with conversational experiences. That really requires a level of sophistication and our ability to use and work with and manipulate natural language that is very much at the research frontier now. And that’s why most current in-production chatbots are kind of boring.

Peter Lee: [00:51:27] Yeah. We’ve taken a step forward of trying to think of these things almost in terms of being able to play a game of 20 questions. One of the most inspiring applications of health bots that we dream about is in matching people to clinical trials. At any point, there are thousands of clinical trials. You can go to a website called clinicaltrials.gov and there’s a search bar there, and you can type in something like breast cancer. When you do that, you get this gigantic dump of every registered clinical trial going on that might be pertinent to breast cancer.

While that’s useful, the problem with that is it’s hard to know which ones of those … If you are, say, someone who’s desperate to find a clinical trial to enroll in because you’ve run out of other viable options for whatever is ailing you, it’s just almost impossible to go through all of that technical information and try to understand this. Would it be possible to use an AI to read through all that technical information and then to synthesize what amounts to a game of 20 questions, something that’ll converse with you and ask you questions in order to narrow down to just that one or two or three clinical trials that might be a match for you.

It’s that kind of thing where it’s not fully general conversation of the sort that I think you and I were talking about just a minute ago, but is slightly more structured than that in order to help you more intelligently, more efficiently find the right medical or healthcare solution for you. That kind of application is something that we’re really putting a lot of kind of heart and mind into, along with many others around the world. It’s exciting that we’re starting to see these things actually make it into clinical use today.

I kind of agree with you. I do roll my eyes sometimes at the overheated hype around intelligent agents and chatbots as well, just like anybody else, but it’s really getting somewhere in these more limited domains.

Sam Charrington: [00:53:56] I think it also says why the interesting work in domains like this is going to be … It’s not generic. You’re solving a specific problem and there’s a lot of investment in getting the machine running AI right for this particular problem as opposed to implementing a generic framework.

Peter Lee: [00:54:16] That’s right.

Sam Charrington: [00:54:17] Awesome. Well, Peter, thank you so much for taking the time to chat with me about the stuff you’re seeing and working on in the healthcare space. A ton of really interesting examples in there and I’m looking forward to following all this work and digging deeper. Thank you.

Peter Lee: [00:54:37] And we didn’t even talk about China once. That’s great.

Sam Charrington: [00:54:41] Well, you mentioned ResNet a few times kind of taunting me to dive into that conversation, but I’ll refer folks to the article and we’ll put the link in the show notes.

Peter Lee: [00:54:52] Sounds great. It was really a pleasure chatting.