We could not locate the page you were looking for.

Below we have generated a list of search results based on the page you were trying to reach.

404 Error
Sam Charrington: Hey, what’s up everyone! We are just a week away from kicking off TWIMLfest, and I’m super excited to share a rundown of what we’ve got in store for week 1. On deck are the Codenames Bot Competition kickoff, an Accessibility and Computer Vision panel, the first of our Wellness Wednesdays sessions featuring meditation and yoga, as well as the first block of our Unconference Sessions proposed and delivered by folks like you. The leaderboard currently includes sessions on Sampling vs Profiling for Data Logging, Deep Learning for Time Series in Industry, and Machine Learning for Sustainable Agriculture. You can check out and vote on the current proposals or submit your own by visiting twimlai.com/twimlfest/vote/. And of course, we’ll have a couple of amazing keynote interviews that we’ll be unveiling shortly! As if great content isn’t reason enough to get registered for TWIMLcon, by popular demand we are extending our TWIMLfest SWAG BAG giveaway by just a few more days! Everyone who registers for TWIMLfest between now and Wednesday October 7th, will be automatically entered into a drawing for one of five TWIMLfest SWAG BAGs, including a mug, t-shirt, and stickers. Registration and all the action takes place at twimlfest.com, so if you have not registered yet, be sure to jump over and do it now! We’ll wait here for you. Before we jump into the interview, I’d like to take a moment to thank Microsoft for their support for the show, and their sponsorship of this series of episodes highlighting just a few of the fundamental innovations behind Azure Cognitive Services. Cognitive Services is a portfolio of domain-specific capabilities that brings AI within the reach of every developer—without requiring machine-learning expertise. All it takes is an API call to embed the ability to see, hear, speak, search, understand, and accelerate decision-making into your apps. Visit aka.ms/cognitive to learn how customers like Volkswagen, Uber, and the BBC have used Azure Cognitive Services to embed services like real-time translation, facial recognition, and natural language understanding to create robust and intelligent user experiences in their apps. While you’re there, you can take advantage of the $200 credit to start building your own intelligent applications when you open an Azure Free Account. That link again is aka.ms/cognitive. And now, on to the show! Sam Charrington: [00:00:00] All right, everyone. I am here with Adina Trufinescu. Adina is a Principal Program Manager at Microsoft, working on Computer Vision. Adina, welcome to the TWIML AI podcast. Adina Trufinescu: [00:00:12] Thank you so much for having me here. Sam Charrington: [00:00:14] Absolutely. I'm really looking forward to digging into our chats. We'll be spending quite a bit of time talking about some of the interesting Computer Vision stuff you're working on, in particular, the spatial analysis product that you work on, and some of the technical innovation that went into making that happen. But before we do that, I'd love for you to share a little bit about your background and how you came to work in Computer Vision. Adina Trufinescu: [00:00:40] Definitely. I have joined Microsoft in 1998 so I'm a veteran here, and I started as an Engineer. So, I have an Engineer background, not a [research] background. Then after spending more than 10 years as an Engineer working on primarily Windows OS, I switched for Program Management, and I worked on a bunch of products until eventually, I started working Windows on speech recognition. At the time I was working on Cortana speech recognition, and then, later on, I worked on speech recognition for HoloLens, the mixed reality device. Then for the past year and a half, I transitioned to computer vision.  So I'm a Program Manager. I'm working with both the engineering and the research teams on shipping special analysis, and then special analysis - it's a feature of  Computer Vision in Azure Cognitive Services. Then it just shipped as of this week, at Ignite in the public preview. Sam Charrington: [00:01:37] Nice. In any other year, I'd asked you, what's it like down in Orlando? Because that's where Ignite is historically held. I've been to the last several, and I've done podcasts from Ignite, but this time, we're doing it a little bit virtually as Microsoft is with the event. But super excited to bring to our audience a little bit of this update from Ignite.   Tell us a little bit about the spatial analysis work that you're doing there, and start from the top. What's the problem that the spatial analysis is trying to solve? Adina Trufinescu: [00:02:13] So, before I talk about spatial analysis, let me give you a bit of background information about Azure Cognitive Services for Computer Vision because it's important to highlight the difference and the novelty that spatial analysis brings. So, the existing Computer Vision services are image-based, meaning that basically, the developer passes in an image at the time, and then the inference happens either in the cloud or in a container at the edge. Then the result of the inference image by image is being sent back to the developer. Spatial analysis brings the innovation of actually running Computer Vision AI on video strips. So basically it analyzes life. It can also be recorded but primarily it was designed for live video streams and real-time analysis of these video streams, and in this case for the purpose of understanding people’s movement in physical space. Then when you talk about people’s movement, we're talking primarily about four things. The first one is the more basic scenario of people counting. So, basically in a video stream, we run people detection and then either periodically or when the count of people changes, we provide the insights indicating how many people. Then we have social distancing, which is actually called people distance, but we call it social distancing for the obvious reason. But basically you can configure the desired threshold at which you want to measure the distance between people, and then let's take the magic six feet number, right? So basically, the AI is going to detect the people in the video stream, and then every time, when the people are closer than the minimal threshold, then an event is being generated, indicated that the minimal distance has not been respected. So these are the first two, and then the next two are what we call entry and exit of physical spaces. So to actually detect when people enter or leave a physical space, we have two operations. One is called person crossing a zone - in and out of a zone, and the person crossing a line. Let's take the example of person crossing a line. Let's say that you have a doorway, so you can draw a directional line, and then every time the bounding box of the detected person is crossing and intersecting the line, then we can generate that event, telling you that the person enter the space or exit the space. Sam Charrington: [00:04:43] Awesome. So the context in which this is being offered, as you mentioned, the comparison to the image-based services and image-based service might be something I'm using to do object detection or segmentation of an image. I'm passing that to an API and I'm getting a result back where the service is telling me what it thinks is in the image and the probabilities and this is extending that same general idea to video, essentially. Adina Trufinescu: [00:05:17] That's right, and that we started with the spatial analysis for people movement. We're looking to extend this to other domains for other relevant scenarios in the future. Sam Charrington: [00:05:28] Can you give us an example of the other types of scenarios that folks might want to perform on video? Adina Trufinescu: [00:05:36] So there are many industries where this is relevant. So, basically you can think about retail which currently is targeted towards this person movement analysis but think about,  vehicle analysis. So, that would be like another kind of audit that when detected in a video, then you can have interesting AI insights generated and interesting scenarios. Sam Charrington: [00:06:02] So, yeah, from even that explanation, I get that unlike an image-based service where generally, these work along the lines of ImageNet where you have these many classes of things that can be detected - toys and fruit and oranges, and things like that. In video, you're starting with very specific classes. Can you talk a little bit about why that is? Is it use case driven and that counting people and vehicles, and very specific things are more interesting in video than counting random objects? Or is it more a technical issue or limitation? Adina Trufinescu: [00:06:46] Oh, it's not a limitation. So, we started with understanding people movement because this is where the customer signal was. So, I've mentioned retail. We also have many scenarios in manufacturing or in real estate management, and also the current events was also informing our decisions on when to start, but the way the video pipeline and the detection models are being inserted in the video pipeline is fairly generic, which is why we're looking at enabling other domains in the future. So basically,  the detector model that we have for people today can easily swap with a different detector model for a different domain. Sam Charrington: [00:07:24] Okay. Okay. I'm thinking about the use cases. It sounds like the use cases that you are envisioning are camera-based video streams, as opposed to, I'm going to pipe in a stream of commercial television and ask your service to find anytime a particular can of Coca Cola shows up, or something like that. That's another use case that I see every once in a while, but clearly it's not one you're going after at this point. Adina Trufinescu: [00:07:59] Not for now, not for now. Speaking about the cameras, the cameras that we work with, we don't actually provide like a given model. So, any model that supports the RTSP protocol, which is like the universal protocol. Well, I shouldn't say universal, but it's a common protocol for video streaming. So, you can have a camera or you can have an NVR; any video management system that actually is capable of streaming over the RTSP protocol. We work with that. Sam Charrington: [00:08:31] Okay. NVR being network video recorder, surveillance use case, or a technology used in that use case. Adina Trufinescu: [00:08:41] Yeah, that's right. So basically we're looking at not only at a greenfield site areas where customers install new cameras, but also at existing cameras and existing video systems. Sam Charrington: [00:08:51] So when I think about this type of use case, it makes me think of something like a ring camera where maybe I can grab a raspberry pie or something like that, and have it call out to the service, put in a little USB camera on my raspberry pie and stick it by my door, and do a roll your own ring camera and have it count people that go into some zone or something like that. Could I do that with this? Adina Trufinescu: [00:09:22] You can do something like that but the device that we are supporting and we have tested extensively for, it's actually more of a heavyweight device, it's an Azure Stack Edge.  But the idea here is that these spaces where you can dozens of cameras or you can have hundreds of cameras. So, imagine a warehouse where you could  potentially have hundreds of cameras. Basically, we want you to have a way where you can deploy at scale and you can manage this cameras at scale. Then because of video, the sensitivity around the privacy concerns and data control concerns, basically that's where Azure Stack Edge comes in where you can actually keep the video on your premises. Then all the processing happens on the Azure Stack Edge device, and then only the result of the identified data about the people movement can be sent to the cloud, put your own service in the cloud to your own tenant, and then you can and build a solution in the cloud. Then I should be more specific that the Azure Stack Edge device that we are running with is actually the one that has the Nvidia T4 GPU. So even a more departure from just a [Nano]. This is  the initial release. This is the public preview, and then we're looking at extending the range of devices and hardware acceleration capabilities for something lower, let something less than Azure Stack Edge. Sam Charrington: [00:10:55] Got it. Then for folks that aren't familiar, Azure Stack Edge is essentially a way. It's a pretty heavyweight hardware set out where you're essentially running the Azure Cloud in your data center. That's the general idea, right? Adina Trufinescu: [00:11:09] Yeah, that's right, and if you have a small space where you have, let's say 20, 50 cameras, you don't really need something of the extent of a data center. You need a room, a server closet with a reasonable temperature where you can run these devices. Sam Charrington: [00:11:33] Okay. Okay. So I'm going to have to wait quite a while for this technology to be democratized, if we will, to the point where I'm running it on a raspberry pie with a USB camera. Adina Trufinescu: [00:11:48] I was hoping it's not quite a while but not yet. Sam Charrington: [00:11:53] Not yet, and I think in this day and age, I think we have to talk about surveillance and the role of technologies like this, and enabling different types of surveillance use cases, some of which are problematic and some of which are necessary in the course of doing business. What's the general take on making this kind of service available for those kinds of use cases? Adina Trufinescu: [00:12:24] So when we release spatial analysis, we had in mind what Microsoft calls responsible AI and innovation. So this is where we recognize the potential of harmful use cases, and then with this release, we also released a set of responsible AI guidelines which had three things in mind. The first one is protecting the privacy of the end-user; providing transparency such that the end user and the customer understands the impact of the technology, and then in the end promoted trust. Then the idea there is that we want to pass this responsible AI guidance and practices to our developers. and people that actually build the end to end solutions, such that the end users, the people actually impacted by the technology, can actually be protected, and the human dignity of these people is actually uphold. Sam Charrington: [00:13:18] So it sounds like even if I did have an Azure Stack Edge, I couldn't necessarily just turn on the service and do whatever I want with it. Adina Trufinescu: [00:13:26] So, we have a process for that. We take our customers through, at least for this public preview, where you get access to the container. I'm not sure if I mentioned this, but we started not with an Azure service in the cloud but with a Docker container that you run on your premises on Azure Stack Edge, and basically the container, anybody can download it, but to actually access the functionality in the container, we want you to fill in this form. You describe the use cases that you are considering for  your solution and your deployment, and then we will look together whether these use cases align with the responsible AI guidance and then, if they do, obviously you can proceed, and then if they don't, we'll have that conversation to make sure that the responsible AI guidance is upheld. Sam Charrington: [00:14:15] Okay. Well, let's maybe shift gears and talk a little bit about some of the tech that went into enabling this. In order to do what you're doing, you're doing some kind of standard things like object detection. Is this fresh out of research papers, new techniques to do the detection and classification, or what are some of the things that you're doing there and the challenges that you ran into and productizing this? Adina Trufinescu: [00:14:44] So, I think the challenges, they vary depending on the four use cases. So let me try to break it down and then address each one. So for instance, we are running a DNN for people detection, and then we started with something like more heavyweight, and then we had to transition because of the performance concerns. I'm going to come back to that in a second, but basically we had to transition to a lighter model. Sam Charrington: [00:15:09] A big ResNet...? Adina Trufinescu: [00:15:11] Let's say a big ResNet or a smaller ResNet. Sam Charrington: [00:15:16] Okay. Adina Trufinescu: [00:15:17] I'm going to leave it at that. But the idea there is that for instance, for something like people counting, initially for all operations, we started thinking that we can stream at 15 frames per second, and hen we did that. Then we've noticed that to get maximum usage out of that Azure Stack Edge, which is quite heavyweight, right? We want to run as many video streams as possible. So basically we try to actually go as low as possible in terms of frame rate, and then for something that's person count, the person count from one second to another doesn't change dramatically. So for something like person count or a person distance, we went from 15 frames per second to one frame per second. Then we were able to maximize the usage of the GPU because now the DNN runs at the lower frame rate, and this way you can fit in more video strips. The challenge we had, for instance, with social distance with person count was around generating ground truth. So [we create] a 10 minute video. Let's say you have point in the video and you have to allocate the distance between the people, just looking at the video, you cannot figure out the physical distance between people. So that is where we use synthetic video data. So basically, we are using the same technology that our colleague teams in mixed reality for HoloLens are using, where we generate this game scenes where we can control the positioning of the people and then their relative positioning. So that was the first challenge for person distancing. The second challenge is that the DNN is going to tell you whether there are people in a frame, but it's not gonna tell you the actual physical distance. So for that, you need the camera to be calibrated. So this is where the initial thinking was that we will ask the customer for the camera height, for the angle, for the focus distance, but that wasn't practical either. So this is where we had to actually come up with a calibration algorithm for the camera, such that before the actual operations, where the DNN runs for the purpose of the operation, the algorithm for calibration kicks in such that we ask the customer to have at least two people in the camera field of view. Then the algorithm runs for detecting these people and makes assumptions for their positioning and this way, the camera height and the focal distance are actually calculated. Then we pass it back to the customer as output and we want to make sure that, that reflects the reality, but between a ground truth and the camera calibration, these were the two challenges for person detection. Sam Charrington: [00:18:06] All right. So just maybe taking a step back. We started out talking about counting people and, it sounds like there's some research or work that went into getting from this big heavyweight model to the smaller model. So that was one element of it, but also, just fine tuning the end to end process in terms of how quickly you're able to do it. In other words, what the frame rate you're using for counting people. That was part of counting people? Adina Trufinescu: [00:18:43] Yes, that's right. Sam Charrington: [00:18:44] It was just an iterative process. Keep reducing the framework until things start breaking and you're not able to count accurately or was that something where you're building out models to tell you how low you can go or something. What all went into that? Adina Trufinescu: [00:19:02] So, it was a little bit of both. It was like a constant measurement of performance and accuracy in terms of frame rate, we would go lower and lower to the point where we can maintain the accuracy and precision rates. Then you reach a breaking point and then that's how you know that you have to stop. Then when you have to stop that, I wouldn't say that this was exactly how it happened, but when you talk about frame rate and doing all these tests, this is where the engineering comes in. Then when you come about the performance of the DNN and the models, this is where research teams are making progress in parallel. So basically, it was an iterative process where, between engineering and research, they both worked together to arrive to what seems to be the best balance between performance and accuracy. Sam Charrington: [00:20:01] As part of that counting people process, you've got two sub problems there. One is identifying the people in the frame, and then you also have to know from one frame to the next, which person is which. Is that a part of the challenge here? Adina Trufinescu: [00:20:16] Yeah, that's right. So see, especially for person crossing in and out of a zone and person crossing the line, that's where the tracking part of the algorithm comes in, and to be able to tell that  it's the same person from one frame to another, in addition to the DNN model, we are running a combinatorial algorithm such that detection is telling you that I have these people. Then by extracting features, we can run the  combinatorial algorithm to tell that from frame P minus one to frame T, we have the same set of people, then the S people are detected across the frames. They are getting this anonymous identifier which tells you that there is the same person from frame one to frame ten. Something like that. Sam Charrington: [00:21:09] You mentioned extracting features to help the commonitorial algorithm. Are you pulling those out of the bowels of the DNN or is this a separate pipeline or a separate flow that is identifying features and a kind of more traditional computer vision way? Adina Trufinescu: [00:21:28] So we actually pull it from the DNN and we have the typical feature that you would expect like motion, vector, velocity, and direction in the 2D space and frame by frame, we're  looking at all these attributes. Then we're making the decision whether the same person shows up across the various frames. Then I should say that each person gets an identifier and that is an anonymized identifier. There is no facial recognition or anything of this sort. Sam Charrington: [00:22:03] Okay. Adina Trufinescu: [00:22:04] Then I should say that in our pursuit of a performance, we started this process at running at 15 frames per second because when you actually look closely at how people move in and out of a zone or cross a line, the action of crossing and the time the person crosses that line is fairly short. So we had to run it more than 15 frames per second. This is where we initially started by running the DNN for the people detection every 15th frame, still keeping it at one frame per second, and running the association algorithm every frame. The problem that we had was the accuracy and the performance had all the typical challenges where the identity of the people will be switched or the identity of two people will be merged. This is the fragmentation and merging typical challenges with association. So, if you don't actually run the detection on each frame, every time when a person is occluded or every time when a person disappears from the frame or a new person appears, that's when you have all these association problems of merging and fragmentation. So that was another motivation for us to go to a lighter DNN four-person detection. Something that we can actually run each frame at 15 frames per second. Sam Charrington: [00:23:31] Okay, but you mentioned that there are some parts of the problem that you do down at one frame per second? Adina Trufinescu: [00:23:36] Right. So, just to recap, a person counting and social distancing, we keep doing it at one frame per second, and then person crossing a line and person crossing in and out of a zone, we run at 15 frames per second. Sam Charrington: [00:23:51] Got it.  The main idea there is that for counting people and counting distance, it's not an associative problem. You're just looking at what's in the frame. Adina Trufinescu: [00:24:03] Right, right. Sam Charrington: [00:24:04] Someone bounces in our out between frames, if they're not in the frame, you don't count them. But when you're talking about entering and exiting physical spaces, you want to keep track of who was already in the space versus who wasn't in the space in order for you to provide an accurate account. So you have to, there's a bit more accounting that has to happen, and then you get these challenges with people disappearing  because they were at the edge or something like that.  That's where you have to focus on these segmentation, emerging problems. Adina Trufinescu: [00:24:34] Yeah, that's right. So, imagine that counting people over social distance, not a lot whole happened in a second. So, imagine that you have a railway station and you have a doorway where a dozen of  people needs to pass through. At that point, you have to run people detection at a higher frame rate such that you do not lose the people, or you do not lose them when they show up and you want to lose them when they disappear. Sam Charrington: [00:25:00] Yeah. Yeah. Yeah. So you mentioned a bit about the training data challenge that you ran into there, and this is related to that last problem we talked about with entering and exiting physical spaces. Is that correct? Or is it--? Adina Trufinescu: [00:25:18] Yeah, that's right. So this is where ground truth was also challenging. Take videos and these videos can be 10 minutes to one hour. You could have, depending on which space are you using, you could have few people or you could have a dozen or you could have a hundred people, right? So annotating that data frame by frame at 15 frames per second, that's a lot of work. Not only that. You have to track the same person from this frame across all the 15 frames times this many minutes is the same person. It's possible but you don't want to do that. You don't want to ask any human to do that. So this is where-- Sam Charrington: [00:26:01] If I can just jump in. If the network isn't tracking  the people but it's a combinatorial type of algorithm, is that a non-learned algorithm where you don't need to train on associating people or do you also move that-- Adina Trufinescu: [00:26:22] That is not a DNN. It's an algorithm and you don't have to train it. So what we are training is the people detection model, and then we are testing independently first the people detection model, and then we are testing the tracking aspect of it, and then we are testing the combinatorial algorithm. So that's where the ground truth needs to cover all the use cases. But then the most challenging one is the one where you have to generate ground truths that annotates each person and the anonymized identity of each person across the frames. Sam Charrington: [00:27:04] Okay. Yeah. I was trying to make sure that you actually had to track that because that would seem to make the data collection process quite a bit more challenging when you're annotating the identity of folks. That can be, if we're talking about images, that look like a overhead image of Grand Central Station or something, I would imagine that to be difficult for a human annotator. Adina Trufinescu: [00:27:26] Yeah, right. So this is where synthetics plays the same role as before. We are generating all these synthetics videos where, not only that we want to make sure that it's the same person across the video, but you want to make sure that the padding of the people in physical spaces across the use cases is most realistic, and then you want to annotate that. You have the different camera angles, you have the different heights and you have the lighting conditions. So trying to go into the real world to collect all that data, and then to annotate that data, that would be a real challenge. So this is where synthetics played a huge role and was a huge time saver. Sam Charrington: [00:28:12] Where does this synthetic data come from? Did you take an Xbox game that kind of looked like it had people in a crowd and try to use that, or did you develop a custom data generator for this problem? Adina Trufinescu: [00:28:27] It's pretty much the same technology that is being used for HoloLens and for mixed reality; the same kind of the technology that powers the [same] generation. We didn't take a game but the concept is very much game-like where you can overlay an image of actual physical space, and then you can start placing all these characters into the 3D space, and then generating the video streams out of that. Then, because you can play with the physics and then with the lighting, you can have a great variation. That is actually what we need to assure the high quality of the AI models and of the combinatorial algorithm. Sam Charrington: [00:29:14] Is that synthetic data approach also related to the camera placement approach that you mentioned? Are you varying the camera angle as part of this synthetic generation? Adina Trufinescu: [00:29:25] Yeah, that's right. So, Computer Vision has a custom vision and we want people to go and create custom vision models, but to the extent where they don't have to, and then we can actually save them time by creating these high quality models which perform great in all of these conditions. We want to do that. So the goal there was that when we train and when we test, we test with data from all these various conditions. So, part of the synthetic data was to-- Like the ceiling in a retail space is different than a ceiling in a manufacturing space. So this is where you need to bring in that variation. Sam Charrington: [00:30:08] Okay. From a customer perspective, are they sending you pictures from their camera  and there's a model that figures out where their camera might be? You said that you don't want them to have to send you measurements or anything like that. What's the input to that process? Adina Trufinescu: [00:30:27] So, we do not collect data from customers.  In the product, none of the video that is being processed is used for training. So the way we are approaching this is visiting customers, looking and learning about their environment, and learning about the parameters of the environment such that we can simulate it. Then we also create simulations of the real world scenarios. Obviously not manufacturing but you might use something like a store layout. That's something that you can emulate fairly easily, and then in that scenario, you have something where the camera is at 10 feet or camera is at 20 feet. Then you're  looking at the different angles and the different areas in the store where you want to apply the person crossing zone, person crossing line. That's how you  generate the synthetics data. Sam Charrington: [00:31:24] Got it.  Okay. Finally, you started to mention a kind of measurement and some of the challenges that measurements pose for this problem. Can you elaborate on the way you score these models and how you assess their accuracy? Adina Trufinescu: [00:31:45] So we applied the MOT Challenge, and then we used the data set to track the accuracy of the person detection and the person tracking model. We applied the MOT Accuracy and precision formulas. Sam Charrington: [00:32:04] MOT Challenge - Multi-Object Challenge, [inaudible]? Adina Trufinescu: [00:32:09] Multi-Object Tracking Challenge. So, we apply the industry standards to assess the precision and accuracy of the model. But, the thing that we did a bit different was that the actual output that goes to the customer is not actually frame by frame, the result of the detection or the tracking. What we actually send to the customer is the count of people, the distance between people, the time they spent in a zone, or the entry and exit events in the zone, such that they can calculate the dwell time. So we looked at the use cases, and we came up with accuracy measures specific to the scenario, and then we generated ground truths such that we can test holistically, not only the tracking part of the algorithm but the entwined algorithm between tracking association and applying this logic, like person crossing in and out of the zone or person crossing in and out the line. Sam Charrington: [00:33:11] So did you extend the challenge benchmark to your specific use cases in the higher level metrics that you're providing to customers, or did you have a separate parallel path that was more reflective of your specific kind of use case specific numbers? Adina Trufinescu: [00:33:30] It's pretty much specific to the use case. To give you an example, for the person entering and exiting the zone, we looked at the, what we call dwell time, which is a fairly common use case for what people want to measure. Then we looked at the timestamps for the ground truth. We created ground truth by looking at the timestamps of people entering and exiting the zone. Then we created measures for dwell time entering or exiting. It helped us assure that the accuracy of the end product, which is what the customer is consuming, is at a level that is satisfying the customer requirements. Sam Charrington: [00:34:22] With these measurements in mind, did you give up a lot going from the huge DNNs to a more compact DNNs and changing frame rates, and things like that? All these things that you needed to do to deliver a product that worked in the kind of environment that you were looking to do, did you lose a lot in accuracy for the measurements that you're trying to provide? Adina Trufinescu: [00:34:49] Not really. The goal is to gain in accuracy. You have to make tradeoffs and then you have to balance. It's always like a tug of war between accuracy and performance,  working with  customers, thats why we have these public previews. Before the public preview, we had the private preview. So, we work closely with a set of customers to validate the accuracy of the entwined algorithm for their use cases.  There were some learnings that we took away and then that's how we arrived by making the right trade offs, such that both the accuracy and the performance and the cost of the end to end solutions make sense. Sam Charrington: [00:35:31] Awesome. Awesome. You presented  on this at Ignite this week when you unveiled the public stage of release. Any takeaways from your presentation or the reception to it? Adina Trufinescu: [00:35:44] So, it was well received. I would say that you stay so much focused on performance and accuracy, and then the feedback that we got was, it was very strong feedback. For instance, the measure between people, we provided only in fit. Obviously, you have to stay focused on everything that matters. I mean, we'll try to move fast and everything happened so fast and that this is something that we plan during the pandemic months. Then the six feet that you hear every day stuck with us. Then we realized that our customers needs the metric system. So we had feedback like that. But then at this point, we are very excited to have the customer [stride] and I'm pretty sure that there will be more learnings. Sam Charrington: [00:36:41] Awesome. Awesome. Well, we'll be sure to link out to the service where folks can find it in the show notes, but thanks so much for taking the time to share with us an update on this new service, and what you're up to. Adina Trufinescu: [00:36:58] Yeah, it was my pleasure. Thank you for having me. Sam Charrington: [00:37:00] Thanks, Adina.
Hosting models and productionizing them is a pain point. Let’s fix that. Imagine a stream processing platform that leverages ML models and requires real-time decisions. While most solutions provide tightly coupled ML models in the use case, these may not offer the most efficient way for a data scientist to update or roll back a model. With model as a service, disrupting the flow and relying on technical engineering teams to deploy, test, and promote their models is a thing of the past. It’s time to focus on building a decoupled service-based architecture while upholding engineering best practices and delivering gains in terms of model management and deployment. Other benefits also include empowering data scientists by supporting patterns such as A/B testing, multi-armed bandits, and ensemble modeling. Sumit and John demonstrate their work with a reference architecture implementation for building the set of microservices and lay down, step by step the critical aspects of building a well-managed ML model deployment flow pipeline that requires validation, versioning, auditing, and model risk governance. They discuss the benefits of breaking the barriers of a monolithic ML use case by using a service-based approach consisting of features, models, and rules. Join in to gain insights into the technology behind the scenes that accepts models built using popular libraries like H2O, Scikit-learn, or TensorFlow and serve them via REST/gRPC which makes it easy for the models to integrate into business applications and services that need predictions.
Sam Charrington: Today we're excited to present the final episode in our AI for the Benefit of Society series, in which we're joined by Mira Lane, Partner Director for Ethics and Society at Microsoft. Mira and I focus our conversation on the role of culture and human-centered design in AI. We discuss how Mira defines human-centered design, its connections to culture and responsible innovation, and how these ideas can be scalability implemented across large engineering organization. Before diving in I'd like to thank Microsoft once again for their sponsorship of this series. Microsoft is committed to ensuring the responsible development and use of AI and is empowering people around the world with this intelligent technology to help solve previously intractable societal challenges spanning, sustainability, accessibility, and humanitarian action. Learn more about their plan at Microsoft.ai. Enjoy. Mira Lane: [00:00:09] Thank you, Sam. Nice to meet you. Sam Charrington: [00:00:11] Great to meet you and I'm excited to dive into this conversation with you. I saw that you are a video artist and technologist by background. How did you come to, you're looking away, is that correct? Mira Lane: [00:00:28] No, that's absolutely true. Sam Charrington: [00:00:30] Okay. So I noted that you're a video artist. How did you come to work at the intersection of ethics and society and AI? Mira Lane: [00:00:42] For sure. So let me, Sam, let me give you a little bit of a background on how I got to this point. I actually have a mathematics and computer science background from the University of Waterloo in Canada. So I've had an interesting journey, but I've been a developer, program manager, and designer, and when I think about video art and artificial intelligence, I'll touch artificial intelligence first and then the video art, but a few years ago I had the opportunity to take a sabbatical and I do this every few years. I take a little break, reflect on what I'm doing, retool myself as well. So I decided to spend three months just doing art. A lot of people take a sabbatical and they travel but I thought I'm just gonna do art for three months and it was luxurious and very special. But then I also thought I'm going to reflect on career at the same time and I was looking at what was happening in the technology space and feeling really unsettled about where technology was going, how people were talking about it, the way I was seeing it affect our societies and I thought I want to get deeper into the AI space. So when I came back to Microsoft, I started poking around the company and said is there a role in artificial intelligence somewhere in the company? And something opened up for me in our AI and Research group where they were looking for a design manager. So I said absolutely. I'll run one of these groups for you, but before I take the role, I'm demanding that we have an ethics component to this work because what they were doing is they were taking research that was in the AI space and figuring out how do we productize this? Because at that point, research was getting so close to engineering that we were developing new techniques and you were actually able to take those to market fairly quickly and I thought this is a point where we can start thinking about responsible innovation and let's make that a formalized practice. So me taking the role for the design manager was contingent on us creating a spot for ethics at the same time and so backing up a little bit, the video part comes in because I have traditionally been a really analog artist. I've been a printmaker, a painter, and during my sabbatical, I sought some more digitized, looked at digitizing some of the techniques that I was playing with on the analog side. I thought let me go play in the video space for a while. So for three months, like I said, I retooled and I started playing around with different ways of recording, editing, and teaching myself some of these techniques and one of the goals I set out at the time was well, can I get into a festival? Can I get into a music or video festival? So that was one of my goals at the end of the three months. Can I produce something interesting enough to get admitted into a festival? And I won a few, actually. Sam Charrington: [00:03:46] That's fantastic. Mira Lane: [00:03:46] So I was super pleased. I'm like okay, well that means I've got something there I need to continue practicing. But that for me opened up a whole new door and one of the things that I did a few years ago also was to explore art with AI, and could we create a little AI system that could mimic my artwork and become a little co-collaborator with myself? So we can dig into that if you want, but it was a really interesting journey around can AI actually compliment an artist or even replace an artist? So there's interesting learnings that came out of that experience. Sam Charrington: [00:04:25] Okay. Interesting, interesting. We're accumulating a nice list of things to touch on here. Mira Lane: [00:04:30] Yeah, absolutely. Sam Charrington: [00:04:31] Ethics and your views on that was at the top of my list, but before we got started, you mentioned work that you've been doing exploring culture and the intersection between culture and AI and I'm curious what that means for you. It's certainly a topic that I hear brought up quite a bit. Particularly when I'm talking to folks in enterprises that are trying to adopt AI technologies and you hear all the time well one of the biggest things we struggle with is culture. So maybe, I don't know if that's the right place to start, but maybe we'll start there. What does that mean for you when you think about culture in AI? Mira Lane: [00:05:12] Yeah, no, that's a really good question, and I agree that one of the biggest things is culture and the reason why I say that is if you look at every computer scientist that's graduating, none of us have taken an ethics class and you look at the impact of our work, it is touching the fabric of our society. Like it's touching our democracies and our freedoms, our civil liberties, and those are powerful tools that we're building, yet none of us have gone through a formal ethics course and so the discipline is not used to talking about this. It's a few years ago you're like I'm just building a tool. I'm building an app. I'm building a platform that people are using, and we weren't super introspective about that. It wasn't part of the culture, and so when I think about culture in the AI space, because we're building technologies that have scale and power, and are building on top of large amounts of data that empower people to do pretty impressive things, this whole question of culture and asking ourselves, well what could go wrong? How could this be used? Who is going to use it directly or indirectly? And those are parts of the culture of technology that I don't think has been formalized. You usually hear designers talking about that kind of thing. It's part of human-centered design. But even in the human-centered design space, it's really about what is my ideal user or my ideal customer and not thinking about how could we exploit this technology in a way that we hadn't really intended? We've talked about that from an engineering context the way we do threat modeling. How could a system be attacked? How do you think about denial of service attacks? Things like that. But we don't talk about it from a how could you use this to harm communities? How could you use this to harm individuals or how could this be inadvertently harmful? So those parts of cultures are things that we're grappling with right now and we're introducing into our engineering context. So my group sits at an engineering level and we're trying to introduce this new framework around responsible innovation and there's five big components to that. One is being able to anticipate, look ahead, anticipate different futures, look around corners and try to see where the technology might go. How someone could take it, insert it into larger systems, how you could do things at scale that are powerful that you may not intend to do. There's a whole component around this responsible innovation that is around reflection and looking at yourselves and saying where do we have biases? Where are we assuming things? What are our motivations? Can we have an honest conversation about our motivations? Why are we doing this and can we ask those questions? How do we create the space for that? We've been talking about diversity and inclusion like how do you bring diverse voices into the space, especially people that would really object to what you're doing and how do you celebrate that versus tolerate that? There's a big component around our principles and values and how do you create with intention and how do you ensure that they align with the principles and they align with their values and they're still trustworthy? So there's a whole framework around how we're thinking about innovation in the space and at the end of the day it comes down to the culture of the organization that you're building because if you can't operate at scale, then you end up only having small pockets of us that are talking about this versus how do we get every engineer to ask what's this going to be used for? And who's going to use it? Or what if this could happen? And we need people to start asking those types of questions and then start talking about how do we architect things in a way that's responsible. But I'd say most engineers probably don't ask those types of questions right now. So we're trying to build that into the culture of how we design and develop new technologies. Sam Charrington: [00:09:14] Mm-hmm (affirmative). One of the things that I often find frustrating about this conversation particularly when talking to technology vendors is this kind of default answer while we just make the guns, we don't shoot them. We just make the technologies. They can be used for good. They can also be used for bad, but we're focused on the good aspects. It sounds like, well, I'm curious, how do you articulate your responsibility with the tools that you're creating? Or Microsoft's responsibility with the tools it's creating. Do you have a- Mira Lane: [00:09:55] Well I have a very similar reaction to you when I hear oh, we're just making tools. I think, well, fine. That's one perspective, but the responsible perspective is we're making tools and we understand that they can be used in these ways and we've architected them so that they cannot be misused and we know that there will be people that misuse them. So I think you're hearing a lot of this in the technology space and every year there's more and more of it where people are saying look, we have to be responsible. We have to be accountable. So I think we'll hear fewer and fewer people saying what you're hearing, what I'm hearing as well. But one of the things we have to do is we have to avoid the ideal path and just talking only about the ideal path. Because it's really easy to just say here's the great ways that this technology is going to be used and not even talk about the other side because then, again, we fall into that pattern of well, we only thought about it from this one perspective, and so one of the things that my group is trying to do is to make it okay to talk about here's how it could go wrong so that it becomes part of our daily habit and we do it at various levels. We do it at our all hands, so when people are showing our technology, we have them show the dark side of it at the same time so that we can talk about that in an open space and it becomes okay to talk about it. No one wants to share the bad side of technology. No one wants to do that. But if we make it okay to talk about it, then we can start talking about well, how do we prevent that? So we do that at larger forums and I know this is a podcast, but I wanted to show you something. So I'll talk about it, but we created, it's almost like a game, but it's a way for us to look at different stakeholders and perspectives of what could happen. So how do we create a safe environment where you can look at one of our ethical principles. You can look at a stakeholder that is interacting with the system and then you say well if the stakeholder for example is a woman in a car and your system is a voice recognition system, what would she say if she gave it a one star review? She would probably say I had to yell a lot and it didn't recognize me because we know that most of our systems are not tuned to be diverse, right? So we start creating this environment for us to talk about these types of things so that it becomes okay again. How do we create safe spaces? Then as we develop our scenarios, how do we bring those up and track them and say, well how do we fix it now that we've excavated these issues? Well, let's fix it and let's talk about it. So that's, again, part of culture. How do we make it okay to bring up the bad parts of things, right? So it's not just the ideal path. Sam Charrington: [00:12:46] Mm-hmm (affirmative). Do you run into, or run up against engineers or executives that say, introspection, safe spaces, granola? What about the bottom line? What does this mean for us as a business? How do we think about this from a shareholder perspective? Mira Lane: [00:13:09] It's interesting, I don't actually hear a lot of that pushback because I think internally at Microsoft, there is this recognition of hey, we want to be really thoughtful and intentional and I think the bigger issue that we hear is how do we do it? It's not that we don't want to. It's well, how do we do it and how do we do it at scale? So what are the different things you can put in place to help people bring this into their practice? And so there isn't a pushback around well, this is going to affect my bottom line, but there's more of an understanding that yeah, if we build things that are thoughtfully designed and intentional and ethical that it's better for our customers. Our customers want that too, but then again the question is how do we do it and where is it manifest? So there's things that we're doing in that space. When you look at AI, a big part of it is data. So how do you look at the data that's being used to power some of these systems and say is this a diverse data set? Is this well rounded? Do we have gaps here? What's the bias in here? So we start looking at certain components of our systems and helping to architect it in a way that's better. I think all of our customers would want a system that recognized all voices, right? Because again, to them, they wouldn't want a system that just worked for men, it didn't work for women. So again, it's a better product as a result. So if we can couch it in terms of better product, then I think it makes sense versus if it's all about us philosophizing and only doing that, I don't know if that's the best. Only doing that is not productive, right? Sam Charrington: [00:14:59] Do you find that the uncertainty around ethical issues related to AI has been an impediment to customers adopting it? Does that get in the way? Do they need these issues to be figured out before they dive in? Mira Lane: [00:15:22] I don't think it's getting in the way, but I think what I'm hearing from customers is help us think about these issues and a lot of people, a lot of customers don't understand AI deeply, right? It's a complex space and a lot of people are ramping up in it. So the question is more about what should I be aware of? What are the questions that I should be asking and how can we do this together? We know you guys are thinking about this deeply. We're getting just involved in it, a customer might say, and so it's more about how do we educate each other? And for us if we want to understand, how do you want to use this? Because sometimes we don't always know the use case for the customer so we want to deeply understand that to make sure that what we're building actually works for what they are trying to do, and from their perspective they want to understand well how does this technology work and where will it fail and where will it not work for my customers? So the question of ethics is more about we don't understand the space well enough, help us understand it and we are concerned about what it could do and can we work together on that? So it's not preventing them from adopting it, but there's definitely a lot of dialog. It comes up quite a bit around we've heard this. We've heard bias is an issue. Well, what does that mean? Sam Charrington: [00:16:47] Right. Mira Lane: [00:16:47] So I think that's an education opportunity. Sam Charrington: [00:16:49] When you think about ethics from a technology innovation perspective, are there examples of things that you've seen either that Microsoft is doing or out in the broader world that strike you as innovative approaches to this problem? Mira Lane: [00:17:12] Yeah, I'll go back to the data side of things just briefly, but there's this concept called data sheets, which I think is super interesting. You're probably really familiar with that and- Sam Charrington: [00:17:25] I've written about some of the work that Timnit Gebru and some others with Microsoft have done around data sheets for data sets. Mira Lane: [00:17:31] Exactly, and the interesting part for us is how do you put it into the platform? How do you bake that in? So one of the pieces of work that we're doing is we're taking this notion of data sheets and we are applying it into how we are collecting data and how we're building out our platform. So I think that that's, I don't know if it's super novel because to me it's like a nutrition label for your data. You won't understand how is it collected? What's in it? How can you use it? But I think that that's one where now as people leave the group you want to make sure that there's some history and understanding the composition of it. There's some regulation around how we manage it internally and how we manage data in a thoughtful way. I think that's just a really interesting concept that we should be talking about more as an industry and then can we share data between each other in a way that's responsible as well? Sam Charrington: [00:18:24] Right. I don't know that the data sheet, I think inherent to the idea was that hey, this isn't novel. In fact, look at electrical components and all these other industries that do this. It's just "common sense". But what is a little novel, I think, is actually doing it. So since that paper was published, several companies have published similar takes, model cards, and there have been a handful and every time I hear about them I ask okay, when is this? When are you going to be publishing these for your services and the data sets that you're publishing? And no one's done it yet. So it's intriguing to hear you say that you're at least starting to think in this way internally. Do you have a sense for what the path to publishing these kinds of, whether it's data sheet or a card or some kind of set of parameters around bias either in a data set or a model for a commercial public service? Mira Lane: [00:19:41] Yeah, absolutely. We're actually looking at doing this for facial recognition and we've publicly commented about that,  we've said,  hey we're going to be sharing for our services what it's great for, what it's not, and so that stuff is actually actively being worked on right now. You'll probably see more of this in the next few weeks, but there is public comment that's going to come out with more details about it and I'll say that on the data sheet side, I think a large portion of it is it needs to get implemented in the engineering systems first and you need to find the right place to put it. So that's the stuff that we're working on actively right now. Sam Charrington: [00:20:25] Can you comment more on that? It does, as you say that, it does strike me a little bit as one of these iceberg kind of problems. It looks very manageable kind above the waterline but if you think about what goes into the creation of a data set or a model, there's a lot of complexity and certainly the scale that Microsoft is working it needs to be automated. What are some of the challenges that have come into play in trying to implement an idea like that? Mira Lane: [00:21:01] Well, let me think about this for a second so I can frame it the right way. The biggest challenge for us on something like that is really thinking through the data collection effort first and spending a little bit of time there. That's where we're actually spending quite a bit of time as we look at, so let me back up for a second. I work in an engineering group that touches all the speech, language, and vision technologies and we do an enormous amount of data collection to power those technologies. One of the things that we're first spending time on is looking at exactly how we're collecting data and going through those methodologies and saying is this the right way that we should be doing this? Do we want to change it in any way? Do we want to optimize it? Then we want to go and apply that back in. So you're right, this is a big iceberg because there's so many pieces connected to it and the spec for data sheets and the ones we've seen are large and so what we've done is how do we grab the core pieces of this and implement and create the starting point for it? And then scale over time add versioning, being able to add your own custom scheme list to it and scale over time, but what is the minimum piece that we can put into this system and then make sure that it's working the way we want it to? So it's about decomposing the problem and saying which ones do we want to prioritize first? For us, we're spending a lot of time just looking at the data collection methodologies first because there's so much of that going on and at the same time, what is the minimum part of the data sheet spec that we want to go and put in and then lets start iterating together on that. Sam Charrington: [00:22:41] It strikes me that these will be most useful when there's kind of broad industry adoption or at least coalescence around some standard whether it's a standard minimum that everyone's doing and potentially growing over time. Are you involved in or aware of any efforts to create something like that? Mira Lane: [00:23:02] Well I think that that's one piece where it's important. I would say also in a large corporation, it's important internally as well because we work with so many different teams and we're interfacing with, we're a platform but we interface with large parts of our organization and being able to share that information internally, that is a really important piece to the puzzle as well. I think the external part as well, but the internal one is not any less important in my eyes because that's where we are. We want to make sure that if we have a set of data, that this group A is using it in one way. If group B wants to use it, we want to make sure they have the rights to use it. They understand what it's composed of, where it's orientation is and so that if they pick it up, they do it with full knowledge of what's in it. So for us internally it's a really big deal. Externally, I've heard pockets of this but I don't think I can really comment on that yet with full authority. Sam Charrington: [00:24:03] I'm really curious about the intersection between ethics and design and you mentioned human-centered design earlier. My sense is that that phrase kind of captures a lot of that intersection. Can you elaborate on what that means for you? Mira Lane: [00:24:20] Yeah, yeah. So when you look at traditional design functions, when we talk about human-centered design, there's lots of different human-centered design frameworks. The one I typically pick up is Don Norman's emotional design framework where he talks about behavioral design, reflective design, and visceral design. And so behavior is how is something functioning? What is the functionality of it? Reflective is how does it make you feel about yourself? How does it play to your ego and your personality? And visceral is the look and feel of that. That's a very individual oriented approach to design and when I think about these large systems, you actually need to bring in the ecosystem into that. So how does this object you're creating or this system you're creating, how does it fit into the ecosystem? So one of the things we've been playing around with is we've actually reached into adjacent areas like agriculture and explore how do you do sustainable agriculture? What are some of those principles and methodologies and how do you apply that into our space? So a lot of the conversations we're having is around ecosystems and how do you insert something into the ecosystem and what happens to it? What is the ripple effect of that? And then how do you do that in a way that keeps that whole thing sustainable? It's a good solution versus one that's bad and causes other downstream effects. So I think that those are changes that we have to have in our design methodology. We're looking away from the one artifact and thinking about it from a here's how the one user's going to work with it versus how is the society going to interact with it? How are different communities going to interact with it and what does it do to that community? It's a larger problem and so there's this shift in design thinking that we're trying to do with our designers. So they're not just doing UI. They're not just thinking about this one system. They're thinking about it holistically. And there isn't a framework that we can easily pick up, so we have to kind of construct one as we're going along. Sam Charrington: [00:26:28] Yeah, for a while a couple of years ago maybe I was in search of that framework and I think the motivation was just really early experiences of seeing AI shoved into products in ways that were frustrating or annoying. For example, a Nest thermostat. It's intended to be very simple, but it's making these decisions for you in a way that you can't really control and it's starting me down this path of what does it mean to really, build out a discipline of design that is aware of AI and intelligence? I've joked on the podcast before, I call it intelligent design, but that's an overloaded term. Mira Lane: [00:27:23] Totally is. Sam Charrington: [00:27:24] But is there a term for that now or people thinking about that? How far have we come in building out a discipline or a way of thinking of what it means to build intelligence into products? Mira Lane: [00:27:37] Yeah, we have done a lot of work around education for our designers because we found a big gap between what our engineers were doing and talking about and what our designers had awareness over. So we actually created a deep learning for designers workshop. It was a two day workshop and it was really intensive. So we took neural nets, convolutions, all these concepts and introduced them to designers in a way that designers would understand it. We brought it to here's how you think about it in terms of photoshop. Here's how you think about it in terms of the tools you're using and the words you use there, here’s  how it applies. Here's an exercise where people had to get out of their seats and create this really simple neural net with human beings and then we had coding as well. So they were coding in Python and in notebooks, so they were exposed to it and we exposed them to a lot of the techniques and terminology in a way that was concrete and they were able to then say oh, this is what style transfer looks like. Oh, this is how we constructed a bot. So first on the design side, I think having the vocabulary to be able to say oh, I know what this word means. Not just I know what it means, but I've experienced it, so that I can have a meaningful discussion with my engineer, I think that that was an important piece, and then understanding how AI systems are just different from regular systems. They are more probabilistic in nature. The defaults mattered. They can be self learning and so how do we think about these and starting to showcase case studies with our designers to understand that these types of systems are quite different from the deterministic type of systems that may have designed for in the past. Again, I think it comes back to culture because, and we keep doing these workshops. Every quarter we'll do another one because we have so much demand for it and we found even engineers and PMs will come to our design workshops. But kind of democratizing the terminology a little bit and making it concrete to people is an important part of this. Sam Charrington: [00:29:48] It's interesting to think about what it does to a designer's design process to have more intimate knowledge of these concepts. At the same time a lot of the questions that come to mind for me are much higher level concepts in the domain of design. For example, we talk about user experience. To what degree should a user experience AI if that makes any sense? Should we be trying to make AI or this notion of intelligence invisible to users or very visible to users? This has come up recently in, for example, I'm thinking of Google Duplex when they announced that that system was gonna be making phone calls to people and there was a big kerfuffle about whether that should be disclosed. Mira Lane: [00:30:43] Yeah. Sam Charrington: [00:30:43] I don't know that there's a right answer. In some ways you want some of this stuff to be invisible. In other ways, tying back to the whole ethics conversation, it does make sense that there's some degree of disclosure. Mira Lane: [00:30:57] Yeah, absolutely. Sam Charrington: [00:30:58] I imagine as a designer, this notion of disclosure can be a very nuanced thing. What does that even mean? Mira Lane: [00:31:03] Yeah, yeah. And it's all context dependent and it's all norm dependent as well because if you were to look into the future and say are people more comfortable, I mean look at airports for example. People are walking through just using face ID, using the clear system and a few years ago, I think if you ask people would you feel comfortable doing that? Most people would say no, I don't feel comfortable doing that. I don't want that. So I think in this space because it's really fluid and new norms are being established and things are being tested out, we have to be on top of how people are feeling and thinking about these technologies so that we understand where some disclosure needs to happen and where things don't. In a lot of cases you almost want to assume disclosure for things that are very consequential and high stakes. Where there is opportunity for deception. In the Duplex case you have to be thoughtful about that. So this isn't one where you can say okay, you should always disclose. It just depends on the context. So we have this notion of consequential scenarios where things are if there's automated decision making, if there are scenarios where there is, there are high stakes scenarios. Those ones we think about in we just put a little bit more due diligence over those and start to be more thoughtful about those. Then we have other types of scenarios which are more systems-oriented and here's some things that are operationally oriented and they end up having different types of scenarios, but we haven't been able to create a here's the exact way you do every single, you approach it in every single way. So it is super context dependent and expectation dependent. Maybe after a while you get used to your Nest thermostat and you're fine with the way it's operating, right? So I don't know. These social norms are interesting because they are, someone will go and establish something or they'll test the waters. Google Glass tested the waters and that was a cultural response, right? People responded and said I don't want to be surveilled. I want to be able to go to a bar and get a drink and not have someone recording me. Sam Charrington: [00:33:21] Right. Mira Lane: [00:33:22] So I think we have to understand where society is relative to what the technologies are that we're inserting into them. So again, it comes back to are we listening to users? Are we just putting tech out there? I think we really have to start listening to users. My group has a fairly large research component to it and we spend a lot of time talking to people. Especially in the places where we're going to be putting some tech and understanding what it's going to do to the dynamic and how they're reacting to it. Sam Charrington: [00:33:52] Mm-hmm (affirmative). Mm-hmm (affirmative). Yeah, it strikes me that maybe it's kind of the engineer background in me that's looking for a framework, a flowchart for how we can approach this problem and I need to embrace more of the design or it's like every product, every situation is different and it's more about a principled approach as opposed to a process. Mira Lane: [00:34:18] Absolutely. It's more about a principled and intentional approach. So what we're just talking about is everything that you're choosing, are you intentional about that choice and are you very thoughtful about things like defaults? Because we know that people don't change them and so how do you think about every single design choice and being principled and then very intentional and evidence-driven. So we pushed this onto our teams and I think some of our teams maybe don't enjoy being with us sometimes as a result but we say look, we're going to give you some recommendations that are going to be principled, intentional, and evidence-driven and we want to hear back from you if you don't agree on your evidence and why you're saying this is a good or bad idea. Sam Charrington: [00:34:59] Mm-hmm (affirmative). Mira Lane: [00:35:00] That's the way you have to operate right now because it is so context driven. Sam Charrington: [00:35:04] I wonder if you can talk through some examples of how human-centered design, AI, all these things come together in the context of kind of concrete problems that you've looked at. Mira Lane: [00:35:13] Yeah, I was thinking about this because a lot of the work that we do is fairly confidential, but there's one that I can touch on, which was shared at build earlier this year and that was a meeting room device and I don't know if you remember this, but there's a meeting room device that we're working on that recognizes who's in the room and does transcription of that meeting, and to me, as someone who is a manager, I love the idea of having a device in the room that captures action items and who was here and what was said. So we started looking at this and we said okay, well let's look at different types of meetings and people, and let's look at categories of people that this might affect differently. And so how do you think about introverts in a meeting? How do you think about women and minorities because there are subtle dynamics that are happening in meetings that make some of these relationships, they can reinforce certain types of stereotypes or relationships and so we started interviewing people in the context of this sort of meeting room device and this is research that is pretty well recognized. It's not novel research, but it reinforced the fact that when you start putting in things that will monitor anyone that's in a room, certain categories of people behave differently and you see larger discrepancies and impact with women, minorities, more junior people. So we said wow, this is really interesting because as soon as you put a recording device in a room, it's gonna subtly shift the dynamic where some people might talk less or some people might feel like they're observed or depending on if there's a manager in the room and there's a device in the room, they're going to behave differently and does that result in a good meeting or a bad one? We're not sure. But that will affect the dynamic. And so then we took a lot of this research and we went back to the product team and said well how do we now design this in such a way that we design with privacy first in mind? And make users feel like they're empowered to opt into it and so we've had discussions like that, especially around these types of devices where we've seen big impact to how people behave. But it's not like a hard guideline. There's not really a hard set of rules around what you have to do, but because all meetings are different. You have brainstorming ones that are more about fluid ideas. You don't really care who said what, it's about getting the ideas out. You have ones where you're shipping something important and you wanna know who said what because there are clear action items that go with them and so trying to create a system that works with so many different nuanced conversations and different scenarios is not an easy one. So what we do is we'll run alongside with a product team and while they're engineering, they're developing their work, we will take the research that we've gathered and we'll create alternatives for them at the same time so that we can run alongside of them. We can say hey, here's option A, B, C, D, and E. Let's play with these and maybe we come up with a version that mixes them all together. But it gives them options to think about. Because again, it comes back to oh, I might not have time to think about all of this. So how do we empower people with ideas and concrete things to look at? Sam Charrington: [00:38:35] Yeah, I think that example's a great example of the complexity or maybe complexity's not the right word, but the idea that your initial reaction might be like the exact opposite of what you need to do. Mira Lane: [00:38:51] Yep. Sam Charrington: [00:38:51] As you were saying this, I was just like oh, just hide the thing so no one knows it's there. It doesn't change the dynamic. It's like that's exactly wrong. Mira Lane: [00:38:58] You don't want to do that. Don't hide it. Sam Charrington: [00:38:59] Right, right. Mira Lane: [00:39:01] Yeah. And maybe that's another piece. I'm sorry to interrupt that, but one of the things I've noticed is our initial reaction is often wrong, and so how do we hold it at the same that we give ourselves a space to explore other things and then keep an open mind and say okay, I have to adjust and change because hiding it would absolutely be an interesting option, but then you have so many issues with that, right? But again, it is about being able to have an open mindset and being able to challenge yourself in this space. Sam Charrington: [00:39:33] Do you have a sense for where if we kind of buy in to the idea that folks that are working with AI need to be more thoughtful and more intentional and maybe incorporate more of this into more of this design thinking element to their work? Do you have a sense for where this does, or should, or needs to live within a customer organization? Mira Lane: [00:40:01] Yeah, I think it actually, and this is a terrible answer but I think it needs to live everywhere in some ways because one thing that we're noticing is we have corporate level things that happen. We have an ether board. It's an advisory board that looks at AI technologies and advises and that's at a corporate level and that's a really interesting way of approaching it, but it can't live alone and so the thing that we have learned is that if we pair it with groups that mine that sit in the engineering context, that are able to translate principles, concepts, guidelines into practice, that sort of partnership has been really powerful because we can take those principles and say well here's where it really worked and here's where it kind of didn't work and we can also find issues and say well we're grappling with this issue that you guys hadn't thought about. How do you think about this and can we create a broader principle around it? So I think there's this strong cycle of feedback that happens. If you have something at the corporate level or you established just what your values are, what are our guidelines and what are our approaches? But at the engineering context, you have a team that can problem solve and apply and then you can create a really tight feedback loop between that engineering team and your corporate team so that you're continually reinforcing each other, because the worst thing would be just to have a corporate level thing and just be PR speak. You don't want that. Sam Charrington: [00:41:23] Right. Right. Mira Lane: [00:41:24] The worst thing would also be just to have it on the engineering level because then you would have a very distributed mechanism of doing something that may not cohesively ladder up to your principles. So I think you kind of need both to have them work off each other to really have something effective and maybe there's other things as well, but so far this has been a really productive and iterative experiment that we're doing. Sam Charrington: [00:41:50] Do any pointers come to mind for folks that want to explore this space more deeply? Do you have a top three favorite resources or initial directions? Mira Lane: [00:42:02] Well it depends on what you want to explore. So I was reading the AI Now report the other day. It's a fairly large report, 65 page report around the impact of AI in different systems, different industries and so if you're looking at getting up to speed on well what areas is AI going to impact? I would start with some of these types of groups because I found that they are super thoughtful and how they're going into each space and understanding each space and then bubbling up some of the scenarios. So if you're thinking about AI from a how is it impacting? Those types of things are really interesting. On the engineering side, I actually spend a lot of time on a few Facebook groups where they have, there's some big AI groups in Facebook and they're always sharing here's the latest, here's what's going on, try this technique. So that keeps me up to speed on some of those that are happening and also archive just to see what research is being published. The design side I'm sort of mixed. I haven't really found a strong spot yet. I wish I had something in my back pocket where I can just refer to, but the thing that maybe has been on the theory side that has been super interesting is to go back to a few people that have made commentaries just around sustainable design. So I refer back to Wendell Berry quite a bit, the agriculturalist and poet, actually, who has really introspected how agriculture could be reframed. Ursula Franklin is also a commentary from Canada. She did a lot of podcasts or radio broadcast a long time ago and she has a whole series around technology and it’s societal impact and if you replace a few of those words and put in some of our new age words, it would still hold true, and so I think there's a lot of theory out there but not a lot of here's really great examples of what you can do because we're all still feeling out the space and we haven't found perfect patterns yet that you can democratize and share out broadly. Sam Charrington: [00:44:18] Well, Mira, thanks so much for taking the time to chat with us about this stuff. It's a really interesting space and one that I enjoy coming back to periodically and I personally believe that there's this intersection of AI and design as one that's just wide open and should and will be further developed and I'm kind of looking forward to keeping an eye on it and I appreciate you taking the time to chat with me about it. Mira Lane: [00:44:49] Thank you so much, Sam. It was wonderful talking to you. Sam Charrington: [00:44:52] Thank you.
Sam Charrington: Today we're excited to continue the AI for the benefit of society series that we've partnered with Microsoft to bring to you. Today we're joined by Hanna Wallach principal researcher at Microsoft research. Hanna and I really dig into how bias and a lack of interpretability and transparency show up across machine learning. We discuss the role that human biases, even those that are inadvertent, play in tainting data, whether deployment of fair ML algorithms can actually be achieved in practice and much more. Along the way, Hannah points us to a ton of papers and resources to further explore the topic of fairness in ML. You'll definitely want to check out the show notes page for this episode, which you'll find at twimlai.com/talk/232. Before diving in I'd like to thank Microsoft for their support of the show and their sponsorship of this series. Microsoft is committed to ensuring the responsible development and use of AI and is empowering people around the world with this intelligent technology to help solve previously intractable societal challenges, spanning sustainability, accessibility and humanitarian action. Learn more about their plan at Microsoft.ai. Enjoy. Sam Charrington: [00:02:18] All right everyone, I am on the line with Hanna Wallack, Hanna is a principal researcher at Microsoft Research in New York City. Hanna, welcome to this week in Machine Learning and AI. Hanna Wallach:[00:00:11] Thanks, Sam. It's really awesome to be here. Sam Charrington: [00:00:14] It is a pleasure to have you on the show, and I'm really looking forward to this conversation. You are clearly very well known in the machine learning and AI space. Last year, you were the program chair at one of the largest conferences in the field, NeurIPS. In 2019, you'll be it's general chair. But for those who don't know about your background, tell us a little bit about how you got involved and started in ML and AI. Hanna Wallach:[00:00:48] Sure. Absolutely. So I am a machine learning researcher by training, as you might expect. I've been doing machine learning for about 17 years now. So since way before this stuff was even remotely fashionable, or popular, or cool, or whatever it is nowadays. In that time, we've really seen machine learning change a lot. It's sort of gone from this weirdo academic discipline only of interest to nerds like me, to something that's so mainstream that it's on billboards, it's in TV shows, and so on and so forth. It's been pretty incredible to see that shift over that time. I got into machine learning sort of by accident, I think that's often what happens. I had taken some undergrad classes on information theory and stuff like that, found that to be really interesting, but thought that I was probably going to go into human computer interaction research. But through a research assistantship during the summer between my undergrad degree and my Master's degree, I ended up discovering machine learning, and was completely blown away by it. I realized that this is what I wanted to do. I've been focusing on machine learning in various different forms since them. My PHD was specifically on Bayesian Latent Variable methods, typically for analyzing text and documents. So topic models, that kind of thing. But during my PHD, I really began to realize that I'm not particularly interested in analyzing documents for the sake of analyzing documents, I'm interested in analyzing documents because humans write documents to communicate with one another. It's really that underlying social process that I'm most interested in. So then during my postdoc, I started to shift direction from primarily looking at text and documents to thinking really about those social processes. So not just what are people saying, but also who’s interacting with whom, and thinking about machine learning methods for analyzing the structure and content of social processes in combination. I then dove into this much more when I got a faculty job, because I was hired as part of UMass Amherst’s Computational Social Science Initiative. So at that point I started focusing really in depth on this idea of using machine learning to study society. I established collaborations with a number of different social scientists, focusing on a number of different topics. Over the years, I've mostly ended up working with political scientists, and often study questions relating to government transparency, and still looking at sort of this whole idea of a social process consists of individuals, or groups of individuals interacting with one another, information that might be used in or arising from these interactions, and then the fact that these things might change over time. I often use one of these or two of these modalities, so structure, content, or dynamics, to learn about one or more of the other ones as well. As I continued to work in this space, I started to think more, not just about how we can use machine learning to study society, but the fact that machine learning is becoming much more prevalent within society. About four years ago, I started really thinking more about these issues of fairness, accountability, transparency, and ethics. It was a pretty natural fit for me to start moving in this direction. Not only was I already thinking about questions to do with people, but I've done a lot of diversity and inclusion work in my non research life. So I'm one of the co-founders of the Women in Machine Learning workshop, I also co-founded two organizations to get more women involved in free and open source software development. So issues related to fairness and stuff like that are really something that I tend to think about a lot in general. So I ended up making sort of this shift a little bit in my research focus. That's not to say that I don't still work on things to do with core computational social science, but increasingly my research is focusing on the ways that machine learning impacts society. So fairness, accountability, transparency, and ethics. Sam Charrington: [00:05:53] We will certainly dive deep into those topics. But before we do, you've mentioned a couple of times the term computational social science. That's not a term that I've heard before, I don't believe. Can you ... Is that ... I guess I'm curious how established that is as a field, or is it something that is specific to that institution that you were working at? Hanna Wallach:[00:06:19] Sure. So this is really a discipline that started emerging in maybe sort of 2009, 2008, that kind of time. By 2010, which is when I was hired at UMass, it really was sort of its own little emerging field with a bunch of different computer scientists and social scientists really committed to pushing this forward as a discipline. The basic idea, of course, is you know social scientists study society and social processes, and they've been doing this for decades. But often using qualitative methods. But of course, as more of society moves towards digitized interaction methods, and online platforms, and other kinds of things like that, we're beginning to see much more of this sort of digital data. At the same time, we've seen this massive increase, as I've said, in the popularity of machine learning and machine learning methods that are really suitable for analyzing data about social processes in society. So computational social science is really the sort of emerging discipline at the intersection of computer science, the social sciences, and statistics as well. The real goal is to develop and use computational and statistical methods, so machine learning methods, for example, to understand society, social processes, and answer questions that are substantively interesting to social scientists. At this point, there are people at a number of different institutions focusing on computational social science. So yes, of course, UMass, as I've mentioned before. But also Northwestern, Northeastern, University of Washington, in fact have been doing this for years, and of course, Microsoft Research is no exception in this regard. Part of the reason why I joined Microsoft Research was that we have a truly exceptional group of researchers in computational social science here. That was really very appealing to me. Sam Charrington: [00:08:31] Oh, awesome, awesome. So you talked about your transition to focusing on fairness, accountability, transparency, and ethics in machine learning and AI. Can you talk a little bit about what those terms mean to you, and your broader research? Hanna Wallach:[00:08:54] Yeah, absolutely. So I think the bulk of my own research in that sort of broad umbrella falls within two categories. So the first is fairness, and the second is what I would sort of describe as interpretability of machine learning. So in that fairness bucket, really, much of my research is focused on studying the ways in which machine learning can inadvertently harm or disadvantage groups of people or individual people in various different, usually unintended, ways. I'm interested in understanding not only why this occurs, but what we can do to mitigate it, and what we can do to really develop fairer machine learning systems. So systems that don't inadvertently harm individuals or groups of people. In the intelligibility bucket, so there, I'm really interested in how we can make machine learning methods that are interpretable to humans in different roles for particular purposes. There has been a lot of research in this area over the past few years, focusing on oftentimes developing simple machine learning models that can be easily understood my humans simply by exposing their internals, and also on developing methods that can generate explanations for either entire models or the predictions of models. Those models might be potentially very complex. My own work typically focuses really more on the human side of intelligibility, so what is it that might make a system intelligible or interpretable to a human trying to carry out some particular task? I do a lot of human subjects experiments to really try and understand some of those questions with a variety of different folks here at Microsoft Research. Sam Charrington: [00:11:01] On the topic of fairness and avoiding inadvertent harm, there are a lot of examples that I think many of our audience would be familiar with, the ProPublica work into the use of machine learning systems in the justice process, and others. Are there examples that come to mind for you that are maybe less well known, but that illustrate for you the importance of that type of work? Hanna Wallach:[00:11:36] Yes. So when I typically think about this space, I tend to think about this in terms of the types of different harms that can occur. I have some work with Aaron Shapiro, Solon Barocas, and Kate Crawford on the different types of harms that can occur. Kate Crawford actually did a fantastic job of talking about this work in her invited talk at the NeurIPS conference in 2017. But to give you some concrete examples, so many of the examples that people are most familiar with are these scenarios as you mentioned where machine learning systems are being used to allocate or withhold resources, opportunities, or information. So one example would be of the compass recidivism prediction system being used to make decisions about whether people should be released on bail. Another example would be from a story, a news story that happened in November where Amazon revealed that it had abandoned an automating hiring tool because of fears that the tool would reinforce existing gender imbalances in the workplace. So there you're looking at these existing gender imbalances, and seeing that this tool is perhaps withholding opportunities from women in the tech industry in an undesirable way. There was a lot of coverage about this very sensible decision that Amazon made to abandon that tool. Some other examples would be more related to quality of service issues even when no resources or opportunities are being allocated or withheld. So a great example there would be the work that Joy Buolamwini and Timnit Gebru did focusing on the ways that commercial gender classification systems might perform less well, so less accurate, for certain groups of people. Another example you might think of, let's say, speech recognition systems. You can imagine systems that work really well for people with certain types of accents, or for people with voices at certain pitches. But less well for other people, certainly for me. I'm British, and I have a lisp. I know that oftentimes speech recognition systems don't do a great job of understanding what I'm saying. This is much less of an issue nowadays, but you know, five or so years ago, this was really frustrating for me. Some other examples are things like stereotyping. So here the most famous example of stereotyping in machine learning is Latanya Sweeney's work from 2013, where she showed that advertisements that were being shown on web searches for different people's names would more typically be advertisements that reinforced stereotypes about black criminality when people searched for sort of black sounding names, than when people searched for stereotypically white sounding names. So there the issue is this sort of reinforcement of these negative stereotypes within society by the placement of particular ads for particular different types of searches. So another example of stereotyping in machine learning would be the work done by Joanna Bryson and others at Princeton University on stereotypes in word embeddings. There has also been some similar work done by my colleague, Adam Kalai, here at Microsoft Research. Both of these groups of researchers showed that if you train word embedding methods, so things like Word2Vec, that try and identify a low dimensional embedding for word types based on the surrounding words that are typically used in conjunction with them in sentences, you end up seeing that these word embeddings reinforce existing gender stereotypes. For example, so the word man ends up being embedded much closer to programmer and similarly woman ends up being embedded much closer to homemaker than vice versa. So that would be another kind of example. Then we see other kinds of examples of unfairness and harms within machine learning as well. So for example, over and under representation. So Matthew Kay and some others at the University of Washington have this really nice paper where they show that for professions with an equal or higher percentage of men than women, the image search results are much more heavily skewed towards images of men than reality. So that would be another kind of example. What you'll see from all of these examples that I've mentioned is that they affect a really wide range of systems and types of machine learning applications. The types of harms or unfairness that might occur are also pretty wide ranging as well, going from yes, sure, allocational withholding of resources, opportunities of information, but moving beyond that to stereotyping and representation and so on. Sam Charrington: [00:17:02] So often when thinking about fairness and bias in machine learning and the types of harm that can come about when unfair systems are developed, the kind of all roads lead back to the data itself, and the biases that are inherent in that data. Given that machine learning and AI is so dependent on data, and often much of the data that we have is biased, what can we do about that, and what are the kinds of things that your research is exploring to help us address these issues? Hanna Wallach:[00:17:41] Absolutely. Yeah, so you've hit on a really important point there, which is that in a lot of the sort of public discourse about fairness in machine learning, you have people making comments about algorithms being unfair, or algorithms being biased. Really, I think this misses some of the most fundamental points about why this is such a challenging landscape. So I want to just emphasize a couple of those here in response to your question. So the first thing is that machine learning is all about taking data, finding patterns in that data, and then often training systems to mimic the decisions that are represented within that data. Of course, we know that the society we live in is not fair. It is biased. There are structural disadvantages and discrimination all over the place. So it's pretty inevitable that if you take data from a society like that, and then train machine learning systems to find patterns expressed in that data, and to mimic the decisions made within that society, you will necessarily reproduce those structural disadvantages, that bias, that discrimination, and so on. So you're absolutely right that a lot of this does indeed come from data. But the other point that I want to make is that it's not just from data and it's not from algorithms per se. The issue is really, as I see it, and as my colleagues here at Microsoft Research see it, the issue is really about people and people's decisions at every point in that machine learning life cycle. So I've done some work on this with a number of people here at Microsoft, most recently I put together a tutorial on machine learning and fairness in collaboration with my colleague Jenn Wortman Vaughan. The way we really think about this is that you have to prioritize fairness at every stage of that machine learning lifecycle. You can't think about it as an afterthought. The reason why is that decisions that we make at every stage can fundamentally impact whether or not a system treats people fairly. So I think it's really important when we're thinking about fairness in machine learning to not just sort of make general statements about algorithms being unfair, or systems being unfair, but really to go back to those particular points and think about how unfairness can kind of creep in at any one of those stages. That might be as early as the task definition stage, so when you're sitting down to develop some machine learning system, it's really important to ask the question of who does this take power from, and who does this give power to? The answers to that question often reveal a lot about whether or not that technology should even be built in this first place. Sometimes the answer to addressing fairness in machine learning is simply, no, we should not be building that technology. But there are all kinds of other decisions and assumptions at other points in that machine learning life cycle as well. So the way we typically like to think about it is that a machine learning model, or method, is effectively an abstraction of the world. In making that abstraction, you necessarily have to make a bunch of assumptions about the world. Some of these assumptions will be more or less justified, some of these assumptions will be better fit for the reality than others. But if you're not thinking really carefully about what those assumptions are, when you are developing your machine learning system, this is one of the most obvious places that you can inadvertently end up introducing bias or unfairness. Sam Charrington: [00:21:42] Can you give us some concrete examples there? Hanna Wallach:[00:21:45] Yeah. Absolutely. One common example of this form would be stuff to do with teacher evaluation. So there have been a couple of high profile lawsuits about this kind of thing. But I think it illustrates the point nicely. So it's common for teachers to be evaluated based on a number of different factors, but including their student's test scores. Indeed, many of the methods that have been developed to analyze teacher quality using machine learning systems have really focused predominantly on student's test scores. But this assumes that student's test scores are in fact an accurate predictor of teacher quality. This isn't actually always the case. A good teacher should obviously do more than test prep. So any system that really looks just at test scores when trying to predict teacher quality is going to do a bad job of capturing these other properties. So that would be one example. Another example involves predictive policing. So a predictive policing system might make predictions about where crimes will be committed based on historic arrest data. But an implicit assumption here is that the number of arrests in an area is an accurate proxy for the amount of crime. It doesn't take into account the fact that policing practices can be racially biased, or there might be historic over policing in less affluent neighborhoods. I'll give you another example as well. So many machine learning methods work by defining some objective function, and then learning the parameters of the model so as to optimize that objective function. So for example, if you define an objective function in the context of, let's say, a search engine, that prioritizes user clicks, you may end up with search results that don't necessarily reflect what you want them to. This is because users may click on certain types of search results over other search results, and that might not be reflective of what you want to be showing when you show users a page of search results. So as a concrete example, many search engines, if you search for the word boy, you see a bunch of pictures of male children. But if you search for the world girl, you see a bunch of pictures of grown up women. These are pretty different to each other. This probably comes from the fact that search engines typically optimize for clicks among other metrics. This really shows how hard it can be to even address these kinds of fairness issues, because in different circumstances the word girl may be referring to a child or a woman, and users search for this term with different intentions. In this particular example, as you can probably imagine, one of these intentions might be more prevalent than the other. Sam Charrington: [00:24:57] You've identified lots of opportunities for pitfalls in the process of fielding systems going all the way back to the way you define your system, and state your intentions, and formulate the problem that you're going after. Beyond simply being mindful of the potential for bias and unfairness and just saying simply, I realize that that's not simple, that it's work to be mindful of this. But beyond that, what does your research offer in terms of how to overcome these kinds of issues? Hanna Wallach:[00:25:43] Yeah, this is a really good question. It's a question that I get a lot from people is what can we actually do in practice? There are a number of things that can be done in practice. Not all of them are easy things to do, as you say. So one of the most important things is that issues relating to fairness in machine learning are fundamentally socio-technical. They're not going to be addressed by computer scientists or developers alone. It's really important to involve a range of diverse stakeholders in these conversations when we're developing machine learning systems so that we have a bunch of different perspectives represented. So moving beyond just involving computer scientists and developers on teams, it's really important that we involve social scientists, lawyers, policy makers, end users, people who are going to be affected or impacted by these systems down the line, and so on and so forth. That's one really concrete thing you can do. There is a project that came out of the University of Washington called the Diverse Voices project. It provides a way of getting feedback from stakeholders on tech policy documents. It's really good, they have a great how-to guide that I definitely recommend checking out. But many of the things that they recommend doing there, you can also think about when you're trying to get feedback from stakeholders on, let's say, the definition of a machine learning system. So that task definition stage. Some of these could even potentially be expanded to consider other stages of that machine learning pipeline as well. So there are a number of things that you can do at every single stage of the machine learning pipeline. In fact, this tutorial that I mentioned earlier, that I worked on with my colleague Jenn Wortman Vaughan actually has guidelines for every single step of the pipeline. But to give you examples, here are some things, for instance, that you can do when you're selecting a data source. So for example, it's really important to think critically before even collecting any data. It's often very tempting to say, oh, there is already some dataset that I can probably repurpose for this. But it's really important to take that step back and before immediately acting based on availability to actually think about whether that data source is appropriate for the task you want to use it for. There is a number of reasons why it might not be, it could be to do with biases and the data source selection process. There might be societal biases present in the data source itself. It might be that the data source doesn't match the deployment context, that's a really important one that people really should be taking into account. Where are you thinking about deploying your machine learning system and does the data you have availability for training and development match that context? As another example, still related to data, it's really important to think about biases in the technology used to collect data. So as an example here, there was an app released in the city of Boston back in 2011, I think it was called Street Bump. The way it worked is it used iPhone data and specifically the sort of positional movement of iPhones as people were driving around, to gather data on where there were potholes that should be repaired by the city. But pretty quickly, the city of Boston figured out that this actually wasn't a great way to get that kind of data, because back in 2011, the people who had iPhones were typically quite affluent and only lived in certain neighborhoods. So that would be an example about thinking carefully about the technology even used to collect data. It's also really important to make sure that there is sufficient representation of different subpopulations who might be ultimately using or affected by your machine learning system to make sure that you really do have good representation overall. Moving onto things like the model, there is a number of different things that you can do there, for instance, as well. So in the case of a model, I mentioned a bit about assumptions being really important. It's great to really clearly define all of your assumptions about the model, and then to question whether there might be any explicit or implicit biases present in those assumptions. That's a really important thing to do when you're thinking about choosing any particular model or model structure. You could even, in some scenarios, include some quantitative notion of parity, for instance, in your model objective function as well. There have been a number of academic papers that take that approach in the literature over the past few years. Sam Charrington: [00:30:43] Can you give an example of that last point? Hanna Wallach:[00:30:46] Yeah, sure. So imagine you have some kind of a machine learning classifier that's going to make decisions of the form, let's say loan, no loan, hire, no hire, bail, no bail, and so on. The way we normally develop these classifiers is to take a bunch of labeled data, so data points labeled with, let's say, loan, no loan, and then we train a model, a machine learning model, a classifier, to optimize accuracy on that training data. So you end up setting the parameters of that model such that it does a good job of accurately predicting those labels from the training data. So the objective function that's typically used is one that considers, usually, only accuracy. But something else you can do is define some quantitative definition of fairness, some quantitative fairness metric, and then try to simultaneously optimize both of these objectives. So classifier accuracy and whatever your chosen fairness metric is. There is a number of these different quantitative metrics that have been proposed out there that all typically are looking at parity across groups of some sort. So I think it's really important to remember that even though these are often referred to as fairness metrics, they're really parity metrics. They neglect many of the really important other aspects of fairness, like justice, and due process, and so on and so forth. But, it is absolutely possible to take these parity metrics and to incorporate them into the objective function of, say, a classifier, and then to try and prioritize satisfying and optimizing that fairness metric at the same time as optimizing classifier accuracy. There have been a number of papers that focus on this kind of approach, many of them will focus on one particular type of classifier, so like SBMs, or neural networks, or something like that, and one particular fairness metric. There are a bunch of standard fairness metrics that people like to look at. I actually have some work with some colleagues here at Microsoft where we have a slightly more general way of doing this that will work with many different types of classifiers, and many different types of fairness metrics. So there is no reason to start again from scratch if you want to switch to a different classifier or a different fairness metric. We actually have some open source Python code available on GitHub that implements our approach. Sam Charrington: [00:33:27] So you've talked about the idea that kind of people are fundamentally the root of the issue, that these are societal issues, that they're not going to be solved by technological advancements or processes alone. At the same time, there has been a ton of new research happening in this area by folks in your group and elsewhere. Does that lead to a mismatch between what's happening in academia and on the technical side with the way this stuff actually gets put into practice? Hanna Wallach:[00:34:11] That's an awesome question. The simple answer is yes. This actually relates to one of my most recent research projects, which I'm really, really excited about. So last summer, some of my colleagues and I, specifically Jenn Wortman Vaughan, Miro Dudík, and Hal Daumé, along with our incredible intern, Ken Holstein from CMU, conducted the first systematic investigation of industry practitioner's challenges and needs for support relating to developing fairer machine learning systems. This work actually came about because we were thinking about ways of developing interfaces for that fair classification work that I mentioned a minute ago. Through a number of conversations with people in different product groups here at Microsoft and people at other companies, we realized that these kinds of classification tasks, while they're incredibly well studied within the fairness and machine learning literature, are maybe less common than we had thought in practice within industry. So that got us thinking about whether there might be, actually, a mismatch between the academic literature on fairness and machine learning, and practitioner's actual needs. What we ended up doing was this super interesting research project that was a pretty different style of research for me and for my colleagues. So I am a machine learning researcher, so is Jen, so is Howell, and so is Miro. Ken, our intern, is an HCI researcher. What we ended up doing was this qualitative HCI work to really understand what it is that practitioners are facing in reality when they try and develop fairer machine learning systems. To do this, we conducted semi structured interviews with 35 people, spanning 25 different teams, in 10 different companies. These people were in a number of different roles, ranging from social scientist, data labeler, product manager, program manager, to data scientists and researcher. Where possible, we tried to interview multiple people from the same team in order to get a variety of perspectives on that team's challenges and needs for support. We then took our findings from these interviews, and developed a survey which was then completed by another 267 industry practitioners, again, in a variety of different companies and a variety of different roles. What we found, at a high level, was that yes, there is a mismatch between the academic literature on fairness in machine learning and industry practitioner's actual challenges and needs for support on the ground. So firstly, much of the machine learning literature on fairness focuses on classification, and on supervised machine learning methods. In fact, what we found is that industry practitioners are grappling with fairness issues in a much wider range of applications beyond classification or prediction scenarios. In fact, many times the systems they're dealing with involve these really rich, complex interactions between users and the system. So for example, chat bots, or adaptive tutoring, or personalized retail, and so on and so forth. So as a result, they often struggle to use existing fairness research from the literature, because the things that they're facing are much less amenable to these quantitative fairness metrics. Indeed, very few teams have fairness KPIs or automated tests that they can use within their domain. One of the other things that we found is that the machine learning literature typically assumes access to sensitive attributes like race or gender, for the purpose of auditing systems for fairness. But in practice, many teams have no access to these kinds of attributes, and certainly not at the level of individuals. So they express needs for support in detecting biases and unfairness with access only to core screened, partial, or indirect information. This is something that we've seen much less focus on in the academic literature. Sam Charrington: [00:38:41] That last point is an interesting one, and one that I've brought up on the podcast previously. In many of the places you might want to use an approach like that, it's forbidden, from a regulatory perspective, to use the information that you want to use in your classifier to achieve fairness in any part of the decisioning process. Hanna Wallach:[00:39:04] Exactly. This sets up this really difficult tension between doing the right thing in practice from a machine learning perspective, and what is legally allowed. I'm actually working on a paper at the moment with a lawyer, Zack Conard, actually, a law student, Zack Conard, at Stanford University, on exactly this issue. This challenge between what you want to do from a machine learning perspective, and what you are required to do from a legal perspective, based on humans and how humans behave, and hundreds of years of law in that realm. It's really challenging, and there is this complicated trade off there that we really need to be thinking about. Sam Charrington: [00:39:48] It does make me wonder if techniques like or analogous to a differential privacy or something like that could be used to provide a regulatorily acceptable way to access protected attributes, so that they can be incorporated into algorithms like this. Hanna Wallach:[00:40:07] Yeah, so there was some work on exactly this kind of topic at the FAT ML Workshop colocated with ICML last year. This work was proposing the use of encryption and such like in order to collect and make available such information, but in a way that users would feel as if their privacy was being respected, and so that people who wanted to use that information would be able to use it for purposes such as auditing. I think that's a really promising approach, although there is obviously a bunch of non trivial challenges involved in thinking about how you might make that a reality. It's a really complicated landscape. But definitely one that's worth thinking about. Sam Charrington: [00:40:54] Was there a third area that you were about to mention? Hanna Wallach:[00:40:58] Yeah, so one of the main themes that we found in our work studying industry practitioners is a real mismatch between the focus on different points in the machine learning life cycle. So the machine learning literature typically assumes no agency over data collection. This makes sense, right? If you're a machine learning academic, you typically work with standard data sets that have been collected and made available for years. You don't typically think about having agency over that data collection process. But of course, in industry, that's exactly where practitioners often do have the most control. They are in charge of that data collection or data curation process, and in contrast, they often have much less control over the methods or models themselves, which often are embedded within much bigger systems. So it's much harder to intervene from a perspective of fairness with the models than it is with the data. We found that really interesting, this sort of difference in emphasis between models versus data in these different groups of people. Of course, many practitioners voiced needs for support in figuring out how to leverage that sort of agency over data collection to create fairer data sets for use in developing their systems. Sam Charrington: [00:42:20] So you mentioned the FAT ML workshop. I'm wondering as we come to a close, if there are any resources, events, pointers, I'm sure there are tons of things that you'd love to point people at. But what are your top three or four things that you would suggest people take a look at as they're trying to wrap their heads around this area, and how to either have an impact as a researcher, or how to make good use of it as a practitioner? Hanna Wallach:[00:42:55] Yeah. Absolutely. So there are a number of different places with resources to learn more about this kind of stuff. So first, I've mentioned a couple of times, this tutorial that I put together with Jen Waltman Vahn, that will be available publicly online very soon. It is in fact being broadcast next week, so it should be up by the time this podcast goes live. So I would definitely recommend that people check that out to really get a sense of how we, at Microsoft, are thinking about fairness in machine learning. Then moving beyond that, and thinking specifically on more of the academic literature, the FAT ML workshop maintains a list of resources on the workshop website. That's again, another really, really great place to look for things to read about this topic. The FAT Star conference is a relatively newly created conference on fairness accountability and transparency, not just in machine learning, but across all of computer science and computational systems. Again, there, I recommend checking out the website to see the publications that were there last year, and also the publications that will be there this year. There is a number of really interesting papers that I haven't read yet, but I'm super excited to read, being presented at this year's conference. That conference also has tutorials on a range of different subjects. So it's also worth looking at the various different tutorials there. So at last year's conference, Arvind Narayanan presented this amazing tutorial on quantitative fairness metrics, and why they're not a one size fits all solution, why there are trade offs between them, why you can't just sort of take one of these definitions, optimize for it, and call it quits. So I definitely recommend checking that out. Some other places that are worth looking for resources on this, the AI Now Institute, which was co-founded by Kate Crawford, who is also here at Microsoft Research, and Meredith Whitaker, who is also at Google, also has some incredibly awesome resources. They've put out a number of white papers and reports over the past couple of years that really get at the crux of why these are complicated socio-technical issues. So I strongly recommend reading pretty much everything that they put out. I would also recommend checking out some of the material put out by Data and Society, which is also an organization here in New York, led by Danah Boyd, and they too have a number of really interesting things that you can read about these different topics. Then the final thing I want to emphasize is the Partnership on AI, which was formed a couple of years ago by Microsoft and a bunch of other companies working in this space of AI to really foster cross company collaboration and moving forward in this space when thinking about these complicated societal issues that relate to AI and machine learning. So the partnership has been really ramping up over the past couple of years, and they also have some good resources that are worth checking out. Sam Charrington: [00:46:22] Oh, that's great. That is a great list that will keep us busy for a while. Hanna, thank you so much for taking the time to chat with us. It was really a great conversation, and I appreciate it. Hanna Wallach:[00:46:34] No problem. Thank you for having me. This has been really great. Sam Charrington: [00:46:38] Awesome, thank you.
Sam Charrington: Today we're excited to continue the AI for the Benefit of Society series that we've partnered with Microsoft to bring you. In this episode. We're joined by Peter Lee, Corporate Vice President at Microsoft Research responsible for the company's healthcare initiatives. Peter and I met a few months ago at the Microsoft ignite conference where he gave me some really interesting takes on AI development in China. We reference those in the conversation and you can find more on that topic in the show notes. This conversation centers on three impact areas that Peter sees for AI and healthcare, namely diagnostics and therapeutics, tools and the future of precision medicine. We dig into some examples in each area and Peter details the realities of applying machine learning and some of the impediments to rapid scale. Before diving in I'd like to thank Microsoft for their support of the show and their sponsorship of this series. Microsoft is committed to ensuring the responsible development and use of AI and is empowering people around the world with this intelligent technology to help solve previously intractable societal challenges spanning sustainability, accessibility and humanitarian action. Learn more about their plan at Microsoft.ai. Enjoy. Sam Charrington: [00:02:18] All right, everyone. I am on the line with Peter Lee. Peter is a corporate vice president at Microsoft responsible for the company's healthcare initiatives. Peter, it is so great to speak with you again. Welcome to This Week in Machine Learning and AI. Peter Lee: [00:00:14] Sam, it's great to be here. Sam Charrington: [00:00:17] Peter, you gave a really interesting presentation to a group that I was at at Ignite about what some of Microsoft was working on, at Microsoft Research as well as a really interesting take on AI development in China. That kind of piqued my interest, and we ended up sitting down to chat about that in a little bit more detail. While I did cover that for my blog and newsletter, and I'll be linking to it in the show notes, we won't be diving into that today. It was a really, really interesting take that I reflect on often, and I think it's an interesting setup for diving into your background, because you do have a very interesting background and an interesting perspective and set of responsibilities at Microsoft. On that note, can you share with our audience a little bit about your background? Peter Lee: [00:01:11] Sure, Sam. I'd love to do that. I agree it is a little bit unusual, although I think the common thread throughout has been about research and trying to bring research into the real world. I'm a computer scientist by training. I was a professor of computer science at Carnegie Mellon for a long time, actually for 24 years, and at the end of my time there was the head of the Computer Science Department. Then I went to Washington, D.C, to serve at an agency called DARPA, which is the Defense Advanced Research Projects Agency. That's kind of the storied research agency that built the Saturn V booster technology, invented the ARPANET, which became the Internet, developed robotics, lots and lots of other things. I learned a lot about bringing research to life there. Then, after a couple of years there, I was recruited to Microsoft and joined Microsoft Research. Started managing the mothership lab in Redmond, in the headquarters in Redmond, and then a little bit later all of the U.S. research labs and then ultimately, all of Microsoft's 13 labs around the world. Right about that time, Steve Ballmer announced his retirement. Satya Nadella took over as the CEO. Harry Shum took over all of AI and research at Microsoft and became my boss. They asked me to start a new type of research organization internally. It's called NExT, which stands for New Experiences in Technologies, and we've been trying to grow and incubate new research-powered businesses ever since, and most recently in healthcare. Sam Charrington: [00:03:04] I think when I think about AI and healthcare, there's certainly a ton of ground to cover there, but I think one of the areas that gets a lot of attention of late is all the progress that's being made around applying neural nets, CNNs in particular, to imagery. I'm wondering from your perspective, how do you tend to think about AI applied to the healthcare space and where the big opportunities are? Peter Lee: [00:03:37] Yeah. When I think about AI and healthcare, I'm really optimistic about the future. Not that there aren't huge, difficult problems and sometimes things always seem to go slower than you expect. It's a little bit like watching grass grow. It does grow and things do happen, but sometimes it's hard to see it. But over the last 15 years, the thing that I think is underappreciated is the entire healthcare industry has gone digital. It was only 15 years ago that, for example, in the United States, less that 10% of physicians were recording your health history in a digital electronic health record. Now, we're up over 95%, and that's just an amazing transformation over 15 years. It's not like we don't still have problems, data is siloed, it's not in standard formats. There's all sorts of problems, but the fact that it's gone digital just opens up huge, huge amounts of potential. I kind of look at the potential for AI in three areas. One is the thing that you pointed out, which are AI technologies that actually lead to better diagnostics and therapeutics, things that actually advance medical science and medical technology. A second area for AI is in the area of tools, tools that actually make doctors better at what they do, make them happier while they're doing it, and also improve the experience for you and me as patients or consumers of healthcare. Then the third area is in this wonderful future of precision medicine that's taking new sources of information, digital information, your genome, your proteome, your immunome, data from your fitness wearables and so on and integrating all of that together to give you a complete picture of what's going on with your body. Those are sort of three broad areas, and they're all incredibly exciting right now. Sam Charrington: [00:05:51] When you think about the first two of those categories, better diagnostics and therapeutics and tools, how do you distinguish them? It strikes me that giving doctors a better way to analyze medical imagery, for example, or to use that example again, is a tool that they can use, but when you say tools, what do you specifically mean? Peter Lee: [00:06:14] Yeah. You're absolutely right. There's an overlap. It's not like the boundaries between these things are all that hardened, but if you think about one problem that doctors have today is by some estimates in the United States, doctors are spending 40 to 50% of their workdays entering documentation, entering notes that record what happened in their encounters with patients. That's sometimes called an encounter note. That documentation is actually required now by various rules and regulations. It's an incredible source of burden. In fact, I'm guessing you've had this experience, most people have. You go to your doctor, I go to mine, and I like her very much, but while I'm being examined by her, she's not looking at me. She's actually sitting at a PC, typing in the encounter notes. The reason she's doing that is if she doesn't do it while she's examining me, she'll have to do it for a couple of hours maybe in the evening, taking time away from her own family. That burden is credited or blamed for a rise in physician burnout. Well, AI technologies today are rapidly approaching the point where ambient intelligence can just observe and listen to a doctor-patient encounter and automate the vast majority of the burden of that required clinical note-taking. That's an example of the kind of technology that could in a really material way just improve the lives and the workday satisfaction of doctors and nurses. I put that in a different category than technologies that actually give you more precise diagnosis of what's ailing you or ability to target therapies that might actually attack the very specific genetic makeup, let's say, of a cancer that's inhabiting your body right now. Sam Charrington: [00:08:17] Got it. Got it. Maybe let's take each of these categories in turn. I'd love to get a perspective from you on where you see the important developments coming from, from a research perspective, and where you see the opportunities and where you see things heading in each. Peter Lee: [00:08:42] Sure. Well, why don't we start with your example of imaging, because computer vision based on deep neural nets has just been progressing at this stunning rate. It seems like every week you see another company, another startup, or another university research group showing off their latest advances in using deep neural net-based computer vision technologies to do various kinds of medical image diagnosis or segmentation. Here at Microsoft, we've been working pretty hard on those as well. We have this wonderful program based primarily in India that's been trained on the health records and eye images of over 200,000 patients. That idea of taking all that data, you get the signal of which of those patients have, let's say, suffered from, say, diabetic retinopathy or a progression of refractive error leading to blindness. From that signal in the electronic health record, coupled with the images, we are able to train a computer vision-based thing to make a prediction about whether a child whose eye image has been taken is in danger of losing eyesight. That is in deployment right now in India, and, of course, for other parts of the world like the United States and Europe, which are more regulated, these things are in various states of clinical validation so they can be more broadly deployed. Another example is a project that we have called InnerEye that is trying to just reduce the incredible, kind of boring and mundane problem of just pixel-by-pixel outlining the parts of your body that are tumor and should be attacked with the radiation beam as opposed to healthy tissue. That problem with radiation therapy planning has to be done really perfectly, which is why it's this sort of pixel-by-pixel process. But there is maybe five or 15 minutes of real black magic that's drawing on all of the intuition and experience and wisdom of a radiologist and then two to three hours of complete drudgery, and much of that complete drudgery can just be eliminated with modern computer vision technologies. These things are really developing so rapidly and coming online. They tend not to replace completely what doctors and radiologists can do, because there is always some judgment and intuition involved in these things, but when done right, they can integrate into the workflow to really enable, to kind of liberate clinicians from a lot of drudgery and to reduce mistakes. I think one other thing that's sometimes not fully appreciated is you also, when you get these tools, you can take these measurements over and over and over again. When they become cheap, you can take them every day, if necessary, which allows you to track progression of a disease or its treatment over time much more precisely. These sorts of applications, I think, in medical imaging, I think are really promising. One thing I ... it's a hobby horse of mine ... before I pause, is in 2015 here in Microsoft Research we invented something called deep residual networks, which are now commonly called ResNets. ResNet has become part of an industry standard and research standard in computer vision using deep neural nets. We ourselves have refrained from using ResNets for doing things like imaging of 3D images for the purposes of radiation therapy planning, and there are various technical reasons for that. Sometimes we have a mixture of being proud seeing the rest of the world use our invention for interesting medical imaging, but we also sometimes get worried that people don't quite understand the failure modes in these things. But, still, the progress has just been spectacular. Sam Charrington: [00:13:14] That's kind of an interesting prompt. Maybe let's take a moment to explore the failure modes, and why don't you ... It sounds like you don't advise folks to apply ResNets to the types of images that we tend to see in medical imaging. What's that about? Peter Lee: [00:13:32] Yeah. It's not advising or warning people against it. If you think about, let's say, take the problem of radiation therapy planning, it's a 3D problem. You have a tumor that is a 3D mass in your body and you're trying to come up with the plan for that radiation beam to attack ideally as much of that tumor while preserving as much healthy tissue as possible. Of course, your picture into that 3D tumor is as a series of two-dimensional slices, at least with current medical imaging. One very basic question is, as you examine slice-by-slice that tumor with respect to the healthy tissue, is each slice being properly and logically registered with the next one? A simple or naïve application of a convolutional neural network, like a ResNet, doesn't automatically do that. The other problem is it's unclear to what extent a bad training sample or set of training samples will do to one of these deep neural nets. In fact, just in the last few weeks and months, there have been more and more interesting academic research studies showing some interesting failure modes from a surprisingly small number of bad training samples. I think that these things are changing all the time. Our algorithms and our algorithmic understanding are improving all the time, but at least within our research groups, we've taken pains to understand that this application of computer vision isn't like others. It's more in the realm of, say, driverless cars where safety is of paramount concern, and we just have to have absolute certainty that we understand the possible failure modes of these things. Sometimes with just an off-the-shelf application of ResNets or any similar deep neural net algorithm, we and now more and more other researchers at universities are finding that we don't yet fully understand the failure modes. Sam Charrington: [00:16:02] In some ways, there's an opportunity beyond kind of naïve application of an algorithm that performs very well on ImageNet. Today, you can get data sets that include kind of these 2D representations of what are fundamentally 3D applications or 3D images and apply the regular 2D algorithms to them and find interesting things. But you're saying that a) we can do better and b) we may not even be doing the right things in many cases because of these safety issues. I'm wondering, on the first of those two points, the doing better, is there either a standard approach that's better than ResNet for these 3D images that you've developed at Microsoft or have seen otherwise? Or where are we in terms of taking advantage of the 3D nature of medical images and deep learning? Peter Lee: [00:17:06] Yeah. That's a good question. For our InnerEye project, which is really run by a great set of researchers based mostly in our Cambridge, U.K. research lab and led by Antonio Criminisi. He's really one of the preeminent authorities in computer vision. In fact, he led an effort some years ago to work out the 3D computer vision for Kinect, and so he's really specialized in 3D. The InnerEye project, which is really for us an effort to really understand completely the workflow of radiation therapy planning, that system actually doesn't use residual network. What it does is it uses kind of an architecture of layered what are called decision forests. That gives not only some benefits in terms of more compact representations of machine-learned models and, therefore, some performance improvements, but it allows us to kind of capture a kind of logical registration of the images as they go slice-by-slice. In other words, you're inferring not just the segmentation of each 2D image slice, but you're actually trying to infer the voxel, the 3D voxel volume of the tumor that you're trying to attack. Then on top of that, there's a process involved when you're dealing with medical technologies. You don't just put it out there and start applying it on people. You get it peer-reviewed. You get it peer-reviewed, in this case, in computer science journals and in medical journals, and you go through a clinical validation, and if you're in the United States, for example, through an FDA approval process. For us, as we're learning about what does our cloud, what do our AI services, what do our tools have to be in order to support this future of AI-powered healthcare, InnerEye is an example of us going end-to-end to try to build it all out and to understand all those components and to understand what has to be done to really do it right. It's been a great learning experience. We're now in the process not only of working with various companies who might want to integrate this InnerEye technology into their medical devices, but we're starting to now pull apart the kind of bricks and mortar that we used in the technical architecture for InnerEye in order to expose those as APIs for other developers to use. Our intent is not to get into the radiation therapy business. Our intent is not to get into radiology. But we do want our cloud and our AI services and our algorithms to be a great place for any other company or any other startup or innovator who wants to do that and ideally do it on our cloud, using our tools. Sam Charrington: [00:20:29] An interesting point in there. You mention that the decision forests that you developed to address this problem ... I guess we often think of there being this tradeoff between factors like explainability or safety, as you related that second point, and performance, which we think of as the neural net is delivering kind of the ultimate in performance in many cases. But in this case, this decision forest algorithm is outperforming at least your classic 2D ResNets, and I'm imagining also providing benefits in terms of explainability/safety. Is that correct? Peter Lee: [00:21:21] Well, we feel very strongly that it provides benefits in terms of safety. Explainability is really another very interesting question and problem. There's a potential for greater explainability. One of the lessons that we learned when we were working on AI for sales intelligence ... We had really developed tremendous amount of AI that would ingest large amounts of data from the world as well as from customer relationship management databases, emails and so on for our sales teams and used that through various AI algorithms to do things like synthesize new offers to specific customers or to surface new prospective customers or to suggest new discount pricing for specific customers. One of the things we learned is that no self-respecting sales executive is going to offer a 20% discount to a customer just because his algorithm says so. Typically- Sam Charrington: [00:22:35] Doctors are probably similar? Peter Lee: [00:22:37] That's right. In that situation, we also moved away from, in that specific case, moved away from the pure deep neural net architecture to having a kind of layered architecture of Bayesian graphical models. The reason for that was so that we could synthesize an explanation in plain English of not only offer a 20% discount, but why. As we get into, away from more point solutions that are kind of machine learning or AI-powered to more of that digital assistant that is the companion to a clinician and gives that clinician a second opinion or advice on a first opinion, those sorts of explanations undoubtedly are going to become important, especially at the beginning when we're trying to establish trust in these things. As we've been experimenting even with the kind of ambient intelligence to just listen in on a doctor-patient encounter and try to automate a note, one thing we've found is that doctors will look at the synthesized note and not trust everything in it because they don't quite yet have the understanding of why did the note come out this way. It became important to provide tools so that when you, say, click on a specific entry in the note, that it could be mapped back to a running transcript and to the right spot in the running transcript that was recorded. These sorts of things I think are part of maybe the human-computer interaction or the human-AI interaction that we're having to think about pretty hard as we try to integrate these things into clinical workflow. Sam Charrington: [00:24:30] Before we move on beyond diagnostics and therapeutics, all of the examples that you gave fell into the domain of computer vision. Are there interesting things happening in diagnostics beyond the kind of onslaught of these new computer vision-based approaches? Peter Lee: [00:24:51] Yeah. I think actually some of the most interesting things are not in computer vision, and this maybe crosses over into the precision medicine thing. One of the projects I'm so excited about is something that we're doing jointly with a Seattle biotech startup, Adaptive Biotechnologies. The setup is this: If you take a small blood sample from your body, in that sample, in that one-mL sample, you'll end up capturing on the order of one million T cells. The T cells are one of the primary agents in your adaptive immune system. About two and a half years ago, there was a major scientific breakthrough that got published that showed that the receptor ... There's a receptor on the surface of your T cells, and in that receptor, there's a small snippet of DNA. There was strong evidence two and a half years ago that that snippet of DNA completely determines what pathogen or infectious disease agent or cancer that T cell has been programmed to seek out and destroy. That paper was very interesting because it used a simple linear regression in order to identify from a read of that little snippet of DNA on the T cell receptor whether you had CMV, cytomegalovirus, or not. It was really just an impressive paper and just very recent. Well, the thing that was interesting about Adaptive Biotechnologies is Adaptive Biotechnologies was in the business of giving you a printout of that specific snippet of DNA in all the T cell receptors in a blood sample. They had a business model that would help some cancer centers titrate the amount of specific chemotherapy you were getting based on a reading of that DNA. That raised the question, would it be possible to take that printout of those T cell receptor DNA sequences and, in essence, think of that as a language and translate it into the language of antigens? Then, if you can do that, can you take those antigens and do a kind of topic identification problem to figure out what infectious diseases, what cancers, and what autoimmune disorders your body is currently coping with right now? It turned into this very interesting new business opportunity for Adaptive Biotechnologies that if machine learning could be used to solve those two problems, then they would have a technology that would be very similar to a universal diagnostic, a simple blood test powered by machine learning that could do early diagnosis of any infectious disease, any cancer, and any autoimmune disorder. Microsoft found that interesting enough that we actually took an investment position in Adaptive Biotechnologies and agreed to work with them on the machine learning. And Adaptive, for their part, agreed to build a bigger production pipeline in order to generate training data to power that machine learning that we're developing at Microsoft. What has transpired since then has been an amazing amount of progress where we've added tremendous amount of sophistication actually using deep neural nets and started to feed it with billions of points of training data. In fact, this year, the production facility at Adaptive will be able to generate up to a trillion points of training data. We're now targeting five specific diseases, ovarian cancer, pancreatic cancer, type I diabetes, celiac disease, and Lyme disease. That's two cancers, two autoimmune disorders, and one infectious disease with the same machine learning pipeline. It's still an experiment, but it kind of shows you the potential power of these advances in immunology, in genomics, and AI all being bound together to give the possibility. We know the science now is valid, and if we can now build the technology that ties those things together, we get the potential for a universal diagnostic, but as close a thing that we could imagine getting to the Star Trek tricorder as anything. Sam Charrington: [00:29:31] Mm-hmm (affirmative). That was the thing that popped immediately to mind for me, the tricorder. That example, I think, captures for me really plainly both the promise of applying machine learning and AI to this healthcare domain, but also maybe a little bit of the frustration in thinking through, okay, collecting a trillion samples and you've got this pipeline, why does it take so long? There's certainly regulatory and political types of reasons that maybe we'll get into. I'm wondering if you can elaborate on with that much training data and kind of the science in place and a pipeline in place, what are the realities of applying machine learning in this type of context that impede kind of rapid scale? Why just five diseases and not 25, for example? Peter Lee: [00:30:43] Yeah. That's such a great question. Yeah, human biology is just so complicated. I will say there are three ways, maybe, to take a cut at that. If we took a look at the very basic science, just consider the human genome, something that geneticists at several universities have taught me which was really eye-opening, is if you look at the human genome and then look at all the possible variants, the number of variants in the human genome that would still be considered homo sapiens is just astronomically large. Yet, the total number of people on the planet relative to that number is really tiny, only, what, seven and a half billion people. In fact, if we had somehow DNA samples from every human that has ever existed, I think most estimates say there are fewer than 106 billion people that have ever existed since Adam and Eve. If we are using modern machine learning, which is basically looking at statistical patterns and correlations, we have an immediate problem for a lot of basic problems in genomics, because we basically don't have a source of enough training data. The complexity of human beings, the complexity of cancer, the genetic complexity of disease, is just vastly larger than the number of people that have ever existed. Sam Charrington: [00:32:21] Meaning relative to the possible combinations of genes- Peter Lee: [00:32:28] That's right. Sam Charrington: [00:32:28] ... every human is ... I guess it shouldn't be surprising that every human is unique, but even given ... It's a little counterintuitive. You'd think there's only these four letters that were thrown together to figure all this stuff out. Right? Peter Lee: [00:32:43] Yes. What that means is that, yes, we will and we have been making ... We, meaning the scientific community and the technology community, have been making stunning advances and making really meaningful improvements for neonatal intensive care, for cancer treatments, for immunology, but fundamentally, scientifically, we still need something beyond just machine learning. We really need something that gets into the basic biology. That's kind of one reason why this is hard. Another reason is these are just big problems. In the project with Adaptive Biotechnologies, there are between 10 to the 15th and 10 to the 16th different T cell receptors that your body can produce and on the order of maybe 10 to the 7th known antigens. Imagine we're trying to do is trying to fill out a gigantic Excel spreadsheet with 10 to the 16th columns and 10 to the 7th rows. That's just a heck of a big table, and so you end up needing a large amount of training data to discern enough structure, find enough patterns in order to have a shot at filling in at least useful parts of that table. The good news is everybody has T cells, and so we can take blood samples from anybody, from just ordinary, healthy people, and then we can go to research laboratories around the world that have stored libraries of antigens and start correlating those stored libraries of antigens against those what are called naïve blood samples. That's exactly what Adaptive Biotechnologies is doing in order to generate the very large amount of training data. It's a little bit of a good news situation there that we don't need to find thousands or millions of sick people. We can generate the data from just ordinary samples. But it's still a very large amount of data that we need. Then the third kind of way that I think about this is it gets back to the safety issue. We do things a certain way because ultimately, medicine and medical science is based on causal relationships. In other words, we want to know that A causes B, but what we typically get out of machine learning is just A is correlated with B. We get those inferences, and then it takes more work and more testing under controlled circumstances to know that there's a causal relationship. All three of those things kind of create challenges. It does take time, but I think the good thing is as the regulatory organizations like the FDA have gotten smarter and smarter about what is machine learning, what is it good for, what are its limitations, that whole process has gotten, I think, faster and more efficient over time. Then there's a second element, which is, of course, companies are in it to make money. At a minimum, even if they have purely humanitarian intentions, at a minimum they have to be sustained over time. That means that insurance companies and Medicare and Medicaid, they have to be willing to reimburse doctors and nurses when they actually use or prescribe these diagnostics and therapeutics. All of that takes time. Sam Charrington: [00:36:37] At least on the second of your three points, in thinking about scaling, solving problems like this, specifically training data, do you have a rule of thumb, a chart that says, okay, one trillion training samples will get us these five diseases, but we'll need 10 trillion to get to 10 diseases? I realize that that's almost an asinine question and it's much more complex than that, but does it make sense at all to think of it like that? And think of, I guess, the impact of collecting training data and what the trajectory looks like that over time, kind of like the way we thought of as we drive the cost of sequencing down, the downstream effects that that'll have? Peter Lee: [00:37:27] Yeah. Well, when you find the answer to that question, please tell me. In my experience, I've seen this go two ways. One of the wonderful things about modern machine learning algorithms today is that they're far less susceptible to problems of over-fitting. They come very close to this wonderful property that the more data, the more better. But it does happen that sometimes you hit a wall, that you start to see a trail-off in improvement. We really don't know. The kind of early results that we've gotten with admittedly simpler diseases like CMV, and then CMV is actually not that interesting from a medical perspective, they give us tremendous hope. Then other internal, more technical validations, give us supreme confidence that the basic science, the biological science is well-understood now. Once you start really attacking much more complex diseases, like any cancer, it's really hard. I would be unwilling personally to make a prediction about what will happen. But there's every reason today for optimism, and I think the only unknown is whether there is a what if we fall off a cliff at some point and stop finding improvements. Or if we're going to just get to a viable FDA-approved diagnostic in the near term that will be constantly improving as more and more people are diagnosed. It could really go in either way. I'm really unable and actually unwilling to make a prediction about which way it will go, but we are feeling pretty confident. Incidentally, I should say last month Adaptive Biotechnologies closed a deal with Genentech for applications of this T cell receptor antigen map in the therapeutic space, in the area of cellular therapies for targeted cancer treatments. That deal has a value of over $2 billion, so there's also some ... When you're dealing with commercial relationships like that, there's a tremendous amount of due diligence. These are big bets and big pharma is accustomed to making large, risky bets like this, but I think it's another sign that at least leading scientists at one of the larger pharmaceutical organizations is also increasingly confident that we can fill out this map. Sam Charrington: [00:40:38] We've talked about diagnostics. We've talked about precision medicine. What do you see happening on the tooling side, both from the doctor's perspective as well as the patient experience perspective? Peter Lee: [00:40:52] Yeah. One thing, it's a simple thing, but it's been surprising how useful it has turned out to be. We've been piloting chatbot technology that we call the Microsoft Health Bot. This has been sort of in a beta program with a few dozen healthcare organizations. What it does is, we've sort of advanced our cognitive services for language processing, for natural language processing, for conversational understanding and the tooling to provide a drag-and-drop interface so that ordinary people can program these chatbots, at least for medical settings, and then we've improved the models, the language models, so they understand medical and healthcare concepts and terms. We've been surprised at the kinds of applications that people use. One example is there are organizations that have made prescription bots. The idea is this. Maybe you get a prescription from your doctor or from the hospital and you go to the pharmacy, you get your prescription filled, and then a day or two later, you get a message from this intelligent chatbot that's asking, "How's it going? Do you have any questions? Or have you had any issues with your medication?" It invites you proactively to get into a conversation that gives the healthcare provider tremendous insight into whether you're adhering to your prescription. That's a huge problem. Something like 35% of people actually don't follow through with their prescription medications. It's just there to answer questions. Maybe you have some stomach upsets or some people who are on a lot of medications hate having all those bottles and they put them all, dump all the pills into a baggy and then they can't remember which pills are which. The health bot is able to converse with you and say, "Oh, well, why don't you point your phone camera at a bunch of pills and I'll remind you what they are." It uses modern computer vision ResNets, actually, to remind you what these pills are. The kind of engagement that the healthcare providers get, the improvements in engagement and the satisfaction that people like you and me have is really improved. Or just asking simple benefits questions or medical triage of various sorts, these kinds of ideas have been surprisingly interesting. In fact, so surprising for us that later this week, we'll be making that product generally available for sale. You'll be able to use the Microsoft Health Bot technology without any restriction, except for payment, of course. That is something that has gone extremely well. That technology now is being baked into more and more of, I think, of what people will be seeing. We have a collaboration hub application in Office 365 called Teams, and Teams has been this just wonderful technology for improving collaboration in all sorts of workplace settings. Well, we've made Teams healthcare compliant and able to connect to electronic health record systems, and then by integrating great kind of collaboration intelligence tools, to just parse records or a newer way to go to find certain bits of information or just to be able to ask an intelligent agent that is part of your team, "Did so-and-so check the sutures last night?" and be able to get a smart answer whether people are awake or not. There are all these little ways that I think AI can be used in the workflow of healthcare delivery. One of the things that is, I think, underappreciated about healthcare delivery today, especially in acute care settings, is it's a super collaborative environment. Sometimes there can be as many as 20 people that are working together as a team delivering care to multiple patients at a time. How to keep that team of 20 people all on the same page and all coordinated is getting to be a really difficult problem, typically done with Post It notes and half-erased whiteboards now transitioning to pretty insecure consumer messaging apps. But the idea of having real enterprise-grade collaboration support with AI, I think just can make all of that much better and then provide much more security and privacy for people. A lot of these applications of AI end up being less flashy than doing some automatic radiation therapy planning of a medical image, but they really kind of help people, those people on the front lines of healthcare delivery do their jobs better. Sam Charrington: [00:46:34] I tend to find myself having really kind of mixed feelings about conversational applications, at least from the perspective of talking about them on the podcast. There's no question that conversational experiences and interfaces will be a huge part of the way we interact with computers in the future and that there's tons of work that needs to happen there because of the reasons that you mentioned, like less flashy. I wonder if there's still interesting research. At least my question to you is are there still interesting research challenges there? Or is it all, do we have all the pieces and it's just kind of rolling up the sleeves and building enterprise software, which we know is hard and takes time? Peter Lee: [00:47:21] Yeah. It's a good question. It feels like research to me. Sam Charrington: [00:47:27]. (laughter) Elaborate. Peter Lee: [00:47:28] Some of the problems, if anything, feel little difficult, honestly. If we just, say, take the problem of listening to a doctor-patient conversation and from that, understanding what should go into the standard form of a clinical encounter note. Here's a typical thing. There could be an exchange. Let's say, Sam, you're my doctor and I'm your patient, you might be asking me how I'm doing and I might complain about the pain in my left knee hasn't gone away. We can have an exchange about how that goes, and ultimately, what goes into the note by you is a note about my continued lack of weight loss and that my being overweight is contributing to the lack of healing with my knee problem. That may or may not have been a part of our conversation. While it's important that the weight loss element be in that clinical note ... In fact, it might even mean revenue for that doctor because there might be a weight loss program that gets prescribed and so on. That's important and it's important not to miss that. The human exchange here and the things that are implicit in those conversations, let alone the fact that I'll say kneecap and you'll say patella, are things that are as close to general artificial intelligence style problems as anything. Sam Charrington: [00:49:15] Yeah. Peter Lee: [00:49:18] Look, we don't kid ourselves that we're anywhere close to solving those types of problems, but those are the kinds of problems we think about, even when we just look at the kind of day-to-day, minute-by-minute work that people do to deal with their healthcare. Sam Charrington: [00:49:33] Right, right. Peter Lee: [00:49:34] There's another one that's interesting. To really unlock the power of AI, what we would want to do is to just open up huge databases to great researchers and innovators everywhere, but, of course, we need to do that without violating anyone's privacy. There's one problem, something called de-identification. It would be great to be able to take a treasure trove of what's in electronic health records and "de-identify" it. Well, some parts of those electronic health records are easy to do because there might be a field called Social Security Number, another field called Name, another one called Address, and so on, so you can just scrub those out. But large amounts of clinical data involve just unstructured notes, and to really have a deep understanding of what's in those notes and in order to scrub those in a way that won't inadvertently reveal somebody's identity or their medical condition, again, is something that in the ultimate, ends up being a very general AI problem. Sam Charrington: [00:50:41] That's a great reframing of the way to think about this is I guess most chatbots are boring because they're boring. Kind of the entity intent framework that most chatbots are built on is kind of like table stakes relative to what we're really trying to do with conversational experiences. That really requires a level of sophistication and our ability to use and work with and manipulate natural language that is very much at the research frontier now. And that's why most current in-production chatbots are kind of boring. Peter Lee: [00:51:27] Yeah. We've taken a step forward of trying to think of these things almost in terms of being able to play a game of 20 questions. One of the most inspiring applications of health bots that we dream about is in matching people to clinical trials. At any point, there are thousands of clinical trials. You can go to a website called clinicaltrials.gov and there's a search bar there, and you can type in something like breast cancer. When you do that, you get this gigantic dump of every registered clinical trial going on that might be pertinent to breast cancer. While that's useful, the problem with that is it's hard to know which ones of those ... If you are, say, someone who's desperate to find a clinical trial to enroll in because you've run out of other viable options for whatever is ailing you, it's just almost impossible to go through all of that technical information and try to understand this. Would it be possible to use an AI to read through all that technical information and then to synthesize what amounts to a game of 20 questions, something that'll converse with you and ask you questions in order to narrow down to just that one or two or three clinical trials that might be a match for you. It's that kind of thing where it's not fully general conversation of the sort that I think you and I were talking about just a minute ago, but is slightly more structured than that in order to help you more intelligently, more efficiently find the right medical or healthcare solution for you. That kind of application is something that we're really putting a lot of kind of heart and mind into, along with many others around the world. It's exciting that we're starting to see these things actually make it into clinical use today. I kind of agree with you. I do roll my eyes sometimes at the overheated hype around intelligent agents and chatbots as well, just like anybody else, but it's really getting somewhere in these more limited domains. Sam Charrington: [00:53:56] I think it also says why the interesting work in domains like this is going to be ... It's not generic. You're solving a specific problem and there's a lot of investment in getting the machine running AI right for this particular problem as opposed to implementing a generic framework. Peter Lee: [00:54:16] That's right. Sam Charrington: [00:54:17] Awesome. Well, Peter, thank you so much for taking the time to chat with me about the stuff you're seeing and working on in the healthcare space. A ton of really interesting examples in there and I'm looking forward to following all this work and digging deeper. Thank you. Peter Lee: [00:54:37] And we didn't even talk about China once. That's great. Sam Charrington: [00:54:41] Well, you mentioned ResNet a few times kind of taunting me to dive into that conversation, but I'll refer folks to the article and we'll put the link in the show notes. Peter Lee: [00:54:52] Sounds great. It was really a pleasure chatting.
Talk 228 – AI for Earth Interview Transcript Sam Charrington: Today's episode is part of a series of shows on the topic of AI for the benefit of society, that were excited to have partnered with Microsoft to produce. In this show, we're joined by Lucas Joppa and Zach Parisa. Lucas is the chief environmental officer at Microsoft, spearheading the company's five-year, $50 million, AI for Earth commitment, which seeks to apply machine learning and artificial intelligence across four key environmental areas, agriculture, water, biodiversity and climate change. Zack is co-founder and president of SilviaTerra, a Microsoft AI for Earth grantee, whose mission is to help people use modern data sources to better manage forest habitats and ecosystems. In our conversation we discussed the ways that machine learning and AI can be used to advance our understanding of forests and other ecosystems and support conservation efforts. We discuss how SilviaTerra uses computer vision and data from a wide array of sensors like LiDAR, combined with AI, to yield more detailed small area estimates of the various species in our forests. And we also discuss another AI for Earth project, WildMe, a computer vision-based wildlife conservation project that we discussed with Jason Holmberg back in episode 166. Before diving in I'd like to thank Microsoft for their support of the show and their sponsorship of this series. Microsoft is committed to ensuring the responsible development and use of AI and is empowering people around the world with this intelligent technology to help solve previously intractable societal challenges spanning sustainability accessibility and humanitarian action. Learn more about their plan at Microsoft.ai. Enjoy the show. Sam Charrington: [00:02:17] All right, everyone. I am here with Lucas Joppa and Zack Parisa. Lucas is the CEO of Microsoft, no, not that CEO, but the Chief Environmental Officer. Zack is the Co-Founder and President of Silvia Terra. Lucas and Zack, welcome to this week in Machine Learning and AI. Lucas Joppa: [00:00:22] Thanks for having us here. It’s a huge pleasure. Zack Parisa: [00:00:24] Great to be here. Sam Charrington: [00:00:25] Awesome. Let’s dive right in. We’ll be talking about Microsoft’s AI For Earth Initiative, but before we jump into that, Lucas, as the CEO of Microsoft. I think, I’m going to run this one all day. Tell me a little bit about your background and how you came to be the CEO of Microsoft. Lucas Joppa: [00:00:48] Yeah, sure. I would say I never dreamed of being the CEO of anything that’s for sure. Particularly, in the standard context of it, much less what it means in my specific title is the Chief Environmental Officer. I mean, I grew up in far northern rural Wisconsin, I was obsessed with being outside. My approach to school in life in general was what can, how can I get done with anything that I need to get done with so I can go play out in the woods? I think, I thought I was going to grow up to be a game warden or something similar to that. Technology was not a big factor in my life as well. I mean, I’ve never had a computer growing up or a TV or anything else. I eventually found my way into university, started discovering that I was really interested in thinking about a career in environmental science, studied Wildlife Ecology. Again, not the traditional career path for somebody at Microsoft. Went off and spent a little time, in the United States Peace Corps in Malawi, working for the Department of National Parks and wildlife and then came back and did my PhD in Ecology. It was really then that I started to put together this, the two kind of incredible ages that I think we’re alive in today and the way I see our world. Which is that we’re doing business here at the intersection of the information age, and then this also incredible age of negative human impacts on earth’s natural systems. It was during my PhD, I just was really struggling with what’s the right way to do science at a way that scales with the scale of the problem. That’s when computing, programming, Machine Learning all kind of came flooding into my life at the same time. Ended up at Microsoft and Microsoft Research leading programs and environmental and computer science, and then things just progressed from there. Sam Charrington: [00:02:41] You’re actively involved in academic research and a number of organizations. Can you share a little bit about that? We talked about it a bit earlier. Lucas Joppa: [00:02:51] Sure. I mean, once you, live long enough in the academic world, the Pavlovian response stored some of the rewards that, that environment installs. I mean, I’m not proud to say it, but since I’m not proud, I should just say it. I am still that academic that checks their citations every day when I wake up over breakfast. While I definitely have a much larger and more expanded per view of roles and responsibilities here at Microsoft. I still think, science is important. Science is what drives all of the environmental sustainability decisions that we make here at this company. It’s what ultimately led to why we invested in this program AI For Earth. I firmly believe that, you have to understand the details, if you’re going to try to lead an organization somewhere with a big picture vision, if you don’t understand the details, if you don’t understand the science and then it’s difficult to do that. Just the way my brain works, the easiest way to understand the details is to get your hands dirty and be in there with the rest of the world trying to build the solutions of the future. That’s where the academic research for me comes in. It’s just that opportunity to actually like go really deep and work on both sides of the equation. I still publish in the environmental science literature. I still publish in the computer science literature, and the most depressing thing about that is how few of us there are that do both of those things. It’s one of the things that I spend a lot of my time every day doing is just trying to bring those two worlds together, and publishing is a fantastic way to do that. Sam Charrington: [00:04:35] Zach, you’re a forester. Zack Parisa: [00:04:37] Yeah, yeah. Sam Charrington: [00:04:38] I didn’t know that was a thing beyond the Subaru. Zack Parisa: [00:04:40] Right, right, sure enough. It’s absolutely a thing and an exciting, I think, there’s a rebirth in forestry now. I’m hoping that it’ll become a more broadly known thing here, before too long. Sam Charrington: [00:04:56] Tell us about your background and about Silvia Terra. Zack Parisa: [00:04:59] Yeah, sure. The start of my story actually isn’t terribly dissimilar than Lucas’s. I grew up in North Alabama though not Wisconsin, but in this funny place that was like North Alabama’s, covered in woods, but it also has NASA installation, in Huntsville, Alabama. My youth was basically just spend in the woods. When I was in first grade, I wanted to be an Entomologist. When I was in third grade, I wanted to be a Zoologist. I went through, geology and so on and so forth until I finally met somebody who was a forester. Until you meet somebody and you have somebody walk you through what that is, it’s an obscure field. What that is to me is the confluence of economics and ecology for me. It was this brilliant opportunity at the time, and that’s the way that I saw it because it brought together everything that I cared about. From the ecology side, insects and soils, geology, the interconnected nature of all of those systems, but also the economic side. Not only what the forest is, but also what we want it to be and how we value that as a society, and how we mean to take it from one place now, which is where we find it today to where we want it to be, and what we believe we need. That was my entrance into it. I believed, I would carry that out. I would live and work as a forester by managing some tract of land for some owner movement, whether that’s public or private, but that I would be focused on that landscape. Going through Undergrad, what I became really interested in, were oddly and a surprise to me was the quantitative aspects of certain problems like insects in a forest. When I first got into forestry, my freshman year, there was a massive outbreak of southern pine beetle in the U.S. South, and it was killing lots of pine trees. That was a really compelling problem to me because it relates so much not only to the trees themselves and the beetle, but also how we’ve managed them historically and sort of what, how that impacts, locally economies and that type of thing. I started into pheromone plume modeling of all things in a forest and system and trying to take measurements of concentrations of pheromones in locations, and backtrack to where that originated from in the winter, to try and deal with these beetles more effectively. What I learned from that or what I gathered was that there’s this incredible ability to scale up my interests. To still focus on the things that I loved to most, but to look at them with a different lens and to potentially affect change in a different way, than I had conceived of before. I wound up doing a work in Brazil, I was really interested in Tropical Forestry. I took some time off from Undergrad to do that, and worked in other areas, Bolivia in South America. There I got to see situations where people were dependent on different aspects of land, in different ways and more direct ways than I think I was familiar with from my youth in the U.S. South. Where, they were hurting animals, they were collecting nuts, fruits, things like that. They’re collecting fuel wood to stay warm, to cook. They were also, wanting to sell wood into a market, and to develop as communities. Forestry is about trade offs. There are a lot of things that we can do, and there are a lot of potential futures that we have before us, but we have to address the complexity of those systems in more comprehensive ways than we have in the past. There’s far more than just a timber market now, there’s far more than just a concern for delivery of wood to build houses. When we spoke just a little bit before, but that was experienced very acutely here in the Pacific Northwest. When people were confronting the issue of whether we had enough spotted owl habitat or spotted owls themselves or not. Whether we had managed appropriately in the past to accommodate those and everything that’s related to that species, or the habitats and other species that are related, or whether we haven’t, whether we’d failed. If we needed to go back and reconsider the ways that we make decisions. That was a really freighted conversation, it brought people to boiling points, and that was before my time really, before I really entered into the profession in any meaningful way. That type of conversation goes on now and it’s even more complicated, and there are more issues and more dimensions that we have to consider than there were then. To have constructive conversations, we have to have information to inform those discussions to facilitate the communication that yields solutions, that people can live with. Sam Charrington: [00:10:40] I’m presuming that, that need is what led you to found Silvia Terra? Zack Parisa: [00:10:45] It is. Yeah. Absolutely. Sam Charrington: [00:10:47] What is Silvia Terra, what is the company? Zack Parisa: [00:10:48] Right, what we do here? Failing to answer your questions here. Silvia Terra we provide information, just like what I was speaking about there. Our objective is to help people use modern data sources, like remotely sensed information from satellites, from aerial basis, from UAVs and modern modeling techniques to help get more resolution on information and get more accuracy and precision on information. Not only just about trees, but about habitats and beyond. That’s the focus of our company. We’ve been at this for about nine years, a lot of the folks that we work with are timber companies, we also work with non environmental NGOs, we work with government agencies. All of them, they have effectively the same questions, they’re very similar needs. Initially, up until now we’ve been providing data project-to-project to help them answer those critical questions that they confront on a regular basis. I guess, the reason I’m in this room with you all here today is that, we were able to start working with Microsoft AI For Earth. To begin to scale and expand that work, to build a foundational data set that we can start to use to answer these questions and to build on, to improve our ability to manage for the future. Sam Charrington: [00:12:21] This may be a good segue to taking a step back and Lucas, what is AI For Earth? Lucas Joppa: [00:12:29] Sure. Well, I think in the context of this conversation, you can think about it. What is AI For Earth? That’s why a reformed forester, who’s now the co founder of a startup and a reformed wildlife ecologists, who’s now the Chief Environmental Officer at Microsoft are at a table talking with you on TWIML. Sam Charrington: [00:12:44] I feel like we’re in this recursive. Lucas Joppa: [00:12:46] That’s right. I know exactly, I can’t even see you guys anymore. I’m just staring at myself and an Infinity Mirror here. What AI For Earth is, is as of Tuesday of this week, a one-year-old program. Sam Charrington: [00:13:00] Happy birthday. Lucas Joppa: [00:13:01] Thank you. Thank you. It was fantastic. We spent it celebrating with our colleagues at National Geographic in Washington, D.C. Sam Charrington: [00:13:08] In the woods? Lucas Joppa: [00:13:10] Unfortunately no, but at the founders table of one of the most iconic and exploration driven organizations in the world. It was an incredible time. What AI For Earth is, is a five year, $50 million commitment on behalf of Microsoft to deploy our 35 years. Actually a little bit more than 35 years of fundamental research in the core fields of AI and Machine Learning. To deploy those to affect change in these four key areas of environment that we care deeply about, which is agriculture, water, biodiversity, and climate change. The reason that we’re doing that is, because we recognize that at Microsoft, I already spoke about this tale of two ages really, this time of this information age and this time of incredible, negative impacts of human activities on earth’s natural systems. You look and you realize that as a society we’re facing almost an unprecedented challenge. We somehow have to figure out how to mitigate and adapt to changing climates, ensure resilient water supply sustainably feed, human population, rapid, the growing to 10 billion people. All while stemming this ongoing and catastrophic loss of biodiversity that we see are around the world. We’ve got to do that while ensuring that the human experience continues to improve all around the world for everybody that economic growth and prosperity, continue to grow. That’s why I say it’s an unprecedented challenge. I mean, the scope and the scale are just incredible. If you look at the scope and scale of the problem and you step back and you ask yourself the same question as a company that I asked during my PhD, which is, “Well, what are the things that are growing in the same exponential fashion as the scale and complexity of that challenge of our environmental challenge?” Well, pretty much the only trends that are happening in an analogous fashion, are in the tech sector and particularly in the broader field of AI and the more narrow Machine Learning approaches that are getting a lot of attention today. That’s when we decided to put together this program to actually say, “Hey, we’ve been investing as a company for over a decade at the intersection, environmental science and computer science.” I led research programs in our blue sky research division called Microsoft Research for a fair number of years on that. But, then the technology reached a point, the criticality of the societal challenge, I think, reached a point that it was time for a company like Microsoft to step in and actually start to deploy some of those resources. Deploy them in ways that, ensure that we ultimately change the way that we monitor model and then ultimately manage earth’s natural systems in a way that we’ve never been able to before. We started out, as I said, a year ago with basically nothing but aspiration. We looked back this past Tuesday, this event that we had National Geographic where we inducted a new set of grantees into our portfolio, and realize that in that short year we’d set up relationships with organizations all over the world. Over 200 organizations all over the world, each that are dedicated to taking a Machine Learning first approach to solving challenges in these four domain areas that we focus on. There on all set, they’re working on all seven continents now, over 50 countries in the world, 34 countries here in the United States. Today, get the opportunity to sit down with one of the grantees, right? To hear a little bit more about, just their particular experience, and talk about the ways that that Machine Learning in particular can fundamentally change our ability to understand what’s going on on planet earth. Because I think, that most people don’t take the time to step back and realize when they hear terms like information age, just how narcissistic that really is, that almost every bit of information that we’ve been collecting is about ourselves, right? It’s about where the nearest Starbucks is, it’s about what people who searched for also searched for, right? It’s at the peril of ignoring the rest of life on earth and the ways that it supports us in our economies, it’s what Silvia Terra, I think, is so focused on, is using vast amounts of data, new approaches in Machine Learning to actually just ask them simple questions like, where are all the trees in the United States? We don’t know answers to things like that. I mean, that just blows my mind, and so that’s where a lot of this came from. It’s just a fundamental desire to change our ability to monitor and model life on earth. I guess, that isn’t all that simple, but- I also think it’s completely and totally doable, right? I mean, look at where we’ve come from, from an information processing capacity over the past 25 years to where we are today. I mean, if you would’ve tried to predict every little bit of it, it would have been impossible, but it seems preordained now that you look back at it. Sam Charrington: [00:18:38] When I think about the types of systems that we’ve been talking about thus far, both the economic systems, political systems as well as the biological systems. It jumps out at me that there’s a tremendous amount of complexity in those systems, and Machine Learning, deep learning in particular has this great ability to pick out patterns and abstract away from complexity, which kind of says to me, “Oh, it’s a no brainer to apply Machine Learning to this.” We’re still very early on in our ability to put these Machine Learning to work. I guess, I’m curious, maybe for you Zack, where you think the opportunity is with applying Machine Learning and AI, for the types of problems that concern you in particular with regard to forests? Zack Parisa: [00:19:43] Yeah, yeah, absolutely. I guess, listening to Lucas there, one thing that jumps out at me from when you first spoken that, your response to the second question there are lots of people that are very interested in natural resources and there are lots of people that are very interested in Machine Learning and AI, but it is a very small community of people. I think, it’s rare that you … it’s uncommon to start out believing, you’re going to spend all your time outside and then find yourself curled up in front of some code. The first thing, I think there’s a lot of opportunity for people to make that leap and just to begin to see that as a more natural thing, because the questions are very complex. Again, just like Lucas said, most of our focus has been on how to market to somebody to buy a cup of coffee here versus there. How to think about social networks and how to think about marketing networks and transportation networks. I think, it’s exciting to see that begin to percolate down and transition to the story behind how all of those materials come into our world and life. The fact is that everything around us and I think the surprising fact is that everything around us, every little bit of technology and everything that built this room that we’re in or that your listeners are in, those things were either grown or mined. Every piece of that, every little bit has some geographic story, some geographic stories, some physical story, some environmental story. If we were to be confronted with all of those stories, just from one day of our consumption, one day of us interacting as we normally do, it would take us years to even sift through all of those stories. There’s no way, there’s no way, but those stories, all amass to have a very large impact in how we all live. To me, that is the huge opportunity here. We with Microsoft AI For Earth have worked on this data set for the continental U.S. at high resolution to inform about, down to species and diameters, where trees are and what those structures and compositions are and moving forward what they could be. That’s not going to stop, the fact that we are all consumers and that while we have a conservation need, we also have a consumptive need. I think, there’s so much opportunity to begin to investigate how we balance that and how we feel about that and to engage a meaningful conversation, as at multiple levels in society about how that can best be done. Ask about opportunities. I mean, I was never excited about AI or Stats or Machine Learning for the sake of, I mean, it is awesome, I now understand that, and I do get jammed up about exciting advances there, but it’s about what it can answer. I mean, that’s what drew me out of the woods and put me in front of a computer, it was the ability to start to even think about those big questions, and put it all like distill it to something simple and right in front of us. That’s the opportunity. It allows us to know more about our world and ourselves and to create a better world and a better image of our of ourselves. Sam Charrington: [00:23:34] Can we maybe dig into a little bit more detail of either the Dataset that you just mentioned or another project and talk through, the process through it’s Silvia Terra uses Machine Learning, the challenges that you run into maybe walk us through a scenario. Zack Parisa: [00:23:54] Sure. Absolutely. I’ll just briefly tell you where we’re coming from. People have been managing forests for hundreds, a couple 100 years and in the U.S. about 100 plus. They needed information then, as they do now, but to get that they would do a statistical survey, they would go and put measurements in and you work up in average and you make a plan based on that average. That has been effective, it’s what people use a lot still today, but what we’re focused on doing is bringing imagery into bear and model assisted and model based methods to yield small area estimates. For us it’s at a 15 meter resolution, and for a 15 meter pixel, what we’re predicting is the number of stems, their sizes and species. When I say size, I mean the diameter of the trunk of the tree at four and a half feet off the ground. From there to, in a hierarchical context to predict them, maybe the height of the tree or the ratio of crown to just clear bowl at the bottom. From there, maybe the herbaceous, since we can infer or predict they’d be the light conditions under that forest, how much herbaceous plant matter there may be there? Carrying that forward. How many herbivores that could support scaling that up? How many large carnivores that could support? For now, the primary piece, this foundational data set that we’ve worked with Microsoft on is that tree list information for each one of those pixels, which hasn’t existed before, but that opens up so many doors for what we can begin to build onto and model further down the line. Sam Charrington: [00:25:46] At a resolution of 15 meters, single pixel might contain how many trees? Zack Parisa: [00:25:54] It could contain an awful lot. Easily, and this is the tricky thing because the tree could be as small as a seedling, it can be as large as a sequoia. You could have less than one, right? It could have 300 packed, but small, tiny little trees packed and tightened. This is the fundamental difference about what we’re working on here, to me than where we’re coming from. Which is, we need to transition away from the binary or the basically qualitative classifications, forest, non-forest. That’s not actually that informative about what that forest can … what habitat it can provide. What maybe we need to do or not do to ensure that it’s the type of forest that’s going to continue providing the things we care about. Clean water, carbon out of the atmosphere, wood to build this table. Those are the types of things. Beginning to quantify those aspects is very important. When I began working with this, everything was on the table. I mean, there was a potential to use LiDAR and neural nets, to try and clarify discrete trees. We do not do that for various reasons, largely bias in results. For us, parting out species became a massive problem. If you have, let’s say 40 trees of multiple species in one pixel, how do you begin to differentiate those when you’re looking at one pixel of data from lots of imagery sources. That was a technical challenge. Lucas Joppa: [00:27:40] One of the things that I think is interesting about this is like you’re talking about forestry, right? Whether or not people know it’s a profession, it’s an extremely old one. You know some people are going to … you don’t think that you’re going to be talking about Machine Learning. You also don’t think that you’re necessarily going to be talking about philosophy or existential questions, but you asked a question about 15 meter resolution, right? Which when you work with organizations like Silvia Terra that are looking down at the world and asking what is there, you end up having these existential conversations about what is a thing, right? At what level should we be taking data points to be able to feed into these Machine Learning algorithms? Because when you incorporate the zed dimension or the Z dimension or whatever you want to call it, whatever part of this planet earth we’re from, you can be looking down at a multitude of different objects, right? Depending on what sensor you’re using, you may only see one of them or you may see many of them. If you’re using something like LiDAR and you’re able to get your laser sensors enough to see enough of those things. You start struggling with all of these questions that are actually fairlyn unarticulated in the modern Machine Learning literature quite frankly. Where, all the standard libraries taken a 300 by 300 pixel image and they all have these harsh expectations and sure, maybe we think we all left the world of frequent statistics behind, but we still carry over it the ghosts of a lot of those, harsh binary classification results. It’s just fascinating I think, to think about, not just like what’s hard in the forestry space, and how modern Machine Learning techniques can help transform that, but also what the problems in the applications that an organization like Silvia Terra, and then the rest of our AI first grantees, what that brings to the Machine Learning community, which is what’s hard here? Why can’t we just take all the deep neural network advances that we’ve made and just voila, we’ve solved all the world’s problems, right? It’s because, as you said, we’re still at the infancy of a lot of what we hope to achieve in Machine Learning. We just also recognize the severely short amount of time that we have to answer some of these bigger and environmental questions. We have got to take everything that we have at our disposal and start to deploy it. Sam Charrington: [00:30:18] You mentioned sensors and LiDARs, a very specific curiosity question. I’ve always associated LiDAR, like a local, a very short range local sensing mechanism. Is that not the case? Can you do LiDAR from satellites? Lucas Joppa: [00:30:34] Yes, yes. Sam Charrington: [00:30:35] Talking about satellites or playing- Lucas Joppa: [00:30:36] Playing. Sam Charrington: [00:30:37] What are all the sensors that come into play here? Zack Parisa: [00:30:38] A new sensor was just launched a couple weeks ago. Lucas Joppa: [00:30:42] Something like that. Zack Parisa: [00:30:43] There’s JEDI Sensor, it’s called JEDI. I’m used to it now. Lucas Joppa: [00:30:48] I was going to say it. Sam Charrington: [00:30:49] Use the LiDAR? Zack Parisa: [00:30:50] Use the LiDAR, Lucas. Lucas Joppa: [00:30:52] JEDI, here’s a … Zack Parisa: [00:30:55] Well, it’s worth [crosstalk 00:30:56]. They’re strapping this thing onto the space station. It’s going to be pulsing down, not the polls, but basically everything between. I think, it’s full-waveform LiDAR and so absolutely, even historically there was iSAT, which was a satellite-based LiDAR Sensor. Moreover, and more commonly in forestry, and a lot of even in urban areas, they’re collecting LiDAR information from airplanes at different altitudes and different point densities. Something common one might be like 12 or 24 points per square meter. When you see that over a forest at canopy, some of those pulses reached the ground. The best elevation models that you see in the U.S. right now, are LiDAR derived elevation models. That’s the source of a lot of the information that we’re getting. You see it in a lot of flood plain areas, the Mississippi Delta area, so that we can better understand how flooding may occur or may not occur in certain areas. Lucas Joppa: [00:32:02] One more thing that I’m always struck by, when you start thinking about remote sensing and just sensing in general as applied to environmental systems, is that as we start to take a more digital or computational approach to sensing, we almost by definition have got to start taking a more Machine Learning approach to driving insights. Because, what computers are able to do, and I don’t know, maybe I’m just missing the conversation or maybe the conversation isn’t as fully articulated as it could be, but computers are able to sense the world in so many more dimensions than people are. Why do we model? Well, we model because we need a simplifying function to help us understand an already complex world. What was already complex according to our five senses has now become exponentially more complicated with things like hyper spectral resolution monitoring, where you’re getting thousands of bands back of imagery plus things like LiDAR that are getting 24 points per square meter. You can’t, humans don’t even know … It’s interesting, people always complain that they don’t understand what the layers and a deep neural network do. We also have no idea how to even interpret most of the signals that are coming back from the most advanced sensors in the world because they don’t correspond to dimensionality that we live in. Sam Charrington: [00:33:22] I was just going to ask that, when I’ve talked to folks that are using LiDAR in the context of self-driving vehicles and this whole idea of sensor fusion comes into play and making sense of all these disparate data sources. That example are very local and now we’re talking about, global data sources or at least much larger scale and with overlapping tiles and capabilities. There’s a ton of complexity, are those … is that type of complexity, some of the complexity that your company is working on managing or do you count on upstream providers to sort a lot of that out for you? Zack Parisa: [00:34:09] That’s exactly the type of complexity that we deal with. I mean, there are an enormous pool of potential data sources that exist and they all have potentially very useful attributes. Some of them less so, they have different timestamps associated with them, and there’s one very nice thing about measuring forests is that, as long as you don’t mess with them, they tend not to move too much. Trees, they’re pretty willing subjects just to be measured, but they are always changing. There’s growth associated, if there’s natural, there’s naturally occurring disturbance. There is human-caused disturbance and both of those we want to keep track of. What I see our role right now as being is taking that massive pool of potential sources of remotely sensed data, and the very small and often underappreciated pool of field measurements. The things that we actually might care about and translating between those things and creating something that is more highly resolved, more accurate, more precise and more useful than what could otherwise be achieved. So, yeah draw the signal out of the noise, the classic tale. Lucas Joppa: [00:35:24] If I look at kind of the full portfolio of AI For Earth grantees, well over 200, you see that, at least in my mind, Silvia Terra is as an organization one of the most mature, right? They’re actually out of the lab, their startup business model, et Cetera, et Cetera. When I think about why that is in the context of Machine Learning, why they’re able to take advantage of that. It’s because of one thing that we just heard, which is they’re taking advantage of these ground-based data points that they can use to train their models, right? That’s because forestry is something that is so inherently tied to our broader economy that we have here in the United States and all around the world. A history of going out boots on the ground and putting a tape measure around a tree and a GPS signal next to it and saying, “This tree is here, it’s this height and it’s of this species.” That’s so rare in the broader environmental space. It’s one of the reasons that I think, organizations like Silvia Terra are unfortunately standing alone in many respects is because there’s so few data sets. It’s called Machine Learning because we’re teaching computers, right? To teach, you have to be taught or to be taught, you need to be shown examples. It’s why we’ve seen, so significant of advances in other fields of Machine Learning but not in others. There’s just so few annotations in our space that when you come into a forestry space where the U.S. government has paid money for the past hundred years to go out and figure all this out. Companies like Silvia Terra can stand on top of that and really just kind of zoom off ahead. But, they are in many ways the exception to the rule, which is unfortunate I think. Sam Charrington: [00:37:18] Do you find that the kind of work that you’re doing, we talked about the sensing and pulling all that information together. Does this put you at the research frontier of using Machine Learning techniques or you able to use off the shelf types of models? Where does your work fall in the spectrum of complexity? Zack Parisa: [00:37:45] Boy. Sam Charrington: [00:37:46] Or maybe complexity is not the right word just in terms of the innovation cycle, are you able to apply things that people are doing in other fields pretty readily? Or are you having to push the limits and pull right out of academic research or things like that? Zack Parisa: [00:38:05] It’s a little bit of both. I mean, our core algorithm has been, it’s matured over the last nine years of doing the work that we have, and we’re a small team, we’re 10 people effectively. I guess, when I got into this, I originally, when I thought this quant path was something that really resonated with me that I wanted, that I connected with, and then I saw value in. I originally, then thought I was going to a professor, I would be a researcher somewhere. I would be putting papers out because that must be how change happens. My path changed when I went around to people that I’d worked with an industry and asked them what papers they were reading to effect, to change the way that they worked? What was the most influential journals that they were reading? The answer was that they weren’t reading the journals, they were busy managing land and that they wanted a tool, not a publication. I mean, that was a little eye opening, that’s what Max my other, Max Nova my Co-Founder and I set about to do is build tools. I don’t really, accept like a full dichotomy between, is it research or is it just off the shelf type stuff? I mean, we pride ourselves in our ability not only to understand the systems that we’re working in, but also, to be abreast of what’s happening in modern computational techniques and modeling efforts, your modeling tools. Which I imagine everybody would probably say, right? Like everybody would tell you, no. We’re right on the edge. The funny thing that I learned when I got into this, I’m on the applied side. I mean, I talk with people that are trying to figure out wildfire modeling and how to pick which communities to allocate funds and efforts to help manage a forest to prevent catastrophic fires. I work with people that are trying to figure out how to manage for forest carbon. I work with people that try and figure out how to manage forests to deliver wood to a mill to make paper. What’s I guess, striking to me from where I started to now, is I thought that what people needed to see was the math. I thought I would show up at their offices and be like, “Good news. We figured it out. Check this new method out. We pipe in this data. We put in these measurements from the ground. We’re able to model this more effectively now.” What I learned is that if I can’t communicate effectively about what we’ve done, if it really truly seems like magic than it is by definition, it’s incredible in the truest sense of the word, it is not credible, and credibility counts. In some cases where, when we’re working with people, we may not use the most fantastic new thing. We may use something that is slightly more costly in terms of input data that it requires or costly in terms of model fit, but that is more easily understood and explained and more robust to, like the boot test. You go out and it just makes sense. Sam Charrington: [00:41:36] Lucas, does that experience ring true for the other grantees that you work with or are there a spectrum of experiences there in terms of where they are and applying? Lucas Joppa: [00:41:47] Some of our grantees are using almost commodity services at this moment. I mean, Microsoft for instance has a service called Custom Vision AI, sorry, Custom Vision API. They want to do, some of our grantees want to do, simple image recognition tasks and the service works for them. They literally just drag and a whole bunch of photos of one type and a whole bunch of photos of another type and the system learns it and produces a result for them and that’s fine. Right? That’s pretty far on the one side of just like commoditized services. Then there are other grantees that are out there creating exceptionally custom algorithms for their work. I think, we’ve got a grantee, called Wild Me that does basically facial recognition for species, so that they can provide better wildlife population estimates of a species like giraffe, and zebra, things that they can. Everybody knows a giraffe or everybody has heard that every giraffe’s pattern is unique, but look at a couple of photos of giraffes and you realize just how hard it is for the human eye to spot those differences. Right? They’re building algorithms to differentiate any particular, zebra or giraffe and then plug those into statistical models for estimating populations. There’s nothing off the shelf that does that. In fact, most of the main libraries, they have to go back and modify the core code of, so it’s a full, full spectrum. We’re willing to support all of it, right? Because, what we’re trying to get people to understand is, well, first and foremost, we’re just trying to break down the access barrier, right? We want to ensure that budget isn’t a barrier to getting this stuff done. Because as I think, sure you and many of your listeners are aware, sometimes the latest Machine Learning approaches can be fairly expensive. If not, it might be an open source library, but somebody needs 1000 GPUs to run this thing on, right? We make sure that the infrastructure gets in the hands of folks, et Cetera, but it’s also just awareness that you could be thinking about this, you don’t have to be. We want the world’s leading Machine Learning scientists to be thinking about what they could be doing, but we don’t want the rest of the world to think that they have to be one of the world’s Machine Learning experts to have a crack at this, right? That there’s software and services that can help them as well. We see the full spectrum and I think it’s super healthy. We also see the full spectrum of, if I would encapsulate what Zack was saying there and just two words of interest in what we would call Explainable AI, right? Do people really care why an algorithm said that this was a giraffe and that was a zebra? Not really. You don’t have to explain that to them. Right? Do they want to understand why some decision support algorithm, like land, like a spatial optimization algorithm that assigns this part of the country or this part of the county into protected land and this part into industrial use and this part into urban growth and expansion? How that works and why people thought that this was the better policy than that? Sam Charrington: [00:45:14] Probably so. Lucas Joppa: [00:45:15] Yes, they do. I think, there’s a lot of hand wringing and angst right now around conversations like Explainable AI and whatever. I think, it’s no different than the conversation we’ve always had about modeling, which is why it’s a model of a complex system. Why are you building it? If it’s being built to just do a simple classification task and it’s easy for a human to go and check the accuracy left or right then great. You can use some really advanced statistical techniques, if it’s something that, if that model instead is a model of, for instance, a human decision process, then I think the onus on kind of explainability is much higher. Sam Charrington: [00:46:03] Along those lines, we’ve used computation to understand the environment climate for a very long time. Weather for example, has been a great focus of high performance computing. Taking a step back from, the fact that we’re all really excited about AI. Where do you think AI offers unique opportunities relative to the things that we’ve done for a long time? Lucas Joppa: [00:46:31] Sure. Well, I think that the answer to that will be super complex, I’ll try to make it simple, you mentioned weather. I think sure, there’s no question that statistics, and math and then the computational platforms that started to support them over the recent decades have been used for environmental monitoring. I mean, Fisher was, it goes all the way back to some of these guys were biologists. Right? The bigger question is why are we excited about this today? For me it really is the full broad definition of what we mean by AI. It’s the recognition that we’re finally deploying computing systems that can collect unprecedented amounts of data and not just amounts, but we were talking about the full crazy dimensionality of the data that we’re starting to take on. We’ve got this breakthrough in data, we’ve got this breakthrough in infrastructure, where you can … I made a joke about needing 1000 GPUs. Well, if you need one, 1000, 10,000, you just got to turn a knob these days and get access to it. Sam Charrington: [00:47:43] Wherever you are on the novice, still a lot cheaper than a supercomputer. Lucas Joppa: [00:47:47] Extremely. We have made crazy advances and just a whole plethora of algorithms, but for a lot of the most important ones, we’ve directly accelerated the compute, through the perspective of those algorithms. For the first time, and then of course we’ve made it so easy to deploy these algorithms as web based services, as APIs, right? Then, of course the software infrastructure stack and all of that is incredible. We’ve made it commodity level infrastructure, anybody can get access to this stuff. You hear this term Democratizing AI, what we mean by that is bringing it all into a stack that anybody can use. You don’t need access to a government-run super computer anymore, that’s all one side of it. The other thing is from weather, as a great example here where traditional weather forecasting was strong numerical simulation. That’s one type of math, right? But, there wasn’t a lot of learning in real time about what was going on. We took a physical process, we built a model that we thought strongly corresponded with it, and then we ran numerical simulations of it. Fast forward and yeah, just for the simulation perspective, you need a lot of compute. The question is, but all sorts of crazy things happen when we do that, that we don’t quite understand. Right? Little eddy flux has happened in some atmospheric layer or whatever and we don’t really know why. Then the weather community started using Machine Learning to not necessarily learn why, but to be able to predict for one reason or another when those things were going to come and weather forecasting got a lot better. Same thing is happening now in climate modeling as well. We know there’s things that we just can’t do, from our traditional approach to climate modeling. There’s a whole new group that just spun out, that’s taking purely Machine Learning first approach to building a new climate model for the world and not positioning themselves as better, but positioning themselves as complimentary. I think, that there’s a lot of work that’s just happened and commoditizing all of this stuff as well as, recognizing that while we’ve taken a hugely mathematical, statistical and computational approach to doing some of the stuff in the past. Machine Learning is a different approach, right? It’s a data driven approach, and that can be very complimentary and we’ve seen it accelerate extremely economically important things like weather cap, forecasting, forestry, agriculture, and on and on. Sam Charrington: [00:50:31] As we wind up. Zack, can you share something that you’re particularly excited about, looking forward in terms of the application of AI to Forestry? Zack Parisa: [00:50:42] Yeah, absolutely. I mean, obviously we’re excited to be releasing this data set, but it’s really about what it enables. We’re excited to see more nuanced and reactive markets around environmental services like species, habitat, carbon, water, be informed by these type of data and to play a part in that process to integrate these concerns into ongoing management decisions. That’s the biggest piece. It’s what you can do with this information, as you even move it from data to information to decisions. Sam Charrington: [00:51:29] Lucas, how about from your product, as you look at this from both a very technical and research perspective, but also as managing and interacting with this portfolio of innovators that are working in this space. What are you excited about? Lucas Joppa: [00:51:48] Well, ultimately the future I see, and the way that we’ve structured the whole program is we think the world fundamentally needs is the ability or what society needs is the ability to query the planet by X, Y, and T. We need to be able to ask questions just like we ask some potentially- Sam Charrington: [00:52:10] No zed? Lucas Joppa: [00:52:10] What’s that? Sam Charrington: [00:52:11] No zed? Lucas Joppa: [00:52:12] No zed. Well, I was actually speaking with my team the other day and I had sent a slide that said X, Y, T. Apostrophe Z and I was like, I said, “Stretch goal.” So, yeah, we get the zed dimension then I can retire. But no, I think, ultimately that’s where we need to go, we need to be able to allow people to ask for any particular piece of land or water, what was there? What’s there now? What could be there? Empower policy makers to figure out what should be there. We’re far from that. Now, Microsoft has always had an empowering an ecosystem of customers and partners approach. We don’t look at the world and say, “Oh, say we buy into my X, Y, T vision.” We don’t see that as some fantastical crystal ball that the world spins around and taps on, we see it as a constellation of services and products and solutions brought by all sectors. What we’re looking to do is engage with the Silvia Terra’s of the world, unfortunately, there are far too few at the moment. Engage with those that are there, bring up the next generation and the next and the next, until eventually there’s a self supporting community of Machine Learning, we talk about born digital. I think, about born Machine Learning, these organizations that it’s just baked into their DNA, but the organization doesn’t exist because of Machine Learning. It exists because of the challenges that we face in the environmental space. They just are capable of ingesting Machine Learning approaches natively and efficiently and treat space and time as first class data citizens in this world of Machine Learning. Sam Charrington: [00:54:07] Fantastic. Well, Lucas in Zack, thanks so much for taking the time to chat with me. Lucas Joppa: [00:54:13] Thank you. It was a pleasure. Zack Parisa: [00:54:14] Yeah. Thanks Sam. Appreciate it.