We could not locate the page you were looking for.

Below we have generated a list of search results based on the page you were trying to reach.

404 Error
Akshita is a Senior Research Engineer on the AllenNLP team, involved in R&D for natural language processing (NLP). Most recently, Akshita has been working on the OLMo project, where she has contributed to pretraining dataset construction, model training and inference, and evaluation tools and benchmark. She has also worked on open-source libraries such as allennlp, ai2-tango, etc. Akshita graduated with a Master’s degree in Computer Science from the University of Massachusetts Amherst in 2020, where she worked with Prof. Mohit Iyyer at the intersection of NLP and digital humanities. Previously, Akshita worked at Cerebellum Capital (Summer 2019), and at InFoCusp (2015-2018), where she worked on building a data science platform. In her spare time, Akshita enjoys reading novels, writing (especially poetry), and dancing.
Ben Zhao is the Neubauer Professor of Computer Science at the University of Chicago. He completed his PhD from Berkeley (2004) and his BS from Yale (1997). He is an ACM distinguished scientist, and recipient of the NSF CAREER award, MIT Technology Review’s TR-35 Award (Young Innovators Under 35), ComputerWorld Magazine’s Top 40 Tech Innovators award, Google Faculty award, and IEEE ITC Early Career Award. His work has been covered by media outlets such as Scientific American, New York Times, Boston Globe, LA Times, Wall Street Journal, MIT Tech Review, and Slashdot. He has published more than 160 publications in areas of security and privacy, machine learning, networked systems, Internet measurements and HCI. He served as Program (co)chair for the World Wide Web Conference (WWW 2016) and the ACM Internet Measurement Conference (IMC 2018), and General Co-Chair for ACM HotNets 2020. Over the years, Ben followed his own interests in pursuing research problems that he finds intellectually interesting and meaningful. That’s led him to work on a sequence of areas from P2P networks, online social networks, SDR/open spectrum systems, graph mining and modeling, user behavior analysis, to adversarial machine learning. Since 2016, he mostly worked on security and privacy problems in machine learning and mobile systems.
Richard Zhang is a Senior Research Scientist at Adobe Research, with interests in computer vision, deep learning, machine learning, and graphics. He obtained his PhD in EECS, advised by Professor Alexei A. Efros, at UC Berkeley in 2018. He graduated summa cum laude with BS and MEng degrees from Cornell University in ECE. He received an Adobe Research Fellowship in 2017 and was recognized as an Innovator Under 35 by the MIT Technology Review in 2023. More information can be found on his webpage: http://richzhang.github.io/.
Atul Deo is the General Manager of the AWS Machine Learning team. In this role, he owns product management and engineering for services based on foundation models (FMs). Atul has been with Amazon since 2014. Atul joined the Corporate Development team where he led multiple high-impact acquisitions for Amazon. In 2018, Atul joined the AWS Machine Learning team as a product management leader. Subsequently, he has helped launch and grow multiple AWS services such as Amazon CodeWhisperer, Amazon Transcribe, and Contact Lens for Amazon Connect. Prior to Amazon, Atul worked at Yahoo and started his career as a software developer.
Nicholas is a research scientist at Google Brain working at the intersection of machine learning and computer security. His most recent line of work studies properties of neural networks from an adversarial perspective. He received his Ph.D. from UC Berkeley in 2018, and his B.A. in computer science and mathematics (also from UC Berkeley) in 2013. Generally, he is interested in developing attacks on machine learning systems; most of his work develops attacks demonstrating security and privacy risks of these systems.
Today we’re joined by Anima Anandkumar, Bren Professor of Computing And Mathematical Sciences at Caltech and Sr Director of AI Research at NVIDIA. In our conversation, we take a broad look at the emerging field of AI for Science, focusing on both practical applications and longer-term research areas. We discuss the latest developments in the area of protein folding, and how much it has evolved since we first discussed it on the podcast in 2018, the impact of generative models and stable diffusion on the space, and the application of neural operators. We also explore the ways in which prediction models like weather models could be improved, how foundation models are helping to drive innovation, and finally, we dig into MineDojo, a new framework built on the popular Minecraft game for embodied agent research, which won a 2022 Outstanding Paper Award at NeurIPS. 
Back in the fall of 2018, we conducted a series of interviews with some of the people behind the large-scale ML platforms at organizations like Facebook, Airbnb, LinkedIn, OpenAI and more. That series of interviews turned into the first volume of our AI Platforms podcast series, led to the publication of the The Definitive Guide to Machine Learning Platforms ebook, and ultimately to us launching the first TWIMLcon: AI Platforms conference in San Francisco the following fall.The first of those interviews was with Aditya Kalro, an engineering manager at Facebook. Aditya walked us through FBLearner Flow, the home-grown machine learning platform used at the company. Sam and Aditya recently reconnected for a webcast we held on The Evolution of Machine Learning Platforms at Facebook. Check out the replay and our highlights from the discussion below.https://www.youtube.com/embed/I0E43Up2L7kBeyond Model TrainingIn the early days, FBLearner was largely focused on model Training. The team had a strong appreciation for the importance of experimentation for data scientists and engineers and was really focused on building solid experiment management and collaboration capabilities into the tool. Eventually, they realized a need to create infrastructure and tooling for the entire machine learning lifecycle. This meant that they had to design the platform to support everything from data ingestion to feature development to data preparation, all the way through to model serving.“The big bang for the buck in AI is really data and features - we had zero tooling for it at the time and that had to change.” - Aditya KalroAditya’s commentary on the importance of tooling and support for data labeling and feature management echoes what we’ve heard so far in our recent series of interviews and panel discussion on Data-Centric AI.Investing in the Data Side of MLOpsThe team invested in building out several features in support of the data side of MLOps. They added new workflows to support both manual (human) and automated (machine-only and human-in-the-loop) labeling. They also built a “feature store” which for them was a marketplace for features that anybody in the organization could discover and use. ML Model Deployment StrategiesIn addition to working on data, they also began to put a big focus on allowing users to easily find and use specialized hardware such as GPUs for distributed training as well as production inference.Also on the deployment side, Aditya shared how they built a set of high level abstractions that allowed them to have rules such as “if Model 2 performs better than Model 1, then promote Model 2 to succeed Model 1.” We could probably do a whole talk just on the mechanics of challenger models, shadow models, and model promotion and rollback procedures. If you’d be interested in learning more about this topic hit reply below and let us know!Applying DevOps Lessons to ML Model DevelopmentNext, they worked on treating ML development more like the way their team was already handling software development. They built systems and processes to allow for faster model build and release cycles that could support faster retraining. They also implemented more monitoring and debugging tooling. With more insight into the data and more trackability of the build systems, they were able to achieve more data and model lineage, that is, tracking where data came from and which models were using it in what experiments. All this contributed to better auditability, reproducibility, and governance. As these processes became more systematic, they pulled in their security teams and worked on improving data security and isolation so that models only had access to the data they needed and nothing else.Key Lessons Learned in Four Years of Platform EvolutionEarly in his presentation, Aditya shared a few key design principles that guided the team in their journey:Reusability: Making the system, data, artifacts, and workflows more reusable and composable so that they could make use of prior work, and spend less time redoing work;Ease of use: They wanted to create tools that were easy to use, so they invested heavily in their APIs and UIs;Scale: They focused on creating infrastructure that allowed them to train, evaluate, and run experiments at scale.To close out the talk, Aditya shared some of the key lessons learned through their platform evolution.ML platforms need to support the entire model development lifecycle.ML platforms must be “modular, not monolithic.”Standardizing data and features was critical to their success.Evolving your platform requires disrupting yourself. In their case, they did this by pairing infrastructure engineers with ML engineers which allowed them to continuously evolve the platform to better support their users.Aditya answered a number of excellent audience questions about containerization, challenges of data standardization, supporting research vs. production teams, building your own tooling vs. leveraging open-source, an expanded discussion of their approach to labeling, and more.We want to thank Aditya and Meta for coming on the webcast and we’ll look forward to another update soon!
Been is interested in helping humans to communicate with complex machine learning models: not only by building tools (and tools to criticize them), but also studying their nature, compared to humans. Quanta magazine (written by John Pavlus) is a great description of what I do and why. She believes the language that humans and machines communicate must be human-centered--higher-level, human-friendly concepts--so that it can make sense to everyone , regardless of how much they know about ML. She gave keynote at ICLR 2022 (blog post, video TBD), ECML 2020 and at the G20 meeting in Argentina in 2018. One of my work TCAV received UNESCO Netexplo award, was featured at Google I/O 19' and in Brian Christian's book on The Alignment Problem.
Peter Skomoroch is an entrepreneur, investor, and the former Head of Data Products at Workday and LinkedIn. He was Co-Founder and CEO of SkipFlag, a venture backed deep learning startup acquired by Workday in 2018. Peter is a senior executive with extensive experience building and running teams that develop products powered by data and machine learning. He was an early member of the data team at LinkedIn, the world's largest professional network with over 500 million members worldwide. As a Principal Data Scientist at LinkedIn, he led data science teams focused on reputation, search, inferred identity, and building data products. He was also the creator of LinkedIn Skills and Endorsements, one of the fastest growing new product features in LinkedIn's history. Before joining LinkedIn, Peter was Director of Analytics at Juice Analytics and a Senior Research Engineer at AOL Search. In a previous life, he developed price optimization models for Fortune 500 retailers, studied machine learning at MIT, and worked on Biodefense projects for DARPA and The Department of Defense. Peter has a B.S. in Mathematics and Physics from Brandeis University and research experience in Machine Learning and Neuroscience.
Rafael Gomez-Bombarelli joined the MIT faculty in January 2018. He received a B.S., M.S., and Ph.D. in Chemistry from Universidad de Salamanca in Spain, followed by postdoctoral work at Heriot-Watt University and Harvard University after which he was a senior researcher at Kyulux NA applying Harvard-licensed technology to create real-life commercial organic light-emitting diode (OLED) products. Dr. Gomez-Bombarelli’s research trajectory has evolved from experimental mechanistic studies of organic molecules with emphasis on environmental toxicity to computer-driven design of molecular materials. By combining first-principles simulation with machine learning on theoretical and experimental datasets he aims to accelerate the discovery cycle of novel practical materials. Through his research at MIT he plans to address the role of molecular transformation in materials discovery, in areas such as catalyst design, the environmentally-minded development of novel and replacement chemicals, and designing for stability in advanced materials. Rafa's work has been featured in journals such as Technology Review and the Wall Street Journal. He was also co-founder of Calculario, a materials discovery company that leverages quantum chemistry and machine learning to target advanced materials in a range of high-value markets.
Technical Program Manager with over 20 years experience in the field as developer, development lead and program manager. I have experience presenting to large audiences, I enjoy technically challenging management roles, and I thrive in a fast paced culture focused on building high quality products. I started my career in US in 1998 working on Windows 2000 as a developer. I have shipped Windows 2000, Windows XP, Windows 2003 and Windows 7, working on various technologies from Print Spooler Service, Active Directory, 64bit Interop and RPC, TCP/IP and USB device connectivity, XPS document printing, Winb32 and .Net System APIs. I progressed from developer to development lead and I lead a team of 6 developers for 3 years. From 2009 to 2014 I have worked on Windows Phone as Senior Program Manager, driving multiple key projects from defining the process for Application certification of apps in the Windows Phone Store to Fast Application Switching and Fast Application Resume, app pre-compilation, deployment of phone apps in the Windows Store, multitasking for Location apps, VOIP, Audio, etc. As a Principal Program Manager Lead, I've lead a team of 5 Program Managers, driving the feature definition for the Windows Phone Execution Model and App Lifecycle, Resource Management, Multitasking, App-to-App and Page Navigation Model. From 2014 to 2018 I have worked on Speech Recognition for Cortana on PC, Windows Virtual Reality and Hololens experiences. I have worked on improving speech recognition accuracy via personalization of user's language models and acoustic models. I have been on point for enabling 3rd party skills for Cortana via voice commands and for enabling voice input for chat bots built with Microsoft Bot Framework. Since January 2019 I am working on Artificial Intelligence for Cognitive Services and Computer Vision at Microsoft, driving product requirements and working with Engineering and Research teams through the product lifecycle. I drove the product requirements and the engineering release for Computer Vision for Spatial Analysis. For more on Spatial Analysis and Azure Cognitive Services see https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/
Arul Menezes is Distinguished Engineer at Microsoft and the founder of Microsoft Translator. He has grown it from a small research project in Microsoft Research into one of Microsoft’s most successful flagship AI services within the Azure Cognitive Services family, translating 90+ languages and dialects, used by hundreds of millions of consumers, and tens of thousands of developers and businesses worldwide. It is also embedded in Microsoft products such as Office, Bing, Windows, Skype. Arul has 30+ years of deep experience in computer science, software development, and 20+ years in natural language processing and artificial intelligence. In building Microsoft Translator, Arul followed the model of a startup embedded in Microsoft, owning Translation from basic research to technology productization, data acquisition, model training, web service and API (99.95% SLA), as well as consumer-facing mobile and PC applications. Neural Machine translation is one of the most advanced and demanding of the current wave of AI technologies, regularly modelling terabytes of data. Arul's team recently announced several major breakthroughs. In March 2018, the Translator team announced it had reached parity with professional human translators, a first for MT technology. This was demonstrated using a standard research community test set of Chinese news (translated into English) and all data and evaluation results were released to the research community. In April 2018, the team announced neural offline-translation on Android and iOS with translation quality almost matching the Cloud. This is the first availability of neural MT models running locally on regular phones. In May 2018, the team announced Custom Translator, enabling, for the first time in the industry, self-service customization of neural machine translation models to customer data and domains. His team has also applied the same technology to a wide variety of AI tasks, including grammatical error correction, and natural language understanding.
Naila Murray obtained a BSE in electrical engineering from Princeton University in 2007. In 2012, she received her Ph.D. from the Universitat Autonoma de Barcelona, in affiliation with the Computer Vision Center. She joined Xerox Research Centre Europe in 2013 as a research scientist in the computer vision team, working on topics including fine-grained visual categorization, image retrieval, and visual attention. From 2015 to 2019, she led the computer vision team at Xerox Research Centre Europe and continued to serve in this role after its acquisition and transition to becoming NAVER LABS Europe. In 2019, she became the director of science at NAVER LABS Europe. In 2020, she joined Meta AI’s FAIR team, where she served as a senior research manager. She now leads as the director of AI research at Meta. She has served as area chair for ICLR 2018, ICCV 2019, ICLR 2019, CVPR 2020, ECCV 2020, and CVPR 2022, and program chair for ICLR 2021. Her current research interests include few-shot learning and domain adaptation.
I am a Research Scientist at Google. Previously, I completed my Ph.D. at Boston University, advised by Professor and Dean of the College of Arts and Sciences Stan Sclaroff. My primary research focus is computer vision and machine learning. I interned at Amazon working with Javier Romero, Timo Bolkart, Ming C. Lin, and Raja Bala during the Summer of 2021. I interned at Apple AI Research during the 2019 and 2020 Summers where I worked with Dr. Barry-John Theobald and Dr. Nicholas Apostoloff. In 2018 I was a Spring/Summer intern at the NEC-Labs Media Analytics Department, where I worked with Prof. Manmohan Chandraker and Dr. Samuel Schulter. I graduated from Georgia Tech in Fall 2017 with an M.Sc. in Computer Science specializing in Machine Learning, advised by Prof. James Rehg at the Center for Behavioral Imaging. Recently, our work DreamBooth has been selected for a Student Best Paper Honorable Mention Award at CVPR 2023 (0.25% award rate) I have been selected as a Twitch Research Fellowship finalist for the year 2020 and as a second-round interviewee for the Open Phil AI Fellowship. I also appeared on the popular Machine Learning and AI podcast TWIML AI talking about my recent work on defending against deepfakes. While on a 5-year valedictorian scholarship, I obtained my B.Sc. and M.Sc. from Ecole Polytechnique in Paris, France. Additionally, I worked as an intern at MIT CSAIL with Dr. Kalyan Veeramachaneni and Dr. Lalana Kagal.
Recognized worldwide as one of the leading experts in artificial intelligence, Yoshua Bengio is most known for his pioneering work in deep learning, earning him the 2018 A.M. Turing Award, “the Nobel Prize of Computing,” with Geoffrey Hinton and Yann LeCun. He is a full professor at Université de Montréal, and the founder and scientific director of Mila – Quebec AI Institute. He co-directs the CIFAR Learning in Machines & Brains program as a senior fellow and acts as scientific director of IVADO. In 2019, he was awarded the prestigious Killam Prize and in 2022, became the most-cited computer scientist in the world. He is a fellow of both the Royal Society of London and Canada, Knight of the Legion of Honor of France, Officer of the Order of Canada, Member of the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology since 2023 and a Canada CIFAR AI Chair. Concerned about the social impact of AI, he actively contributed to the Montreal Declaration for the Responsible Development of Artificial Intelligence.
Chris has spent over a decade applying statistical learning, artificial intelligence, and software engineering to political, social, and humanitarian efforts. He is the Director of Machine Learning at the Wikimedia Foundation. Previously, he was the Director of Data Science at Devoted Health, Director of Data Science at the Kenyan startup BRCK, cofounded the AI startup Yonder, created the data science podcast Partially Derivative, was the Director of Data Science at the humanitarian non-profit Ushahidi, and was the director of the low-resource technology governance project at FrontlineSMS. He also wrote Machine Learning For Python Cookbook (O'Reilly 2018) and created MachineLearningFlashcards.
Chancey Fleet is a 2018-19 Data & Society Fellow and current Affiliate-in-Residence whose writing, organizing and advocacy aims to catalyze critical inquiry into how cloud-connected accessibility tools benefit and harm, empower and expose communities of disability. Chancey is the Assistive Technology Coordinator at the New York Public Library where she founded and maintains the Dimensions Project, a free open lab for the exploration and creation of accessible images, models and data representations through tactile graphics, 3d models and nonvisual approaches to coding, CAD and "visual" arts. Chancey was recognized as a 2017 Library Journal Mover and Shaker.
I am an Assistant Professor with a joint appointment in the departments of Computing Science and Psychology at the University of Alberta. I am a fellow of the Alberta Machine Intelligence Institute and I hold a CIFAR Canadian AI Chair (2018-2023). Prior to this, I was faculty at the University of Victoria. My interests are Computational Linguistics, Machine Learning and Neuroscience. My work combines all three of these areas to study the way the human brain processes language. Models of language meaning (semantics) are typically built using large bodies of text (corpora) collected from the Internet. These corpora often contain billions of words, and thus cover the majority of the ways words are used. However, to build computer programs that truly understand language, and can understand more rare and nuanced word usage, we need algorithms that can generalize beyond common word usage. By collecting brain images of people reading, we can explore how the human brain handles the complexities of language, which could inspire the next generation of semantic models.
Sarah's work focuses on research and emerging technology strategy for AI products in Azure. Sarah works to accelerate the adoption and positive impact of AI by bringing together the latest innovations in research with the best of open source and product expertise to create new tools and technologies. Sarah is currently leading Responsible AI for the Azure Cognitive Services. Prior to joining the Cognitive Services, Sarah lead the development of responsible AI tools in Azure Machine Learning. She is an active member of the Microsoft AETHER committee, where she works to develop and drive company-wide adoption of responsible AI principles, best practices, and technologies. Sarah was one of the founding researchers in the Microsoft FATE research group and prior to joining Microsoft worked on AI fairness in Facebook. Sarah is active contributor to the open source ecosystem, she co-founded ONNX, Fairlearn, and OpenDP's SmartNoise was a leader in the Pytorch 1.0 and InterpretML projects. She was an early member of the machine learning systems research community and has been active in growing and forming the community. She co-founded the MLSys research conference and the Learning Systems workshops. She has a Ph.D. in computer science from UC Berkeley advised by Dave Patterson, Krste Asanovic, and Burton Smith.
Dr. Sameer Singh is an Associate Professor of Computer Science at the University of California, Irvine (UCI). He is working primarily on robustness and interpretability of machine learning algorithms, along with models that reason with text and structure for natural language processing. Sameer was a postdoctoral researcher at the University of Washington and received his PhD from the University of Massachusetts, Amherst, during which he interned at Microsoft Research, Google Research, and Yahoo! Labs. He has received the NSF CAREER award, selected as a DARPA Riser, UCI ICS Mid-Career Excellence in research award, and the Hellman and the Noyce Faculty Fellowships. His group has received funding from Allen Institute for AI, Amazon, NSF, DARPA, Adobe Research, Hasso Plattner Institute, NEC, Base 11, and FICO. Sameer has published extensively at machine learning and natural language processing venues, including paper awards at KDD 2016, ACL 2018, EMNLP 2019, AKBC 2020, and ACL 2020.
Dr. Fitter is an Assistant Professor in the School of Mechanical, Industrial, and Manufacturing Engineering at Oregon State University. Her past degrees include a B.S. and B.A. in mechanical engineering and Spanish from the University of Cincinnati and an M.S.E. and Ph.D. in robotics and mechanical engineering and applied mechanics from the University of Pennsylvania. She completed her doctoral work in the GRASP Laboratory's Haptics Group and was a Postdoctoral Scholar in the University of Southern California Interaction Lab from 2017 to 2018. Her past experiences in industry include fluid modeling and simulation for the Procter & Gamble Oral Care Division and wearable health monitoring device development and evaluation for Microsoft Research. As a member of the Collaborative Robotics and Intelligent Systems (CoRIS) Institute, Dr. Fitter aims to equip robots with the ability to engage and empower people in interactions from playful high-fives to challenging physical therapy routines Outside of her day job, Dr. Fitter is a semi-professional stand-up comedian. Her soothing midwestern voice has been described as "sexy," "librarian-like," and "nearly inaudible." She has opened for Bil Dwyer, Laurie Kilmartin, and Whitney Cummings and performed in the All Jane Comedy Festival.  
Dan has been coding all his life, but it had always been tangential to his career until early 2018 when he caught the machine learning bug and decided to make a career pivot. More recently, Dan has been focusing specifically on NLP. In his free time Dan loves playing all sorts of board games (Dice Forge, Puerto Rico, Stone Age, Seven Wonders, Dominion, etc.) and this Codenames competition ties together so many of Dan's passions!
There are few things I love more than cuddling up with an exciting new book. There are always more things I want to learn than time I have in the day, and I think books are such a fun, long-form way of engaging (one where I won’t be tempted to check Twitter partway through). This book roundup is a selection from the last few years of TWIML guests, counting only the ones related to ML/AI published in the past 10 years. We hope that some of their insights are useful to you! If you liked their book or want to hear more about them before taking the leap into longform writing, check out the accompanying podcast episode (linked on the guest’s name). (Note: These links are affiliate links, which means that ordering through them helps support our show!) Adversarial ML Generative Adversarial Learning: Architectures and Applications (2022), Jürgen Schmidhuber AI Ethics Sex, Race, and Robots: How to Be Human in the Age of AI (2019), Ayanna Howard Ethics and Data Science (2018), Hilary Mason AI Sci-Fi AI 2041: Ten Visions for Our Future (2021), Kai-Fu Lee AI Analysis AI Superpowers: China, Silicon Valley, And The New World Order (2018), Kai-Fu Lee Rebooting AI: Building Artificial Intelligence We Can Trust (2019), Gary Marcus Artificial Unintelligence: How Computers Misunderstand the World (The MIT Press) (2019), Meredith Broussard Complexity: A Guided Tour (2011), Melanie Mitchell Artificial Intelligence: A Guide for Thinking Humans (2019), Melanie Mitchell Career Insights My Journey into AI (2018), Kai-Fu Lee Build a Career in Data Science (2020), Jacqueline Nolis Computational Neuroscience The Computational Brain (2016), Terrence Sejnowski Computer Vision Large-Scale Visual Geo-Localization (Advances in Computer Vision and Pattern Recognition) (2016), Amir Zamir Image Understanding using Sparse Representations (2014), Pavan Turaga Visual Attributes (Advances in Computer Vision and Pattern Recognition) (2017), Devi Parikh Crowdsourcing in Computer Vision (Foundations and Trends(r) in Computer Graphics and Vision) (2016), Adriana Kovashka Riemannian Computing in Computer Vision (2015), Pavan Turaga Databases Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases (2021), Xin Luna Dong Big Data Integration (Synthesis Lectures on Data Management) (2015), Xin Luna Dong Deep Learning The Deep Learning Revolution (2016), Terrence Sejnowski Dive into Deep Learning (2021), Zachary Lipton Introduction to Machine Learning A Course in Machine Learning (2020), Hal Daume III Approaching (Almost) Any Machine Learning Problem (2020), Abhishek Thakur Building Machine Learning Powered Applications: Going from Idea to Product (2020), Emmanuel Ameisen ML Organization Data Driven (2015), Hilary Mason The AI Organization: Learn from Real Companies and Microsoft’s Journey How to Redefine Your Organization with AI (2019), David Carmona MLOps Effective Data Science Infrastructure: How to make data scientists productive (2022), Ville Tuulos Model Specifics An Introduction to Variational Autoencoders (Foundations and Trends(r) in Machine Learning) (2019), Max Welling NLP Linguistic Fundamentals for Natural Language Processing II: 100 Essentials from Semantics and Pragmatics (2013), Emily M. Bender Robotics What to Expect When You’re Expecting Robots (2021), Julie Shah The New Breed: What Our History with Animals Reveals about Our Future with Robots (2021), Kate Darling Software How To Kernel-based Approximation Methods Using Matlab (2015), Michael McCourt
This study group will follow the text Reinforcement Learning: An Introduction, Sutton and Barto, 2nd Edition, 2018 MIT Press. The group meets every Sunday at 10 am PT starting on April 11, 2021. The study group slack channel is #rl_2021 (you can join our slack community by clicking on "join us" at twimlai.com/community). Also, the study group leaders are looking for volunteers from the community to help lead the sessions so don’t hesitate to post a note in the channel if you’d like to volunteer.
Getting to know Andrea Banino Andrea’s background as a neuroscientist informed his work in deep learning. At DeepMind, Andrea’s research falls in the realm of Artificial General Intelligence, specifically memory, along with investigating ways to shape deep learning systems so they better mimic the human brain. “I think for us, we have a different sort of memory. We have very long-term memory, we have short-term memory. I argue that agents should be equipped with these sort of different timescale memory.” Introduction to Human and Machine Memory Human memory can be broadly categorized into two kinds: short-term, sometimes called “working” memory, and long-term memory. Working memory deals with immediate phenomena and manipulates it for other cognitive functions. Tasks like counting, drawing a still life, or putting together a puzzle, where you use recently encountered information to accomplish a goal involve working memory. Recurrent neural networks and LSTMs are working memory equivalent models which hold information “online” to solve a problem, and then usually let it go afterwards. Long-term memory can be further subdivided into episodic and semantic memory. Episodic memory, also called autobiographical memory, catalogues personal experiences and stores them in memories. This differs from semantic memory, which generally stores knowledge and concepts. For example, knowing what a bike looks like and what it does is semantic memory, while remembering a specific bike ride with a friend is stored in autobiographical memory. Andrea’s research background is in long-term episodic memory. There isn’t a really good long-term memory equivalent in ML models yet, but Andrea and his team have experimented with a few different arrangements. Long-Term Memory Models One interesting model Andrea explored is a memory-augmented neural network. This is a neural network connected to an external memory source, which allows it to write previous computations and reuse previous computation procedures when it encounters similar problems. Retrieval augmented models are another long-term memory equivalent that have the ability to look things up in their memory. However, unlike human minds, they don’t update or reconsolidate their memory based on new information; it’s just a constant cycle of check and replicate. Transformer models also seem promising as a substrate for long-term memory. However, Andrea notes that they have only been used to model language so far, so still limited data. One downside is that transformers are computation-heavy and difficult to scale, so it’s definitely an open area of research. Overfitting, in models and humans A common critique of deep learning models is that they have a tendency to overfit to their data set, and have difficulty generalizing as a result. While this is certainly an issue, Andrea brought up another really interesting point. Humans also memorialize, and there’s always the potential for overfitting as a person. One way evolution has helped prevent against that is by increasing the data set over time, as the set of human experiences our brains pull from increases as we age. Andrea mentioned that even humans are limited in our generalizability — limited by the data we take in. The link between memory and learning is that consistent experience enables generalization, so people take memories and use them to predict the future. In some ways, our brains aim to minimize uncertainty, and incorporating previously-known information about the environment helps us predict what’s going to happen in the future. Neural Network Navigation Task In 2018, Andrea and his colleagues published a paper that explored agent navigation via representation. The model they built was programmed to mimic the human hippocampus. To understand what this model looked like, Andrea explained the three types of cells in the hippocampus that work together for spatial analysis. Head direction cells fire when a person is facing a specific direction relative to their environment. Place cells on the other hand fire in a specific place, such as the town square or even one’s own bedroom. Grid cells fire in a hexagonal lattice format and are theorized to be the cells that allow us to calculate shortcuts. Andrea et al. trained a neural network with models that mimicked each of these three traits. Via experimentation, using methods like dropout and introducing noise, Andrea and his team were able to determine that all three artificial cell types were necessary for successful shortcut navigation. “We managed to make the representation emerge in our neural network, trained it to do path integration, a navigation task. And we proved that that was the only agent able to take a shortcut. So, it was an empirical paper to prove what the grid cells are for.” Ponder Net: an algorithm that prioritizes complexity Andrea’s most recent development is an algorithm called Ponder Net. As a general rule, the amount of computational power required for a neural network to make an inference increases as the size of a model’s input (like its feature dimensionality) increases, while the required computational power has no necessary relation to the complexity of a particular problem or prediction. By contrast, the amount of time it takes a human to solve a problem is directly related to the problem’s complexity. Ponder Net attempts to create neural networks that budget computational resources based on problem complexity. It does so with the introduction of a halting algorithm which helps to conserve inference time, so if the computer is confident about the solution, it can stop calculating early. How does it work? Pondering steps & the halting algorithm Ponder Net is based on previous work called adaptive computation time. Adaptive computation time (ACT) minimizes the number of pondering steps with a halting algorithm. In ACT, the algorithm finds a weighted average of the prediction, instead of a specific prediction. With Ponder Net, the probability of halting is found for each time step in the sequence. Andrea explained that the probability of halting is a Bernoulli random variable (think coin flip) which tells you the probability of halting at the current step, given that you have not halted at the previous step. From there, Ponder Net calculates a probability distribution by multiplying the probability at each time step in order to form a proper geometric distribution. Once we have that, the algorithm can then calculate the loss for each prediction in the sequence that we made. The loss can then be weighted by the probability where we altered that particular step. Andrea sees Ponder Net as a technique that can be applied in many different architectures, and he tested it on a number of different tasks. The team reported above state-of-the art performance, and that Ponder Net was able to succeed at extrapolation tests where traditional neural networks fail. Transformers & Reinforcement Learning Another project Andrea mentioned was a BERT-inspired combined transformer and LSTM algorithm he published in a recent paper. While LSTMs work great for reinforcement learning tasks, they do suffer from a recency bias which makes them less suited to long-term memory problems. Transformers perform better over a long string of information, however their reward system is more complicated and they have noisier gradients. Andrea’s algorithm applied a BERT masking training to features from a CNN which were then reconstructed. Figure 1 from CoBERL paper Combining the LSTM with a transformer reduced the size and increased the speed of the algorithm. Something clever Andrea did was letting the agent choose whether to use the LSTM alone or to combine with the transformer “I think there’s lots of stuff we can do to improve transformers and memory in general, in reinforcement learning, especially in relation to the length of the context that we can process.” Check out the podcast episode to learn more about Ponder Net, and reinforcement learning!
Running out of gift ideas and need a little inspiration? The TWIML team has you covered! We put together a Holiday Gift Guide featuring some of our favorite AI-enabled products. It’s probably no surprise if you listen to the podcast, but AI has found its way into a bunch of different areas. This is just a small sampling of some of the nifty gadgets and services that caught our attention this holiday season. Surprise the AI enthusiast (or non-enthusiast) in your life with: The Drone: ActiveTrack + Advanced Pilot Asset System am felt this list wouldn’t be complete without at least one drone, and this is the one on his wish list. The DJI Mavic Air 2 has all the usual drone things: good sensors, a nice camera, etc., but what makes the drone unique is the stellar AI software. ActiveTrack 3.0 and Advanced Pilot Assist System 3.0 features allow you to focus on a subject and have it tracked and filmed while the drone is in flight. There’s a pretty good review of the drone here. Personal Trainer for Running For those of us missing the gym, running is often the only refuge. We like the Vi app because it uses AI to personalize workouts, offer to coach, and give daily challenges– all helpful qualities when you’re trying to build healthy habits! Coding Robots for Kids & Adults This one is for the kiddos! (And anyone learning to code). We’re fans of this little smart Root Coding Robot that complements any level of coding experience. It’s super interactive, with 3 learning levels full of lessons, projects, and activities. If you’re curious about how iRobot is using AI, check out this TWIML interview on Re-Architecting Data Science at iRobot with Angela Bassa, the company’s Global Global Head of Analytics & Data Science. For older kids or young-at-heart adults, DJI’s Robomaster S-1 is a neat choice too and allows users to program with some simple AI building blocks like person detection. Data-based Skincare In an effort to create a skincare program based on data science, Proven established The Skin Genome ProjectTM, which became the most comprehensive skincare database you can find and winner of MIT’s 2018 Artificial Intelligence Award. With this database that accounts for over 20,000 skincare ingredients, 100,000 products, 8 million testimonials, and even the climate you live in – they’re able to curate skin care formulas based on your skin. We hope you enjoy our top picks!
Today we're joined by Roland Memisevic, return podcast guest and Co-Founder & CEO of Twenty Billion Neurons. We last spoke to Roland in 2018, and just earlier this year TwentyBN made a sharp pivot to a surprising use case, a companion app called Fitness Ally, an interactive, personalized fitness coach on your phone. In our conversation with Roland, we explore the progress TwentyBN has made on their goal of training deep neural networks to understand physical movement and exercise. We also discuss how they've taken their research on understanding video context and awareness and applied it in their app, including how recent advancements have allowed them to deploy their neural net locally while preserving privacy, and Roland's thoughts on the enormous opportunity that lies in the merging of language and video processing.
Today we're joined by Mike del Balso, co-Founder and CEO of Tecton. Mike, who you might remember from our last conversation on the podcast, was a foundational member of the Uber team that created their ML platform, Michelangelo. Since his departure from the company in 2018, he has been busy building up Tecton, and their enterprise feature store. In our conversation, Mike walks us through why he chose to focus on the feature store aspects of the machine learning platform, the journey, personal and otherwise, to operationalizing machine learning, and the capabilities that more mature platforms teams tend to look for or need to build. We also explore the differences between standalone components and feature stores, if organizations are taking their existing databases and building feature stores with them, and what a dynamic, always available feature store looks like in deployment. Finally, we explore what sets Tecton apart from other vendors in this space, including enterprise cloud providers who are throwing their hat in the ring.
In 2018, the Global Slavery Index found that there were 40.3 M people in modern slavery, of whom 25M were in forced labor producing computers, clothing, agricultural products, raw materials, etc and 15M were in forced marriage. In order to facilitate measures to end modern slavery the United Nations demands companies to take immediate measures and state clear policies. The Future Society is an independent nonprofit think-and-do tank curating an up-to-date repository of >16K of those statements. Use your NLP-Skills predict whether those statements meet the required standards!
The issue of bias in AI was the subject of much discussion in the AI community last week. The publication of PULSE, a machine learning model by Duke University researchers, sparked a great deal of it. PULSE proposes a new approach to the image super-resolution problem, i.e. generating a faithful higher-resolution version of a low-resolution image. In short, PULSE works by using a novel technique to efficiently search space of high-resolution artificial images generated using a GAN and identify ones that are downscale to the low-resolution image. This is in contrast to previous approaches to solving this problem, which work by incrementally upscaling the low-resolution images and which are typically trained in a supervised manner with low- and high-resolution image pairs. The images identified by PULSE are higher resolution and more realistic than those produced by previous approaches, and without the latter’s characteristic blurring of detailed areas. However, what the community quickly identified was that the PULSE method didn’t work so well on non-white input images. An example using a low res image of President Obama was one of the first to make the rounds, and Robert Ness used a photo of me to create this example: I’m going to skip a recounting of the unfortunate Twitter firestorm that ensued following the model’s release. For that background, Khari Johnson provides a thoughtful recap over at VentureBeat, as does Andrey Kurenkov over at The Gradient. Rather, I’m going to riff a bit on the idea of where bias comes from in AI systems. Specifically, in today’s episode of the podcast featuring my discussion with AI Ethics researcher Deb Raji I note, “I don’t fully get why it’s so important to some people to distinguish between algorithms being biased and data sets being biased.” Bias in AI systems is a complex topic, and the idea that more diverse data sets are the only answer is an oversimplification. Even in the case of image super-resolution, one can imagine an approach based on the same underlying dataset that exhibits behavior that is less biased, such as by adding additional constraints to a loss or search function or otherwise weighing the types of errors we see here more heavily. See AI artist Mario Klingemann’s Twitter thread for his experiments in this direction. Not electing to consider robustness to dataset biases is a decision that the algorithm designer makes. All too often, the “decision” to trade accuracy with regards to a minority subgroup for better overall accuracy is an implicit one, made without sufficient consideration. But what if, as a community, our assessment of an AI system’s performance was expanded to consider notions of bias as a matter of course? Some in the research community choose to abdicate this responsibility, by taking the position that there is no inherent bias in AI algorithms and that it is the responsibility of the engineers who use these algorithms to collect better data. However, as a community, each of us, and especially those with influence, has a responsibility to ensure that technology is created mindfully, with an awareness of its impact. On this note, it’s important to ask the more fundamental question of whether a less biased version of a system like PULSE should even exist, and who might be harmed by its existence. See Meredith Whittaker’s tweet and my conversation with Abeba Birhane on Algorithmic Injustice and Relational Ethics for more on this. A full exploration of the many issues raised by the PULSE model is far beyond the scope of this article, but there are many great resources out there that might be helpful in better understanding these issues and confronting them in our work. First off there are the videos from the tutorial on Fairness Accountability Transparency and Ethics in Computer Vision presented by Timnit Gebru and Emily Denton. CVPR organizers regard this tutorial as “required viewing for us all.” Next, Rachel Thomas has composed a great list of AI ethics resources on the fast.ai blog. Check out her list and let us know what you find most helpful. Finally, there is our very own Ethics, Bias, and AI playlist of TWIML AI Podcast episodes. We’ll be adding my conversation with Deb to it, and it will continue to evolve as we explore these issues via the podcast. I'd love to hear your thoughts on this. (Thanks to Deb Raji for providing feedback and additional resources for this article!)
Stefan Lee is involved in a number of projects around emergent communication. To hear about more of them, in addition to the ViLBERT model, check out the full interview! TWiML Talk #358 with Stefan Lee. One of the major barriers keeping robots from full immersion in our daily lives, is the reality that meaningful human interaction is dependent on the interpretation of visual and linguistic communication. Humans (and most living creatures) rely on these signals to contextualize, operate and organize our actions in relation to the world around us. Robots are extremely limited in their capacity to translate visual and linguistic inputs, which is why it remains challenging to hold smooth conversation with a robot. Ensuring that robots and humans can understand each other sufficiently is at the center of Stefan Lee’s research. Stefan is an assistant professor at Oregon State University for the Electrical Engineering and Computer Science Department. Stefan, along with his research team, held a number of talks and presentations at NeurIPS 2019. One highlight of their recent work is the development of a model called ViLBERT (Vision and Language BERT), published in their paper, ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. BERT Models for Natural Language Processing BERT (Bidirectional Encoder Representations from Transformers) is Google’s popularized model that has revolutionized natural language processing (NLP). BERT is what’s called a language model, meaning it is trained to predict the future words in a sentence based on past words. Another example is GPT-2, which caused a controversy when released by OpenAI last year. Part of what makes them so special is that they are bidirectional, meaning they can contextualize language by reading data from both the left and right. This builds relationships between words and helps the model make more informed predictions about related words. They also pre-train models on large sets of unlabeled data using a transformer architecture, so data sequences don’t need to be processed in order, enabling parallel computations. Models like BERT work by taking “a large language corpus and they learn certain little things to build supervision from unlabeled data. They’ll mask out a few words and have them re-predict it based on the other linguistic context, or they’ll ask if a sentence follows another sentence in text.” Extending BERT to the Visual Domain ViLBERT is an extension of the BERT technique. To apply the BERT model to vision and language, the team worked with a data set called Conceptual Captions, composed of around 3 million images paired with alt-text. Their method is to mask out random parts of the image, and then ask the model to reconstruct the rest of the image given the associated alt-text. “Likewise, we’re asking, does this sentence match with this image or not? or masking out parts of the language and having it reconstruct from the image and the text. We’re designing this self-supervised multi-model task with this large weekly supervised data source.” Visual Grounding Stefan describes that “Most of natural language processing is just learning based on its association with other words. Likewise on the visual side, you’re learning to represent some sparse set of classes. Those classes often relate to specific nouns, but they don’t have a sense of closeness, so there’s no idea that the feature for a cat should be close to the feature for a tiger…The point of ViLBERT is to try to learn these associations between vision and language directly. This is something we usually call visual grounding of a word.” Humans do this naturally because we often have the inherent context to imagine a visual representation of something. For example, the words “wolf head” might bring up certain imagery of wolves, but machines lack the same visual associations. What Stephan is working towards with ViLBERT is to present the agent with something like, “red ball” and have it interpret that to reconstruct an image of a red ball. How Vilbert Works: Bounding Boxes and Co-Attentional Transformer In object detection, bounding boxes are used to describe the target area of the object being observed. For the purpose of ViLBERT, an image is dissected as a set of bounding boxes that are independent of each other. Each box is given a positional encoding that shows where the box was pulled from, but ultimately the order of the sequence is not important. BERT works in the same way where you have a sequence of words (a sentence) that are treated as independent inputs and given a positional embedding. “It’s just a set. It’s an unordered set. In fact, the actual input API for BERT looks the same in our model for the visual side and the linguistic side.” The bounding boxes are output by an R-CNN model trained on the Visual Genome. The R-CNN model “can produce quite a lot of bounding boxes and you can sample from it if you’d like to increase the randomness.” Something to note is that many of the bounding boxes are not well aligned and the data could come back as fairly noisy. “Sometimes you’ll have an object like a road…it doesn’t do a great job of honing in on specific objects.” While the model is not trained on the visuals from scratch, it still has to learn the association even when it might not be obvious. To train the model, certain parts of the image (bounding boxes) are removed, and the model is asked about alignment between the alt-text and the image. The distinction between BERT is that “In BERT, it’s a sentence and the next sentence, and you’re predicting whether one comes after the other, but in our case, it’s an image and a sentence, and we’re asking does this align or not? Is this actually a pair from conceptual captions?” At the end of the process, “what we get is a model that has built some representations that bridge between vision and language” which can then be fine-tuned to fit a variety of other tasks. Applications and Limitations Since the ViLBERT paper, the model has been applied to around a dozen different vision and language tasks. “You can pre-train this and then use it as a base to perform fairly well, fairly quickly, on a wide range of visual and language reasoning tasks.” In addition to fine-tuning for specific task types, you can adjust for specific types of data or data language relationships. One useful adaptation is for Visual Question and Answering (VQA) to help those who are visually impaired ask questions about the world around them, and receive descriptive answers in return. To modify the model for VQA, you could feed the questions as the text inputs and “train an output that predicts my subset of answers.” ViLBERT is pre-trained on a dataset of images and captions as the text input. For VQA, you would use the questions as the text input and have the model reconstruct answers as the output. While ViLBERT is a solid starting point for the field, Stefan notes that the grounding component of the research is still underdeveloped. For example, if the model is trained for VQA on a limited dataset like COCO images, there may be objects that are not accounted for because the machine never learned they existed. “One example that I like to show from a recent paper is that [COCO images] don’t have any guns. If we’ve fed this caption trained on COCO with an image of a man in a red hat with a red shirt holding a shotgun, and the caption is a man in a red hat and a red shirt holding a baseball bat, because he’s wearing what looks like a baseball uniform and he’s got something in his hands. It might as well be a baseball bat. If we talk back to these potential applications of helping people with visual impairment, that kind of mistake doesn’t seem justifiable.” Future Directions For Visual Grounding One related area of research that Stefan has started to branch into is the interpretation of motion. The problem with images is that it can often be difficult to distinguish between active behaviors and stationary behaviors. “For a long time in the community, grounding has been on static images, but there’s actually a lot of concepts that rely on motion, that rely on interaction to ground. I could give you a photo and you could tell me it looks like people are talking, but they could just be sitting there quietly as well.” There is less emphasis on the interaction, which is a key element not only to understanding communication, but for accuracy in reading a social situation. Machines are not yet able to catch on to these distinctions and it’s a growing area of interest for Stefan. For more on what Stefan is up to, be sure to check out the full interview with Stefan Lee for a deeper understanding of ViLBERT and the numerous projects the team is working on.
We're excited to share our third annual Black in AI series! When you're done with this year's series, make sure you check out the previous BAI series: Black in AI 2019 - Black in AI 2018.
How does LinkedIn allow its data scientists to access aggregate user data for exploratory analytics while maintaining its users' privacy? That was the question at the heart of our recent conversation with Ryan Rogers, a senior software engineer in data science at the company. The answer, it turns out, is through differential privacy, a topic we've covered here on the show quite extensively over the years. Differential privacy is a system for publicly sharing information about a dataset by describing patterns of groups within the dataset, the catch is you have to do this without revealing information about individuals in the dataset (privacy). Ryan currently applies differential privacy at LinkedIn, but he has worked in the field, and on the related topic of federated learning, for quite some time. He was introduced to the subject as a PhD student at the University of Pennsylvania, where he worked closely with Aaron Roth, who we had the pleasure of interviewing back in 2018. Ryan later worked at Apple, where he focused on the local model of differential privacy, meaning differential privacy is performed on individual users' local devices before being collected for analysis. (Apple uses this, for example, to better understand our favorite emojis 🤯 👍👏). Not surprisingly, they do things a bit differently at LinkedIn. They utilize a central model, where the user's actual data is stored in a central database, with differential privacy applied before the data is made available for analysis. (Another interesting use case that Ryan mentioned in the interview: the U.S. Census Bureau has announced plans to publish 2020 census data using differential privacy.) Ryan recently put together a research paper with his LinkedIn colleague, David Durfee, that they presented as a spotlight talk at NeurIPS in Vancouver. The title of the paper is a bit daunting, but we break it down in the interview. You can check out the paper here: Practical Differentially Private Top-k Selection with Pay-what-you-get Composition. There are two major components to the paper. First, they wanted to offer practical algorithms that you can layer on top of existing systems to achieve differential privacy for a very common type of query: the "Top-k" query, which means helping answer questions like "what are the top 10 articles that members are engaging with across LinkedIn?" Secondly, because privacy is reduced when users are allowed to make multiple queries of a differentially private system, Ryan's team developed an innovative way to ensure that their systems accurately account for the information the system returns to users over the course of a session. It's called Pay-what-you-get Composition. One of the big innovations of the paper is discovering the connection between a common algorithm for implementing differential privacy, the exponential mechanism, and Gumbel noise, which is commonly used in machine learning. One of the really nice connections that we made in our paper was that actually the exponential mechanism can be implemented by adding something called Gumbel noise, rather than Laplace noise. Gumbel noise actually pops up in machine learning. It's something that you would do to report the category that has the highest weight, [using what is] called the Gumbel Max Noise Trick. It turned out that we could use that with the exponential mechanism to get a differentially private algorithm. [...] Typically, to solve top-k, you would use the exponential mechanism k different times⁠ —you can now do this in one shot by just adding Gumbel noise to [existing algorithms] and report the k values that are in the the top […]which made it a lot more efficient and practical. When asked what he was most excited about for the future of differential privacy Ryan cited the progress in open source projects. This is the future of private data analytics. It's really important to be transparent with how you're doing things, otherwise if you're just touting that you're private and you're not revealing what it is, then is it really private? He pointed out the open-source collaboration between Microsoft and Harvard's Institute for Quantitative Social Sciences. The project aims to create an open-source platform that allows researchers to share datasets containing personal information while preserving the privacy of individuals. Ryan expects such efforts to bring more people to the field, encouraging applications of differential privacy that work in practice and at scale. Listen to the interview with Ryan to get the full scope! And if you want to go deeper into differential privacy check out our series of interviews on the topic from 2018. Thanks to LinkedIn for sponsoring today's show! LinkedIn Engineering solves complex problems at scale to create economic opportunity for every member of the global workforce. AI and ML are integral aspects of almost every product the company builds for its members and customers. LinkedIn's highly structured dataset gives their data scientists and researchers the ability to conduct applied research to improve member experiences. To learn more about the work of LinkedIn Engineering, please visit engineering.linkedin.com/blog.
Today we close out AI Rewind 2019 joined by Amir Zamir, who recently began his tenure as an Assistant Professor of Computer Science at the Swiss Federal Institute of Technology. Amir joined us on the podcast back in 2018 to discuss his CVPR Best Paper winner, and in today's conversation, we continue with the thread of Computer Vision. In our conversation, we discuss quite a few topics, including Vision-for-Robotics, the expansion of the field of 3D Vision, Self-Supervised Learning for CV Tasks, and much more! Check out the rest of the series at twimlai.com/rewind19! We want to hear from you! Send your thoughts on the year that was 2019 below in the comments, or via Twitter at @samcharrington or @twimlai!
Sam Charrington: Hey, what's up everyone? This is Sam. A quick reminder that we've got a bunch of newly formed or forming study groups, including groups focused on Kaggle competitions and the fast.ai NLP and Deep Learning for Coders part one courses. It's not too late to join us, which you can do by visiting twimlai.com/community. Also, this week I'm at re:Invent and next week I'll be at NeurIPS. If you're at either event, please reach out. I'd love to connect. All right. This week on the podcast, I'm excited to share a series of shows recorded in Orlando during the Microsoft Ignite conference. Before we jump in, I'd like to thank Microsoft for their support of the show and their sponsorship of this series. Thanks to decades of breakthrough research and technology, Microsoft is making AI real for businesses with Azure AI, a set of services that span vision, speech, language processing, custom machine learning, and more. Millions of developers and data scientists around the world are using Azure AI to build innovative applications and machine learning models for their organizations, including 85% of the Fortune 100. Microsoft customers like Spotify, Lexmark, and Airbus, choose Azure AI because of its proven enterprise grade capabilities and innovations, wide range of developer tools and services and trusted approach. Stay tuned to learn how Microsoft is enabling developers, data scientists and MLOps and DevOps professionals across all skill levels to increase productivity, operationalize models at scale and innovate faster and more responsibly with Azure machine learning. Learn more at aka.ms/azureml. All right, onto the show! Erez Barak: [00:02:06] Thank you. Great to be here with you, Sam. Sam Charrington: [00:02:08] I'm super excited about this conversation. We will be diving into a topic that is generating a lot of excitement in the industry and that is Auto ML and the automation of the data science process. But before we dig into that, I'd love to hear how you got started working in ML and AI. Erez Barak: [00:02:30] It's a great question because I've been working with data for quite a while. And I think roughly about five to 10 years ago, it became apparent that the next chapter for anyone working with data has to weave itself through the AI world. The world of opportunity with AI is really only limited by the amount of data you have, the uniqueness of the data you have and the access you have to data. And once you're able to connect those two worlds, a lot of things like predictions, new insights, new directions, sort of come out of the woodwork. So seeing that opportunity, imagining that potential, has naturally led me to work with AI. I was lucky enough to join the Azure AI group, and there's really three focal areas within that group. One of them is machine learning. How do we enable data scientists of all skills to operate through the machine learning lifecycle, starting from the data to the training, to registering the models to putting them in productions and managing them, a process we call ML Ops. So just looking at that end to end and understanding how we enable others to really go through that process in a responsible trusted and known way has been a super exciting journey so far. Sam Charrington: [00:03:56] And so do you come at this primarily from a data science perspective, a research perspective, an engineering perspective? Or none of the above? Or all of the above? Erez Barak: [00:04:07] I'm actually going to go with all of the above. I think it'd be remiss to think that if you're  a data science perspective, and you're trying to build a product and really looking to build the right set of products for people to use as they go through their AI journey, you'd probably miss out on an aspect of it. If you just think about the engineering perspective, you'll probably end up with great info that doesn't align with any of the data science. So you really have to think between the two worlds and how one empowers the other. You really have to figure out where most data scientists of all skills need the help, want the help, are looking for tools and products and services on Azure to help them out, and I think that's the part I find most compelling. Sort of figuring that out and then really going deep where you landed, right? 'Cause if we end up building a new SDK, we're going to spend a whole lot of time with our data science customers, our data science internal teams and figure out, "Well, how should our SDK look like?" But if you're building something like Auto ML that's targeted not only at the deeper data scientist, but also the deeper rooted data professionals, you're going to spend some time with them and understand not only what they need, but also how that applies to the world of data science. Sam Charrington: [00:05:27] And what were you working on before Azure AI? Erez Barak: [00:05:31] So before Azure AI, in Microsoft, I worked for a team called Share Data, which really created a set of data platforms for our internal teams. And prior to joining Microsoft, I worked in the marketing automation space, at a company called Optify. and again the unique assets we were able to bring to the table as part of Optify in the world of marketing automations were always data based. We were always sort of looking at the data assets the marketers had and said, "what else can we get out of it?" Machine learning wasn't as prevalent at the time, but you could track back to a lot of what we did at that time and how machine learning would've helped if it was used on such a general basis. Sam Charrington: [00:06:12] Yeah, one of the first machine learning use cases that I worked with were with folks that were doing trying to do lead scoring and likelihood to buy, propensity to buy types of use cases. I mean that's been going on for a really long time. Erez Barak: [00:06:30] So we're on a podcast so you can't see me smiling, but we did a lot of work around building lead scoring...and heuristics and manual heuristics, and general heuristics, and heuristics that the customer could customize. And today, you've seen that to really evolve to a place where there's a lot of machine learning behind it. I mean, it's perfect for machine learning, right? You've got all this data. It's fresh. It's coming  in new. There's insights that are really hard to find out. Once you've start slicing and dicing it by regions or by size of customers, it gets even more interesting so all the makings for having machine learning really make it shine. Sam Charrington: [00:07:07] Yeah you are getting pretty excited I think. Erez Barak: [00:07:08] Oh, no, no, no. It's a sweet spot there. Yes. Sam Charrington: [00:07:12] Nice. You want to dive into talking about Auto ML? For the level of excitement and demand for Auto ML and enthusiasm that folks have for the topic, not to mention the amount of confusion that there is for the topic, I've probably not covered it nearly enough on the podcast. Certainly when I think of Auto ML, there's a long academic history behind the technical approaches that drive it. But it was really popularized for many with Google's Cloud Auto ML in 2018, and before that they had this New York Times PR win that was a New York Times article talking about how AI was going to create itself, and I think that contributed a lot to, 'for lack of a better term in this space', but then we see it all over the place. There are other approaches more focused on citizen data science. I'd love to just start with how you define Auto ML and what's your take on it as a space and its role and importance, that kind of thing. Erez Barak: [00:08:42] Yeah, I really relate to many of the things you touched on. So maybe I'll start - and this is true for many things we do in Azure AI but definitely for Auto ML - on your point around academic roots. Microsoft has this division called MSR, Microsoft Research, and it's really a set of researchers who look into bleeding edge topics and drive the world of research in different areas. And that is when we first got, in our team, introduced to Auto ML. So a subset of that team has been doing research around the Auto ML area for quite a few years. They've been looking at it, they've been thinking. It yes, I've heard the sentence, "AI making AI." That's definitely there. But when you start reading into it like what does it mean and to be honest, it means a lot of things to many people. It's quite overused. I'll be quite frank. There's no one industry standard definition that says, "Hmm, here's what Auto ML is." I can tell you what it is for us. I can tell you what it is for our customers. I can tell you where we're seeing it make a ton of impact. And it comes to using machine learning capabilities in order to help you, being the data scientist, create machine capabilities in a more efficient, in a more accurate, in a more structured fashion. Sam Charrington: [00:10:14] My reaction to that is that it's super high level. And it leaves the door open for all of this broad spectrum of definitions that you just talked about. For example, not to over index on what Google's been doing, but Cloud Auto ML Vision when it first came out was a way for folks to do vision cognitive services, but use some of their own data to tune it. Right? Which is a lot different. In fact, they caught a lot of flack from the academic Auto ML community because they totally redefined what that community had been working for for many years and started creating the confusion. Maybe a first question is, do you see it as being a broad spectrum of things or is it how do we even get to a definition that separates the personalized cognitive services trained with my own data versus this other set of things? Erez Barak: [00:11:30] I think you see it as more of that general sense, so I would say probably not. I see it as a much more concrete set of capabilities that adhere to a well known process. That actually is agreed upon across the industry. When you build a model, what do you do? You get data, you featurize that data. Once the features are in place, you choose a learner, you choose an algorithm. You train that algorithm with the data, creating a model. At that point, you want to evaluate the model, make sure it's accurate. You want to get some understanding of what are the underlining features that have most affected the model. And you want to make sure, in addition, that you can explain that model is not biased, you can explain that model is really fair towards all aspects of what it's looking at. That's a well-known process. I think there's no argument around that in the sort of the machine learning field that's sort of the end to end. Auto ML allows automating that process. So at its purest, you feed Auto ML the data and you get the rest for free if you may. Okay? that would be sort of where we're heading, where we want to be. And I think that's at the heart of Auto ML. So, where does the confusion start? I could claim that what we or others do for custom vision follows that path, and it does. I can also claim that some of what we do for custom vision is automated. And then there's  the short hop to say, "Well, therefore it is Auto ML." But I think that misses the general point of what we're trying to do with Auto ML. Custom vision is a great example where Auto ML can be leveraged. But Auto ML can be leveraged wherever that end to end process happens in machinery. Sam Charrington: [00:13:27] Nice. I like it. So maybe we can walk through that end to end process and talk about some of the key areas where automation is applied to contribute to Auto ML. Erez Barak: [00:13:44] So I'd like to start with featurization. And at the end of the day, we want an accurate model. A lot of that accuracy, a lot of the insights we can get, the predictions we can get, and the output we can get from any model is really hinged on how effective your featurization is. So many times you hear that, "Well, 80% of the time data scientists spend on data." Can I put a pin on, do you know where that number comes from? Oh of course. Everyone says that's the number, everyone repeats it. It's a self-fulfilling prophecy. I'm going to say 79% of it just to be sure. But I think it's more of an urban legend at that point. I am seeing customers who do spend that kind of percentages  I am seeing experiments rerun that take that amount of time. Generalizing that number is just too far now to do. Sam Charrington: [00:14:42] I was thinking about this recently, and wondering if there's some institute for data science that's been tracking this number over time. It would be interesting to see how it changes over time I think is the broader curiosity. Erez Barak: [00:14:55] It would. I should go figure that out. [laughs] So anyone who builds a model can quickly see the effect of featurization on the output. Now, a lot of what's done, when building features, can be automated. I would even venture to say that a part of it can be easily automated. Sam Charrington: [00:15:24] What are some examples? Erez Barak: [00:15:25] Some examples are like, "I want to take two columns and bring them together into one." "I want to change a date format to better align with the rest of my columns." And even a easy one, "I'd like to enhance my data with some public holiday data when I do my sales forecasting because that's really going to make it more accurate." So it's more data enhancement, but you definitely want to build features into your data to do that. So getting that right is key. Now start thinking of data sets that have many rows, but more importantly have many columns. Okay? And then the problem gets harder and harder. You want to try a lot more options. There's a lot more ways of featurizing the data. Some are more effective than others. Like we recently in Auto ML, have incorporated the BERT model into our auto featurization capability. Now that allows us to take text data we use for classification and quickly featurize it. It helps us featurize it in a way that requires less input data to come in for the model to be accurate. I think that's a great example of how deep and how far that can go. Sam Charrington: [00:16:40] You mentioned that getting that featurization right is key. To what extent is it an algorithmic methodological challenge versus computational challenge? If you can even separate these two. Meaning, there's this trade off between... Like we've got this catalog of recipes like combining columns and bending things and whatever that we can just throw at a data set that looks like it might fit. Versus more intelligent or selective application of techniques based on nuances whether pre-defined or learned about the data. Erez Barak: [00:17:28] So it extends on a few dimensions. I would say there are techniques. Some require more compute than others. Some are easier to get done. Some require a deeper integration with existing models like I mentioned BERT before, to be effective. But that's only one dimension. The other dimension is the fit of the data into a specific learner. So we don't call it experiments in machine learning for nothing. We experiment, we try. Okay? Nobody really knows exactly which features would affect the model in a proper way, would drive accuracy. So there's a lot of iteration and experimentation being done. Now think of this place where you have a lot of data, creating a lot of features and you want to try multiple learners, multiple algorithms if you may. And that becomes quickly quite a mundane process that automating can really, really help with. And then add on top of that, we're seeing more and more models created with just more and more features. The more features you have, the more nuanced you can get about describing your data. The more nuanced the model can get about predicting what's going to happen next, or we're now seeing models with millions and billions of features coming out. Now, Auto ML is not yet prepared to deal with the billion feature model, but we see that dimension extend. So extend compute, one, extend the number of iterations you would have, extend to the number of features you have. Now you got a problem that's quickly going to be referred to as mundane. Hard to do. Repetitive. Doesn't really require a lot of imagination. Automation just sounds perfect for that. So that's why one of the things we went after in the past, I'd say six to twelve months is how we get featurization to a place where you do a lot of auto featurization. Sam Charrington: [00:19:22] I'm trying to parse the extent to which, or whether, you agree with this dichotomy that I presented. You've got this mundane problem that if a human data scientist was doing would be just extremely iterative, and certainly one way of automating is to just do that iteration a lot quicker because the machine can do that. Another way of automating is... let's call it more intelligent approaches to navigating that feature space or that iteration space, and identifying through algorithmic techniques what are likely to be the right combinations of features as opposed to just throwing the kitchen sink at it and putting that in a bunch of loops. And certainly that's not a dichotomy, right? You do a bit of both. Can you elaborate on that trade off or the relationship between those two approaches? Is that even the right way to think about it or is that the wrong way to think about it? Erez Barak: [00:20:33] I think it's a definitely a way to think about it. I'm just thinking through that lens for a second. So I think you describe the brute force approach to it. On one side. The other side is how nuanced can you get about it? So what we know is you can get quite nuanced. There's things that are known to work, things that are not known to work. Things that work with a certain type of data set that don't work with another. Things that work with a certain type of data set combined with the learner that don't work with others. So as we build Auto ML, I talked about machine learning used to help with machine learning. We train a model to say, "Okay, in this kind of event, you might want to try this kind of combination first." Because if you're... I talked about the number of features, brute force is not an option. So we have have toto get a lot more nuanced about it, so what Auto ML does is given those conditions if you may, or those features for that model, it helps shape the right set of experiments before others. That's allowing you to get to a more accurate model faster. So I think that's one aspect of it. I think another aspect, which you may have touched on, and I think is really important throughout Auto ML, but definitely in featurization, is why people are excited about that. The next thing you are going to hear is that I want to see what you did. And you have to show what kind of features you used. And quickly follows is, "I want to change feature 950 out of the thousand features you gave me. And I want to add two more features at the end because I think they're important." That's where my innovation as a data scientist comes into play. So you've got to, and Auto ML allows you to do that, be able to open up that aspect and say, "Here's what I've come up with. Would you like to customize? Would you like to add? Would you like to remove?" Because that's where you as a data scientist shine and are able to innovate. Sam Charrington: [00:22:39] So we started with featurization. Next step is learner/model selection? Erez Barak: [00:22:45] I think it's probably the best next step to talk about. Yes. I think there's a lot of configuration that goes into this like how many iterations do I want to do?For instance. How accurate do I want to get? What defines accuracy? But those are more manual parameters we ask the user to add to it. But then automation again comes into play as learner selection. So putting Auto ML aside, what's going to happen? Build a set of features, choose a learner, one that I happen to know is really good for this kind of problem and try it out. See how accurate I get. If it doesn't work, but even if it works, you are going to try another. Try another few. Try a few options. Auto ML at the heart of it is what it does. Now, going to what we talked about in featurization, we don't take a brute force approach. We have a model that's been trained over millions of experiments, sort of knows what would be a good first choice given the data, given the type of features, given the type of outcome you want. What do we try first? Because people can't just run an endless number of iterations. It takes time, takes cost, and sort of takes the frankly it takes a lot of the ROI out of something you expect from Auto ML. So you want to get there as fast as possible based on learnings from the past. So what we've automated is that selection. Put in the data, set a number of iterations or not set them. We have a default number that goes in. And then start using the learners based on the environment we're seeing out there and choosing them out from that other model we've trained over time. By the way, that's a place where we really leaned on the outputs we got from MSR. That's a place where they, as they were defining Auto ML, as they were researching it, really went deep into, and really sort of created assets we were then able to leverage. A product sort of evolves over time and the technology evolves over time, but if I have to pick the most, or the deepest rooted area, we've looked at from MSR, it's definitely the ability to choose the right learner for the right job with a minimal amount of compute associated with it if you may. Sam Charrington: [00:24:59] And what are some of the core contributions of that research if you go to the layer deeper than that? Erez Barak: [00:25:10] Are you asking in context of choosing a model or in general? Sam Charrington: [00:25:13] Yeah, in the context of choosing a model. For example, as you described, what is essentially a learner, learning which model to use, that created a bunch of questions for me around like, "Okay how do you  represent this whole, what are the features of that model? And what is the structure of that model?" And I'm curious if that's something that came out of MSR or that was more from the productization and if there are specific things that came out of that MSR research that come to mind as being pivotal to the way you think about that process. Erez Barak: [00:25:57] So I recall the first version coming out of MSR wasn't really of the end to end product, but at the heart of it was this model that helps you pick learners as it relates to the type size of data you have and the type of target you have. This is where a lot of the research went into. This is where we publish papers around, "Well, which features matter when you choose that?" This is where MSR went and collected a lot of historical data around people running experiments and trained that model. So the basis at the heart of our earliest versions, we really leaned on MSR to get that model in place. We then added the workflow to it, the auto featurization I talked about, some other aspects we'll talk about in a minute, but at the heart of it, they did all that research to understand... Well, first train that model. Just grabbing the data. Sam Charrington: [00:26:54] And what does that model look like? Is it a single model? Is it relatively simple? Is it fairly complex? Is it some ensemble? Erez Barak: [00:27:06] I'll oversimplify a little bit, but it profiles your data. So it takes a profile of your data, it profiles your features, it takes a profile of your features. It looks at the kind of outcome you want to achieve. Am I doing time series forecasting here? I'm doing classification. I'm doing regression that really matters. And based on those features picks the first learner to go after. Then what it does is uses the result of that first iteration, which included all the features I'm talking about, but also now includes, "Hey, I also tried learner X and I got this result." And that helps it choose the next one. So what happens is you look at the base data you have, but you constantly have additional features that show you, "Well, what have I tried and what were the results?" And then the next learner gets picked based on that. And that gets you in a place where the more you iterate, the closer you get to that learner that gives you more accurate result. Sam Charrington: [00:28:14] So I'm hearing elements of both supervised learning. You have a bunch of experiments and the models that were chosen ultimately, but also elements of something more like simple reinforcement learning, contextual bandits, explore, exploit kind of things as well. Erez Barak: [00:28:37] It definitely does both. If I could just touch on one point, reinforcement learning, as it's defined, I wouldn't say we're doing reinforcement learning there. Saying that, we're definitely... every time we have an iteration going or every X times we have that, we do fine tune the training of the model to learn as it runs more and more. So I think reinforcement learning is a lot more reactive. But taking that as an analogy, we do sort of continuously collect more training data and then retrain the model that helps us choose better and better over time. Sam Charrington: [00:29:15] Interesting. So we've talked about a couple of these aspects of the process. Feature engineering, model selection, next is once you've identified the model, tuning hyper-parameters and optimization. Do you consider that its own step or is that a thing that you're doing all along? Or both? Erez Barak: [00:29:38] I consider it part of that uber process I talked about earlier. We're just delving into starting to use deep learning learner within Auto ML. So that's where we're also going to automate the parameter selection, hyper-parameter selection. A lot of the learners we have today are classic machine learning if you may, so that's where hyper-parameter tuning is not as applicable. But saying that, every time we see an opportunity like that, I think I mentioned earlier in our forecasting capabilities, we're now adding deep learning models. In order to make the forecasting more accurate, that's where that tuning will also be automated. Sam Charrington: [00:30:20] Okay, actually elaborate. I think we chatted about that pre-interview, but you mentioned that you're doing some stuff with TCN and Arema around times series forecasting. Can you elaborate on that? Erez Barak: [00:30:36] Yeah, so I talked about this process of choosing a learner. Now you also have to consider what is your possible set of learners you can choose from. And what we've added recently are sort of deep learning models or networks that actually are used within that process. So TCN and Arema are quite useful when doing times series forecasting. Really drive the accuracy based on the data you have. So we've now embedded those capabilities within our forecasting capability. Sam Charrington: [00:31:12] So when you say within forecasting, meaning a forecasting service that you're offering as opposed to within... Erez Barak: [00:31:21] No, let me clarify. There's three core use cases we support as part of Auto ML. One for classification, the other for regression, and the third for times series forecasting. So when I refer to that, I was referring more to that use case within Auto ML. Sam Charrington: [00:31:42] Got it. So in other words in the context of that forecasting use case, as opposed to building a system that is general and applying it to time series and using more generalized models, you're using now TCN and Arema as core to that, which are long-proven models for times series forecasting. Erez Barak: [00:32:07] Yeah, I would argue they're also a bit generalized, but in the context of forecasting. But let me tell you how we're thinking about it. There's generally applicable models. Now, we're seeing different use cases like in forecasting there are generally applicable models for that area, that are really useful in that area. That's sort of the current state we're in right now. And we want to add a lot more known generally applicable models to each area. In addition to that, sort of where we're heading and as I see this moving forward, more and more customers will want to add their own custom model. We've done forecasting for our manufacturing. We've tuned it to a place where it's just amazing for what we do because we know a lot more about our business than anyone else. We'd like to put that in the mix every time your Auto ML considers the best option. I think we're going to see- I'm already seeing a lot of that, sort of the 'bring your own model'. It makes sense. Sam Charrington: [00:33:06] That's an interesting extension to bring your own data, which was one of the first frontiers here. Erez Barak: [00:33:11] I mean you're coming in to a world now, it's not "Hey, there's no data science here. There's a lot of data science going on so I'm the data scientist. I've worked on this model for the past, you name it, weeks? Months? Years? And now this Auto ML is really going to help me be better? I don't think that's a claim we even want to make. I don't think that's a claim that's fair to make. The whole idea is find the user where they are. You have a custom model? Sure, let's plug that in. It's going to be considered with the rest in a fair and visible way, maybe with the auto featurization it even goes and becomes more accurate. Maybe you'll find out something else, you want to tune your model. Maybe you have five of those models, and you're not sure which one is best so you plug in all five. I think that's very much sort of where we're heading, plugging into an existing process that's already deep and rich wherever it lands. Sam Charrington: [00:34:07] The three areas that we've talked about, again featurization, model selection, and parameter tune or optimization are I think, what we tend to think of as the core of Auto ML. Do you also see it playing in the tail end of that process like the deployment after the model's deployed? There's certainly opportunities to automate there. A lot of that is very much related to dev ops and that kind of thing, but are there elements of that that are more like what we're talking about here? Erez Barak: [00:34:48] I think there's two steps, if you don't mind I'll talk about two steps before that. I think there's the evaluation of the model. Well, how accurate is it, right? But again you get into this world of iterations, right? So that's where automation is really helpful. That's one. The other is sort of the interpretation of the model. That's where automation really helps as well. So now, especially when I did a bunch of automation, I now want to make sure, "Well, which features really did affect this thing? Explain them to me. And work that into your automated processes. Did your process provide a fair set of data for my model to learn from? Does it represent all all genders properly? Does it represent all races properly? Does it represent all aspects of my problem, uses them in a fair way? Where do you see imbalance?" So I think automating those pieces are right before we jump into deployment, I think it's really mandatory when you do Auto ML to give that full picture. Otherwise, you're sort of creating the right set of tools, but I feel without doing that, you're sort of falling a bit short of providing everyone the right checks and balances to look at the work they're doing. So when I generalize the Auto ML process, I definitely include that. Back to your question on do I see deployment  playing there? To be honest, I'm not sure. I think definitely the way we evaluate success is we look at the models deployed with Auto ML or via Auto ML or that were created via Auto ML and are now deployed. We looked at their inferences. We look at their scoring, and we provide that view to the customer to assess the real value of their model. Automation there I think if I have to guess, yes. Automation will stretch there. Do I see it today? Can I call it that today? Not just yet. Sam Charrington: [00:36:54] Well, a lot of conversation  around this idea of deploying a model out into production, and thankfully I think we've convinced people that you can, it's not just deploy once and you're not thinking about it anymore. You have to monitor the performance of that model and there's a limited lifespan for most of the models that we're putting into production and then the next thing that folks get excited about is, "Well I can just see when my model falls out of tolerance and then auto-retrain..." It's one of these everyone's talking about it, few are actually doing it. it sounds like you're in agreement with that like we're not there yet at scale or no? Erez Barak: [00:37:42] So I think we often refer to that world as the world of ML ops. Machine learning operations in a more snappy way. I think there's a lot of automation there. If you look at automation, you do it dev ops for just code. I mean, forget machine learning code, but code, let alone models, is very much automation we need. I do think there're two separate loops that have clear interface points. Like deployed models, like maybe data about data drift. But they sort of move in different cycles at different speeds. So we're learning more about this but I suspect that iteration of training, improving accuracy, getting to a model where the data science team says, "Oh, this one's great. Let's use that." I suspect that's one cycle and frankly that's where we've been hyper-focused on automating with Auto ML. There's naturally another cycle of that, operations that we're sort of looking at automation opportunities with ML ops. Do they combine into one automation cycle? Hmm, I'm not sure. Sam Charrington: [00:38:58] But it does strike me that when for example, the decision "Do I retrain from scratch? Do I incrementally retrain? Do I start all the way over?" Maybe that decision could be driven by some patterns or characteristics in the nature of the drift in the performance shift that a model could be applied to. And then,  there're aspects of what we're thinking about and talking about as Auto ML that are applied to that dev ops-y part. Who knows? Erez Barak: [00:39:37] No, I'd say who knows. Then listening to you I'm thinking oh, to myself that while we sort of have a bit of a fixed mindset on the definition we'd definitely need to break through some of that and open up and see, "Well, what is it that we're hearing from the real world that should shape what we automate, how we automate and under which umbrella we put it?" I think, and you will notice, it's moving so fast, evolving so fast. I think we're just at the first step of it. Sam Charrington: [00:40:10] Yeah. A couple quick points that I wanted to ask about. Another couple areas that are generating excitement under this umbrella are neural architecture surge and neural evolution and techniques  like that. Are you doing anything in those domains? Erez Barak: [00:40:30] Again, we're incorporating some of those neural architectures into Auto ML today. I talked about our deeper roots with MSR and how they got us that first model. Our MSR team is very much looking deeper into those areas. They're not things that formulated just yet but the feeling is that the same concepts we put into Auto ML, or automated machine learning can be used there, can be automated there. I'm being a little vague because it is a little vague for us, but the feeling is that there is something there, and we're lucky enough to have the MSR arm that, when there's a feeling there's something there, some research starts to pan out, and they're thinking of different ideas there but to be frank, I don't have much to share at that point in terms of more specifics yet. Sam Charrington: [00:41:24] And my guess is we've been focused on this Auto ML as a set of platform capabilities that helps data scientists be more productive. There's a whole other aspect of Microsoft delivering cognitive services for vision, and other things where they're using Auto ML internally and where it's primarily deep learning based, and I can only imagine that they're throwing things like architecture surge and things like that at the problem. Erez Barak: [00:41:58] Yeah. So they do happen in many cases I think custom vision is a good example. We don't see the general patterns just yet and for the ones we do see, the means of automation haven't put out yet. So if I look at where we were with the Auto ML thinking probably a few years back is where that is right now. Meaning, "Oh, it's interesting. We know there's something there." The question is how we further evolve into something more specific. Sam Charrington: [00:42:30] Well, Erez, thanks so much for taking the time to chat with us about what you're up to. Great conversation and learned a ton. Thank you. Erez Barak: [00:42:38] Same here. Thanks for your time and the questions were great. Had a great time.
Bits & Bytes Microsoft open sources Bing vector search. The company published its vector search toolkit, Space Partition Tree and Graph (SPTAG) [Github], which provides tools for building, searching and serving large scale vector indexes. Intel makes progress toward optical neural networks. A new article on the Intel AI blog (which opens with a reference to TWIML Talk #267 guest Max Welling’s 2018 ICML keynote) describes research by Intel and UC Berkeley into new nanophotonic neural network architectures. A fault tolerant architecture is presented, which sacrifices accuracy to achieve greater robustness to manufacturing imprecision. Microsoft research demonstrates realistic speech with little labeled training data. Researchers have crafted an “almost unsupervised” text-to-speech model that can generate realistic speech using just 200 transcribed voice samples (about 20 minutes’ worth), together with additional unpaired speech and text data. Google deep learning model demonstrates promising results in detecting lung cancer. The system demonstrated the ability to detect lung cancer from low-dose chest computed tomography imagery, outperforming a panel of radiologists. Researchers trained the system on more than 42,000 CT scans. The resulting algorithms turned up 11% fewer false positives and 5% fewer false negatives than their human counterparts. Facebook open-sources Pythia for multimodal vision and language research. Pythia [Github] [arXiv] is a deep learning framework for vision and language multimodal research framework that helps researchers build, reproduce, and benchmark models. Pythia is built on PyTorch and designed for Visual Question Answering (VQA) research, and includes support for multitask learning and distributed training. Facebook unveils what secretive robotics division is working on. The company outlined some of the focus areas for its robotics research team, which include teaching robots to learn how to walk on their own, using curiosity to learn more effectively, and learning through tactile sensing. Dollars & Sense Algorithmia raises $25M Series B for its AI platform Icometrix, a provider of brain imaging AI solutions, has raised $18M Quadric, a startup developing a custom-designed chip and software suite for autonomous systems, has raised $15M in a funding Novi Labs, a developer of AI-driven unconventional well planning software, has raised $7M To receive the Bits & Bytes to your inbox, subscribe to our Newsletter.
Today we're joined by Alex Ratner, Ph.D. student at Stanford, to discuss his work on Snorkel, a framework for creating training data with weak supervised learning techniques. With Snorkel, Alex and his team hope to tackle the ever-present issue of having large data sets available by having users instead write a set of labeling functions, or scripts that programmatically label data. In our conversation, we discuss the original inspiration for Snorkel and some of the projects they've undertaken since it's inception. We also discuss some of the papers that have been presented at various conferences, that used Snorkel for training data, including Kunle Olokotun's "Software 2.0" presentation that we broke down in our 2018 NeurIPS series.
Over the next few weeks on the podcast, we're bringing you volume 2 of our AI Platforms series. You'll recall that last fall we brought you AI Platforms Volume 1, featuring conversations with platform builders from Facebook, Airbnb, LinkedIn, Open AI, Shell and Comcast. This series turned out to be our most popular series of shows ever, and over 1,000 of you downloaded our first eBook on ML platforms: Kubernetes for Machine Learning, Deep Learning & AI. Well now it's back, and we're sharing more experiences of teams working to scale and industrialize data science and machine learning at their companies.
Bits & Bytes Microsoft leads the AI patent race. As per EconSight research findings, Microsoft leads the AI patent race going into 2019 with 697 patents that the firm classifies as having a significant competitive impact as of November 2018. Out of the top 30 companies and research institutions as defined by EconSight in their recent analysis, Microsoft has created 20% of all patents in the global group of patent-producing companies and institutions. AI hides data from its creators to cheat at its appointed task. Research from Stanford and Google found that the ML agent intended to transform aerial images into street maps and back was found to be hiding information it would need later. Tech Mahindra launches GAiA for enterprises. GAiA is the first commercial version of the open source Acumos platform, explored in detail in my conversation with project sponsor Mazin Gilbert about a year ago. Taiwan AI Labs and Microsoft launch AI platform to facilitate genetic analysis. The new AI platform “TaiGenomics” utilizes AI techniques to process, analyze, and draw inferences from vast amounts of medical and genetic data provided by patients and hospitals. Google to open AI lab in Princeton. The AI lab will comprise a mix of faculty members and students. Elad Hazan and Yoram Singer, who both work at Google and Princeton and are co-developers of the AdaGrad algorithm, will lead the lab. The focus of the group is developing efficient methods for faster training. IBM designs AI-enabled fingernail sensor to track diseases. This tiny, wearable fingernail sensor can track disease progression and share details on medication effectiveness for Parkinson’s disease and cardiovascular health. ZestFinance and Microsoft collaborate on AI solution for credit underwriting. Financial institutions will be able to use the Zest Automated Machine Learning (ZAML) tools to build, deploy, and monitor credit models using the Microsoft Azure cloud and ML Server. Dollars & Sense Swiss startup  Sophia Genetics raises $77M to expand its AI diagnostic platform Baraja, LiDAR start-up, has raised $32M in a series A round of funding Semiconductor firm QuickLogic announced that it has acquired SensiML, a specialist in ML for IoT applications Donnelley Financial Solutions announced the acquisition of eBrevia, a provider of AI-based data extraction and contract analytics software solutions Graphcore, a UK-based AI chipmaker, has secured $200M in funding, investors include BMW Ventures and Microsoft Dataiku Inc, offering an enterprise data science and ML platform, has raised $101M in Series C funding Ada, a Toronto-based co focused on automating customer service, has raised $19M in funding To receive the Bits & Bytes to your inbox, subscribe to our Newsletter.
Happy New Year! I've spent the week at CES in Las Vegas this week, checking out a bunch of exciting new technology. (And a bunch of not-so-exciting technology as well.) I'll be writing a bit more about my experiences at CES on the TWIML blog, but for now I'll simply state the obvious: AI was front and center at this year's show, with many interesting applications, spanning smart home and city to autonomous vehicles (using the term vehicle very broadly) to health tech and fitness tech. I focused on making videos this time around, and we'll be adding a bunch from the show to our CES 2019 playlist over on Youtube, so be sure to check that out and subscribe to our channel while you're there. In other news, we just wrapped up our AI Rewind 2018 series in which I discussed key trends from 2018 and predictions for 2019 with some of your favorite TWIML guests. This series was a bit of an experiment for us and we're excited to have received a lot of great feedback on it. If you've had a chance to check it out I'd love to hear your thoughts. Cheers, Sam P.S. We're always looking for sponsors to support our work with the podcast. If you think your company might benefit from TWIML sponsorship, I'd love your help getting connected to the right people. P.P.S. I'm planning a visit to the Bay Area for the week of January 21st. I've got a few open slots for briefings and meeting up with listeners. If you're interested in connecting give me a holler.   To receive this to your inbox, subscribe to our Newsletter.
In the final episode of our AI Rewind series, we're excited to have Siddha Ganju back on the show. Siddha, who is now an autonomous vehicles solutions architect at Nvidia shares her thoughts on trends in Computer Vision in 2018 and beyond. We cover her favorite CV papers of the year in areas such as neural architecture search, learning from simulation, application of CV to augmented reality, and more, as well as a bevy of tools and open source projects.
In this episode of our AI Rewind series, we introduce a new friend of the show, Simon Osindero, Staff Research Scientist at DeepMind. We discuss trends in Deep Reinforcement Learning in 2018 and beyond. We've packed a bunch into this show, as Simon walks us through many of the important papers and developments seen last year in areas like Imitation Learning, Unsupervised RL, Meta-learning, and more.
In this episode of our AI Rewind series, we've brought back recent guest Sebastian Ruder, PhD Student at the National University of Ireland and Research Scientist at Aylien, to discuss trends in Natural Language Processing in 2018 and beyond. In our conversation we cover a bunch of interesting papers spanning topics such as pre-trained language models, common sense inference datasets and large document reasoning and more, and talk through Sebastian's predictions for the new year.
In this episode of our AI Rewind series, we're back with Anima Anandkumar, Bren Professor at Caltech and now Director of Machine Learning Research at NVIDIA. Anima joins us to discuss her take on trends in the broader Machine Learning field in 2018 and beyond. In our conversation, we cover not only technical breakthroughs in the field but also those around inclusivity and diversity.
In this episode of our AI Rewind series, we're bringing back one of your favorite guests of the year, Jeremy Howard, founder and researcher at Fast.ai. Jeremy joins us to discuss trends in Deep Learning in 2018 and beyond. We cover many of the papers, tools and techniques that have contributed to making deep learning more accessible than ever to so many developers and data scientists.
To close out 2018, and open the new year, we're excited to bring you our first-ever AI Rewind series! In this series I interview friends of the show for their perspectives on the key developments of 2018, as well as a look ahead at the year to come. We'll cover a few key categories this year, namely, Computer Vision, Natural Language Processing, Deep Learning, Machine Learning, and Reinforcement Learning. Of course, we realize that there are many more possible categories than these, there's a ton of overlap between these topics, and no single interview could hope to cover everything important in any of these areas. Nonetheless, we're pleased to present these talks and invite you to share your own perspectives by commenting below.
Today we close out both our NeurIPS series and our 2018 conference coverage with this interview with Nando de Freitas, Team Lead & Principal Scientist at Deepmind and Fellow at the Canadian Institute for Advanced Research. In our conversation, we explore his interest in understanding the brain and working towards artificial general intelligence through techniques like meta-learning, few-shot learning and imitation learning. In particular, we dig into a couple of his team's NeurIPS papers: "Playing hard exploration games by watching YouTube," and "One-Shot high-fidelity imitation: Training large-scale deep nets with RL."
Bits & Bytes IBM, Nvidia pair up on AI-optimized converged storage system.  IBM SpectrumAI with Nvidia DGX, is a converged system that combines a software-defined file system, all-flash storage, and Nvidia's DGX-1 GPU system. The storage system supports AI workloads and data tools such as TensorFlow, PyTorch, and Spark. Google announces Cloud TPU Pods, availability in alpha.  Google Cloud TPU Pods alpha are tightly-coupled supercomputers built with hundreds of Google’s custom Tensor Processing Unit (TPU) chips and dozens of host machines. Price/performance benchmarking shows a 27x speedup for nearly 40% lower cost in training a ResNet-50 network. MediaTek announces the Helio P90.  The Helio P90 system-on-chip (SoC) uses the company's APU 2.0 AI architecture. APU 2.0 is a leading fusion AI architecture designed by MediaTek can deliver a new level of AI experiences that are 4X more powerful than the Helio P70 and Helio P60 chipsets. Facebook open sources PyText for faster NLP development. Facebook has open sourced the PyText modeling framework for NLP experimentation and deployment. The library is built on PyTorch and supports use cases such as document classification, sequence tagging, semantic parsing, and multitask modeling. On scaling AI training. This interesting article from OpenAI proposes that the gradient noise scale metric can be used to predict parallelizability of training for a wide variety of tasks, and explores the relationship between gradient noise scale, batch size, and training speed. Dollars & Sense TechSee, Tel Aviv-based provider of AI-powered visual customer engagement solutions, has secured $16M in Series B funding Zesty.ai, an Oakland, CA-based AI startup, closed US$13M Series A financing Walmart Labs India, the product development division of the US retail giant, announced that it has acqui-hired AI and data analytics startup Int.ai Avnet, Inc announced that it will acquire Softweb Solutions, Inc., a software and AI company that provides software solutions for IoT Sign up for our Newsletter to receive this weekly to your inbox.
A couple of weeks ago, I had the pleasure of attending the NeurIPS conference in Montreal. I had a wonderful time there and of course, the highlight was the chance to meet a ton of TWIML listeners. I held two fun meetups at the conference: The first was focused on AI in Production, and AI Platforms and Infrastructure and it attracted a pretty nice crowd. We basically took over a dumpling restaurant in chinatown near the convention center. I also held a listener meetup one evening and got to hang out with listeners from all over the country and world. Of course, one of my favorite parts of the conference was the opportunity to sit down with a few of the many great researchers attending and presenting there. This week on the show, we're excited to share our 2018 NeurIPS Series.
A couple of weeks ago I spent the week in Las Vegas at the Amazon Web Services (AWS) re:Invent conference, and we shared a few of my interviews from the event in our AWS re:Invent Series. I’ll share a bit of the news coming out of the event in this post, but if you’re interested in machine learning and artificial intelligence, and especially in the intersection of ML/AI and cloud computing, I really recommend that you tune into our 2nd Annual re:Invent Roundup Roundtable. The Roundtable is a fun TWIML “tradition” in which myself and a couple of panelists get together at re:Invent to recap the week’s announcements. (Note to self: I really like this format and need to do it more often.) This year, the Roundtable included veteran participant Dave McCrory (VP of Engineering at Wise.io at GE Digital) and Val Bercovici (CEO of startup Pencil Data). Dave, Val, and I cover all of the interesting AI news and highlights from this year’s conference. Here are a bunch of the main machine learning and artificial intelligence announcements made by AWS around re:Invent: New Features for Amazon SageMaker: Workflows, Algorithms, and Accreditation. In the run-up to re:Invent (aka “pre:Invent”), Amazon announced new automation, orchestration, and collaboration features to make it easier to build, manage, and share machine learning workflows. SageMaker RL aims to bring reinforcement learning to the masses. Newreinforcement learning extensions to SageMaker support 2D & 3D simulation and physics environments as well as OpenAI’s OpenGym. Robotics tools Sumerian, RoboMaker, and ROS are also supported. AWS also announced AWS DeepRacer, a new 1/18th scale autonomous model race car for developers, driven by reinforcement learning. SageMaker Ground Truth simplifies data labeling. The service aims to allow developers to build highly accurate training datasets using machine learning while reducing data labeling costs by up to 70% using active learning. SageMaker Neo to optimize AI performance at the edge. Neo is an open source compiler and runtime targeting edge devices. It provides automatic optimization and will compile deep learning models to run on any edge device at up to 2x the performance & 1/10th the size. Amazon Textract allows developers to extract text from virtually any document. The new service automatically extracts text and tabular data from scanned documents. Amazon Transcribe now supports real-time transcriptions. With the new feature called Streaming Transcription, Amazon Transcribe (speech-to-text service) will be able to create text transcripts from audio in real time. Amazon Rekognition announces updates to its face detection, analysis, and recognition capabilities. The new updates will enhance the ability to detect more faces from images, perform higher accuracy face matches, and obtain improved age, gender, and emotion attributes for faces in images. Amazon launches ML-based platform for healthcare. Amazon Comprehend Medical platform allows developers to process unstructured medical text and identify information such as patient diagnosis, treatments, dosages, symptoms and signs, and more. Amazon announces new ML chip. AWS’ new ML chip, Inferentia, is a high-throughput, low-latency, and a cost-effective processor. It supports multiple ML frameworks, including TensorFlow, Caffe2, and ONNX. NVIDIA AI Software is now available on AWS Marketplace. This will simplify access to NVIDIA software on Amazon ECS and Amazon EKS. Six NVIDIA containers will be available on AWS marketplace; including CUDA, MXNet, PyTorch, TensorFlow, TensorRT, and TensorRT Inference Server. Amazon QuickSight announces ML Insights in preview. Amazon announced that it is adding three new features to Amazon QuickSight to provide customers with ML-powered insights beyond visualizations. New features will provide hidden insights, forecasting, and narrative description of the dashboard Amazon announced new personalization and forecasting services. AWS announced a Amazon Personalize and Amazon Forecast. These fully-managed services incorporate Auto-ML features to put ML-powered personalization and forecasting capabilities into the hands of developers with little ML experience. Amazon SageMaker now comes with a search feature. The SageMaker Search capability lets you find and evaluate the most relevant model training. The new feature will accelerate the model development and will reduce overall time to market ML-based solutions. AWS introduces dynamic training for DL with Amazon EC2. Dynamic Training (DT) for DL models allows DL practitioners to reduce model training cost and time by leveraging the cloud’s elasticity and economies of scale. Amazon’s ‘Machine Learning University’ is now available to all developers. AWS announced that its ML courses to train engineers at Amazon are now available to all developers. There are more than 30 self-service, self-paced digital courses with more than 45 hours of courses, videos, and labs for four key groups: developers, data scientists, data platform engineers, and business professionals. As you can see, AWS announced a ton of new ML and AI capabilities this year, making it fun, and challenging, to try to keep up with it all. In addition to the Roundup, you should also check out my conversation with Jinho Choi of Emory University, discussing key challenges faced with the cloud-based NLP platform and his vision for his group’s ELIT platform, and my conversation with Thorsten Joachims of Cornell University, discussing the inherent and introduced biases in recommender systems, and how inference techniques can be used to make learning algorithms more robust to them.
For today's show, I'm excited to present our second annual re:Invent Roundtable Roundup. This year I'm joined by my friends Dave McCrory, VP of Software Engineering at Wise.io at GE Digital, and Val Bercovici, Founder and CEO of Pencil Data. If you missed the news coming out of re:Invent, or you want to know more about what one of the biggest AI platform providers is up to, you'll want to stay tuned, because we'll discuss many of their new offerings in this episode. We cover all of AWS' most important ML and AI announcements, including SageMaker Ground Truth, Reinforcement Learning and New, DeepRacer, Inferentia and Elastic Inference, ML Marketplace, Personalize, Forecast and Textract, and more.
This week on the podcast features a few of my conversations from last week's AWS re:Invent conference in Las Vegas.
Bits & Bytes Google Cloud first to offer Tesla T4 GPUs. The NVIDIA Tesla T4 instances are currently available in alpha. They target ML inference, distributed training of models, and computer graphics workloads. Intel unveils Neural Compute Stick 2. Intel announced an update to the Movidius Neural Compute Stick focused on AI at the network edge. The Intel NCS 2 is based on the Intel Movidius Myriad X vision processing unit (VPU), offering more compute power and greater prototyping flexibility. Microsoft brings AI to Power BI. The new addition of AI features will bring image recognition, text analytics, and automated ML to Power BI. Integrated with Azure Machine Learning, Power BI will let users create ML models directly within the tool. Kasito launches second generation conversational AI platform. Kasisto’s updated conversational AI platform KAI will allow banks to launch new self-service features more easily. Linux Foundation launches Acumos platform for quick AI deployment. The Athena release of the Acumos AI Project aims to make building, sharing and deploying AI applications easy. In TWIML Talk #78, Mazin Gilbert talks about the goals and architecture of the platform. SnapLogic introduces self-service machine learning platform. SnapLogic announced the launch of SnapLogic Data Science, which combines data integration, data transformation, and machine learning training features in a single self-service platform. Elsevier creates AI-ready life sciences data platform. Entellect, a new cloud-based data platform aims to make life sciences R&D more efficient by enriching and harmonizing proprietary and external data and delivering it in an AI-ready environment. Dollars & Sense Standard Cognition, a provider of AI-powered autonomous checkout solutions, raised $40M in Series A funding Habana Labs, a startup in the AI processor space, announced it has raised $75M in Series B funding Apex.ai, which is building an operating system for self-driving vehicles, has raised $15.5M in Series A funding Microsoft announced its acquisition of a conversational AI and bot development startup XOXCO Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
Bits & Bytes AWS introduces Amazon SageMaker Object2Vec. Entity embeddings are a powerful technique gaining traction among many ML users. With its new Object2Vec algorithm, AWS aims to make creating them more accessible for SageMaker users. Object2Vec is a highly customizable multi-purpose algorithm that can learn low-dimensional dense embeddings of high dimensional objects. Microsoft develops flexible AI system that can summarize the news. Summarization is a hard problem in NLP. Microsoft researchers have developed an AI framework that can reason relationships in “weakly structured” text, enabling it to outperform conventional models on a range of text summarization tasks. Google launches AI Hub and Kubeflow Pipelines. AI Hub provides a private, secure destination where enterprises can upload and share ML resources within their own organizations as well as access Google-produced content such as including pipelines, Jupyter notebooks, and TensorFlow modules. To support this, Kubeflow Pipelines will provide a way to compose, package, deploy, and manage ML workflows for reuse. The New York Times taps Google's AI to manage old photos. Google Cloud has teamed up with The New York Times to digitize and manage over 7 million archived photos and will use ML to find new insights from the old photos. OpenAI releases SpinningUp in Deep RL. For reinforcement learning fans and students in the community, this new educational resource from OpenAI provides a wealth of info and references to help users understand deep reinforcement learning. Would anyone be interested in a Deep RL course/study group based on this resource in early 2019? Dollars & Sense Engineer.ai announced that it has raised $29.5M Series A funding Aiden.ai, a London based AI analytics start-up, has raised a $1.6M Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
Bits & Bytes ONNX Runtime for ML inference now in preview. Microsoft released a preview of the ONNX Runtime, a high-performance inference engine for Open Neural Network Exchange (ONNX) models. It is compatible with ONNX version 1.2 and comes in Python packages that support both CPU and GPU. Uber describes new platform for rapid Python ML development. Uber shared Michelangelo PyML, an extension to its Michelangelo platform providing for faster development and experimentation based on Docker containers. NYU and Facebook release cross-language NLU data set. As researchers look to increase the number of languages NLU systems can understand, gathering and annotating data in every language is a bottleneck. One alternative is to train a model on data in one language and then test that model in other languages. The Cross-Lingual Natural Language Inference (XNLI) data set advances this approach by providing that test data in languages. Malong researchers develop a technique to train deep neural networks. In this new paper, Malong introduces CurriculumNet, a training strategy leveraging curriculum learning to increase performance while decreasing noise when working on large sets of data. The code is now available on GitHub as well. Facebook launches Horizon reinforcement learning platform. Facebook has open-sourced Horizon, an end-to-end applied reinforcement learning platform. Unlike other open-source RL platforms focused on gameplay, Horizon targets real-world applications and is used at Facebook to optimize notifications, video streams, and chatbot suggestions. Google launches AdaNet for combining algorithms with AutoML. Google launched AdaNet, an open-source tool for automatically creating high-quality models based on neural architecture search and ensemble learning. Users can add their own model definitions to AdaNet using high-level TensorFlow APIs. Dollars & Sense People.ai announced that it has raised $30M in Series B funding led by Andreessen Horowitz DataRobot, a Boston-based automated ML company, raised $100M in Series D funding Syntiant Corp, an Irvine-based AI semiconductor company, raised $25M in Series B funding led by M12, Microsoft’s VC arm Oracle announced that it has acquired data management and AI solutions provider DataFox eSentire has acquired Seattle-based cybersecurity AI company Versive (formerly Context Relevant) AppZen, an AI auditing solutions provider, announced $35 million funding led by Lightspeed Venture Partners Validere, which provides an AI and IoT platform for oil and gas, raised $7m in seed funding Esperanto Technologies a hardware company focused on energy efficient systems for AI, ML, and DL, closed $58m Series B funding Conversica, offering conversational AI products for sales and marketing, announced it has secured a $31 million Series C funding Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
This video is a recap of our Fast.ai x TWIML Online Machine Learning Study Group. In this session, we review Lesson 4, Feature Importance, Tree Interpreter. It’s not too late to join the study group. Just follow these simple steps: Head over to twimlai.com/community, and sign up for the programs you're interested in, including the Fast.ai study groups, or either of our Monthly Meetup groups. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review lesson one of the updated 2018 Deep Learning Course, Recognizing Cats and Dogs. It’s not too late to join the study group. Just follow these simple steps: Head over to twimlai.com/community, and sign up for the programs you're interested in, including either of the Fast.ai study groups or our Monthly Meetup groups. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our Fast.ai x TWIML Online Machine Learning Study Group. In this session, we review Lesson 3, Performance, Validation and Model Interpretation. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our Fast.ai x TWIML Online Deep Learning Study Group. In this session, we review lesson three, Improving Your Image Classifier. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our October 2018 Americas TWIML Online Meetup. In this month's community segment, we briefly discuss Entity Embeddings, the topic of the next month’s meetup, along with the recent announcement that MIT made a $1 billion commitment to address the global opportunities and challenges presented by the prevalence of computing and the rise of artificial intelligence (AI). In our presentation segment, Mark Meketon joins us to cover TWIML Talk veteran Ksenia Konyushkova's paper “Learning Active Learning from Data." For links to the papers, podcasts, and more mentioned above or during this meetup, for more information on previous meetups, or to get registered for upcoming meetups, visit twimlai.com/meetup! https://youtu.be/IiRBY7gP0II Paper: Learning Active Learning with Data
This video is a recap of our Fast.ai x TWIML Online Machine Learning Study Group. In this session, we review lesson two, Random Forest Deep Dive. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our Fast.ai x TWIML Online Deep Learning Study Group. In this session, we review lesson three, Improving Your Image Classifier. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our Fast.ai x TWIML Online Machine Learning Study Group. In this session, we review lesson one, Introduction to Random Forests. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
A few weeks ago, machine learning & AI researchers, practitioners and students from Africa and around the world met at Stellenbosch University, South Africa, for the second annual Deep Learning Indaba, an event that aims to expand African participation in and contribution to the field. While I wasn't able to make it to Cape Town, I did have a chance to speak with some of the event's awesome speakers, and I'm excited to present our Deep Learning Indaba series.
This video is a recap of our October 2018 EMEA TWIML Online Meetup. In this month's community segment we discuss the upcoming topics for the EMEA meetup group, along with our ongoing and upcoming Fast.AI study groups. We take a quick look at the newly released fastai v1 library, Jeremy Howard’s upcoming book, and the podcast we recorded with him earlier this week. Finally, we discuss a few dataset search engines including the recently released Google Dataset Search Engine Beta. In our presentation segment, Arvind Kumar leads us in a breakdown of the paper “Including multi-feature interactions and redundancy for feature ranking in mixed datasets,” which he co-authored along with other researchers from Bosch and the Hasso Plattner Institute in Germany. For links to the papers mentioned above and more information on this and previous meetups, or to get registered for upcoming meetups, visit twimlai.com/meetup! https://youtu.be/W5-fYQ4bvVc Paper - Including multi-feature interactions and redundancy for feature ranking in mixed datasets - Slides Mentioned in the Community Segment - Jeremy Howard’s upcoming book, The Mechanics of Machine Learning - The Fastai v1 Deep Learning Framework with Jeremy Howard - Jeremy Howard's panel discussion at Pytorch Conference(Starts at 1:18:47) - Google Dataset Search Engine - CV Data Search Engine - Google Earth Engine
This video is a recap of our Fall Fast.ai x TWIML Online Deep Learning Study Group. In this session, we review lesson two, Convolutional Neural Nets. It’s not too late to join the study group. Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our Fall Fast.ai x TWIML Online Deep Learning Study Group. In this session, we review lesson two, Convolutional Neural Nets. It’s not too late to join the study group. Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
In this recap of our first session of the fall Fast.ai Deep Learning course study group. This week we review lesson one of the course, Recognizing Cats and Dogs. It’s not too late to join the study group. Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
Bits & Bytes Researchers develop AI to detect musical mood. Deezer researchers have developed a deep learning system which can identify the mood and intensity of songs based on audio and lyrics. Microsoft announces automated machine learning service. The new service aims to identify the best machine learning pipeline for the user’s labeled data. Automated ML is integrated with Azure Machine Learning and includes an SDK for integration with Python development environments including Visual Studio Code, PyCharm, Azure Databricks notebooks and Jupyter notebooks. Microsoft contributes $40 million for humanitarian AI. Microsoft launched a new $40-million program aimed at harnessing the power of AI for humanitarian action. The program targets causes such as disaster recovery, helping children, and protecting refugees. Microsoft features Shell industrial AI use cases. Early in his Ignite keynote, Microsoft CEO Satya Nadella highlighted work by Microsoft and Bonsai, which it acquired earlier this year, performed at Shell. The join work aims to enhance safety by applying AI and IoT in a number of areas including at retail gas stations. DeepMind outlines Technical AI Safety program. Interesting post by Google DeepMind researchers outlining the key tenets—specification, robustness, and assurance—of their AI safety research program. AI’s new muse: our sense of smell. Artificial neural networks have been loosely inspired by the brain, specifically the visual cortex. This article describes recent work being done to draw inspiration from our olfactory circuits as well. Dollars & Sense Netradyne, which applies AI to driver and fleet safety, has raised Series B funding of $21 million Olis Robotics, announced the acquisition of White Marsh Forests, a machine learning startup based in Seattle Slack, announced that it has acquired Astro, a messaging startup that applied AI to email Marketing agency Impact Group acquired ad platform startup Cluepto help connect retail brands to consumers Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
In today's episode we'll be taking a break from our Strata Data conference series and presenting a special conversation with Jeremy Howard, founder and researcher at Fast.ai. Fast.ai is a company many of our listeners are quite familiar with due to their popular deep learning course. This episode is being released today in conjunction with the company's announcement of version 1.0 of their fastai library at the inaugural Pytorch Devcon in San Francisco. Jeremy and I cover a ton of ground in this conversation. Of course, we dive into the new library and explore why it's important and what's changed. We also explore the unique way in which it was developed and what it means for the future of the fast.ai courses. Jeremy shares a ton of great insights and lessons learned in this conversation, not to mention mentions a bunch of really interesting-sounding papers. Don't forget to join our community of machine learning enthusiasts, including our study groups for the fast.ai courses, at twimlai.com/meetup.
Bits & Bytes IBM launches tool aimed at detecting AI bias. IBM Research has launched the AI Fairness 360 Kit to scan for signs of AI bias and make recommendation adjustments. The open source Python package contains nine different algorithms, developed by the broader algorithmic fairness research community, to mitigate unwanted bias. Microsoft adds Tensorflow scoring to ML.Net. The company has added TensorFlow model scoring to version 0.5 of its ML.Net open source machine learning framework, enabling the use of existing TensorFlow models in ML.Net experiments. Tencent announces new AI services. The new Tencent Open AI Platform, called "AI.QQ.COM," aims to build a services ecosystem leveraging Tencent’s various AI capabilities. The platform makes more than 100 AI APIs available to industry. Accenture Introduces new healthcare bots. The virtual-assistant bots “Ella” and “Ethan” join the Accenture Intelligent Patient Platform to make intelligent recommendations for interactions between life sciences companies, patients, health care providers, and caregivers. Miso Robotics enhances AI platform to include frying skills. The cloud-connected Miso AI platform now enables Flippy, the company's autonomous robotic kitchen assistant, to perform frying tasks in addition to grilling. Dollars & Sense Ontario-based DarwinAI, which develops tools for optimizing deep neural nets, raised $3 million, led by Obvious Ventures and iNovia Capital Leena AI, an HR bot startup has raised $2 million in a seed funding round from a group of Silicon Valley investors Oxbotica, an Oxford, UK-based autonomous vehicle software company, completed a £14m funding round Orbital Insight announced its acquisition of a Boston-based “FeatureX,” which specializes in computer vision for satellite imagery Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
This video is a recap of our September 2018 Americas TWIML Online Meetup. In this month's community segment we discuss the upcoming topics for both the EMEA and Americas meetup groups, along with our recently started Fast.AI study group. We also briefly discuss episode #180 of the podcast, which featured Nick Bostrom, Professor and author of the book Superintelligence. Finally, Sam shares some interesting blog posts. In our presentation segment, David Clement leads us in a breakdown of the paper “DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills.” For links to the papers mentioned above and more information on this and previous meetups, or to get registered for upcoming meetups, visit twimlai.com/meetup! https://youtu.be/RLa5XqH36c8 Paper: DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills The What-If Tool: Code-Free Probing of Machine Learning Models Help! I can’t reproduce a machine learning project! SQL Query Optimization Meets Deep Reinforcement Learning The Trinity Of Errors In Financial Models: An Introductory Analysis Using TensorFlow Probability
A few weeks ago I had the opportunity to spend some time at the Strata Data Conference, presented by O'Reilly and Cloudera, in New York. While on site, I was able to meet with some of the speakers from the conference. In this series, you'll hear a few of those great conversations.
Bits & Bytes Executive change at Google Cloud AI. Google Cloud AI head Dr. Fei-Fei Li has left the organization and will be returning to her professorship at Stanford. Dr. Andrew Moore, Dean of the School of Computer Science at CMU, will replace her by year end. Cisco unveils server for Artificial Intelligence and Machine Learning. Cisco has launched new servers aimed at speeding up deep learning workloads. Facebook's 'Rosetta' can extract text from a billion images daily. Facebook has developed a machine learning system called Rosetta for contextual extraction of text in images. The system supports image search and will also help Facebook identify inappropriate or harmful content. NVIDIA launches new data center inference. NVIDIA launched “TensorRT Hyperscale” offering inference acceleration for voice, video, image and recommendation services. The new platform features NVIDIA Tesla T4 GPUs based on the Turing architecture. Facebook developed an AI-based debugging tool. The tool, called “SapFix,” aims to help programmers by finding and fixing software bugs automatically. Google open sources ‘What-If Tool’ for code-free ML Experiments. Google’s AI research team has developed What-If Tool, a new TensorBoard feature allowing users to analyze an ML model without writing code. The tool offers a visual interface for exploring different model results. Dollars & Sense Syllable.ai, which offers a healthcare chat platform, raises $13.7M Integrate.ai, a Toronto-based AI software company, secures $30 million in Series A funding Microsoft announced the acquisition of San Francisco based Lobe, whose slick demo of code-free deep learning swept the Twitters a few months ago Deloitte announced its acquisition of Magnetic Media Online's artificial intelligence platform business. Magnetic is a marketing technology company headquartered in New York City Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
This video is a recap of our September 2018 EMEA TWIML Online Meetup. In this month's community segment we briefly look at Remote Sensing and Auto Encoders for Weather Tracking. We also discuss the status of upcoming Fast.AI Study Groups, including groups looking to reengage in part 1 again, Starting Part 2, and the release up the updated Version 2 (Lesson 1). In our presentation segment, Kai Lichtenberg leads us in a breakdown of Geoffrey Hinton’s CapsNets paper. Topics covered include: - Whats wrong with CNNs - Why our brain is probably doing “Inverse Graphics.” - What is capsule? - CapsNet and Dynamic Routing. https://youtu.be/G7lvnt1eRjw For links to the papers mentioned above and more information on this and previous meetups, or to get registered for upcoming meetups, visit twimlai.com/meetup! Paper: Dynamic Routing Between Capsules Paper: Matrix Capsules with EM Routing
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review part two of lesson of seven, Resnets from Scratch. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar.
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review part one of lesson of seven, Resnets from Scratch. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar.
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review part two of lesson of six, Interpreting Embeddings; RNNs from Scratch. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar.
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review part one of lesson of six, Interpreting Embeddings; RNNs from Scratch. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar.
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review part two of lesson of five, Collaborative Filtering; Inside the Training Loop. It’s not too late to join the study group. Just follow these simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar.
Bits & Bytes Diffbot launches knowledge graph as-a-service. The startup, whose roots are in web scraping, applied machine learning, computer vision, and natural language processing to create a database of ‘all the knowledge of the Web,’ spanning over 10 billion entities and 1 trillion facts. Automatic transliteration helps Alexa find data across language barriers. Amazon researchers have developed a multilingual “named-entity transliteration system” to help Alexa overcome language barriers in multilingual environment. Oracle open sources GraphPipe for model deployment. Though Oracle has a strained relationship with open source, they recently released a new open source tool called GraphPipe, designed to simplify and standardize the deployment of machine learning models. Google turns datacenter cooling controls over to AI. Google was already using AI to optimize data center energy efficiency. Now they’ve handed over complete control of data center cooling to AI. Instead of humans implementing AI-generated recommendations, the system is now directly controlling data center cooling. IBM researchers propose ‘factsheets’ for AI transparency. Expanding on ideas like the Datasheets for Datasets paper I discussed previously, an IBM Research team has suggested a factsheet based approach for AI developers to ensure transparency. Facebook and NYU researchers speed up MRI scans with AI. Facebook announced the fastMRI project, in collaboration with NYU, which aims to apply AI to accelerate MRI scans by up to 10 times. Google releases Dopamine reinforcement learning research framework. Google announced the new TensorFlow-based framework, which aims to provide flexibility, stability, and reproducibility for new and experienced RL researchers. Baidu launches EZDL, a coding-free deep learning platform. Chinese firm Baidu EZDL, an online tool enabling anyone to build, design, and deploy models without writing code. Dollars & Sense Canvass Analytics, a Toronto-based provider of AI-enabled predictive analytics for IIOT, raised $5M in funding. Cloudalize, a cloud platform for running GPU-accelerated applications, has secured a €5 million funding round. Intel announced that it is buying Vertex.ai, a startup developing a platform-agnostic model suite, for an undisclosed amount. Zscaler, announced that it has acquired AI and ML technology and the development team of stealth security startup TrustPath. New Knowledge, an Austin-based cybersecurity company that protects corporations from covert, coordinated disinformation campaigns, raised $11M in Series A funding. Phrasee, a London based marketing technology company that uses AI to generate optimized marketing copy, closed a $4m Series A funding round. Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
Bits & Bytes IBM Research presents 'DeepLocker,' AI-powered malware. IBM researchers have developed DeepLocker, a new breed of highly targeted and evasive attack tools powered by AI. The malware remains dormant until identifying its target through indicators like facial recognition, geolocation and voice recognition. The project is meant to raise awareness of the possibilities when AI is combined with current malware techniques. Microsoft Updates ML.NET Machine Learning Framework Microsoft has updated ML.NET, its cross-platform, open-source machine learning framework for .NET developers. This initial preview release of ML.NET enables simple tasks like classification and regression and brings the first draft of .NET APIs for training and inference. The framework can be extended to add support for popular ML libraries like TensorFlow, Accord.NET, and CNTK. Tesla dumps Nvidia, goes it alone on AI hardware. Tesla confirmed that it has developed its own processor for AutoPilot and other on-vehicle AI workloads. The company mentioned that the new processor is 10 times faster than the one it was using from Nvidia, and the linked article’s author estimates the company could save billions by developing its own. Doc.ai and Anthem to introduce health data trial powered by AI and blockchain. The companies have partnered to launch an AI data trial on the blockchain. doc.ai will attempt to identify models for predicting allergies based upon collected phenome (e.g., age, height, weight), exposome (e.g., exposure to weather/pollution based on location) and physiome (e.g., physical activity, daily steps) data. NetApp and NVIDIA launch new deep learning architecture. NetApp introduced the NetApp® ONTAP® AI architecture, powered by NVIDIA DGX™ supercomputers and NetApp AFF A800 cloud-connected all-flash storage to simplify, speed, and scale the deep learning data pipeline. Getty Images launches AI tool for media publishers. Getty Images launched Panels by Getty Images, a new artificial intelligence tool for media publishers that recommends the best visual content to accompany a news article. Dollars & Sense Malong Technologies, a Shenzhen, China-based computer vision startup, received a minority investment from Accenture Test.ai, which AI to the challenge of testing apps, has raised an $11 million Series A round led by Gradient Ventures Skyline AI, a New York based real estate investech firm using AI and data science, raised $18M in Series A funding DefinedCrowd, which provides crowd-sourced data for training AI, has raised a $11.8 million funding round led by Evolution Equity Partners Dbrain, whose platform links crowdworkers and data scientists via the blockchain, has raised a $3 million funding round Racetrack.ai, which applies AI to accelerating business sales, has raised $5 million in a pre-Series A round Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
Bits & Bytes Google announced a bunch of interesting ML/AI-related news at last week’s Next conference. Here are the highlights, along with a few other tidbits. Google launches new AI-powered contact center solution. The global market for cloud-based contact center solutions is expected to exceed $30B by 2023. It’s no surprise that Google wants a piece of this, and to that end launched the Contact Center AI alpha. The new offering combines Google’s Dialogflow chat platform with other AI technologies—e.g. agent assist and a conversational topic modeler—to help customers reduce wait times, improve customer satisfaction, and gain greater insights. A full host of technology and services partners were announced as well. Furthering its edge initiatives, Google releases new Cloud IoT Edge. Cloud IoT Edge includes Edge IoT Core, which facilitates the connection of edge devices to the Google Cloud and simplifies their management, and Edge ML, which supports running pre-trained TensorFlow Lite models on edge hardware. Cloud IoT Edge is designed to take advantage of the newly announced Edge TPU as well (see below). Google unveils new AI chips for edge machine learning. Google is bringing its TPU accelerator chips from the cloud to the edge with the launch of Edge TPU, currently in early access. Aiming to compete with offerings like the Nvidia Jetson and Intel Movidius product families, Edge TPU brings high-performance ML inference to small, power-constrained devices. Google adds Natural Language and Translation services to the Cloud AutoML family. I covered the launch of Google Cloud AutoML Vision in the newsletter earlier this year. Last week Google pulled back the covers on new AutoML services for natural language classification and translation. Skip the press releases though and check out Rachel Thomas’ great series of posts on these new tools. For more from Google and Next, check out these roundups of all announcements and analytics/ML announcements. Dollars & Sense Snap40, which uses ML/AI for remote patient monitoring, has secured US $8 million in seed financing Zorroa, which provides a platform for managing visual assets, has closed a $7M funding round Shanghai-based Wayz.ai, a smart location and mapping start-up (not to be confused with Waze) announced that it has raised a US$80 million series A Unisound, a Chinese AI solutions provider, specialized in voice recognition and language processing, has received RMB600 million ($89 million) in Series C-plus funding Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
Bits & Bytes Elon Musk, DeepMind co-founders promise never to make killer robots. The founders have signed on to the Future of Live Institute’s pledge to develop, manufacture or use killer robots, which was published at the annual International Joint Conference on Artificial Intelligence in Stockholm, Sweden Huawei plans AI chips to rival Nvidia, Intel. The company is reportedly developing AI chips for both networking equipment and the datacenter in an effort to strengthen its position in growing AI market and to compete with the likes of Nvidia and Intel. Let the sniping continue. Facebook has hired Shahriar Rabii to lead its chip initiative. Rabii previously worked at Google, where he helped lead the team in charge of building the Visual Core chip for the company’s Pixel devices. Apple has appointed former Google AI exec John Giannandrea to lead a new artificial intelligence and machine learning team, to include the Siri unit. Interesting projects. Researchers at Nvidia, MIT, and Aalto University presented an approach to automatically removing noise, grain, and even watermarks from photos at ICML. A Google researcher along with collaborators from academia have developed a deep learning-based system for identifying protein crystallization, achieving a 94% accuracy rate and potentially improving the drug discovery process by making it easier to map the structures of proteins. Google revealed “Move Mirror,” an ML experiment that matches user’s poses with images of other people in the same pose. Dollars & Sense R4 Technologies, a Ridgefield, Connecticut-based AI startup created by Priceline.com founders and executives, secured $20m in Series B funding Cambridge-based SWIM.AI, which provides edge intelligence software for IoT applications, announced $10 million in Series B funding Viz.ai, a company applying AI in healthcare secured $21 million in Series A funding Computer vision technology provider AnyVision announced at it has secured $28 million in Series A financing Salesforce has signed a definitive agreement to acquire Datorama, an AI-powered marketing intelligence platform Workday announced that it has acquired Stories.bi, which uses AI to automate analytics and generate natural language business stories Robotic retail inventory specialist Bossa Nova announced the acquisition of AI video surveillance company, HawXeye Self-driving car company Pony.ai raised $102 million, putting it close to a billion dollar valuation Box announced that it has acquired Butter.ai, a startup focused on cross-silo enterprise search DataRobot announced that it has acquired Nexosis, an ML platform company whose founders we interviewed in TWIML Talk #69 Accenture has acquired Kogentix to strengthen Accenture Applied Intelligence’s growing data engineering business Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
This video is a recap of our July 2018 TWIML Online Meetup. In this month's community segment we look at the ongoing Fast.ai Study Group, the upcoming meetup presenter schedule, the recent Glow paper from the folks at OpenAI, and entity embeddings. In our presentation segment, Nicholas Teauge leads us in a discussion on the paper Quantum Machine Learning by Jacob Biamonte et al, which explores how to devise and implement concrete quantum software that outperforms classical computers on machine learning tasks. For links to the papers mentioned above and more information on this and previous meetups, or to get registered for upcoming meetups, visit twimlai.com/meetup! https://youtu.be/-ftniM7248I Paper: Quantum Machine Learning OpenAI Glow
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review part one of lesson of five, Collaborative Filtering; Inside the Training Loop. It’s not too late to join the study group! Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review lesson four, Structured, Time Series, & Language Models. It’s not too late to join the study group! Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
In this episode I'm joined by Amir Zamir, Postdoctoral researcher at both Stanford & UC Berkeley. Amir joins us fresh off of winning the 2018 CVPR Best Paper Award for co-authoring "Taskonomy: Disentangling Task Transfer Learning." In this work, Amir and his coauthors explore the relationships between different types of visual tasks and use this structure to better understand the types of transfer learning that will be most effective for each, resulting in what they call a "computational taxonomic map for task transfer learning." In our conversation, we discuss the nature and consequences of the relationships that Amir and his team discovered, and how they can be used to build more effective visual systems with machine learning. Along the way Amir provides a ton of great examples and explains the various tools his team has created to illustrate these concepts.
Bits & Bytes AI around the world. This interesting post summarizes the national AI strategies of the 15 nations that have formally published them. Baidu unveils AI chipset. Baidu launches Kunlun, China's first “cloud-to-edge” AI chips. The chips were built to accommodate a variety of AI scenarios–such as voice recognition, search ranking, natural language processing, autonomous driving and large-scale recommendations–and can be applied to both cloud and edge scenarios, including data centers, public clouds, autonomous vehicles, and other devices. Kunlun includes distinct chips for both training and inference. New research explores identification of Photoshopped images. At the recent CVPR conference, University of Maryland researchers presented a method for identifying edited pictures using deep learning. Stanford AI recreates periodic table. Applying techniques borrowed from NLP, Stanford researchers created “Atom2Vec.” The tool analyzed a list of chemical compound names from an online database and proceeded to re-create the periodic table of elements in a few hours. ‘AI eating software’ roundup. Silicon design tools vendor NetSpeed incorporates AI features into new SoCBuilder design and integration platform. AMFG launches new AI software platform for industrial 3D printing. Energy industry ERP provider Quorum Software adds a new cognitive services layer providing intelligent ingest, compliance and reporting capabilities. Dollars & Sense Facebook has acquired the team behind Bloomsbury AI, a London firm specializing in using ML to understand natural language documents JDA Software to acquire Blue Yonder, a provider of AI solutions for retail D.C. startup QxBranch closed $4.1 million in Series A funding to develop analytics software for quantum computing JASK, an Autonomous Security Operations Center (ASOC) platform provider announced that it has raised $25M in Series B funding Suzhou city-based AISpeech announced $76 million in Series D funding, bringing its total funding to over $121 million Tact.ai, a conversational AI sales platform, announced its $27M Series C raise, bringing total funding to more than $57M Cybersecurity firm Balbix raises $20 million in Series B round led by Singtel Innov8 Ping Identity acquires Elastic Beam, a cybersecurity startup using artificial intelligence to monitor and protect APIs Precision Therapeutics, a company focused on AI-based personalized medicine and drug discovery, announced that its merger with Helomics Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review lesson four, Structured, Time Series, & Language Models. It’s not too late to join the study group! Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review lesson three, Improving Your Image Classifier. It’s not too late to join the study group! Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
Bits & Bytes IBM hosts first AI-Human debate. In a publicity stunt in the spirit of Deep Blue’s match with Garry Kasparov, or IBM Watson’s appearance on Jeopardy, IBM hosted the first ever live public debate between its Project Debater AI and a human in San Francisco last week. The company has published several datasets and technical papersoutlining various components of the system. Nvidia publishes “Super SloMo” for transforming standard video into slow motion At last week’s CVPR, NVIDIA researchers presented research into a deep learning model for interpolating between video frames to produce slow-motion video from a standard 30-frame-per-second video. Amazon SageMaker now supports PyTorch and TensorFlow 1.8: In a recent update, Amazon SageMaker has added support for PyTorch deep learning models. I’m now wondering if the fast.ai course and library can be completed on SageMaker. AWS is also now supporting the latest stable TensorFlow versions. Microsoft to acquire Bonsai, one of my favorite AI companies. Berkeley-based Bonsai, a client of mine and sponsor of last year’s Industrial AI series, offers a deep reinforcement learning platform for enterprise AI. I’m super excited for them and looking forward to seeing how things evolve now that they’ll be part of Microsoft. Tracking the state-of-the-art (SOTA) in NLP. Researcher Sebastian Ruder has put together an interesting project to track the SOTA of a variety of problems in natural language processing. Dollars & Sense AI-powered fitness startup Vi raises $20 million Falkonry, a provider of machine learning software for manufacturing, raised $4.6 million in Series A funding Prifender, whose software uses AI to map PII in enterprise data systems, raised a $5M seed round AI-as-a-service startup Noodle.ai announced a $35 million round led by Dell Technologies WalkMe announced that it has acquired DeepUI, whose ML models seek to understand any software at the GUI level, without the need for an API Twitter has agreed to buy San Francisco-based Smyte, which offers tools to stop online abuse, harassment, and spam, and protect user accounts PayPal announced today that it has agreed to acquire Simility, a leading fraud prevention and risk management platform provider for $120 million Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
Perhaps especially appropriate given that much of the globe is glued to the World Cup at the moment, this week we're excited to kick off a series of shows on AI applications in the realm of sports. While I'm not personally the biggest sports fan, my producer Imari is a huge sports follower, and this series has been something he's wanted to see since we started working together. So, if you like these shows, be sure to hit him up on Twitter at @twiml_imari.
One of the pleasures of my frequent travel is the opportunity to meet with TWIML listeners at conferences and in their home cities. During one such informal meetup last year I sat at a sidewalk cafe in Toronto sipping a coffee and chatting with a listener about his search for his next data science job. As we talked it occurred to me that many of the listeners I’ve met with have mentioned either looking for their next gig or recently switching jobs. Apparently, this isn’t just the folks I hang out with; a recent survey showed that nearly a quarter of data scientists changed jobs in 2017! I realized at that point that TWIML could provide a valuable resource to members of the community by connecting them with high-quality career opportunities, and the creation of a TWIML job board was added to our to-do list. Why yet another job board? That’s a really good question. There are a ton of job boards, so what would be different about a TWIML job board? Well, first of all, most of them suck and we think we can do better. Here are some of the worst problems with current job boards: Shadow jobs. It turns out that a lot of the jobs on job boards don’t actually exist. They’re scraped by aggregators with no relation to the hiring companies, maybe months ago, to create the illusion of breadth and drive SEO traffic. Or, maybe they're posted to collect resumes for future use. They’re rarely vetted for existence or quality. Black-hat tactics. Unscrupulous agency recruiters have a bad reputation among technical professionals. Part of the reason is the use of “black hat” techniques like misleading job descriptions or bait-and-switch tactics that frustrate job seekers, waste their time, and lead them on. Poor differentiation. Too often listings feel generic, not offering enough detail to help the job seeker understand the role or the hiring organization, or differentiate it from other positions or companies. By posting only quality posts by vetted hiring organizations, we'll eliminate the worst of these problems. We’ll work with hiring companies to create better listings and we’ll continue to innovate over time to improve the way our community members gain access to the best ML and AI jobs. We're excited about the opportunity to help the TWIML community advance their careers. Right now we’re reaching out to hiring organizations to identify our first set of partners. If you’re a hiring manager or inside recruiter and you’re looking hire some of the best ML and AI talent in the industry, send me an email and let’s talk. And if you’re actively or passively job hunting, I’d also love to hear from you on what you’re looking for and how we might help in the comments below. Sign up for our Newsletter to receive this weekly to your inbox.
Bits & Bytes DeepMind AI needs just a few images to construct a 3D model. Google DeepMind has developed Generative Query Network, a new algorithm that can render simple 3-dimensional scenes from static images. Amazon ships DeepLens; adds support for TensorFlow and Caffe AWS DeepLens is now shipping to a developer near you. The company has added some new capabilities, including TensorFlow and Caffe support. DeepLens also now supports the Deconvolution, L2Normalization, and LRN layers provided by MXNet. Oracle advances AI effort to automate DevOps Oracle announced the availability of its new Oracle Cloud Platform services which touts new AI and machine learning capabilities. The new platform aims to automate operational, security, and recovery tasks for cloud users. Dollars & Sense Eigen Technologies, an applied AI company focused on document data mining for finance, law, and professional services, raises £13m ($17.5m) Series A Care robotics firm Embodied raises $22 million Series A Wave Computing, maker of a deep learning hardware appliance announced that it has acquired MIPS Tech Tableau Software has acquired Empirical Systems, a software company seeking to automate analytics AR/VR studio Recall Studios acquires NLP tools provider Evolution AI Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
This video is a recap of our June 2018 TWIML Online Meetup. In this month's community segment we briefly cover the us use of machine learning in agriculture use cases, a recent Google "AI Principles” blog post, recently released researched coming from our friends at OpenAI: Improving language understanding with unsupervised learning. In our presentation segment, Kelvin Ross, director with IntelliHQ, a healthcare AI firm based in Queensland, Australia, joins us to present the paper Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks describing an algorithm developed by Stanford researchers that has been able to exceed human expert performance in identifying cardiac arrhythmias based on raw ECG readings. We also covered the broader issues associated with data capture and labelling, as well as longer term heart-rate variability, which can be used as a predictor or early warning for sepsis, fatigue, shock, concussion, heart attack, stroke, and more. https://youtu.be/k3cCgh5WZ7I For links to the papers mentioned above and more information on this and previous meetups, or to get registered for upcoming meetups, visit twimlai.com/meetup! Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks Presentation Slides
This video is a recap of our Fast.ai x TWIML Online Study Group. In this session, we review lesson two, Convolutional Neural Nets. Topics covered include: • Learning Rate (LR) • Data Augmentation • Stochastic Gradient Descent (SGD) with Restart approach • Using Test Time Augmentation It’s not too late to join the study group. Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
https://youtu.be/e9RwPTNyaDM In this recap of our second session of the Fast.ai Deep Learning course study group. This week we review lesson one of the course, Recognizing Cats and Dogs. It’s not too late to join the study group. Just follow these three simple steps: Sign up for the TWIML Online Meetup, noting fast.ai in the “What you hope to learn” box. Use the email invitation you’ll receive to join our Slack group. If you don’t receive it within a few minutes, check your spam folder. Once you’re in Slack, join the #fast_ai channel and hop over to #intros as well and introduce yourself. Use the link posted in the #meetup slack channel to add our events to your calendar. SUBSCRIBE AND TURN ON NOTIFICATIONS
A couple of weeks ago I spent some time at the PegaWorld conference in Las Vegas. The theme of the conference was automation, particularly in service of the customer experience, and I had a great time seeing all the advancements coming into this field by way of machine learning and AI.