AI Agents: Substance or Snake Oil with Arvind Narayanan
EPISODE 704
|
OCTOBER
7,
2024
Watch
Follow
Share
About this Episode
Today, we're joined by Arvind Narayanan, professor of Computer Science at Princeton University to discuss his recent works, AI Agents That Matter and AI Snake Oil. In “AI Agents That Matter”, we explore the range of agentic behaviors, the challenges in benchmarking agents, and the ‘capability and reliability gap’, which creates risks when deploying AI agents in real-world applications. We also discuss the importance of verifiers as a technique for safeguarding agent behavior. We then dig into the AI Snake Oil book, which uncovers examples of problematic and overhyped claims in AI. Arvind shares various use cases of failed applications of AI, outlines a taxonomy of AI risks, and shares his insights on AI’s catastrophic risks. Additionally, we also touched on different approaches to LLM-based reasoning, his views on tech policy and regulation, and his work on CORE-Bench, a benchmark designed to measure AI agents' accuracy in computational reproducibility tasks.
About the Guest
Arvind Narayanan
Princeton University
Resources
- AI Agents That Matter
- AI Snake Oil Newsletter
- AI Snake Oil: Starting reading the AI Snake Oil book online
- Amazon: AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference
- CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
- On the Societal Impact of Open Foundation Models
- Center for Information Technology Policy (CITP)
- THE MNIST database of handwritten digits
- OpenAI o1
- The Web Robots Pages
- Competition-Level Code Generation with AlphaCode
- Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI)
- Public Safety Assessment: A Risk Tool That Promotes Safety, Equity, and Justice | Arnold Ventures
- The Epic Sepsis Model Falls Short—The Importance of External Validation
- FTC Announces Crackdown on Deceptive AI Claims and Schemes
- Assessing the Risks of Open AI Models with Sayash Kapoor - #675

