Reasoning Over Complex Documents with DocLLM with Armineh Nourbakhsh

EPISODE 672

February 19, 2024

WATCH

Play Video

Join our list for notifications and early access to events

About this Episode

Today we're joined by Armineh Nourbakhsh of JP Morgan AI Research to discuss the development and capabilities of DocLLM, a layout-aware large language model for multimodal document understanding. Armineh provides a historical overview of the challenges of document AI and an introduction to the DocLLM model. Armineh explains how this model, distinct from both traditional LLMs and document AI models, incorporates both textual semantics and spatial layout in processing enterprise documents like reports and complex contracts. We dig into her team’s approach to training DocLLM, their choice of a generative model as opposed to an encoder-based approach, the datasets they used to build the model, their approach to incorporating layout information, and the various ways they evaluated the model’s performance.

Connect with Armineh

Resources

Related Episodes

Why Vision Language Models Ignore What They See

Building an AI Mathematician

High-Efficiency Diffusion Models for On-Device Image Generation and Editing

Vibe Coding’s Uncanny Valley

Inside Nano Banana 🍌 and the Future of Vision-Language Models

One Response

Eric Saund says:

February 22, 2024 at 4:51 pm

It would be great if you could boost the volume a bit. Especially with Armineh Nourbakhsh, you both used your quiet “library” voices. Listening to the podcast on my mp3 player in the gym, I just could not follow the conversation. Whereas, The Cognitive Revolution podcasts are louder and do fine. It’s always possible for a listener to turn the volume down, but if it is not there in the first place, they’re stuck. I can boost the volume myself using an audio tool, but this is a pain.
Anyway, Thank you for the podcast. Great content! And I appreciate your informed questions and willingness to let the guest carry on at length when they are on a roll.

Reply

Reasoning Over Complex Documents with DocLLM with Armineh Nourbakhsh

About this Episode

Connect with Armineh

Resources

Related Episodes

Related Topics

More from TWIML

One Response

Leave a Reply Cancel reply