Data Rights, Quantification and Governance for Ethical AI with Margaret Mitchell
EPISODE 572
|
MAY
12,
2022
Watch
Follow
Share
About this Episode
Today we close out our coverage of the ICLR series joined by Meg Mitchell, chief ethics scientist and researcher at Hugging Face. In our conversation with Meg, we discuss her participation in the WikiM3L Workshop, as well as her transition into her new role at Hugging Face, which has afforded her the ability to prioritize coding in her work around AI ethics. We explore her thoughts on the work happening in the fields of data curation and data governance, her interest in the inclusive sharing of datasets and creation of models that don't disproportionately underperform or exploit subpopulations, and how data collection practices have changed over the years.
We also touch on changes to data protection laws happening in some pretty uncertain places, the evolution of her work on Model Cards, and how she’s using this and recent Data Cards work to lower the barrier to entry to responsibly informed development of data and sharing of data.
About the Guest
Margaret Mitchell
Hugging Face
Resources
- Paper: Data and its (dis)contents: A survey of dataset development and use in machine learning research
- Paper: Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
- Paper: Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI
- Paper: Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
- Paper: Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure
- Paper: On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
- Video: Workshop on Foundation Models
- Can Language Models Be Too Big? with Emily Bender and Meg Mitchell - #467
- Daring to DAIR: Distributed AI Research with Timnit Gebru - #568
- Big Science and Embodied Learning at Hugging Face with Thomas Wolf - #564
