Data Augmentation and Optimized Architectures for Computer Vision with Fatih Porikli
EPISODE 635
|
JUNE
26,
2023
Watch
Follow
Share
About this Episode
Today we kick off our coverage of the 2023 CVPR conference joined by Fatih Porikli, a Senior Director of Technology at Qualcomm. In our conversation with Fatih, we covered quite a bit of ground, touching on a total of 12 papers/demos, focusing on topics like data augmentation and optimized architectures for computer vision. We explore advances in optical flow estimation networks, cross-model, and stage knowledge distillation for efficient 3D object detection, and zero-shot learning via language models for fine-grained labeling. We also discuss generative AI advancements and computer vision optimization for running large models on edge devices. Finally, we discuss objective functions, architecture design choices for neural networks, and efficiency and accuracy improvements in AI models via the techniques introduced in the papers.
About the Guest
Fatih Porikli
Qualcomm
Thanks to our sponsor Qualcomm AI Research
Qualcomm AI Research is dedicated to advancing AI to make its core capabilities — perception, reasoning, and action — ubiquitous across devices. Their work makes it possible for billions of users around the world to have AI-enhanced experiences on devices powered by Qualcomm Technologies. To learn more about what Qualcomm Technologies is up to on the research front, visit twimlai.com/qualcomm.
Resources
- Paper: Grounded Language-Image Pre-training (GLIP)
- Paper: DistractFlow: Improving Optical Flow Estimation Models via Realistic Distractions and Pseudo-Labeling
- Paper: Déjà vu: Regenerative Learning to Enhance Dense Prediction
- Paper: X3-KD: Cross-modal Cross-stage Cross-task Knowledge Distillation for 3D Object Detection
- Paper: Revisiting Random Convolutions for Single Domain Generalization
- Paper: EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization
- Paper: 3D Part Segmentation with Pretrained Vision-Language Models
- Paper: ReDirTrans: Latent-to-Latent Translation for Gaze and Head Redirection
- Paper: Dense Network Expansion for Class Incremental Learning
- Paper: Efficient Guided Attention for Self-Supervised Multi-Camera Depth Estimation
- Paper: DIFT: Dynamic Iterative Field Transforms for Memory Efficient Optical Flow
- Paper: QuickSRNet
- Paper: Neural Transformation Network to Generate Diverse Views for Contrastive Learning
- Paper: QuickSRNet
- CVPR 2023 Mobile AI Workshop
- L3D-IVU 2023
- Optical Flow Estimation, Panoptic Segmentation, and Vision Transformers with Fatih Porikli - #579

