Chao Feng
I am a first-year CSE PhD student at the University of Michigan (UMich).
Email: chfeng at umich dot edu
Google Scholar  / 
Github
|
|
Research
I'm interested in computer vision and multimodal learning. Please see Google Scholar.
|
|
GPS as a Control Signal for Image Generation
Chao Feng,
Ziyang Chen,
Aleksander Holynski,
Alexei A. Efros,
Andrew Owens,
In submission
Coming soon!
|
|
Binding Touch to Everything: Learning Unified Multimodal Tactile Representations
Fengyu Yang*,
Chao Feng*,
Ziyang Chen*,
Hyoungseob Park,
Daniel Wang,
Yiming Dou,
Ziyao Zeng,
Xien Chen,
Rit Gangopadhyay,
Andrew Owens,
Alex Wong,
CVPR, 2024
project page /
paper
We introduce UniTouch, a unified tactile representation for vision-based tactile sensors aligned with multiple modalities. We show we can now use powerful models trained on other modalities (e.g. CLIP, LLM) to conduct tactile sensing tasks zero shot.
|
|
This&That: Language-Gesture Controlled Video Generation for Robot Planning
Boyang Wang ,
Nikhil Sridhar,
Chao Feng,
Mark Van der Merwe,
Adam Fishman,
Nima Fazeli,
Jeong Joon Park,
In submission
project page /
paper
We introduce This&That, a framework that generates videos from text instructions and gestures for robot planning.
|
|
Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning
Zhiyang Xu,
Chao Feng,
Rulin Shao,
Trevor Ashby,
Ying Shen,
Di Jin,
Yu Cheng,
Qifan Wang,
Lifu Huang,
ACL, 2024 (Findings)
project page /
paper
We construct Vision-Flan, the most diverse publicly available visual instruction tuning dataset to date.
|
|
Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
Chao Feng,
Ziyang Chen,
Andrew Owens,
CVPR, 2023   (Highlight)
project page
/
arXiv
/
code
We learn several feature sets in a self-supervised manner by using audio-visual synchronization task and utilize autoregressive model to do anomaly detection on top of each feature set for video forensics detection.
|
|
AVA-AVD: Audio-Visual Speaker Diarization in the Wild
Eric Zhongcong Xu,
Zeyang Song,
Satoshi Tsutsui,
Chao Feng,
Mang Ye,
Mike Zheng Shou,
ACM Multimedia, 2022
project page
/
arXiv
/
code
We create the AVA Audio-Visual Diarization (AVA-AVD) dataset to develop diarization methods for in-the-wild videos.
|
Service
CVPR 2022/2024, WACV 2023, ACM MM 2023, ICCV 2023, ECCV 2024, NeurIPS 2024, ICRA 2025, ICLR 2025, AISTATS 2025, TPAMI.
|
|