About

Yunzhong Hou (侯云钟) is a research fellow at the Australian National University (ANU). Prior to that, he completed his PhD in Computer Science and Engineering from ANU in 2023 under the guidance of Prof. Liang Zheng, and his bachelor’s degree in electronic engineering from Tsinghua University in 2018.

Specializing in computer vision and deep learning, his current research interest lies in AI videography and photography, embodied and LLM agents, and 3D understanding and generation. Yunzhong is actively involved in the academic community, contributing as a conference reviewer for CVPR, ECCV, ICCV, NeurIPS, AAAI, and ACM MM; and as a journal reviewer for IEEE TPAMI, IEEE TIP, and Nature Communications. He also serves as an area chair for ICASSP and ACM MM; and as the organizing committee for the WWW 2025 workshop on Multimedia Object Re-ID (MORE’25).

For more details, please find his CV here.

News

2025.07 Invited talk at Monash University “From Aggregation to Planning: a Study on Camera Views”, hosted by Prof. Mehrtash Harandi.

2025.06 Two papers accepted to ICCV 2025! Big congrats to @Xingjian Leng and @Yuwei Yang.

2025.05 Named outstanding reviewer for CVPR 2025!

2025.04 Very glad to have chaired the WWW 2025 (the ACM Web Conference) workshop on Multimedia Object Re-ID (MORE’25 homepage)! Thank you to all the speakers and those of you joined us online or offline! Special thanks to all the colaborators! Recorded talks available online. youtube, bilibili.

2025.02 Honored to serve as Area Chair for ACM MM 2025.

Show News Archive

2024.12 Happy to announce our WWW 2025 (the ACM Web Conference) workshop on Multimedia Object Re-ID (MORE’25), where I’m serving as the organizing team. Open to submissions for up to four pages. homepage, submission site.

2024.12 Check out our latest work on drone videography, “Learning Camera Movement Control from Real-World Drone Videos“, where we take a different approach to AIGC and record the scene as is rather than creating from scratch. paper, project page, code, Twitter.

2024.10 I am serving as an Area Chair for ACM Multimedia 2024 [full program]. An honor to serve as Session Chair and host Oral Session 13 - Machine Learning for Multimedia with Prof. Chang Xu. Excited to present our latest work on AI drone videography at the ACM MM Area Chair Workshop for ACM MM 2024.

2024.09 Honored to serve as Area Chair for ICASSP 2025.

2024.07 Glad to present our latest work on AI drone videography at the International Research Workshop Data Science and AI & Robotics (DSAIR24) at University of Canberra following the invitation from Prof. Shuangzhe Liu.

2024.05 Our paper on color quantization and pixel art creation, “Scalable Deep Color Quantization: A Cluster Imitation Approach“, is accepted by IEEE Trans on Image Processing. paper, code.

2024.02 Our paper on camera configuration optimizaiton, “Learning to Select Views for Efficient Multi-View Understanding“, is accepted by CVPR 2024. See you in Seattle! paper, code.

2023.12 Our grant proposal “Privacy-Percerving Perception for Robotics” is awarded by the HMI seed grant for 25,000 AUD! Big thank you to Dylan, Rahul, and Mike!

2023.12 Check out our latest research on camera layout optimization “Optimizing Camera Configurations for Multi-View Pedestrian Detection“. arxiv

2023.11 Our paper “View-Coherent Correlation Consistency for Semi-Supervised Semantic Segmentation“ is accepted by Pattern recognition. paper.

2023.06 I was named outstanding reviewer for CVPR 2023!

2023.04 Joined as a research fellow at ANU, working with Prof. Tom Gedeon and Dr. Liang Zheng. Excited!

2023.03 Check out our latest research “Learning to Select Camera Views: Efficient Multiview Understanding at Few Glances“ on arXiv. paper, code

2022.07 Internship at Amazon Web Services as a research scientist on vision-language tasks. Hello Bay Area!

2021.12 Our paper “Adaptive Affinity for Associations in Multi-Target Multi-Camera Tracking“ is accepted by IEEE Trans on Image Processing. paper, code

2021.07 Our paper “Ranking Models in Unlabeled New Environments“ is accepted by ICCV 2021. paper, code

2021.07 Our paper “Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)“ is accepted by ACM MM 2021. paper, code

2021.03 Our paper “Visualizing Adapted Knowledge in Domain Transfer“ is accepted by CVPR 2021. paper, code, 知乎-UDA可视化, 知乎-无需风格图像的风格迁移.

2020.07 Our paper “Multiview Detection with Feature Perspective Transformation“ is accepted by ECCV 2020. paper, code, 知乎, MultiviewX dataset download.

2020.03 Our paper “Learning to Structure an Image with Few Colors“ is accepted by CVPR 2020. paper, code, 知乎.

2019.11 A new paper “Locality aware appearance metric for multi-target multi-camera tracking” is released on arXiv. paper, code, 知乎.

2019.06 Won 5th place out of 22 participants in multi-target multi-camera tracking in CVPR 2019 AI-City Challenge. paper, code.

2019.06 Won 3rd place out of 84 participants in vehicle re-identification in CVPR 2019 AI-City Challenge. paper, code.

2019.03 Our paper “Improving Device-Edge Cooperative Inference of Deep Learning via 2-Step Pruning” is accepted by Infocom workshop on IECOO 2019. paper, code.

Research Papers

Effective Training Data Synthesis for Improving MLLM Chart Understanding
Yuwei Yang, Zeyu Zhang, Yunzhong Hou, Zhuowan Li, Gaowen Liu, Ali Payani, Yuan-Sen Ting, Liang Zheng
ICCV, 2025

REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
Xingjian Leng, Jaskirat Singh, Yunzhong Hou, Zhenchang Xing, Saining Xie, Liang Zheng
ICCV, 2025

Learning Camera Movement Control from Real-World Drone Videos
Yunzhong Hou, Liang Zheng, Philip Torr
arXiv preprint, 2024

Learning to Select Camera Views: Efficient Multiview Understanding at Few Glances
Yunzhong Hou, Stephen Gould, Liang Zheng
Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Scalable Deep Color Quantization: a Cluster Imitation Approach
Yunzhong Hou, Stephen Gould, Liang Zheng
IEEE Transaction on Image Processing (IEEE TIP), 2024

View-coherent correlation consistency for semi-supervised semantic segmentation
Yunzhong Hou, Stephen Gould, Liang Zheng
Pattern Recognition (PR), 2024

Optimizing Camera Configurations for Multi-View Pedestrian Detection
Yunzhong Hou, Xingjian Leng, Tom Gedeon, Liang Zheng
arXiv preprint, 2023

Adaptive Affinity for Associations in Multi-Target Multi-Camera Tracking
Yunzhong Hou, Zhongdao Wang, Shengjin Wang, Liang Zheng
IEEE Transaction on Image Processing (IEEE TIP), 2022

Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)
Yunzhong Hou, Liang Zheng
ACM Multimedia (ACM MM), 2021

Ranking Models in Unlabeled New Environments
Xiaoxiao Sun, Yunzhong Hou, Weijian Deng, Hongdong Li, Liang Zheng
International Conference on Computer Vision (ICCV), 2021

Visualizing Adapted Knowledge in Domain Transfer
Yunzhong Hou, Liang Zheng
Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Multiview Pedestrian Detection with Feature Perspective Transformation
Yunzhong Hou, Liang Zheng, Stephen Gould
European Conference on Computer Vision (ECCV), 2020

Learning to Structure an Image with Few Colors
Yunzhong Hou, Liang Zheng, Stephen Gould
Conference on Computer Vision and Pattern Recognition (CVPR), 2020

Contact

Please contact me via e-mail.