Siyu Huang

Siyu Huang (黄思羽)

Assistant Professor at Clemson University
Research Interests: Computer Vision, Deep Learning, Generative Model

Address: 218 McAdams Hall, Clemson, SC 29631
Email: siyuh@clemson.edu
Google Scholar

I am leading the Vision and Learning Lab (ViL) at Clemson University.

I am looking for motivated PhD students and interns.

Education

Ph.D. at College of Information Science and Electronic Engineering, Zhejiang University
Sept 2014 - July 2019 Hangzhou, China
Supervised by Prof. Zhongfei (Mark) Zhang and Prof. Xi Li.
Visiting Scholar at Language Technologies Institute, School of Computer Science, Carnegie Mellon University
Jan 2018 - Jan 2019 Pittsburgh, USA
Supervised by Prof. Alexander G. Hauptmann.
Bachelor at College of Information Science and Electronic Engineering, Zhejiang University
Sept 2010 - June 2014 Hangzhou, China
Chu Kochen Honors College

Work Experiences

Tenure-Track Assistant Professor at Clemson University
School of Computing
Aug 2023 - Now Clemson, SC, USA
Postdoctoral Fellow at Harvard University
Visual Computing Group (VCG), School of Engineering and Applied Sciences
Nov 2021 - Aug 2023 Boston, USA
Working with Prof. Hanspeter Pfister
Research Fellow at Nanyang Technological University
Rapid-Rich Object Search (ROSE) Lab, School of Electrical and Electronic Engineering
Feb 2021 - Oct 2021 Singapore
Working with Dr. Bihan Wen
Research Scientist at Baidu Research
Big Data Lab
July 2019 - Jan 2021 Beijing, China
Working with Prof. Dejing Dou and Prof. Jun Huan
Research Intern at Baidu Research
Big Data Lab
Mar 2019 - June 2019 Beijing, China
Supervised by Prof. Jun Huan.

Teaching

CPSC 8810: Machine Learning-based Image Synthesis [2025 Fall] [2024 Fall] [2023 Fall]
This course offers a comprehensive exploration of machine learning techniques for visual data (e.g., images or videos) synthesis. The course will cover a range of topics from classical algorithms (e.g., image filtering and transformation) and deep generative models (e.g., VAEs, GANs, and Diffusion Models). Participants will learn to implement image synthesis algorithms, to understand cutting-edge image synthesis techniques, and to explore intriguing research questions. This course will be of particular interest to students seeking to delve into fields of generative AI, computer vision, and deep learning.
CPSC 4070/6070: Applied Computer Vision [2026 Spring] [2025 Spring] [2024 Spring]
This course offers an introduction to fundamental principles and real-world applications of 2D, 3D, and deep learning-based computer vision. Major topics include image filtering, feature detection and matching, recognition and tracking, scene understanding, camera imaging geometry, stereo vision, and deep learning-based vision. Students will learn to implement interesting computer vision algorithms in a series of well designed projects. Students will also explore intriguing research questions during a final project. This course will be of particular interest to students seeking to delve into fields of image processing and computer vision.
CPSC 8440: Generative AI: Technical and Human Perspectives
This online Coursera course discusses the cutting-edge generative AI techniques from both technical and human-centered perspectives. Through three parts, Large Language Models (LLMs), AI Generated Content (AIGC), and Human-Centric Generative AI, students will gain a basic understanding of methods and ethical considerations driving this transformative field. This course is designed to accommodate students with diverse technical backgrounds. This course will be of particular interest to students seeking to delve into fields of generative AI and human-centered AI.

Publications

All 3D Active Learning AutoML Biomedical Diffusion Model Explainable AI Face Few-Shot Learning Flow-based Model Foundation Model GAN Graph Graphics Image Restoration Image Synthesis Medical Multimodal Physics Security Style Transfer Survey Visual Understanding

HAD: Hallucination-Aware Diffusion Priors for 3D Reconstruction
Xi Liu, Weiwei Sun, Zhou Ren, Chris Broaddus, Siyu Huang, Laurent Guigues
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026
3D Graphics Diffusion Model Image Synthesis
FF3R: Feedforward Feature 3D Reconstruction from Unconstrained Views
Chaoyi Zhou, Run Wang, Feng Luo, Mert D. Pesé, Zhiwen Fan, Yiqi Zhong, Siyu Huang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Findings, 2026
3D Graphics
Bonnet: Ultra-Fast Whole-Body Bone Segmentation from CT Scans [pdf] [code]
Hanjiang Zhu, Pedro Martelleto Rezende, Zhang Yang, Tong Ye, Bruce Z. Gao, Feng Luo, Siyu Huang, Jiancheng Yang
IEEE International Symposium on Biomedical Imaging (ISBI), 2026
Medical Visual Understanding
Bézier Splatting for Fast and Differentiable Vector Graphics Rendering [pdf] [project page]
Xi Liu, Chaoyi Zhou, Nanxuan Zhao, Siyu Huang
Advances in Neural Information Processing Systems (NeurIPS), 2025
Graphics
A Multimodal Visual–Language Foundation Model for Computational Ophthalmology [pdf]
Danli Shi, Weiyi Zhang, Jiancheng Yang, Siyu Huang, Xiaolan Chen, Mayinuer Yusufu, Kai Jin, Shan Lin, Shunming Liu, Qing Zhang, Mingguang He
npj Digital Medicine, 2025
Medical Multimodal Foundation Model
AutoAL: Automated Active Learning with Differentiable Query Strategy Search [pdf]
Yifeng Wang, Xueying Zhan, Siyu Huang
International Conference on Machine Learning (ICML), 2025
Active Learning AutoML
Few-Shot Generalized Category Discovery With Retrieval-Guided Decision Boundary Enhancement
Yunhan Ren, Feng Luo, Siyu Huang
ACM International Conference on Multimedia Retrieval (ICMR), 2025
Few-Shot Learning
LoRD: A Low-Rank Defense Method for Adversarial Attack on Diffusion Models
Jiaxuan Zhu, Siyu Huang
IEEE International Conference on Multimedia and Expo (ICME), 2025
Diffusion Model Security
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions [pdf]
Yifei Dong, Fengyi Wu, Sanjian Zhang, Guangyu Chen, Yuzhi Hu, Masumi Yano, Jingdong Sun, Siyu Huang, Feng Liu, Qi Dai, Zhi-Qi Cheng
CVPR 4th Anti-UAV Workshop, 2025 (Best Paper)
Survey Security
SoftShadow: Leveraging Soft Masks for Penumbra-Aware Shadow Removal [pdf] [code]
Xinrui Wang, Lanqing Guo, Xiyu Wang, Siyu Huang, Bihan Wen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025
Image Restoration
Latent Radiance Fields with 3D-aware 2D Representations [pdf] [project page] [code]
Chaoyi Zhou*, Xi Liu*, Feng Luo, Siyu Huang
International Conference on Learning Representations (ICLR), 2025
3D Graphics
3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors [pdf] [project page] [code]
Xi Liu*, Chaoyi Zhou*, Siyu Huang
Advances in Neural Information Processing Systems (NeurIPS), 2024 (Spotlight)
3D Graphics Diffusion Model Image Synthesis
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation [pdf] [project page] [code]
Lanqing Guo, Yingqing He, Haoxin Chen, Menghan Xia, Xiaodong Cun, Yufei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen
European Conference on Computer Vision (ECCV), 2024
Diffusion Model Image Synthesis
Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance [pdf]
Weiyi Zhang, Siyu Huang, Jiancheng Yang, Ruoyu Chen, Zongyuan Ge, Yingfeng Zheng, Danli Shi, Mingguang He
Medical Image Computing and Computer Assisted Intervention (MICCAI), 2024
Medical Multimodal Image Synthesis
MTPret: Improving X-ray Image Analytics with Multi-Task Pre-training [pdf]
Weibin Liao, Qingzhong Wang, Xuhong Li, Yi Liu, Zeyu Chen, Siyu Huang, Dejing Dou, Yanwu Xu, Haoyi Xiong
IEEE Transactions on Artificial Intelligence (TAI), 2024
Medical Foundation Model
Learning Gaze-aware Compositional GAN from Limited Annotations [pdf] [code]
Nerea Aranjuelo Ansa, Siyu Huang, Ignacio Arganda-Carreras, Luis Unzueta Irurtia, Oihana Otaegui Madurga, Hanspeter Pfister, Donglai Wei
ACM Symposium of Eye Tracking Research & Applications (ETRA), 2024
GAN Face Image Synthesis
S3-TTA: Scale-Style Selection for Test-Time Augmentation in Biomedical Image Segmentation [pdf] [code]
Kangxian Xie, Siyu Huang, Sebastian Andres Cajas Ordonez, Hanspeter Pfister, Donglai Wei
IEEE International Symposium on Biomedical Imaging (ISBI), 2024
Biomedical Image Synthesis
Towards Robust Image Denoising via Flow-based Joint Image and Noise Model [pdf]
Lanqing Guo, Siyu Huang, Haosen Liu, and Bihan Wen
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2023
Image Restoration Flow-based Model
ContRE: A Complementary Measure for Robustness Evaluation of Deep Networks via Contrastive Examples [pdf]
Xuhong Li, Xuanyu Wu, Linghe Kong, Xiao Zhang, Siyu Huang, Dejing Dou, and Haoyi Xiong
IEEE International Conference on Data Mining (ICDM), 2023
Explainable AI
Domain-Scalable Unpaired Image Translation via Latent Space Anchoring [pdf] [code]
Siyu Huang*, Jie An*, Donglai Wei, Zudi Lin, Jiebo Luo, Hanspeter Pfister
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Image Synthesis
3D Domain Adaptive Instance Segmentation via Cyclic Segmentation GANs [pdf] [project page] [code]
Leander Lauenburg, Zudi Lin, Ruihan Zhang, Marcia dos Santos, Siyu Huang, Ignacio Arganda-Carreras, Edward S. Boyden, Hanspeter Pfister, Donglai Wei
IEEE Journal of Biomedical and Health Informatics (JBHI), 2023
3D Biomedical GAN Image Synthesis
QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity [pdf] [code] [slides]
Siyu Huang*, Jie An*, Donglai Wei, Jiebo Luo, Hanspeter Pfister
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Style Transfer Image Synthesis
ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal [pdf] [code]
Lanqing Guo, Chong Wang, Wenhan Yang, Siyu Huang, Yufei Wang, Hanspeter Pfister, Bihan Wen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Image Restoration Diffusion Model
Making Your First Choice: To Address Cold Start Problem in Vision Active Learning [pdf] [code]
Liangyu Chen, Yutong Bai, Siyu Huang, Yongyi Lu, Bihan Wen, Alan Yuille, Zongwei Zhou
Medical Imaging with Deep Learning (MIDL), 2023
Active Learning Medical
Cross-Model Consensus of Explanations and Beyond for Image Classification Models: An Empirical Study [pdf] [code]
Xuhong Li, Haoyi Xiong, Siyu Huang, Shilei Ji, Dejing Dou
Machine Learning, European Conference on Machine Learning 2022 journal track (MLJ), 2023
Explainable AI
ShadowFormer: Global Context Helps Shadow Removal [pdf] [code]
Lanqing Guo, Siyu Huang, Ding Liu, Hao Cheng, Bihan Wen
AAAI Conference on Artificial Intelligence (AAAI), 2023
Image Restoration
Temporal Output Discrepancy for Loss Estimation-based Active Learning [pdf] [code]
Siyu Huang, Tianyang Wang, Haoyi Xiong, Bihan Wen, Jun Huan, Dejing Dou
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Active Learning
MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep Models for X-ray Images of Multiple Body Parts [pdf]
Weibin Liao, Haoyi Xiong, Qingzhong Wang, Yan Mo, Xuhong Li, Yi Liu, Zeyu Chen, Siyu Huang, Dejing Dou
Medical Image Computing and Computer Assisted Intervention (MICCAI), 2022
Medical Foundation Model
A Unified Framework for Bidirectional Prototype Learning from Contaminated Faces across Heterogeneous Domains [pdf] [code]
Meng Pang, Binghui Wang, Siyu Huang, Yiu-ming Cheung, Bihan Wen
IEEE Transactions on Information Forensics and Security (TIFS), 2022
Image Synthesis Security Face
Parameter-Free Style Projection for Arbitrary Style Transfer [pdf] [code]
Siyu Huang, Haoyi Xiong, Tianyang Wang, Bihan Wen, Qingzhong Wang, Zeyu Chen, Jun Huan, Dejing Dou
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Style Transfer Image Synthesis
BM-NAS: Bilevel Multimodal Neural Architecture Search [pdf] [code] [video]
Yihang Yin, Siyu Huang, Xiang Zhang
AAAI Conference on Artificial Intelligence (AAAI), 2022 (Oral)
AutoML Multimodal
AutoGCL: Automated Graph Contrastive Learning via Learnable View Generators [pdf] [code] [video]
Yihang Yin, Qingzhong Wang, Siyu Huang, Haoyi Xiong, Xiang Zhang
AAAI Conference on Artificial Intelligence (AAAI), 2022
AutoML Graph
Boosting Active Learning via Improving Test Performance [pdf] [code]
Tianyang Wang, Xingjian Li, Pengkun Yang, Guosheng Hu, Xiangrui Zeng, Siyu Huang, Cheng-Zhong Xu, Min Xu
AAAI Conference on Artificial Intelligence (AAAI), 2022
Active Learning
Semi-Supervised Active Learning with Temporal Output Discrepancy [pdf] [code] [slides] [supp] [poster] [video]
Siyu Huang, Tianyang Wang, Haoyi Xiong, Jun Huan, Dejing Dou
International Conference on Computer Vision (ICCV), 2021
Active Learning
ReLLIE: Deep Reinforcement Learning for Customized Low-Light Image Enhancement [pdf] [code]
Rongkai Zhang, Lanqing Guo, Siyu Huang, Bihan Wen
ACM International Conference on Multimedia (ACM MM), 2021
Image Restoration
An Investigation of Containment Measure Implementation and Public Responses to the COVID-19 Pandemic in Mainland China [pdf]
Ji Liu, Haoyi Xiong, Xiakai Wang, Jizhou Huang, Qiaojun Li, Tongtong Huang, Siyu Huang, Haifeng Wang, Dejing Dou
IEEE International Conference on Digital Health (ICDH), 2021
Medical
ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows [pdf] [code] [supp]
Jie An*, Siyu Huang*, Yibing Song, Dejing Dou, Wei Liu, Jiebo Luo
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Style Transfer Image Synthesis Flow-based Model
Dual Low-Rank Multimodal Fusion [pdf]
Tao Jin*, Siyu Huang*, Yingming Li, Zhongfei Zhang
Findings of the Association for Computational Linguistics: EMNLP (EMNLP Findings), 2020
Multimodal Visual Understanding
Neighbours Matter: Image Captioning with Similar Images [pdf]
Qingzhong Wang, Jiuniu Wang, Antoni Chan, Siyu Huang, Haoyi Xiong, Xingjian Li, Dejing Dou
British Machine Vision Conference (BMVC), 2020
Multimodal Visual Understanding
Generating Person Images with Appearance-aware Pose Stylizer [pdf] [code]
Siyu Huang, Haoyi Xiong, Zhi-Qi Cheng, Qingzhong Wang, Xingran Zhou, Bihan Wen, Jun Huan, Dejing Dou
International Joint Conference on Artificial Intelligence (IJCAI), 2020
Image Synthesis GAN
SBAT: Video Captioning with Sparse Boundary-Aware Transformer [pdf]
Tao Jin, Siyu Huang, Ming Chen, Yingming Li, Zhongfei Zhang
International Joint Conference on Artificial Intelligence (IJCAI), 2020
Multimodal Visual Understanding
Stacked Pooling for Boosting Scale Invariance of Crowd Counting [pdf] [code]
Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Visual Understanding
Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning [pdf]
Tao Jin, Siyu Huang, Yingming Li, Zhongfei Zhang
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Multimodal Visual Understanding
Text Guided Person Image Synthesis [pdf] [supp]
Xingran Zhou, Siyu Huang, Bin Li, Yingming Li, Jiachen Li, Zhongfei Zhang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Image Synthesis GAN
User-Ranking Video Summarization with Multi-Stage Spatio-Temporal Representation [pdf] [demo]
Siyu Huang, Xi Li, Zhongfei Zhang, Fei Wu, Junwei Han.
IEEE Transactions on Image Processing (TIP), 2019
Visual Understanding
Perceiving Physical Equation by Observing Visual Scenarios [pdf]
Siyu Huang*, Zhi-Qi Cheng*, Xi Li, Xiao Wu, Zhongfei Zhang, Alexander Hauptmann.
NeurIPS Workshop on Modeling the Physical World, 2018
Physics Visual Understanding
TVT: Two-View Transformer Network for Video Captioning [pdf]
Ming Chen, Yingming Li, Zhongfei Zhang, Siyu Huang.
Asian Conference on Machine Learning (ACML), 2018
Multimodal Visual Understanding
GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning [pdf] [slides] [poster]
Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann.
ACM International Conference on Multimedia (ACM MM), 2018 (Oral)
AutoML
Learning to Transfer: Generalizable Attribute Learning with Multitask Neural Model Search [pdf]
Zhi-Qi Cheng, Xiao Wu, Siyu Huang, Jun-Xiu Li, Alexander Hauptmann, Qiang Peng.
ACM International Conference on Multimedia (ACM MM), 2018
AutoML
Body Structure Aware Deep Crowd Counting [pdf] [demo]
Siyu Huang, Xi Li, Zhongfei Zhang, Fei Wu, Shenghua Gao, Rongrong Ji, Junwei Han.
IEEE Transactions on Image Processing (TIP), 2018
Visual Understanding
Deep Learning Driven Visual Path Prediction From a Single Image [pdf]
Siyu Huang, Xi Li, Zhongfei Zhang, Zhouzhou He, Fei Wu, Wei Liu, Jinhui Tang, Yueting Zhuang.
IEEE Transactions on Image Processing (TIP), 2016
Visual Understanding

Professional Services

Grant panel:

Conference organizing chair:

Area chair/meta reviewer:

Journal reviewer:

Conference reviewer:

Since Nov, 2018