Siyu Huang  (黄思羽)

Researcher at Baidu Research, Beijing, China
Research Interests: Computer Vision, Deep Learning, Multimedia Analysis

Address: Baidu Technology Park Building No.2, Beijing, China, 100193
My Google Scholar



Work Experiences


  1. Dual Low-Rank Multimodal Fusion [pdf]
    Tao Jin*, Siyu Huang*, Yingming Li, Zhongfei Zhang
    EMNLP Findings, 2020

  2. Neighbours Matter: Image Captioning with Similar Images [pdf]
    Qingzhong Wang, Jiuniu Wang, Antoni Chan, Siyu Huang, Haoyi Xiong, Xingjian Li, Dejing Dou
    BMVC, 2020

  3. Generating Person Images with Appearance-aware Pose Stylizer [pdf] [code]
    Siyu Huang, Haoyi Xiong, Zhi-Qi Cheng, Qingzhong Wang, Xingran Zhou, Bihan Wen, Jun Huan, Dejing Dou
    IJCAI, 2020

  4. SBAT: Video Captioning with Sparse Boundary-Aware Transformer [pdf]
    Tao Jin, Siyu Huang, Ming Chen, Yingming Li, Zhongfei Zhang
    IJCAI, 2020

  5. Stacked Pooling for Boosting Scale Invariance of Crowd Counting [pdf] [code]
    Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann
    ICASSP, 2020

  6. Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning [pdf]
    Tao Jin, Siyu Huang*, Yingming Li, Zhongfei Zhang
    EMNLP, 2019.

  7. Text Guided Person Image Synthesis [pdf] [supp]
    Xingran Zhou, Siyu Huang*, Bin Li, Yingming Li, Jiachen Li, Zhongfei Zhang.
    CVPR, 2019.

  8. User-Ranking Video Summarization with Multi-Stage Spatio-Temporal Representation [pdf] [demo]
    Siyu Huang, Xi Li, Zhongfei Zhang, Fei Wu, Junwei Han.
    IEEE T-IP, 2019.

  9. Perceiving Physical Equation by Observing Visual Scenarios [pdf]
    Siyu Huang*, Zhi-Qi Cheng*, Xi Li, Xiao Wu, Zhongfei Zhang, Alexander Hauptmann.
    NIPS Workshop, 2018.

  10. TVT: Two-View Transformer Network for Video Captioning [pdf]
    Ming Chen, Yingming Li, Zhongfei Zhang, Siyu Huang.
    ACML, 2018.

  11. GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning [pdf] [slides] [poster]
    Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann.
    ACM MM, 2018 (Oral).

  12. Learning to Transfer: Generalizable Attribute Learning with Multitask Neural Model Search [pdf]
    Zhi-Qi Cheng, Xiao Wu, Siyu Huang, Jun-Xiu Li, Alexander Hauptmann, Qiang Peng.
    ACM MM, 2018.

  13. Body Structure Aware Deep Crowd Counting [pdf] [demo]
    Siyu Huang, Xi Li, Zhongfei Zhang, Fei Wu, Shenghua Gao, Rongrong Ji, Junwei Han.
    IEEE T-IP, 2018.

  14. Deep Learning Driven Visual Path Prediction From a Single Image [pdf] [dataset]
    Siyu Huang, Xi Li, Zhongfei Zhang, Zhouzhou He, Fei Wu, Wei Liu, Jinhui Tang, Yueting Zhuang.
    IEEE T-IP, 2016.


Chinese Invention Patents

  • A one-dimensional sequence learning based video summarization algorithm. Siyu Huang, Xi Li, Zhongfei Zhang. Approved, 2020. #201710888621.1
  • A pedestrian body appearance structure based crowd counting algorithm. Siyu Huang, Xi Li, Zhongfei Zhang. Approved, 2020. #201611225785.8
  • A high-resolution person pose transfer system. Siyu Huang, Haoyi Xiong, Dejing Dou. Filed, 2020. #202010507406.4
  • A federated learning-based data mining system. Ji Liu, Haoyi Xiong, Siyu Huang, Dejing Dou. Filed, 2020. #202010339533.8
  • A text-guided person image generation algorithm. Xingran Zhou, Siyu Huang, Bin Li, Yingming Li, Zhongfei Zhang. Filed, 2020. #201910257463.9
  • A neural network architecture search algorithm for predicting image attributes. Siyu Huang, Xi Li, Zhongfei Zhang. Filed, 2018. #201810802108.0

Invited Talks and Presentations

  1. Invited Talk: "Greedy Neural Architecture Search for Multi-Attribute Learning".
    State University of New York (SUNY) at Binghamton, Binghamton, USA. Sept 2018.
  2. Invited Presentation: "GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning".
    ACM International Conference on Multimedia (ACM MM). Seoul, Korea. Oct 2018. [photo]
  3. Invited Talk: "Text Guided Person Image Synthesis".
    Baidu Research. Beijing, China. Mar 2019.
  4. Invited Talk: “Deep Cross-Modal Knowledge Mining”.
    Forum for Talented Young Scholar in Computer Science. Zhejiang University, Hangzhou, May 2019.
  5. Open Course: “NAS with RL and Differentiable NAS in AutoDL”.
    Baidu Create 2019 - Baidu AI Developer Conference. Beijing, July 2019.
  6. Conference Tutorial: “A Tutorial on Neural Architecture Search”.
    IEEE International Conference on Data Mining (ICDM) Tutorial on Automated Deep Learning: Theory, Algorithms, Platforms, and Applications. Beijing, Nov 2019.
  7. Open Course: “Automated Deep Learning: Theory and Applications”.
    Baidu World Congress, Branch Forum. Beijing, Sept 2020.


  • Outstanding Research. Baidu Research. 2020.
  • Excellent Postgraduate Students' Award. Zhejiang University. 2019.
  • ACM MM Student Travel Award. ACM SIGMM. 2018.
  • National Scholarship for PhD Student. Ministry of Education of the P.R.C. 2017. Awarded to top 1% PhD students.
  • Postgraduate of Merit. Zhejiang University. 2017, 2018.
  • Award of Honor for Postgraduate. Zhejiang University. 2017, 2018.
  • Graduate of Chu Kochen Honors College. Zhejiang University. 2014.
  • Excellent Bachelor’s Thesis "Feature Learning and Its Applications". Zhejiang University. 2014.

Professional Services

    PC member: ICDM'20
    Journal reviewer: IEEE T-PAMI, IEEE T-NNLS, IEEE T-IP, IEEE T-MM, Neurocomputing
    Conference reviewer: AAAI’21, IJCAI'21, BMVC'20, ICPR'20, ICMLA'20, AAAI’19, ICDM’18

Web Site Hit Counter Since Nov, 2018

Proudly powered by Bootstrap