Biography

I’m a researcher in deep learning, now working as a research assistant at Westlake University with Prof. Stan Z. Li, researching on AI related molecule and protein design. Previously, I spent a year at Shanghai Artificial Intelligence Laboratory with Prof. Yu Cheng, researching on the Mixture of Experts (MoE) in large language models. Before that, I received my BS in Computer Science & Mathematics from the University of Electronic Science and Technology of China (UESTC).

My research spans across domains such as ML, NLP, and CV, and I have a strong passion for uncovering the intrinsic properties of neural networks with theoretical guarantees. My primary research interests include, but are not limited to:

  1. Representation Learning: Enhancing abstract data representations to improve generalizability, interpretability, and expand model capacity, thereby avoiding degradation.
  2. Neural Network Architecture: Discovering general structures to enhance model efficiency or achieve mathematical completeness (e.g., MoE, GNN).
  3. AI for Biology / Psychology: Leveraging AI to advance the scientific progress of human beings.

News

Selected Publications

  1. Zhangyang Gao*, Daize Dong*, Cheng Tan, Jun Xia, Bozhen Hu, Stan Z. Li, A Graph is Worth K Words: Euclideanizing Graph using Pure Transformer, The 41st International Conference on Machine Learning (ICML 2024). [Paper]
  2. Jiacheng Ruan, Jingsheng Gao, Mingye Xie, Daize Dong, Suncheng Xiang, Ting Liu, Yuzhuo Fu, iDAT: inverse Distillation Adapter-Tuning, The 15th International Congress on Mathematical Education (ICME 2024). [Paper] (Oral)
  3. Shwai He, Liang Ding, Daize Dong, Boan Liu, Fuqiang Yu, Dacheng Tao, PAD-Net: An Efficient Framework for Dynamic Networks, Proceedings of The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023). [Paper]
  4. Shwai He, Liang Ding, Daize Dong, Miao Zhang, Dacheng Tao, SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters, Findings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [Paper]
  5. Shwai He, Chenbo Jiang, Daize Dong, Liang Ding, SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution, IEEE/CVF Winter Conference on Applications of Computer Vision, 2023 (WACV 2023). [Paper]

Projects

  • LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training. [Paper] [Code]