Yiren Jian

I am a research scientist at TikTok and ByteDance Inc, specializing in multi-modal foundational models. I earned my PhD in Computer Science from Dartmouth College in January 2024, where I worked on Machine Learning research under the mentorship of Soroush Vosoughi. During my PhD study, I have been research scientist interns at NEC Labs America (2021), Snap Research (2022) and ByteDance AML (2023).

Within the initial three years of my program at Dartmouth, I collaborated closely with Lorenzo Torresani on Computer Vision. Prior to attending Dartmouth, I earned a MS degree in Biophysics with Chen Zeng at The George Washington University.

profile photo
Research

My current research focuses on multimodal large language models (MLLMs) for advanced understanding. I have contributed to the pretraining of generative vision-language models [NeurIPS '23, ACL '24, Findings of ACL '24], and I am actively working on the development of multimodal language models for math and STEM domains [InfiMM-WebMath-40B].

In my prior research, I have explored a wide range of topics within machine learning, with a focus on computer vision, natural language processing, and computational science. For a full list of my publications, please refer to my Google Scholar profile.

Working Memory Identifies Reasoning Limits in Language Models
Chunhui Zhang, Yiren Jian, Zhongyu Ouyang, Soroush Vosoughi
The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
code / bibtex

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning
Xiaotian Han#, Yiren Jian#, Xuefeng Hu#, Haogeng Liu#, Yiqi Wang#, Qihang Fan, Yuang Ai, Huaibo Huang, Ran He, Zhenheng Yang, Quanzeng You
preprint, 2024
data / bibtex

InfiMM: Advancing Multimodal Understanding with an Open-Sourced Visual Language Model
Haogeng Liu, Quanzeng You, Yiqi Wang, Xiaotian Han, Bohan Zhai, Yongfei Liu, Wentao Chen, Yiren Jian, Yunzhe Tao, Jianbo Yuan, Ran He, Hongxia Yang
Findings of the Association for Computational Linguistics (Findings of ACL), 2024
code / bibtex

Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction
Yiren Jian, Tingkai Liu, Yunzhe Tao, Chunhui Zhang, Soroush Vosoughi, Hongxia Yang
Annual Meeting of the Association for Computational Linguistics (ACL), 2024   (Oral Presentation)
code / poster / slides / bibtex

GEM: Generating Engaging Multimodal Content
Chongyang Gao, Yiren Jian, Natalia Denisenko, Soroush Vosoughi, V.S. Subrahmanian
International Joint Conference on Artificial Intelligence (IJCAI), 2024
code / bibtex

Knowledge from Large-Scale Protein Contact Prediction Models Can be Transferred to the Data-Scarce RNA Contact Prediction Task
Yiren Jian, Chongyang Gao, Chen Zeng, Yunjie Zhao, Soroush Vosoughi
International Conference on Pattern Recognition (ICPR), 2024
code / bibtex

Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
Yiren Jian, Chongyang Gao, Soroush Vosoughi
Neural Information Processing Systems (NeurIPS), 2023   (Spotlight)
code / poster / slides / bibtex

Evaluating Native-like Structures of RNA-protein Complexes Through the Deep Learning Method
Chengwei Zeng#, Yiren Jian#, Soroush Vosoughi, Chen Zeng, Yunjie Zhao
Nature Communications, 2023
code / bibtex

Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings
Yiren Jian, Chongyang Gao, Soroush Vosoughi
Neural Information Processing Systems (NeurIPS), 2022
code / poster / slides / video / bibtex

T-Cell Receptor-Peptide Interaction Prediction with Physical Model Augmented Pseudo-Labeling
Yiren Jian, Erik Kruus, Martin Renqiang Min
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022   (Oral Presentation)
code / poster / slides / bibtex

Embedding Hallucination for Few-shot Language Fine-tuning
Yiren Jian, Chongyang Gao, Soroush Vosoughi
Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2022
code / poster / slides / video / bibtex

Contrastive Learning for Prompt-based Few-shot Language Learners
Yiren Jian, Chongyang Gao, Soroush Vosoughi
Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2022
code / poster / slides / video / bibtex

Label Hallucination for Few-shot Classification
Yiren Jian, Lorenzo Torresani
AAAI Conference on Artificial Intelligence (AAAI), 2022
code / poster / slides / video / appendix / bibtex

MetaPix: Domain Transfer for Semantic Segmentation by Meta Pixel Weighting
Yiren Jian, Chongyang Gao
Image and Vision Computing, 2021
code / bibtex

Task Meta-Transfer from Limited Parallel Labels
Yiren Jian, Karim Ahmed, Lorenzo Torresani
NeurIPS (Meta-Learning Workshop), 2020
code / poster / video / bibtex

DIRECT: RNA Contact Predictions by Interating Structural Patterns
Yiren Jian, Xiaonan Wang, Jiadi Qiu, Huiwen Wang, Zhichao Liu, Yunjie Zhao, Chen Zeng
BMC Bioinformatics, 2019
bibtex

Teaching
Graduate Teaching Assistant, Deep Learning, CS78/278, Winter 2021
Graduate Teaching Assistant, Deep Learning, CS78/178, Winter 2019
Graduate Teaching Assistant, Machine Learning, CS74/274, 2023
Graduate Teaching Assistant, Machine Learning, CS74/274, 2022
Graduate Teaching Assistant, Machine Learning, CS74/274, 2021
Graduate Teaching Assistant, Machine Learning, CS74/174, 2018
Reviewing service
  • European Conference on Computer Vision (ECCV)

  • International Conference on Machine Learning (ICML)

  • International Conference on Learning Representations (ICLR)

  • Annual AAAI Conference on Artificial Intelligence (AAAI)

  • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

  • Conference on Neural Information Processing Systems (NeurIPS)

  • Annual Meeting of the Association for Computational Linguistics (ACL)

Education

Dartmouth College
Sep 2018 - Jan 2024

PhD in Computer Science

The George Washington University
Sep 2015 - May 2017

MS in Physics

Huazhong University of Science and Technology
Sep 2011 - June 2015

BS in Physics


website template