Survey preprint · 2026

AI4AIR: A Comprehensive Survey on Large Language Models for AI Research

A comprehensive survey of how LLMs support data engineering, model design and optimization, model evaluation, and closed-loop automation across AI research.

Xiang Ao^1,2,*, Junhong Lian^1,2, Hanyang Li³, Siyi Wang^1,2, Yiran Qiao^1,2, Yi Qiao^1,2, Jiaqi Xu^1,2, Qing He^1,2, and Xueqi Cheng^1,2,*

¹State Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences · ²University of Chinese Academy of Sciences · ³National University of Singapore · ^*Corresponding authors

AI4AIR survey taxonomy overview — Fig. 1. Illustration of the multi-faceted roles of LLMs across the iterative ML-centered AI research lifecycle.

Abstract

Language-mediated automation is beginning to complement the human-centered trial-and-error process in AI research. Among current AI tools, large language models (LLMs) have become a central interface for generation, knowledge synthesis, and reasoning in research workflows. While LLMs are now widely used to support general scientific workflows such as literature review and scientific writing, their specific roles and deeper contributions to the core lifecycle of AI research itself remain insufficiently explored in a systematic manner. To bridge this gap, this survey introduces AI4AIR (short for AI for AI Research), which comprehensively reviews LLMs as pivotal components within machine learning research pipelines. We construct a structured two-dimensional taxonomy. One axis spans major research domains including natural language processing, computer vision, data mining, and general machine learning. The other follows the research pipeline stages, encompassing data engineering, model design and optimization, model evaluation, and the cross-stage closed-loop automation that connects them. Within this framework, we identify five recurring roles of LLMs, namely annotator, synthesizer, optimizer, evaluator, and orchestrator, through which LLMs contribute to AI research workflows. We further discuss bottlenecks such as contamination, hallucination, and reliability under feedback-driven use, and outline future directions for improving both the efficiency and the reliability of AI research and discovery.

Highlights

Systematic Survey of AI4AIR

To the best of our knowledge, this is the first systematic survey that summarizes the utility of LLMs in ML-centered AI research, providing a structured perspective on their roles across the AI research lifecycle.

Role-Based Two-Dimensional Taxonomy

We construct a taxonomy based on LLMs' roles in machine learning model design and development, systematically categorizing existing related works.

Future Directions

Based on our review, we identify the bottlenecks of current research and discuss possible future directions for leveraging advanced AI tools to improve both the efficiency and effectiveness of AI research and discovery.

Taxonomy of AI4AIR

Resources

Curated paper entries, topic tags, and resource metadata are coming soon.

Citation

If this survey or repository is useful for your research, please cite the survey. The formal citation will be updated after the archival version is available.

@article{ao2026ai4air,
  title   = {AI4AIR: A Comprehensive Survey on Large Language Models for AI Research},
  author  = {Ao, Xiang and Lian, Junhong and Li, Hanyang and Wang, Siyi and Qiao, Yiran and Qiao, Yi and Xu, Jiaqi and He, Qing and Cheng, Xueqi},
  year    = {2026},
  note    = {Preprint, under review}
}