Agents for Data Science: From Raw Data to AI-generated Notebooks Using LLMs and Code Execution
Data science tasks involve a complex interplay of datasets, code and code outputs for answering questions, deriving insights, or building models from data. Tasks and chosen methods may require specialized data domain or scientific domain knowledge. Queries range from high-level (low-code) or highly technical (high-code). Code execution results, such as plots and tables are artifacts used by data scientists to interpret and reason about the current and future states of a solution towards completing the task. This presents unique challenges in designing, deploying and evaluating LLM-based agents for automating data science workflows. In this talk we will introduce an end-to-end, autonomous Data Science Agent (DSA) built around Gemini and available as an experiment at labs.google/code. DSA leverages agentic flows, planning and orchestration to tackle open-ended data science explorations. It uses LLMs for planning, task decomposition, code generation, reasoning and error-correction through code execution. DSA is designed to streamline the entire data science process, enabling users to query data in natural language, and get from a dataset and prompt to a fully AI-generated, populated notebook. We’ll discuss design choices (prompting, SFT, orchestration), iterative development cycles, evaluation, lessons learned and future challenges. Where applicable, we will showcase real-world case studies demonstrating how DSA can assist with bootstrapping the analysis of data from complex scientific domains.