AIware 2024
Mon 15 - Tue 16 July 2024 Porto de Galinhas, Brazil, Brazil
co-located with FSE 2024

This paper presents a comprehensive comparative analysis of Large Language Models (LLMs) for generation of code documentation. Code documentation is an essential part of the software writing process. The paper evaluates models such as GPT-3.5, GPT-4, Bard, Llama2, and StarChat on various parameters like Accuracy, Completeness, Relevance, Understandability, Readability and Time Taken for different levels of code documentation. Our evaluation employs a checklist-based system to minimize subjectivity, providing a more objective assessment. We find that, barring StarChat, all LLMs consistently outperform the original documentation. Notably, closed-source models GPT-3.5, GPT-4, and Bard exhibit superior performance across various parameters compared to open-source/source-available LLMs, namely Llama 2 and StarChat. Considering the time taken for generation, GPT-4 demonstrated the longest duration by a significant margin, followed by Llama2, Bard, with GPT-3.5 and StarChat having comparable generation times. Additionally, file level documentation had a considerably worse performance across all parameters (except for time taken) as compared to inline and function level documentation.

Tue 16 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30
Industry Talk4 + AIware for Software Lifecycle ActivitiesMain Track / Industry Statements and Demo Track / Late Breaking Arxiv Track at Mandacaru
Chair(s): Filipe Cogo Centre for Software Excellence, Huawei Canada
14:00
20m
Industry talk
AI in Software Engineering at Google: Progress and the Path Ahead
Industry Statements and Demo Track
Satish Chandra Google, Inc
14:20
10m
Paper
A Comparative Analysis of Large Language Models for Code Documentation Generation
Main Track
Shubhang Shekhar Dvivedi IIIT Delhi, Vyshnav Vijay IIIT Delhi, Sai Leela Rahul Pujari IIIT Delhi, Shoumik Lodh IIIT Delhi, Dhruv Kumar Indraprastha Institute of Information Technology, Delhi
DOI
14:30
10m
Paper
AI-Assisted Assessment of Coding Practices in Modern Code Review
Main Track
Manushree Vijayvergiya Google, Malgorzata Salawa Google, Ivan Budiselic Google, Dan Zheng Google DeepMind, Pascal Lamblin Google, Marko Ivanković Google; Universität Passau, Juanjo Carin Google, Mateusz Lewko Google Inc, Jovan Andonov Google, Goran Petrović Google Inc, Danny Tarlow Google, Petros Maniatis Google DeepMind, René Just University of Washington
DOI
14:40
10m
Paper
The Role of Generative AI in Software Development Productivity: A Pilot Case Study
Main Track
Mariana Coutinho CESAR School, Lorena Marques CESAR School, Anderson Santos CESAR School, Marcio Dahia CESAR School, Cesar França CESAR School, Ronnie de Souza Santos University of Calgary
DOI
14:50
10m
Paper
Effectiveness of ChatGPT for Static Analysis: How Far Are We?
Main Track
Mohammad Mahdi Mohajer York University, Reem Aleithan York University, Canada, Nima Shiri Harzevili York University, Moshi Wei York University, Alvine Boaye Belle York University, Hung Viet Pham York University, Song Wang York University
DOI
15:00
5m
Paper
Addressing Compiler Errors: Stack Overflow or Large Language Models?
Late Breaking Arxiv Track
Patricia Widjojo The University of Melbourne, Christoph Treude Singapore Management University
Pre-print
15:05
25m
Live Q&A
Session Q&A and topic discussions
Main Track