AIware 2024
Mon 15 - Tue 16 July 2024 Porto de Galinhas, Brazil, Brazil
co-located with FSE 2024

The advent of advanced AI underscores the urgent need for comprehensive safety evaluations, necessitating collaboration across communities (i.e., AI, software engineering, and governance). However, divergent practices and terminologies across these communities, combined with the complexity of AI systems—of which models are only a part—and environmental affordances (e.g., access to tools), obstruct effective communication and comprehensive evaluation. This paper proposes a framework for AI system evaluation comprising three components: 1) harmonised terminology to facilitate communication across communities involved in AI safety evaluation; 2) a taxonomy identifying essential elements for AI system evaluation; 3) a mapping between AI lifecycle, stakeholders, and requisite evaluations for accountable AI supply chain. This framework catalyses a deeper discourse on AI system evaluation beyond model-centric approaches.

Mon 15 Jul

Displayed time zone: Brasilia, Distrito Federal, Brazil change

16:00 - 18:00
Security and Safety + Round Table + Day1 ClosingMain Track / Late Breaking Arxiv Track at Mandacaru
Chair(s): Thomas Zimmermann Microsoft Research, Ahmed E. Hassan Queen’s University
16:00
5m
Paper
An AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mapping
Main Track
Boming Xia CSIRO's Data61 & University of New South Wales, Qinghua Lu Data61, CSIRO, Liming Zhu CSIRO’s Data61, Zhenchang Xing CSIRO's Data61
DOI
16:05
5m
Paper
Measuring Impacts of Poisoning on Model Parameters and Embeddings for Large Language Models of Code
Main Track
Aftab Hussain University of Houston, Md Rafiqul Islam Rabin University of Houston, Amin Alipour University of Houston
DOI
16:10
10m
Paper
A Case Study of LLM for Automated Vulnerability Repair: Assessing Impact of Reasoning and Patch Validation Feedback
Main Track
Ummay Kulsum North Carolina State University, Haotian Zhu Singapore Management University, Bowen Xu North Carolina State University, Marcelo d'Amorim North Carolina State University
DOI
16:20
5m
Paper
Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy
Late Breaking Arxiv Track
Aftab Hussain University of Houston, Md Rafiqul Islam Rabin University of Houston, Toufique Ahmed University of California at Davis, Bowen Xu North Carolina State University, Premkumar Devanbu UC Davis, Amin Alipour University of Houston
Pre-print
16:25
25m
Live Q&A
Session Q&A and topic discussions
Main Track

16:50
60m
Panel
Round Table
Main Track

17:50
10m
Day closing
Day 1 summary and closing
Main Track