APR workshop - February 2024
Venue
Date | Thursday, February 8th, 2024 |
Time | 9am to 4pm |
Room | MR20@SoC (COM3-02-59), National University of Singapore |
Directions | Getting Here, Map COM3 |
Program
(all times are in the local timezone: Singapore Standard Time, i.e., UTC+8)
Start | End | Agenda Item |
---|---|---|
09:00 AM | 09:30 AM | Breakfast |
09:30 AM | 09:40 AM | Abhik Roychoudhury “Introduction to the APR Programme” |
09:40 AM | 10:00 AM | Ridwan Shariffdeen “Towards Trustworthy Systems with Artificial Intelligence” |
10:00 AM | 12:00 AM | Prem Devanbu “Naturalness & BImodality of Software: Theory, Applications and Risks” |
12:00 PM | 01:00 PM | Lunch |
01:00 PM | 03:00 PM | Peter Müller “Automated Modular Program Verification Using Viper” |
03:00 PM | 03:20 PM | Ziyi Yang “Synthesising Separation Logic Predicates by Heap Memory Graphs” |
03:20 PM | 03:40 PM | Li Wenhua “Inferring Incorrectness Specifications for Object-Oriented Programs” |
03:40 PM | 04:00 PM | Tea Break |
Scheduled Talks
Tutorial 1: Naturalness & BImodality of Software: Theory, Applications and Risks
Speaker: Prem Devanbu
Abstract: At UC Davis, we discovered (a decade ago) that, despite the considerable power and flexibility of programming languages, large software corpora are actually even more repetitive than NL Corpora. We went on to show that this “naturalness” of code could be captured in statistical models, and exploited within software tools. This line of work enjoyed a tremendous boost from the high-capacity and flexibility of deep learning models. Numerous other creative and interesting applications of naturalness have ensued, from colleagues around the world. More recently, we have focused on another property of software: it is bimodal. Software is written not only to be run on machines, but also read by humans; this makes it amenable to both formal analysis, and statistical prediction. Bimodality allows new ways of training machine learning models, and new ways to understand the human experience of programming.
Despite all these positive scientific and engineering developments, as we venture ever-deeper into the practice of “LLMs for Code”, there are major concerns emerging about Code Quality of generated code, and how developers actually end up using generated code of questionable quality. I will conclude this “Good News” story on a rather pessimistic note, with the issue of generated code-quality, and our limited progress in this area.
Speaker bio: Prem Devanbu holds a B.Tech from IIT Madras, and a Ph.D from Rutgers University. After working in Industrial software development at Bell Laboratories and offshots in New Jersey, he joined the faculty of the University of California at Davis; he is now a Research Professor in software engineering there. His primary research interest for nearly two decades has been towards the goal of exploiting the abundant data and meta-data available in software repositories to improve programming practice. Prem won the ACM SIGSOFT Outstanding Research Award in 2021, the Alexander von Humboldt Research Award in 2022, and the IEEE Computer Society Harlan Mills Award in 2024. He is an ACM Fellow.
Tutorial 2: Automated Modular Program Verification Using Viper
Speaker: Peter Müller
Abstract: Imperative programs are notoriously difficult to reason about. Shared mutable state may lead to subtle interactions between program components and cause memory errors, unintended side effects, data races, and other bugs.
This talk introduces Viper, a verification infrastructure that automates the modular verification of imperative programs. Viper provides an intermediate language to express verification problems in a variation of separation logic, and two verification back-ends that automate verification using an SMT solver. Program verifiers for mainstream languages are implemented as front-ends, which translate an input program and specification into Viper.
The talk will provide an overview of the Viper language, demonstrate its use on a series of examples, and discuss how it can be used to prove the correctness of programs written in Rust.
Speaker bio: Peter Müller has been Full Professor and head of the Programming Methodology Group at ETH Zurich since August 2008. His research focuses on languages, techniques, and tools for the development of correct software. His previous appointments include a position as Researcher at Microsoft Research in Redmond, an Assistant Professorship at ETH Zurich, and a position as Project Manager at Deutsche Bank in Frankfurt. Peter Müller received his PhD from the University of Hagen.
Talk 1: Towards Trustworthy Systems with Artificial Intelligence
Speaker: Ridwan Shariffdeen
Abstract: Use of Large Language Models for Software Engineering can be a double-edged sword which has favorable applications as well as undesirable consequences if used naively. Rapid development of Artificial Intelligence advances the technology to create better software systems addressing limitations in existing state of the art techniques. However, naively using LLMs could result in compromising the integrity and security of existing software systems. In this talk, I will present the ongoing efforts at NUS TSS group on how we leverage the advances of AI to develop trustworthy software systems and how we address trust enhancement issues in using AI for SE. Overall, AI provides the duality of aiding the creation of trustworthy systems while simultaneously creating new challenges to ensure trustworthiness of existing systems.
Speaker bio: Ridwan Shariffdeen holds a BSC (Hons) from University of Moratuwa, Sri Lanka and a Ph.D from the the National University of Singapore. Currently a Research Fellow at NUS, working at the intersections of software security, automated program repair, software testing and trustworthy software systems. Prior to his Ph.D, he worked as a Senior Software Engineer focusing on Cloud Security and Web Application Security. He has published papers on top-tier conferences and journals such as ICSE, PLDI, TOSEM and IEEE CLOUD. He is a Member of the ACM TOSEM Board of Distinguished Reviewers, and he has served as reviewer for several journals such as TECS, TDSC, and EMSE. Moreover, he serves as a program committee member for several flagship conferences such as ICSE, ASE and FSE.
Talk 2: Synthesising Separation Logic Predicates by Heap Memory Graphs, [Slides]
Speaker: Ziyi Yang
Abstract: We present an approach to automatically synthesise Separation Logic (SL) predicates from examples using Inductive Logic Programming (ILP) techniques. The main challenges to make such inductive synthesis effective in practice are (1) to avoid using negative examples that are required in ILP but are difficult to construct for heap-based structures in an automated fashion, and (2) to be capable of summarising not just the shape of a heap (e.g., it is a linked list), but also the properties of the data it stores (e.g., it is a sorted linked list). We tackle these challenges with a new approach for predicate learning. The key novel contributions of our work are (a) the formulation of the ILP-based learning procedure using “positive-only” examples and (b) an algorithm that harnesses a state-of-the-art ILP inference procedure to synthesise human-readable SL predicates from concrete memory graphs.
We show that our framework can efficiently and correctly synthesise SL predicates for structures that were beyond the reach of the state-of-the-art tools, including those featuring non-trivial payload constraints (e.g., binary search trees) and nested recursion (e.g., n-ary trees). Finally, we demonstrate the utility of our tool Sippy by using the synthesised predicates for deductive synthesis of heap-manipulating programs.
Speaker bio: Ziyi is a 3rd year PhD student from VERSE Lab. His main research interests are program synthesis, transformation, and logic programming.
Talk 3: Inferring Incorrectness Specifications for Object-Oriented Programs
Speaker: Li Wenhua
Abstract: Incorrectness logic (IL) based on under-approximation (UX) is effective at finding real program bugs. Nevertheless, formalizing IL for object-oriented programs (OOP) is challenged by the need to support class inheritance and method overriding. Previously, a principled approach for verifying OOP had been formulated via a novel UX proof system. However, these UX specifications were expected to be provided manually, which will hamper the users. To make the current UX verification techniques more practical for OOP, in this work, we design specification inference rules via bi-abduction and specialise the inter-procedural call rules for OOP. In particular, we infer static specifications for calls which can be resolved statically (static dispatching) and address inheritance and method overriding (dynamic dispatching) via dynamic specifications. At its core, we formalise a principle that guarantees the validity of dynamic specifications; the type constraints propagation that can selectively trim dynamic specifications for sound reasoning of method calls and support the class-cast-exception (CCE) detection. We prototype the proposed UX specification inference for OOP in ToolX to demonstrate its feasibility. Our experimental results show that, ToolX can detect more errors (null-pointer-exceptions) comparing with the state-of-the-art tool. In addition, ToolX can also report CCEs in real-world projects where no other tools can perform CCE analysis, to the best of our knowledge.
Speaker bio: Li Wenhua is a fourth-year PhD student in the Department of Computer Science at the School of Computing, National University of Singapore. He is under the supervision of Prof. Wei-Ngan Chin. His research focuses on program logic, automated bug finding/verification and program repair.
Guests
We will have Peter Müller and Prem Devanbu as guests.
Contact
For any questions, please contact Andreea Costea.