We are happy to announce the launch event of our MoE Tier 3 program on Automated Program Repair. The event hosted several researchers from PL & SE, who gave talks specifically focusing on testing, analysis, and repair. Further, we organized a poster session for our student researchers and a joint lunch.
UPDATES: Photos of the event can be found here and below is the recording of Abhik Roychoudhury’s welcome message and program overview:
Date and Location
|Date||Friday, November 11, 2022|
|Time||9 am to 4 pm|
|Room||Cerebro@SoC (COM1-02-05), National University of Singapore|
|Directions||Getting Here, Map COM1, Floor Plan Level 2|
(all times are in the local timezone: Singapore Standard Time, i.e., UTC+8)
|09:00||09:05||Welcome Message by Abhik Roychoudhury|
|09:05||09:25||Overview of the APR program and innaguration of our logo|
|09:25||10:10||Invited Talk by Andreas Zeller (CISPA): “Semantic Debugging”|
|10:30||11:15||Invited Talk by Sumit Gulwani (Microsoft): “Program repair: applications and neuro-symbolic techniques”|
|11:15||12:00||Invited Talk by Cristian Cadar (ICL): “Dynamic Symbolic Execution for Continuous Testing”|
|12:00||13:30||Lunch Break (along with Poster Sessions and open discussions)|
|13:30||14:15||Invited Talk by Satish Chandra (Google): “Counterfactual Explanations for Models of Code”|
|14:15||15:00||Invited Talk by Miryung Kim (UCLA): “Automated Program Repair for Democratizing Heterogeneous Computing Applications”|
|15:00||15:45||Invited Talk by Martin Monperrus (KTH): “Sequence-to-sequence learning for program repair”|
|15:45||Closing Message by Abhik Roychoudhury|
|Open End with Coffee/Tea and Poster Discussions|
Cristian Cadar (Imperial College London): Dynamic Symbolic Execution for Continuous Testing
Abstract: Dynamic symbolic execution has gathered a lot of attention in recent years as an effective technique for generating high-coverage test suites and finding deep errors in complex software applications. Most work, however, has focused on whole-program testing. In this talk, I will discuss recent efforts on adapting dynamic symbolic execution for continuous testing, where the analysis effort is focused on recently changed code, i.e., on software patches.
Bio: Cristian Cadar is a Professor in the Department of Computing at Imperial College London, where he leads the Software Reliability Group (http://srg.doc.ic.ac.uk), working on automatic techniques for increasing the reliability and security of software systems. Cristian’s research has been recognised by several prestigious awards, including the EuroSys Jochen Liedtke Award, HVC Award, BCS Roger Needham Award, IEEE TCSE New Directions Award, and two test of time awards. Many of the research techniques he co-authored have been open-sourced and used in both academia and industry. In particular, he is co-author and maintainer of the KLEE symbolic execution system, a popular system with a large user base. Cristian has a PhD in Computer Science from Stanford University, and undergraduate and Master’s degrees from the Massachusetts Institute of Technology.
Satish Chandra (Google): Counterfactual Explanations for Models of Code
Abstract: Machine learning (ML) models play an increasingly prevalent role in many software engineering tasks. However, because most models are now powered by opaque deep neural networks, it can be difficult for developers to understand why the model came to a certain conclusion and how to act upon the model’s prediction. Motivated by this problem, this talk explores counterfactual explanations for models of source code. Such counterfactual explanations constitute minimal changes to the source code under which the model “changes its mind”. We integrate counterfactual explanation generation to models of source code in a real-world setting. We describe considerations that impact both the ability to find realistic and plausible counterfactual explanations, as well as the usefulness of such explanations to the developers that use the model. In a series of experiments we investigate the efficacy of our approach on three different models, each based on a BERT-like architecture operating over source code.
Bio: Satish Chandra is a researcher at Google, where he applies machine learning techniques to improve developer productivity. Prior to that, he has worked – in reverse chronological order – at Meta, Samsung Research, IBM Research, and Bell Laboratories. His work has spanned many areas of programming languages and software engineering, including program analysis, type systems, software synthesis, bug finding and repair, software testing and test automation, and web technologies. His research has been widely published in leading conferences in his field, including POPL, PLDI, ICSE, FSE and OOPSLA. The projects he has led have had significant industrial impact: in additional to his work on developer productivity at Facebook, his work on bug finding tools shipped in IBM’s Java static analysis product, his work on test automation was adopted in IBM’s testing services offering, and his work on memory profiling of web apps was included in Samsung’s Tizen IDE.
Satish Chandra obtained a PhD from the University of Wisconsin-Madison, and a B.Tech from the Indian Institute of Technology-Kanpur, both in computer science. He is an ACM Distinguished Scientist and an elected member of WG 2.4.
Sumit Gulwani (Microsoft): Program repair: applications and neuro-symbolic techniques
Abstract: I will describe 4 interesting applications of program repair: (a) unblock low-code users who get stuck with last-mile formula-repair errors, (b) enhance productivity of developers who spend a significant fraction of their time in software evolution, (c) provide feedback/hints to students to improve their learning experiences in intro programming courses, and (d) improve precision of NL2Code systems based on large language models.
The task of synthesizing an intended repair is both a search and a ranking problem. Search is required to discover candidate repairs that correspond to the (often ambiguous) intent, and ranking is required to pick the best repair from multiple plausible alternatives. This creates a fertile playground for combining grammar-based symbolic-reasoning techniques (which can encode correctness constraints and also guide efficient enumeration of repair candidates) and machine-learning techniques (which can model human preferences in programming). Recent advances in large language models like Codex offer further promise to advance such neuro-symbolic techniques.
Bio: Sumit Gulwani is a computer scientist connecting ideas, people, and research with practice. He is the inventor of many intent-understanding, programming-by-example, and programming-by-natural-language technologies including the popular Flash Fill feature in Excel used by hundreds of millions of people. He founded and currently leads the PROSE research and engineering team at Microsoft that develops APIs for program synthesis and has incorporated them into various Microsoft products including Office, Visual Studio, PowerQuery, PowerApps, Powershell, and SQL. He has co-authored 10 award-winning papers (including 3 test-of-time awards from ICSE and POPL) amongst 140+ research publications across multiple computer science areas and delivered 50+ keynotes/invited talks. He was awarded the Max Planck-Humboldt medal in 2021 and the ACM SIGPLAN Robin Milner Young Researcher Award in 2014 for his pioneering contributions to program synthesis and intelligent tutoring systems. He obtained his PhD in Computer Science from UC-Berkeley, and was awarded the ACM SIGPLAN Outstanding Doctoral Dissertation Award. He obtained his BTech in Computer Science and Engineering from IIT Kanpur, and was awarded the President’s Gold Medal.
Miryung Kim (UCLA): Automated Program Repair for Democratizing Heterogeneous Computing Applications
Abstract: Specialized hardware accelerators like GPUs or FPGAs have become a prominent part of the current computing landscape. However, developing heterogeneous applications is limited to a small subset of programmers with specialized hardware knowledge. To democratize heterogeneous computing, now is the time that the software engineering community should design new waves of testing, debugging, and repair tools for heterogeneous application development.
In this talk, I will first describe technical challenges in making heterogeneity widely accessible to software developers. I will showcase my group’s recent work (HeteroGen) on automated program repair and test input generation for heterogeneous application development with FPGA. HeteroGen takes C/C++ code as input and automatically generates an HLS (high-level synthesis) version with test behavior preservation and better performance.
Bio: Miryung Kim is a Professor and Vice Chair of Graduate Studies in the Department of Computer Science at UCLA. She has taken a leadership role in defining the emerging area of software engineering for data analytics (SE4DA and SE4ML). She conducted the first systematic study of refactoring practices in industry and quantified refactoring benefits using Windows version history at Microsoft. Her group created automated testing and debugging for Apache Spark and conducted the largest scale study of data scientists in industry. Her current research focuses on developer tools for heterogeneous computing applications with FPGA.
She is a Program Co-Chair of ESEC/FSE 2022. She was a Keynote Speaker at ASE 2019 and a Distinguished Speaker at UIUC, UMN, and UC Irvine. She received NSF CAREER award, 10 Year Most Influential Paper Award ICSME, ACM SIGSOFT Distinguished Paper Award, Okawa Foundation Award, Google Faculty Award, Microsoft Software Engineering Innovations Foundation Award, and Humboldt Fellowship. She produced 6 professors (Columbia, Purdue, two at Virginia Tech, etc). For her impact on nurturing the next generation of academics, she received the ACM SIGSOFT Influential Educator Award.
Martin Monperrus (KTH Royal Institute of Technology): Sequence-to-sequence learning for program repair
- Abstract: Neural program repair has achieved good results in a recent series of papers. Yet, we observe that it fails to repair some bugs because of a lack of knowledge about 1) the program being repaired, and 2) the actual fault being repaired. Martin Monperrus presents KTH’s recent results on neural program repair.
- Bio: Martin Monperrus is a Professor of Software Technology at KTH Royal Institute of Technology. His research lies in the field of software engineering with a current focus on automatic program repair, AI on code and program hardening. Homepage: https://www.monperrus.net/martin/
Andreas Zeller (CISPA Helmholtz Center for Information Security):
Joint work with Martin Eberlein, Dominic Steinhöfel, and Lars Grunske
Abstract: Why does my program fail? We present AVICENNA, a novel and general technique to automatically determine failure causes and conditions, using logical properties over input elements: “The program fails if and only if int(<length>) > len(<payload>) holds – that is, the given
is larger than the length". In the context of software repair, such failure conditions (and automatically derived test cases) ensure that the fix neither overspecializes (i.e., addresses only a part of the problem) nor overgeneralizes (impacting runs that did not fail in the first place). AVICENNA and its techniques can also be applied to infer failure-related properties of output elements, effectively serving as test oracles.
Bio: Andreas Zeller is faculty at the CISPA Helmholtz Center for Information Security and professor for Software Engineering at Saarland University, both in Saarbrücken, Germany. His research on automated debugging, mining software archives, specification mining, and security testing has won several awards for its impact in academia and industry. Zeller is an ACM Fellow, an IFIP Fellow, an ERC Advanced Grant Awardee, and holds an ACM SIGSOFT Outstanding Research Award.
For any questions, please contact Yannic Noller.