Advancing Compiler Technology
Eight papers reflecting multidisciplinary Northwestern Computer Science collaborations in compilers have been accepted into prestigious conferences this year
When a novice is learning a new language, they will begin by translating a vocabulary word into their first language to understand its meaning and use it in a sentence.
Compilers perform a similar function. A computer programming tool, a compiler transforms the source code from a high-level programming language — such as C, C++, Rust, or Fortran — into a low-level language optimized for execution within a particular system architecture.
Northwestern Computer Science faculty and student teams are collaborating at the intersection of research in systems and networking, programming languages, computer engineering, machine learning, and security to advance compiler technology. Eight papers reflecting partnerships both within the department and with associates at other universities and institutions have been accepted into prestigious conferences this year.
“The collaborative environment we aim to foster leads faculty members to explore ideas that go beyond their comfort zone, which leads to great research solutions across areas,” said Northwestern Engineering’s Simone Campanoni, associate professor of computer science and (by courtesy) electrical and computer engineering.
Heartbeat compiler
Joint work by Campanoni’s Rethinks Compiler Abstractions for New Applications (ARCANA) lab, Peter Dinda’s Prescience Lab, and Umut Acar’s team at Carnegie Mellon University introduced the first compiler capable of fully automating heartbeat scheduling, a granularity control mechanism used to adjust parallelism at run-time.
The paper “Compiling Loop-Based Nested Parallelism for Irregular Workloads” was accepted to the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024), which will be held in San Diego from April 27-May 1.
The team’s heartbeat compiler, supported by a co-designed operating system component, automatically translates unmodified OpenMP code — a parallelism extension to C, C++, and Fortran — into binaries that effectively implement the heartbeat scheduling execution model for popular languages on widely used hardware.
“By democratizing heartbeat scheduling, this project can make applications built on hard-to-optimize, irregular, unbalanced parallel computing systems automatically run faster, tackle larger problems, and do so more energy-efficiently,” said first author Yian Su, a PhD student in computer science at the McCormick School of Engineering advised by Campanoni.
“The outcome of this project shifts the burden of handling the complexity of heartbeat scheduling from the shoulders of programmers,” said Dinda, professor of computer science and (by courtesy) electrical and computer engineering at Northwestern Engineering. “This will help heartbeat scheduling to be widely adopted.”
Additional co-authors include Nadharm Dhiantravan, a fourth-year student earning a combined bachelor’s degree and master’s degree in computer science at Northwestern Engineering; Jasper Liang (MS ’23), currently a software engineer at Altair; Nick Wanninger, a PhD candidate in computer science at Northwestern Engineering advised by Dinda; and Umut Acar and Mike Rainey (Carnegie Mellon University).
Neurosymbolic learned transpilation
Retargeting a compiler for a machine that uses a different instruction set architecture (ISA) is prohibitively time-consuming to hand-engineer, but the process is necessary to maintain legacy systems.
Campanoni reunited with his postdoctoral advisers — Harvard University’s David Brooks and Gu-Yeon Wei — and a team from Cornell University to develop a learned transpilation system which automatically translates binaries for cross-ISA assembly code by leveraging the strengths of probabilistic language models and symbolic solvers.
The collaborators demonstrated their novel neurosymbolic approach in the paper “Guess and Sketch: Language Model Guided Transpilation,” which was accepted to the International Conference on Learning Representations, scheduled May 7-11 in Vienna.
“The outcome of this project allows new ISAs to be rapidly adopted in existing compilation pipelines,” Campanoni said. “This will also enable ISA designers to explore the design space of an ISA with their compiler support automatically generated.”
Additional co-authors include Celine Lee and Alexander Rush (Cornell University) and Stephen Chong, Michal Kurek, Abdulrahman Mahmoud (Harvard University).
Heap sanitizer
A complex region of memory that is dynamically allocated during runtime, the heap is a frequent target for security exploits.
Campanoni and Dinda teamed up with associate professor of computer science Xinyu Xing and his lab members — Ziyi Guo, a PhD student in computer science co-advised by professor of computer science Yan Chen; Zhenpeng Lin (PhD ’23), currently a security researcher at Apple; and Zheng Yu, a PhD student in computer science — to build a heap sanitizer to detect and capture spatial and temporal heap errors.
The team’s compiler and allocator-based heap memory protection (CAMP) tool leverages novel compilation techniques to protect memory stored in the heap of a program against potential security attacks. The paper was accepted to the 33rd USENIX Security Symposium, which will be held in Philadelphia this August.
“Memory heap corruption is an aging problem in software security,” Xing said. “We discovered that compiler optimizations and system co-design could greatly reduce the overhead needed for memory security guarantees.”
Additional accepted papers
Five additional compilers papers were also accepted to top conferences this year:
- The paper “Representing Data Collections in an SSA Form” was accepted to the International Symposium on Code Generation and Optimization. Co-authors include Simone Campanoni, PhD students in computer science Nathan Greiner, Tommy McMichen, Atmn Patel, and Federico Sossai and Peter Zhong (’22), now a PhD student at Carnegie Mellon University.
- The paper “TrackFM: Far-out Compiler Support for a Far Memory World” was accepted to ASPLOS 2024. Collaborators include Simone Campanoni and Peter Dinda (Northwestern Engineering); co-first author Brian Suchy (PhD ’22), currently a software engineer at Google; and Kyle Hale and co-first author Brian Tauro (Illinois Institute of Technology).
- The paper "Getting a Handle on Unmanaged Memory" was also accepted to ASPLOS 2024. Collaborators include Northwestern Engineering's Simone Campanoni, Peter Dinda, and computer science PhD students Tommy McMichen and Nick Wanninger.
- The paper “PROMPT: A Fast and Extensible Memory Profiling Framework” was accepted to Object-Oriented Programming, Systems, Languages, and Applications. Co-authors include Simone Campanoni and Yian Su (Northwestern Engineering) and Sotiris Apostolakis, David August, Yebin Chon, Zujun Tan, and Ziyang Xu (Princeton University).
- The paper “GhOST: a GPU Out-of-Order Scheduling Technique for Stall Reduction” was accepted to the International Symposium on Computer Architecture. Co-authors include Simone Campanoni (Northwestern Engineering); Tor Aamodt (University of British Columbia); David August, Ishita Chaturvedi, Bhargav Reddy Godala, Yucan Wu, and Ziyang Xu (Princeton University); Konstantinos Iliakis, Dimitrios Soudris, and Sotirios Xydis (National Technical University of Athens); and Tyler Sorensen (University of California, Santa Cruz).