Advancing Compiler Technology

Eight papers reflecting multidisciplinary Northwestern Computer Science collaborations in compilers have been accepted into prestigious conferences this year

When a novice is learning a new language, they will begin by translating a vocabulary word into their first language to understand its meaning and use it in a sentence.

Compilers perform a similar function. A computer programming tool, a compiler transforms the source code from a high-level programming language — such as C, C++, Rust, or Fortran — into a low-level language optimized for execution within a particular system architecture.

Simone CampanoniNorthwestern Computer Science faculty and student teams are collaborating at the intersection of research in systems and networking, programming languages, computer engineering, machine learning, and security to advance compiler technology. Eight papers reflecting partnerships both within the department and with associates at other universities and institutions have been accepted into prestigious conferences this year.

“The collaborative environment we aim to foster leads faculty members to explore ideas that go beyond their comfort zone, which leads to great research solutions across areas,” said Northwestern Engineering’s Simone Campanoni, associate professor of computer science and (by courtesy) electrical and computer engineering.

Heartbeat compiler

Joint work by Campanoni’s Rethinks Compiler Abstractions for New Applications (ARCANA) lab, Peter Dinda’s Prescience Lab, and Umut Acar’s team at Carnegie Mellon University introduced the first compiler capable of fully automating heartbeat scheduling, a granularity control mechanism used to adjust parallelism at run-time.

The paper “Compiling Loop-Based Nested Parallelism for Irregular Workloads” was accepted to the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024), which will be held in San Diego from April 27-May 1.

The team’s heartbeat compiler, supported by a co-designed operating system component, automatically translates unmodified OpenMP code — a parallelism extension to C, C++, and Fortran — into binaries that effectively implement the heartbeat scheduling execution model for popular languages on widely used hardware.

“By democratizing heartbeat scheduling, this project can make applications built on hard-to-optimize, irregular, unbalanced parallel computing systems automatically run faster, tackle larger problems, and do so more energy-efficiently,” said first author Yian Su, a PhD student in computer science at the McCormick School of Engineering advised by Campanoni.

Peter Dinda“The outcome of this project shifts the burden of handling the complexity of heartbeat scheduling from the shoulders of programmers,” said Dinda, professor of computer science and (by courtesy) electrical and computer engineering at Northwestern Engineering. “This will help heartbeat scheduling to be widely adopted.”

Additional co-authors include Nadharm Dhiantravan, a fourth-year student earning a combined bachelor’s degree and master’s degree in computer science at Northwestern Engineering; Jasper Liang (MS ’23), currently a software engineer at Altair; Nick Wanninger, a PhD candidate in computer science at Northwestern Engineering advised by Dinda; and Umut Acar and Mike Rainey (Carnegie Mellon University).

Neurosymbolic learned transpilation

Retargeting a compiler for a machine that uses a different instruction set architecture (ISA) is prohibitively time-consuming to hand-engineer, but the process is necessary to maintain legacy systems.

Campanoni reunited with his postdoctoral advisers — Harvard University’s David Brooks and Gu-Yeon Wei — and a team from Cornell University to develop a learned transpilation system which automatically translates binaries for cross-ISA assembly code by leveraging the strengths of probabilistic language models and symbolic solvers.

The collaborators demonstrated their novel neurosymbolic approach in the paper “Guess and Sketch: Language Model Guided Transpilation,” which was accepted to the International Conference on Learning Representations, scheduled May 7-11 in Vienna.

“The outcome of this project allows new ISAs to be rapidly adopted in existing compilation pipelines,” Campanoni said. “This will also enable ISA designers to explore the design space of an ISA with their compiler support automatically generated.”

Additional co-authors include Celine Lee and Alexander Rush (Cornell University) and Stephen Chong, Michal Kurek, Abdulrahman Mahmoud (Harvard University).

Heap sanitizer

A complex region of memory that is dynamically allocated during runtime, the heap is a frequent target for security exploits.

Campanoni and Dinda teamed up with associate professor of computer science Xinyu Xing and his lab members — Ziyi Guo, a PhD student in computer science co-advised by professor of computer science Yan Chen; Zhenpeng Lin (PhD ’23), currently a security researcher at Apple; and Zheng Yu, a PhD student in computer science — to build a heap sanitizer to detect and capture spatial and temporal heap errors.

The team’s compiler and allocator-based heap memory protection (CAMP) tool leverages novel compilation techniques to protect memory stored in the heap of a program against potential security attacks. The paper was accepted to the 33rd USENIX Security Symposium, which will be held in Philadelphia this August.

“Memory heap corruption is an aging problem in software security,” Xing said. “We discovered that compiler optimizations and system co-design could greatly reduce the overhead needed for memory security guarantees.”

Additional accepted papers

Five additional compilers papers were also accepted to top conferences this year:

McCormick News Article