IDEAL Hosts Theory-in-Practice Workshop
An IDEAL special workshop examined practical, robust solutions to big data challenges from an algorithmic perspective.
The ability to store, process and manage large amounts of data is key to harvesting knowledge and insights from data. Since the decline of Moore’s Law — the prediction that the number of transistors in a dense integrated circuit doubles approximately every two years, thus doubling processor speeds — researchers and practitioners have turned to software innovations to solve big data challenges.
On September 12-13, the Institute for Data, Econometrics, Algorithms, and Learning (IDEAL) hosted a special workshop focused on practical, robust solutions from an algorithmic perspective. The event was organized by Quanquan Liu, a postdoctoral scholar in the Northwestern CS Theory Group, and Samir Khuller, Peter and Adrienne Barris Chair of Computer Science at Northwestern Engineering.
At the intersection of theory and practice, speakers addressed topics including high performance computing, large-scale graph algorithms, and parallel algorithms and data structures.
“As we encounter the challenges associated with ever-growing amounts of data, it is important for our practical solutions to have robust theoretical foundations. Likewise, it is also important for our theoretical solutions to be practical,” Liu said.
Launched this month, IDEAL Phase II aims to accelerate transformative advances in the theoretical foundations of data science through research and education programs on machine learning and optimization; high-dimensional data analysis and inference; and emerging topics including reliability, interpretability, privacy, and fairness.
Investigators in computer science, economics, electrical engineering, law, mathematics, operations research, and statistics will collaborate across Northwestern, Google Research, the Illinois Institute of Technology (IIT), the Toyota Technological Institute at Chicago, the University of Illinois at Chicago, and the University of Chicago.
“Parallel computing is now a key component in our ability to read in, digest and process massive amounts of data,” Khuller said. “This wonderful workshop brought together researchers ranging from systems to languages to theory to have a discussion of some of the key challenges ahead, and how decades of work in parallel algorithms can provide some of the needed insights.”
On the first day of the event, guest speakers presented on topics including efficient algorithms for web-scale nearest-neighbor search, parallel computing and dependency graphs, real-time systems and scheduling algorithms, clustering algorithms, and parallel algorithms.
The speakers were:
- Kunal Agrawal (Washington University in St. Louis) – “Real-time Scheduling”
- Jakub Łącki (Google Research – New York) – “Scaling Hierarchical Agglomerative Clustering to Trillion-Edge Graphs”
- Stefan Muller (IIT Chicago) – “Static Prediction of Parallel Computation Graphs”
- Harsha Simhadri (Microsoft Research Lab - India) – “Approximate Nearest Neighbor Search Algorithms for Web-scale Search and Recommendation”
- Uzi Vishkin (University of Maryland, College Park) – “The Unwitting Decline of CPUs: Can Parallel Algorithms Reverse It?”
“Working on the intersection of theory and practice is an area that I am very excited about, as it often leads to developing algorithms that are both elegant and impactful,” Łącki said. “The workshop was a great venue for exchanging ideas and connecting to other researchers working in the area.”
The second day of the workshop featured discussions on paging, systems for graph processing, and directed graphs with presentations from:
- Michael Bender (Stony Brook University) – “Tight Bounds for Online Parallel Paging and Green Paging”
- Laxman Dhulipala (University of Maryland, College Park and Google Research – New York) – “Efficient Algorithms and Systems for Dynamic and Streaming Graphs”
- Jeremy Fineman (Georgetown University) – “Parallel Algorithms for Directed Graphs”
“Members of the IDEAL and Northwestern communities enjoyed the variety of talks that span the bridge between theory and practice,” Liu said. “We hope that our workshop kickstarts more conversations and collaborations that will lead to interesting and impactful research work.”
The workshop schedule also included time for informal interaction among guests as well as meetings between guest speakers and Northwestern CS students, faculty, and postdocs. The event was held in Mudd Hall on Northwestern’s Evanston campus.
“A wonderfully stimulating workshop with a well-balanced technical program across complementary topics and great questions and discussions,” said Vishkin.
Fall 2022 IDEAL Special Quarter on Data Economics
To foster interdisciplinary, inter-institute collaborative research, IDEAL will run thematically focused “special quarters.” The first Phase II program, the “Special Quarter on Data Economics,” will start this fall and is organized by Northwestern Engineering’s Jason Hartline, professor of computer science; and Zhaoran Wang, assistant professor of industrial engineering and management sciences and (by courtesy) computer science; in collaboration with Ian Kash (University of Illinois at Chicago), Nicolas Lambert (Massachusetts Institute of Technology), Grant Schoenebeck (University of Michigan), Bo Waggoner (University of Colorado Boulder), and Haifeng Xu (University of Chicago).
Workshops — including “Elicitation Mechanisms in Practice Workshop” on October 14, “Elicitation and Evaluation Workshop” on October 28, and “Challenges in Data Economics Workshop” on November 11 — will explore topics including valuing data, eliciting data, incentivizing data collection and sharing, adaptive data analysis, and game theory with data.
In addition, two graduate courses will be offered during the special quarter. Hartline is leading a course on data economics (COMP_SCI 497: Data Economics) and Annie Liang, assistant professor of economics at Northwestern’s Weinberg College of Arts and Sciences, is instructing a course on the economics of information (ECON 414: Economics of Information).