Learning the Fundamentals of Academic Research
On April 3, the cohort of Northwestern Computer Science Research Track program students presented projects they developed over fall and winter quarters
Looking back on their experience in Northwestern Computer Science’s two-quarter course sequence, COMP_SCI 298: Introduction to Research Track and COMP_SCI 398: Research Track Practicum, Rachana Aluri appreciated the opportunity to explore the research process in a “no barriers to entry" environment.

“I always wanted to do research, but I had no idea what I wanted to work on,” said Aluri, who along with Sydney Hoppenworth and Anitej Siluveru examined persuasive techniques in data visualization with computer science PhD student Lily Ge and Professor Matthew Kay. “This track was really nice because it connects you with a faculty member and lab and lets you jump right in.”
Third-year students in the Northwestern CS Research Track program learn the fundamentals of academic research through structured, mentored, and collaborative projects. In the fall quarter Introduction to Research Track, program leader and CS 298 instructor Sruti Bhagavatula assigns students to teams based on their research interests and experience from prior coursework. Teams start their projects by conducting literature reviews, gathering data or resources, and gaining project-specific skills. Bhagavatula, CS faculty members, and graduate students provide guidance and project mentorship throughout the process. Teams continue progress in the winter Research Track Practicum.

“Through research, students acquire valuable skills such as ideating and formulating problems, determining solutions to problems to which there is no ‘right’ answer, and directing a project in the face of unknowns or challenges,” said Bhagavatula, assistant professor of instruction at Northwestern Engineering. "It has been wonderful seeing the growth of these students, and I am excited that some of them are interested in pursuing graduate school or a research career path in the future."
Analyzing Legal AI Models, Security Vulnerabilities
For their project, students Yong-Yu Huang and Bryan Karanja (CS) examined ChatGPT’s ability to reliably answer questions about Chicago’s Residential Landlord and Tenant Ordinance. By integrating retrieval-augmented generation via LightRAG into the OpenAI o1 API, the team improved both the accuracy and groundedness of the legal AI model. The team was mentored by Mohammed Alam, assistant professor of instruction and deputy director of the Master of Science in Artificial Intelligence Program; and Daniel W. Linna Jr., a senior lecturer and director of law and technology initiatives at the Northwestern Pritzker School of Law.

“This experience taught me a lot about narrowing down questions and understanding what scope is feasible with a limited amount of time,” Huang said.
“I learned tenacity,” Karanja said. “We were in the weeds, and being able to come out with results was really satisfying.”
The literature review process gave Ruhama Tesfa (CS) more confidence in reading technical papers to gain insights on the state of the field and potential research directions.

Mentored by Bhagavatula, Tesfa examined the prevalence of security vulnerabilities in Stack Overflow’s developer Q&A platform, including published JSON Web Tokens used for authorization or data transmission, the usage of insecure websockets that could lead to eavesdropping and man-in-the-middle attacks, and the sharing of hashes that could expose password data. She identified a significant number of vulnerable posts with enough context for malicious actors to gain unauthorized access to systems.
Applying self-supervised learning models, and machine un-learning techniques
Mona Gomaa (computer engineering) and Marija Jovic (CS and cognitive science) collaborated with CS PhD student Tianao Li and assistant professor of computer science Emma Alexander, on a physics-based framework for joint reconstruction of initial pressure and speed-of-sound images from non-invasive medical imaging modalities. Gomaa and Jovic extended the model to support 3D imaging and reconstruction of human tissues or organs.

“Seeing the tangible results of our code was a big goal of mine in research,” Jovic said. “The fact that we could actually simulate these images and achieve these results definitely made the challenge worth it.”
Amplifying the impact of their results, Research Track teams contributed their work to the open source community.
Computer science students Andre Shportko, Matthew Khoriaty, and Gustavo Mercier explored machine unlearning, or the process of removing specific training data. Employing the Weapons of Mass Destruction Proxy benchmark dataset of hazardous knowledge in biosecurity, cybersecurity, and chemical security, the team developed a sparse autoencoder-based unlearning methodology that employs conditional steering to improve on the removal of this harmful information while retaining safe information. The team published an arXiv preprint with adviser Zach Wood-Doughty, assistant professor of instruction.

“Instead of trying to code from scratch, I wanted build upon the EleutherAI Language Model Evaluation Harness library,” Khoriaty said. “My functionality was added to the official repository and now other researchers can use my code. I am really proud of that.”
A pipeline to grad school
Through research talks and panel discussions on graduate school with CS PhD students, Research Track students also got an inside look at the post-graduate path to help them make a more informed decision about whether grad school is the right fit.
“The track did a great job of teaching us what a PhD would look like, setting us up for long-term success by showing us the highs and lows of a research-oriented career path,” said Bennett Lindberg (CS), who plans to pursue a career in industry for a year or two after graduation then revisit the opportunity of grad school.

Lindberg and Andrew Li (integrated science, CS, and chemistry), guided by assistant professor of computer science Christos Dimoulas, bridged the trade-off between easier-to-use dynamically typed code and type safe and more maintainable statically typed languages. They developed a proof of concept to automate the generation of types using symbolic evaluation and SMT solvers.
Mercier plans to pursue a PhD and has started preparing his application. He said that one of his biggest takeaways from the Research Track program is the importance of continuous learning.
“I learned a lot through this project, not only in findings, but also in terms of background research,” Mercier said. “I also learned how much I still need to learn, and this idea of continuously pushing my knowledge right up to the edge and beyond."
Additional Research Track projects

Anthony Alvarez and Jason Hu – “RP2040 WiFi + SPI Driver for TockOS”
Adviser: Professor Branden Ghena
Alvarez and Hu developed a fully functional communication interface to enable WiFi functionality in the TockOS embedded operating system on the Raspberry Pi Pico WH microcontroller. Their driver was officially merged into the TockOS code base after a few rounds of peer review.

Kevin Fan – “Al and Mathematical Theorem Proving”
Adviser: Postdoctoral fellow Dmitrii Avdiukhin
Fan synthesized existing techniques for automated theorem proving in the lean4 programming language, including fine tuning, reinforcement learning, retrieval-augmentation, and expert iteration.
Eliot Lee and Evan Smith – “LLM Agents in the Tragedy of the Commons: Does Memory Foster Cooperation?”
Adviser: Zach Wood-Doughty
Lee and Smith used a large language model (LLM) as a role-playing agent in a “Tragedy of the Commons” simulation in which parties acting in their own self-interest deplete a shared resource like fossil fuels, ultimately to the detriment of all parties. They determined that introducing a memory mechanism to the limited context window of LLMS did not improve the agent’s cooperation in collective action problems.

Aanand Patel – “Exploring Basis Functions in Kolmogorov-Arnold Networks”
Advisers: Professor Emma Alexander and CS PhD student Yi-Chun Hung
Patel examined Kolmogorov-Arnold Networks (KANs), a neural network architecture that can model complex relationships more efficiently by learning activation functions. He tested five basis functions to determine their impact on KAN performance.

Joseph Shim – “Hyperspectral Image Denoising”
Advisers: Professor Emma Alexander and CS PhD student Kerem Aydin
Hyperspectral images capture detailed spectral data but are highly sensitive to noise. Shim compared two denoising methods to evaluate which is more effective in balancing image quality and computational efficiency. He determined that the hyperspectral denoising transformer improves detail preservation and enables real-time applications in fields like remote sensing and medical imaging.