I am a PhD candidate at the Center for Cybersecurity, New York University (NYU), advised by Prof. Brandon Reagen. My research lies at the intersection of deep learning and applied cryptography (homomorphic encryption and multiparty computation), with a focus on cryptographically secure privacy-preserving machine learning (PPML). As part of the DPRIVE projects, I develop novel architectures and algorithms to optimize neural network computations on encrypted data.
In the early stages of my PhD, I led the design of nonlinear-efficient CNNs, introducing ReLU-optimization techniques (DeepReDuce, ICML'21) and methods for redesigning existing CNNs for private inference efficiency (DeepReShape, TMLR'24), including a family of architectures called HybReNets.
My current research focuses on making private LLM inference more practical through both architectural optimizations and algorithmic innovations. Specifically, we examine the functional role of nonlinearities from an information-theoretic perspective and develop the AERO framework which designs nonlinearity-reduced architectures with entropy-guided attention mechanisms. Our preliminary findings have been accepted to ATTRIB@NeurIPS'24 and PPAI@AAAI'25.
Besides research, I have contributed as an (invited) reviewer for NeurIPS (2023, 2024), ICLR (2024, 2025), ICML (2024), CVPR (2024), AISTATS (2025), AAAI (2025), and TMLR.
I am currently on the job market, graduating in the summer of 2025, and seeking research scientist roles at the intersection of LLM science, architectural optimization, and privacy-preserving AI. Feel free to reach out!
Ph.D. in Privacy-preserving Deep Learning, 2020 - present
New York University
M.Tech. (Research Assistant) in Computer Science and Engineering, 2017 - 2020
Indian Institute of Technology Hyderabad
B.Tech. in Electronics and Communication Engineering, 2009 - 2013
National Institute of Technology Surat
In this work, we present a comprehensive analysis to understand the role of nonlinearities in transformer-based decoder-only language models. We introduce AERO, a four-step architectural optimization framework that refines the existing LLM architecture for efficient PI by systematically removing nonlinearities such as LayerNorm and GELU and reducing FLOPs counts. For the first time, we propose a Softmax-only architecture with significantly fewer FLOPs tailored for efficient PI. Furthermore, we devise a novel entropy regularization technique to improve the performance of Softmax-only models. AERO achieves up to 4.23x communication and 1.94x latency reduction.
DeepReDuce is a set of optimizations for the judicious removal of ReLUs to reduce private inference latency by leveraging the ReLUs heterogeneity in classical networks. DeepReDuce strategically drops ReLUs upto 4.9x (on CIFAR-100) and 5.7x (on TinyImageNet) for ResNet18 with no loss in accuracy. Compared to the state-of-the-art for private inference DeepReDuce improves accuracy and reduces ReLU count by up to 3.5% (iso-ReLU) and 3.5×(iso-accuracy), respectively.
Responsibilities include: