Nandan Kumar Jha
Nandan Kumar Jha
Home
News
Publications
Contact
1
Circa: Stochastic ReLUs for Private Deep Learning
Circa reduces the runtime overhead of ReLU operation by 1.9x by decoupling the sign evaluation and multiplication steps in the Garbled circuit with no loss in accuracy. Further, it achieves a total of 4.7x runtime reduction by employing the sign approximation in the Garbled circuit by leveraging the error-tolerant properties of neural networks within a 1% accuracy margin.
Zahra Ghodsi
,
Nandan Kumar Jha
,
Brandon Reagen
,
Siddharth Garg
PDF
Cite
Poster
DeepReDuce: ReLU Reduction for Fast Private Inference
DeepReDuce is a set of optimizations for the judicious removal of ReLUs to reduce private inference latency by leveraging the ReLUs heterogeneity in classical networks. DeepReDuce strategically drops ReLUs upto 4.9x (on CIFAR-100) and 5.7x (on TinyImageNet) for ResNet18 with no loss in accuracy. Compared to the state-of-the-art for private inference DeepReDuce improves accuracy and reduces ReLU count by up to 3.5% (iso-ReLU) and 3.5×(iso-accuracy), respectively.
Nandan Kumar Jha
,
Zahra Ghodsi
,
Siddharth Garg
,
Brandon Reagen
PDF
Cite
Poster
Slides
ICML video
Long video
Press release
DRACO: Co-optimizing hardware utilization, and performance of dnns on systolic accelerator
In this work, we address the low PE utilization and low data reuse challenges in the memory-bound DNNs stem either from depthwise convolution or the very less number of channels in each group. We proposed DNN-optimization techniques that strike a balance between PE utilization, energy efficiency, and accuracy of the DNNs.
Nandan Kumar Jha
,
Shreyas Ravishankar
,
Sparsh Mittal
,
Arvind Kaushik
,
Dipan Mandal
,
Mahesh Chandra
PDF
Cite
Slides
Video
DOI
ULSAM: Ultra-lightweight subspace attention module for compact convolutional neural networks
The capability of the self-attention mechanism to model the long-range dependencies has catapulted its deployment in vision models. …
Rajat Saini
,
Nandan Kumar Jha
,
Bedanta Das
,
Sparsh Mittal
,
C Krishna Mohan
PDF
Cite
Code
Poster
Slides
Video
DOI
E2GC: Energy-efficient group convolution in deep neural networks
The number of groups (g) in group convolution (GConv) is selected to boost the predictive performance of deep neural networks (DNNs) in …
Nandan Kumar Jha
,
Rajat Saini
,
Subhrajit Nag
,
Sparsh Mittal
PDF
Cite
Slides
DOI
The Ramifications of Making Deep Neural Networks Compact
The recent trend in deep neural networks (DNNs) research is to make the networks more compact. The motivation behind designing compact …
Nandan Kumar Jha
,
Sparsh Mittal
,
Govardhan Mattela
PDF
Cite
Slides
DOI
Characterizing and Optimizing End-to-End Systems for Private Inference
In two-party machine learning prediction services, the client’s goal is to query a remote server’s trained machine learning …
Karthik Garimella
,
Zahra Ghodsi
,
Nandan Kumar Jha
,
Siddharth Garg
,
Brandon Reagen
PDF
Cite
Code
Poster
DeepReShape: Redesigning Neural Networks for Efficient Private Inference
DeepReShape is the first work to conduct a rigorous characterization of desirable neural network attributes for efficient Private Inference (PI). We discovered that distinct network attributes are required for different ReLU counts; in particular, wider networks are beneficial only for higher ReLU counts, whereas networks with a greater proportion of least-critical ReLU are desirable for lower ReLU counts. Further, we introduced a novel network design principle called “ReLU-equalization” to strategically allocate channels within the network to optimize ReLUs and FLOPs efficiency simultaneously. DeepReShape outperforms the current SOTA (SENets, ICLR'23) by achieving a 2.1% increase in accuracy and a 5.2x faster runtime at iso-ReLU counts on CIFAR-100, and an 8.7x faster runtime at iso-accuracy on the TinyImageNet dataset.
Nandan Kumar Jha
,
Brandon Reagen
PDF
Cite
Slides
Video
Cite
×