1

Circa: Stochastic ReLUs for Private Deep Learning

Circa reduces the runtime overhead of ReLU operation by 1.9x by decoupling the sign evaluation and multiplication steps in the Garbled circuit with no loss in accuracy. Further, it achieves a total of 4.7x runtime reduction by employing the sign approximation in the Garbled circuit by leveraging the error-tolerant properties of neural networks within a 1% accuracy margin.

Zahra Ghodsi, Nandan Kumar Jha, Brandon Reagen, Siddharth Garg

Circa: Stochastic ReLUs for Private Deep Learning

DeepReDuce: ReLU Reduction for Fast Private Inference

DeepReDuce is a set of optimizations for the judicious removal of ReLUs to reduce private inference latency by leveraging the ReLUs heterogeneity in classical networks. DeepReDuce strategically drops ReLUs upto 4.9x (on CIFAR-100) and 5.7x (on TinyImageNet) for ResNet18 with no loss in accuracy. Compared to the state-of-the-art for private inference DeepReDuce improves accuracy and reduces ReLU count by up to 3.5% (iso-ReLU) and 3.5×(iso-accuracy), respectively.

Nandan Kumar Jha, Zahra Ghodsi, Siddharth Garg, Brandon Reagen

DeepReDuce: ReLU Reduction for Fast Private Inference

DRACO: Co-optimizing hardware utilization, and performance of dnns on systolic accelerator

In this work, we address the low PE utilization and low data reuse challenges in the memory-bound DNNs stem either from depthwise convolution or the very less number of channels in each group. We proposed DNN-optimization techniques that strike a balance between PE utilization, energy efficiency, and accuracy of the DNNs.

Nandan Kumar Jha, Shreyas Ravishankar, Sparsh Mittal, Arvind Kaushik, Dipan Mandal, Mahesh Chandra

DRACO: Co-optimizing hardware utilization, and performance of dnns on systolic accelerator

ULSAM: Ultra-lightweight subspace attention module for compact convolutional neural networks

The capability of the self-attention mechanism to model the long-range dependencies has catapulted its deployment in vision models. …

Rajat Saini, Nandan Kumar Jha, Bedanta Das, Sparsh Mittal, C Krishna Mohan

ULSAM: Ultra-lightweight subspace attention module for compact convolutional neural networks

E2GC: Energy-efficient group convolution in deep neural networks

The number of groups (g) in group convolution (GConv) is selected to boost the predictive performance of deep neural networks (DNNs) in …

Nandan Kumar Jha, Rajat Saini, Subhrajit Nag, Sparsh Mittal

E2GC: Energy-efficient group convolution in deep neural networks

The Ramifications of Making Deep Neural Networks Compact

The recent trend in deep neural networks (DNNs) research is to make the networks more compact. The motivation behind designing compact …

Nandan Kumar Jha, Sparsh Mittal, Govardhan Mattela

The Ramifications of Making Deep Neural Networks Compact

Characterizing and Optimizing End-to-End Systems for Private Inference

In two-party machine learning prediction services, the client’s goal is to query a remote server’s trained machine learning …

Karthik Garimella, Zahra Ghodsi, Nandan Kumar Jha, Siddharth Garg, Brandon Reagen

Characterizing and Optimizing End-to-End Systems for Private Inference

DeepReShape: Redesigning Neural Networks for Efficient Private Inference

DeepReShape is the first work to conduct a rigorous characterization of desirable neural network attributes for efficient Private Inference (PI). We discovered that distinct network attributes are required for different ReLU counts; in particular, wider networks are beneficial only for higher ReLU counts, whereas networks with a greater proportion of least-critical ReLU are desirable for lower ReLU counts. Further, we introduced a novel network design principle called “ReLU-equalization” to strategically allocate channels within the network to optimize ReLUs and FLOPs efficiency simultaneously. DeepReShape outperforms the current SOTA (SENets, ICLR'23) by achieving a 2.1% increase in accuracy and a 5.2x faster runtime at iso-ReLU counts on CIFAR-100, and an 8.7x faster runtime at iso-accuracy on the TinyImageNet dataset.

Nandan Kumar Jha, Brandon Reagen

DeepReShape: Redesigning Neural Networks for Efficient Private Inference