Autonomous NIC Offloads

Boris Pismenny, Haggai Eran, Aviad Yehezkel, Liran Liss, Adam Morrison, Dan Tsafrir

Published in ASPLOS '21: ACM International Conference on Architectural Support for Languages and Operating Systems, 2021

CPUs routinely offload to NICs network-related processing tasks like packet segmentation and checksum. NIC offloads are advantageous because they free CPU cycles. But they are usually applicable to layer≤4 protocols, and not to layer-5 protocols (L5Ps) built on top of TCP. This limitation is due to a misfeature we call ``offload dependence,’’ which dictates that L5P offloads also require offloading the underlying layer≤4 protocols and related functionality—TCP, IP, firewall, tunneling, etc. Offload dependence hinders innovation because it hard-wires a complicated, ever-changing implementation.

We propose ``autonomous NIC offloads,’’ which eliminate offload dependence. Autonomous offloads provide a stateful and lightweight software-device architecture that accelerates L5Ps without having to migrate the entire layer≤4 TCP/IP stack into the NIC. A main challenge that autonomous offloads address is coping with out-of-sequence packets. We implement autonomous offloads for two L5Ps: (i) NVMe-over-TCP zero-copy and CRC computation, and (ii) https authentication, encryption, and decryption. Our autonomous offloads improve throughput by up to 3.3x and lower CPU consumption and latency by as much as 2.4x and 1.4x. Their implementation is already upstreamed in the Linux kernel, and they will be supported in the next-generation of Mellanox NICs.