Azure Accelerated Networking: SmartNICs in the Public …

Azure Accelerated networking : SmartNICs in the Public CloudDaniel FirestoneAndrew PutnamSambhrama MundkurDerek ChiouAlireza DabaghMike AndrewarthaHari AngepatVivek BhanuAdrian CaulfieldEric ChungHarish Kumar ChandrappaSomesh ChaturmohtaMatt HumphreyJack LavierNorman LamFengfen LiuKalin OvtcharovJitu PadhyeGautham PopuriShachar RaindelTejas SapreMark ShawGabriel SilvaMadhan SivakumarNisheeth SrivastavaAnshuman VermaQasim ZuhairDeepak BansalDoug BurgerKushagra VaidDavid A. MaltzAlbert GreenbergMicrosoftAbstractModern cloud architectures rely on each server running itsown networking stack to implement policies such as tun-neling for virtual networks, security, and load , these networking stacks are becoming increas-ingly complex as features are added and as network speedsincrease. Running these stacks on CPU cores takes awayprocessing power from VMs, increasing the cost of run-ning cloud services, and adding latency and variability tonetwork present Azure Accelerated networking (AccelNet),our solution for offloading host networking to hardware,using custom Azure SmartNICs based on FPGAs.

Wedefine the goals of AccelNet, including programmabilitycomparable to software, and performance and efficiencycomparable to hardware. We show that FPGAs are the bestcurrent platform for offloading our networking stack asASICs do not provide sufficient programmability, and em-bedded CPU cores do not provide scalable performance,especially on single network SmartNICs implementing AccelNet have beendeployed on all new Azure servers since late 2015 in afleet of>1M hosts. The AccelNet service has been avail-able for Azure customers since 2016, providing consis-tent<15 s VM-VM TCP latencies and 32 Gbps through-put, which we believe represents the fastest network avail-able to customers in the Public cloud. We present thedesign of AccelNet, including our hardware/software co-design model, performance results on key workloads, andexperiences and lessons learned from developing and de-ploying AccelNet on FPGA-based Azure IntroductionThe Public cloud is the backbone behind a massive andrapidly growing percentage of online software services [1,2, 3].

In the Microsoft Azure cloud alone, these servicesconsume millions of processor cores, exabytes of stor-age, and petabytes of network bandwidth. Network per-formance, both bandwidth and latency, is critical to mostcloud workloads, especially interactive a large Public cloud provider, Azure has built itscloud network on host-based software-defined network-ing (SDN) technologies, using them to implement almostall virtual networking features, such as private virtual net-works with customer supplied address spaces, scalable L4load balancers, security groups and access control lists(ACLs), virtual routing tables, bandwidth metering, QoS,and more. These features are the responsibility of the hostplatform, which typically means software running in cost of providing these services continues to in-crease. In the span of only a few years, we increased net-working speeds by 40x and more, from 1 GbE to 40 GbE+,and added countless new features. And while we built in-creasingly well-tuned and efficient host SDN packet pro- cessing capabilities, running this stack in software on thehost requires additional CPU cycles.

Burning CPUs forthese services takes away from the processing power avail-able to customer VMs, and increases the overall cost ofproviding cloud Root I/O Virtualization (SR-IOV) [4, 5] has beenproposed to reduce CPU utilization by allowing direct ac-cess to NIC hardware from the VM. However, this di-rect access would bypass the host SDN stack, makingthe NIC responsible for implementing all SDN these policies change rapidly (weeks to months), werequired a solution that could provide software-like pro-grammability while providing hardware-like this paper we present Azure Accelerated Network-ing (AccelNet), our host SDN stack implemented on theFPGA-based Azure smartnic . AccelNet provides near-native network performance in a virtualized environment,offloading packet processing from the host CPU to theAzure smartnic . Building upon the software-based VFPhost SDN platform [6], and the hardware and software in-frastructure of the Catapult program [7, 8], AccelNet pro-vides the performance of dedicated hardware, with theprogrammability of software running in the goal is to present both our design and our experiencesrunning AccelNet in production at scale, and lessons Traditional Host Network ProcessingIn the traditional device sharing model of a virtualizedenvironment such as the Public cloud, all network I/O toand from a physical device is exclusively performed in thehost software partition of the hypervisor.

Every packetFigure 1: An SR-IOV NIC with a PF and and received by a VM is processed by the VirtualSwitch (vSwitch) in the host networking stack. Receiv-ing packets typically involves the hypervisor copying eachpacket into a VM-visible buffer, simulating a soft inter-rupt to the VM, and then allowing the VM s OS stack tocontinue network processing . Sending packets is similar,but in the opposite order. Compared to a non-virtualizedenvironment, this additional host processing : reduces per-formance, requires additional changes in privilege level,lowers throughput, increases latency and latency variabil-ity, and increases host CPU Host SDNIn addition to selling VMs, cloud vendors sellingInfrastructure-as-a-Service (IaaS) have to provide rich net-work semantics, such as private virtual networks with cus-tomer supplied address spaces, scalable L4 load balancers,security groups and ACLs, virtual routing tables, band-width metering, QoS, and more.

These semantics are suf-ficiently complex and change too frequently that it isn tfeasible to implement them at scale in traditional switchhardware. Instead, these are implemented on each host inthe vSwitch. This scales well with the number of servers,and allows the physical network to be simple, scalable andvery Virtual Filtering Platform (VFP) is our cloud-scaleprogrammable vSwitch, providing scalable SDN policyfor Azure . It is designed to handle the programmabil-ity needs of Azure s many SDN applications, providinga platform for multiple SDN controllers to plumb com-plex, stateful policy via match-action tables. Details aboutVFP and how it implements virtual networks in softwarein Azure can be found in [6]. SR-IOVMany performance bottlenecks caused by doing packetprocessing in the hypervisor can be overcome by usinghardware that supports SR-IOV. SR-IOV-compliant hard-ware provides a standards-based foundation for efficientlyand securely sharing PCI Express (PCIe) device hardwareamong multiple VMs.

The host connects to a privilegedphysical function (PF), while each virtual machine con-nects to its own virtual function (VF). A VF is exposed asa unique hardware device to each VM, allowing the VMdirect access to the actual hardware, yet still isolating VMdata from other VMs. As illustrated in Figure 1 , an SR-IOV NIC contains an embedded switch to forward packetsto the right VF based on the MAC address. All data pack-ets flow directly between the VM operating system andthe VF, bypassing the host networking stack entirely. Thisprovides improved throughput, reduced CPU utilization,lower latency, and improved , bypassing the hypervisor brings on a new setof challenges since it also bypasses all host SDN pol-icy such as that implemented in VFP. Without additionalmechanisms, these important functions cannot be per-formed as the packets are not processed by the SDN stackin the Generic Flow Table OffloadOne of AccelNet s goals was to find a way to makeVFP s complex policy compatible with SR-IOV.

Themechanism we use in VFP to enforce policy and filteringin an SR-IOV environment is called Generic Flow Tables(GFT). GFT is a match-action language that defines trans-formation and control operations on packets for one spe-cific network flow. Conceptually, GFT is comprised of asingle large table that has an entry for every active networkflow on a host. GFT flows are defined based on the VFPunified flows (UF) definition, matching a unique sourceand destination L2/L3/L4 tuple, potentially across multi-ple layers of encapsulation, along with a header transpo-sition (HT) action specifying how header fields are to beadded/ the GFT table does not contain an entry for anetwork flow (such as when a new network flow is started),the flow can be vectored to the VFP software running onthe host. VFP then processes all SDN rules for the firstpacket of a flow, using a just-in-time flow action compilerto create stateful exact-match rules for each UF ( eachTCP/UDP flow), and creating a composite action encom-passing all of the programmed policies for that flow.

VFPthen populates the new entry in the GFT table and deliversthe packet for the actions for a flow have been populated in theGFT table, every subsequent packet will be processed bythe GFT hardware, providing the performance benefits ofSR-IOV, but with full policy and filtering enforcement ofVFP s software SDN Design Goals and RationaleWe defined the GFT model in 2013-2014, but there arenumerous options for building a complete solution acrosshardware and software. We began with the following goalsand constraints as we set out to build hardware offloads forhost SDN:1. Don t burn host CPU coresAzure, like its competitors, sells VMs directly to cus-tomers as an IaaS offering, and competes on the price ofthose VMs. Our profitability in IaaS is the difference be-tween the price a customer pays for a VM and what it costsus to host one. Since we have fixed costs per server, thebest way to lower the cost of a VM is to pack more VMsonto each host server. Thus, most clouds typically deploythe largest number of CPU cores reasonably possible at agiven generation of 2-socket (an economical and perfor-mant standard) blades.

At the time of writing this paper,a physical core (2 hyperthreads) sells for $ ,or a maximum potential revenue of around $900/yr, and$4500 over the lifetime of a server (servers typically last3 to 5 years in our datacenters). Even considering thatsome fraction of cores are unsold at any time and thatclouds typically offer customers a discount for commit-ted capacity purchases, using even one physical core forhost networking is quite expensive compared to dedicatedhardware. Our business fundamentally relies on selling asmany cores per host as possible to customer VMs, and sowe will go to great lengths to minimize host , running a high-speed SDN datapath using host CPUcores should be Maintain host SDN programmability of VFPVFP is highly programmable, including a multi-controller model, stateful flow processing , complexmatching capabilities for large numbers of rules, complexrule- processing and match actions, and the ability to eas-ily add new rules.

Azure Accelerated Networking: SmartNICs in the Public …

Tags:

Information

Transcription of Azure Accelerated Networking: SmartNICs in the Public …

Related search queries

Azure Accelerated Networking: SmartNICs in the Public …

Tags:

Information

Documents from same domain

Related documents

Related search queries