Example: stock market

Title of Presentation - Flash Memory Summit

Flash Memory Summit 2017 Santa Clara, CA1 OpenCAPITMO verviewFlash Memory Summit 2017 Open Coherent Accelerator Processor InterfaceFlash Memory Summit 2017 Santa Clara, CA2 Accelerated Computing and High Performance Bus Attributes driving Accelerators Emergence of complex storage and Memory solutions Introduction of device coherency requirements (IBM s introduction in 2013) Growing demand for network performance Various form factors ( , GPUs, fpgas , ASICs, etc.) Driving factors for a high performance bus -Consider the environment Increased industry dependence on hardware acceleration for performance Hyperscale datacenters and HPC are driving need for much higher network bandwidth Deep learning and HPC require more bandwidth between accelerators and Memory New Memory /storage technologies are increasing the need for bandwidth with low latencyComputationData AccessFlash Memory Summit 2017 Santa Clara, CA3 Two Bus performance coherent bus needed Hardware acceleration will become commonplace, If you are going to use Advanced Memory /Storage technology and Accelerators, you need to get data in/out very quickly Today s system interfaces are insufficient to address this requirement Systems must be able to integrate multiple Memory technologies with different access methods.

9 Server qualified accelerator cards featuring FPGAs, network I/O and an open architecture software/firmware framework. Design Services/Application

Tags:

  Fpgas

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Title of Presentation - Flash Memory Summit

1 Flash Memory Summit 2017 Santa Clara, CA1 OpenCAPITMO verviewFlash Memory Summit 2017 Open Coherent Accelerator Processor InterfaceFlash Memory Summit 2017 Santa Clara, CA2 Accelerated Computing and High Performance Bus Attributes driving Accelerators Emergence of complex storage and Memory solutions Introduction of device coherency requirements (IBM s introduction in 2013) Growing demand for network performance Various form factors ( , GPUs, fpgas , ASICs, etc.) Driving factors for a high performance bus -Consider the environment Increased industry dependence on hardware acceleration for performance Hyperscale datacenters and HPC are driving need for much higher network bandwidth Deep learning and HPC require more bandwidth between accelerators and Memory New Memory /storage technologies are increasing the need for bandwidth with low latencyComputationData AccessFlash Memory Summit 2017 Santa Clara, CA3 Two Bus performance coherent bus needed Hardware acceleration will become commonplace, If you are going to use Advanced Memory /Storage technology and Accelerators, you need to get data in/out very quickly Today s system interfaces are insufficient to address this requirement Systems must be able to integrate multiple Memory technologies with different access methods.

2 Coherency and performance attributes Traditional I/O architecture results in very high CPU overhead when applications communicate with I/O or Accelerator challenges must be addressed in an open architecture allowing full industry participation Architecture agnostic to enable the ecosystem growth and adaption Establish sufficient volume base to drive cost down Support broad ecosystem of software and attached devicesFlash Memory Summit 2017 Santa Clara, CA4 OpenCAPI Advantages for Storage Class Memories Open standard interface enables to attach wide range of devices Ability to support a wide range of access models from byte addressable load/store to block Extreme bandwidth beyond classical storage interfaces OpenCAPI feature of Home Agent Memory geared specifically for storage class Memory paradigms Agnostic interface allows extension to evolving Memory technologies in the future ( , compute-in- Memory ) Common physical interface between non- Memory and Memory devicesWhere are we coming from today?

3 CAPI Technology Unlocks the Next Level of Performance for FlashIdentical hardware with 3 different paths to dataFlashSystemConventionalI/O (FC)Legacy CAPI External Flash DrawerIBM POWER S822 LLegacy CAPI -Integrated CardIBM's Legacy CAPI NVMeFlash Accelerator is almost 5X more efficient in performing IO vs traditional storage. 21%35%56%100%0%25%50%75%100%CAPI NVMeTraditional NVMeTraditional Storage -Direct IOTraditional Storage -FilesystemRelative CAPI vs. NVMe Instruction Counts per IOKernel InstructionsUser InstructionsLegacy CAPI -accelerated NVMe Flash can issue more IOsper CPU thread than regular NVMe scaling and resiliencyCaching with persistent data framesNew solutions via large scalingComparison of Memory Paradigms Needle-in-a-haystack EngineMain MemoryProcessor ChipDDR4/5 DataDLx/TLxExample: Basic DDR attachProcessor ChipDLx/TLxEmerging Storage Class MemoryProcessor ChipDataDLx/TLxTiered MemorySCMDDR4/5 DataDLx/TLxSCMOpenCAPI WINS due to Bandwidth, best of breed latency, and flexibility of an Open architectureJOIN TODAY!

4 Paradigms with Great PerformanceExamples: Encryption, Compression, Erasure prior to network or storageProcessor ChipAccDataEgress TransformDLx/TLxProcessor ChipAccDataBi-Directional TransformAccTLx/DLxExamples: NoSQL such as Neo4J with Graph Node Traversals, etcNeedle-in-a-haystack EngineExamples: Machine or Deep Learning potentially using OpenCAPI attached memoryMemory TransformProcessor ChipAccDataDLx/TLxExample: Basic work offloadProcessor ChipAccNeedlesDLx/TLxExamples: Database searches, joins, intersections, mergesIngress TransformProcessor ChipAccDataDLx/TLxExamples: Video Analytics, HFT, VPN/IPsec/SSL, Deep Packet Inspection (DPI), Data Plane Accelerator (DPA), Video Encoding ( ), etcNeedle-In-A-Haystack EngineHaystack DataFlash Memory Summit 2017 Santa Clara, CA8 Data Centric Computing with OpenCAPITMF lash Memory Summit 2017 Allan Cantle CTO & Founder Nallatech qualified accelerator cards featuring fpgas , network I/O and an open architecture software/firmware framework.

5 Design Services/Application Optimisation Nallatech a Molexcompany 24 years of FPGA Computing heritage Data Centric High Performance Heterogeneous Computing Real-time, low latency network and I/O processing Intel PSG (Altera) OpenCL & Xilinx Alliance partner Member of OpenCAPI, GenZ & OpenPOWER Server partners: Cray, DELL, HPE, IBM, Lenovo Application porting & optimization services Successfully deployed high volumes of FPGA acceleratorsNallatech at a GlanceData Centric Architectures -Fundamental Zero Power when Data is t Move the Data unless you absolutely have Data has to Move, Move it as efficiently as possibleOur guiding value is in the Data!& the CPU core can often be effectively free!11 Data Center Architectures, Blending Evolutionary with RevolutionaryOpenCAPIOpenCAPIOpenCAPIFPG AFPGAFPGAE merging Data Centric EnhancementsSCM / FlashSCM / FlashSCM / FlashCPUCPUCPUE xisting DataCenterInfrastructureMemoryMemoryMemo ryExisting DataCenterInfrastructureEmerging Data Centric EnhancementsNallatech HyperConverged & Disaggregatable Server Leverage Google & Rackspace s OCP Zaius/Barreleye G2 platform Reconfigurable FPGA Fabric with Balanced Bandwidth to CPU, Storage & Data Plane Network OpenCAPI provides Low Latency & coherent Accelerator / Processor Interface GenZ Memory -Semantic Fabric provides Addressable shared Memory up to 32 Zetabytes200 GBytes/s200 GBytes/s170GB/s170GB/s4x OpenCAPIC hannels200 GBytes/s Xilinx Zynq US+ High Storage Accelerator Blade 4 FSAs in 2OU Rackspace Barreleye G2 OCP Storage drawer deliver.

6 - 152 GByte/s PFD* Bandwidth to 1TB of DDR4 Memory 256 GByte/s PFD* Bandwidth to 64TB of Flash 200 GByte/s PFD* Bandwidth through the OpenCAPI channels 200 GByte/s PFD* Bandwidth through the GenZ Fabric IO Open Architecture software/firmware frameworkReconfigurable Hardware Dataplane, Flash Storage Accelerator FSA128 GByte RDIMM DDR4 Memory @ 2400 MTPS PCIe Gen 3 SwitchZynq US+ ZU19EG FFVC1760 8 GByte DDR4 PCIe G2 x 4 Control Plane Interfacex72X8X72 SlimSAS ConnectorPCIe x16 G3100 GbE QSFP28100 GbE 22110 22110 22110 22110 22110 22110 22110 22110 SSDOpenCAPI InterfacePCIe x16 G38x PCIe x4 G3128 GByte DDR4 RDIMMGenZ Data Plane I/Ox72 MPSoC*PFD = Peak Full DuplexSummary OpenCAPIA ccelerator to Processor Interface Benefits Coherency Lowest Latency Highest Bandwidth Open Standard Perfect Bridge to blend CPU Centric & Data Centric Architectures Join the Open Community where independent experts innovate together and you can help to decide on big topics like whether :-Separate Control and Data Planes --are better than --Converged ones


Related search queries