Speed Up Training and Inference with NetApp ONTAP AI

maltiger

Artificial intelligence (AI) and deep learning (DL) enable enterprises to detect fraud, improve customer relationships, optimize the supply chain, and deliver innovative products and services in an increasingly competitive marketplace. Yours may be one of the many organizations that are leveraging new DL approaches to drive digital transformation and gain a competitive advantage.

Now you can fully realize the promise of AI and DL by simplifying, accelerating, and integrating your data pipeline with the NetApp<sup>®</sup> ONTAP<sup>®</sup> AI-proven architecture, powered by NVIDIA DGX supercomputers and NetApp cloud-connected all-flash storage. Streamline the flow of data reliably and speed up training and inference with the Data Fabric that spans from edge to core to cloud.

The NetApp ONTAP AI architecture was developed and verified by NetApp and NVIDIA. It gives your organization a prescriptive architecture that:

Eliminates design complexities* Allows independent scaling of compute and storage* Enables you to start small and scale seamlessly* Offers a range of storage options for various performance and cost points

Powerful Performance

ONTAP AI testing using ImageNet data with a NetApp AFF A800 system and NVIDIA DGX-1 servers in a 1:4 storage-to-compute configuration achieved training throughput of 23,000 training images per second (TIPS) and inference throughput of 60,000 TIPS. With this configuration, you can expect to get more than 2GBps of sustained throughput (5GBps peak) with well under 1ms of latency, while the GPUs operate at more than 95% utilization. A single AFF A800 system supports throughput of 25GBps for sequential reads and 1 million IOPS for small random reads at under 500-microsecond latencies for NAS workloads. These results demonstrate available performance headroom that can support many more DGX-1 servers as requirements increase.

Start Small, Grow Nondisruptively

Start with a 1:1 storage-to-compute configuration and scale out as your data grows to a 1:5 configuration and beyond. NetApp’s rack-scale architecture allows organizations to start with an AFF A220 and grow as needed to scale from hundreds of terabytes to tens of petabytes with an all-flash array. And with NetApp ONTAP FlexGroup, up to 20 petabytes of single namespace can handle more than 400 billion files.

Intelligently Manage Your Data with an Integrated Pipeline

ONTAP AI leverages the NetApp Data Fabric to unify data management across the pipeline with a single platform. Use the same tools to securely control and protect your data in flight, in use, or at rest, from edge to core to cloud, and meet compliance requirements with confidence.

With ONTAP AI, you can simplify deployment and management with single-point-of-contact support for your NVIDIA, NetApp, and Cisco proven architecture.

To learn more, attend the ONTAP AI CrowdChat on August 9 hosted by theCube, and check out the NetApp AI and DL web site.

The post Speed Up Training and Inference with NetApp ONTAP AI appeared first on NetApp Blog.

https://blog.netapp.com/speed-up-training-and-inference-with-netapp-ontap-ai/