Improving Application Performance with NVMe Storage - Part 3
NVMe Storage Use Cases and Summary: Benefits of NVMe storage for AI/ML
May 01, 2019

Zivan Ori
E8 Storage

Share this

Start with Part 1: The Rise of AI and ML Driving Parallel Computing Requirements

Start with Part 2: Local versus Shared Storage for Artificial Intelligence (AI) and Machine Learning (ML)

NVMe Storage Use Cases

NVMe storage's strong performance, combined with the capacity and data availability benefits of shared NVMe storage over local SSD, makes it a strong solution for AI / ML infrastructures of any size. There are several AI / ML focused use cases to highlight.

■ Financial Analytics – Financial services and financial technology (FinTech) are increasingly turning to automation and artificial intelligence to fuel their decision making processes for investments. Using a mix of historical data and financial modeling, one platform can provide the horsepower required for predicting future investment strategies for their financial customers.

■ Image Recognition in Manufacturing – Manufacturing has long used automation in their production lines to increase the output capacity of their production systems, scaling from hundreds of units to thousands or even millions of units per hour. The financial impact of a quality issue on the production line can be devastating if not caught in a timely manner. Real-time image recognition of photos of manufactured parts is essential to determining whether a part meets the quality standards required, as well as capturing systematic quality issues in real-time.

■ Car Services – Ride sharing apps have given rise to a new paradigm in public transit, allowing users and drivers to connect quickly and easily as needed. Ride sharing companies use AI / ML for traffic modeling to position drivers where they are most needed based on both past and current ride sharing requests. This increases the drivers' potential revenue by reducing drive times as well as increases customer satisfaction through reduced wait times, both of which improve the revenue potential for the ride sharing company.

Beyond AI / ML, one vendor also provides more generalized computing services for their customers. They provide storage capacity for cloud services, using OpenStack and Kubernetes in conjunction with NVMe storage for high performance storage. In addition, they also leverage NVMe storage for big data analytics, using spark applications to perform multiple types of data analytics tasks, such as SQL, data mining and more.

Summary: Benefits of NVMe storage for AI/ML

NVMe storage is an ideal solution for countless AI / ML workloads, especially machine learning for multiple applications. With NVMe storage, you can:

■ Create and manage larger shared data-sets for training – By separating out storage capacity from the compute nodes, data-sets for machine learning training can scale up to 1PB. As the data-set grows and more NVMe storage is brought online, performance grows as well, rather than being limited by legacy storage controller bottlenecks.

■ Overcome the capacity limitations of local SSDs in GPU nodes – With limited space for SSD media, GPU nodes have limited capacity to manage larger datasets. With NVMe storage, NVMe volumes can be dynamically provisioned over high performance Ethernet or InfiniBand networks.

■ Accelerate epoch time of machine learning by as much as 10x – By leveraging high performance NVMe-oF, NVMe storage eliminates the latency bottlenecks of older storage protocols and unleashes the parallelism inherent to the NVMe protocol. Every GPU node has direct, parallel access to the media at the lowest possible latency.

■ Improve the utilization of GPUs – Having GPUs rest idle due to slow access to data for processing is costly. By offloading storage access to the idle CPUs, and delivering storage performance at the speed of local SSD, NVMe storage ensures that the GPU-nodes are kept busy with fast access to data.

Zivan Ori is CEO and Co-Founder of E8 Storage
Share this

The Latest

March 04, 2024

This year's Super Bowl drew in viewership of nearly 124 million viewers and made history as the most-watched live broadcast event since the 1969 moon landing. To support this spike in viewership, streaming companies like YouTube TV, Hulu and Paramount+ began preparing their IT infrastructure months in advance to ensure an exceptional viewer experience without outages or major interruptions. New Relic conducted a survey to understand the importance of a seamless viewing experience and the impact of outages during major streaming events such as the Super Bowl ...

March 01, 2024

As organizations continue to navigate the complexities of the digital era, which has been marked by exponential advancements in AI and technology, the strategic deployment of modern, practical applications has become indispensable for sustaining competitive advantage and realizing business goals. The Info-Tech Research Group report, Applications Priorities 2024, explores the following five initiatives for emerging and leading-edge technologies and practices that can enable IT and applications leaders to optimize their application portfolio and improve on capabilities needed to meet the ambitions of their organizations ...

February 29, 2024

Despite the growth in popularity of artificial intelligence (AI) and ML across a number of industries, there is still a huge amount of unrealized potential, with many businesses playing catch-up and still planning how ML solutions can best facilitate processes. Further progression could be limited without investment in specialized technical teams to drive development and integration ...

February 28, 2024

With over 200 streaming services to choose from, including multiple platforms featuring similar types of entertainment, users have little incentive to remain loyal to any given platform if it exhibits performance issues. Big names in streaming like Hulu, Amazon Prime and HBO Max invest thousands of hours into engineering observability and closed-loop monitoring to combat infrastructure and application issues, but smaller platforms struggle to remain competitive without access to the same resources ...

February 27, 2024

Generative AI has recently experienced unprecedented dramatic growth, making it one of the most exciting transformations the tech industry has seen in some time. However, this growth also poses a challenge for tech leaders who will be expected to deliver on the promise of new technology. In 2024, delivering tangible outcomes that meet the potential of AI, and setting up incubator projects for the future will be key tasks ...

February 26, 2024

SAP is a tool for automating business processes. Managing SAP solutions, especially with the shift to the cloud-based S/4HANA platform, can be intricate. To explore the concerns of SAP users during operational transformations and automation, a survey was conducted in mid-2023 by Digitate and Americas' SAP Users' Group ...

February 22, 2024

Some companies are just starting to dip their toes into developing AI capabilities, while (few) others can claim they have built a truly AI-first product. Regardless of where a company is on the AI journey, leaders must understand what it means to build every aspect of their product with AI in mind ...

February 21, 2024

Generative AI will usher in advantages within various industries. However, the technology is still nascent, and according to the recent Dynatrace survey there are many challenges and risks that organizations need to overcome to use this technology effectively ...

February 20, 2024

In today's digital era, monitoring and observability are indispensable in software and application development. Their efficacy lies in empowering developers to swiftly identify and address issues, enhance performance, and deliver flawless user experiences. Achieving these objectives requires meticulous planning, strategic implementation, and consistent ongoing maintenance. In this blog, we're sharing our five best practices to fortify your approach to application performance monitoring (APM) and observability ...

February 16, 2024

In MEAN TIME TO INSIGHT Episode 3, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at Enterprise Management Associates (EMA) discusses network security with Chris Steffen, VP of Research Covering Information Security, Risk, and Compliance Management at EMA ...