Skip to main content

Improving Application Performance with NVMe Storage - Part 3

NVMe Storage Use Cases and Summary: Benefits of NVMe storage for AI/ML
Zivan Ori

Start with Part 1: The Rise of AI and ML Driving Parallel Computing Requirements

Start with Part 2: Local versus Shared Storage for Artificial Intelligence (AI) and Machine Learning (ML)

NVMe Storage Use Cases

NVMe storage's strong performance, combined with the capacity and data availability benefits of shared NVMe storage over local SSD, makes it a strong solution for AI / ML infrastructures of any size. There are several AI / ML focused use cases to highlight.

■ Financial Analytics – Financial services and financial technology (FinTech) are increasingly turning to automation and artificial intelligence to fuel their decision making processes for investments. Using a mix of historical data and financial modeling, one platform can provide the horsepower required for predicting future investment strategies for their financial customers.

■ Image Recognition in Manufacturing – Manufacturing has long used automation in their production lines to increase the output capacity of their production systems, scaling from hundreds of units to thousands or even millions of units per hour. The financial impact of a quality issue on the production line can be devastating if not caught in a timely manner. Real-time image recognition of photos of manufactured parts is essential to determining whether a part meets the quality standards required, as well as capturing systematic quality issues in real-time.

■ Car Services – Ride sharing apps have given rise to a new paradigm in public transit, allowing users and drivers to connect quickly and easily as needed. Ride sharing companies use AI / ML for traffic modeling to position drivers where they are most needed based on both past and current ride sharing requests. This increases the drivers' potential revenue by reducing drive times as well as increases customer satisfaction through reduced wait times, both of which improve the revenue potential for the ride sharing company.

Beyond AI / ML, one vendor also provides more generalized computing services for their customers. They provide storage capacity for cloud services, using OpenStack and Kubernetes in conjunction with NVMe storage for high performance storage. In addition, they also leverage NVMe storage for big data analytics, using spark applications to perform multiple types of data analytics tasks, such as SQL, data mining and more.

Summary: Benefits of NVMe storage for AI/ML

NVMe storage is an ideal solution for countless AI / ML workloads, especially machine learning for multiple applications. With NVMe storage, you can:

■ Create and manage larger shared data-sets for training – By separating out storage capacity from the compute nodes, data-sets for machine learning training can scale up to 1PB. As the data-set grows and more NVMe storage is brought online, performance grows as well, rather than being limited by legacy storage controller bottlenecks.

■ Overcome the capacity limitations of local SSDs in GPU nodes – With limited space for SSD media, GPU nodes have limited capacity to manage larger datasets. With NVMe storage, NVMe volumes can be dynamically provisioned over high performance Ethernet or InfiniBand networks.

■ Accelerate epoch time of machine learning by as much as 10x – By leveraging high performance NVMe-oF, NVMe storage eliminates the latency bottlenecks of older storage protocols and unleashes the parallelism inherent to the NVMe protocol. Every GPU node has direct, parallel access to the media at the lowest possible latency.

■ Improve the utilization of GPUs – Having GPUs rest idle due to slow access to data for processing is costly. By offloading storage access to the idle CPUs, and delivering storage performance at the speed of local SSD, NVMe storage ensures that the GPU-nodes are kept busy with fast access to data.

The Latest

Industry experts offer predictions on how AI will evolve and impact technology and business in 2025. Part 5 covers the infrastructure and hardware supporting AI ...

Industry experts offer predictions on how AI will evolve and impact technology and business in 2025. Part 4 covers advancements in AI technology ...

Industry experts offer predictions on how AI will evolve and impact technology and business in 2025. Part 3 covers AI's impact on employees and their roles ...

Industry experts offer predictions on how AI will evolve and impact technology and business in 2025. Part 2 covers the challenges presented by AI, as well as solutions to those problems ...

In the final part of APMdigest's 2025 Predictions Series, industry experts offer predictions on how AI will evolve and impact technology and business in 2025 ...

E-commerce is set to skyrocket with a 9% rise over the next few years ... To thrive in this competitive environment, retailers must identify digital resilience as their top priority. In a world where savvy shoppers expect 24/7 access to online deals and experiences, any unexpected downtime to digital services can lead to significant financial losses, damage to brand reputation, abandoned carts with designer shoes, and additional issues ...

Efficiency is a highly-desirable objective in business ... We're seeing this scenario play out in enterprises around the world as they continue to struggle with infrastructures and remote work models with an eye toward operational efficiencies. In contrast to that goal, a recent Broadcom survey of global IT and network professionals found widespread adoption of these strategies is making the network more complex and hampering observability, leading to uptime, performance and security issues. Let's look more closely at these challenges ...

Image
Broadcom

The 2025 Catchpoint SRE Report dives into the forces transforming the SRE landscape, exploring both the challenges and opportunities ahead. Let's break down the key findings and what they mean for SRE professionals and the businesses relying on them ...

Image
Catchpoint

The pressure on IT teams has never been greater. As data environments grow increasingly complex, resource shortages are emerging as a major obstacle for IT leaders striving to meet the demands of modern infrastructure management ... According to DataStrike's newly released 2025 Data Infrastructure Survey Report, more than half (54%) of IT leaders cite resource limitations as a top challenge, highlighting a growing trend toward outsourcing as a solution ...

Image
Datastrike

Gartner revealed its top strategic predictions for 2025 and beyond. Gartner's top predictions explore how generative AI (GenAI) is affecting areas where most would assume only humans can have lasting impact ...

Improving Application Performance with NVMe Storage - Part 3

NVMe Storage Use Cases and Summary: Benefits of NVMe storage for AI/ML
Zivan Ori

Start with Part 1: The Rise of AI and ML Driving Parallel Computing Requirements

Start with Part 2: Local versus Shared Storage for Artificial Intelligence (AI) and Machine Learning (ML)

NVMe Storage Use Cases

NVMe storage's strong performance, combined with the capacity and data availability benefits of shared NVMe storage over local SSD, makes it a strong solution for AI / ML infrastructures of any size. There are several AI / ML focused use cases to highlight.

■ Financial Analytics – Financial services and financial technology (FinTech) are increasingly turning to automation and artificial intelligence to fuel their decision making processes for investments. Using a mix of historical data and financial modeling, one platform can provide the horsepower required for predicting future investment strategies for their financial customers.

■ Image Recognition in Manufacturing – Manufacturing has long used automation in their production lines to increase the output capacity of their production systems, scaling from hundreds of units to thousands or even millions of units per hour. The financial impact of a quality issue on the production line can be devastating if not caught in a timely manner. Real-time image recognition of photos of manufactured parts is essential to determining whether a part meets the quality standards required, as well as capturing systematic quality issues in real-time.

■ Car Services – Ride sharing apps have given rise to a new paradigm in public transit, allowing users and drivers to connect quickly and easily as needed. Ride sharing companies use AI / ML for traffic modeling to position drivers where they are most needed based on both past and current ride sharing requests. This increases the drivers' potential revenue by reducing drive times as well as increases customer satisfaction through reduced wait times, both of which improve the revenue potential for the ride sharing company.

Beyond AI / ML, one vendor also provides more generalized computing services for their customers. They provide storage capacity for cloud services, using OpenStack and Kubernetes in conjunction with NVMe storage for high performance storage. In addition, they also leverage NVMe storage for big data analytics, using spark applications to perform multiple types of data analytics tasks, such as SQL, data mining and more.

Summary: Benefits of NVMe storage for AI/ML

NVMe storage is an ideal solution for countless AI / ML workloads, especially machine learning for multiple applications. With NVMe storage, you can:

■ Create and manage larger shared data-sets for training – By separating out storage capacity from the compute nodes, data-sets for machine learning training can scale up to 1PB. As the data-set grows and more NVMe storage is brought online, performance grows as well, rather than being limited by legacy storage controller bottlenecks.

■ Overcome the capacity limitations of local SSDs in GPU nodes – With limited space for SSD media, GPU nodes have limited capacity to manage larger datasets. With NVMe storage, NVMe volumes can be dynamically provisioned over high performance Ethernet or InfiniBand networks.

■ Accelerate epoch time of machine learning by as much as 10x – By leveraging high performance NVMe-oF, NVMe storage eliminates the latency bottlenecks of older storage protocols and unleashes the parallelism inherent to the NVMe protocol. Every GPU node has direct, parallel access to the media at the lowest possible latency.

■ Improve the utilization of GPUs – Having GPUs rest idle due to slow access to data for processing is costly. By offloading storage access to the idle CPUs, and delivering storage performance at the speed of local SSD, NVMe storage ensures that the GPU-nodes are kept busy with fast access to data.

The Latest

Industry experts offer predictions on how AI will evolve and impact technology and business in 2025. Part 5 covers the infrastructure and hardware supporting AI ...

Industry experts offer predictions on how AI will evolve and impact technology and business in 2025. Part 4 covers advancements in AI technology ...

Industry experts offer predictions on how AI will evolve and impact technology and business in 2025. Part 3 covers AI's impact on employees and their roles ...

Industry experts offer predictions on how AI will evolve and impact technology and business in 2025. Part 2 covers the challenges presented by AI, as well as solutions to those problems ...

In the final part of APMdigest's 2025 Predictions Series, industry experts offer predictions on how AI will evolve and impact technology and business in 2025 ...

E-commerce is set to skyrocket with a 9% rise over the next few years ... To thrive in this competitive environment, retailers must identify digital resilience as their top priority. In a world where savvy shoppers expect 24/7 access to online deals and experiences, any unexpected downtime to digital services can lead to significant financial losses, damage to brand reputation, abandoned carts with designer shoes, and additional issues ...

Efficiency is a highly-desirable objective in business ... We're seeing this scenario play out in enterprises around the world as they continue to struggle with infrastructures and remote work models with an eye toward operational efficiencies. In contrast to that goal, a recent Broadcom survey of global IT and network professionals found widespread adoption of these strategies is making the network more complex and hampering observability, leading to uptime, performance and security issues. Let's look more closely at these challenges ...

Image
Broadcom

The 2025 Catchpoint SRE Report dives into the forces transforming the SRE landscape, exploring both the challenges and opportunities ahead. Let's break down the key findings and what they mean for SRE professionals and the businesses relying on them ...

Image
Catchpoint

The pressure on IT teams has never been greater. As data environments grow increasingly complex, resource shortages are emerging as a major obstacle for IT leaders striving to meet the demands of modern infrastructure management ... According to DataStrike's newly released 2025 Data Infrastructure Survey Report, more than half (54%) of IT leaders cite resource limitations as a top challenge, highlighting a growing trend toward outsourcing as a solution ...

Image
Datastrike

Gartner revealed its top strategic predictions for 2025 and beyond. Gartner's top predictions explore how generative AI (GenAI) is affecting areas where most would assume only humans can have lasting impact ...