Pepperdata Introduces Observability and Optimization for GPUs Running Big Data Applications
August 17, 2021
Share this

Pepperdata announced that the Pepperdata product portfolio now includes the ability to monitor Graphics Processing Units (GPUs) running big data applications like Spark on Kubernetes.

Workloads that harness tremendous amounts of data, such as machine learning (ML) and artificial intelligence (AI) applications, require GPUs, which were originally designed to accelerate graphics rendering. That extra processing power comes with a high price tag, and it requires near-constant monitoring for resource waste to get the best performance at the lowest possible cost.

Pepperdata now monitors GPU performance, providing the visibility needed for Spark applications running on Kubernetes and utilizing the processing power of GPUs. With this new visibility, companies can improve the performance of their Spark apps running on those GPUs and manage costs at a granular level.

Unlike traditional infrastructure monitoring, which is limited to the platform, the Pepperdata solution provides visibility into GPU resource utilization at the application level. Pepperdata also provides instant recommendations for optimization. Features include:

- Visibility into GPU memory usage and waste

- Fine-tuning of GPU usage through end-user recommendations

- Ability to attribute usage and cost to specific end-users

“Spark on Kubernetes is quickly becoming a dominant part of the compute infrastructure as data-intensive ML and AI applications proliferate,” said Ash Munshi, CEO, Pepperdata. “GPUs can handle these workloads, but they are expensive to buy and are power-intensive. Until now, there hasn’t been a way to view and manage the infrastructure and applications, which can lead to unnecessary waste and overspending for big data workloads. With Pepperdata, organizations can properly size their GPU hardware investments and have the confidence that they are utilizing them well.”

There are products on the market for monitoring GPUs, but they typically lack long-term storage, the ability to scale, and often do not correlate infrastructure metrics to applications. Pepperdata solves these problems with insight for data center operators, data scientists, and ML/AI developers. They can now understand who is using what resources, optimize to eliminate waste so jobs can be tuned and prioritized, and make sure costs are assigned appropriately to the right users or groups across the enterprise.

Share this

The Latest

March 23, 2023

APMdigest and leading IT research firm Enterprise Management Associates (EMA) are partnering to bring you the EMA-APMdigest Podcast, a new podcast focused on the latest technologies impacting IT Operations. In Episode 2 - Part 1 Pete Goldin, Editor and Publisher of APMdigest, discusses Network Observability with Shamus McGillicuddy, Vice President of Research, Network Infrastructure and Operations, at EMA ...

March 22, 2023

CIOs have stepped into the role of digital leader and strategic advisor, according to the 2023 Global CIO Survey from Logicalis ...

March 21, 2023

Synthetic monitoring is crucial to deploy code with confidence as catching bugs with E2E tests on staging is becoming increasingly difficult. It isn't trivial to provide realistic staging systems, especially because today's apps are intertwined with many third-party APIs ...

March 20, 2023

Recent EMA field research found that ServiceOps is either an active effort or a formal initiative in 78% of the organizations represented by a global panel of 400+ IT leaders. It is relatively early but gaining momentum across industries and organizations of all sizes globally ...

March 16, 2023

Managing availability and performance within SAP environments has long been a challenge for IT teams. But as IT environments grow more complex and dynamic, and the speed of innovation in almost every industry continues to accelerate, this situation is becoming a whole lot worse ...

March 15, 2023

Harnessing the power of network-derived intelligence and insights is critical in detecting today's increasingly sophisticated security threats across hybrid and multi-cloud infrastructure, according to a new research study from IDC ...

March 14, 2023

Recent research suggests that many organizations are paying for more software than they need. If organizations are looking to reduce IT spend, leaders should take a closer look at the tools being offered to employees, as not all software is essential ...

March 13, 2023

Organizations are challenged by tool sprawl and data source overload, according to the Grafana Labs Observability Survey 2023, with 52% of respondents reporting that their companies use 6 or more observability tools, including 11% that use 16 or more.

March 09, 2023

An array of tools purport to maintain availability — the trick is sorting through the noise to find the right one. Let us discuss why availability is so important and then unpack the ROI of deploying Artificial Intelligence for IT Operations (AIOps) during an economic downturn ...

March 08, 2023

Development teams so often find themselves rushing to get a release out on time. When it comes time for testing, the software works fine in the lab. But, when it's released, customers report a bunch of bugs. How does this happen? Why weren't the flaws found in QA? ...