Skip to main content

The State of Cloud Costs 2024

Containers are a common theme of wasted spend among organizations, according to the State of Cloud Costs 2024 report from Datadog.

In fact, 83% of container costs were associated with idle resources. About 54% of this wasted spend was on cluster idle, which is the cost of overprovisioning cluster infrastructure, while 29% was associated with workload idle, which comes from resource requests that are larger than their workloads require. This wasted spend comes as organizations allocate more of their EC2 compute to running containers, up to 35% compared to 30% a year ago.

Other report findings include:

GPU Spend Increasing

The report found organizations that use graphics processing unit (GPU) instances have increased their average spending on those instances by 40% in the last year. This growth in spend on GPU instances comes as more companies are experimenting with AI and large language models (LLMs). GPUs' capacity for parallel processing makes them critical for training LLMs and executing other AI workloads, where they can be more than 200% faster than CPUs.

"Today, the most widely used type of GPU-based instance is also the least expensive. This suggests that many customers are still in the experimentation phase with AI and applying the GPU instance to their early efforts in adaptive AI, machine learning inference and small-scale training," said Yrieix Garnier, VP of Product at Datadog. "We expect that as organizations expand their AI activities and move them into production, they will be spending a larger proportion of their cloud compute budget as they use more expensive types of GPU-based instances."

Outdated Technologies Are Widely Used

AWS's current infrastructure offerings commonly both outperform their previous-generation versions and cost less, but 83% of organizations still spend an average of 17% of their EC2 budgets on previous-generation technologies.

Cross-AZ traffic makes up half of data transfer costs

The report states that, "On average, organizations spend almost as much on sending data from one availability zone (AZ) to another as they do on all other types of data transfer combined — including VPNs, gateways, ingress, and egress."

The report found that 98% of organizations are affected by cross-AZ charges, representing an opportunity to optimize cloud costs, such as by colocating related resources within a single AZ whenever availability requirements allow.

"In some cases, cloud providers have stopped charging for certain types of data transfer. It's difficult to predict how these changes might evolve, but if providers relax data transfer costs further, future cross-AZ traffic may become less of a factor in cloud cost efficiency," the report adds.

Fewer Organizations Taking Advantage of Discounts

Cloud service providers offer commitment-based discounts on many of their services — for example, AWS has discount programs for Amazon EC2, Amazon RDS, Amazon SageMaker and others — but only 67% of organizations are participating in these discounts, down from 72% last year.

Green Technology on the Rise

On average, organizations that use Arm-based instances spend 18% of their EC2 compute budget on them — twice as much as they did a year ago. Instance types based on the Arm processor use up to 60% less energy than similar EC2s and often provide better performance at a lower cost.

Hot Topics

The Latest

People want to be doing more engaging work, yet their day often gets overrun by addressing urgent IT tickets. But thanks to advances in AI "vibe coding," where a user describes what they want in plain English and the AI turns it into working code, IT teams can automate ticketing workflows and offload much of that work. Password resets that used to take 5 minutes per request now get resolved automatically ...

Governments and social platforms face an escalating challenge: hyperrealistic synthetic media now spreads faster than legacy moderation systems can react. From pandemic-related conspiracies to manipulated election content, disinformation has moved beyond "false text" into the realm of convincing audiovisual deception ...

Traditional monitoring often stops at uptime and server health without any integrated insights. Cross-platform observability covers not just infrastructure telemetry but also client-side behavior, distributed service interactions, and the contextual data that connects them. Emerging technologies like OpenTelemetry, eBPF, and AI-driven anomaly detection have made this vision more achievable, but only if organizations ground their observability strategy in well-defined pillars. Here are the five foundational pillars of cross-platform observability that modern engineering teams should focus on for seamless platform performance ...

For all the attention AI receives in corporate slide decks and strategic roadmaps, many businesses are struggling to translate that ambition into something that holds up at scale. At least, that's the picture that emerged from a recent Forrester study commissioned by Tines ...

From smart factories and autonomous vehicles to real-time analytics and intelligent building systems, the demand for instant, local data processing is exploding. To meet these needs, organizations are leaning into edge computing. The promise? Faster performance, reduced latency and less strain on centralized infrastructure. But there's a catch: Not every network is ready to support edge deployments ...

Every digital customer interaction, every cloud deployment, and every AI model depends on the same foundation: the ability to see, understand, and act on data in real time ... Recent data from Splunk confirms that 74% of the business leaders believe observability is essential to monitoring critical business processes, and 66% feel it's key to understanding user journeys. Because while the unknown is inevitable, observability makes it manageable. Let's explore why ...

Organizations that perform regular audits and assessments of AI system performance and compliance are over three times more likely to achieve high GenAI value than organizations that do not, according to a survey by Gartner ...

Kubernetes has become the backbone of cloud infrastructure, but it's also one of its biggest cost drivers. Recent research shows that 98% of senior IT leaders say Kubernetes now drives cloud spend, yet 91% still can't optimize it effectively. After years of adoption, most organizations have moved past discovery. They know container sprawl, idle resources and reactive scaling inflate costs. What they don't know is how to fix it ...

Artificial intelligence is no longer a future investment. It's already embedded in how we work — whether through copilots in productivity apps, real-time transcription tools in meetings, or machine learning models fueling analytics and personalization. But while enterprise adoption accelerates, there's one critical area many leaders have yet to examine: Can your network actually support AI at the speed your users expect? ...

The more technology businesses invest in, the more potential attack surfaces they have that can be exploited. Without the right continuity plans in place, the disruptions caused by these attacks can bring operations to a standstill and cause irreparable damage to an organization. It's essential to take the time now to ensure your business has the right tools, processes, and recovery initiatives in place to weather any type of IT disaster that comes up. Here are some effective strategies you can follow to achieve this ...

The State of Cloud Costs 2024

Containers are a common theme of wasted spend among organizations, according to the State of Cloud Costs 2024 report from Datadog.

In fact, 83% of container costs were associated with idle resources. About 54% of this wasted spend was on cluster idle, which is the cost of overprovisioning cluster infrastructure, while 29% was associated with workload idle, which comes from resource requests that are larger than their workloads require. This wasted spend comes as organizations allocate more of their EC2 compute to running containers, up to 35% compared to 30% a year ago.

Other report findings include:

GPU Spend Increasing

The report found organizations that use graphics processing unit (GPU) instances have increased their average spending on those instances by 40% in the last year. This growth in spend on GPU instances comes as more companies are experimenting with AI and large language models (LLMs). GPUs' capacity for parallel processing makes them critical for training LLMs and executing other AI workloads, where they can be more than 200% faster than CPUs.

"Today, the most widely used type of GPU-based instance is also the least expensive. This suggests that many customers are still in the experimentation phase with AI and applying the GPU instance to their early efforts in adaptive AI, machine learning inference and small-scale training," said Yrieix Garnier, VP of Product at Datadog. "We expect that as organizations expand their AI activities and move them into production, they will be spending a larger proportion of their cloud compute budget as they use more expensive types of GPU-based instances."

Outdated Technologies Are Widely Used

AWS's current infrastructure offerings commonly both outperform their previous-generation versions and cost less, but 83% of organizations still spend an average of 17% of their EC2 budgets on previous-generation technologies.

Cross-AZ traffic makes up half of data transfer costs

The report states that, "On average, organizations spend almost as much on sending data from one availability zone (AZ) to another as they do on all other types of data transfer combined — including VPNs, gateways, ingress, and egress."

The report found that 98% of organizations are affected by cross-AZ charges, representing an opportunity to optimize cloud costs, such as by colocating related resources within a single AZ whenever availability requirements allow.

"In some cases, cloud providers have stopped charging for certain types of data transfer. It's difficult to predict how these changes might evolve, but if providers relax data transfer costs further, future cross-AZ traffic may become less of a factor in cloud cost efficiency," the report adds.

Fewer Organizations Taking Advantage of Discounts

Cloud service providers offer commitment-based discounts on many of their services — for example, AWS has discount programs for Amazon EC2, Amazon RDS, Amazon SageMaker and others — but only 67% of organizations are participating in these discounts, down from 72% last year.

Green Technology on the Rise

On average, organizations that use Arm-based instances spend 18% of their EC2 compute budget on them — twice as much as they did a year ago. Instance types based on the Arm processor use up to 60% less energy than similar EC2s and often provide better performance at a lower cost.

Hot Topics

The Latest

People want to be doing more engaging work, yet their day often gets overrun by addressing urgent IT tickets. But thanks to advances in AI "vibe coding," where a user describes what they want in plain English and the AI turns it into working code, IT teams can automate ticketing workflows and offload much of that work. Password resets that used to take 5 minutes per request now get resolved automatically ...

Governments and social platforms face an escalating challenge: hyperrealistic synthetic media now spreads faster than legacy moderation systems can react. From pandemic-related conspiracies to manipulated election content, disinformation has moved beyond "false text" into the realm of convincing audiovisual deception ...

Traditional monitoring often stops at uptime and server health without any integrated insights. Cross-platform observability covers not just infrastructure telemetry but also client-side behavior, distributed service interactions, and the contextual data that connects them. Emerging technologies like OpenTelemetry, eBPF, and AI-driven anomaly detection have made this vision more achievable, but only if organizations ground their observability strategy in well-defined pillars. Here are the five foundational pillars of cross-platform observability that modern engineering teams should focus on for seamless platform performance ...

For all the attention AI receives in corporate slide decks and strategic roadmaps, many businesses are struggling to translate that ambition into something that holds up at scale. At least, that's the picture that emerged from a recent Forrester study commissioned by Tines ...

From smart factories and autonomous vehicles to real-time analytics and intelligent building systems, the demand for instant, local data processing is exploding. To meet these needs, organizations are leaning into edge computing. The promise? Faster performance, reduced latency and less strain on centralized infrastructure. But there's a catch: Not every network is ready to support edge deployments ...

Every digital customer interaction, every cloud deployment, and every AI model depends on the same foundation: the ability to see, understand, and act on data in real time ... Recent data from Splunk confirms that 74% of the business leaders believe observability is essential to monitoring critical business processes, and 66% feel it's key to understanding user journeys. Because while the unknown is inevitable, observability makes it manageable. Let's explore why ...

Organizations that perform regular audits and assessments of AI system performance and compliance are over three times more likely to achieve high GenAI value than organizations that do not, according to a survey by Gartner ...

Kubernetes has become the backbone of cloud infrastructure, but it's also one of its biggest cost drivers. Recent research shows that 98% of senior IT leaders say Kubernetes now drives cloud spend, yet 91% still can't optimize it effectively. After years of adoption, most organizations have moved past discovery. They know container sprawl, idle resources and reactive scaling inflate costs. What they don't know is how to fix it ...

Artificial intelligence is no longer a future investment. It's already embedded in how we work — whether through copilots in productivity apps, real-time transcription tools in meetings, or machine learning models fueling analytics and personalization. But while enterprise adoption accelerates, there's one critical area many leaders have yet to examine: Can your network actually support AI at the speed your users expect? ...

The more technology businesses invest in, the more potential attack surfaces they have that can be exploited. Without the right continuity plans in place, the disruptions caused by these attacks can bring operations to a standstill and cause irreparable damage to an organization. It's essential to take the time now to ensure your business has the right tools, processes, and recovery initiatives in place to weather any type of IT disaster that comes up. Here are some effective strategies you can follow to achieve this ...