Skip to main content

Grafana Labs Launches Mimir 3.0

Grafana Labs announced the launch of Grafana Mimir 3.0, the latest evolution of its open-source, horizontally scalable metrics backend. 

Mimir 3.0 marks an architectural milestone, delivering new levels of reliability, performance, and cost efficiency for Prometheus-compatible monitoring at enterprise scale.

Built on open source, open standards, and open ecosystems, Grafana Labs helps organizations innovate without lock-in and move fast without compromise. At KubeCon, the company also announced updates across its open-source ecosystem, including Grafana Tempo 2.9 with AI-assisted tracing, continued Kubernetes Monitoring enhancements, and deeper Prometheus and OpenTelemetry support to help teams simplify observability and gain more value from their data.

“From open source and open standards to open ecosystems and open minds – building in the open is core to our philosophy at Grafana Labs,” said Myrle Krantz, Senior Director of Engineering, Grafana Labs. “That’s why we’re continuing to invest in open source, like adding AI-assisted tracing in Tempo and making it easier to get the most out of OpenTelemetry and Prometheus. We’re continuously improving Kubernetes Monitoring. And with the new Mimir 3.0 release, we’re helping teams scale even more reliably, expanding what’s possible for open observability in 2026 and beyond.”

Three years in development, Grafana Mimir 3.0 introduces a new decoupled architecture that separates the read and write paths for more reliable, large-scale metrics operations:

  • Reliability: By decoupling reads and writes through an asynchronous Kafka-based ingest layer, cross-path dependencies are eliminated, keeping queries fast and stable even under heavy ingestion loads.
  • Performance: The new Mimir Query Engine (MQE) streams query results instead of loading entire datasets into memory, improving execution speed and reducing memory usage by up to 92%.
  • Cost efficiency: Early testing reports up to 15% lower resource usage while achieving higher throughput and consistency across large clusters.

Together, these innovations make Mimir 3.0 the most resilient, high-performing, and cost-efficient metrics backend for Prometheus and OpenTelemetry data – now available on Grafana Cloud and for self-managed users via open source. 

The latest release of Grafana Tempo, the open source distributed tracing backend, introduces new capabilities to speed up trace analysis and bring AI into the observability workflow.

  • MCP server support: An experimental Model Context Protocol (MCP) server allows AI assistants like Claude Code and Cursor to query distributed tracing data with TraceQL, enabling natural-language debugging and faster root cause analysis.
  • TraceQL metrics sampling: New probabilistic query hints accelerate analysis in high-volume environments, returning approximate results faster without losing visibility.
  • Multi-tenant and operational improvements: New metrics for query I/O, span timing, and usage tracking improve observability and performance visibility at scale.

Tempo 2.9 also deepens OpenTelemetry support by aligning with newer OpenTelemetry semantic conventions, reaffirming Grafana Labs’ commitment to open, composable observability.

Building on the success of Grafana Cloud Kubernetes Monitoring, Grafana Labs has introduced powerful new capabilities that simplify observability across even the most complex Kubernetes environments. This is especially timely as a recent survey by the CNCF found that 80% of respondents work for IT organizations that have deployed Kubernetes in a production environment.

Kubernetes Monitoring in Grafana Cloud has evolved into an observability solution that doesn’t just visualize telemetry but interprets it, automates insights, and guides teams to action. New updates include:

  • Grafana AI Assistant integration: Teams can now interact with Kubernetes Monitoring using Grafana Assistant (now generally available), an AI-powered agent built into Grafana Cloud that can read dashboards, drill into panels, and summarize results in real time. Using natural language, users can ask how a workload is behaving, what’s impacting performance, or where costs are trending.
  • GPU monitoring: Available at both the Node and Cluster level, new GPU utilization panels help detect overheating, power drain, or underuse before they impact performance, ensuring AI workloads remain stable and efficient.
  • Automated root cause analysis: Now integrated with the generally available Grafana Knowledge Graph, Kubernetes Monitoring gives you automatic RCA and Insight Rings.
  • Expanded workload support: Kubernetes Monitoring now provides full visibility into CronJobs, Argo Rollouts, Bare Pods, Static Pods, Strimzi Pod Sets, and other nonstandard workloads, ensuring comprehensive coverage across diverse infrastructure types.
  • Monitor cron jobs and other job types: Get full visibility into all cron and manual jobs across clusters. Instantly see status, distribution, and missed runs to ensure automation reliability and quick issue detection.
  • CPU and memory panels: New CPU and Memory tabs provide clear, layered views of compute usage – from cluster to container – with efficiency graphs and CPU distribution analysis that help optimize capacity, cost, and performance.
  • Cloud provider nodes: One-click correlation between AWS EC2 instances and Kubernetes workloads enables unified troubleshooting across cloud and container layers, reducing context-switching and mean time to resolution. And for teams on AWS, CloudWatch metric streams in Grafana Cloud can cut metric pipeline costs, including storage and agent infrastructure, by up to 10x while delivering near-real-time metrics.

Together, these updates make Kubernetes Monitoring in Grafana Cloud an intelligent, automated, and AI-capable solution for today’s dynamic, large-scale environments.

Grafana Labs continues to invest in open standards and community-led innovation across its ecosystem:

  • Beyla donation complete: Earlier in 2025, Grafana Labs donated Grafana Beyla, its eBPF-based, zero-code auto-instrumentation agent, to OpenTelemetry. Renamed OpenTelemetry eBPF Instrumentation, the project just marked its first official release under the OpenTelemetry umbrella. The donation reinforces Grafana Labs’ long-standing commitment to advancing open, vendor-neutral observability.
  • Grafana Alloy: Grafana Labs’ distribution of the OpenTelemetry Collector, Grafana Alloy is now the default data pipeline layer across Grafana Cloud and open source deployments. Alloy unifies metrics, logs, and traces collection while supporting both Prometheus and OpenTelemetry pipelines.
  • Prometheus 3.0 and OpenTelemetry interoperability: Grafana engineers contributed to the introduction of profiling signal support, new semantic conventions, and Prometheus 3.0 compatibility, strengthening cross-project interoperability.

The Latest

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

40% of organizations deploying AI will implement dedicated AI observability tools by 2028 to monitor model performance, bias and outputs, according to Gartner ...

Until AI-powered engineering tools have live visibility of how code behaves at runtime, they cannot be trusted to autonomously ensure reliable systems, according to the State of AI-Powered Engineering Report 2026 report from Lightrun. The report reveals that a major volume of manual work is required when AI-generated code is deployed: 43% of AI-generated code requires manual debugging in production, even after passing QA or staging tests. Furthermore, an average of three manual redeploy cycles are required to verify a single AI-suggested code fix in production ...

Many organizations describe AI as strategic, but they do not manage it strategically. When AI plans are disconnected from strategy, detached from organizational learning, and protected from serious assumptions testing, the problem is no longer technical immaturity; it is a failure of management discipline ... Executives too often tell organizations to "use AI" before they define what AI is supposed to change. The problem deepens in organizations where strategy isn't well articulated in the first place ...

Across the enterprise technology landscape, a quiet crisis is playing out. Organizations have run hundreds, sometimes thousands, of generative AI pilots. Leadership has celebrated the proof of concept (POCs) ... Industry experience points to a sobering reality: only 5-10% of AI POCs that progress to the pilot stage successfully reach scaled production. The remaining 90% fail because the enterprise environment around them was never ready to absorb them, not the AI models ...

Grafana Labs Launches Mimir 3.0

Grafana Labs announced the launch of Grafana Mimir 3.0, the latest evolution of its open-source, horizontally scalable metrics backend. 

Mimir 3.0 marks an architectural milestone, delivering new levels of reliability, performance, and cost efficiency for Prometheus-compatible monitoring at enterprise scale.

Built on open source, open standards, and open ecosystems, Grafana Labs helps organizations innovate without lock-in and move fast without compromise. At KubeCon, the company also announced updates across its open-source ecosystem, including Grafana Tempo 2.9 with AI-assisted tracing, continued Kubernetes Monitoring enhancements, and deeper Prometheus and OpenTelemetry support to help teams simplify observability and gain more value from their data.

“From open source and open standards to open ecosystems and open minds – building in the open is core to our philosophy at Grafana Labs,” said Myrle Krantz, Senior Director of Engineering, Grafana Labs. “That’s why we’re continuing to invest in open source, like adding AI-assisted tracing in Tempo and making it easier to get the most out of OpenTelemetry and Prometheus. We’re continuously improving Kubernetes Monitoring. And with the new Mimir 3.0 release, we’re helping teams scale even more reliably, expanding what’s possible for open observability in 2026 and beyond.”

Three years in development, Grafana Mimir 3.0 introduces a new decoupled architecture that separates the read and write paths for more reliable, large-scale metrics operations:

  • Reliability: By decoupling reads and writes through an asynchronous Kafka-based ingest layer, cross-path dependencies are eliminated, keeping queries fast and stable even under heavy ingestion loads.
  • Performance: The new Mimir Query Engine (MQE) streams query results instead of loading entire datasets into memory, improving execution speed and reducing memory usage by up to 92%.
  • Cost efficiency: Early testing reports up to 15% lower resource usage while achieving higher throughput and consistency across large clusters.

Together, these innovations make Mimir 3.0 the most resilient, high-performing, and cost-efficient metrics backend for Prometheus and OpenTelemetry data – now available on Grafana Cloud and for self-managed users via open source. 

The latest release of Grafana Tempo, the open source distributed tracing backend, introduces new capabilities to speed up trace analysis and bring AI into the observability workflow.

  • MCP server support: An experimental Model Context Protocol (MCP) server allows AI assistants like Claude Code and Cursor to query distributed tracing data with TraceQL, enabling natural-language debugging and faster root cause analysis.
  • TraceQL metrics sampling: New probabilistic query hints accelerate analysis in high-volume environments, returning approximate results faster without losing visibility.
  • Multi-tenant and operational improvements: New metrics for query I/O, span timing, and usage tracking improve observability and performance visibility at scale.

Tempo 2.9 also deepens OpenTelemetry support by aligning with newer OpenTelemetry semantic conventions, reaffirming Grafana Labs’ commitment to open, composable observability.

Building on the success of Grafana Cloud Kubernetes Monitoring, Grafana Labs has introduced powerful new capabilities that simplify observability across even the most complex Kubernetes environments. This is especially timely as a recent survey by the CNCF found that 80% of respondents work for IT organizations that have deployed Kubernetes in a production environment.

Kubernetes Monitoring in Grafana Cloud has evolved into an observability solution that doesn’t just visualize telemetry but interprets it, automates insights, and guides teams to action. New updates include:

  • Grafana AI Assistant integration: Teams can now interact with Kubernetes Monitoring using Grafana Assistant (now generally available), an AI-powered agent built into Grafana Cloud that can read dashboards, drill into panels, and summarize results in real time. Using natural language, users can ask how a workload is behaving, what’s impacting performance, or where costs are trending.
  • GPU monitoring: Available at both the Node and Cluster level, new GPU utilization panels help detect overheating, power drain, or underuse before they impact performance, ensuring AI workloads remain stable and efficient.
  • Automated root cause analysis: Now integrated with the generally available Grafana Knowledge Graph, Kubernetes Monitoring gives you automatic RCA and Insight Rings.
  • Expanded workload support: Kubernetes Monitoring now provides full visibility into CronJobs, Argo Rollouts, Bare Pods, Static Pods, Strimzi Pod Sets, and other nonstandard workloads, ensuring comprehensive coverage across diverse infrastructure types.
  • Monitor cron jobs and other job types: Get full visibility into all cron and manual jobs across clusters. Instantly see status, distribution, and missed runs to ensure automation reliability and quick issue detection.
  • CPU and memory panels: New CPU and Memory tabs provide clear, layered views of compute usage – from cluster to container – with efficiency graphs and CPU distribution analysis that help optimize capacity, cost, and performance.
  • Cloud provider nodes: One-click correlation between AWS EC2 instances and Kubernetes workloads enables unified troubleshooting across cloud and container layers, reducing context-switching and mean time to resolution. And for teams on AWS, CloudWatch metric streams in Grafana Cloud can cut metric pipeline costs, including storage and agent infrastructure, by up to 10x while delivering near-real-time metrics.

Together, these updates make Kubernetes Monitoring in Grafana Cloud an intelligent, automated, and AI-capable solution for today’s dynamic, large-scale environments.

Grafana Labs continues to invest in open standards and community-led innovation across its ecosystem:

  • Beyla donation complete: Earlier in 2025, Grafana Labs donated Grafana Beyla, its eBPF-based, zero-code auto-instrumentation agent, to OpenTelemetry. Renamed OpenTelemetry eBPF Instrumentation, the project just marked its first official release under the OpenTelemetry umbrella. The donation reinforces Grafana Labs’ long-standing commitment to advancing open, vendor-neutral observability.
  • Grafana Alloy: Grafana Labs’ distribution of the OpenTelemetry Collector, Grafana Alloy is now the default data pipeline layer across Grafana Cloud and open source deployments. Alloy unifies metrics, logs, and traces collection while supporting both Prometheus and OpenTelemetry pipelines.
  • Prometheus 3.0 and OpenTelemetry interoperability: Grafana engineers contributed to the introduction of profiling signal support, new semantic conventions, and Prometheus 3.0 compatibility, strengthening cross-project interoperability.

The Latest

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

40% of organizations deploying AI will implement dedicated AI observability tools by 2028 to monitor model performance, bias and outputs, according to Gartner ...

Until AI-powered engineering tools have live visibility of how code behaves at runtime, they cannot be trusted to autonomously ensure reliable systems, according to the State of AI-Powered Engineering Report 2026 report from Lightrun. The report reveals that a major volume of manual work is required when AI-generated code is deployed: 43% of AI-generated code requires manual debugging in production, even after passing QA or staging tests. Furthermore, an average of three manual redeploy cycles are required to verify a single AI-suggested code fix in production ...

Many organizations describe AI as strategic, but they do not manage it strategically. When AI plans are disconnected from strategy, detached from organizational learning, and protected from serious assumptions testing, the problem is no longer technical immaturity; it is a failure of management discipline ... Executives too often tell organizations to "use AI" before they define what AI is supposed to change. The problem deepens in organizations where strategy isn't well articulated in the first place ...

Across the enterprise technology landscape, a quiet crisis is playing out. Organizations have run hundreds, sometimes thousands, of generative AI pilots. Leadership has celebrated the proof of concept (POCs) ... Industry experience points to a sobering reality: only 5-10% of AI POCs that progress to the pilot stage successfully reach scaled production. The remaining 90% fail because the enterprise environment around them was never ready to absorb them, not the AI models ...