Skip to main content

OpenTelemetry: A Complete Telemetry Framework that Grows with You

Juraci Paixão Kröhling
OllyGarden

Most organizations approach OpenTelemetry as a collection of individual tools they need to assemble from scratch. This view misses the bigger picture. OpenTelemetry is a complete telemetry framework with composable components that address specific problems at different stages of organizational maturity. You start with what you need today and adopt additional pieces as your observability practices evolve.

The framework includes everything from automatic instrumentation that provides immediate value to sophisticated governance tools for enterprise-scale deployments. Each component solves a specific problem that emerges as teams scale their observability practices. The key insight is recognizing which components address your current challenges and understanding what becomes available as your needs grow.

The Starting Point: Immediate Value with Auto-Instrumentation

Organizations beginning their observability journey need results quickly. OpenTelemetry provides automatic instrumentation libraries for major programming languages that capture traces, metrics, and logs without requiring code changes. For a Java application, getting started takes two commands:

Image
OTel

The application now emits traces, metrics, and logs using the OpenTelemetry Protocol (OTLP). Point it at any compatible backend and you have telemetry flowing. For local testing, a single container provides a complete environment:

Image
OTel

Auto-instrumentation solves the cold start problem. Teams get immediate visibility into application behavior without extensive instrumentation work. This quick win demonstrates value to stakeholders while teams learn what additional instrumentation would provide the most benefit. The instrumentation uses OpenTelemetry's SDK, which provides a stable foundation for future customization.

Adding Flexibility: The Collector as a Central Processing Hub

As observability practices mature, teams encounter new requirements. They need to send telemetry to multiple backends for different teams. They want to enrich data with environment metadata before it leaves the cluster. They need to sample high-volume traces to control costs. The OpenTelemetry Collector addresses these challenges through a central processing pipeline.

The Collector receives telemetry data, processes it through configurable pipelines, and exports it to one or more destinations. This decouples instrumentation from backend decisions. Applications send data to the Collector using OpenTelemetry Protocol (OTLP), and infrastructure teams configure where that data ultimately goes. Backend migrations become configuration changes rather than application redeployment projects.

Teams typically introduce the Collector when they need capabilities beyond simple data forwarding. Processors transform data, add attributes, perform sampling decisions, or batch data for efficient transmission. Receivers accept data in various formats, allowing gradual migration from legacy instrumentation. Exporters send data to commercial platforms or open source backends without requiring instrumentation changes.

Scaling to Kubernetes: Automation Through the Operator

Managing instrumentation across hundreds or thousands of services running in Kubernetes presents operational challenges. The OpenTelemetry Operator automates instrumentation injection and Collector lifecycle management at cluster scale. Teams define instrumentation policies once, and the Operator ensures they apply consistently across workloads.

The Operator eliminates manual instrumentation configuration for each deployment. It watches for new pods and injects auto-instrumentation based on defined policies. This approach scales instrumentation practices to large environments while maintaining consistency. The Operator also manages Collector deployments, handling upgrades and configuration distribution across the cluster.

Organizations adopt the Operator when manual instrumentation management becomes operationally expensive. The component represents a solved automation problem, applying proven Kubernetes patterns to observability instrumentation.

Customizing Instrumentation: The API and Semantic Conventions

Auto-instrumentation provides broad coverage, but application-specific insights require custom instrumentation. The OpenTelemetry API enables developers to create spans for business-critical operations, record custom metrics, and emit structured logs with correlation context. These APIs work alongside auto-instrumentation, supplementing automatic coverage with application-specific detail.

Semantic conventions provide standardized attribute names and values for common concepts. Rather than each team inventing attribute names for HTTP requests, database queries, or message queue operations, semantic conventions establish shared vocabulary. This consistency enables observability tools to understand telemetry data semantically, regardless of which team or service generated it.

Teams introduce custom instrumentation when auto-instrumentation does not capture critical business workflows. The API provides the mechanism, while semantic conventions ensure the resulting data remains interoperable across the organization. This combination supports both immediate instrumentation needs and long-term observability platform evolution.

Enterprise Governance: Weaver for Organizational Standards

Large organizations need to enforce instrumentation standards across teams while allowing flexibility for specific use cases. Weaver generates type-safe instrumentation code from semantic convention definitions, ensuring teams use standardized attributes correctly. This governance tool bridges the gap between organizational standards and implementation reality.

Weaver takes YAML definitions of semantic conventions and generates code in multiple languages. Developers use generated types that enforce attribute naming, typing, and documentation standards automatically. This approach scales organizational governance without creating bottlenecks or requiring constant code review of instrumentation details.

Organizations adopt Weaver when inconsistent attribute usage across teams creates data quality problems. The tool represents a solution to the governance challenge that emerges at enterprise scale, where manual enforcement of standards becomes impractical.

The Vendor Neutrality Advantage: Backend Agility

Every component in the OpenTelemetry framework reinforces a core principle: instrumentation should outlive backend decisions. Applications emit telemetry in a standardized format, infrastructure routes it through configurable pipelines, and backends consume what they need. Backend selection becomes a runtime decision rather than an instrumentation commitment.

This separation has practical implications for organizations. Evaluating new observability platforms does not require re-instrumenting applications. Cost management becomes a matter of adjusting Collector configurations to sample or filter data appropriately. Acquisitions or organizational changes that consolidate observability platforms do not trigger instrumentation projects.

The framework's vendor neutrality protects instrumentation investments while maintaining flexibility as requirements evolve. Teams instrument once using stable APIs and collect the benefits across multiple backend generations.

A Framework That Scales with Organizational Maturity

OpenTelemetry provides a complete telemetry framework where each component addresses specific challenges that emerge as practices mature. Organizations start with auto-instrumentation for quick wins, add the Collector for processing flexibility, introduce the Operator for Kubernetes automation, layer in custom instrumentation where needed, and adopt governance tools like Weaver at enterprise scale.

The framework does not require adopting every component immediately. Each piece represents a solution waiting on the shelf for when specific problems arise. This modularity allows organizations to grow their observability practices at their own pace while maintaining a consistent technical foundation.

The investment in OpenTelemetry-based instrumentation compounds over time. Early instrumentation remains valuable as new components address evolving requirements. The framework scales from a single service sending traces to a backend, to thousands of services across multiple clusters with sophisticated processing pipelines and governance controls. Organizations choose which components to deploy based on current needs, confident that additional capabilities remain available as requirements change.

Juraci Paixão Kröhling is a Software Engineer at OllyGarden, OpenTelemetry Governing Board Member and CNCF Ambassador

The Latest

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.

The quietest week your engineering team has ever had might also be its best. No alarms going off. No escalations. No frantic Teams or Slack threads at 2 a.m. Everything humming along exactly as it should. And somewhere in a leadership meeting, someone looks at the metrics dashboard, sees a flat line of incidents and says: "Seems like things are pretty calm over there. Do we really need all those people?" ... I've spent many years in engineering, and this pattern keeps repeating ...

The gap is widening between what teams spend on observability tools and the value they receive amid surging data volumes and budget pressures, according to The Breaking Point for Observability Leaders, a report from Imply ...

OpenTelemetry: A Complete Telemetry Framework that Grows with You

Juraci Paixão Kröhling
OllyGarden

Most organizations approach OpenTelemetry as a collection of individual tools they need to assemble from scratch. This view misses the bigger picture. OpenTelemetry is a complete telemetry framework with composable components that address specific problems at different stages of organizational maturity. You start with what you need today and adopt additional pieces as your observability practices evolve.

The framework includes everything from automatic instrumentation that provides immediate value to sophisticated governance tools for enterprise-scale deployments. Each component solves a specific problem that emerges as teams scale their observability practices. The key insight is recognizing which components address your current challenges and understanding what becomes available as your needs grow.

The Starting Point: Immediate Value with Auto-Instrumentation

Organizations beginning their observability journey need results quickly. OpenTelemetry provides automatic instrumentation libraries for major programming languages that capture traces, metrics, and logs without requiring code changes. For a Java application, getting started takes two commands:

Image
OTel

The application now emits traces, metrics, and logs using the OpenTelemetry Protocol (OTLP). Point it at any compatible backend and you have telemetry flowing. For local testing, a single container provides a complete environment:

Image
OTel

Auto-instrumentation solves the cold start problem. Teams get immediate visibility into application behavior without extensive instrumentation work. This quick win demonstrates value to stakeholders while teams learn what additional instrumentation would provide the most benefit. The instrumentation uses OpenTelemetry's SDK, which provides a stable foundation for future customization.

Adding Flexibility: The Collector as a Central Processing Hub

As observability practices mature, teams encounter new requirements. They need to send telemetry to multiple backends for different teams. They want to enrich data with environment metadata before it leaves the cluster. They need to sample high-volume traces to control costs. The OpenTelemetry Collector addresses these challenges through a central processing pipeline.

The Collector receives telemetry data, processes it through configurable pipelines, and exports it to one or more destinations. This decouples instrumentation from backend decisions. Applications send data to the Collector using OpenTelemetry Protocol (OTLP), and infrastructure teams configure where that data ultimately goes. Backend migrations become configuration changes rather than application redeployment projects.

Teams typically introduce the Collector when they need capabilities beyond simple data forwarding. Processors transform data, add attributes, perform sampling decisions, or batch data for efficient transmission. Receivers accept data in various formats, allowing gradual migration from legacy instrumentation. Exporters send data to commercial platforms or open source backends without requiring instrumentation changes.

Scaling to Kubernetes: Automation Through the Operator

Managing instrumentation across hundreds or thousands of services running in Kubernetes presents operational challenges. The OpenTelemetry Operator automates instrumentation injection and Collector lifecycle management at cluster scale. Teams define instrumentation policies once, and the Operator ensures they apply consistently across workloads.

The Operator eliminates manual instrumentation configuration for each deployment. It watches for new pods and injects auto-instrumentation based on defined policies. This approach scales instrumentation practices to large environments while maintaining consistency. The Operator also manages Collector deployments, handling upgrades and configuration distribution across the cluster.

Organizations adopt the Operator when manual instrumentation management becomes operationally expensive. The component represents a solved automation problem, applying proven Kubernetes patterns to observability instrumentation.

Customizing Instrumentation: The API and Semantic Conventions

Auto-instrumentation provides broad coverage, but application-specific insights require custom instrumentation. The OpenTelemetry API enables developers to create spans for business-critical operations, record custom metrics, and emit structured logs with correlation context. These APIs work alongside auto-instrumentation, supplementing automatic coverage with application-specific detail.

Semantic conventions provide standardized attribute names and values for common concepts. Rather than each team inventing attribute names for HTTP requests, database queries, or message queue operations, semantic conventions establish shared vocabulary. This consistency enables observability tools to understand telemetry data semantically, regardless of which team or service generated it.

Teams introduce custom instrumentation when auto-instrumentation does not capture critical business workflows. The API provides the mechanism, while semantic conventions ensure the resulting data remains interoperable across the organization. This combination supports both immediate instrumentation needs and long-term observability platform evolution.

Enterprise Governance: Weaver for Organizational Standards

Large organizations need to enforce instrumentation standards across teams while allowing flexibility for specific use cases. Weaver generates type-safe instrumentation code from semantic convention definitions, ensuring teams use standardized attributes correctly. This governance tool bridges the gap between organizational standards and implementation reality.

Weaver takes YAML definitions of semantic conventions and generates code in multiple languages. Developers use generated types that enforce attribute naming, typing, and documentation standards automatically. This approach scales organizational governance without creating bottlenecks or requiring constant code review of instrumentation details.

Organizations adopt Weaver when inconsistent attribute usage across teams creates data quality problems. The tool represents a solution to the governance challenge that emerges at enterprise scale, where manual enforcement of standards becomes impractical.

The Vendor Neutrality Advantage: Backend Agility

Every component in the OpenTelemetry framework reinforces a core principle: instrumentation should outlive backend decisions. Applications emit telemetry in a standardized format, infrastructure routes it through configurable pipelines, and backends consume what they need. Backend selection becomes a runtime decision rather than an instrumentation commitment.

This separation has practical implications for organizations. Evaluating new observability platforms does not require re-instrumenting applications. Cost management becomes a matter of adjusting Collector configurations to sample or filter data appropriately. Acquisitions or organizational changes that consolidate observability platforms do not trigger instrumentation projects.

The framework's vendor neutrality protects instrumentation investments while maintaining flexibility as requirements evolve. Teams instrument once using stable APIs and collect the benefits across multiple backend generations.

A Framework That Scales with Organizational Maturity

OpenTelemetry provides a complete telemetry framework where each component addresses specific challenges that emerge as practices mature. Organizations start with auto-instrumentation for quick wins, add the Collector for processing flexibility, introduce the Operator for Kubernetes automation, layer in custom instrumentation where needed, and adopt governance tools like Weaver at enterprise scale.

The framework does not require adopting every component immediately. Each piece represents a solution waiting on the shelf for when specific problems arise. This modularity allows organizations to grow their observability practices at their own pace while maintaining a consistent technical foundation.

The investment in OpenTelemetry-based instrumentation compounds over time. Early instrumentation remains valuable as new components address evolving requirements. The framework scales from a single service sending traces to a backend, to thousands of services across multiple clusters with sophisticated processing pipelines and governance controls. Organizations choose which components to deploy based on current needs, confident that additional capabilities remain available as requirements change.

Juraci Paixão Kröhling is a Software Engineer at OllyGarden, OpenTelemetry Governing Board Member and CNCF Ambassador

The Latest

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.

The quietest week your engineering team has ever had might also be its best. No alarms going off. No escalations. No frantic Teams or Slack threads at 2 a.m. Everything humming along exactly as it should. And somewhere in a leadership meeting, someone looks at the metrics dashboard, sees a flat line of incidents and says: "Seems like things are pretty calm over there. Do we really need all those people?" ... I've spent many years in engineering, and this pattern keeps repeating ...

The gap is widening between what teams spend on observability tools and the value they receive amid surging data volumes and budget pressures, according to The Breaking Point for Observability Leaders, a report from Imply ...