Skip to main content

OpenTelemetry: A Complete Telemetry Framework that Grows with You

Juraci Paixão Kröhling
OllyGarden

Most organizations approach OpenTelemetry as a collection of individual tools they need to assemble from scratch. This view misses the bigger picture. OpenTelemetry is a complete telemetry framework with composable components that address specific problems at different stages of organizational maturity. You start with what you need today and adopt additional pieces as your observability practices evolve.

The framework includes everything from automatic instrumentation that provides immediate value to sophisticated governance tools for enterprise-scale deployments. Each component solves a specific problem that emerges as teams scale their observability practices. The key insight is recognizing which components address your current challenges and understanding what becomes available as your needs grow.

The Starting Point: Immediate Value with Auto-Instrumentation

Organizations beginning their observability journey need results quickly. OpenTelemetry provides automatic instrumentation libraries for major programming languages that capture traces, metrics, and logs without requiring code changes. For a Java application, getting started takes two commands:

Image
OTel

The application now emits traces, metrics, and logs using the OpenTelemetry Protocol (OTLP). Point it at any compatible backend and you have telemetry flowing. For local testing, a single container provides a complete environment:

Image
OTel

Auto-instrumentation solves the cold start problem. Teams get immediate visibility into application behavior without extensive instrumentation work. This quick win demonstrates value to stakeholders while teams learn what additional instrumentation would provide the most benefit. The instrumentation uses OpenTelemetry's SDK, which provides a stable foundation for future customization.

Adding Flexibility: The Collector as a Central Processing Hub

As observability practices mature, teams encounter new requirements. They need to send telemetry to multiple backends for different teams. They want to enrich data with environment metadata before it leaves the cluster. They need to sample high-volume traces to control costs. The OpenTelemetry Collector addresses these challenges through a central processing pipeline.

The Collector receives telemetry data, processes it through configurable pipelines, and exports it to one or more destinations. This decouples instrumentation from backend decisions. Applications send data to the Collector using OpenTelemetry Protocol (OTLP), and infrastructure teams configure where that data ultimately goes. Backend migrations become configuration changes rather than application redeployment projects.

Teams typically introduce the Collector when they need capabilities beyond simple data forwarding. Processors transform data, add attributes, perform sampling decisions, or batch data for efficient transmission. Receivers accept data in various formats, allowing gradual migration from legacy instrumentation. Exporters send data to commercial platforms or open source backends without requiring instrumentation changes.

Scaling to Kubernetes: Automation Through the Operator

Managing instrumentation across hundreds or thousands of services running in Kubernetes presents operational challenges. The OpenTelemetry Operator automates instrumentation injection and Collector lifecycle management at cluster scale. Teams define instrumentation policies once, and the Operator ensures they apply consistently across workloads.

The Operator eliminates manual instrumentation configuration for each deployment. It watches for new pods and injects auto-instrumentation based on defined policies. This approach scales instrumentation practices to large environments while maintaining consistency. The Operator also manages Collector deployments, handling upgrades and configuration distribution across the cluster.

Organizations adopt the Operator when manual instrumentation management becomes operationally expensive. The component represents a solved automation problem, applying proven Kubernetes patterns to observability instrumentation.

Customizing Instrumentation: The API and Semantic Conventions

Auto-instrumentation provides broad coverage, but application-specific insights require custom instrumentation. The OpenTelemetry API enables developers to create spans for business-critical operations, record custom metrics, and emit structured logs with correlation context. These APIs work alongside auto-instrumentation, supplementing automatic coverage with application-specific detail.

Semantic conventions provide standardized attribute names and values for common concepts. Rather than each team inventing attribute names for HTTP requests, database queries, or message queue operations, semantic conventions establish shared vocabulary. This consistency enables observability tools to understand telemetry data semantically, regardless of which team or service generated it.

Teams introduce custom instrumentation when auto-instrumentation does not capture critical business workflows. The API provides the mechanism, while semantic conventions ensure the resulting data remains interoperable across the organization. This combination supports both immediate instrumentation needs and long-term observability platform evolution.

Enterprise Governance: Weaver for Organizational Standards

Large organizations need to enforce instrumentation standards across teams while allowing flexibility for specific use cases. Weaver generates type-safe instrumentation code from semantic convention definitions, ensuring teams use standardized attributes correctly. This governance tool bridges the gap between organizational standards and implementation reality.

Weaver takes YAML definitions of semantic conventions and generates code in multiple languages. Developers use generated types that enforce attribute naming, typing, and documentation standards automatically. This approach scales organizational governance without creating bottlenecks or requiring constant code review of instrumentation details.

Organizations adopt Weaver when inconsistent attribute usage across teams creates data quality problems. The tool represents a solution to the governance challenge that emerges at enterprise scale, where manual enforcement of standards becomes impractical.

The Vendor Neutrality Advantage: Backend Agility

Every component in the OpenTelemetry framework reinforces a core principle: instrumentation should outlive backend decisions. Applications emit telemetry in a standardized format, infrastructure routes it through configurable pipelines, and backends consume what they need. Backend selection becomes a runtime decision rather than an instrumentation commitment.

This separation has practical implications for organizations. Evaluating new observability platforms does not require re-instrumenting applications. Cost management becomes a matter of adjusting Collector configurations to sample or filter data appropriately. Acquisitions or organizational changes that consolidate observability platforms do not trigger instrumentation projects.

The framework's vendor neutrality protects instrumentation investments while maintaining flexibility as requirements evolve. Teams instrument once using stable APIs and collect the benefits across multiple backend generations.

A Framework That Scales with Organizational Maturity

OpenTelemetry provides a complete telemetry framework where each component addresses specific challenges that emerge as practices mature. Organizations start with auto-instrumentation for quick wins, add the Collector for processing flexibility, introduce the Operator for Kubernetes automation, layer in custom instrumentation where needed, and adopt governance tools like Weaver at enterprise scale.

The framework does not require adopting every component immediately. Each piece represents a solution waiting on the shelf for when specific problems arise. This modularity allows organizations to grow their observability practices at their own pace while maintaining a consistent technical foundation.

The investment in OpenTelemetry-based instrumentation compounds over time. Early instrumentation remains valuable as new components address evolving requirements. The framework scales from a single service sending traces to a backend, to thousands of services across multiple clusters with sophisticated processing pipelines and governance controls. Organizations choose which components to deploy based on current needs, confident that additional capabilities remain available as requirements change.

Juraci Paixão Kröhling is a Software Engineer at OllyGarden, OpenTelemetry Governing Board Member and CNCF Ambassador

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

OpenTelemetry: A Complete Telemetry Framework that Grows with You

Juraci Paixão Kröhling
OllyGarden

Most organizations approach OpenTelemetry as a collection of individual tools they need to assemble from scratch. This view misses the bigger picture. OpenTelemetry is a complete telemetry framework with composable components that address specific problems at different stages of organizational maturity. You start with what you need today and adopt additional pieces as your observability practices evolve.

The framework includes everything from automatic instrumentation that provides immediate value to sophisticated governance tools for enterprise-scale deployments. Each component solves a specific problem that emerges as teams scale their observability practices. The key insight is recognizing which components address your current challenges and understanding what becomes available as your needs grow.

The Starting Point: Immediate Value with Auto-Instrumentation

Organizations beginning their observability journey need results quickly. OpenTelemetry provides automatic instrumentation libraries for major programming languages that capture traces, metrics, and logs without requiring code changes. For a Java application, getting started takes two commands:

Image
OTel

The application now emits traces, metrics, and logs using the OpenTelemetry Protocol (OTLP). Point it at any compatible backend and you have telemetry flowing. For local testing, a single container provides a complete environment:

Image
OTel

Auto-instrumentation solves the cold start problem. Teams get immediate visibility into application behavior without extensive instrumentation work. This quick win demonstrates value to stakeholders while teams learn what additional instrumentation would provide the most benefit. The instrumentation uses OpenTelemetry's SDK, which provides a stable foundation for future customization.

Adding Flexibility: The Collector as a Central Processing Hub

As observability practices mature, teams encounter new requirements. They need to send telemetry to multiple backends for different teams. They want to enrich data with environment metadata before it leaves the cluster. They need to sample high-volume traces to control costs. The OpenTelemetry Collector addresses these challenges through a central processing pipeline.

The Collector receives telemetry data, processes it through configurable pipelines, and exports it to one or more destinations. This decouples instrumentation from backend decisions. Applications send data to the Collector using OpenTelemetry Protocol (OTLP), and infrastructure teams configure where that data ultimately goes. Backend migrations become configuration changes rather than application redeployment projects.

Teams typically introduce the Collector when they need capabilities beyond simple data forwarding. Processors transform data, add attributes, perform sampling decisions, or batch data for efficient transmission. Receivers accept data in various formats, allowing gradual migration from legacy instrumentation. Exporters send data to commercial platforms or open source backends without requiring instrumentation changes.

Scaling to Kubernetes: Automation Through the Operator

Managing instrumentation across hundreds or thousands of services running in Kubernetes presents operational challenges. The OpenTelemetry Operator automates instrumentation injection and Collector lifecycle management at cluster scale. Teams define instrumentation policies once, and the Operator ensures they apply consistently across workloads.

The Operator eliminates manual instrumentation configuration for each deployment. It watches for new pods and injects auto-instrumentation based on defined policies. This approach scales instrumentation practices to large environments while maintaining consistency. The Operator also manages Collector deployments, handling upgrades and configuration distribution across the cluster.

Organizations adopt the Operator when manual instrumentation management becomes operationally expensive. The component represents a solved automation problem, applying proven Kubernetes patterns to observability instrumentation.

Customizing Instrumentation: The API and Semantic Conventions

Auto-instrumentation provides broad coverage, but application-specific insights require custom instrumentation. The OpenTelemetry API enables developers to create spans for business-critical operations, record custom metrics, and emit structured logs with correlation context. These APIs work alongside auto-instrumentation, supplementing automatic coverage with application-specific detail.

Semantic conventions provide standardized attribute names and values for common concepts. Rather than each team inventing attribute names for HTTP requests, database queries, or message queue operations, semantic conventions establish shared vocabulary. This consistency enables observability tools to understand telemetry data semantically, regardless of which team or service generated it.

Teams introduce custom instrumentation when auto-instrumentation does not capture critical business workflows. The API provides the mechanism, while semantic conventions ensure the resulting data remains interoperable across the organization. This combination supports both immediate instrumentation needs and long-term observability platform evolution.

Enterprise Governance: Weaver for Organizational Standards

Large organizations need to enforce instrumentation standards across teams while allowing flexibility for specific use cases. Weaver generates type-safe instrumentation code from semantic convention definitions, ensuring teams use standardized attributes correctly. This governance tool bridges the gap between organizational standards and implementation reality.

Weaver takes YAML definitions of semantic conventions and generates code in multiple languages. Developers use generated types that enforce attribute naming, typing, and documentation standards automatically. This approach scales organizational governance without creating bottlenecks or requiring constant code review of instrumentation details.

Organizations adopt Weaver when inconsistent attribute usage across teams creates data quality problems. The tool represents a solution to the governance challenge that emerges at enterprise scale, where manual enforcement of standards becomes impractical.

The Vendor Neutrality Advantage: Backend Agility

Every component in the OpenTelemetry framework reinforces a core principle: instrumentation should outlive backend decisions. Applications emit telemetry in a standardized format, infrastructure routes it through configurable pipelines, and backends consume what they need. Backend selection becomes a runtime decision rather than an instrumentation commitment.

This separation has practical implications for organizations. Evaluating new observability platforms does not require re-instrumenting applications. Cost management becomes a matter of adjusting Collector configurations to sample or filter data appropriately. Acquisitions or organizational changes that consolidate observability platforms do not trigger instrumentation projects.

The framework's vendor neutrality protects instrumentation investments while maintaining flexibility as requirements evolve. Teams instrument once using stable APIs and collect the benefits across multiple backend generations.

A Framework That Scales with Organizational Maturity

OpenTelemetry provides a complete telemetry framework where each component addresses specific challenges that emerge as practices mature. Organizations start with auto-instrumentation for quick wins, add the Collector for processing flexibility, introduce the Operator for Kubernetes automation, layer in custom instrumentation where needed, and adopt governance tools like Weaver at enterprise scale.

The framework does not require adopting every component immediately. Each piece represents a solution waiting on the shelf for when specific problems arise. This modularity allows organizations to grow their observability practices at their own pace while maintaining a consistent technical foundation.

The investment in OpenTelemetry-based instrumentation compounds over time. Early instrumentation remains valuable as new components address evolving requirements. The framework scales from a single service sending traces to a backend, to thousands of services across multiple clusters with sophisticated processing pipelines and governance controls. Organizations choose which components to deploy based on current needs, confident that additional capabilities remain available as requirements change.

Juraci Paixão Kröhling is a Software Engineer at OllyGarden, OpenTelemetry Governing Board Member and CNCF Ambassador

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...