Tag Archives: OpenTelemetry

OpenTelemetry’s First Release Candidates

OpenTelemetry has hit another milestone with the tracing specification reaching release candidate status.

With the specification now ready to go, expect to see tracing release candidates of the official APIs and SDKs over the next few weeks, along with updated exporters for Cloud Trace. In the coming months the same will follow for the metrics specification, followed by metrics release candidates of the APIs and SDKs and Cloud Monitoring exporters, followed by the project’s general availability. At this point we’ll switch our default application metrics and distributed tracing instrumentation from OpenCensus to OpenTelemetry.

This is exciting news for Google Cloud customers, as OpenTelemetry will enable even better observability experiences, both with Cloud Monitoring and Cloud Trace, or the third party monitoring and operations tools of your choice.

Originally posted on the on the OpenTelemetry blog.


As we’ve discussed in past announcements, we’re hard at work building OpenTelemetry’s first GA quality release. Today marks another milestone in this journey, with the freezing and first release candidate of the tracing specification.
Tracing Spec Release Candidate

The tracing specification is now considered to be a release candidate (RC) and is frozen, and the OpenTelemetry APIs and SDKs have a stable specification to build their own release candidates against. This means:
  • API, SDK, and Collector release candidates will appear within the next few weeks.
  • No breaking spec changes are allowed between now and the final GA specification, beyond any showstopper (P1) issues that are revealed in the RC period. We don’t expect any of these to appear, but the purpose of the RC period is for us to validate that we have a GA-worthy spec.
  • Some non-breaking changes will be allowed during the RC period. Most of these are clarifications of existing behaviour or are pure editorial updates.
The release candidate sections of the specification include all tracing related dependencies, specifically the following sections: Trace, Baggage, Resource, Context Propagation, Environment Variables, Exporters (for traces). You can view the progress of each OpenTelemetry component’s implementation in the project status matrix.

What’s Coming Next?

Achieving a release candidate of the tracing specification has been the top priority of OpenTelemetry since releasing our beta in March. With this completed, our focus now shifts to tracing release candidates of the APIs, SDKs, Collector, and auto instrumentation components, and producing a release candidate of the metrics specification.

RC Tracing Implementations

Most OpenTelemetry APIs and SDKs are close to completing their tracing RC implementations, and we expect the first wave of these to arrive within the next two weeks. Contributors who are looking to provide instrumentation (for various web frameworks, storage clients, etc.) can start building against release candidate APIs once they arrive. While the APIs may change in response to issues discovered during RC usage and testing (which will result in multiple pre-GA release candidates for these components), these will be extremely constrained.

Several SDKs will have two waves of release candidate milestones: the first will contain functionality from the tracing and context propagation sections of the specification, and the second will include release candidate implementations for baggage, exporters, resources, and environment variables.

Metrics

In parallel to the tracing RC component releases, we will apply the focus that we’ve had on tracing to the metrics specification. Starting this week, we will categorize which work items are required for GA, which can be optionally allowed in GA (non-breaking), and which will be shifted to post-GA. After completing this, we will track our burndown progress, and lock the metrics specification and publish a metrics specification release candidate once all P1 items are complete. Shortly after this, the APIs, SDKs, Collector, and other components will publish release candidates with RC-quality tracing and metrics functionality.

Productionization and GA Readiness Work

Once the metrics specification, SDKs, Collector, and other components reach release candidate status, we will focus on productionization tasks like writing documentation, producing a post-GA versioning strategy, building additional automated tests, etc. Once we are satisfied with each component’s adoptability and reliability, we will announce their general availability.

Overall Timeline

  1. Components (APIs, SDKs, Collector, auto instrumentation, etc.) issue release candidates with RC-quality tracing functionality.
  2. The metrics section of the specification achieves RC quality and is frozen.
  3. Components issue release candidates with RC-quality tracing and metrics functionality.
  4. Once we are satisfied with our metrics + tracing release candidates, OpenTelemetry goes GA.
  5. Logging enters beta, then issues an RC specification, followed by RC-quality logging functionality in each component, followed by a GA for logging.
We will have a better understanding of our GA release timeline in the coming weeks once outstanding work on the metrics specification is fully accounted for.

Tracking a Language’s Progress

As mentioned above, you can view the progress of a particular component (API, SDK, etc.) in the project status matrix. Each component’s implementation has their own timeline, though a core set (the JavaScript, Java, Go, Python, and .Net APIs + SDKs, the Collector, and Java auto instrumentation) are all tracking well. Each component has its own GA burndown board.

FAQ

I want to use OpenTelemetry on my production services; what’s the impact of today’s announcement?

SDKs with release candidate quality tracing support will be available in a few weeks. Release candidates are not recommended for critical production services, however they are functional and are intended to offer APIs that are compatible with their upcoming GA counterparts.

I want to write instrumentation for OpenTelemetry; what’s the impact of today’s announcement?

APIs with release candidate quality tracing support will be available shortly (prior to the SDKs). You can bind against these to produce traces that will be picked up by the OpenTelemetry SDKs or any other implementations that implement the OpenTelemetry APIs.

When will OpenTelemetry offer drop-in replacements for OpenCensus and OpenTracing?

Work is currently underway on bridge APIs that allow OpenTelemetry SDKs to seamlessly replace OpenCensus libraries or OpenTracing implementations. While the delivery date of this functionality is not tied to OpenTelemetry’s GA goals, we expect this to arrive between each API + SDK’s release candidate and GA milestones.

Wrapping Up

Producing a specification release candidate is an important milestone for the OpenTelemetry community, and it took significant effort on the part of our contributors to make this happen. We’d like to thank every person and every organization that was a part of this release, and to recognize that their contributions are laying the groundwork for the project's long term success.

If you haven’t been a part of the OpenTelemetry community but would like to join, now is the perfect time! OpenTelemetry is now in the top three CNCF projects by weekly and cumulative commits, and no matter your level of commitment (ha!) to the project, contributions are always welcome. If you have a particular area that you’re interested in (for example, the Python API + SDK), the best way to get involved is to join the relevant weekly SIG meetings or interact with other contributors on Gitter.

By Morgan McLean, Google Cloud

Java zPages for OpenTelemetry

What is OpenTelemetry?

OpenTelemetry is an open source project aimed at improving the observability of our applications. It is a collection of cloud monitoring libraries and services for capturing distributed traces and metrics and integrates naturally with external observability tools, such as Prometheus and Zipkin. As of now, OpenTelemetry is in its beta stage and supports a few different languages.

What are zPages?

zPages are a set of dynamically generated HTML web pages that display trace and metrics data from the running application. The term zPages was coined at Google, where similar pages are used to view basic diagnostic data from a particular host or service. For our project, we built the Java /tracez and /traceconfigz zPages, which focus on collecting and displaying trace spans.

TraceZ

The /tracez zPage displays span data from the instrumented application. Spans are split into two groups: spans that are still running and spans that have completed.

TraceConfigZ

The /traceconfigz zPage displays the currently active tracing configuration and allows users to change the tracing parameters. Examples of such parameters include the sampling probability and the maximum number of attributes.

Using the zPages

This section describes how to start and use the Java zPages.

Add the dependencies to your project

First, you need to add OpenTelemetry as a dependency to your Java application.

Maven

For Maven, add the following to your pom.xml file:
<dependencies>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-api</artifactId>
        <version>0.7.0</version>
    </dependency>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-sdk</artifactId>
        <version>0.7.0</version>
    </dependency>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-sdk-extension-    zpages</artifactId>
        <version>0.7.0</version>
    </dependency>
</dependencies>

Gradle

For Gradle, add the following to your build.gradle dependencies:
implementation 'io.opentelemetry:opentelemetry-api:0.7.0'
implementation 'io.opentelemetry:opentelemetry-sdk:0.7.0'
implementation 'io.opentelemetry:opentelemetry-sdk-extension-zpages:0.7.0'

Register the zPages

To set-up the zPages, simply call startHttpServerAndRegisterAllPages(int port) from the ZPageServer class in your main function:
import io.opentelemetry.sdk.extensions.zpages.ZPageServer;

public class MyMainClass {
    public static void main(String[] args) throws Exception {
        ZPageServer.startHttpServerAndRegisterAllPages(8080);
        // ... do work
    }
}
Note that the package com.sun.net.httpserver is required to use the default zPages setup. Please make sure your version of the JDK includes this package if you plan to use the default server.

Alternatively, you can call registerAllPagesToHttpServer(HttpServer server) to register the zPages to a shared server:
import io.opentelemetry.sdk.extensions.zpages.ZPageServer;

public class MyMainClass {
    public static void main(String[] args) throws Exception {
        HttpServer server = HttpServer.create(new                     InetSocketAddress(8000), 10);
        ZPageServer.registerAllPagesToHttpServer(server);
        server.start();
        // ... do work
    }
}

Access the zPages

View all available zPages on the index page

The index page (at /) lists all available zPages with a link and description.


View trace spans on the /tracez zPage

The /tracez zPage displays information about running and completed spans, with completed spans further organized into latency and error buckets. The data is aggregated into a summary-level table:


You can click on each of the counts in the table cells to access the corresponding span details. For example, here are the details of the ChildSpan latency sample (row 1, col 4):


View and update the tracing configuration on the /traceconfigz zPage.

The /traceconfigz zPage provides an interface for users to modify the current tracing parameters:


Design

This section goes into the underlying design of our code.

Frontend


The frontend consists of two main parts: HttpHandler and HttpServer. The HttpHandler is responsible for rendering the HTML content, with each zPage implementing its own ZPageHandler. The HttpServer, on the other hand, is responsible for listening to incoming requests, obtaining the requested data, and then invoking the aforementioned ZPageHandlers. The HttpServer class from com.sun.net is used to construct the default server and to handle http requests on different routes.

Backend





The backend consists of two components as well: SpanProcessor and DataAggregator. The SpanProcessor watches the lifecycle of each span, invoking functions each time a span starts or ends. The DataAggregator, on the other hand, restructures the data from the SpanProcessor into an accessible format for the frontend to display. The class constructor requires a TracezSpanProcessor instance, so that the TracezDataAggregator class can access the spans collected by a specific TracezSpanProcessor. The frontend only needs to call functions in the DataAggregator to obtain information required for the web page.

Conclusion

We hope that this blog post has given you a little insight into the development and use cases of OpenTelemetry’s Java zPages. The zPages themselves are lightweight performance monitoring tools that allow users to troubleshoot and better understand their applications. Once OpenTelemetry is officially released, we hope that you try out and use the /tracez and /traceconfigz zPages!

By William Hu and Terry Wang – Software Engineering Interns, Core Compute Observability

Java zPages for OpenTelemetry

What is OpenTelemetry?

OpenTelemetry is an open source project aimed at improving the observability of our applications. It is a collection of cloud monitoring libraries and services for capturing distributed traces and metrics and integrates naturally with external observability tools, such as Prometheus and Zipkin. As of now, OpenTelemetry is in its beta stage and supports a few different languages.

What are zPages?

zPages are a set of dynamically generated HTML web pages that display trace and metrics data from the running application. The term zPages was coined at Google, where similar pages are used to view basic diagnostic data from a particular host or service. For our project, we built the Java /tracez and /traceconfigz zPages, which focus on collecting and displaying trace spans.

TraceZ

The /tracez zPage displays span data from the instrumented application. Spans are split into two groups: spans that are still running and spans that have completed.

TraceConfigZ

The /traceconfigz zPage displays the currently active tracing configuration and allows users to change the tracing parameters. Examples of such parameters include the sampling probability and the maximum number of attributes.

Using the zPages

This section describes how to start and use the Java zPages.

Add the dependencies to your project

First, you need to add OpenTelemetry as a dependency to your Java application.

Maven

For Maven, add the following to your pom.xml file:
<dependencies>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-api</artifactId>
        <version>0.7.0</version>
    </dependency>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-sdk</artifactId>
        <version>0.7.0</version>
    </dependency>
    <dependency>
        <groupId>io.opentelemetry</groupId>
        <artifactId>opentelemetry-sdk-extension-    zpages</artifactId>
        <version>0.7.0</version>
    </dependency>
</dependencies>

Gradle

For Gradle, add the following to your build.gradle dependencies:
implementation 'io.opentelemetry:opentelemetry-api:0.7.0'
implementation 'io.opentelemetry:opentelemetry-sdk:0.7.0'
implementation 'io.opentelemetry:opentelemetry-sdk-extension-zpages:0.7.0'

Register the zPages

To set-up the zPages, simply call startHttpServerAndRegisterAllPages(int port) from the ZPageServer class in your main function:
import io.opentelemetry.sdk.extensions.zpages.ZPageServer;

public class MyMainClass {
    public static void main(String[] args) throws Exception {
        ZPageServer.startHttpServerAndRegisterAllPages(8080);
        // ... do work
    }
}
Note that the package com.sun.net.httpserver is required to use the default zPages setup. Please make sure your version of the JDK includes this package if you plan to use the default server.

Alternatively, you can call registerAllPagesToHttpServer(HttpServer server) to register the zPages to a shared server:
import io.opentelemetry.sdk.extensions.zpages.ZPageServer;

public class MyMainClass {
    public static void main(String[] args) throws Exception {
        HttpServer server = HttpServer.create(new                     InetSocketAddress(8000), 10);
        ZPageServer.registerAllPagesToHttpServer(server);
        server.start();
        // ... do work
    }
}

Access the zPages

View all available zPages on the index page

The index page (at /) lists all available zPages with a link and description.


View trace spans on the /tracez zPage

The /tracez zPage displays information about running and completed spans, with completed spans further organized into latency and error buckets. The data is aggregated into a summary-level table:


You can click on each of the counts in the table cells to access the corresponding span details. For example, here are the details of the ChildSpan latency sample (row 1, col 4):


View and update the tracing configuration on the /traceconfigz zPage.

The /traceconfigz zPage provides an interface for users to modify the current tracing parameters:


Design

This section goes into the underlying design of our code.

Frontend


The frontend consists of two main parts: HttpHandler and HttpServer. The HttpHandler is responsible for rendering the HTML content, with each zPage implementing its own ZPageHandler. The HttpServer, on the other hand, is responsible for listening to incoming requests, obtaining the requested data, and then invoking the aforementioned ZPageHandlers. The HttpServer class from com.sun.net is used to construct the default server and to handle http requests on different routes.

Backend





The backend consists of two components as well: SpanProcessor and DataAggregator. The SpanProcessor watches the lifecycle of each span, invoking functions each time a span starts or ends. The DataAggregator, on the other hand, restructures the data from the SpanProcessor into an accessible format for the frontend to display. The class constructor requires a TracezSpanProcessor instance, so that the TracezDataAggregator class can access the spans collected by a specific TracezSpanProcessor. The frontend only needs to call functions in the DataAggregator to obtain information required for the web page.

Conclusion

We hope that this blog post has given you a little insight into the development and use cases of OpenTelemetry’s Java zPages. The zPages themselves are lightweight performance monitoring tools that allow users to troubleshoot and better understand their applications. Once OpenTelemetry is officially released, we hope that you try out and use the /tracez and /traceconfigz zPages!

By William Hu and Terry Wang – Software Engineering Interns, Core Compute Observability

OpenTelemetry is now beta!

OpenTelemetry and OpenCensus have been a critical part of our goal of making platforms like Kubernetes more observable and more manageable. This has been a multi-year journey for us, from creating OpenCensus and growing it into a core part of major web services’ observability stack, to our announcement of OpenTelemetry last year and the rapid growth of the OpenTelemetry community.

Beta is a big milestone for OpenTelemetry, as developers can now use the SDKs, integrations, and Collector to capture distributed traces and metrics from their applications and send them to backends like Prometheus, Jaeger, Cloud Monitoring, Cloud Trace, and others for analysis. This is a great time to try out OpenTelemetry and get involved in the observability community— whether you’re looking to improve your visibility into production services, giving your users performance data from client libraries that you maintain—or want to join a rapidly-growing open source project!

To learn more, please read our official community announcement, which copied below:

Co-authored by maintainers, community contributors, and members of the OpenTelemetry governance committee.

OpenTelemetry has just begun its first wave of beta releases, starting with the Collector and the Erlang, Go, Java, JavaScript, and Python SDKs, followed by the .Net SDK and Java auto-instrumentation agent. This means that you can begin integrating OpenTelemetry into your applications and client libraries to capture app-level metrics and distributed traces.

If you’re not already familiar with OpenTelemetry, the project provides a single set of language-specific APIs, SDKs, agents, and other components that you can use to collect distributed traces, metrics, and related metadata from your applications. In addition to its core capabilities, much of OpenTelemetry’s utility comes from integrations for HTTP and RPC libraries, storage clients, etc. that allow developers to capture critical observability data from their applications with almost zero effort. After capturing these signals, each OpenTelemetry component can export them to your backends of choice, including Prometheus, Jaeger, Zipkin, Azure Monitor, Dynatrace, Google Cloud Monitoring + Trace, Lightstep, New Relic, and Splunk.

This first beta release includes:
  • APIs and SDKs for Erlang, Go, Java, JavaScript, and Python, which include the interfaces and implementations that you need to define and create distributed traces and metrics, manage sampling and context propagation, etc. The .Net API + SDK will follow shortly.
  • Language-specific API integrations for at least one popular HTTP framework, gRPC, and at least one popular storage client, which can be enabled with one line of code, and will automatically capture relevant traces and metrics and handle context propagation.
  • Language-specific exporters that allow SDKs to send captured traces and metrics to any supported backends.
  • The OpenTelemetry Collector, which can receive data from OpenTelemetry SDKs and other sources, and then export this telemetry to any supported backend.
  • Auto-Instrumentation for Java that captures telemetry from 47 Java libraries and frameworks without requiring any modification to your application.
  • Documentation for each component including getting started guides.
As these and subsequent OpenTelemetry components enter beta (requirements and release plan), we are declaring that they are ready to start integrating with. This means that service developers can begin to include OpenTelemetry in their applications and that maintainers of storage, RPC, etc. clients should start testing the OpenTelemetry APIs to provide better observability of their users.

However, this does come with some caveats:
  • Each OpenTelemetry component will likely undergo several beta releases in the coming weeks — this is simply the first.
  • While functional, beta components have not gone through thorough testing or benchmarking and they are not intended for production workloads.
  • While we aim to avoid any major changes to the OpenTelemetry APIs between beta and GA release candidates, we cannot guarantee that there will not be any changes during this period.
  • Some functionality is still missing from the first beta and will be added in subsequent releases; this is documented in each component’s GitHub repository.
In the coming weeks, you can expect additional beta releases from the first wave of OpenTelemetry components and others. In particular, we expect the API + SDK for .Net and the Java auto-instrumentation agent to be ready soon. Eventually, components will reach a level of maturity and testing where we’ll feel confident in naming them a release candidate (RC), after which we will not make any breaking changes to the APIs for that component.

This beta milestone is a huge accomplishment for the OpenTelemetry community, and every contributor should be proud of the fact that OpenTelemetry is now working and ready to integrate with. This is a great opportunity for the maintainers of client libraries to begin integrating with the OpenTelemetry APIs, for end-users to start integrating it into their services, and for anyone interested in contributing to join our rapidly growing community by joining our mailing lists, Gitter chats, and the monthly community meeting!

By Morgan McLean, Product Manager

OpenTelemetry: The Merger of OpenCensus and OpenTracing

We’ve talked about OpenCensus a lot over the past few years, from the project’s initial announcement, roots at Google and partners (Microsoft, Dynatrace) joining the project, to new functionality that we’re continually adding. The project has grown beyond our expectations and now sports a mature ecosystem with Google, Microsoft, Omnition, Postmates, and Dynatrace making major investments, and a broad base of community contributors.

We recently announced that OpenCensus and OpenTracing are merging into a single project, now called OpenTelemetry, which brings together the best of both projects and has a frictionless migration experience. We’ve made a lot of progress so far: we’ve established a governance committee, a Java prototype API + implementation, workgroups for each language, and an aggressive implementation schedule.

Today we’re highlighting the combined project at the keynote of Kubecon and announcing that OpenTelemetry is now officially part of the Cloud Native Computing Foundation! Full details are available in the CNCF’s official blog post, which we’ve copied below:

A Brief History of OpenTelemetry (So Far)

After many months of planning, discussion, prototyping, more discussion, and more planning, OpenTracing and OpenCensus are merging to form OpenTelemetry, which is now a CNCF sandbox project. The seed governance committee is composed of representatives from Google, Lightstep, Microsoft, and Uber, and more organizations are getting involved every day.

And we couldn't be happier about it – here’s why.

Observability, Outputs, and High-Quality Telemetry

Observability is a fashionable word with some admirably nerdy and academic origins. In control theory, “observability” measures how well we can understand the internals of a given system using only its external outputs. If you’ve ever deployed or operated a modern, microservice-based software application, you have no doubt struggled to understand its performance and behavior, and that’s because those “outputs” are usually meager at best. We can’t understand a complex system if it’s a black box. And the only way to light up those black boxes is with high-quality telemetry: distributed traces, metrics, logs, and more.

So how can we get our hands – and our tools – on precise, low-overhead telemetry from the entirety of a modern software stack? One way would be to carefully instrument every microservice, piece by piece, and layer by layer. This would literally work, it’s also a complete non-starter – we’d spend as much time on the measurement as we would on the software itself! We need telemetry as a built-in feature of our services.

The OpenTelemetry project is designed to make this vision a reality for our industry, but before we describe it in more detail, we should first cover the history and context around OpenTracing and OpenCensus.

OpenTracing and OpenCensus

In practice, there are several flavors (or “verticals” in the diagram) of telemetry data, and then several integration points (or “layers” in the diagram) available for each. Broadly, the cloud-native telemetry landscape is dominated by distributed traces, timeseries metrics, and logs; and end-users typically integrate with a thin instrumentation API or via straightforward structured data formats that describe those traces, metrics, or logs.



For several years now, there has been a well-recognized need for industry-wide collaboration in order to amortize the shared cost of software instrumentation. OpenTracing and OpenCensus have led the way in that effort, and while each project made different architectural choices, the biggest problem with either project has been the fact that there were two of them. And, further, that the two projects weren’t working together and striving for mutual compatibility.

Having two similar-yet-not-identical projects out in the world created confusion and uncertainty for developers, and that made it harder for both efforts to realize their shared mission: built-in, high-quality telemetry for all.

Getting to One Project

If there’s a single thing to understand about OpenTelemetry, it’s that the leadership from OpenTracing and OpenCensus are co-committed to migrating their respective communities to this single and unified initiative. Although all of us have numerous ideas about how we could boil the ocean and start from scratch, we are resisting those impulses and focusing instead on preparing our communities for a successful transition; our priorities for the merger are clear:
  • Straightforward backwards compatibility with both OpenTracing and OpenCensus (via software bridges)
  • Minimizing the time where OpenTelemetry, OpenTracing, and OpenCensus are being co-developed: we plan to put OpenTracing and OpenCensus into “readonly mode” before the end of 2019.
  • And, again, to simplify and standardize the telemetry solutions available to developers.
In many ways, it’s most accurate to think of OpenTelemetry as the next major version of both OpenTracing and OpenCensus. Like any version upgrade, we will try to make it easy for both new and existing end-users, but we recognize that the main benefit to the ecosystem is the consolidation itself – not some specific and shiny new feature – and we are prioritizing our own efforts accordingly.

How you can help

OpenTelemetry’s timeline is an aggressive one. While we have many open-source and vendor-licensed observability solutions providing guidance, we will always want as many end-users involved as possible. The single most valuable thing any end-user can do is also one of the easiest: check out the actual work we’re doing and provide feedback. Via GitHub, Gitter, email, or whatever feels easiest.

Of course we also welcome code contributions to OpenTelemetry itself, code contributions that add OpenTelemetry support to existing software projects, documentation, blog posts, and the rest of it. If you’re interested, you can sign up to join the integration effort by filling in this form.

By Ben Sigelman, co-creator of OpenTracing and member of the OpenTelemetry governing committee, and Morgan McLean, Product Manager for OpenCensus at Google since the project’s inception