-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
appender-tracing: Include trace ID and span ID in logs when nested in tracing crate's spans #2438
base: main
Are you sure you want to change the base?
appender-tracing: Include trace ID and span ID in logs when nested in tracing crate's spans #2438
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes to discuss #2438 (comment)
@@ -17,6 +17,7 @@ tracing = { workspace = true, features = ["std"]} | |||
tracing-core = { workspace = true } | |||
tracing-log = { version = "0.2", optional = true } | |||
tracing-subscriber = { workspace = true, features = ["registry", "std"] } | |||
tracing-opentelemetry = { version = "0.28", optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a conflict if the application uses version 0.29 of tracing-opentelemetry
, while the appender depends on version 0.28 as specified in this configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracing-opentelemetry v0.29 hasn't been released, but I think I see what you are getting at.
tracing-opentelemetry v0.28 (latest release) depends on opentelemetry v0.27, so won't be compatible with opentelemetry v0.28 when it is released. It looks like typically all the opentelemetry crates would be bumped together so there is a version of all these crates that are compatible with each other. However, this means the point at which opentelemetry-appender-tracing v0.28 would be released there wouldn't be a compatible version of tracing-opentelemetry without coordination with that external crate.
Given that this is an experimental feature, I'm guessing it doesn't make sense to wait for tracing-opentelemetry v0.29 to be released before releasing opentelemetry-appender-tracing.
Removing the version constraint would probably work in practice with these unstable versions given that the constraint on opentelemetry crates should ensure the correct version is used. Although, potentially tracing-opentelemetry could introduce a breaking change that is incompatible with opentelemetry-appender-tracing ithout changing the opentelemetry v0.x version that it depends on, which I think would result in the incompatible version of tracing-opentelemetry to be chosen to use by cargo when a compatible version does exist.
I think the simplest thing to do for now (while things are unstable) when releasing a new v0.x version of opentelemetry crates would be to just optimistically bump the version of this optional tracing-opentelemetry dependency to its next v0.y version. E.g. when releasing opentelemetry v0.28.0, depend on tracing-opentelemetry v0.29, so that it can be made compatible with opentelemetry-appender-tracing without any further changes to opentelemetry-appender-tracing. If tracing-opentelemetry v0.29.0 isn't compatible, then v0.28.1 can be released of opentelemetry-appender-tracing could be released fixing this feature. However, it also won't result in an incompatible v0.30 version of tracing-opentelemetry being used when v0.29 is already working with this feature.
Functionally, you could think of a release of a new v0.x version of opentelemetry crates (including opentelemetry-appender-tracing) as removing this feature, then automatically reintroducing the new feature when the next v0.y version of tracing-opentelemetry is made. This could be done manually, which would allow for extra testing after the v0.y release of tracing-opentelemetry is made, but it just seems like it introduces unnecessary work and complicates the opentelemetry-appender-tracing release process.
Once things stabilize, this could be coordinated more carefully between opentelemetry-appender-tracing & tracing-opentelemetry using RC releases, such that the new major version of opentelemetry-appender-tracing is made compatible with a RC version of tracing-opentelemetry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dylanahsmith. I agree that optimistically bumping the version of tracing-opentelemetry to the next v0.y (e.g., v0.29 when releasing opentelemetry v0.28) seems like a practical approach for now, given the experimental nature of the feature. If compatibility issues arise, a quick patch release can address them.
Please merge this, working perfect for me: Using this dependencies on my actix-web app:
And setup: use std::sync::LazyLock;
use opentelemetry::KeyValue;
use opentelemetry::trace::{TraceError, TracerProvider};
use opentelemetry_appender_tracing::layer::OpenTelemetryTracingBridge;
use opentelemetry_otlp::{LogExporter, MetricExporter, SpanExporter, WithExportConfig};
use opentelemetry_sdk::{logs::{self as sdklogs}, runtime, trace::{self as sdktrace}, Resource};
use opentelemetry_sdk::logs::LoggerProvider;
use opentelemetry_sdk::metrics::{MetricError, SdkMeterProvider};
use opentelemetry_sdk::propagation::TraceContextPropagator;
use opentelemetry_semantic_conventions::resource;
use tracing_bunyan_formatter::{JsonStorageLayer};
use tracing_subscriber::{EnvFilter, Layer};
use tracing_subscriber::layer::SubscriberExt;
use crate::error::CustomError;
use crate::tracing_utils;
const APP_NAME: &str = crate::built_info::PKG_NAME;
static RESOURCE: LazyLock<Resource> =
// LazyLock::new(|| Resource::new(vec![KeyValue::new(resource::SERVICE_NAME, APP_NAME)]));
LazyLock::new(|| {
Resource::builder()
.with_service_name(APP_NAME)
.build()
});
pub fn init_telemetry() -> Result<(), CustomError> {
opentelemetry::global::set_text_map_propagator(TraceContextPropagator::new());
let log_tracer = tracing_utils::init_logs()?;
let span_tracer = tracing_utils::init_traces()?;
let metric_tracer = tracing_utils::init_metrics()?;
opentelemetry::global::set_meter_provider(metric_tracer.clone());
let filter_otel = EnvFilter::new("info")
.add_directive("hyper=off".parse().unwrap())
.add_directive("opentelemetry=off".parse().unwrap())
.add_directive("opentelemetry_sdk=off".parse().unwrap())
.add_directive("tonic=off".parse().unwrap())
.add_directive("h2=off".parse().unwrap())
.add_directive("reqwest=off".parse().unwrap());
let otel_log_layer = OpenTelemetryTracingBridge::new(&log_tracer).with_filter(filter_otel);
let otel_span_layer = tracing_opentelemetry::layer().with_tracer(span_tracer.tracer(APP_NAME));
let env_filter = EnvFilter::try_from_default_env().unwrap_or(EnvFilter::new("info")
.add_directive("opentelemetry=debug".parse().unwrap())
.add_directive("opentelemetry_sdk=off".parse().unwrap())
);
let format = tracing_subscriber::fmt::format().pretty();
let formatting_layer = tracing_subscriber::fmt::layer().event_format(format);
let subscriber = tracing_subscriber::Registry::default()
.with(env_filter)
.with(otel_span_layer)
.with(otel_log_layer)
.with(JsonStorageLayer)
.with(formatting_layer);
tracing::subscriber::set_global_default(subscriber).map_err(|err| {
CustomError::TracingGenericError(format!("Fail to setup global tracing config: {}", err.to_string()))
})?;
Ok(())
}
fn init_logs() -> Result<sdklogs::LoggerProvider, opentelemetry_sdk::logs::LogError> {
let exporter = LogExporter::builder()
.with_tonic()
.with_endpoint("http://127.0.0.1:44317")
.build()?;
Ok(LoggerProvider::builder()
// .with_batch_exporter(exporter, runtime::TokioCurrentThread)
.with_batch_exporter(exporter)
.with_resource(RESOURCE.clone())
.build())
}
fn init_traces() -> Result<sdktrace::TracerProvider, TraceError> {
let exporter = SpanExporter::builder()
.with_tonic()
.with_endpoint("http://127.0.0.1:44317")
.build()?;
Ok(sdktrace::TracerProvider::builder()
.with_batch_exporter(exporter, runtime::TokioCurrentThread)
.with_resource(RESOURCE.clone())
.build())
}
fn init_metrics() -> Result<opentelemetry_sdk::metrics::SdkMeterProvider, MetricError> {
let exporter = MetricExporter::builder()
.with_tonic()
.with_endpoint("http://127.0.0.1:44317")
.build()?;
// let reader = opentelemetry_sdk::metrics::PeriodicReader::builder(exporter, runtime::TokioCurrentThread)
let reader = opentelemetry_sdk::metrics::PeriodicReader::builder(exporter)
.with_interval(std::time::Duration::from_secs(1))
.build();
Ok(SdkMeterProvider::builder()
.with_reader(reader)
.with_resource(RESOURCE.clone())
.build())
} and set this on actix-web httpserver app setup: .wrap(TracingLogger::default()) |
This correction is very nice. Im waiting for it to be merged. |
Tagging for consideration in 0.28 though not strictly required for 0.28. |
@maximebedard - Would it be possible for you to help with the review, as changes are similar to #1394. |
Feature flag is added now, so removing the block.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2438 +/- ##
=======================================
+ Coverage 77.8% 79.5% +1.6%
=======================================
Files 122 118 -4
Lines 23030 22495 -535
=======================================
- Hits 17931 17884 -47
+ Misses 5099 4611 -488 ☔ View full report in Codecov by Sentry. |
@TommyCpp Could you help review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Looks safe being behind feature flag and validated by multiple users.
One thing we need to take care is to bump the tracing-opentelemetry
version as discussed here as part of every release. But good to have more eyes/approvals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
.get::<OtelData>() | ||
.and_then(|otd| otd.builder.span_id); | ||
|
||
let opt_trace_id = span.scope().last().and_then(|root_span| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to look for the root span to get the trace_id
here? Shouldn't the current span have the trace_id
set as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically the current span won't have the span id explicitly set on it, so it won't actually store the trace id itself, instead it will be inherited when the span is built for sending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose going straight to the root span wouldn't support an explicitly set trace id on a non-root span. Is there a non-contrived example of when this might be done that I should add a test for and support? Or should this just be left as a limitation to avoid extra code and runtime overhead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the listener code and yeah, it's only root spans that get the trace_id
set in the builder. Someone would have to manually override that and set a trace_id
on the builder to get that behavior.
Still, I think that we should pull out the trace_id
from the builder and use that if it's available, or fall back to the root one.
can't wait for this pr to be merged, it's really helpful |
Will take a look tomorrow! |
Fixes #1378
Problem Summary
opentelemetry-appender-tracing was only including the trace & span IDs from scopes opened using the opentelemetry tracing API. However, given that opentelemetry-appender-tracing is meant to provide support for getting these events from the tracing crate, it seems more likely that the spans would also be created through the tracing crate, but the trace & span IDS from those spans weren't being included in the logs.
Changes
Similar to the previous fix attempt (#1394), the span/trace IDs are obtained from the OtelData span extension from that the opentelemetry-tracing adds.
Issues resolved from the previous fix attempt (#1394):
Additionally, I've made the following changes:
Merge requirement checklist
CHANGELOG.md
files updated for non-trivial, user-facing changes