-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] No OpenTelemetry-based tracing data captured when proxies are not instrumented #5919
Comments
I got another idea when writing the issue: enabling the console exporter for tracing only, and see what it emits. Here's the outcome: On the deployed environmentNo output whatsoever (when looking with In the local environmentIn addition to the above logging (and some other more detailed logging in general, enabled via
Will update the issue title accordingly. The problem seems to be with collection of tracing activity data in general; it's not isolated to the OTLP exporter per se. |
For the record, we haven't found the exact source of this but... modifying our YARP-based reverse proxy which sits in front of the application makes things work. 🤯 I.e. adding basically this to our reverse proxy code: builder.Services.AddOpenTelemetry()
.ConfigureResource(resource => resource.AddService(serviceName))
.WithTracing(tracing => tracing
.AddAspNetCoreInstrumentation()
.AddOtlpExporter()
); ...makes OTLP tracing work (with some caveats) both for the reverse proxy and for the service behind the reverse proxy. 🤔 I'm utterly perplexed by this. Could Will close this issue soon unless anyone wants to keep it open for further debugging. |
@perlun, I suppose that your proxy was creating non-recorded span. Then it was propagated to the application. What is more, AspNetCore is by default instrumented - so if you do not record Activities by .AddAspNetCoreInstrumentation() it will produce non-reocrded spans. |
@Kielek Ah, that makes a bit of sense, thanks for the reply. 👍 I'll readily admit that I'm pretty much of a noob when it comes to OpenTelemetry in general. I guess the ParentBased-stuff is documented somewhere? |
Thank @Kielek and @cijothomas. 👍 I feel like it would be worth mentioning these semantics somewhere. At least to me/us, it was quite a bit of a gotcha. We spent literal days debugging this before we (more or less by coincidence) found that it suddenly started working when we added telemetry to our Yarp-based reverse proxy. It would be nice to help others avoiding falling into this pit. We could add it to https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry/README.md#troubleshooting, but it's a very specific problem so not sure that it properly belongs there. I guess there isn't any form of a "FAQ" for the project or similar? |
Hey Bro! @perlun You're great, I've been struggling with this problem for a whole day, I'm programmed with Client -> YARP -> Nginx -> APP. |
@perlun, @a35506322 if you think that it is worth to improve documentation, we are always opened, especially for the PRs :- ). |
Package
OpenTelemetry
Package Version
Runtime Version
net8.0
Description
Hi,
We have been debugging a really weird problem for quite some time now, so I thought I'd reach out and ask for help. The problem is with our ASP.NET Core application not producing any tracing-related data, despite being enabled in our
Startup
class.What's even more weird: the exact same application works fine when running locally (both in Docker and on the Linux host). In the deployed environment, it runs inside Docker and this is where it doesn't work.
I am not necessarily saying this is a bug in OpenTelemetry, but something in our application or elsewhere is causing the OpenTelemetry-based instrumentation to malfunction.
Steps to Reproduce
Unfortunately, I don't currently have a minimal reproducible example; in an isolated ASP.NET application based on the "getting started" example (https://opentelemetry.io/docs/languages/net/getting-started/), everything works as intended. The problem is isolated to our production code.
Setup code
We set up OpenTelemetry in a method called from
ConfigureServices
in ourStartup
class:Expected Result
Tracing data from ASP.NET being sent to the configured OTLP exporter.
Actual Result
No tracing data emitted whatsoever.
Additional Context
NuGet package references
In addition to the package versions listed above, we also tested with
1.10.0-beta.1
+1.9.0
ofOpenTelemetry.Instrumentation.AspNetCore
, with no difference.What we have tested
Set up the OpenTelemetry Collector using these instructions, for easy(er) debugging. Presume the name of the container is
otel-collector
.Run the application (inside Docker). Provoke some HTTP requests that produces tracing-related data. Check the logs of the OpenTelemetry Collector using this command:
docker logs otel-collector 2>&1 | grep data_type
. On the environments where this doesn't work, the command outputs data roughly like this. As can be seen, notraces
-related data is emitted from the application. (logs
was also missing at one point but I think this was because of a misconfiguration in our app)The
OTEL_DIAGNOSTICS.json
-generated diagnosticsWhen it works (on my local machine), the log looks roughly like this. The "Activity stopped" events contains the path to the route being traced (
GET api/path/to/resource-1
etc).When it doesn't work, it looks like this. Note how the
Activity stopped
entries are lacking the URL paths.More details
I've debugged this to the best of my ability, and I suspected that the
if (this.IsEnabled(EventLevel.Verbose, EventKeywords.All))
call here returned false:opentelemetry-dotnet/src/OpenTelemetry/Internal/OpenTelemetrySdkEventSource.cs
Lines 61 to 65 in 5dff99f
Because of limitations in my IDE, I was unable to place a breakpoint in 3rd party code when attaching to the process running inside the Docker container on the machine where we saw the problem, so I couldn't confirm this. Also, now when writing this, I am thinking: if the event source is somehow disabled, would we even get any
Activity started
events being logged at all? 🤔I am very much at the end of the road here; we don't know how to debug this further. Any ideas/suggestions are greatly appreciated. 🙏
The text was updated successfully, but these errors were encountered: