01:11 | <littledan> | Sounds like a good idea to me. As an OpenTelemetry maintainer, zone.js is the only option on the web platform that we can rely on to propagate tracing contexts. And I've received constant complaining about the delicacy of zone.js and the lack of support of async functions. |
02:06 | <Chengzhong Wu> | Is there an issue you can point to with more details about how this came up? |
02:41 | <littledan> | Looks like these issues are on the server side? |
02:45 | <littledan> | Anyway great to see a reference to the original Angular issue at https://github.com/angular/angular/issues/31730 |
02:47 | <littledan> | Yoav, there is a lot of context in the presentation at https://docs.google.com/presentation/d/1yw4d0ca6v2Z2Vmrnac9E9XJFlC872LDQ4GFR17QdRzk/edit?usp=sharing (but my mastodon client keeps crashing when I try to respond to the thread) |
02:49 | <Chengzhong Wu> | Looks like these issues are on the server side? |
02:52 | <Chengzhong Wu> | OpenTelemetry can also be running on the browsers to trace user interactions and page navigation. |
02:52 | <littledan> | Oh I see, they are about https://github.com/open-telemetry/opentelemetry-js/blob/main/packages/opentelemetry-context-zone/README.md |
02:54 | <Chengzhong Wu> | yeah, opentelemetry context manager implementation builtin on top of zone.js |
02:57 | <littledan> | What does it use that for? |
02:59 | <Chengzhong Wu> | Well, that's the story of platform agnostic context propagation. We need an abstract interface so that we can inject platform provided async context implementation like AsyncLocalStorage, zone.js (and potentially CF Workerd's AsyncLocalStorage and Deno's). |
03:00 | <Chengzhong Wu> | This is the api definition of OpenTelemetry ContextManager https://github.com/open-telemetry/opentelemetry-js/blob/main/api/src/context/types.ts#L43. It defines the operations that required to trace the application |
03:01 | <littledan> | So I found these docs for Node, are there analogous docs for the client side? https://opentelemetry.io/docs/instrumentation/js/context/ |
03:02 | <Chengzhong Wu> | So I found these docs for Node, are there analogous docs for the client side? https://opentelemetry.io/docs/instrumentation/js/context/ |
03:08 | <littledan> | The sample code there says that using the zone context provider is optional. What breaks if it is missing? What does “supports asynchronous operations” mean? It would be great to have this whole concrete case concisely described in our README. |
03:11 | <Chengzhong Wu> | Yeah, definitely. Without the zone.js on the browser, context propagation is based on manual propagation, or based on the sync call stacks -- identical to the example described in https://docs.google.com/presentation/d/1yw4d0ca6v2Z2Vmrnac9E9XJFlC872LDQ4GFR17QdRzk/edit#slide=id.g198251ee25f_2_6. |
03:13 | <littledan> | Sorry the part I am missing is how this comes up as important in OpenTelemetry as I am not so familiar with that library |
03:13 | <littledan> | What sorts of traces do you end up wanting to take on the client side? |
03:17 | <littledan> | This is about building a causal chain that spans several server and client exchanges, explaining just part of what is going along in the page, passing along a context id in an AsyncContext variable? |
03:20 | <littledan> | I would say that the task priority use case, the timing case in the slide deck, and the context id propagation use case are all very interesting and different. (I honestly don’t understand the cache case yet) |
03:24 | <littledan> | The fact that we have 3-4 very different, real and meaningful use cases that are all on the client and solved by AsyncContext should be a strong argument |
03:28 | <Chengzhong Wu> | What sorts of traces do you end up wanting to take on the client side? |
03:34 | <littledan> | Can you walk me through a basic case where the context is used? It is great to have all these references but I want to understand if the context is only used in these contrib plugins or also in further even more simple cases. |
03:36 | <littledan> | Maybe the User interaction instrumentation is this basic case? |
03:42 | <Chengzhong Wu> | yeah |
03:47 | <littledan> | There are a lot of details here about how it hooks into events in the case that zone.js is missing! |
03:48 | <littledan> | Perhaps parallel to that, Yoav commented that it was a bit subtle which events make sense to propagate async context over |
03:49 | <Chengzhong Wu> | So taking the screenshots in the readme as the example, when a button is clicked, a new trace span is created and saved to the async context as the current active span (code). When fetch is invoked as a result of the click event, the fetch instrumentation takes the current active span from the async context (code) and create a child span of it, and injects necessary trace ids and span ids into the request to be send to the server. |
03:50 | <Chengzhong Wu> | Basically, every time we create a span, the span will be marked as a child span of the active span saved in the async context. |
03:52 | <littledan> | And it will start a new span when, e.g. there is a click or a long task detected? |
03:52 | <Chengzhong Wu> | yeah, exactly |
03:54 | <littledan> | Ah I see, thanks for explaining. This sounds extremely similar to what I think Yoav might be trying to accomplish—to the point where we should probably try to understand whether such a higher level construct might be sufficient for the OpenTelemetry case |
03:56 | <littledan> |
Didn’t we just cancel React over this? ;) |
03:57 | <littledan> | So taking the screenshots in the readme as the example, when a button is clicked, a new trace span is created and saved to the async context as the current active span (code). When |
04:03 | <Chengzhong Wu> | Ah I see, thanks for explaining. This sounds extremely similar to what I think Yoav might be trying to accomplish—to the point where we should probably try to understand whether such a higher level construct might be sufficient for the OpenTelemetry case |
04:10 | <littledan> | Yeah, when considering the aspect of generating these IDs and sending them to the server, the behavior and usage is sufficiently different from anything that could be built into the browser, so this motivates exposing the “lower level” AsyncContext API that OpenTelemetry can use, possibly alongside some higher level built-in purely client-side tracing/metrics |
04:28 | <Chengzhong Wu> | Sorry if I'm taking it wrong. The trace id in the OpenTelemetry is not the lower level async id James mentioned. The trace id is just sort of a random value. |
04:36 | <littledan> | Right, I got that, for this reason it would be sort of inappropriate for the built in browser api to generate it and pass it on to fetch |
04:38 | <littledan> | The question I am trying to answer is: for Yoav, do we actually need to expose AsyncContext to JS, or could we just have built in browser mechanisms to handle these cases? For example, main thread scheduling priority propagation could be something totally built-in, so it provides relatively weak motivation for exposing AsyncContext to JS. |