02:10
<littledan>
* I'm not even sure if per-store configurability (if it's a global default to push things in the call/resolve-time vs registration-time direction in the cases where both are available) will be a solution; sometimes you want the registration-time ancestor of the call-time ancestor
04:32
<Justin Ridgewell>
Wow, this really blew up over the last few days
04:34
<Justin Ridgewell>
It's not just the timing of enabling PromiseHook. It is that the portion of an async function before the first await behaves differently from the rest of the function because the context capture will have already happened around the outer await before passing over the inner awaits. It's confusing to users that the context is not flowing out.
I don’t understand this. What are the inner and outer awaits? In my mind, the async context is bound when we enter the function, it doesn’t matter where the awaits are.
04:34
<Justin Ridgewell>
The proposal as it current stands is maintaining the lexical consistent value inside a async function body across await for AsyncContext.Variable, which is part of the approach aiming to address the dynamic scope concerns from the previous meeting's feedback.
Exactly.
04:35
<Justin Ridgewell>
Yes. That is actually what users keep telling me they are expecting.
Are there links to issue reports for this?
04:37
<Justin Ridgewell>
(See the warnings about rust tracing::Span in async code https://docs.rs/tracing/latest/tracing/span/struct.Span.html#in-asynchronous-code for some consequences of this sort of design in the wild -- but they're better off because Futures are expressive enough to write Future::instrument without needing built-in help)
This example doesn’t apply in our case, since any cooperate threading (another promise resuming in JS) would have restored its own context.
04:44
<Justin Ridgewell>
sure, people rewriting code from callbacks to promises might prefer resolution time, but what about people writing async/await code from scratch? Do we know what they need? Would we be breaking them?
And it’s not just web APIs, anything trying to cache fetches or other async behavior, any module level initializion using a promise, will break any user.
04:47
<Justin Ridgewell>
Flowing around is essentially just local variable scope which, to me, doesn't really seem to have much value. I've asked a few times for an explanation or some use case why anyone would want that flow but have not yet got any answer on that. 🤔
It’s a local variable scope that propagates to child calls automatically. I explicitly want this becasue it’s understandable, and easy to debug.
04:49
<Justin Ridgewell>
even with .set(), I don't think this would be dynamic scoping, at least not in the sense that https://github.com/tc39/proposal-async-context/blob/master/SCOPING.md discusses
Unfortunately it is a form of dynamic scoping, but from callee to caller. This was another one of the points I had to discuss with the SES folks to get them on board with the proposal. The way they reasoned about it is complicated, but essentially boils down to an implicit param to every function that is the async context mapping.
04:51
<Justin Ridgewell>
One concern I have with the mutable variable idea is how one is supposed to bound the mutation - I think it's common to make a change with a lifetime, which is why we've gravitated toward the run() encapsulation. As you've pointed out, with the flow-through semantics, run is very clearly the wrong name, and set is much more appropriate. But your example also hints at some mechanism whereby changes to the variable would still somehow go out of scope? I think that's important, but it seems like it would lead to lots of confusion. If some other scope is entered internally, would that inner scope accidentally prune off a branch of var mutations? That also doesn't seem right.
I think allowing unbounded sets opens up the same problems as scoping, and introduces a global leak that is unfixable (how do you ever know when the last set value is not needed anymore?).
04:57
<Justin Ridgewell>
There's not really any writing on it presently, but basically because we don't flow into promise continuations we need to keep an object at the top-level of the request which we push spans into and basically guess what the parent is because, unlike with callbacks, the parent does not flow into the continuation. This means we have to look at what has been pushed into that list in the past and try to guess what the parent would be. Sometimes it's simply the last span in the list, but not always, so we need to do some additional analysis to figure out what was the last span which could be a realistic parent, which requires we also hold a bunch of the span metadata. Holding all this metadata is expensive given that we could have hundreds of thousands of concurrent spans. What we want is for the correct context to just be able to flow through the resolve path so we only need to store the IDs and can entirely eliminate that big bag of data which lives for the whole life of the request.
I think this is what I’m missing. I don’t understand why could would need to guess, I’m naively assuming its always follows-from the last child and child-of the current parent. What’s a case where this isn’t correct? How does the flows-through solve holding holding the hundreds of thousands of concurrent spans?
04:57
<Justin Ridgewell>
(Also, sorry for not respondning a whole lot, normal coprorate BS is taking up all my free time)
08:49
<Andreu Botella>
Unfortunately it is a form of dynamic scoping, but from callee to caller. This was another one of the points I had to discuss with the SES folks to get them on board with the proposal. The way they reasoned about it is complicated, but essentially boils down to an implicit param to every function that is the async context mapping.
is there any documentation anywhere about what SES needs from new proposals?
09:01
<Stephen Belanger>
Honestly I have to agree with others who pointed out that some in TC39 will see this version to be "not well behaved"
In my view, considering in-place mutation as "not well behaved" because of unintended flow consequences just says to me that the flow model is not clearly defined and correct enough to be certain of what the behaviour will be. If the model is actually correct and consistent then there should be no reason that setting partway through a scope should produce any sort of unexpected behaviour.
09:03
<Stephen Belanger>
I just have to admit the use cases are valid and there are many places where We Need Both .... makes things way more complicated tho
Yes, particularly with async/await the callback-scoped version starts to make less and less sense. This was the whole reason that way back before we settled on AsyncLocalStorage I was backing the AsyncLocal proposal which only did the set/get style and left the flow scoping semantics up to the runtime to get right.
09:06
<Andreu Botella>
I'm not actually sure I understand this. As with the current proposal, you would still need to have access to an object in order to set or get, so isn't it exactly parallel to setting a variable defined in an outer scope?
09:08
<Stephen Belanger>
my experience with mutable dynamically bound variables in Factor was negative. It can be surprising how much or little the write to the variable is propagated up/backwards and then read by someone else. .run has a simple answer to this question: it isn't propagated up/backwards.
It doesn't mutate the slot, but if you store an object in it you can propagate mutations up the tree. I think all we can really do is make captures be a copy of what the reference points to at the time and not be the same binding, so at least that top-level mapping of store to value doesn't propagate upward. This is also partly why I think it's important that it be treated as a variable and not a bag of data, because data bags are problematic when it comes to leaking mutations.
09:10
<Stephen Belanger>
so we ended up rediscovering that setting a dynamically scoped variable was an anti-pattern, and we moved towards doing .run instead
This is what I lean toward at this point, personally. Though I do see that there are a few scenarios where it doesn't quite work the way you want it unless you do things like breaking out of awaits. Unfortunately this is how the OTel context works--you can activate a context within the current scope.
09:11
<littledan>
What do you mean by breaking out of awaits?
09:17
<littledan>
Yes, basically every other language flows this way with OTel, but other languages also provide things like bytecode manipulation so it's able to do this externally. With Node.js there is no solution.
Comparison with other languages would be really useful if we can figure out where to look
09:22
<Stephen Belanger>
I don’t understand this. What are the inner and outer awaits? In my mind, the async context is bound when we enter the function, it doesn’t matter where the awaits are.

The spec, as it is presently, calls for capturing the context at the point where an await happens, not where the promise being awaited resolves. The context flows into that promise, and it would flow all the way to the point of resolving, but then we orphan that branch and return the context back to the captured value of that outer await.

In this particular example though, it's using store.enterWith(...) which effectively sets the value in the current scope without creating a new scope first. Because the initial segment of an async function executes synchronously, if that occurs before the first await of that internal function then the point at which the outer function captures its await will have already change its context and so will adopt that inner context. Whereas if that inner function modifies the current context after the first await within that inner function it will have already passed the point where the outer function captured the await context and so it would not use that value. The fact this sometimes changes the outer function context and sometimes doesn't is extremely confusing. Users are expecting it to always change the outer context, as indicated in the issue, because it is logically a continuation and should therefore behave the same as it would were it written with callbacks instead.

09:26
<Stephen Belanger>
Are there links to issue reports for this?
The issue I just posted yesterday was exactly this, and we get numerous reports expecting this behaviour in Datadog support escalations. I've seen it come up many times before in many places too--Slack conversations, conversations at collab summits...I've never seen anyone that expected the behaviour as it is presently. It's possible that no one talks about it because people don't complain about what they think is already working, but given that the vast majority of context management usage is APMs and all the APM devs I've talked to have complained about these context flows as problematic I feel like there's a pretty good chance we're getting it wrong.
09:27
<Stephen Belanger>
It’s a local variable scope that propagates to child calls automatically. I explicitly want this becasue it’s understandable, and easy to debug.
Yes, I get that people want this "around" flow. I'm fine with people wanting it. I'm just not clear what exactly the reason is why they want it. What specifically is the use case for that flow?
09:36
<Stephen Belanger>
I think this is what I’m missing. I don’t understand why could would need to guess, I’m naively assuming its always follows-from the last child and child-of the current parent. What’s a case where this isn’t correct? How does the flows-through solve holding holding the hundreds of thousands of concurrent spans?

Consider this code:

async function doSomething() {
  await doAThing() // first span
  await doAnotherThing() // no span
  await doYetAnotherThing() // second span should be follows-from first
}

If we have call doSomething() twice in a Promise.all(...), which we do not instrument or have any particular awareness of--maybe it's user code--then we'll get two spans being created for the first awaits of each, then when the second spans come in they will no have a way to differentiate because the doSomething() itself would have the same context when it runs both times and would be restoring that over the awaits, blowing away the context we had from the inner function calls which we otherwise could associate with if we had flow-through semantics.

09:39
<Stephen Belanger>
I'm not actually sure I understand this. As with the current proposal, you would still need to have access to an object in order to set or get, so isn't it exactly parallel to setting a variable defined in an outer scope?
Yeah, I really don't get what the security concerns are here. You need access to the variable instance to retrieve the data. Sharing any normal variable would have the same security implications.
09:41
<Stephen Belanger>
What do you mean by breaking out of awaits?
I mean the flow-through semantics. The store.run(...) form is not particularly usable in async functions, having an in-scope mutation would make more sense, or at least some way to inform the context to not bind around the await when what you actually want is the flow-through semantics.
09:43
<Stephen Belanger>
Most other languages I've seen don't have a user-provided scoping mechanism for context, the runtime decides it themselves.
09:43
<Stephen Belanger>
Like Ruby does that with fiber locals as just a simple map that it manages on its own without any user-provided scopes.
09:44
<Stephen Belanger>
https://docs.ruby-lang.org/en/master/Fiber.html#method-i-storage
09:45
<Stephen Belanger>
And go has its explicit Context object which you have to pass through manually, but you basically do clones every time you want to modify it and it flows downward through the calls you make passing in the modified version.
09:46
<Stephen Belanger>
.NET also has AsyncLocal, which is basically exactly what was originally proposed in Node.js before AsyncLocalStorage was chosen instead, because people didn't trust the Node.js runtime to manage scopes correctly. https://learn.microsoft.com/en-us/dotnet/api/system.threading.asynclocal-1?view=net-8.0
09:47
<Stephen Belanger>
Again, doing the simple key/value hash map type structure with no user-supplied scoping mechanism.
09:47
<Chengzhong Wu>
The issue was confusing for it's use on enterWith and performance optimization that make an AsyncLocalStorage behaves differently when modification of AsyncLocalStorage was performed at different positions in an async function body. Let's wait to see if Matteo wants to expand on his original requirement on the issue.
09:50
<Stephen Belanger>
Python also does the plain map with a set/get and manages context scopes on its own. https://docs.python.org/3/library/contextvars.html
09:51
<Stephen Belanger>
As far as I'm aware, JavaScript is the only language that thinks (for some reason) that this needs to be handed over to the user rather than just letting them modify the current scope.
09:52
<Stephen Belanger>
Probably because we had bad experiences with domains in Node.js. But I think that's less a problem of set/get and more that domains itself was just bad. 😬
09:52
<Andreu Botella>
As far as I'm aware, JavaScript is the only language that thinks (for some reason) that this needs to be handed over to the user rather than just letting them modify the current scope.
in Rust, there's tokio's task_local! / LocalKey
09:52
<Andreu Botella>
which doesn't let you modify the current scope
09:53
<Chengzhong Wu>
.NET also has AsyncLocal, which is basically exactly what was originally proposed in Node.js before AsyncLocalStorage was chosen instead, because people didn't trust the Node.js runtime to manage scopes correctly. https://learn.microsoft.com/en-us/dotnet/api/system.threading.asynclocal-1?view=net-8.0
https://dotnetfiddle.net/7ulHg0 AFAICT dotnet doesn't flow-through values even it allows .set pattern
09:54
<Stephen Belanger>
Yep, flow-through is a different issue from the matter of if set/get are reasonable. I mostly brought that up because of James talking about that split idea.
09:55
<Chengzhong Wu>
That's great. I really appreciate distinguishing the two topics
09:55
<Stephen Belanger>
But I think what would probably actually make sense is to just have multiple context types which work in the set/get mode and use the two types to differentiate the flows.
09:56
<Stephen Belanger>
One can be a "modify current context and flow into children only" type while the other can be "modify for any logically following execution" type.
10:28
<Stephen Belanger>
I think ContinuationContext clearly describes the logical execution flow style. I feel like the AsyncContext name is a bit vague about exactly how it flows through async behaviour though. I wonder if we should consider a name which more clearly communicates exactly how it flows through async code so we aren't creating confusion?
10:31
<Stephen Belanger>
I wonder if we need the "async" in the name at all. CallContext might communicate the flow semantics better? Or even ImplicitCallContext to make it even more clear that this is data which gets passed around without needing to do so explicitly? 🤔
10:44
<Stephen Belanger>

Another reason the set/get model kind of makes more sense is you can build the store.run(...) model on top of it whereas the reverse is not possible.

run(value, scope) {
  const previous = this.get()
  try {
    this.set(value)
    return scope()
  } finally {
    this.set(previous)
  }
}
11:02
<Andreu Botella>
I've been looking through https://github.com/endojs/endo/blob/markm-fluid-scopes/packages/eventual-send/src/async-contexts/README.md and I think I more or less understand the security concerns, and set() definitely violates the "unobservable to anything that cannot run in the spawned turn" condition
11:03
<Andreu Botella>
I'm not yet sure whether the resolution context behavior of promises would, though
11:04
<Andreu Botella>
I think it would as well
11:05
<Andreu Botella>
so either of those behaviors would probably get strong opposition from the SES folks
11:06
<Andreu Botella>
the thing that breaks confidentiality is global wrap, though
11:06
<Andreu Botella>
but we can't remove it (in favor of instance wrap) without making a bunch of use cases impossible or impractical
11:13
<Andreu Botella>
actually, no, even the behavior of then would allow that kind of information leaks
11:16
<Chengzhong Wu>
Yes, then is basically exposing global wrap as defined
11:19
<Andreu Botella>
well, I guess in this case it wouldn't properly be then that wraps, but resolve/reject
11:20
<Stephen Belanger>
I don't get SES opposition. The data related to a variable is completely inaccessible without the variable instance reference. A value is already being exposed, it's just being exposed through essentially managed binding update of an earlier defined variable. The patterns of context management through a variable store are not insecure unless you're sharing the store object in the first place which obviously should not be done with unknown/untrusted things. But opposing context management designs because this is possible makes no sense when you can do the same thing with any form of variable reference.
11:24
<Andreu Botella>
the thing they're concerned with is that global wrap would let two components establish a channel of communication by letting one have a local AsyncContext.Variable and another have global wrap
11:26
<Andreu Botella>
the idea is that, if the component with global wrap can have access to multiple contexts that are set by the other component, then they can reset one such context at some later point, which would communicate information to the other component if that one is called later
11:27
<Stephen Belanger>
Ehhh...sort of. They can "communicate" in the intended sense that a global wrap can change the behaviour in a way that it is intending to be global.
11:27
<Stephen Belanger>
But this is also why I prefer instance-scoped bind as the default--you're not messing with a global graph and therefore influencing unrelated things.
11:28
<Stephen Belanger>
But sometimes you need to influence the global graph.
11:28
<Stephen Belanger>
That forced bind though is why I made WindowChannel--so you can explicitly accept a suggested context change rather than having it forced on you.
11:29
<Andreu Botella>
I don't necessarily disagree, but set() establishes a communications channel that's not limited to one microtask and its outgoing scoped data flow
11:29
<Andreu Botella>
and that seems to be the thing they're concerned about
11:30
<Stephen Belanger>
I don't necessarily disagree, but set() establishes a communications channel that's not limited to one microtask and its outgoing scoped data flow
How so? The runtime can define that the scope of a set alteration lives to the end of the current microtask.
11:30
<Stephen Belanger>
This is what most other languages do.
11:31
<Stephen Belanger>
When the current tick ends and it would transition to an async continuation it terminates the context and sets whatever would get restored for the continuation.
11:31
<Stephen Belanger>
Or explicitly null/undefined/empty if nothing was captured for that continuation.
11:31
<Stephen Belanger>
But the point is, a set(...) never crosses over into unrelated activity.
11:33
<Stephen Belanger>

Basically this, but with the scope.run(...) being implicitly provided by the runtime:

// Any calls to set(...) get cleared when scope ends
scope.run(() => {
  store.set(1)
})
11:34
<Stephen Belanger>
And, to be clear, we are essentially already doing this with propagation.
11:34
<Stephen Belanger>
All we need to do is raise the set one level out to whatever the current thing which could have had a scope is, and set there rather than introducing a new scope.
11:38
<Stephen Belanger>
The only real concern is that the store needs copy-on-write semantics so that changes don't leak backward in time within the same scope by modifying the same map. The set(...) should clone the map and future captures within the same scope should see the new map.
11:40
<Andreu Botella>
I think I misspoke
11:40
<Stephen Belanger>
I would argue the set/get model is actually easier to reason about because users can't produce strange scopes that don't behave quite the way they expect, like const gen = (function*() { ... })(); store.run(1, () => gen.next()).
11:40
<Andreu Botella>
and this whole thing is somewhat hurting my brain
11:41
<Andreu Botella>

So if I understand correctly the readme in this PR, a new turn can learn explicit information to where it came from, but it cannot use that information to covertly communicate or affect the integrity of other turns that were also spawned at the same time, if it didn't have a way to do so without async context.

11:41
<Andreu Botella>
I understand "turn" to mean "(micro)task" here
11:43
<Andreu Botella>
in the end, I don't know how much of this is needed, because the current proposal already establishes a new overt channel, and set would be no different
11:43
<Andreu Botella>
I don't fully understand the fact that the SES folks are fine with having it be scoped to within the (micro)task
11:44
<Andreu Botella>
but I was trying to explain what I understand to be their concerns
11:45
<Stephen Belanger>

Yep, so that would be true of the flow-around semantics. A caller can't see into a callee, but a callee can receive context from their caller.

That is not true of the flow-through semantics as they can set context which can be viewed in a callee via part of it being continuation of that call resolving. However, I don't think that is a security concern either as:

  • It's intentional that it has that communication flow
  • That data is still not accessible without sharing the variable instance reference. All bets are off on security once you start explicitly sharing data stores.
11:46
<Andreu Botella>
I'd like to hear Mathieu Hofman's thoughts on this
11:47
<Stephen Belanger>
You have to both explicitly pass that data out of the callee via the set(...) and explicitly retrieve it by having access to the store in caller when it calls get(...).
11:47
<Stephen Belanger>
So really the communication channel between the things is already quite explicitly defined.
11:48
<Stephen Belanger>
It's no different than if I store an event emitter globally, attached a listener in the caller to receive events, and then emit an event inside the callee from that same global event emitter.
11:51
<Andreu Botella>
I think you're missing the point: the communications channel they're worried about is one enabled by global wrap
11:52
<Andreu Botella>
which can be used by a component that doesn't have access to the variable to set the context in which the component with access to it is called
11:52
<Stephen Belanger>
But this is also why I prefer instance-scoped bind as the default--you're not messing with a global graph and therefore influencing unrelated things.
See:
11:52
<Andreu Botella>
without the knowledge of the component that sets both of those components up
11:52
<Andreu Botella>
See:
but global wrap would still be available in your model
11:52
<Andreu Botella>
the fact that it's available is the confidentiality risk
11:53
<Andreu Botella>
and you could do the same with then, I think
11:53
<Stephen Belanger>
You can do the same thing with event emitters though. If you make a global you can emit from anywhere, you can subscribe from anywhere and you can unsubscribe from anywhere.
11:54
<Andreu Botella>
but here there isn't a global
11:54
<Stephen Belanger>
And yes, global wrap would still be available but, as I've said, I would strongly advise against its use.
11:55
<Stephen Belanger>
Though the WindowChannel can replace global bind so you always have to opt-in to forced graph changes.
11:55
<Stephen Belanger>
Which can eliminate that problem.
11:55
<Stephen Belanger>
But depends on WindowChannel.
11:56
<Stephen Belanger>
So unless we do that (or something like it) we're stuck with a global bind. I don't think global binding is avoidable when things like connection pools can exist. We need a way for users to be able to communicate that something should be routed around in a particular way.
11:57
<Stephen Belanger>
But we could be doing it in a way which the author of that code can provide a suggestion that something should be routed around and a store owner can provide some sort of acceptance of such changes.
11:58
<Stephen Belanger>
That sort of suggest/accept model adds a bunch of complexity though.
12:03
<Andreu Botella>
And yes, global wrap would still be available but, as I've said, I would strongly advise against its use.
but resolve and then would use global wrap, and it seems like they would still allow this channel
12:05
<Andreu Botella>
and those are fundamental to promises, so it would make promises and async/await impossible in a SES environment which doesn't prevent use of AsyncContext
12:13
<Stephen Belanger>
but resolve and then would use global wrap, and it seems like they would still allow this channel
They would in the current model. But potentially they could do the same suggest/accept thing. That would both give us control to have the different flows we expect, and resolve that security concern.
12:15
<Stephen Belanger>
Though arguably it's not actually an issue if the behaviour is to always do a particular thing. It'd just be an observable side effect of how execution works, same as how microtask timing is an observable side effect.
12:34
<littledan>
is there any documentation anywhere about what SES needs from new proposals?
lol no it’s more of a vibe. We have asked them for these docs several times over the years and their response is “yes that is a great idea, someone should definitely write that”
12:43
<Andreu Botella>
lol no it’s more of a vibe. We have asked them for these docs several times over the years and their response is “yes that is a great idea, someone should definitely write that”
and yet in https://github.com/endojs/endo/pull/1424 some of them were saying that it should be up to proposal authors to make sure their proposals work with SES??
12:45
<nicolo-ribaudo>
is there any documentation anywhere about what SES needs from new proposals?
The guiding principle I have been following is "the final goal is to be able to run/wrap some code in a sandbox that emulates any other JS environment, without that code being able to neither detect it not affect the outer environment in any way".
JS is not there (yet? They hope that one day it will be), but anything that moves us further away from that goal is bad
12:46
<nicolo-ribaudo>
With the exception of infinite loops, that everybody pretends do not exist
12:49
<Andreu Botella>
with something like compartments, you would be able to detect that you're in one because the prototypes are frozen, but I guess the important point is that you can't tell which sandbox you're in, right?
13:09
<littledan>
and yet in https://github.com/endojs/endo/pull/1424 some of them were saying that it should be up to proposal authors to make sure their proposals work with SES??
yes, that's how it's generally done
13:19
<Stephen Belanger>
Seems odd to me to block a specification proposal on something else that is apparently unspecified. 🤔
13:19
<Stephen Belanger>
If they expect conformance they should be responsible for clearly communicating what conformance means.
14:43
<littledan>
well, they are available for conversations
18:44
<Justin Ridgewell>
No, this is just recollection from the meetings I had with them. That dynamic scoping doc in the proposal was written by @Chengzhong Wu and me before meeting with them, based on our own understanding of their objections from the first plenary meeting where they rejected the proposal.
18:48
<Andreu Botella>
Based on https://github.com/endojs/endo/blob/markm-fluid-scopes/packages/eventual-send/src/async-contexts/README.md, it seems like the current spec text does open a communications channel, but it's one they don't think breaks SES because you can't observe it from outside of the task a .run() call happens in
18:48
<Andreu Botella>
I don't really see why that makes it not break SES, but that seems to be the reasoning
18:49
<Andreu Botella>
and that doesn't apply with resolution time or set()
18:49
<littledan>
I’m telling you, it’s a vibe
18:52
<littledan>
There’s no single thing that the SES things are going for; it is very loosely around enabling membranes but not just that
18:53
<Justin Ridgewell>
I'm not actually sure I understand this. As with the current proposal, you would still need to have access to an object in order to set or get, so isn't it exactly parallel to setting a variable defined in an outer scope?
There’s two directions. If foo calls bar, and it’s completely normal that foo can change a parameter to bar, but it’s not normal if bar can change one of foo’s parameters. If both share a mutable object, that’s fine to mutate it in bar, but bar can’t replace foos reference. This was one of the things we discussed based on the OCAP language meeting.
19:05
<Justin Ridgewell>

The spec, as it is presently, calls for capturing the context at the point where an await happens, not where the promise being awaited resolves. The context flows into that promise, and it would flow all the way to the point of resolving, but then we orphan that branch and return the context back to the captured value of that outer await.

In this particular example though, it's using store.enterWith(...) which effectively sets the value in the current scope without creating a new scope first. Because the initial segment of an async function executes synchronously, if that occurs before the first await of that internal function then the point at which the outer function captures its await will have already change its context and so will adopt that inner context. Whereas if that inner function modifies the current context after the first await within that inner function it will have already passed the point where the outer function captured the await context and so it would not use that value. The fact this sometimes changes the outer function context and sometimes doesn't is extremely confusing. Users are expecting it to always change the outer context, as indicated in the issue, because it is logically a continuation and should therefore behave the same as it would were it written with callbacks instead.

What is the example code? What is the indicated issue URL? We’d previosuly discussed why we don’t have an enterWith API because of cases like this, it needs to properly restore the context after the execution of the inner async function pauses its await. Part of the reason we discussed using declarations with modifications to await keyword to restore the previous context at the suspend point.
19:06
<Justin Ridgewell>
The issue I just posted yesterday was exactly this, and we get numerous reports expecting this behaviour in Datadog support escalations. I've seen it come up many times before in many places too--Slack conversations, conversations at collab summits...I've never seen anyone that expected the behaviour as it is presently. It's possible that no one talks about it because people don't complain about what they think is already working, but given that the vast majority of context management usage is APMs and all the APM devs I've talked to have complained about these context flows as problematic I feel like there's a pretty good chance we're getting it wrong.
What issue? There were like 400 messages in this channel over the last week.
19:06
<Andreu Botella>
https://github.com/nodejs/node/issues/53037
19:15
<Justin Ridgewell>

Consider this code:

async function doSomething() {
  await doAThing() // first span
  await doAnotherThing() // no span
  await doYetAnotherThing() // second span should be follows-from first
}

If we have call doSomething() twice in a Promise.all(...), which we do not instrument or have any particular awareness of--maybe it's user code--then we'll get two spans being created for the first awaits of each, then when the second spans come in they will no have a way to differentiate because the doSomething() itself would have the same context when it runs both times and would be restoring that over the awaits, blowing away the context we had from the inner function calls which we otherwise could associate with if we had flow-through semantics.

So it’s the case where doSomething doesn’t create a parent span for its children? And we have to infer order of sibilngs when two doSomething calls are interleaved. Is this not just user error?
22:17
<Steve Hicks>

I spoke with the tracing experts I'm working with and they strongly prefer "flow around" semantics over "flow through" await (with a caveat that their primary experience is in other languages). In particular, the concern is that the ability to do an unbounded set() isn't a good situation. Further, they weren't at all concerned with the visibility issues from pruning off subtasks by binding around them, for two reasons: (1) exact follows-from relationships are primarily relevant for critical path analysis, which already has numerous other issues making it intractable, and (2) we can get a reasonable enough picture with the mutable trace objects referenced in a "lexically scoped" async store (i.e. what's currently proposed here). The rough structure we use (IIUC) is that the async-local variables are trivially cheap span IDs and the global store is a log of events that includes those IDs. When a span begins/ends, it gets a new ID the start/end times and parents are appended to the log (along with any additional metadata). The graph can then be reconstructed later from that log.

I agree completely with Justin's assessment of the doSomething() example - it wouldn't be an issue if doSomething were properly instrumented. If you branch into two parallel subtasks without making a child span, you're asking for a bad time.

22:21
<Steve Hicks>
All that said, even if flows-around is the correct (default) behavior for await (and don't forget about yield), the question of accessing causal context in callbacks passed to builtin and userland scheduling APIs is still relevant, and needs to be addressed (in a way that hopefully doesn't break "lexical encapsulation"). So we do still need to be working on the question of callingContext or something like it.
22:29
<Andreu Botella>
Do you think that the default context for events (and other APIs that could have more than one context) should be the registration time one?
22:31
<Andreu Botella>
I've been rethinking this, but I don't trust my intuitions on this
23:44
<littledan>
What are your current thoughts, Andreu?
23:44
<littledan>
Even if you aren’t sure of them
23:47
<Andreu Botella>
I think that maybe the only cases where events should be registration time are the ones where there isn't any possible JS origin for the event
23:48
<Andreu Botella>
even in things that are parallel to scheduling API, such as the message event, the thing that is parallel to calling e.g. setTimeout is the message sending
23:48
<Andreu Botella>
not the registration
23:48
<Andreu Botella>
but I suspect that would lead to a lot of use of global wrap
23:49
<littledan>
Yes, I am coming around to that point of view as well. You could think of it like, it should give “as much information as possible” about the cause.
23:49
<littledan>
Along the lines of the arguments that we’ve been discussing, but just applied to callbacks
23:50
<Andreu Botella>
but we'd have to look into why zone.js went with registration for all events
23:51
<Andreu Botella>
and whether that works for its users
23:59
<littledan>
Yeah have you managed to get in touch with them?