TC39 Async Context on 2024-05-27

17:56	<Andreu Botella>	I think we have more `unhandledrejection` bugs that we haven't caught yet
17:57	<Andreu Botella>	this was in part because `unhandledrejection` is tested with WPT, not test262, and I haven't gotten around to writing WPT tests yet
17:58	<Andreu Botella>	but it's looking like with the current spec, `someAsyncApi().then()` would lose track of the rejection context
17:59	<Andreu Botella>	in Stephen's proposal of preserving the resolution context, this would just work I think
18:02	<Stephen Belanger>	This seems funcionally equivalent to https://github.com/tc39/proposal-async-context/issues/60#issuecomment-2050457550, which solves the global leak issue by having a wrapping try-catch reset the vaules after exiting the closure. It’s still not my favorite implementation becuase of mutability and odering of async functions causing unrelated breakages, eg if main started with an `await 0` before entering `’bar’`, the outside caller sees different behavior. It feels like Zalgo-lite. Similar, but the intent is that the runtime would generally be defining those scopes rather than only ever being handled by a user-facing API. We have a bunch of discrete execution concepts which can be used as boundaries for these scopes such as a promise continuation or a callback. We could also probably extend the Function type with something like a `func.bindToContext(value)` which could be similar to having that function call `run(...)` internally. My point is, we can have the nice DX of the set/get interface without too heavily exposing the scoping problem to users. Decoupling the scoping also makes it potentially reusable, which would be beneficial if we do decide to make separate types for the two around and through flows. It would simplify things for users if they can describes scopes with a single API and have both flow systems just make different decisions about how to link the scopes together in the graph.
18:03	<Stephen Belanger>	Everywhere except if it’s set within the global context, which persists beyond just the current sync execution. Your above `defineScope(() => {})` solves this by definining an exit point that cleans the global context. If you mean the top-level of the file/module, it'd just live as long as that script/module does, which seems to make sense to me.
18:04	<Stephen Belanger>	The risk of things living super long though is also easily mitigated by just emptying the context in some way, such as setting it to undefined when you've decided you're done with it. In a flows-through system, I think you also need to free every cached promise that holds that context? They would strongly hold their resolution context. The engine wouldn’t mutate user’s context automatically, and without a library API to know when the current exeuction is finalized, you’re left guessing when you can mutate the context or drop all promises. Yes, promises suck a little bit in that they would need to hold the reference alive as long as the promise remains alive. This could increase memory pressure, but is also the expected behaviour as any future continuation attached to that promise should restore that context value.
18:14	<Stephen Belanger>	in Stephen's proposal of preserving the resolution context, this would just work I think Yes, `unhandledrejection` is one particular case where the desired context is always the flows-through context. You want to capture the context as it was when the rejection happened, even if it flows through a bunch of intermediate layers. The difference between through and around flows is basically the same conceptually as subject and predicate. The through path gives you the subject of the failure so you can actually see what's going on. Whereas the around path gives you the predicate which is only describing what is being done with that subject, but that's not relevant in the case of a failure as it's what would have been done but will not be because of the failure. This also matches that prior description of around flow being like parameter flow. It describes where execution is going and not so much where it came from, which is a bit counter-intuitive from the flow users have generally expected from AsyncLocalStorage where they want to know where something came from so they can acquired stored information about that originating execution.
18:15	<Andreu Botella>	so if we have both types of async context variables, then the only one that would have a relevant context for unhandledrejection would be the get/set one?
18:16	<Andreu Botella>	no, you still need some value for the other variables
18:16	<Stephen Belanger>	get/set is a separate thing.
18:16	<Andreu Botella>	oh, I might have been conflating both proposals
18:17	<Stephen Belanger>	The get/set thing is just about that you don't actually need the `store.run(...)` if you have the runtime provide scopes.
18:18	<Stephen Belanger>	Well, you still might want to have `run(..)` sometimes, but it becomes less important if the context flow model is clear and correct from the core.
18:18	<Stephen Belanger>	Which is where integration with things like promises comes in.
18:19	<Andreu Botella>	well, I think the idea I had was James Snell's distinction between a run API that flows around and a set API that flows through
18:19	<Andreu Botella>	I might have missed some of the intervening discussion because I was sick
18:19	<Stephen Belanger>	So it's a little weird to me that we're pursuing both defining the correct flow model and still using the `run(...)` scoping when we could just safely modify the current scope if we actually have a clearly defined "current scope" to be modifying.
18:20	<Stephen Belanger>	I may have missed that too. 🤔
18:21	<Andreu Botella>	would sync function boundaries be part of that scope?
18:22	<Stephen Belanger>	Yes, it should be only sync segments of execution.
18:22	<Stephen Belanger>	And then the `run` function would essentially just be: `run(value, scope) { const prev = this.get() this.set(value) try { return scope() } finally { this.set(prev) } }`
18:25	<Stephen Belanger>	I think one of the hurdles we have at the implementor level is that we really want to think about async functions and generators as singular functions and not chains of segments of synchronous execution managed by a scheduler.
18:25	<Andreu Botella>	so IIUC: `asyncVar.set("foo"); await (async () => { asyncVar.set("bar"); })(); console.log(asyncVar.get()); // bar` but `asyncVar.set("foo"); // No await! (async () => { asyncVar.set("bar"); })(); console.log(asyncVar.get()); // foo`
18:25	<Andreu Botella>	wait
18:26	<Stephen Belanger>	Both would actually be bar.
18:26	<Andreu Botella>	do things change if there's some await inside the async function, before the set?\
18:26	<Stephen Belanger>	And the first is bar now because that first segment of the inner async function is sync.
18:27	<Stephen Belanger>	Yes, so currently a huge source of confusion in Node.js is that `store.enterWith(...)` before the first await behaves very differently from `store.enterWith(...)` after the first await.
18:27	<Stephen Belanger>	Because that first segment of an async function is actually executed synchronously rather than lazily.
18:28	<Stephen Belanger>	I think that's more a problem with how async functions are defined than with AsyncLocalStorage though. You can get the same unexpected behaviour just modifying globals in an async function.
18:29	<Stephen Belanger>	I really wish async functions were actually lazy started and would not run anything until they are awaited or a continuation is attached. A bit late to change the past though. 😐️
18:35	<Stephen Belanger>	To distill things down to the essence of context management, all that is actually required to fully implement context flow is to capture three points: When a call schedules an async task the current context needs to be captured. When top-level or any nested continuations of async tasks would begin sync execution the context needs to be restored. And if context is not flowed completely into all paths then the end of sync execution may also be required as the next sync execution to begin may not apply a context swap before it runs if it is not something which context normally flows into. This point is why I think it's generally best to consider all execution as logically descending from the runtime starting and branch contexts off from that accordingly.
18:40	<Stephen Belanger>	The window between a sync execution beginning and either that sync execution ending or the next sync execution beginning is the exact scope for which you can apply a `set` in the `set`/`get` model. You would just need copy-on-write semantics so changes don't impact execution already passed in that sync window. And then the `get` is all you need for async task scheduling to capture the value to be restored in the sync execution of its continuation using `set` again before beginning that execution.
18:43	<Stephen Belanger>	So in theory you actually only need the two points: `set` corresponding to a sync execution starting or a desired change partway through a sync execution to segment out the remainder, and the `get` corresponding to capturing the context when an async task is scheduled or when the user would like to retrieve the value themselves.
18:45	<Andreu Botella>	Yes, so currently a huge source of confusion in Node.js is that `store.enterWith(...)` before the first await behaves very differently from `store.enterWith(...)` after the first await. This wouldn't be an issue with `run`. I suspect you think that removing this footgun makes things so much less convenient that it's not worth it?
18:47	<Stephen Belanger>	It's not an issue, but `run(...)` is even harder to figure out how to use correctly in an async function.
18:47	<Stephen Belanger>	Do you wrap the call to that function? Do you nest its entire body in another async function within a nested run? Either way the DX is pretty bad.
18:48	<Stephen Belanger>	Whereas a simple `store.set(...)` is a lot more straightforward.
18:49	<Stephen Belanger>	And also, importantly, is not yet defined, so we could specify that calling `set(...)` inside an async function is treated as if that async function had a scope around it, if we want to do the around scope and not leak out even if calling it before that first await.
18:49	<Stephen Belanger>	So the semantics of the initial sync segment would not leak at all.
18:50	<Andreu Botella>	that might work
18:50	<Stephen Belanger>	We have the power to make it a lot more clear.
18:50	<Andreu Botella>	I guess it could come down to implementation complexity, because now it would no longer be a simple pointer swap
18:50	<Stephen Belanger>	The `enterWith(...)` only leaks because we don't have any mechanism of doing that auto-scoping.
18:51	<Stephen Belanger>	It's just userland being less powerful.
18:52	<Stephen Belanger>	Well, it could still be a pointer swap, just whenever you call an async function you could just add an additional scope around that initial segment to separate out any context values set there.
18:53	<Stephen Belanger>	So it'd create a new scope when starting the async function, then pointer-swap back after it returns its promise.
18:54	<Stephen Belanger>	And that first await would have linked its context to whatever the context was inside that async function.
18:58	<Stephen Belanger>	The neat thing though, which is partly what I'm trying to define in that context management RFC I'm working on myself which I shared recently, is that if you treat async functions as just a chain of sync segments, you can apply different types of propagation semantics to each segment. The around scope could link each segment directly, flowing context from the point an await starts to the point where it ends, giving you that lexical scope style some want for those things like signal-sharing for cancellation. And the through scope would simply include the nested segment sequences in the flow chain, so context would flow out through the branches and come back out the await. You can use the same scope definitions for both systems though.
22:58	<Chengzhong Wu>	I think the flow-through pattern doesn't answer the similar problem of https://github.com/tc39/proposal-async-context/issues/90. The originating context could be a stack of contexts and use inner-most context would discard all outer contexts. Each `await` creates resolution handler on a potentially rejected promise, use the inner-most context would lose the context when the rejection was handled multiple times. `let aGlobalPromise = asyncVar.run('global', () => { return Promise.reject() }) async function someAsyncApi() { await asyncVar.run('async-inner', async () => { try { await aGlobalPromise } catch (e) { throw e } }) } asyncVar.run("foo", async () => { await someAsyncApi() }); asyncVar.run("bar", async () => { await someAsyncApi() }); window.addEventListener("unhandledrejection", () => { console.log(asyncVar.get()); // 'foo' or 'bar' or 'global' or 'async-inner'? });`
23:35	<Chengzhong Wu>	A single promise could be branched multiple times, and `unhandledrejection` events are dispatched for each "unhandled" promise, rather than a single source of rejection