TC39 Structs and Shared Structs on 2024-07-02

16:24	<rbuckton>	FYI, I have someone coming by today to repair some siding that came loose during a thunderstorm and it looks like they were delayed and will be here around 1pm EST (the start of the meeting today), so I may be delayed by a few minutes or interrupted.
17:55	<rbuckton>	iain: Regarding your comment that the intuition is that "`async` affects the type", I disagree. Both `async` (and ``) imply a syntactic transformation in the function body. Non `async` code can still return a `Promise`, and non `` code can still return a generator.
17:58	<iain>	(I think you tagged the wrong Iain)
17:59	<iain>	I don't think that tracks. If I add `async` or `*` to a function definition, it returns a different sort of thing, in a way that my caller needs to know about. If I add `unsafe`, it doesn't change anything from the caller's perspective.
18:02	<iain>	Functions don't have typed signatures in raw JS, but if you return a promise or a generator from a function then that is reflected in the implicit return type. The same is not true for `unsafe`.
18:03	<rbuckton>	I did, oops.
18:03	<iain>	The distinction is whether the property is important to the caller, and since we're explicitly avoiding function colouring, I claim that `unsafe` is only relevant to the code inside the function.
18:05	<iain>	My anecdotal evidence that this is potentially confusing is that I was personally confused by this while reading the explainer.
18:05	<rbuckton>	We tried to make this distinction in TypeScript fairly clear. While you can write `async function f(): Promise<void> { ... }` in your code, the output declaration is `declare function f(): Promise<void> { ... }`, as `async` only performs a syntactic transformation. It does certainly inform the return type, but it is not part of the function signature from a type checking perspective.
18:05	<iain>	(Although it is certainly plausible that I am too easily confused!)
18:07	<rbuckton>	Decorators will further complicate that mental model, though, as a decorator could affect the return type of a function as well. At one point (after we had already shipped `async`/`await`), there were comments that we could have just used generators and `@async function* f() { ... }`.
18:08	<rbuckton>	Yes, `async` and `*` do imply a specific return type, but that is purely a result of the syntactic transformation. In the same way, `accessor` is also a syntactic transformation.
18:09	<rbuckton>	It could even be argued that `static` is a syntactic transformation insomuch as it applies to where a method or field is placed on a class. All of these potentially affect the type, but the type produced is purely a result of the transformation itself.
18:13	<rbuckton>	I'd also like to point out that `unsafe`, as I've proposed, is generally consistent with Rust as prior art. Rust allows `unsafe {}`, but also `unsafe fn`, `unsafe trait`, and `unsafe impl`: By default, `unsafe fn` also acts like an `unsafe {}` block around the code inside the function. This means it is not just a signal to the caller, but also promises that the preconditions for the operations inside the function are upheld. Mixing these two meanings can be confusing, so the `unsafe_op_in_unsafe_fn` lint can be enabled to warn against that and require explicit unsafe blocks even inside `unsafe fn`. In Rust, disallowing `unsafe fn` in favor of a nested unsafe block is specified as a lint rule.
18:14	<rbuckton>	https://doc.rust-lang.org/std/keyword.unsafe.html
18:15	<rbuckton>	Or am I misinterpreting? Does Rust require an `unsafe` block around an unsafe function call?
18:15	<iain>	In general, I would say that a function signature (broadly waving at all the parts of a function declaration outside the body) provides information that is important to the caller. This is especially true in statically typed languages, but even in JS I think it holds. By putting `unsafe` in such a prominent location, we imply that it is similarly important to the caller, which is not the case here.
18:15	<iain>	You are misinterpreting: Rust requires an unsafe block around calls to unsafe functions.
18:15	<rbuckton>	Ah, thanks. My mistake.
18:15	<iain>	That's a big part of why I misread your explainer.
18:17	<iain>	The purpose of the Rust lint is to encourage code to be precise about which parts of a function body are unsafe, even if the entire function must be called in an unsafe context.
18:17	<rbuckton>	An alternative to `unsafe function f() {}` that I'd also put on the explainer might be `function f() unsafe { }`. My concern is that this isn't obvious that it also affects the parameter list. Then again `function f() { "use strict"; }` affects the parameter list as well.
18:19	<iain>	The equivalent in JS of the Rust lint would be to have function colouring (where `unsafe function foo()` can only be called from inside an unsafe block) and also require explicit unsafe blocks inside the body of the function, which is the opposite of what you are proposing.
18:19	<rbuckton>	So, `class C unsafe { }` to make a class body unsafe, or `shared struct S unsafe { }` to make a shared struct body unsafe. We probably wouldn't do `unsafe const`/`unsafe let` in that case because it would be mixing up suffix vs. prefix, so we would need to depend on an unsafe IIFE or `unsafe do`
18:21	<rbuckton>	function coloring is a major DX pain. I see it as a necessity for `async` and `*` given that the syntactic transformations affect the return type, but it's not a practice I'm fond of continuing with new syntax if it isn't warranted.
18:22	<rbuckton>	So, `class C unsafe { }` to make a class body unsafe, or `shared struct S unsafe { }` to make a shared struct body unsafe. We probably wouldn't do `unsafe const`/`unsafe let` in that case because it would be mixing up suffix vs. prefix, so we would need to depend on an unsafe IIFE or `unsafe do` I suppose it would be `do unsafe {}`, to maintain the suffix position
18:26	<iain>	Function colouring in this case allows for the more nuanced expression of safety invariants. So for example you could have `function foo() { unsafe {...} }` and `unsafe function foo_AlreadyHoldingLock() {...}`, in which case `unsafe function` does not do a syntactic transformation, but it does impose restrictions on the callers to maintain invariants.
18:27	<iain>	I'm not convinced we want that, and I think adding it might impose a small performance overhead on unrelated code, but it's a point in design space.
18:27	<rbuckton>	There is one thing about function coloring an `unsafe` function has over `async`/`await` that makes it somewhat more palatable, which is that you can introduce an `unsafe {}` block in safe code to perform the operation. That almost makes me want to have both `unsafe function` ("it is unsafe to call me and my contents are unsafe") and `function () unsafe { }` ("my contents are unsafe, but it is safe to call me as I have done my due diligence to ensure I am safe at the boundaries"), mostly because I really am not a fan of the namespace nesting style often seen in C++, i.e. `function f() { unsafe { ... } }`
18:28	<iain>	Yeah, given my previous experience in Rust, that's what I thought you were proposing initially. The problem is that then every call that is not in an unsafe context is responsible for checking that the callee is not an unsafe function, which potentially slows down polymorphic code.
18:28	<rbuckton>	Where `... unsafe { }` is just syntactic sugar for `... { unsafe { } }`
18:29	<iain>	(Although there's a chance that we could fold it into checks that we already have to do to ensure that you don't call a derived constructor without `new`)
18:31	<rbuckton>	Could that slow down be handled via a function stub, such that "safe" code has no overhead (if it calls the stub, the stub throws), while "unsafe" code has overhead as it must check for the stub to step over it, or to pass the stub a flag indicating safety?
18:33	<rbuckton>	We already expect "unsafe" code will have some additional complexity even without the notion of an `unsafe {}` block, purely because reads and writes potentially require agent coordination
18:35	<iain>	At a hardware level there isn't really any way to pass a flag that doesn't require the safe caller to do at least a little bit of work to not pass it
18:37	<iain>	(That's maybe not true if you imagine that we have some sort of global "are we in an unsafe block" flag that gets cleared when unsafe code calls into safe code and reset when we return, but keeping that flag set correctly seems potentially complicated.)
18:38	<rbuckton>	So "safe code just calls the function" as normal (which throws for the stub), and "unsafe code first checks if the function is an unsafe function stub and then calls the underlying function" isn't an option?
18:38	<iain>	The overall performance cost here is pretty small
18:39	<rbuckton>	I'll admit, I'm primarily coming at this from the spec perspective, and not the perspective of an implementer or optimizin gcompiler.
18:40	<iain>	Yeah, I guess I can see some ways of making that work.
18:41	<iain>	Although they end up adding a fair bit of complexity to some already very complicated code
18:41	<rbuckton>	But I wouldn't expect a global flag is necessary given that `unsafe {}` is purely syntactic and could be used to drive transformations or optimizations based on its presence in the parse tree.
18:44	<iain>	Taking a step back: this can all be implemented, and with sufficient elbow grease the overhead could be minimized. The question is whether coloured functions provide enough value to justify engines spending their limited elbows on this instead of the million other things we could be implementing / optimizing.
18:51	<shu>	i don't think function coloring is problematic from engines' perspectives, but it is pretty bad for usability, especially since we already have async/non-async
18:55	<iain>	Actually, now that I'm thinking through the implementation, even normal `unsafe` blocks are at least a little annoying to implement, because it means that every GetProperty needs to know its location in the source. Or else you use a global flag, and clear it around calls?
18:56	<shu>	i was actually imagining something even dumber, like outputting different bytecode
18:57	<shu>	since it's lexical
18:57	<iain>	Oh, yeah, maybe that works too
18:57	<littledan>	could still be slightly annoying maybe to maintain two types of property access, with their ICs and such
18:58	<rbuckton>	My biggest concern was `unsafe` having `async`/`await`-like poisoning effects. Introducing `async` to a sync function normally poisons it's callers if they must maintain sequential execution. Given that you can nest an `unsafe{}` block in safe code, the concern is lessened somewhat. In the call I said that an `unsafe function` doesn't perform any implicit synchronization or coordination, so its up to the author to implement any necessary coordination, including none at all. The "none at all" coordination was meant as a way for you to decompose an `unsafe` function into multiple `unsafe` functions without having to guard against "safe" code invoking them unintentionally by leveraging scoping. Function coloring at this level isn't quite as bad as I'd feared, and has the benefit of pushing the user to implement safety in a function not marked `unsafe`.
18:58	<shu>	could still be slightly annoying maybe to maintain two types of property access, with their ICs and such V8 bytecodes at least can have immediate arguments. it could be a Get with an "in-unsafe-block" bit
18:59	<shu>	like, the same way "should throw" flags are threaded through for strict code
19:00	<iain>	SM has SetProp/StrictSetProp and so on
19:00	<iain>	Although most of the code is shared
19:00	<shu>	yeah, same
19:00	<iain>	It ends up being similar in practice
19:00	<shu>	same to "most of the code is shared"
19:01	<rbuckton>	In other words, this `unsafe function readMessage(lck, workArea) { ... } unsafe function writeMessage(workArea, message) { ... } unsafe function processMessage(message) { ... } function processWorkArea(mut, workArea) unsafe { using lck = new UniqueLock(mut); let message; while (message = readMessage(lck, workArea)) { const result = processMessage(message); writeMessage(workArea, result); } }` Doesn't seem quite so bad to me (though I still prefer `function() unsafe { }` to `function() { unsafe { } }`)
19:04	<rbuckton>	It has the upside of preventing users from inadvertently invoking unsafe code from safe code and allows you to declare your function as not only containing unsafe code, but also indicating that it doesn't internally perform any coordination.
19:08	<rbuckton>	In C#, `unsafe` can apply to a function/method, but does not affect callers: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/unsafe
19:09	<rbuckton>	Though `unsafe` in C# is primarily around direct access to pointers (which Rust also shares).
19:10	<iain>	For me the uncertainty about the value of function colouring implies strongly that we should leave out `unsafe function` syntax for now. In the future we will have much more user experience to help determine what that syntax should mean.
19:14	<rbuckton>	I'd really like to be able to write conventional JS with shared structs when I know I already have exclusive access to an object. If we only have `unsafe {}`, I can't write this `unsafe function doWork(task, timeout = task.timeout ?? 1000) { ... }` and instead must write this `function doWork(task, timeout) { unsafe { timeout ??= task.timeout ?? 1000; ... } }`
19:16	<rbuckton>	However, I do see the potential value of safe code erroring if you invoke `doWork`, since `doWork` here doesn't implement a coordination mechanism as it's intended to be used from another function that does. Instead, I must indicate it by convention, i.e. `function doWorkUnsafe()` to draw attention to its use.
19:21	<rbuckton>	Let me be clear on my position though. If we must have `unsafe`, but can only have `unsafe {}` for now, I'm fine with that. I do think the lack of an `unsafe` marker for functions and class/struct bodies is a major DX wart that will very likely need to be addressed at some point, function coloring or not. I just don't want to go down a road of allowing `import`/`export` inside of an `unsafe` block as it would likely be a long term aesthetic wart on the language after we introduce an `unsafe` marker in other contexts.
19:23	<rbuckton>	We could consider an alternative to make `import`/`export` work, by declaring the entire module as `unsafe` via something like `unsafe module;` (or some other incantation) at the top level.
19:25	<rbuckton>	Or use module blocks, e.g.: `unsafe module M { export shared struct AtomicValue { ... } } export * from M;`
19:26	<rbuckton>	(though that still uses `unsafe` as a prefix)
19:32	<rbuckton>	I'd also be fine with postfix-`unsafe` markers for declarations (`function f() unsafe { }`) and potentially allowing `unsafe` in both positions (`unsafe function f() {}` and `function f() unsafe {}`, though `unsafe function f() unsafe {}` is redundant). I'd also be fine with `unsafe` markers for parameters much like I suggested for variable and field initializers in a world where we either can't have postifx-`unsafe` or if postfix-`unsafe` can't include parameters, e.g.: `function doWork(task, unsafe timeout = task.timeout ?? 1000) { unsafe { ... } }`
19:35	<rbuckton>	IMO, only having `unsafe {}` is not ideal, though `do unsafe {}` would make that somewhat more bearable, e.g.: `function doWork(task, timeout = do unsafe { task.timeout } ?? 1000) { unsafe { } }` But for that we would need `do {}` to advance.
19:36	<rbuckton>	Or we would have to advance `unsafe {}` as an expression as well, which would be confusing if we do end up advancing `do`.
22:08	<littledan>	It would be great if someone brought do expressions back to committee. My understanding is that bakkot is leaving that for others to champion. (Maybe there is some remaining controversy but I don’t know what it is)
22:29	<iain>	It looks like we decided in March 2021 that we were going to do some sort of user study. Did anything ever come of that?
23:24	<Mathieu Hofman>	An alternative to `unsafe function f() {}` that I'd also put on the explainer might be `function f() unsafe { }`. My concern is that this isn't obvious that it also affects the parameter list. Then again `function f() { "use strict"; }` affects the parameter list as well. I am honestly suspicious of any code that attempts to do anything with a shared struct passed in arguments without first satisfying whatever synchronization mechanism is appropriate to access that shared struct. As such I suspect that only allowing unsafe blocks is actually a benefit as it would force authors to consider whether they've first satisfied the synchronization responsibility they're supposed to take on, and which seem hard to satisfy within the parameters list alone.
23:35	<Mathieu Hofman>	I'd really like to be able to write conventional JS with shared structs when I know I already have exclusive access to an object. If we only have `unsafe {}`, I can't write this `unsafe function doWork(task, timeout = task.timeout ?? 1000) { ... }` and instead must write this `function doWork(task, timeout) { unsafe { timeout ??= task.timeout ?? 1000; ... } }` Can't you define your `doWork` function inside an unsafe block instead?
23:35	<rbuckton>	In an earlier example I showed how you might decompose a series of `unsafe` operations into multiple functions, where only the entrypoint function would perform any coordination, i.e.: `unsafe function readMessage(...) { ... } unsafe function writeMessage(...) { ... } unsafe function processMessage(...) { ... } function processWorkArea(workArea) { unsafe { // performs locking // calls readMessage/writeMessage/processMessage } }` If we have `unsafe function`, we can enforce that safe code cannot invoke an `unsafe` function directly, or inadvertently. If we do not have `unsafe function` and only have `unsafe {}`, then we cannot perform such enforcement and there is no clear delineation between a safe entrypoint and unsafe code: `function readMessage(...) { unsafe { ... } } function writeMessage(...) { unsafe { ... } } function processMessage(...) { unsafe { ... } } function processWorkArea(workArea) { unsafe { // performs locking // calls readMessage/writeMessage/processMessage } }` Here, `readMessage` will not perform any independent coordination or locking as it expects to be called by `processWorkArea`, which is the function that would actually perform locking. A user could inadvertently invoke `readMessage` from "safe" code, resulting in a data race. The only way to enforce this is by convention, thus you would instead want to write this as: `function readMessageUnsafe(...) { unsafe { ... } } function writeMessageUnsafe(...) { unsafe { ... } } function processMessageUnsafe(...) { unsafe { ... } } function processWorkArea(workArea) { unsafe { // performs locking // calls readMessageUnsafe/writeMessageUnsafe/processMessageUnsafe } }`
23:37	<rbuckton>	Can't you define your `doWork` function inside an unsafe block instead? It's not quite so easy if I want to make `doWork` available to code outside of the block: `let doWork; unsafe { doWork = function() { ... }; }` This would be a regular frustration developers would encounter, both here and with `import`/`export`, or shared struct bodies, etc.
23:38	<rbuckton>	Blocks are best for localizing the transition from safe to unsafe. They're terrible for encapsulating declarations since you generally want at least one declaration to escape the block to be actually usable.
23:38	<Mathieu Hofman>	I do find interesting the proposition that the user could define unsafe functions that like shared struct fields do need to be called from an unsafe context. As mentioned that seems to point we could for now reserve that space in the syntax for later
23:39	<iain>	Note that we could also simply allow `unsafe { let doWork = ...; }`
23:40	<iain>	An unsafe block doesn't have to be a separate lexical scope of its own
23:41	<shu>	i would strongly prefer that something that looks like `{ }` be its own lexical scope
23:41	<rbuckton>	lexical scoping should never escape a `{}`, that would be a terrible precedent.
23:41	<shu>	that is a pretty deep affordance
23:41	<shu>	yeah
23:41	<rbuckton>	We don't even let class decorators access lexically scoped private names since they're outside of the class body
23:41	<iain>	I point to the parallel of namespace blocks in C++, where indenting them like: `unsafe { let doWork = ... }` makes it less confusing.
23:42	<shu>	I do find interesting the proposition that the user could define unsafe functions that like shared struct fields do need to be called from an unsafe context. As mentioned that seems to point we could for know reserve that space in the syntax for later runtime enforcement of colored functions like that is probably a no-go
23:42	<shu>	I point to the parallel of namespace blocks in C++, where indenting them like: `unsafe { let doWork = ... }` makes it less confusing. nty :)
23:43	<rbuckton>	I maintain that C++ `namespace`-like indentation is a terrible aesthetic that we should not go out of our way to replicate.
23:43	<shu>	there is the worse-is-worse alternative of `"use unsafe"` which doesn't imply anything about scoping
23:44	<shu>	however, i find directives bad precisely because of that
23:44	<Mathieu Hofman>	All this now makes me realize something. What is the compatibility story of shared structs (and I suppose unsafe functions in the future) with Proxy. I don't think that we should prevent constructing a proxy with such a target, but I also assume a proxy trap implementation wouldn't be exempted from unsafe checks when accessing the target, even if the trap was triggered from an unsafe block. Is the only option that proxy traps be updated to become unsafe themselves? Is there a way to dynamically test whether an object has an unsafe color?
23:44	<shu>	there is no function coloring
23:44	<shu>	proxies just work?
23:45	<rbuckton>	No, they wouldn't.
23:45	<shu>	why wouldn't proxies just work?
23:45	<iain>	You need to have an unsafe block inside the proxy trap, don't you?
23:45	<rbuckton>	They would work as long as you don't have a proxy trap for `get` or `set`
23:46	<rbuckton>	But I don't imagine that `unsafe` magically carries through to proxies via the `get` and `set` traps.
23:46	<shu>	sorry, that's what i mean. proxies "just compose", unless there's interposed user code like a trap
23:46	<Mathieu Hofman>	Also would the Reflect intrinsics be "forwarding" the unsafe environment? Aka throw if not called from an unsafe block when bottoming out in accessing an unsafe receiver?
23:46	<shu>	in which case, exactly as ron says, they'd need their own `unsafe { }` marker
23:46	<shu>	it works exactly like strict mode throwing
23:47	<rbuckton>	If you have a shared struct `s` and you need an `unsafe` block to read `s.x`, then `new Proxy(s, { get(target, key, receiver) { return Reflect.get(target, key, receiver); } }).x` would throw because neither the `get` trap nor `Reflect.get` can read/write the struct's fields.
23:49	<rbuckton>	e.g., we might need a `Reflect.unsafeGet` and a `{ unsafeGet }` trap, or we'd need to be able to pass `unsafe` as a flag to the trap/Reflect.get
23:50	<rbuckton>	Would you want `Reflect.get(s, "x")` to work outside of an `unsafe` context?
23:51	<Mathieu Hofman>	runtime enforcement of colored functions like that is probably a no-go How is calling different from field access? Doesn't the receiver need to perform some check in both cases?
23:51	<shu>	i feel like it really shouldn't?
23:51	<rbuckton>	`"use strict"` applies mostly to `set`, and informs how to react to the boolean return value of `Reflect.set()` or the `set` trap. It doesn't impact the `get` trap at all.
23:52	<shu>	How is calling different from field access? Doesn't the receiver need to perform some check in both cases? it's different in that Ron's sketch is completely lexical, so all property access lexically contained with `unsafe { }` can generate a different bytecode at parse time. there is no propagation from from frame to frame
23:52	<rbuckton>	We won't need an `unsafe` block to use `Atomics.load(s, "x")`, since that already has implications around memory order. I'm not sure where I stand on whether `Reflect.get` observes `unsafe`
23:53	<rbuckton>	My design sketch is more loosely based on C#'s interpretation of `unsafe` than Rust's in that C# doesn't require `unsafe` functions be invoked from within an `unsafe` block, while Rust does.
23:54	<shu>	We won't need an `unsafe` block to use `Atomics.load(s, "x")`, since that already has implications around memory order. I'm not sure where I stand on whether `Reflect.get` observes `unsafe` i'm not sure mark would agree to that, actually. while it's true `Atomics.load` can't exhibit a data race, it can still exhibit races. so if mark's desired guarantee is "no non-deterministic races arising from shared memory at all", then it should also require `unsafe`. otherwise it can be outside of `unsafe`
23:54	<iain>	While I'm agnostic about the value of function colouring, I don't see why you can't generate different bytecode for calls in the same way you do for property access.
23:55	<iain>	It is definitely unfortunate that it would require calls to perform an extra check in safe contexts (aka normal code that isn't touching any of this stuff), but it seems technically feasible to enforce.
23:55	<shu>	While I'm agnostic about the value of function colouring, I don't see why you can't generate different bytecode for calls in the same way you do for property access. i guess i don't know how the unsafe propagation works. if i have `unsafe { safeFunction(); } function safeFunction() { unsafeFunction(); } unsafe unsafeFunction() { ... }`, does that work or does that throw?
23:55	<Mathieu Hofman>	it's different in that Ron's sketch is completely lexical, so all property access lexically contained with `unsafe { }` can generate a different bytecode at parse time. there is no propagation from from frame to frame Can't you generate a different byte code for unsafeCall? An unsafe function would throw on regular call. A safe function would accept both call and unsafeCall
23:56	<iain>	That throws for the same reason as anything else
23:56	<shu>	okay, then yes, we can also generate a different bytecode
23:56	<shu>	and then it comes down to do we really want another function color
23:57	<rbuckton>	i'm not sure mark would agree to that, actually. while it's true `Atomics.load` can't exhibit a data race, it can still exhibit races. so if mark's desired guarantee is "no non-deterministic races arising from shared memory at all", then it should also require `unsafe`. otherwise it can be outside of `unsafe` I don't see a way to have `Atomics.load` be aware of `unsafe` unless we start treating it like we do direct vs. indirect eval? Otherwise we essentially would have function coloring, but only for `Atomics` methods and only when they receive a `shared struct` argument.
23:58	<shu>	good point. for Atomics.load to require `unsafe` would require an `UnsafeCall` internal bytecode as we've been discussing
23:58	<rbuckton>	So would it be better to special case function coloring purely for the `Atomics` methods, or just make it a more general mechanism?
23:58	<shu>	but that'll be an implementation detail, and is orthogonal to whether we expose that coloring to user code
23:59	<iain>	I don't see any backwards-compatible way to make Atomics methods usefully unsafe
23:59	<shu>	well, Atomics currently don't work on field names, only TAs and indices
23:59	<shu>	that will remain usable everywhere
23:59	<shu>	and there will be magic to make the new forms throw outside of `unsafe`