TC39 Structs and Shared Structs on 2024-10-11

00:13	<shu>	WASM integration: I doubt an FFI style approach and wasm having to go through JS would be acceptable for every shared field access? privacy is a source-language level concept, not a target language level concept. how JS structs are reflected in wasm is just an independent discussion IMO.
00:14	<shu>	shared struct extend: the semantics of privates in JS is lexical. I think we may need a notion of protected here. Or at least a way to extract and represent the power to access private fields (which could be used to solve the wasm issue as well) yes, the existing semantics of #-names binds us in a way. i don't have a good answer
00:14	<shu>	like wasmgc structs don't have names, just offsets, right?
00:14	<shu>	privacy is just not a question there
00:30	<rbuckton>	for unsafe {}, mark is also happy with "all shared struct fields always private", then we don't need unsafe {}, because the idea is the public API must be written by the developer, which should be threadsafe the implementation constraint for me with that world is that we must be able to compile #-names in shared structs just as normal property slots. because of the "#-names have different identities depending on the evaluation of the class declaration" thing, they are wildly inefficient in engines. this problem might not exist at all if we have the restriction that struct decls are top-level only and thus can't be reevaluated anyways so that sounds like a promising thing for us to explore If struct fields are always private, how would you even use them with Atomics? I'm not sure I'm in favor of that direction, for more reasons than that.
00:58	<shu>	If struct fields are always private, how would you even use them with Atomics? I'm not sure I'm in favor of that direction, for more reasons than that. you expose methods that does the atomics thing
00:58	<shu>	the "always private" direction has a strong dependency on per-realm prototypes with methods
00:59	<shu>	oh you mean, like, you can't reify #-names so how do you even call Atomics.load
00:59	<shu>	i misunderstood
00:59	<shu>	yeah idk
00:59	<shu>	we're gonna need to come up with something there
01:00	<shu>	i am of course more than happy with a direction with neither unsafe{} nor always private, but i think agoric still can't live with that
01:03	<shu>	the atomics thing is one to be solved regardless of always-private, i suppose, because rbuckton you wanted to be able to use private state eventually anyway
01:04	<rbuckton>	you expose methods that does the atomics thing This doesn't make sense. Atomics requires a field name, but you cannot reference a private field
01:04	<shu>	i know, i misunderstood the question
01:06	<rbuckton>	I agree, and I already have a solution for this, but I also am not partial to private fields only as a direction. I could have multiple structs representing a complex data structure that are fully guarded, but I would always have to access their fields indirectly through accessors? IMO, that is not a great solution.
01:07	<shu>	i think if you have a bunch of friend classes, i agree friend classes ought to be able to see each other somehow, but i don't know how complex we wanna make it
01:07	<shu>	but we still need to thread the needle between a guardrail that Agoric requires for this to progress
01:08	<shu>	that's the most viable option i see, if choosing between this and no guardrails at all
01:08	<shu>	are there other ideas?
01:08	<shu>	(assuming unsafe{} is dead in the water, given how much opposition we heard during plenary)
01:10	<rbuckton>	We're making this so much more complicated than it needs to be
01:10	<shu>	i agree personally
01:10	<rbuckton>	What opposition, aside from Waldemar's?
01:10	<shu>	oh, your preference is to have unsafe{}?
01:11	<rbuckton>	I'd much rather have no guardrails than only private fields.
01:11	<rbuckton>	I'd rather have `unsafe` than only private fields.
01:11	<shu>	not just Waldemar, you also said TS gave negative feedback (i saw ryanc hating on it for the same reasons), i saw matrix conversations that were very much against the idea
01:12	<shu>	okay, it'd be good to get a ranking and take stock of the options at the next meeting. i've invited Waldemar but i don't know if he'll attend
01:12	<shu>	the known options so far are: no guardrails at all volatile {} (people definitely hated unsafe) private only
01:12	<rbuckton>	One suggestion that seemed interesting was using a different punctuator for field access. Something like `obj->x`, but maybe without the pointer dereferencing baggage.
01:13	<rbuckton>	Like, we have `.id` and `.#id` we could just have something else
01:13	<shu>	that seems also like a lot of complexity to me
01:13	<shu>	but sure, we can also talk about it
01:13	<shu>	my preference is 1 > 3 > 2, ron's is 1 > 2 > 3, agoric's AFAIU is 2 = 3 (?) > 1
01:14	<rbuckton>	I think private only is a non-starter because it requires so much additional complexity.
01:15	<rbuckton>	As far as atomic access to privates, there's always https://github.com/rbuckton/proposal-refs
01:15	<Mathieu Hofman>	one advantage of new syntax is that it opens up the possibility of not allowing access to these fields through reflection, the same way private fields cannot be accessed through reflection
01:15	<rbuckton>	I haven't proposed it, but I've been thinking about it for years.
01:15	<Mathieu Hofman>	it also means that private shared fields don't need to have exactly the same semantics as private fields
01:16	<shu>	it also means that private shared fields don't need to have exactly the same semantics as private fields the Atomics methods problem is a hard requirement though
01:16	<shu>	to not be able to do seq-cst, or acq/rel (as waldemar is proposing now) on those fields would be fatal
01:16	<rbuckton>	This is supposed to be a perf-critical feature. Routing everything through accessors is antithetical to that
01:16	<shu>	well, it's only via public use is the point
01:16	<shu>	i'm actually not super concerned for routing through accessors because that only happens externally to the class's methods
01:16	<Mathieu Hofman>	yeah Atomics becomes difficult to explain with new syntax if there is no reflection
01:17	<shu>	which would only be a perf problem in your cluster-of-friends case
01:17	<Mathieu Hofman>	accessors on fixed shape objects, especially if they're possibly autogenerated, don't have to have any overhead, right ?
01:17	<rbuckton>	the refs proposal would introduce a mechanism for reified References.
01:17	<shu>	accessors on fixed shape objects, especially if they're possibly autogenerated, don't have to have any overhead, right ? oh they have super overhead as a function call
01:17	<rbuckton>	So you could do `Atomics.load(ref struct.#x)`
01:17	<shu>	that is a huge overhead compared to a load on a struct
01:18	<shu>	but i can live with it if it's for public users only
01:18	<shu>	because that shouldn't happen frequently for public users
01:19	<shu>	the refs proposal would introduce a mechanism for reified References. that's a big dependency to take :(
01:19	<rbuckton>	which would only be a perf problem in your cluster-of-friends case Which is something I'm already doing in the TS experiment to build the lock free data structures I need to actually get work done.
01:19	<shu>	interesting, what is your cluster of friend classes in that case?
01:20	<rbuckton>	I agree, but it has other uses. It's had use cases since Decorators was proposed.
01:20	<shu>	just pick a concrete one, just curious
01:20	<rbuckton>	Writing a Deque
01:20	<shu>	and the subclasses that comprise it that need to see each others' state?
01:20	<rbuckton>	there are about 4-5 data structures that are all interrelated.
01:20	<rbuckton>	The `Deque` needs to be able to access the ring buffer's state.
01:21	<rbuckton>	This is all lock free, so only the Deque public methods need to be guarded.
01:21	<shu>	okay so you have like a RingBuffer and you want Deque to just get ._buffer, not RingBuffer.p.getBufferElementAt(n) or whatever
01:21	<rbuckton>	And you need to use objects for ringbuffer entries for appropriate CAS operations.
01:21	<rbuckton>	roughly.
01:21	<shu>	right, or .compareExchangeElementAt
01:22	<rbuckton>	https://github.com/microsoft/TypeScript/blob/shared-struct-test/src/compiler/sharing/collections/deque.ts
01:22	<shu>	right, if we take alway-private to mean literal #-names with their semantics today, this works very poorly
01:23	<rbuckton>	(though that's using `class` and TS legacy decorators to give me something to attach types to, the constructors actually return structs)
01:23	<shu>	man why is this so hard
01:24	<shu>	i am really skeptical about adding a new kind of property after private names
01:24	<shu>	also for a new pointer-field-access syntax, but i'd need to see the specifics i guess
01:25	<rbuckton>	Everyone I've spoken to on my team seems to be in agreement that this additional guardrail is wholly unnecessary.
01:26	<rbuckton>	JS has getters and setters, so an object's properties can change underneath you unexpectedly already, and reads and writes don't tear.
01:26	<shu>	this was waldemar's core point iirc
01:27	<shu>	i confess i am also confused why data races are categorically different from "wacky getters". mark's explanation to me in the hallway was, getters are the public API of the class designer, so it is intentional by the designer. whereas fields are just fields, so i guess to expose racy access on them by default without additional signal of intent is the contagion he wants to prevent
01:28	<shu>	i didn't have time to really dig in, but a struct designer still has to type `fieldName;`
01:30	<rbuckton>	`shared struct` is in the public API of the struct author as well. I don't see how this is different.
01:30	<shu>	it would be good to dig into more there, for sure
01:30	<rbuckton>	What's not in the public API of the class designer are Proxies, and Proxies can change values on you even for fields
01:31	<shu>	because mark's actual goal as articulated is "how do we better ensure a world where by default devs interact with shared structs only via their public interface, which should be threadsafe"
01:33	<rbuckton>	If you don't have COOP/COEP enabled, then it is threadsafe because you can't actually share it?
01:55	<rbuckton>	Stage 2 is a little early, but I'm experimenting with a TypeScript downleveling for shared structs that uses the dev trial constructor:
01:57	<rbuckton>	No `shared struct`-specific type checking though, it piggybacks on `class` type checking.
01:57	<rbuckton>	i.e., it doesn't disallow non-static methods/accessors at present.
01:58	<rbuckton>	and the [[Prototype]] chain of the struct constructor doesn't match spec since the actual struct type gets stuffed in there.
02:05	<rbuckton>	it also doesn't allow subclassing
02:07	<Mathieu Hofman>	If you don't have COOP/COEP enabled, then it is threadsafe because you can't actually share it? My understanding is that it's mostly in a world where we accept shared memory concurrency, and where it's available. What can we do to steer authors towards designing their shared data so that race conditions are contained and not exposed to users.
02:10	<Mathieu Hofman>	For example most host APIs are internally multi-threaded with shared memory, but thankfully they do not expose that to the JS program. Host object rarely expose objects with data properties that change without local action
04:37	<shu>	Mathieu Hofman: i want to get to a shared understanding in this group (and in committee) the way in which shared memory data race in shared struct public fields is a categorically different kind of unsafe than getters and proxies doing arbitrary things
04:39	<shu>	namely, the assertion from mark seems to have been "those getters are the intended public API provided by the author", while the contention is "from the consumer's point of view, unclear why public fields don't constitute the same intent" is the main issue perhaps that there can't be private fields currently? so the conscientious shared struct author has no way to express the intent?
04:39	<shu>	like, if we had both public and private fields in shared structs (assuming we can somehow make Atomics work, no idea how), what is the issue?
06:41	<nicolo-ribaudo>	I see discussion about how to use private fields and atomics — both Justin and littledan had proposals to reify private names as first-class values, you should probably also get in touch with them
19:45	<rbuckton>	True, which is why I've held off on proposing it until there were more motivating use cases. I initially wrote up the proposal to address a concern around decorators surfaced by the Angular team: `class A { @dec(B) b; } class B { @dec(A) a; }` This throws because `B` is not yet defined (even w/o TDZ it would have been `undefined`). You can address this with arrow functions, but its easy to accidentally pass an actual function and forget the arrow as you cannot readily distinguish between function types. With references, this might be: `class A { @dec(&B) b; } class B { @dec(&A) a; }` A second motivator is to support both conditional and unconditional output values w/o the overhead of destructuring: `Map.prototype.tryGet = function (key, &value) { // naive implementation if (this.has(key)) { value = this.get(key); return true; } return false; } let value; if (map.tryGet(key, &value) && value > 0) { ... }` And refs could be used to pass a reference to a variable, field, or array element: `// private field and array element refs for atomics Atomics.load(&this.#x); Atomics.store(&ar[0], value); // private field refs for friend classes class Source { signal; #waiters; constructor() { this.signal = new Signal(&this.#waiters); } } class Signal { #waiters; constructor(&waiters) { // deref/bind this.#waiters = &waiters; // ref, reified } subscribe(cb) { let &waiters = this.#waiters; // deref waiters ??= new Set(); // assigns to source.#waiters waiters.add(cb); } }` A reified reference is roughly akin to a closure: `let a = 1; let b = &a; b.value; // 1 b.value++; a; // 2 // rough desugaring let a = 1; let b = { get value() { return a; }, set value(v) { a = v; } }; b.value; // 1 b.value++; a; // 2` Except that for fields its actually more like a Reference record in that it holds the object and property key, which is how something like `Atomics` could reach in and touch the actual field value. This is the only behavior that refs provide that isn't already possible in the language as a closure. My hope is, though, that engines could potentially optimize ref/deref pairs and avoid reification when it isn't necessary: `let a = 1; let &b = &a; // ref/deref. 'b' and 'a' point to same binding function &f(&x) { return &x; } // statically indicates ref return let x = 1; let &y = &f(&x); // ref/deref/ref/deref/ref/deref. 'y' and 'x' point to the same binding`
19:51	<rbuckton>	This also came up in the Extractors proposal when comparing how we would need to use array destructuring, while C#'s `Deconstruct` uses `out` (same as `ref`, but statically enforced to ensure you assign to it before returning)
19:54	<rbuckton>	I've gone back and forth on whether to use `ref` or `&`. I've felt `&` seems too pointer-y, but `ref` is ambiguous with call when the operand is parenthesized, e.g. `ref (x).y`.
20:01	<rbuckton>	You could potentially split up something like refs into two parts: `&` for ref/deref but no reification (i.e., either statically an error if not derefing if that's feasible, or produces an opaque object), with reification coming later (if at all).
20:08	<rbuckton>	Prior to C# 8, you could only take a ref of an argument and pass it to a ref parameter, which is a further potential restriction.
20:11	<rbuckton>	That would still cover the main motivations I'd have for the refs proposal, though without some of the conveniences it offers. A decorator could take a ref as a parameter and internally close over it. The only problem being that the parameter could not be optional (or could be but it would be an error to access it, as if it had TDZ, which I know that's not a good direction)
20:14	<rbuckton>	There was also some interest in using refs with the signals proposal. I spoke with Domenic Gannaway a few months back about that, as it could potentially allow for reactive variables similar to Svelte.
20:23	<rbuckton>	For example: const nameState = new Signal.State("Bob"); const helloComputed = new Signal.Computed(() => `Hello ${name}`); let &name = &nameState.getMutableRef(); const &hello = &helloComputed.getImmutableRef(); name; // "Bob"; hello; // "Hello Bob"; name = "Alice"; hello; // "Hello Alice";
20:25	<nicolo-ribaudo>	That seems exactly like the code that made the committee reject `import defer` without a namespace import 😅
20:25	<nicolo-ribaudo>	i.e. that it was introducing side-effects in binding access
20:26	<rbuckton>	Yeah, I am aware it's a hard sell. This is fairly explicit at the declaration site, on a per-binding case.
20:27	<nicolo-ribaudo>	I guess it's no worse than the alternative (using an object and property access)
20:27	<nicolo-ribaudo>	Given that it's statically analyzable
20:27	<rbuckton>	With the exception of how Atomics would work, this is fully transpilable.
20:33	<rbuckton>	Another use case is conditional references. We have functions in the TS compiler where we do some work with a secondary optional output parameter. To make it optional, we just either pass `undefined` when it's not needed or a `{ value: undefined }` object when it is needed. Passing a ref would be far more convenient.
20:40	<rbuckton>	shu: I have a thought, though maybe this isn't enough. What if shared structs can have private fields and public fields, but you must declare the public fields as `volatile`? If the answer to "if you want to make the field accessible w/o a guardrail is to use `get`/`set` (or `accessor`?)", which has overhead, then perhaps marking the field with `volatile` would be acceptable without the overhead.
21:55	<shu>	to make sure i understand, something like `shared struct S { publicField; }` doesn't parse, but `shared struct S { volatile publicField; }` does
21:56	<shu>	i get the intent and i can live with it, but it does feel confusing with conditional requirement for public fields and not private fields
21:57	<shu>	private names aren't any less volatile because they're private
21:58	<shu>	i guess i can also live with that every field must be required tagged `volatile`, but that also feels kinda like we're just requiring devs to type an behaviorless incantation, the purpose of which is lost
22:07	<shu>	rbuckton: an actually orthogonal way to solve the Atomics-on-private issue is to give up on Atomics overloads, and go with Atomics objects which AFAICT is what most languages do anyway
22:08	<shu>	i.e. `#privateFoo = Atomics.Value()` or something, which you can then use `Atomics.load` on
22:08	<shu>	the problem is that this incurs an extra layer of indirection and an extra layer heap allocation, since i don't think it can be zero cost
22:09	<rbuckton>	i guess i can also live with that every field must be required tagged `volatile`, but that also feels kinda like we're just requiring devs to type an behaviorless incantation, the purpose of which is lost I'd rather do that than be required to write `accessor` or `get`, the behavior of which is unnecessary and pure overhead.
22:09	<shu>	unless we decide it's worth inlining into struct declarations with special syntax, like `atomic #foo` or `atomic foo`
22:10	<shu>	but even then i don't know how to avoid materializing it into an actual object if the goal is to do Atomics access on them
22:10	<rbuckton>	i.e. `#privateFoo = Atomics.Value()` or something, which you can then use `Atomics.load` on I'm doing something similar in the shared structs experiment in the TS compiler, mostly to work around TypeScript's soft `private`. The `AtomicValue` isn't strictly necessary, but it type checks better than a bunch of `// @ts-ignore` comments.
22:11	<shu>	i think ergonomically the idea is pretty good and fits well with the language
22:11	<rbuckton>	In the case of the `AtomicValue` I wrote, it has all of the atomics methods itself.
22:11	<shu>	right
22:11	<shu>	but indirection and the always on-heap boxing are antithetical to the performance needs of lock-free algs
22:11	<shu>	so if we can thread that needle somehow that also sidesteps the problem
22:11	<rbuckton>	Agreed
22:12	<shu>	i think it's probably doable if we have special syntax marker like `atomic #foo`
22:13	<shu>	then it can be treated like primitive prototype lookups when you access, say, `42.toString` or whatever
22:13	<shu>	you don't really need to materialize the box in all cases
22:13	<rbuckton>	If Mark's concern is that volatile fields on a shared struct could be be surfaced as part of a public API, could that not be guarded against via a type-aware linter like `eslint` w/`ts-eslint` instead?
22:14	<shu>	perhaps, that's up to mark and i think the highest order thing we should talk through before we return to investigating correlation
22:14	<rbuckton>	i.e., if you are concerned about it, you lint for it?
22:14	<shu>	i guess it comes down to how is mark running the linter?
22:15	<shu>	like since the concern is about other libraries
22:15	<shu>	is the linter being run on the whole project, across all its deps?
22:15	<shu>	if so that seems like it should point it out
22:15	<rbuckton>	I don't think the average dev is just going to start using `shared struct` over something like `class`.
22:15	<rbuckton>	No, but you should be vetting your dependencies anyways.
22:15	<shu>	right, but the worry was something like
22:16	<rbuckton>	your dependencies could also start using Proxies and report things that are fields that could change on you at a moment's notice.
22:16	<shu>	i mean yeah
22:16	<shu>	i have the same confusion
22:16	<shu>	anyway i gotta go pack and check out
22:16	<shu>	i'll also noodle more on the atomic value idea
22:17	<shu>	the Atomics methods with reflected property access really was like, forced on us by SABs
22:17	<shu>	perhaps we should dream a little bigger here anyway
22:20	<rbuckton>	for what it's worth, `&` refs would work for array elements, so `Atomics.load(&u32array[0])` would also work.