TC39 Structs and Shared Structs on 2023-01-30

05:11	<littledan>	Right, this is why I concluded that the only reference point we could use is the module system
06:07	<ljharb>	just like builtin modules, there'd always have to be a way to access the same functionality in Scripts - how would that work at all with the module system?
08:39	<Ashley Claymore>	Automatically de-duping types based on module could also be difficult when the bundler inlines the defining module into separate main and worker bundles. It would need to know this module is an entry-point that should not be inlined
08:43	<Ashley Claymore>	For manually registering a prototype: `StructClass.adopt(orphanedInstance)` is potential the simplest to explain on its own but per-instance-per-agent uses the most memory, and requires the most orchestration as each individual instance needs to be adopted before methods can be called
08:48	<Ashley Claymore>	registering once per type, reduces memory and is a one time orchestration (per type,per non-original-agent) which can then "just work" from that point on. But needs to answer: what carries the "type" being registered? Is the registry per-agent, or per-realm, or per-comms-channel? ( a bit like the "transferable objects" list)
08:51	<Ashley Claymore>	or is everything automatic, for the use case rbuckton mentioned where an npm package of shared structs "just works across workers" with no orchestration (ignoring ensuring the bundler&server are configured as required)
14:53	<rbuckton>	For manually registering a prototype: `StructClass.adopt(orphanedInstance)` is potential the simplest to explain on its own but per-instance-per-agent uses the most memory, and requires the most orchestration as each individual instance needs to be adopted before methods can be called This seems infeasible for any complex, nested set of structs
14:57	<rbuckton>	or is everything automatic, for the use case rbuckton mentioned where an npm package of shared structs "just works across workers" with no orchestration (ignoring ensuring the bundler&server are configured as required) While this would be nice to have as a default, I don't think "automatic only" is feasible w/o making structs unbundleable.
15:02	<rbuckton>	registering once per type, reduces memory and is a one time orchestration (per type,per non-original-agent) which can then "just work" from that point on. But needs to answer: what carries the "type" being registered? Is the registry per-agent, or per-realm, or per-comms-channel? ( a bit like the "transferable objects" list) Once per realm is consistent with primitives like `Number.prototype`, `Boolean.prototype`, etc. The question is whether structs are more "object"-like or more "primitive"-like, since Objects are effectively per-Agent since you can hand one to another realm and it still walks the original prototype chain.
15:09	<Ashley Claymore>	Once per-realm has the consequence that the value would not meet the current 'sealed' guarantees. Maybe this is OK, but also maybe not
15:10	<Mathieu Hofman>	I think regarding dynamic prototype lookup we agreed this would be a per-realm behavior like for primitives, which means whatever registry must be at most per realm
15:11	<Ashley Claymore>	while that is 'like' primitives, it's more because primitives are not objects in the first place
15:12	<Mathieu Hofman>	If the only realms that existed were shadow realms, objects would effectively be per realm and not per agent
15:12	<Ashley Claymore>	so they go via toObject, an object doesn't change via toObject
15:12	<Ashley Claymore>	if only!
15:13	<Mathieu Hofman>	And because of the existence of Shadow Realm, you cannot have a mechanism which would expose objects from another realm across the callable boundary
15:13	<Mathieu Hofman>	which a per agent registry would effectively do
15:14	<Ashley Claymore>	but because of same-origin-iframes and node 'vm' there would still need to be defined behavior for same-agent-cross-realm semantics
15:14	<Ashley Claymore>	ShadowRealms is the easy case :)
15:15	<Ashley Claymore>	easy ~= can't pass objects, so can't pass structs
15:18	<Mathieu Hofman>	well technically can only pass struct between agents with host APIs, but we're still discussing how that works here ;) A host could very well add an API to pass structs between shadow realms
15:18	<Mathieu Hofman>	(I really want to get my structured clone extension mechanism ironed out, as it'd work to pass objects between shadow realms)
15:27	<rbuckton>	As I said earlier, if we had a restriction that a shared struct must be a top-level declaration then most type identity registration would occur during application start up (with the exception of portions of the module graph loaded dynamically via `import()`), which means conflicts (be they intentional or unintentional) would primarily occur early. Malicious code wouldn't be able to hijack an already-registered type identity.
15:28	<ljharb>	how would that restriction work in Script?
15:30	<rbuckton>	how would that restriction work in Script? They would also need to be at the top-level of a Script, so no using a function as a factory for shared struct types. When evaluation of the module/script body completes, the file cannot produce new struct types. Yes, this isn't 100% reliable, but is at least as reliable as the current mechanisms used today to capture intrinsics before they can be patched/modified by other code.
15:32	<Mathieu Hofman>	I think that restriction is completely unrealistic, and makes it impossible to use this feature in a lot of programs
15:33	<rbuckton>	I only suggested the restriction as a possible remediation for the concern about type identity registration being forgeable.
15:34	<rbuckton>	Realistically, even without this restriction, a type identity registry that depends on a first-in wins mechanism is still as reliable as existing intrinsic-capturing mechanisms.
15:34	<Mathieu Hofman>	Javascript is a dynamic language. That suggestion is equivalent to forbidding features from the dynamic part of the language
15:34	<rbuckton>	We already do that with features like `export`
15:35	<Mathieu Hofman>	And I have explained why first win semantics for a global registry is a non-starter
15:35	<rbuckton>	I don't want that restriction.
15:35	<rbuckton>	And I have explained why first win semantics for a global registry is a non-starter Per-agent global, or cross-agent global?
15:35	<Mathieu Hofman>	even per realm
15:36	<Mathieu Hofman>	`export` being first win is not a precendent, like object spread is not a precendent. Both have a local effect (module or object)
15:37	<rbuckton>	I still am not clear on why you believe a per-agent (or per-realm) write-only registry is a non-starter. If you don't want adversarial code to register first, you register first.
15:37	<Mathieu Hofman>	who is "you"
15:37	<rbuckton>	You, the application developer.
15:38	<Mathieu Hofman>	Ok what about 2 libraries that the application load. Should lib1 be able to interfere with lib2 ?
15:39	<rbuckton>	If you are a middleware, or a plugin, or something else running in a host environment where you don't control the environment, then you ensure your type identity is sufficiently unique. If you are the host environment loading a middleware, or plugin, etc., you architect your environment to be resilient to such a conflict.
15:39	<Mathieu Hofman>	there is no observable global mutable state in 262, and we would 100% block anything that introduces anything like it
15:39	<ljharb>	the global object isn't that?
15:39	<ljharb>	or do you mean across realms (bigger than global, "universal" maybe)
15:39	<Ashley Claymore>	minus the intrinsics/primordial objects themselves
15:39	<Mathieu Hofman>	hidden *
15:40	<Mathieu Hofman>	sorry I forgot the hidden part
15:40	<ljharb>	there's a few of those too already, but they're all freezeable ofc
15:40	<Mathieu Hofman>	where ?
15:40	<ljharb>	`AsyncFunction` is one, no?
15:40	<Ashley Claymore>	hidden = internal
15:40	<Ashley Claymore>	i.e. not impacted by being frozen
15:41	<Mathieu Hofman>	yes the hidden intrinsics, which we're trying to fix
15:41	<Mathieu Hofman>	ashley has the correct wording there, AsyncFunction is accessible to JS code
15:42	<rbuckton>	Ok what about 2 libraries that the application load. Should lib1 be able to interfere with lib2 ? If lib2 doesn't want unintentional interference, they should use sufficiently unique type identities. If lib1 performs intentional interference, that's up to the application developer/host environment to mitigate, and a best effort from the runtime to surface that information early, hence the first-in wins registry. If you can only write new unique identities to the registry, either lib1 or lib2 will fail rather than be ignorant of hijacking.
15:43	<ljharb>	i.e. not impacted by being frozen good thing RegExp.prototype isn't a regex anymore :-p
15:43	<rbuckton>	If the concern is about the registry not being freezable, that could be made possible via an API as well.
15:43	<Mathieu Hofman>	But again that would prevent the feature from working
15:44	<rbuckton>	Yes, if malicious code were to freeze the registry your application would stop working. Which would be a fairly good indication that there is malicious code freezing the registry (at least, with a sufficiently clear error message).
15:45	<Mathieu Hofman>	I want any JS feature where I can dynamically load code, that will keep working regardless of previous code that was loaded and executed (if you ignore mutation to the global object and intrinsics)
15:45	<rbuckton>	Just like some packages not working with SES when it locks down the environment.
15:46	<rbuckton>	It feels like you are arguing two opposing positions: Such a registry must be mutable so that code keeps working, but malicious code shouldn't be able to mutate it to hijack it.
15:48	<Mathieu Hofman>	No, I'm arguing this feature cannot use a global registry with forgeable keys because it'd be impossible to make it safe
15:48	<rbuckton>	Perhaps it would be better to outline the specific capabilities we want, the limitations of the environment and ecosystem, and the concerns we have for any given solution. From there we can better determine hard and fast requirements and find potential compromises.
15:49	<rbuckton>	No, I'm arguing this feature cannot use a global registry with forgeable keys because it'd be impossible to make it safe I'm more than willing to entertain other solutions, but its important that the developer experience doesn't make the feature wholly unusable.
15:49	<Mathieu Hofman>	you can have a global registry, as long as it cannot be used by code that hasn't been previously introduced to each other to interact.
15:50	<rbuckton>	Needing to patch or wire up the prototype in `onmessage` is terrible DX.
15:51	<rbuckton>	How would you describe how these introductions should work?
15:51	<Mathieu Hofman>	This is the reason I would block any string based registry that you can ask the question (directly or indirectly) "do you have this key"
15:52	<rbuckton>	Needing to patch or wire up the prototype in `onmessage` is terrible DX. I say this as someone who wrote a package that implements struct-like behavior backed by `SharedArrayBuffer` that does this. The only reason it works is that the "structs" you create are fully typed, so you don't have to walk a graph to wire up prototypes because it does that for you.
15:54	<rbuckton>	This is the reason I would block any string based registry that you can ask the question (directly or indirectly) "do you have this key" How deep does this concern extend? Agent-scoped registries, apparently, but what about realm-scoped registry?
15:55	<Mathieu Hofman>	realm scoped registries too
15:58	<rbuckton>	To clarify, your concern is that such a string-based registry could be used as a side-channel for communication (i.e., it could be used to exfiltrate data)?
15:59	<shu>	hm there seems to be a lot of back and forth i've missed here, is there a tl;dr (no rush)?
15:59	<Mathieu Hofman>	In the solution I presented, the application either has to do a manual bootstrap (sending once over postMessage the unforgeable type identifier, and wire it to the expected implementation), or give up on prototype continuity (aka an incoming shared struct's dynamic proto would not be the same object as the dynamic proto of an "equivalent" shared struct created locally)
15:59	<Mathieu Hofman>	side channel and/or denial of service
16:01	<Mathieu Hofman>	If we had module block with stable identity over postMessage, the discontinuity could be mitigated
16:01	<rbuckton>	In the solution I presented, the application either has to do a manual bootstrap (sending once over postMessage the unforgeable type identifier, and wire it to the expected implementation), or give up on prototype continuity (aka an incoming shared struct's dynamic proto would not be the same object as the dynamic proto of an "equivalent" shared struct created locally) What if such a registry were per-channel? i.e., a `MessagePort` might have an internal registry of type-identity to prototype, and you'd have to set up that registry on each side?
16:02	<rbuckton>	If we had module block with stable identity over postMessage, the discontinuity could be mitigated Because you send the module block over `postMessage` to evaluate so that both sides have the same representation?
16:02	<Mathieu Hofman>	I think a per-channel registry would likely exacerbate identity discontinuity issues, but since they registration would be tied to a non-global capability, it would be acceptable to use forgeable identifiers
16:06	<rbuckton>	My concerns with bootstrapping with a module block is how much other information would need to be bootstrapped along with it to support code-sharing (i.e., imports and package dependencies) making it harder for the other side to ensure initialization is consistent (i.e., any registration that needs to happen, or perhaps conditional imports depending on whether the code is running in the browser main thread or in a worker, etc.). And if that were easily solvable, we could also potentially solve that by bootstrapping just with the struct type definition itself (i.e., can we do this without depending on module blocks?).
16:08	<rbuckton>	I think a per-channel registry would likely exacerbate identity discontinuity issues, but since they registration would be tied to a non-global capability, it would be acceptable to use forgeable identifiers From an API standpoint, I could potentially see there being some kind of built-in `SharedStructTypeIdentityRegistry` that a package could export, such that a host could import the registry and use it to configure a `MessagePort`.
16:10	<rbuckton>	hm there seems to be a lot of back and forth i've missed here, is there a tl;dr (no rush)? How to manage type identity across Agents/Realms/etc. in a way that doesn't violate security concerns, but works for bundlers (so solely relying on module id/file path isn't viable).
16:10	<Mathieu Hofman>	Because you send the module block over `postMessage` to evaluate so that both sides have the same representation? Kinda. Each agent could have its own type for an "equivalent" shared struct, but if the code declaring the struct uses a module instance to define the struct's behavior / dynamic prototype, that module instance can be automatically loaded by other realms. When receiving a struct of that type, the observed prototype object would be the same as the prototype of the locally defined struct, if the local definition used the same module instance (which required module instances to be stable across agents). The "mostly mitigated" part is because the constructor couldn't be on the prototype, as that is realm specific of course
16:12	<Mathieu Hofman>	From an API standpoint, I could potentially see there being some kind of built-in `SharedStructTypeIdentityRegistry` that a package could export, such that a host could import the registry and use it to configure a `MessagePort`. I did not understand this suggestion
16:16	<rbuckton>	I'm mostly suggesting a way to simply the DX around registration, making it easier to combine registries from multiple packages, and to define the composite registry on each side of a message port.
16:17	<rbuckton>	The code that runs on each side could be different for the same struct type given practices such as bundling and tree shaking, and in some cases a struct type may need a method tailored to an environment (i.e., can its methods access node-native bindings, can this method only be run in the browser), etc.
16:18	<rbuckton>	Having an observably similar prototype is profoundly useful, but mandating a similar prototype limits flexibility.
16:20	<Mathieu Hofman>	TLDR here is that registration would be implicit the first time a struct of a given type is shared with another realm/agent if the struct definition used a module instance at declaration. If the same module instance loaded in 2 different agents has a stable identity, aka if `receivedModuleInstance === importedModuleInstance`, then you don't have any prototype discontinuity issues. If the bundler messes up these module identities, then the program has to first send the type identity explicitly over a `postMessage` and the receiving code has to manually do `SharedStructType.register(receivedModuleType, importedModuleInstance)`
16:25	<Mathieu Hofman>	Btw, you can use string identifier over `postMessage` for the application to know what the opaque type identifier is about, restoring your string based semantics. That would be scoped to whatever is holding the string identifier registry (possibly the channel itself).
16:27	<rbuckton>	Btw, you can use string identifier over `postMessage` for the application to know what the opaque type identifier is about, restoring your string based semantics. That would be scoped to whatever is holding the string identifier registry (possibly the channel itself). That's still a poor DX as it would significantly overcomplicate `onmessage` handlers by requiring custom scaffolding in every project to bootstrap.
16:28	<rbuckton>	The `SharedStructTypeIdentityRegistry` suggestion above would handle the per-channel string identifier registry without requiring prior communication between each side of a channel.
16:38	<rbuckton>	Consider, for example: // node_modules/threejs/src/registry.js export const structRegistryWriter = new StructTypeRegistryWriter(); export const structRegistry = structRegistryWriter.registry; // node_modules/threejs/src/math/Vector2.js import { structRegistryWriter } from "../registry.js" @structRegistry.register("http://threejs.com/structs/Vector2") shared struct Vector2 { ... } // main.js import { Vector2, structRegistry } from "threejs"; const worker = new Worker("./worker.js", { structRegistry }); worker.postMessage(new Vector2(0, 0)); // worker.js import { structRegistry } from "threejs"; import { parentPort } from "worker_threads"; parentPort.addRegistry(structRegistry); parentPort.on("message", v => { v.whatever(); });
16:39	<rbuckton>	Both the main and worker threads load the same struct type, though they could be in different bundles and tree shaking could remove some members. They independently associate type registries with their sides of the channel.
16:40	<rbuckton>	While not automatic, the registration mechanism stays as out of the way as possible to simplify the developer experience.
16:41	<Mathieu Hofman>	Basically it'd look like // Done by all workers independently import Vector2DPrototype from "vector2dBehavior.js" with { reflect: "module" }; const { constructor: Vector2D, type: Vector2DType } = SharedStructType.prepare(["x", "y"]); SharedStructType.register(Vector2DType, Vector2DPrototype); const Vector2DUUID = '...'; const structAwareChannel = new MessageChannel(); structAwareChannel.register(Vector2DUUID, Vector2DType, Vector2DPrototype); // worker1.js const v1 = new Vector2D({x: 1, y: 2}); structAwareChannel.postMessage({v1}); // worker2.js channel.on("message", ({v1}) => { console.log(v1 instanceof Vector2D); // true because prototype match even though type instance different });
16:44	<rbuckton>	That's definitely not great for the actual proposal, that seems far to easy to get wrong. Maybe ok in the prototyping stage, but not long term.
16:45	<Mathieu Hofman>	I actually don't think our approaches diverge that much
16:45	<rbuckton>	What happens if `vector2dBehavior.js` requires a package already loaded on the worker? Will it import the worker version, or carry along its transitive dependencies to be re-evaluated on the worker?
16:46	<Mathieu Hofman>	if the struct types are declared independently, they will have different types, and thus different constructors
16:47	<Mathieu Hofman>	What happens if `vector2dBehavior.js` requires a package already loaded on the worker? Will it import the worker version, or carry along its transitive dependencies to be re-evaluated on the worker? Doesn't matter?
16:48	<rbuckton>	Your suggestion depends on features not present in the origin trial, but uses an API that is likely unique to the origin trial. @shu can clarify, but I believe the reason the trial uses a `SharedStructType` constructor in place of syntax as that's much easier to add behind a flag to get early feedback on. I'm not sure how likely we could depend on a proposal like `with { reflect: "module" }` or module blocks, which are nowhere near ready for adoption.
16:49	<Mathieu Hofman>	Yes I agree a dependency on module instances is not great, hence why I removed that from my original proposal. We can skip it here, the only effect is that you won't get implicit fallback registration
16:49	<rbuckton>	Doesn't matter? It matters if the dependent module has side-effects, or if it doubles the amount of runtime memory used because it depends on a large package.
16:50	<Mathieu Hofman>	the behavior of loading a module instance is a general question for module import to answer, and is not really relevant for this proposal
16:50	<Mathieu Hofman>	I only included it so that a behavior can be implicitly shared with other realms/agent without having to rely on a per channel registration on each side
16:53	<Mathieu Hofman>	It would be great to find an approach that allows for the implicit import of behavior for cases that don't care about prototype continuity, yet not force a dependency on module instances
17:01	<shu>	Your suggestion depends on features not present in the origin trial, but uses an API that is likely unique to the origin trial. @shu can clarify, but I believe the reason the trial uses a `SharedStructType` constructor in place of syntax as that's much easier to add behind a flag to get early feedback on. I'm not sure how likely we could depend on a proposal like `with { reflect: "module" }` or module blocks, which are nowhere near ready for adoption. that is accurate
17:12	<rbuckton>	I'm of two opinions on prototype continuity, each based on the overall direction we take for the proposal: If the methods of shared structs only had access to a thread-safe locked-down global in a globally shared realm where they can only access other shared functions or shared structs or imports of the same (i.e., something like the "shared module" approach), then I favor prototype continuity because its easy and has an already limited surface area. If the methods of shared structs are derived from code independently run in each Agent, then I only care about prototype continuity insomuch as a well-written program should be importing the same shared struct definition from the same file/package in each agent, and that any prototype continuity that exists will fall out naturally from that. By loosening the any restriction of prototype continuity, bundlers can take advantage of tree shaking, renaming, inlining, etc. Since we seem to be leaning towards the latter approach, I'm less concerned about prototype continuity.
17:44	<Mathieu Hofman>	I think we have a different definition of continuity here. I probably should have said "stability"
17:45	<Mathieu Hofman>	All your examples seem to rely on prototype stability by require to pass `instanceof` checks between received structs and the local constructor
17:46	<Mathieu Hofman>	This is the difficult part
17:57	<rbuckton>	`instanceof` is a function of the constructor, not the prototype. In my origin-trial derived example, I mistakenly included the constructor in the `register` API, though the runtime doesn't really need to know about the constructor. It would be better written as: `const { type: _Vector2D, register } = SharedStructType.prepare(["x", "y"]); export function Vector2D(x = 0, y = 0) { const _this = Reflect.construct(_Vector2D, [], new.target); _this.x = x; _this.y = y; return _this; } Vector2D.prototype.distanceTo = function (v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); }; register("e3a9bd1f-0f64-4848-b255-3c629d0c44a3", Vector2D.prototype);` You bind the type identity to the prototype to use for that struct type. `x instanceof Vector2D` then falls out of `Vector2D` checking whether `x` has `Vector2D.prototype` in its prototype chain.
18:14	<Mathieu Hofman>	`instanceof` is a check on the prototype object as retrieved from a Constructor. The constructor is the way to express the check, but has nothing to do with the check itself
18:16	<Mathieu Hofman>	I think what I'm trying to say, is that if you want to make prototype match on different constructors, you can't really have a `constructor` property on the prototype object
18:23	<rbuckton>	`instanceof` doesn't check `prototype.constructor` though, so the absence or presence of `constructor` seems like a separate issue.
18:26	<rbuckton>	In my example, you want the local version of `Vector2D` to use the custom constructor to perform field assignments or any other custom logic. So `prototype.constructor` locally would refer to `Vector2D`. Also, I'm not necessarily saying that construction needs to occur this way, but it seemed the simplest way to express the concept and aligns with a known metaphor (ES5 class-likes).
18:27	<Mathieu Hofman>	Right, I'm just saying having `prototype.constructor` would be difficult to have correct if you share a prototype between multiple struct types (imported vs locally defined)
18:28	<rbuckton>	Why would you share a prototype? That generally wouldn't be the case in a post-origin-trial syntactic `shared struct` declaration.
18:29	<Mathieu Hofman>	if you want `instanceof` to return true
18:29	<rbuckton>	I mean, you can share a prototype between multiple classes today, but that's not a common occurrence.
18:29	<Mathieu Hofman>	I'm very very confused
18:29	<Mathieu Hofman>	do you care about `receivedStruct instanceof LocalConstructor` being true or not ?
18:30	<Mathieu Hofman>	if you do, you need either the receivedStruct to be the same underlying type as the locally defined one, which is impossible without global registries, or you have to have the different struct types share a prototype
18:30	<rbuckton>	`x instanceof Y` doesn't care about `prototype.constructor`, it asks `Y` "is `x` an instance of you?". `Y` then looks at `Y.prototype` and recursively checks to see if it is in `x`'s prototype chain.
19:16	<shu>	instanceof is actually hookable
19:16	<shu>	which i honestly did not know until like a month ago
19:17	<ljharb>	sadly yes, which makes it even less reliable
19:18	<rbuckton>	via `[Symbol.hasInstance]`, yes
19:18	<Mathieu Hofman>	yeah with the Symbol.hasInstance on the RHS, but that would require the local constructor to be aware of the other "equivalent" types, which may be knowledge scoped to a local registry
19:18	<Mathieu Hofman>	basically it all boils down to resolving scope conflicts