TC39 Structs and Shared Structs on 2023-01-29

22:17	<rbuckton>	I fail to see how a user supplied value provided after shared struct creation would work. Unless you'd somehow remap types you may have already seen. IMO you'd at least need to pass your unique ID as part of the `SharedStructType.prepare` call. But in general I don't like strings for unique IDs, and since we already have object that carry identity across agents, I thought it'd make sense to reuse them. The API I suggested would only be for the origin trial, not the final proposal. The user supplied type identity and prototype isn't "provided after shared struct creation", but rather, the shared struct constructor isn't usable until after it is registered. This is why shared struct type creation that depends on a prototype is different from data-only shared structs. You would have to call `register` to make the struct type valid.
22:18	<rbuckton>	I fail to see how a user supplied value provided after shared struct creation would work. Unless you'd somehow remap types you may have already seen. IMO you'd at least need to pass your unique ID as part of the `SharedStructType.prepare` call. But in general I don't like strings for unique IDs, and since we already have object that carry identity across agents, I thought it'd make sense to reuse them. "Objects that carry identity across agents" doesn't help with the model I was proposing. In my model, each agent must independently register the type associated with the struct.
22:20	<Mathieu Hofman>	The API I suggested would only be for the origin trial, not the final proposal. The user supplied type identity and prototype isn't "provided after shared struct creation", but rather, the shared struct constructor isn't usable until after it is registered. This is why shared struct type creation that depends on a prototype is different from data-only shared structs. You would have to call `register` to make the struct type valid. That's doesn't answer my concern. What would happen if 2 types with different shapes are registered with the same user supplied identity?
22:20	<rbuckton>	// vector2d.js // Each shared struct type, whether data only or "prepared" has its own unique type export const vector2Dtype = SharedStructType.prepare(["x", "y"]); const _Vector2D = SharedStructType.getConstructor(vector2Dtype); // custom construction behavior export function Vector2D(x = 0, y = 0) { const _this = Reflect.construct(_Vector2D, [], new.target); _this.x = x; _this.y = y; return _this; } // prototype methods Vector2D.prototype.distanceTo = function (v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); }; SharedStructType.registerPrototype(vector2Dtype, Vector2D.prototype); // main.js import { Vector2D, vector2DType } from "./vector2d.js"; const v1 = new Vector2D(1, 2); const worker = new Worker("worker.js"); worker.postMessage([vector2DType, v1]); // worker.js // worker imports Vector2D, which causes registration as a side-effect. import { Vector2D, vector2DType } from "./vector2d.js"; const v2 = new Vector2D(3, 4); parentPort.on("message", ([mainVector2DType, v1]) => { SharedStructType.registerPrototype(mainVector2DType, Vector2D.prototype); assert(mainVector2DType !== vector2DType); assert( SharedStructType.getConstructor(mainVector2DType) !== SharedStructType.getConstructor(vector2Dtype) ); assert(v1 instanceof Vector2D); // by virtue of sharing a prototype v1.x; // 1 v1.distanceTo(v2); // ok v1.toString(); // ok }); SharedStructType.getConstructor() is needed to allow the program to avoid duplicating type definitions in each agent/realm. I think the burden of such deduplication should be on the program, not on the engine. If the engine had to deduplicate itself, you'd either need a user provided type identifier at "prepare" time, and a global lock around such type definitions, or you'd need to somehow be able to collapse separate definitions into a single one. In either case you'd have to figure out what to do if the shape definition does not match, and in the case of definition collapse, how do you communicate the error to the program? By having the type definition generate the unique type identifier, you avoid all those complications in the engine, at the cost of putting more type hydration burden on the program. This assumes the worker can be sure the incoming message is actually for that prototype, which might not be the case if there are multiple possible values for the message. This design seems very hard to implement in user code, while my suggestion is more reliable for write-once and reuse.
22:20	<rbuckton>	That's doesn't answer my concern. What would happen if 2 types with different shapes are registered with the same user supplied identity? An error. If the user defined identity is already registered to a different shape, it should throw.
22:23	<rbuckton>	The bigger question is, what do we want the final, shipping version of this to look like? Do we want package authors to be able to define structs that package consumers can just use, or do we want package consumers to need to register shared structs themselves in any messaging scaffolding?
22:23	<Mathieu Hofman>	This assumes the worker can be sure the incoming message is actually for that prototype, which might not be the case if there are multiple possible values for the message. This design seems very hard to implement in user code, while my suggestion is more reliable for write-once and reuse. Right it requires more setup dance from the program, and it's that setup dance that we should try to make easier. But I'm very doubtful we can overcome the issue with a user supplied identity.
22:28	<rbuckton>	If we want to make this simple for application developers, then they should be able to install a package, import the struct type (or at least use a side-effecting import for the file containing the struct type to make it visible to the agent), and just use it. This means some kind of per-agent registration would need to occur within the file that contains the struct itself. One mechanism we discussed for that was to depend on the resolved module ID (i.e., if we were depending on the module loader cache). However, that doesn't work well with bundling scenarios where I have one bundle for the main thread, and another bundle for a worker.
22:29	<rbuckton>	So there would need to be some way to uniquely identify a struct type regardless of path, such that the same struct types in each bundle can be associated with each other.
22:31	<rbuckton>	User-defined IDs are used everywhere for this in many languages and runtimes: UUIDs, URNs, DTDs, etc.
22:33	<rbuckton>	My main concern is that any burden for registration should fall on the package developer, not the application developer, whatever the design looks like in the end.
22:36	<Mathieu Hofman>	I think the question raised by Ashley Claymoreis relevant: how do you handle the fan in case where multiple agents setup a shared type before being introduced to each other. A forgeable type identifier in my opinion is not safe, and I'm pretty sure it'd never make it through committee. However I'm not convinced we need prototype/constructor continuity here.
22:36	<rbuckton>	And that whatever that design is should be able to take into account bundling, be that with module declarations/module expressions, or traditional bundlers
22:38	<rbuckton>	I think the question raised by Ashley Claymoreis relevant: how do you handle the fan in case where multiple agents setup a shared type before being introduced to each other. A forgeable type identifier in my opinion is not safe, and I'm pretty sure it'd never make it through committee. However I'm not convinced we need prototype/constructor continuity here. Since this wouldn't be a global registry, it wouldn't matter. Your code defines the relationship between an id and a constructor/prototype on your agent. If your code tries to register the same unique id twice, it should be an error so you can isolate the problem early. If the constructor/prototype on one agent doesn't match the constructor/prototype on another agent, that's fine. In fact, that may even be a value add, since bundlers could potentially tree-shake away prototype methods that aren't used.
22:42	<Mathieu Hofman>	How is it not a global registry?
22:42	<Mathieu Hofman>	You're assuming there is a single author to code running in an agent
22:42	<rbuckton>	All that matters is that a given struct type has the same type identity on multiple Agents, so that each Agent can bind a prototype to that type identity. Producing an instance of a struct type should be possible on any Agent, such that I could do: `// main.js const v1 = new Vector2D(1, 2); worker.postMessage(v1); worker.on("message", e => console.log(e instanceof Vector2D)); // worker.js const v2 = new Vector2D(3, 4); remotePort.postMessage(v1); remotePort.on("message", e => console.log(e instanceof Vector2D));`
22:43	<Mathieu Hofman>	And a single author to code running in communicating agents
22:43	<rbuckton>	How is it not a global registry? I don't understand. Each Agent would have an independent registry mapping a type identity to a prototype.
22:44	<rbuckton>	You're assuming there is a single author to code running in an agent I'm assuming that if you want to consistently use the same data in multiple agents, with the same underling behavior, then you should use a consistent definition.
22:44	<Mathieu Hofman>	All that matters is that a given struct type has the same type identity on multiple Agents, so that each Agent can bind a prototype to that type identity. Producing an instance of a struct type should be possible on any Agent, such that I could do: `// main.js const v1 = new Vector2D(1, 2); worker.postMessage(v1); worker.on("message", e => console.log(e instanceof Vector2D)); // worker.js const v2 = new Vector2D(3, 4); remotePort.postMessage(v1); remotePort.on("message", e => console.log(e instanceof Vector2D));` Do you expect these console logs to print true or false?
22:44	<rbuckton>	I expect them to print `true`.
22:45	<Mathieu Hofman>	Then you expect continuity of constructor / prototype across agents
22:45	<rbuckton>	If both the main thread and worker register their version of `Vector2D.prototype` for the same type identity, then creating those structs on either side and sending them to the other side should be consistent.
22:46	<rbuckton>	Then you expect continuity of constructor / prototype across agents No. I would recommend continuity of the prototype across agents. Constructor doesn't actually matter.
22:46	<rbuckton>	As I said, a bundler could potentially tree-shake away prototype methods on either side based on use, and it should still work.
22:47	<Mathieu Hofman>	I don't understand. Each Agent would have an independent registry mapping a type identity to a prototype. Yes that we agree on. The question is who generates the key of that mapping. If it's user controlled, you have code running in an agent that can interfere with other code running in the same agent.
22:50	<Mathieu Hofman>	No. I would recommend continuity of the prototype across agents. Constructor doesn't actually matter. By continuity I mean recognition of the prototype objects identity. It doesn't need to be the same in multiple agents, obviously, but an object of the type registered in agent1 and sent to agent2 is expected to have the same prototype as an object of the "same type" created in agent2. I am not convinced we need that
22:51	<Mathieu Hofman>	I believe we could get away with duck typing here
22:52	<rbuckton>	Basically, imagine this: A Struct Type has a type identity (be it system or user produced). A Struct instance has a private slot containing that type identity. An Agent has a mapping of type identity to a prototype. Agent A registers a Struct type (`Foo`) for a given type identity and a prototype defined in a realm in Agent A. Agent B registers a Struct type (`Foo`) for the same type identity and a prototype defined in a realm in Agent B. Agent A constructs an instance of Agent A's struct type `Foo`, sends it to Agent B. Agent B receives the struct value. When Agent B performs ToObject on the struct value, it looks up the type identity in its internal slot in Agent B's registry to find the prototype. Agent B constructs an instance of Agent B's struct type `Foo`, sends it to Agent A. Agent A receives the struct value. When Agent A performs ToObject on the struct value, it looks up the type identity in its internal slot in Agent A's registry to find the prototype.
22:53	<rbuckton>	The type identity we transfer from A to B, or vise versa, could also encode the expected shape of the struct type. That way, if Agent A and Agent B disagree on the shape associated with the type identity, that error would be thrown on prototype lookup in ToObject.
22:54	<rbuckton>	Neither agent needs to communicate their registry to the other agents, thus no global registry.
22:55	<rbuckton>	If Agent A sends a struct value to Agent B that B doesn't have registered, Agent B could still allow access to the data, just not a prototype walk.
22:57	<rbuckton>	By continuity I mean recognition of the prototype objects identity. It doesn't need to be the same in multiple agents, obviously, but an object of the type registered in agent1 and sent to agent2 is expected to have the same prototype as an object of the "same type" created in agent2. I am not convinced we need that The prototype shape on each side doesn't matter. Just the type identity. I could have a `Foo { x, y, bar() {} }` on A and a `Baz { x, y, quxx() {} }` on B registered to the same type identity. If I create a `Foo` on A and send it to B, B will see it as a `Bar`. If I construct a `Bar` on B and send it to A, A will see it as a `Foo`.
22:58	<rbuckton>	What I care about is consistency in round-tripping.
22:58	<Mathieu Hofman>	My concern is with regards to who mints the type identity, as that grants the right to register it. A forgeable value allows code that doesn't trust each other to interfere with each other. That code could be running in the same realm. If the type is minted by the engine, how is it recovered by the code.
22:58	<rbuckton>	And the `Foo` vs `Bar` idea isn't farfetched, bundlers can and do tree shake methods, and can and do rename classes.
23:00	<rbuckton>	My concern is with regards to who mints the type identity, as that grants the right to register it. A forgeable value allows code that doesn't trust each other to interfere with each other. That code could be running in the same realm. If the type is minted by the engine, how is it recovered by the code. In the real proposal structs are top-level declarations. Evaluation and registration of structs occurs mostly during initial module loading. Defensive JavaScript does what it always does, today, which is ensure code they trust runs first.
23:01	<Mathieu Hofman>	Why would a struct decalration be top level?
23:01	<rbuckton>	You error on conflicts rather than silently allowing them, so most applications will fail immediately if malicious code tries to forge an identity.
23:01	<Mathieu Hofman>	And how does that even matter to type identity
23:01	<rbuckton>	It would have to be if you want to use module id + offset or something as a default type identity.
23:02	<Mathieu Hofman>	How would that be compatible with bundlers?
23:02	<rbuckton>	If a function can return a `struct` declaration that produces a different object identity for each function call, there'd be no way to differentiate them.
23:03	<rbuckton>	How would that be compatible with bundlers? It isn't, that's why you'd also need a way to override the type identity. But having a working default behavior is also a good idea.
23:04	<Mathieu Hofman>	I'm just saying. I don't see how you can have an unforgeable type identity generated or derived by the engine, and have automatic mapping of these types across agents in a way to allows for prototype continuity as you described earlier
23:05	<Mathieu Hofman>	Specifying the type identity when registering must require an unforgeable value, that obviously must have a stable identity when sent between agents
23:05	<rbuckton>	I don't think the type identity being "unforgeable" is important. If you error on an attempted redefinition, and those errors occur during application startup, then we do what we always do and run code we rely on first.
23:07	<rbuckton>	Each struct type needs a type identity. For the bundler case, you could supply one. For the "I'm just loading the same module from the same path in the main thread and the worker" case, you could rely on module id and source text offset within the module as a stable identity.
23:07	<Mathieu Hofman>	A forgeable value will not be acceptable. An error is at best a denial of service for code that doesn't trust each other
23:08	<rbuckton>	Then the registry need not be per-agent, but per-realm, just like for `Number`, `String`, etc. Have untrusted code run in another realm/compartment/etc.
23:08	<Mathieu Hofman>	Again there can be code that doesn't trust each other in the same realm
23:09	<Mathieu Hofman>	At best you could have this registry per compartment
23:12	<Mathieu Hofman>	This type registry you propose is novel in the 262 world, and it does break precedent.