TC39 Structs and Shared Structs on 2023-01-27

01:17	<Mathieu Hofman>	So a shared struct instance has a stable identity across `postMessage`, right? To expand on my earlier idea, I think we don't need any stable identity for module blocks or even symbols, we can just use shared structs themselves to dynamically attach behavior to a shared struct kind, given some built-in wiring. Here is a gist where I explore that approach: https://gist.github.com/mhofman/aa23fcc88e1ccd031a3c34f88577eaf7 It does not require any new syntax, or extra magic in postMessage (like module blocks, or symbol identities being preserved). It only requires an automatically generated static property on shared struct classes that represent the kind, and some built-in behavior, plus the dynamic prototype lookup we discussed of course. I'm actually wondering if this could be prototyped (pun intented) in the current experiment.
10:39	<Ashley Claymore>	Right, that would be similar to having 'known' static functions that operate on the type and each agent individually imports the module to use those functions, except with the power to per-realm/per-agent register those functions as the prototype to use for that 'type' to get the capability of method dispatch. (also method chaining, but `\|>` operator would also give that). 'type' here being the identity created by the `shared struct class` syntax (the fan-out case). For the fan-in case, where a farm of workers start up on their own, each creating their own separate `shared struct class` (from the same module URL), as they are sent to a sink, the sink would need to register that one 'prototype' with the multiple 'type's that are received from each worker.
11:54	<Mathieu Hofman>	Right the fan in case would work too, there jsut would be different constructors and kinds, that each would have to be set to use the same prototype maker, or prototype implementation if the prototype doesn't care about exposing the realm local constructor. You're right at the end of the day this boils down to the capability of setting the dynamic prototype to use for instances.
12:21	<Ashley Claymore>	and to also capture a bit of what was discussed on the call last night:
12:21	<Ashley Claymore>	the goal is to have: the instance of a shared struct to exist in shared memory, and the reference to this is passed around directly. There is not a per-agent wrapper adding a layer of indirection for prop access.
12:24	<Ashley Claymore>	if there is a dynamic lookup when [[prototype]] is accessed, if returned functions come from the calling realm then this means that a shared-struct passed between realms (node-VM, or same-origin-iframe, etc) then that means the value returned by `getPrototypeOf` can observably change, which violates the current description of the sealed integrity level.
12:26	<Ashley Claymore>	cont: if the lookup is 'cached' per-agent, maybe on first read, then this implies there is additional memory usage for a per-agent-per-instance cache
12:26	<Mathieu Hofman>	the goal is to have: the instance of a shared struct to exist in shared memory, and the reference to this is passed around directly. There is not a per-agent wrapper adding a layer of indirection for prop access. I have actually been wondering about that, and whether that's an observable thing from the 262 spec point of view. The only program observable aspect of this is the preservation of identity through cross agent interactions, which is host defined anyway.
12:27	<Ashley Claymore>	yes unlikely to be observable, but if that is an implementation goal then it limits which semantics are performant/simple/memory-efficient etc
12:28	<Mathieu Hofman>	cont: if the lookup is 'cached' per-agent, maybe on first read, then this implies there is additional memory usage for a per-agent-per-instance cache Not really. If the prototype is set once and maybe throws when accessed before, then it's arguably sealed for that agent. As I mentioned, the fact there is a single reference shared between agents is an implementation detail IMO
12:28	<Mathieu Hofman>	Quoted the wrong message
12:29	<Mathieu Hofman>	As for per instance memory, i believe it'd only be per kind / type memory, not per instance
15:46	<rbuckton>	So a shared struct instance has a stable identity across `postMessage`, right? To expand on my earlier idea, I think we don't need any stable identity for module blocks or even symbols, we can just use shared structs themselves to dynamically attach behavior to a shared struct kind, given some built-in wiring. Here is a gist where I explore that approach: https://gist.github.com/mhofman/aa23fcc88e1ccd031a3c34f88577eaf7 It does not require any new syntax, or extra magic in postMessage (like module blocks, or symbol identities being preserved). It only requires an automatically generated static property on shared struct classes that represent the kind, and some built-in behavior, plus the dynamic prototype lookup we discussed of course. I'm actually wondering if this could be prototyped (pun intented) in the current experiment. Is there a reason the identity needs to be a static property, or surfaced to the user at all, as opposed to an internal slot?
15:58	<Ashley Claymore>	allows more userland solutions/experimentation maybe?
15:59	<Ashley Claymore>	though I guess that itself can be done in userland `const id = new shared struct class ID{} shared struct class SSC { __type__ = id }`
15:59	<rbuckton>	I'm not certain that's necessary, at least not for an MVP.
15:59	<Mathieu Hofman>	I did this so the program could attach the dynamic prototype without magic
16:00	<Mathieu Hofman>	We can always try to find more ergonomic ways, but this is flexible for experimenting
16:00	<rbuckton>	It just seems a bit like an overcomplication, IMO.
16:01	<Mathieu Hofman>	If you have an alternative I'm all ears
16:02	<Mathieu Hofman>	All the solutions I've heard so far rely on more syntax that doesn't exist today
16:43	<Mathieu Hofman>	though I guess that itself can be done in userland `const id = new shared struct class ID{} shared struct class SSC { __type__ = id }` Btw, the set shared/dynamic prototype is the critical part, which cannot be done in userland. And since this is leveraging shared stuct itself to describe the type for its identity preserving feature, the whole thing needs to be bootstrapped in the engine too
16:47	<Ashley Claymore>	yep. I was imagining a userland experimental library where the prototype is registered with the library, and then the shared-struct is passed to a `wrap` function that returns a proxy wrapper for it which adds the proto look up (but loses the ability to be structured clone)
16:48	<Mathieu Hofman>	Oh yeah you can totally do this in userland with Proxies
16:48	<Mathieu Hofman>	at the expense of per realm proxy instances
16:49	<Ashley Claymore>	and code wouldn't be able to magically pass that proxied wrapper to another agent, they would know it needs to be unwrapped again
16:49	<Mathieu Hofman>	minus the postMessage identity preserving logic, but that can be emulated by wrapping postMessage, which gets hairy quickly, and makes it impossible to do cross agent gc of course
16:50	<Ashley Claymore>	or structuredClone would need a new handler similar to `toJSON` where it can extract out the struct automatically
16:50	<Mathieu Hofman>	aka no cycle collection. if no cycle, you can use weakrefs
16:50	<Mathieu Hofman>	nah that really needs to be in the engine (at least until I get to propose my API to support distributed GC)
16:51	<Mathieu Hofman>	I have been toying with identity preservation through postMessage for a few years now, that's what got me interested in TC39 in the first place
16:52	<Mathieu Hofman>	(yes I know different standard groups, but the gc API needs to be in the language)
16:52	<Mathieu Hofman>	wrapping postMessage is no fun, it's very inception
16:53	<Ashley Claymore>	right now I kinda like the idea that shared structs that want a prototype are top-level-const exported `export shared struct class Foo {}` , and the module they are declared is what is 'attached' to the 'type' as an internal slot. for other agents to load. It could have an overlap with the import-defer-eval proposal, where the module is sync loaded on the first prototype access to be lazy and reduce the cost when the methods are not accessed by other agents
16:55	<Mathieu Hofman>	yeah maybe that's what the ergonomic solution ends up looking like, but module blocks and/or import defer do not exist today if shu want to experiment with something right away. My proposal is about enabling this with what we have today, and the dynamic prototype lookup we'll need anyway
16:55	<Mathieu Hofman>	no need to mess with module logic
16:56	<Ashley Claymore>	(I don't think) this would need module blocks, the part that is attached to the internal slot can be, as littledan said, a URL string
16:58	<Mathieu Hofman>	I'm personally not a fan of tying module specifier strings into the solution
17:21	<rbuckton>	Is there any reason the origin trial API couldn't be extended to provide a registration mechanism for the shared struct type with a user-provided unique ID (just a string, but could contain a URI, UUID, etc.) and the thread-local prototype to use? // + fairly flexible // - requires setup on receiver end, correlation of struct type identity. // - requires encapsulation for constructor logic // data.js // data only structs don't require registration and have no shared type identity. export const DataOnly = new SharedStructType(["foo"]); // vector2d.js // data+behavior structs require registration to define type identity const { type: _Vector2D, register } = SharedStructType.prepare(["x", "y"]); // custom construction behavior export function Vector2D(x = 0, y = 0) { const _this = Reflect.construct(_Vector2D, [], new.target); _this.x = x; _this.y = y; return _this; } // prototype methods Vector2D.prototype.distanceTo = function (v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); }; // Register the type identity, constructor, and prototype to use for this struct type in this thread. // NOTE: register(id, constructor [, prototype]) register("e3a9bd1f-0f64-4848-b255-3c629d0c44a3", Vector2D, Vector2D.prototype); export { Vector2D }; // main.js import { DataOnly } from "./data.js"; import { Vector2D } from "./vector2d.js"; const data = new DataOnly(); data.foo = "data only, no behavior"; const v1 = new Vector2D(1, 2); const v2 = new Vector2D(3, 4); const worker1 = new Worker("worker1.js"); worker1.postMessage([data, v1, v2]); const worker2 = new Worker("worker2.js"); worker2.postMessage([data, v1, v2]); // worker1.js // worker1 does not import Vector2D, so can only access its data. parentPort.on("message", ([data, v1, v2]) => { data.foo; // "data only, no behavior" v1.x; // 1 v2.y; // 4 // NOTE: prototype not registered. This could mean an invalid prototype chain (thus every non-data // member acces would throw), or a default prototype chain (where prototype methods throw by nature // of them just being `undefined`).0 v1.distanceTo(v2); // error v1.toString(); // error? }); // worker2.js // worker2 imports Vector2D, which causes registration as a side-effect. import "./vector2d.js"; parentPort.on("message", ([data, v1, v2]) => { data.foo; // "data only, no behavior" v1.x; // 1 v2.y; // 4 v1.distanceTo(v2); // ok v1.toString(); // ok }); The user-supplied type identity would allow the user to define the same struct in different bundles, and the registry wouldn't be global, but would rather be thread-local, so there'd be no global mutable registry to worry about. It also doesn't really matter if the prototypes differ slightly between realms/threads, since they all access the same underlying data. We could eventually extend this to syntax, possibly even using decorators: // data.js export shared struct DataOnly { foo; // NOTE: constructors don't necessarily require registration constructor(foo) { this.foo = foo; } } // vector2d.js // a syntactic declaration doesn't necessarily need to set a type identity, since // we could infer one for it based on path to the containing file and its offset within // the file. However, we could still allow the user to explicitly specify type identity // for use with bundlers, or other cases where the inferred type identity might not // be sufficient. @struct.id("e3a9bd1f-0f64-4848-b255-3c629d0c44a3") // or: @struct.id("https://babylonjs.com/5.0/Vector2D") export shared struct Vector2D { constructor(x, y) { this.x = x; this.y = y; } distanceTo(v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); } }
18:02	<Mathieu Hofman>	I fail to see how a user supplied value provided after shared struct creation would work. Unless you'd somehow remap types you may have already seen. IMO you'd at least need to pass your unique ID as part of the `SharedStructType.prepare` call. But in general I don't like strings for unique IDs, and since we already have object that carry identity across agents, I thought it'd make sense to reuse them.
18:03	<Mathieu Hofman>	Btw, the prototype attaching I suggested can be made to have a shape more similar to your suggestion above
18:49	<Mathieu Hofman>	// vector2d.js // Each shared struct type, whether data only or "prepared" has its own unique type export const vector2Dtype = SharedStructType.prepare(["x", "y"]); const _Vector2D = SharedStructType.getConstructor(vector2Dtype); // custom construction behavior export function Vector2D(x = 0, y = 0) { const _this = Reflect.construct(_Vector2D, [], new.target); _this.x = x; _this.y = y; return _this; } // prototype methods Vector2D.prototype.distanceTo = function (v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); }; SharedStructType.registerPrototype(vector2Dtype, Vector2D.prototype); // main.js import { Vector2D, vector2DType } from "./vector2d.js"; const v1 = new Vector2D(1, 2); const worker = new Worker("worker.js"); worker.postMessage([vector2DType, v1]); // worker.js // worker imports Vector2D, which causes registration as a side-effect. import { Vector2D, vector2DType } from "./vector2d.js"; const v2 = new Vector2D(3, 4); parentPort.on("message", ([mainVector2DType, v1]) => { SharedStructType.registerPrototype(mainVector2DType, Vector2D.prototype); assert(mainVector2DType !== vector2DType); assert( SharedStructType.getConstructor(mainVector2DType) !== SharedStructType.getConstructor(vector2Dtype) ); assert(v1 instanceof Vector2D); // by virtue of sharing a prototype v1.x; // 1 v1.distanceTo(v2); // ok v1.toString(); // ok }); SharedStructType.getConstructor() is needed to allow the program to avoid duplicating type definitions in each agent/realm. I think the burden of such deduplication should be on the program, not on the engine. If the engine had to deduplicate itself, you'd either need a user provided type identifier at "prepare" time, and a global lock around such type definitions, or you'd need to somehow be able to collapse separate definitions into a single one. In either case you'd have to figure out what to do if the shape definition does not match, and in the case of definition collapse, how do you communicate the error to the program? By having the type definition generate the unique type identifier, you avoid all those complications in the engine, at the cost of putting more type hydration burden on the program.