2023-01-19 [16:31:10.0048] i do plan to have the call tomorrow if anyone's interested [16:31:19.0088] on my agenda is to talk about method sharing and prototype lookup [16:33:45.0175] I mentioned to Ashley the other day it might be worth for him to join to discuss if there may be any similarities with the R&T proposal as was hinted by some during the last plenary. Will remind him of the call tomorrow [16:36:24.0319] indeed, that sounds great, thanks! [04:38:07.0364] Thanks Mathieu Hofman ! I am likely to be on a train home during the call, but will try and listen in on my mobile [07:22:46.0511] doh, a last minute conflict arose (apartment repair people) [07:23:55.0649] Ashley Claymore: if i move this one to the 26th 1 hour earlier (9am PT), would that work better for you? [09:00:24.0264] OK, are we meeting now? [09:02:00.0657] shu: ^ [09:06:39.0301] OK, I take it not [09:08:36.0005] Moved to next week on the TC39 events cal now [09:13:28.0619] Also feel free to have at the original time next week (10am PT), I'm getting an earlier train than I would usually today :) [10:28:55.0651] littledan: sorry about short notice, last minute conflict 2023-01-26 [17:17:59.0236] planned agenda for tomorrow's meeting: - method sharing and prototype lookup - property redefinition (and freezing) - R&T interaction [17:18:12.0086] * planned agenda for tomorrow's meeting: - method sharing and prototype lookup - property redefinition (and freezing) - R&T interaction [17:25:48.0362] Unfortunately I don't think I will be able to attend tomorrow due to a conflicting appointment. I don't have much to report, Wasm GC is moving towards getting a formal spec (& JS API spec) so if anything relevant comes out of that I'll report in future meetings. [17:28:52.0881] thanks asumu [18:14:18.0164] > <@aclaymore:matrix.org> Also feel free to have at the original time next week (10am PT), I'm getting an earlier train than I would usually today :) ^^ if missed before. Feel free to keep the 10am time, if that works better for asumu . My clash last week was not reoccurring [07:54:34.0449] > <@shuyuguo:matrix.org> planned agenda for tomorrow's meeting: > > - method sharing and prototype lookup > - property redefinition (and freezing) > - R&T interaction I'm also curious if we can leverage `using` with Mutex/ConditionVariable, since that was one of the reasons I focused on getting that proposal to Stage 3. [08:50:52.0388] sure, sounds good [09:04:29.0034] Ashley Claymore: the call is now, btw [09:05:07.0721] omw [09:05:24.0982] sorry, actually 1 min [09:05:29.0556] np [10:39:16.0884] Regarding the "shared modules" suggestion. I may have jumped a few steps ahead in my reasoning without explaining how I got there, so I'll take a few steps back. If we imagine an implementation of shared structs that contains some form of methods, what can those methods close over? Only globals? What about functions that are siblings to the shared struct? What about imports? If we do not close over these things, the methods of shared structs won't be able to reuse useful utilities such as common vector math operations you might use in a 3D graphics library, that aren't somehow patched into `globalThis`. If we do close over these things, how do we ensure the module graph has been instantiated on the worker thread? What about initialization logic that might be needed that wires together some of these modules, applies polyfills, etc.? What if these modules contain top-level `await` or must otherwise be loaded asynchronously? How and when do we instantiate the thread-local (or per-realm?) prototype for each struct, especially if doing so might kick off this kind of module loading in the worker? The leap in logic I took to the "shared module" approach was to address these concerns: - "Shared modules" would have an identity in the module cache that could be used as part of the type identity. - "Shared modules" would promote code reuse. - "Shared modules" would be restricted to containing only those things that can safely be shared, i.e. references to globals (which would be re-bound per realm), local functions, variables, shared structs, and imports/exports from other "shared modules". - Due to the nature of the above restrictions, "shared modules" avoid accidental references to non-shared code. It could be that the idea of isolating shared code to its own file is overkill, but many developers are already used to doing this with things like `protobuf` today (i.e., maintaining their `protobuf` schema in a separate file). [10:39:48.0569] * Regarding the "shared modules" suggestion. I may have jumped a few steps ahead in my reasoning without explaining how I got there, so I'll take a few steps back. If we imagine an implementation of shared structs that contains some form of methods, what can those methods close over? Only globals? What about functions that are siblings to the shared struct? What about imports? If we do not close over these things, the methods of shared structs won't be able to reuse useful utilities such as common vector math operations you might use in a 3D graphics library, that aren't somehow patched into `globalThis`. If we do close over these things, how do we ensure the module graph has been instantiated on the worker thread? What about initialization logic that might be needed that wires together some of these modules, applies polyfills, etc.? What if these modules contain top-level `await` or must otherwise be loaded asynchronously? How and when do we instantiate the thread-local (or per-realm?) prototype for each struct, especially if doing so might kick off this kind of module loading in the worker? The leap in logic I took to the "shared module" approach was to address these concerns: - "Shared modules" would have an identity in the module cache that could be used as part of the type identity. - "Shared modules" would promote code reuse. - "Shared modules" would be restricted to containing only those things that can safely be shared, i.e. references to globals (which would be re-bound per realm), local functions, variables, shared structs, and imports/exports from other "shared modules". - Due to the nature of the above restrictions, "shared modules" avoid accidental references to non-shared code. It could be that the idea of isolating shared code to its own file is overkill, but many developers are already used to doing this with things like `protobuf` today (i.e., maintaining their `protobuf` schema in a separate file). [10:40:26.0368] * Regarding the "shared modules" suggestion. I may have jumped a few steps ahead in my reasoning without explaining how I got there, so I'll take a few steps back. If we imagine an implementation of shared structs that contains some form of methods, what can those methods close over? Only globals? What about functions that are siblings to the shared struct? What about imports? If we do not close over these things, the methods of shared structs won't be able to reuse useful utilities such as common vector math operations you might use in a 3D graphics library, that aren't somehow patched into `globalThis`. If we do close over these things, how do we ensure the module graph has been instantiated on the worker thread? What about initialization logic that might be needed that wires together some of these modules, applies polyfills, etc.? What if these modules contain top-level `await` or must otherwise be loaded asynchronously? How and when do we instantiate the thread-local (or per-realm?) prototype for each struct, especially if doing so might kick off this kind of module loading in the worker? The leap in logic I took to the "shared module" approach was to address these concerns. - "Shared modules" would have an identity in the module cache that could be used as part of the type identity. - "Shared modules" would promote code reuse. - "Shared modules" would be restricted to containing only those things that can safely be shared, i.e. references to globals (which would be re-bound per realm), local functions, variables, shared structs, and imports/exports from other "shared modules". - Due to the nature of the above restrictions, "shared modules" avoid accidental references to non-shared code. It could be that the idea of isolating shared code to its own file is overkill, but many developers are already used to doing this with things like `protobuf` today (i.e., maintaining their `protobuf` schema in a separate file). [10:52:26.0457] My answer to this closing over/loading question was, the other side which receives the object has three options for handling module loading: - The receiving side already expected that the object would come, so the module where the shared struct is defined has already been loaded, and the receiving module can start using it immediately. - [I honestly can't think of a use case for this, but ] The receiving side sets itself up to handle objects dynamically, so it queries the object it received for the module specifier, `await import()`s that, and then can use the methods. - The receiving side just wants to use the plain old data, and can do so without importing anything. In all cases, there are no particular restrictions in what is closed over (just by construction because we do this whole dance per module map). And there just is no such thing as shared code, no limitations on mutating the local copy of the shared classes, or on TLA (because no synchronous module loading is ever used, just normal async). It does depend on one or other type of identity (which could be URL, or module block, or symbol if we have a global mapping). What do you see as the downsides of this option? [10:53:03.0871] > What about initialization logic that might be needed that wires together some of these modules, applies polyfills, etc.? I don't have a solution to this; I was assuming that you could somehow bake this into the module. [10:53:28.0662] (which doesn't mean necessarily bundling all recursive dependencies! it can have `import` statements like normal.) [10:57:10.0628] I honestly don't understand how "shared modules" would work in detail--how they would differ from this, beyond being a subset requiring only recursive use of shared modules [11:53:32.0432] > The receiving side already expected that the object would come, so the module where the shared struct is defined has already been loaded, and the receiving module can start using it immediately. This is probably reasonable, as long as you can reliably correlate a shared struct type in both realms by module id and export name. However, that would potentially restrict shared struct definitions to only be at the top level of a _Module_, since returning them from a function call might not necessarily result in the same identity being valid. [11:54:11.0635] > [I honestly can't think of a use case for this, but ] The receiving side sets itself up to handle objects dynamically, so it queries the object it received for the module specifier, await import()s that, and then can use the methods. This seems like a poor developer experience. [12:04:32.0903] The "shared module" I was imagining would be fairly restrictive so as to have the declarations only really be resident in memory once, and not reparsed/linked/evaluated per-realm. No per-realm initialization, variables limited to constant, primitive values (and trivially reduceable expressions containing primitives), only top-level declarations: structs, functions, vars, imports/exports, maybe enums (if we can find a version of that proposal that might be accepted). Such a module could be accessed via module id, and reachable from any worker/realm. Evaluating functions/methods/constructors/etc. from a "shared module" would use the current realm. Struct type identity would be trivially resolvable via _module id_+_export name_, producing the correct prototype in each realm. [12:05:57.0709] A fourth approach, which I'm trying to enable with this design, is that the receiving end doesn't need to worry about running code to support the struct since its easily reachable. [12:10:04.0496] Yes, the setup is more restrictive due to the limitations imposed by a "shared module", but it also avoids many pitfalls like developers inadvertently depending on thread-local or realm-local state in shared code. The benefit being that _consuming_ shared structs is simple and intuitive. You just send the value via `postMessage` and can use it immediately in the worker without any added fuss. [12:12:05.0182] OK, so this is trying to solve the stronger version of the problem that you explained [12:13:09.0999] > <@rbuckton:matrix.org> This is probably reasonable, as long as you can reliably correlate a shared struct type in both realms by module id and export name. However, that would potentially restrict shared struct definitions to only be at the top level of a _Module_, since returning them from a function call might not necessarily result in the same identity being valid. Yes, shared structs which define methods that are supposed to be accessible from other agents need to be defined at the top level of a module. I agree that this is a singificant restriction. [12:13:38.0472] > <@rbuckton:matrix.org> This seems like a poor developer experience. It's hard for me to evaluate how bad it is without understanding the use cases for this scenario. [12:13:39.0686] Yes. It's trying to impose restrictions on what a shared struct can reference so as to make the rest of the system simple and intuitive. [12:14:37.0267] I just can't construct the scenario in my head where it wouldn't be natural to directly import the module defining the shared struct, when you expect to receive it in postMessage [12:15:36.0735] I came at this from the perspective of: "Lets say we wanted to implement `Number` as a shared struct, from the ground up, what would we need to do?" (excl. operator overloading) [12:16:32.0406] Would you want everyone to need to write `import "std:number";` in their module to receive a number via `postMessage`? [12:16:40.0112] * Would you want everyone to need to write `import "std:number";` in their module to receive a number via `postMessage`? [12:17:29.0256] While that's an interesting lens, I like to think of those things being in an implicit "prelude". (The same logic applies for the operator overloading usage declarations, for example) [12:18:02.0685] so I guess I would go for, "let's see if we can implement `Number` *except* for that specific import statement" [12:18:04.0414] You'd need to use a side-effecting `import "structModule"` if you never access the constructor yourself, otherwise minifiers will tree shake it away. [12:19:22.0280] Which just adds one more source of potential confusion when things don't work in your bundled, minified release build. [12:19:34.0191] Ah, I hadn't really considered tree shaking [12:21:21.0780] are there any other problems that come to mind for you besides tree shaking? [12:21:40.0762] As I mentioned in the thread above, depending on an `import` is one more shaky foundation to build on that is a potential pit of failure for developers. A tree shaking minifier might remove the `import`, or would need to perform additional static analysis to know whether its actually safe to remove the import. [12:22:26.0646] I mean, I think we could teach tree shakers this particular thing: You can't just eliminate running an export of a shared struct, since executing that has a side effect. (We'd have to teach the tree shaker about that syntactic construct anyway!) [12:23:19.0710] OK, thanks for explaining; I hadn't considered that [12:23:32.0353] - Remembering to include the `import` - The main process changing the data it sends to the worker (depending on how the app is structured) - Middleware that might run before application code is loaded. [12:24:54.0248] Its not the tree shaking of the `export`, its the tree shaking of the `import`. That requires looking across files to say "oh, this `import` is from a module that transitively imports a module containing a shared struct that I might potentially receive", which is far more complicated. [12:25:32.0784] > <@rbuckton:matrix.org> Its not the tree shaking of the `export`, its the tree shaking of the `import`. That requires looking across files to say "oh, this `import` is from a module that transitively imports a module containing a shared struct that I might potentially receive", which is far more complicated. oh, I guess I assumed that this was normal stuff for tree shakers [12:25:53.0602] that a module execution may be known to have a side effect and that that shouldn't be removed [12:25:57.0065] If your app/package contains a single shared struct definition, imports becomes un-tree-shakable. [12:26:04.0614] right [12:26:40.0538] if this is an issue you could break up the module [12:26:46.0841] That seems bad. [12:27:10.0284] > <@littledan:matrix.org> if this is an issue you could break up the module Which is basically what a "shared module" enforces. [12:27:24.0082] > <@rbuckton:matrix.org> - Remembering to include the `import` > - The main process changing the data it sends to the worker (depending on how the app is structured) > - Middleware that might run before application code is loaded. OK, I guess the badness of that, together with the badness of this comment I'm replying to, is something which I don't have sufficient intuition into. [12:27:34.0513] or sufficient practical experience of the negative consequences [12:29:13.0733] Due to their restrictions, "shared modules" have no Evaluation step when the module is loaded. Dependency order becomes far less important (excluding decorators, which I'd have to think more on), so you could just surface the shared module imports in place of whatever other import you might have used that was otherwise removed. [12:44:23.0311] computed property names also do stuff when evaluated, as do, you know, the RHS of an `export const`... I'm pretty skeptical that it'd be practical to articulate a usable-enough subset of JS which doesn't have side effects when loaded. This would be very useful if possible, of course! It'd handle the lazy module loading issue [15:04:47.0631] unrelated sidebar: matrix has threads?? [15:04:49.0620] is this a new feature? [15:05:04.0960] i was wondering why it was showing the channel as having 25 unread messages until i found the thread above [15:43:20.0745] It’s had threads experimentally for a while (opt in) but they recently turned it on by default. 2023-01-27 [17:17:40.0197] So a shared struct instance has a stable identity across `postMessage`, right? To expand on my earlier idea, I think we don't need any stable identity for module blocks or even symbols, we can just use shared structs themselves to dynamically attach behavior to a shared struct kind, given some built-in wiring. Here is a gist where I explore that approach: https://gist.github.com/mhofman/aa23fcc88e1ccd031a3c34f88577eaf7 It does not require any new syntax, or extra magic in postMessage (like module blocks, or symbol identities being preserved). It only requires an automatically generated static property on shared struct classes that represent the kind, and some built-in behavior, plus the dynamic prototype lookup we discussed of course. I'm actually wondering if this could be prototyped (pun intented) in the current experiment. [02:39:13.0491] Right, that would be similar to having 'known' static functions that operate on the type and each agent individually imports the module to use those functions, except with the power to per-realm/per-agent register those functions as the prototype to use for that 'type'. 'type' here being the identity created by the `shared struct class` syntax (the fan-out case). For the fan-in case, where a farm of workers start up on their own, each creating their own separate `shared struct class` (from the same module URL), as they are sent to a sink, the sink would need to register that one 'prototype' with the multiple 'type's that are received from each worker. [02:40:23.0011] * Right, that would be similar to having 'known' static functions that operate on the type and each agent individually imports the module to use those functions, except with the power to per-realm/per-agent register those functions as the prototype to use for that 'type' to get the capability of method dispatch. (also method chaining, but `|>` operator would also give that). 'type' here being the identity created by the `shared struct class` syntax (the fan-out case). For the fan-in case, where a farm of workers start up on their own, each creating their own separate `shared struct class` (from the same module URL), as they are sent to a sink, the sink would need to register that one 'prototype' with the multiple 'type's that are received from each worker. [03:54:30.0010] Right the fan in case would work too, there jsut would be different constructors and kinds, that each would have to be set to use the same prototype maker, or prototype implementation if the prototype doesn't care about exposing the realm local constructor. You're right at the end of the day this boils down to the capability of setting the dynamic prototype to use for instances. [04:21:01.0015] and to also capture a bit of what was discussed on the call last night: [04:21:05.0879] - the goal is to have: the instance of a shared struct to exist in shared memory, and the reference to this is passed around directly. There is not a per-agent wrapper adding a layer of indirection for prop access. [04:24:17.0314] - if there is a dynamic lookup when \[\[prototype\]\] is accessed, if returned functions come from the calling realm then this means that a shared-struct passed between realms (node-VM, or same-origin-iframe, etc) then that means the value returned by `getPrototypeOf` can observably change, which violates the current description of the sealed integrity level. [04:24:47.0285] * - if there is a dynamic lookup when \[\[prototype\]\] is accessed, if returned functions come from the calling realm then this means that a shared-struct passed between realms (node-VM, or same-origin-iframe, etc) then that means the value returned by `getPrototypeOf` can observably change, which violates the current description of the sealed integrity level. [04:26:13.0169] - cont: if the lookup is 'cached' per-agent, maybe on first read, then this implies there is additional memory usage for a per-agent-per-instance cache [04:26:28.0092] * - cont: if the lookup is 'cached' per-agent, maybe on first read, then this implies there is additional memory usage for a per-agent-per-instance cache [04:26:34.0204] > <@aclaymore:matrix.org> - the goal is to have: the instance of a shared struct to exist in shared memory, and the reference to this is passed around directly. There is not a per-agent wrapper adding a layer of indirection for prop access. I have actually been wondering about that, and whether that's an observable thing from the 262 spec point of view. The only program observable aspect of this is the preservation of identity through cross agent interactions, which is host defined anyway. [04:27:40.0352] yes unlikely to be observable, but if that is an implementation goal then it limits which semantics are performant/simple/memory-efficient etc [04:28:13.0069] * yes unlikely to be observable, but if that is an implementation goal then it limits which semantics are performant/simple/memory-efficient etc [04:28:21.0568] > <@aclaymore:matrix.org> - cont: if the lookup is 'cached' per-agent, maybe on first read, then this implies there is additional memory usage for a per-agent-per-instance cache Not really. If the prototype is set once and maybe throws when accessed before, then it's arguably sealed for that agent. As I mentioned, the fact there is a single reference shared between agents is an implementation detail IMO [04:28:58.0036] Quoted the wrong message [04:29:36.0116] As for per instance memory, i believe it'd only be per kind / type memory, not per instance [07:46:04.0610] > <@mhofman:matrix.org> So a shared struct instance has a stable identity across `postMessage`, right? To expand on my earlier idea, I think we don't need any stable identity for module blocks or even symbols, we can just use shared structs themselves to dynamically attach behavior to a shared struct kind, given some built-in wiring. Here is a gist where I explore that approach: https://gist.github.com/mhofman/aa23fcc88e1ccd031a3c34f88577eaf7 > > It does not require any new syntax, or extra magic in postMessage (like module blocks, or symbol identities being preserved). It only requires an automatically generated static property on shared struct classes that represent the kind, and some built-in behavior, plus the dynamic prototype lookup we discussed of course. I'm actually wondering if this could be prototyped (pun intented) in the current experiment. Is there a reason the identity needs to be a static property, or surfaced to the user at all, as opposed to an internal slot? [07:58:00.0843] allows more userland solutions/experimentation maybe? [07:58:10.0670] * allows more userland solutions/experimentation maybe? [07:59:00.0390] though I guess that itself can be done in userland ``` const id = new shared struct class ID{} shared struct class SSC { __type__ = id } ``` [07:59:06.0900] I'm not certain that's necessary, at least not for an MVP. [07:59:41.0625] I did this so the program could attach the dynamic prototype without magic [08:00:08.0095] We can always try to find more ergonomic ways, but this is flexible for experimenting [08:00:41.0259] It just seems a bit like an overcomplication, IMO. [08:01:30.0464] If you have an alternative I'm all ears [08:02:31.0153] All the solutions I've heard so far rely on more syntax that doesn't exist today [08:43:52.0229] > <@aclaymore:matrix.org> though I guess that itself can be done in userland > > ``` > const id = new shared struct class ID{} > > shared struct class SSC { > __type__ = id > } > ``` Btw, the set shared/dynamic prototype is the critical part, which cannot be done in userland. And since this is leveraging shared stuct itself to describe the type for its identity preserving feature, the whole thing needs to be bootstrapped in the engine too [08:47:54.0839] yep. I was imagining a userland experimental library where the prototype is registered with the library, and then the shared-struct is passed to a `wrap` function that returns a proxy wrapper for it which adds the proto look up (but loses the ability to be structured clone) [08:48:23.0122] Oh yeah you can totally do this in userland with Proxies [08:48:42.0274] at the expense of per realm proxy instances [08:49:34.0811] and code wouldn't be able to magically pass that proxied wrapper to another agent, they would know it needs to be unwrapped again [08:49:59.0554] minus the postMessage identity preserving logic, but that can be emulated by wrapping postMessage, which gets hairy quickly, and makes it impossible to do cross agent gc of course [08:50:11.0219] or structuredClone would need a new handler similar to `toJSON` where it can extract out the struct automatically [08:50:17.0952] aka no cycle collection. if no cycle, you can use weakrefs [08:50:48.0668] nah that really needs to be in the engine (at least until I get to propose my API to support distributed GC) [08:51:35.0618] I have been toying with identity preservation through postMessage for a few years now, that's what got me interested in TC39 in the first place [08:52:09.0209] (yes I know different standard groups, but the gc API needs to be in the language) [08:52:46.0512] wrapping postMessage is no fun, it's very inception [08:53:13.0745] right now I kinda like the idea that shared structs that want a prototype are top-level-const exported `export shared struct class Foo {}` , and the module they are declared is what is 'attached' to the 'type' as an internal slot. for other agents to load. It could have an overlap with the import-defer-eval proposal, where the module is sync loaded on the first prototype access to be lazy and reduce the cost when the methods are not accessed by other agents [08:55:38.0741] yeah maybe that's what the ergonomic solution ends up looking like, but module blocks and/or import defer do not exist today if shu want to experiment with something right away. My proposal is about enabling this with what we have today, and the dynamic prototype lookup we'll need anyway [08:55:53.0438] no need to mess with module logic [08:56:15.0249] (I don't think) this would need module blocks, the part that is attached to the internal slot can be, as littledan said, a URL string [08:57:12.0561] * (I don't think) this wouldn't need module blocks, the part that is attached to the internal slot can be, as littledan said, a URL string [08:57:19.0214] * (I don't think) this would need module blocks, the part that is attached to the internal slot can be, as littledan said, a URL string [08:58:37.0349] I'm personally not a fan of tying module specifier strings into the solution [09:21:25.0615] Is there any reason the origin trial API couldn't be extended to provide a registration mechanism for the shared struct type with a user-provided unique ID (just a string, but could contain a URI, UUID, etc.) and the thread-local prototype to use? ```js // + fairly flexible // - requires setup on receiver end, correlation of struct type identity. // - requires encapsulation for constructor logic // data.js // data only structs don't require registration and have no shared type identity. export const DataOnly = new SharedStructType(["foo"]); // vector2d.js // data+behavior structs require registration to define type identity const { type: _Vector2D, register } = SharedStructType.prepare(["x", "y"]); // custom construction behavior export function Vector2D(x = 0, y = 0) { const _this = Reflect.construct(_Vector2D, [], new.target); _this.x = x; _this.y = y; return _this; } // prototype methods Vector2D.prototype.distanceTo = function (v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); }; // Register the type identity, constructor, and prototype to use for this struct type in this thread. // NOTE: register(id, constructor [, prototype]) register("e3a9bd1f-0f64-4848-b255-3c629d0c44a3", Vector2D, Vector2D.prototype); export { Vector2D }; // main.js import { DataOnly } from "./data.js"; import { Vector2D } from "./vector2d.js"; const data = new DataOnly(); data.foo = "data only, no behavior"; const v1 = new Vector2D(1, 2); const v2 = new Vector2D(3, 4); const worker1 = new Worker("worker1.js"); worker1.postMessage([data, v1, v2]); const worker2 = new Worker("worker2.js"); worker2.postMessage([data, v1, v2]); // worker1.js // worker1 does not import Vector2D, so can only access its data. parentPort.on("message", ([data, v1, v2]) => { data.foo; // "data only, no behavior" v1.x; // 1 v2.y; // 4 // NOTE: prototype not registered. This could mean an invalid prototype chain (thus every non-data // member acces would throw), or a default prototype chain (where prototype methods throw by nature // of them just being `undefined`).0 v1.distanceTo(v2); // error v1.toString(); // error? }); // worker2.js // worker2 imports Vector2D, which causes registration as a side-effect. import "./vector2d.js"; parentPort.on("message", ([data, v1, v2]) => { data.foo; // "data only, no behavior" v1.x; // 1 v2.y; // 4 v1.distanceTo(v2); // ok v1.toString(); // ok }); ``` The user-supplied type identity would allow the user to define the same struct in different bundles, and the registry wouldn't be global, but would rather be thread-local, so there'd be no global mutable registry to worry about. It also doesn't really matter if the prototypes differ slightly between realms/threads, since they all access the same underlying data. We could eventually extend this to syntax, possibly even using decorators: ```js // data.js export shared struct DataOnly { foo; // NOTE: constructors don't necessarily require registration constructor(foo) { this.foo = foo; } } // vector2d.js // a syntactic declaration doesn't necessarily need to set a type identity, since // we could infer one for it based on path to the containing file and its offset within // the file. However, we could still allow the user to explicitly specify type identity // for use with bundlers, or other cases where the inferred type identity might not // be sufficient. @struct.id("e3a9bd1f-0f64-4848-b255-3c629d0c44a3") // or: @struct.id("https://babylonjs.com/5.0/Vector2D") export shared struct Vector2D { constructor(x, y) { this.x = x; this.y = y; } distanceTo(v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); } } ``` [10:02:40.0619] I fail to see how a user supplied value provided after shared struct creation would work. Unless you'd somehow remap types you may have already seen. IMO you'd at least need to pass your unique ID as part of the `SharedStructType.prepare` call. But in general I don't like strings for unique IDs, and since we already have object that carry identity across agents, I thought it'd make sense to reuse them. [10:03:07.0028] Btw, the prototype attaching I suggested can be made to have a shape more similar to your suggestion above [10:49:12.0002] ```js // vector2d.js // Each shared struct type, whether data only or "prepared" has its own unique type export const vector2Dtype = SharedStructType.prepare(["x", "y"]); const _Vector2D = SharedStructType.getConstructor(vector2Dtype); // custom construction behavior export function Vector2D(x = 0, y = 0) { const _this = Reflect.construct(_Vector2D, [], new.target); _this.x = x; _this.y = y; return _this; } // prototype methods Vector2D.prototype.distanceTo = function (v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); }; SharedStructType.registerPrototype(vector2Dtype, Vector2D.prototype); // main.js import { Vector2D, vector2DType } from "./vector2d.js"; const v1 = new Vector2D(1, 2); const worker = new Worker("worker.js"); worker.postMessage([vector2DType, v1]); // worker.js // worker imports Vector2D, which causes registration as a side-effect. import { Vector2D, vector2DType } from "./vector2d.js"; const v2 = new Vector2D(3, 4); parentPort.on("message", ([mainVector2DType, v1]) => { SharedStructType.registerPrototype(mainVector2DType, Vector2D.prototype); assert(mainVector2DType !== vector2DType); assert( SharedStructType.getConstructor(mainVector2DType) !== SharedStructType.getConstructor(vector2Dtype) ); assert(v1 instanceof Vector2D); // by virtue of sharing a prototype v1.x; // 1 v1.distanceTo(v2); // ok v1.toString(); // ok }); ``` SharedStructType.getConstructor() is needed to allow the program to avoid duplicating type definitions in each agent/realm. I think the burden of such deduplication should be on the program, not on the engine. If the engine had to deduplicate itself, you'd either need a user provided type identifier at "prepare" time, and a global lock around such type definitions, or you'd need to somehow be able to collapse separate definitions into a single one. In either case you'd have to figure out what to do if the shape definition does not match, and in the case of definition collapse, how do you communicate the error to the program? By having the type definition generate the unique type identifier, you avoid all those complications in the engine, at the cost of putting more type hydration burden on the program. [10:49:27.0279] * ```js // vector2d.js // Each shared struct type, whether data only or "prepared" has its own unique type export const vector2Dtype = SharedStructType.prepare(["x", "y"]); const _Vector2D = SharedStructType.getConstructor(vector2Dtype); // custom construction behavior export function Vector2D(x = 0, y = 0) { const _this = Reflect.construct(_Vector2D, [], new.target); _this.x = x; _this.y = y; return _this; } // prototype methods Vector2D.prototype.distanceTo = function (v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); }; SharedStructType.registerPrototype(vector2Dtype, Vector2D.prototype); // main.js import { Vector2D, vector2DType } from "./vector2d.js"; const v1 = new Vector2D(1, 2); const worker = new Worker("worker.js"); worker.postMessage([vector2DType, v1]); // worker.js // worker imports Vector2D, which causes registration as a side-effect. import { Vector2D, vector2DType } from "./vector2d.js"; const v2 = new Vector2D(3, 4); parentPort.on("message", ([mainVector2DType, v1]) => { SharedStructType.registerPrototype(mainVector2DType, Vector2D.prototype); assert(mainVector2DType !== vector2DType); assert( SharedStructType.getConstructor(mainVector2DType) !== SharedStructType.getConstructor(vector2Dtype) ); assert(v1 instanceof Vector2D); // by virtue of sharing a prototype v1.x; // 1 v1.distanceTo(v2); // ok v1.toString(); // ok }); ``` SharedStructType.getConstructor() is needed to allow the program to avoid duplicating type definitions in each agent/realm. I think the burden of such deduplication should be on the program, not on the engine. If the engine had to deduplicate itself, you'd either need a user provided type identifier at "prepare" time, and a global lock around such type definitions, or you'd need to somehow be able to collapse separate definitions into a single one. In either case you'd have to figure out what to do if the shape definition does not match, and in the case of definition collapse, how do you communicate the error to the program? By having the type definition generate the unique type identifier, you avoid all those complications in the engine, at the cost of putting more type hydration burden on the program. 2023-01-29 [14:17:24.0023] > <@mhofman:matrix.org> I fail to see how a user supplied value provided after shared struct creation would work. Unless you'd somehow remap types you may have already seen. IMO you'd at least need to pass your unique ID as part of the `SharedStructType.prepare` call. But in general I don't like strings for unique IDs, and since we already have object that carry identity across agents, I thought it'd make sense to reuse them. The API I suggested would only be for the origin trial, not the final proposal. The user supplied type identity and prototype isn't "provided after shared struct creation", but rather, the shared struct constructor isn't usable until after it is registered. This is why shared struct type creation that depends on a prototype is different from data-only shared structs. You would have to call `register` to make the struct type valid. [14:18:48.0064] > <@mhofman:matrix.org> I fail to see how a user supplied value provided after shared struct creation would work. Unless you'd somehow remap types you may have already seen. IMO you'd at least need to pass your unique ID as part of the `SharedStructType.prepare` call. But in general I don't like strings for unique IDs, and since we already have object that carry identity across agents, I thought it'd make sense to reuse them. "Objects that carry identity across agents" doesn't help with the model I was proposing. In my model, each agent must independently register the type associated with the struct. [14:20:16.0004] > <@rbuckton:matrix.org> The API I suggested would only be for the origin trial, not the final proposal. The user supplied type identity and prototype isn't "provided after shared struct creation", but rather, the shared struct constructor isn't usable until after it is registered. This is why shared struct type creation that depends on a prototype is different from data-only shared structs. You would have to call `register` to make the struct type valid. That's doesn't answer my concern. What would happen if 2 types with different shapes are registered with the same user supplied identity? [14:20:17.0823] > <@mhofman:matrix.org> ```js > // vector2d.js > // Each shared struct type, whether data only or "prepared" has its own unique type > export const vector2Dtype = SharedStructType.prepare(["x", "y"]); > > const _Vector2D = SharedStructType.getConstructor(vector2Dtype); > > // custom construction behavior > export function Vector2D(x = 0, y = 0) { > const _this = Reflect.construct(_Vector2D, [], new.target); > _this.x = x; > _this.y = y; > return _this; > } > > // prototype methods > Vector2D.prototype.distanceTo = function (v) { > const dx = this.x - v.x; > const dy = this.y - v.y; > return Math.sqrt(dx * dx + dy * dy); > }; > > SharedStructType.registerPrototype(vector2Dtype, Vector2D.prototype); > > // main.js > import { Vector2D, vector2DType } from "./vector2d.js"; > const v1 = new Vector2D(1, 2); > const worker = new Worker("worker.js"); > worker.postMessage([vector2DType, v1]); > > // worker.js > // worker imports Vector2D, which causes registration as a side-effect. > import { Vector2D, vector2DType } from "./vector2d.js"; > > const v2 = new Vector2D(3, 4); > > parentPort.on("message", ([mainVector2DType, v1]) => { > SharedStructType.registerPrototype(mainVector2DType, Vector2D.prototype); > assert(mainVector2DType !== vector2DType); > assert( > SharedStructType.getConstructor(mainVector2DType) !== > SharedStructType.getConstructor(vector2Dtype) > ); > assert(v1 instanceof Vector2D); // by virtue of sharing a prototype > v1.x; // 1 > v1.distanceTo(v2); // ok > v1.toString(); // ok > }); > > ``` > > SharedStructType.getConstructor() is needed to allow the program to avoid duplicating type definitions in each agent/realm. I think the burden of such deduplication should be on the program, not on the engine. > > If the engine had to deduplicate itself, you'd either need a user provided type identifier at "prepare" time, and a global lock around such type definitions, or you'd need to somehow be able to collapse separate definitions into a single one. In either case you'd have to figure out what to do if the shape definition does not match, and in the case of definition collapse, how do you communicate the error to the program? > > By having the type definition generate the unique type identifier, you avoid all those complications in the engine, at the cost of putting more type hydration burden on the program. This assumes the worker can be sure the incoming message is actually for that prototype, which might not be the case if there are multiple possible values for the message. This design seems very hard to implement in user code, while my suggestion is more reliable for write-once and reuse. [14:20:52.0690] > <@mhofman:matrix.org> That's doesn't answer my concern. What would happen if 2 types with different shapes are registered with the same user supplied identity? An error. If the user defined identity is already registered to a different shape, it should throw. [14:23:13.0461] The bigger question is, what do we want the final, shipping version of this to look like? Do we want package authors to be able to define structs that package consumers can just use, or do we want package consumers to need to register shared structs themselves in any messaging scaffolding? [14:23:38.0071] > <@rbuckton:matrix.org> This assumes the worker can be sure the incoming message is actually for that prototype, which might not be the case if there are multiple possible values for the message. This design seems very hard to implement in user code, while my suggestion is more reliable for write-once and reuse. Right it requires more setup dance from the program, and it's that setup dance that we should try to make easier. But I'm very doubtful we can overcome the issue with a user supplied identity. [14:28:04.0034] If we want to make this simple for application developers, then they should be able to install a package, import the struct type (or at least use a side-effecting import for the file containing the struct type to make it visible to the agent), and just use it. This means some kind of per-agent registration would need to occur within the file that contains the struct itself. One mechanism we discussed for that was to depend on the resolved module ID (i.e., if we were depending on the module loader cache). However, that doesn't work well with bundling scenarios where I have one bundle for the main thread, and another bundle for a worker. [14:29:00.0167] So there would need to be some way to uniquely identify a struct type regardless of path, such that the same struct types in each bundle can be associated with each other. [14:31:04.0823] User-defined IDs are used _everywhere_ for this in many languages and runtimes: UUIDs, URNs, DTDs, etc. [14:31:22.0312] * User-defined IDs are used _everywhere_ for this in many languages and runtimes: UUIDs, URNs, DTDs, etc. [14:33:23.0933] My main concern is that any burden for registration should fall on the package developer, not the application developer, whatever the design looks like in the end. [14:36:02.0527] I think the question raised by Ashley Claymoreis is relevant: how do you handle the fan in case where multiple agents setup a shared type before being introduced to each other. A forgeable type identifier in my opinion is not safe, and I'm pretty sure it'd never make it through committee. However I'm not convinced we need prototype/constructor continuity here. [14:36:05.0085] And that whatever that design is should be able to take into account bundling, be that with module declarations/module expressions, or traditional bundlers [14:38:56.0259] > <@mhofman:matrix.org> I think the question raised by Ashley Claymoreis is relevant: how do you handle the fan in case where multiple agents setup a shared type before being introduced to each other. A forgeable type identifier in my opinion is not safe, and I'm pretty sure it'd never make it through committee. However I'm not convinced we need prototype/constructor continuity here. Since this wouldn't be a global registry, it wouldn't matter. Your code defines the relationship between an id and a constructor/prototype on your agent. If your code tries to register the same unique id twice, it should be an error so you can isolate the problem early. If the constructor/prototype on one agent doesn't match the constructor/prototype on another agent, that's fine. In fact, that may even be a value add, since bundlers could potentially tree-shake away prototype methods that aren't used. [14:42:13.0632] How is it not a global registry? [14:42:57.0419] You're assuming there is a single author to code running in an agent [14:42:59.0625] All that matters is that a given struct type has the same type identity on multiple Agents, so that each Agent can bind a prototype to that type identity. Producing an instance of a struct type should be possible on any Agent, such that I could do: ```js // main.js const v1 = new Vector2D(1, 2); worker.postMessage(v1); worker.on("message", e => console.log(e instanceof Vector2D)); // worker.js const v2 = new Vector2D(3, 4); remotePort.postMessage(v1); remotePort.on("message", e => console.log(e instanceof Vector2D)); ``` [14:43:13.0360] And a single author to code running in communicating agents [14:43:29.0422] > <@mhofman:matrix.org> How is it not a global registry? I don't understand. Each Agent would have an independent registry mapping a type identity to a prototype. [14:44:25.0264] > <@mhofman:matrix.org> You're assuming there is a single author to code running in an agent I'm assuming that if you want to consistently use the same data in multiple agents, with the same underling behavior, then you should use a consistent definition. [14:44:50.0931] > <@rbuckton:matrix.org> All that matters is that a given struct type has the same type identity on multiple Agents, so that each Agent can bind a prototype to that type identity. Producing an instance of a struct type should be possible on any Agent, such that I could do: > > ```js > // main.js > const v1 = new Vector2D(1, 2); > worker.postMessage(v1); > worker.on("message", e => console.log(e instanceof Vector2D)); > > // worker.js > const v2 = new Vector2D(3, 4); > remotePort.postMessage(v1); > remotePort.on("message", e => console.log(e instanceof Vector2D)); > ``` Do you expect these console logs to print true or false? [14:44:59.0317] I expect them to print `true`. [14:45:55.0135] Then you expect continuity of constructor / prototype across agents [14:45:57.0661] If both the main thread and worker register their version of `Vector2D.prototype` for the same type identity, then creating those structs on either side and sending them to the other side should be consistent. [14:46:38.0598] > <@mhofman:matrix.org> Then you expect continuity of constructor / prototype across agents No. I would _recommend_ continuity of the prototype across agents. Constructor doesn't actually matter. [14:46:57.0243] As I said, a bundler could potentially tree-shake away prototype methods on either side based on use, and it should still work. [14:47:21.0361] > <@rbuckton:matrix.org> I don't understand. Each Agent would have an independent registry mapping a type identity to a prototype. Yes that we agree on. The question is who generates the key of that mapping. If it's user controlled, you have code running in an agent that can interfere with other code running in the same agent. [14:50:04.0572] > <@rbuckton:matrix.org> No. I would _recommend_ continuity of the prototype across agents. Constructor doesn't actually matter. By continuity I mean recognition of the prototype objects identity. It doesn't need to be the same in multiple agents, obviously, but an object of the type registered in agent1 and sent to agent2 is expected to have the same prototype as an object of the "same type" created in agent2. I am not convinced we need that [14:51:32.0853] I believe we could get away with duck typing here [14:52:28.0455] Basically, imagine this: - A Struct Type has a type identity (be it system or user produced). - A Struct instance has a private slot containing that type identity. - An Agent has a mapping of type identity to a prototype. - Agent A registers a Struct type (`Foo`) for a given type identity and a prototype defined in a realm in Agent A. - Agent B registers a Struct type (`Foo`) for the same type identity and a prototype defined in a realm in Agent B. - Agent A constructs an instance of Agent A's struct type `Foo`, sends it to Agent B. - Agent B receives the struct value. When Agent B performs ToObject on the struct value, it looks up the type identity in its internal slot in Agent B's registry to find the prototype. - Agent B constructs an instance of Agent B's struct type `Foo`, sends it to Agent A. - Agent A receives the struct value. When Agent A performs ToObject on the struct value, it looks up the type identity in its internal slot in Agent A's registry to find the prototype. [14:53:40.0930] The type identity we transfer from A to B, or vise versa, could also encode the expected shape of the struct type. That way, if Agent A and Agent B disagree on the shape associated with the type identity, that error would be thrown on prototype lookup in ToObject. [14:54:15.0275] Neither agent needs to communicate their registry to the other agents, thus no global registry. [14:55:14.0916] If Agent A sends a struct value to Agent B that B doesn't have registered, Agent B could still allow access to the data, just not a prototype walk. [14:57:58.0117] > <@mhofman:matrix.org> By continuity I mean recognition of the prototype objects identity. It doesn't need to be the same in multiple agents, obviously, but an object of the type registered in agent1 and sent to agent2 is expected to have the same prototype as an object of the "same type" created in agent2. I am not convinced we need that The prototype shape on each side doesn't matter. Just the type identity. I could have a `Foo { x, y, bar() {} }` on A and a `Baz { x, y, quxx() {} }` on B registered to the same type identity. If I create a `Foo` on A and send it to B, B will see it as a `Bar`. If I construct a `Bar` on B and send it to A, A will see it as a `Foo`. [14:58:12.0291] What I care about is consistency in round-tripping. [14:58:20.0897] My concern is with regards to who mints the type identity, as that grants the right to register it. A forgeable value allows code that doesn't trust each other to interfere with each other. That code could be running in the same realm. If the type is minted by the engine, how is it recovered by the code. [14:58:39.0122] And the `Foo` vs `Bar` idea isn't farfetched, bundlers can and do tree shake methods, and can and do rename classes. [15:00:15.0004] > <@mhofman:matrix.org> My concern is with regards to who mints the type identity, as that grants the right to register it. A forgeable value allows code that doesn't trust each other to interfere with each other. That code could be running in the same realm. If the type is minted by the engine, how is it recovered by the code. In the real proposal structs are top-level declarations. Evaluation and registration of structs occurs mostly during initial module loading. Defensive JavaScript does what it always does, today, which is ensure code they trust runs first. [15:01:02.0067] Why would a struct decalration be top level? [15:01:05.0028] You error on conflicts rather than silently allowing them, so most applications will fail immediately if malicious code tries to forge an identity. [15:01:15.0514] And how does that even matter to type identity [15:01:34.0425] It would have to be if you want to use module id + offset or something as a default type identity. [15:02:27.0084] How would that be compatible with bundlers? [15:02:29.0627] If a function can return a `struct` declaration that produces a different object identity for each function call, there'd be no way to differentiate them. [15:03:08.0722] > <@mhofman:matrix.org> How would that be compatible with bundlers? It isn't, that's why you'd also need a way to override the type identity. But having a working default behavior is also a good idea. [15:04:10.0044] I'm just saying. I don't see how you can have an unforgeable type identity generated or derived by the engine, and have automatic mapping of these types across agents in a way to allows for prototype continuity as you described earlier [15:05:45.0978] Specifying the type identity when registering must require an unforgeable value, that obviously must have a stable identity when sent between agents [15:05:47.0136] I don't think the type identity being "unforgeable" is important. If you error on an attempted redefinition, and those errors occur during application startup, then we do what we always do and run code we rely on first. [15:07:06.0448] Each struct type needs a type identity. For the bundler case, you could supply one. For the "I'm just loading the same module from the same path in the main thread and the worker" case, you could rely on module id and source text offset within the module as a stable identity. [15:07:17.0948] A forgeable value will not be acceptable. An error is at best a denial of service for code that doesn't trust each other [15:08:14.0791] Then the registry need not be per-agent, but per-realm, just like for `Number`, `String`, etc. Have untrusted code run in another realm/compartment/etc. [15:08:46.0912] Again there can be code that doesn't trust each other in the same realm [15:09:56.0172] At best you could have this registry per compartment [15:12:05.0853] This type registry you propose is novel in the 262 world, and it does break precedent. 2023-01-30 [21:11:38.0484] Right, this is why I concluded that the only reference point we could use is the module system [22:07:30.0422] just like builtin modules, there'd always have to be a way to access the same functionality in Scripts - how would that work at all with the module system? [00:39:53.0169] Automatically de-duping types based on module could also be difficult when the bundler inlines the defining module into separate main and worker bundles. It would need to know this module is an entry-point that should not be inlined [00:43:25.0930] For manually registering a prototype: - `StructClass.adopt(orphanedInstance)` is potential the simplest to explain on its own but pre-instance-per-agent uses the most memory, and requires the most orchestration as each individual instance needs to be adopted before methods can be called [00:48:49.0041] - registering once per type, reduces memory and is a one time orchestration and then it could "just work" from that point on. But needs to answer: what carries the "type" being registered? Is the registry per-agent, or per-realm, or per-comms-channel? [00:51:47.0875] * or is everything automatic, for the use case rbuckton mentioned where an npm package of shared structs "just works across workers" with no orchestration (ignoring ensuring the bundler&server are configured as required) [00:53:03.0486] * • registering once per type, reduces memory and is a one time orchestration and then it could "just work" from that point on. But needs to answer: what carries the "type" being registered? Is the registry per-agent, or per-realm, or per-comms-channel*? (* a bit like the "transferable objects" list) [00:54:44.0985] * • registering once per type, reduces memory and is a one time orchestration and then it could "just work" from that point on. But needs to answer: what carries the "type" being registered? Is the registry per-agent, or per-realm, or per-comms-channel*? (* a bit like the "transferable objects" list) [00:55:17.0455] * - registering once per type, reduces memory and is a one time orchestration and then it could "just work" from that point on. But needs to answer: what carries the "type" being registered? Is the registry per-agent, or per-realm, or per-comms-channel\*? (\* a bit like the "transferable objects" list) [03:00:54.0011] * For manually registering a prototype: - `StructClass.adopt(orphanedInstance)` is potential the simplest to explain on its own but per-instance-per-agent uses the most memory, and requires the most orchestration as each individual instance needs to be adopted before methods can be called [03:02:08.0739] * - registering once per type, reduces memory and is a one time orchestration (per type,per non-original-agent) which can then "just work" from that point on. But needs to answer: what carries the "type" being registered? Is the registry per-agent, or per-realm, or per-comms-channel\*? (\* a bit like the "transferable objects" list) [06:53:09.0344] > <@aclaymore:matrix.org> For manually registering a prototype: > > - `StructClass.adopt(orphanedInstance)` is potential the simplest to explain on its own but per-instance-per-agent uses the most memory, and requires the most orchestration as each individual instance needs to be adopted before methods can be called This seems infeasible for any complex, nested set of structs [06:57:25.0305] > <@aclaymore:matrix.org> * or is everything automatic, for the use case rbuckton mentioned where an npm package of shared structs "just works across workers" with no orchestration (ignoring ensuring the bundler&server are configured as required) While this would be nice to have as a default, I don't think "automatic only" is feasible w/o making structs unbundleable. [07:02:58.0441] > <@aclaymore:matrix.org> - registering once per type, reduces memory and is a one time orchestration (per type,per non-original-agent) which can then "just work" from that point on. But needs to answer: what carries the "type" being registered? Is the registry per-agent, or per-realm, or per-comms-channel\*? (\* a bit like the "transferable objects" list) Once per realm is consistent with primitives like `Number.prototype`, `Boolean.prototype`, etc. The question is whether structs are more "object"-like or more "primitive"-like, since Objects are effectively per-Agent since you can hand one to another realm and it still walks the original prototype chain. [07:09:32.0707] Once per-realm has the consequence that the value would not meet the current 'sealed' guarantees. Maybe this is OK, but also maybe not [07:10:01.0229] I think regarding dynamic prototype lookup we agreed this would be a per-realm behavior like for primitives, which means whatever registry must be at most per realm [07:11:56.0510] while that is 'like' primitives, it's more because primitives are not objects in the first place [07:12:08.0378] If the only realms that existed were shadow realms, objects would effectively be per realm and not per agent [07:12:17.0665] so they go via toObject, an object doesn't change via toObject [07:12:36.0266] if only! [07:13:20.0799] And because of the existence of Shadow Realm, you cannot have a mechanism which would expose objects from another realm across the callable boundary [07:13:35.0281] which a per agent registry would effectively do [07:14:51.0090] but because of same-origin-iframes and node 'vm' there would still need to be defined behavior for same-agent-cross-realm semantics [07:14:58.0802] ShadowRealms is the easy case :) [07:15:25.0103] easy ~= can't pass objects, so can't pass structs [07:18:07.0278] well technically can only pass struct between agents with host APIs, but we're still discussing how that works here ;) A host could very well add an API to pass structs between shadow realms [07:18:37.0515] (I really want to get my structured clone extension mechanism ironed out, as it'd work to pass objects between shadow realms) [07:27:07.0769] As I said earlier, if we had a restriction that a shared struct must be a top-level declaration then most type identity registration would occur during application start up (with the exception of portions of the module graph loaded dynamically via `import()`), which means conflicts (be they intentional or unintentional) would primarily occur early. Malicious code wouldn't be able to hijack an already-registered type identity. [07:28:16.0862] how would that restriction work in Script? [07:30:55.0965] > <@ljharb:matrix.org> how would that restriction work in Script? They would also need to be at the top-level of a Script, so no using a function as a factory for shared struct _types_. When evaluation of the module/script body completes, the file cannot produce new struct types. Yes, this isn't 100% reliable, but is at least as reliable as the current mechanisms used today to capture intrinsics before they can be patched/modified by other code. [07:32:15.0024] I think that restriction is completely unrealistic, and makes it impossible to use this feature in a lot of programs [07:33:03.0623] I only suggested the restriction as a possible remediation for the concern about type identity registration being forgeable. [07:34:06.0611] Realistically, even without this restriction, a type identity registry that depends on a first-in wins mechanism is still as reliable as existing intrinsic-capturing mechanisms. [07:34:20.0601] Javascript is a dynamic language. That suggestion is equivalent to forbidding features from the dynamic part of the language [07:34:45.0812] We already do that with features like `export` [07:35:02.0563] And I have explained why first win semantics for a global registry is a non-starter [07:35:08.0184] I don't want that restriction. [07:35:27.0544] > <@mhofman:matrix.org> And I have explained why first win semantics for a global registry is a non-starter Per-agent global, or cross-agent global? [07:35:41.0942] even per realm [07:36:44.0364] `export` being first win is not a precendent, like object spread is not a precendent. Both have a local effect (module or object) [07:37:03.0357] I still am not clear on why you believe a per-agent (or per-realm) write-only registry is a non-starter. If you don't want adversarial code to register first, you register first. [07:37:24.0424] who is "you" [07:37:36.0740] You, the application developer. [07:38:09.0622] Ok what about 2 libraries that the application load. Should lib1 be able to interfere with lib2 ? [07:39:11.0738] If you are a middleware, or a plugin, or something else running in a host environment where you don't control the environment, then you ensure your type identity is sufficiently unique. If you are the host environment loading a middleware, or plugin, etc., you architect your environment to be resilient to such a conflict. [07:39:24.0120] there is no observable global mutable state in 262, and we would 100% block anything that introduces anything like it [07:39:39.0385] the global object isn't that? [07:39:49.0271] or do you mean across realms (bigger than global, "universal" maybe) [07:39:55.0465] minus the intrinsics/primordial objects themselves [07:39:55.0915] * or do you mean across realms (bigger than global, "universal" maybe) [07:39:57.0788] hidden * [07:40:12.0535] sorry I forgot the hidden part [07:40:25.0196] there's a few of those too already, but they're all freezeable ofc [07:40:30.0537] where ? [07:40:32.0956] * there's a few of those too already, but they're all freezeable ofc [07:40:41.0497] `AsyncFunction` is one, no? [07:40:48.0503] hidden = internal [07:40:55.0961] * hideen= internal [07:40:59.0208] i.e. not impacted by being frozen [07:41:02.0976] yes the hidden intrinsics, which we're trying to fix [07:41:09.0854] * i.e. not impacted by being frozen [07:41:26.0621] * hideen = internal [07:41:33.0335] * hidden = internal [07:41:53.0426] ashley has the correct wording there, AsyncFunction is accessible to JS code [07:42:03.0365] > <@mhofman:matrix.org> Ok what about 2 libraries that the application load. Should lib1 be able to interfere with lib2 ? If lib2 doesn't want _unintentional_ interference, they should use sufficiently unique type identities. If lib1 performs _intentional_ interference, that's up to the application developer/host environment to mitigate, and a best effort from the runtime to surface that information early, hence the first-in wins registry. If you can only write new unique identities to the registry, either lib1 or lib2 will fail rather than be ignorant of hijacking. [07:43:28.0848] > <@aclaymore:matrix.org> i.e. not impacted by being frozen good thing RegExp.prototype isn't a regex anymore :-p [07:43:32.0050] If the concern is about the registry not being freezable, that could be made possible via an API as well. [07:43:54.0925] But again that would prevent the feature from working [07:44:59.0068] Yes, if malicious code were to freeze the registry your application would stop working. Which would be a fairly good indication that there is malicious code freezing the registry (at least, with a sufficiently clear error message). [07:45:14.0475] I want any JS feature where I can dynamically load code, that will keep working regardless of previous code that was loaded and executed (if you ignore mutation to the global object and intrinsics) [07:45:18.0897] Just like some packages not working with SES when it locks down the environment. [07:46:33.0313] It feels like you are arguing two opposing positions: Such a registry must be mutable so that code keeps working, but malicious code shouldn't be able to mutate it to hijack it. [07:48:23.0226] No, I'm arguing this feature cannot use a global registry with forgeable keys because it'd be impossible to make it safe [07:48:37.0145] Perhaps it would be better to outline the specific capabilities we want, the limitations of the environment and ecosystem, and the concerns we have for any given solution. From there we can better determine hard and fast requirements and find potential compromises. [07:49:38.0732] > <@mhofman:matrix.org> No, I'm arguing this feature cannot use a global registry with forgeable keys because it'd be impossible to make it safe I'm more than willing to entertain other solutions, but its important that the developer experience doesn't make the feature wholly unusable. [07:49:56.0769] you can have a global registry, as long as it cannot be used by code that hasn't been previously introduced to each other to interact. [07:50:34.0459] Needing to patch or wire up the prototype in `onmessage` is terrible DX. [07:51:06.0321] How would you describe how these introductions should work? [07:51:34.0116] This is the reason I would block any string based registry that you can ask the question (directly or indirectly) "do you have this key" [07:52:15.0082] > <@rbuckton:matrix.org> Needing to patch or wire up the prototype in `onmessage` is terrible DX. I say this as someone who wrote a package that implements struct-like behavior backed by `SharedArrayBuffer` that does this. The only reason it works is that the "structs" you create are fully typed, so you don't have to walk a graph to wire up prototypes because it does that for you. [07:52:39.0031] > <@mhofman:matrix.org> you can have a global registry, as long as it cannot be used by code that hasn't been previously introduced to each other to interact. * How would you describe how these introductions should work? [07:54:51.0017] > <@mhofman:matrix.org> This is the reason I would block any string based registry that you can ask the question (directly or indirectly) "do you have this key" How deep does this concern extend? Agent-scoped registries, apparently, but what about realm-scoped registry? [07:55:17.0766] realm scoped registries too [07:58:22.0001] To clarify, your concern is that such a string-based registry could be used as a side-channel for communication (i.e., it could be used to exfiltrate data)? [07:59:02.0354] hm there seems to be a lot of back and forth i've missed here, is there a tl;dr (no rush)? [07:59:06.0508] In the solution I presented, the application either has to do a manual bootstrap (sending once over postMessage the unforgeable type identifier, and wire it to the expected implementation), or give up on prototype continuity (aka an incoming shared struct's dynamic proto would not be the same object as the dynamic proto of an "equivalent" shared struct created locally) [07:59:47.0538] side channel and/or denial of service [08:01:16.0426] If we had module block with stable identity over postMessage, the discontinuity could be mitigated [08:01:22.0367] > <@mhofman:matrix.org> In the solution I presented, the application either has to do a manual bootstrap (sending once over postMessage the unforgeable type identifier, and wire it to the expected implementation), or give up on prototype continuity (aka an incoming shared struct's dynamic proto would not be the same object as the dynamic proto of an "equivalent" shared struct created locally) What if such a registry were per-channel? i.e., a `MessagePort` might have an internal registry of type-identity to prototype, and you'd have to set up that registry on each side? [08:02:09.0396] > <@mhofman:matrix.org> If we had module block with stable identity over postMessage, the discontinuity could be mitigated Because you send the module block over `postMessage` to evaluate so that both sides have the same representation? [08:02:58.0646] I think a per-channel registry would likely exacerbate identity discontinuity issues, but since they registration would be tied to a non-global capability, it would be acceptable to use forgeable identifiers [08:06:54.0381] My concerns with bootstrapping with a module block is how much other information would need to be bootstrapped along with it to support code-sharing (i.e., imports and package dependencies) making it harder for the other side to ensure initialization is consistent (i.e., any registration that needs to happen, or perhaps conditional imports depending on whether the code is running in the browser main thread or in a worker, etc.). And if that were easily solvable, we could also potentially solve that by bootstrapping just with the struct type definition itself (i.e., can we do this without depending on module blocks?). [08:08:36.0865] > <@mhofman:matrix.org> I think a per-channel registry would likely exacerbate identity discontinuity issues, but since they registration would be tied to a non-global capability, it would be acceptable to use forgeable identifiers From an API standpoint, I could potentially see there being some kind of built-in `SharedStructTypeIdentityRegistry` that a package could export, such that a host could import the registry and use it to configure a `MessagePort`. [08:10:09.0321] > <@shuyuguo:matrix.org> hm there seems to be a lot of back and forth i've missed here, is there a tl;dr (no rush)? How to manage type identity across Agents/Realms/etc. in a way that doesn't violate security concerns, but works for bundlers (so solely relying on module id/file path isn't viable). [08:10:18.0092] > <@rbuckton:matrix.org> Because you send the module block over `postMessage` to evaluate so that both sides have the same representation? Kinda. Each agent could have its own type for an "equivalent" shared struct, but if the code declaring the struct uses a module instance to define the struct's behavior / dynamic prototype, that module instance can be automatically loaded by other realms. When receiving a struct of that type, the observed prototype object would be the same as the prototype of the locally defined struct, if the local definition used the same module instance (which required module instances to be stable across agents). The "mostly mitigated" part is because the constructor couldn't be on the prototype, as that is realm specific of course [08:12:56.0519] > <@rbuckton:matrix.org> From an API standpoint, I could potentially see there being some kind of built-in `SharedStructTypeIdentityRegistry` that a package could export, such that a host could import the registry and use it to configure a `MessagePort`. I did not understand this suggestion [08:16:01.0455] I'm mostly suggesting a way to simply the DX around registration, making it easier to combine registries from multiple packages, and to define the composite registry on each side of a message port. [08:17:40.0454] The code that runs on each side could be different for the same struct type given practices such as bundling and tree shaking, and in some cases a struct type may need a method tailored to an environment (i.e., can its methods access node-native bindings, can this method only be run in the browser), etc. [08:18:43.0447] Having an observably similar prototype is profoundly useful, but mandating a similar prototype limits flexibility. [08:20:33.0152] TLDR here is that registration would be implicit the first time a struct of a given type is shared with another realm/agent if the struct definition used a module instance at declaration. If the same module instance loaded in 2 different agents has a stable identity, aka if `receivedModuleInstance === importedModuleInstance`, then you don't have any prototype discontinuity issues. If the bundler messes up these module identities, then the program has to first send the type identity explicitly over a `postMessage` and the receiving code has to manually do `SharedStructType.register(receivedModuleType, importedModuleInstance)` [08:21:16.0998] > <@mhofman:matrix.org> Kinda. Each agent could have its own type for an "equivalent" shared struct, but if the code declaring the struct uses a module instance to define the struct's behavior / dynamic prototype, that module instance can be automatically loaded by other realms. When receiving a struct of that type, the observed prototype object would be the same as the prototype of the locally defined struct, if the local definition used the same module instance (which required module instances to be stable across agents). The "mostly mitigated" part is because the constructor couldn't be on the prototype, as that is realm specific of course * TLDR here is that registration would be implicit the first time a struct of a given type is shared with another realm/agent if the struct definition used a module instance at declaration. If the same module instance loaded in 2 different agents has a stable identity, aka if `receivedModuleInstance === importedModuleInstance`, then you don't have any prototype discontinuity issues. If the bundler messes up these module identities, then the program has to first send the type identity explicitly over a `postMessage` and the receiving code has to manually do `SharedStructType.register(receivedModuleType, importedModuleInstance)` [08:25:40.0850] Btw, you can use string identifier over `postMessage` for the application to know what the opaque type identifier is about, restoring your string based semantics. That would be scoped to whatever is holding the string identifier registry (possibly the channel itself). [08:27:18.0839] > <@mhofman:matrix.org> Btw, you can use string identifier over `postMessage` for the application to know what the opaque type identifier is about, restoring your string based semantics. That would be scoped to whatever is holding the string identifier registry (possibly the channel itself). That's still a poor DX as it would significantly overcomplicate `onmessage` handlers by requiring custom scaffolding in every project to bootstrap. [08:28:43.0142] The `SharedStructTypeIdentityRegistry` suggestion above would handle the per-channel string identifier registry without requiring prior communication between each side of a channel. [08:38:05.0242] Consider, for example: ```js // node_modules/threejs/src/math/Vector2.js import { structRegistry } from "../registry.js" @structRegistry.register("http://threejs.com/structs/Vector2") shared struct Vector2 { ... } // main.js import { Vector2, structRegistry } from "threejs"; const worker = new Worker("./worker.js", { structRegistry }); worker.postMessage(new Vector2(0, 0)); // worker.js import { structRegistry } from "threejs"; import { parentPort } from "worker_threads"; parentPort.addRegistry(structRegistry); parentPort.on("message", v => { v.whatever(); }); ``` [08:39:19.0419] Both the main and worker threads load the same struct type, though they could be in different bundles and tree shaking could remove some members. They independently associate type registries with their sides of the channel. [08:40:07.0731] * Consider, for example: ```js // node_modules/threejs/src/registry.js export const structRegistryWriter = new StructTypeRegistryWriter(); export const structRegistry = structRegistryWriter.registry; // node_modules/threejs/src/math/Vector2.js import { structRegistryWriter } from "../registry.js" @structRegistry.register("http://threejs.com/structs/Vector2") shared struct Vector2 { ... } // main.js import { Vector2, structRegistry } from "threejs"; const worker = new Worker("./worker.js", { structRegistry }); worker.postMessage(new Vector2(0, 0)); // worker.js import { structRegistry } from "threejs"; import { parentPort } from "worker_threads"; parentPort.addRegistry(structRegistry); parentPort.on("message", v => { v.whatever(); }); ``` [08:40:54.0466] While not automatic, the registration mechanism stays as out of the way as possible to simplify the developer experience. [08:41:42.0995] Basically it'd look like ```js // Done by all workers independently import Vector2DPrototype from "vector2dBehavior.js" with { reflect: "module" }; const { constructor: Vector2D, type: Vector2DType } = SharedStructType.prepare(["x", "y"]); SharedStructType.register(Vector2DType, Vector2DPrototype); const Vector2DUUID = '...'; const structAwareChannel = new MessageChannel(); structAwareChannel.register(Vector2DUUID, Vector2DType, Vector2DPrototype); // worker1.js const v1 = new Vector2D({x: 1, y: 2}); structAwareChannel.postMessage({v1}); // worker2.js channel.on("message", ({v1}) => { console.log(v1 instanceof Vector2D); // true because prototype match even though type instance different }); ``` [08:44:56.0334] That's definitely not great for the actual proposal, that seems far to easy to get wrong. Maybe ok in the prototyping stage, but not long term. [08:45:23.0084] I actually don't think our approaches diverge that much [08:45:46.0644] What happens if `vector2dBehavior.js` requires a package already loaded on the worker? Will it import the worker version, or carry along its transitive dependencies to be re-evaluated on the worker? [08:46:13.0735] if the struct types are declared independently, they will have different types, and thus different constructors [08:47:20.0222] > <@rbuckton:matrix.org> What happens if `vector2dBehavior.js` requires a package already loaded on the worker? Will it import the worker version, or carry along its transitive dependencies to be re-evaluated on the worker? Doesn't matter? [08:48:07.0353] Your suggestion depends on features not present in the origin trial, but uses an API that is likely unique to the origin trial. @shu can clarify, but I believe the reason the trial uses a `SharedStructType` constructor in place of syntax as that's much easier to add behind a flag to get early feedback on. I'm not sure how likely we could depend on a proposal like `with { reflect: "module" }` or module blocks, which are nowhere near ready for adoption. [08:48:44.0496] > <@mhofman:matrix.org> I actually don't think our approaches diverge that much * Your suggestion depends on features not present in the origin trial, but uses an API that is likely unique to the origin trial. @shu can clarify, but I believe the reason the trial uses a `SharedStructType` constructor in place of syntax as that's much easier to add behind a flag to get early feedback on. I'm not sure how likely we could depend on a proposal like `with { reflect: "module" }` or module blocks, which are nowhere near ready for adoption. [08:49:26.0184] Yes I agree a dependency on module instances is not great, hence why I removed that from my original proposal. We can skip it here, the only effect is that you won't get implicit fallback registration [08:49:26.0229] > <@mhofman:matrix.org> Doesn't matter? It matters if the dependent module has side-effects, or if it doubles the amount of runtime memory used because it depends on a large package. [08:50:05.0410] the behavior or loading a module instance is a general question for module import to answer, and is not really relevant for this proposal [08:50:44.0666] I only included it so that a behavior can be implicitly shared with other realms/agent without having to rely on a per channel registration on each side [08:51:01.0533] * I only included it so that a behavior can be implicitly shared with other realms/agent without having to rely on a per channel registration on each side [08:51:38.0141] * the behavior of loading a module instance is a general question for module import to answer, and is not really relevant for this proposal [08:53:23.0660] It would be great to find an approach that allows for the implicit import of behavior for cases that don't care about prototype continuity, yet not force a dependency on module instances [09:01:32.0758] > <@rbuckton:matrix.org> Your suggestion depends on features not present in the origin trial, but uses an API that is likely unique to the origin trial. @shu can clarify, but I believe the reason the trial uses a `SharedStructType` constructor in place of syntax as that's much easier to add behind a flag to get early feedback on. I'm not sure how likely we could depend on a proposal like `with { reflect: "module" }` or module blocks, which are nowhere near ready for adoption. that is accurate [09:12:24.0275] I'm of two opinions on prototype continuity, each based on the overall direction we take for the proposal: If the methods of shared structs only had access to a thread-safe locked-down global in a globally shared realm where they can only access other shared functions or shared structs or imports of the same (i.e., something like the "shared module" approach), then I favor prototype continuity because its easy and has an already limited surface area. If the methods of shared structs are derived from code independently run in each Agent, then I only care about prototype continuity insomuch as a well-written program should be importing the same shared struct definition from the same file/package in each agent, and that any prototype continuity that exists will fall out naturally from that. By loosening the any restriction of prototype continuity, bundlers can take advantage of tree shaking, renaming, inlining, etc. Since we seem to be leaning towards the latter approach, I'm less concerned about prototype continuity. [09:44:39.0756] I think we have a different definition of continuity here. I probably should have said "stability" [09:45:45.0000] All your examples seem to rely on prototype stability by require to pass `instanceof` checks between received structs and the local constructor [09:46:09.0332] This is the difficult part [09:57:21.0751] `instanceof` is a function of the constructor, not the prototype. In my origin-trial derived example, I mistakenly included the constructor in the `register` API, though the runtime doesn't really need to know about the constructor. It would be better written as: ```js const { type: _Vector2D, register } = SharedStructType.prepare(["x", "y"]); export function Vector2D(x = 0, y = 0) { const _this = Reflect.construct(_Vector2D, [], new.target); _this.x = x; _this.y = y; return _this; } Vector2D.prototype.distanceTo = function (v) { const dx = this.x - v.x; const dy = this.y - v.y; return Math.sqrt(dx * dx + dy * dy); }; register("e3a9bd1f-0f64-4848-b255-3c629d0c44a3", Vector2D.prototype); ``` You bind the type identity to the prototype to use for that struct type. `x instanceof Vector2D` then falls out of `Vector2D` checking whether `x` has `Vector2D.prototype` in its prototype chain. [10:14:06.0513] `instanceof` is a check on the prototype object as retrieved from a Constructor. The constructor is the way to express the check, but has nothing to do with the check itself [10:16:05.0603] I think what I'm trying to say, is that if you want to make prototype match on different constructors, you can't really have a `constructor` property on the prototype object [10:23:57.0738] `instanceof` doesn't check `prototype.constructor` though, so the absence or presence of `constructor` seems like a separate issue. [10:26:10.0858] In my example, you want the local version of `Vector2D` to use the custom constructor to perform field assignments or any other custom logic. So `prototype.constructor` locally would refer to `Vector2D`. Also, I'm not necessarily saying that construction needs to occur this way, but it seemed the simplest way to express the concept and aligns with a known metaphor (ES5 class-likes). [10:27:25.0484] Right, I'm just saying having `prototype.constructor` would be difficult to have correct if you share a prototype between multiple struct types (imported vs locally defined) [10:28:12.0309] Why would you share a prototype? That generally wouldn't be the case in a post-origin-trial syntactic `shared struct` declaration. [10:29:07.0865] if you want `instanceof` to return true [10:29:09.0232] I mean, you _can_ share a prototype between multiple classes today, but that's not a common occurrence. [10:29:14.0165] I'm very very confused [10:29:47.0280] do you care about `receivedStruct instanceof LocalConstructor` being true or not ? [10:30:50.0287] if you do, you need either the receivedStruct to be the same underlying type as the locally defined one, which is impossible without global registries, or you have to have the different struct types share a prototype [10:30:59.0646] `x instanceof Y` doesn't care about `prototype.constructor`, it asks `Y` "is `x` an instance of you?". `Y` then looks at `Y.prototype` and recursively checks to see if it is in `x`'s prototype chain. [10:31:17.0605] * `x instanceof Y` doesn't care about `prototype.constructor`, it asks `Y` "is `x` an instance of you?". `Y` then looks at `Y.prototype` and recursively checks to see if it is in `x`'s prototype chain. [11:16:31.0631] instanceof is actually hookable [11:16:38.0478] which i honestly did not know until like a month ago [11:17:59.0866] sadly yes, which makes it even less reliable [11:18:05.0869] via `[Symbol.hasInstance]`, yes [11:18:06.0383] yeah with the Symbol.hasInstance on the RHS, but that would require the local constructor to be aware of the other "equivalent" types, which may be knowledge scoped to a local registry [11:18:10.0833] * sadly yes, which makes it even less reliable [11:18:58.0541] basically it all boils down to resolving scope conflicts