12:00
<rbuckton>

rbuckton: i applaud the scope of "shared modules" but i feel that is tantamount to designing a new language, which is not incremental and significantly increases risk of adoption or shipping anything.

put another way, IMO the only realistic way to move the needle for multithreading for JS is to have an opt-in carve out for shared memory. shared structs is that opt-in. to have "shared modules" seems to require the capability to write code that is actual parallel and threadsafe in general, including a threadsafe stdlib. were i to do a greenfield project i'd design a stdlib with that in mind but i feel like that would be too much to bite off at the moment?

I'm coming at this from two different directions, with two different outcomes:

  1. Design something transformative for the language, adding cohesive and comprehensive new capabilities.
  2. Design something tacked on to the language, adding a minimal set of capabilities necessary to solve a specific problem.

Option 1 is complex, takes longer, and requires a fair amount of "big design upfront", all to support capabilities we may or may not need. However, the end goal is to have something cohesive that is future-leaning with few "warts". That's what "shared modules" is, or is intended to be.

Option 2 is far simpler (though not trivial), shorter term, and focused. While this approach allows for incremental change, we may in the future find that related capabilities are harder to implement in the future because of design decisions we make now, potentially leaving more "warts" in the design over time.

I'm not opposed to either direction, but its worth considering the former even if only to inform the latter.

12:01
<rbuckton>
more i think about it, i think the non-threadsafe stdlib thing is the actual showstopper for me
The "shared modules" approach would have required a threadsafe stdlib subset (i.e., make operations threadsafe if possible, and throw when not).
12:02
<rbuckton>
I find the "only share data, but register per-thread behavior" approach acceptable, so long as we have a reasonable way to register per-thread behavior.
12:04
<rbuckton>
That's the approach I took with https://esfx.js.org/esfx/api/struct-type.html?tabs=ts, though it's not so much "registration" and is more of a "wrap an ArrayBuffer with a DataView"-like approach.
14:30
<shu>
i'm saying something stronger for (1). my intuition is that "transformative" for JS actually means "not adoptable and unrealistic"
17:42
<littledan>
Yeah, I think it's good to consider 1, but after years of considering it, I haven't seen a realistic path for it. Do you? This informs my agreement with Shu's choice of 2.
17:43
<littledan>
Are there other things about it we should consider? Other potential takeaways to inform things further?
20:29
<Mathieu Hofman>
I remain convinced that sharing initialization code between realms, and in this case between agents, is also a general problem that is relevant for this proposal and for others (extensible cloning for example, or recursive initialization of ShadowRealms). I believe this problem can be solved without being transformative to the language, just with targeted API additions and relying on other proposals in progress like module expressions. So going towards (2) does not mean the solution has to be specific to this proposal.
20:30
<littledan>
We're all agreed on sharing code, the question is whether to has to be the same copy of the code or not
20:30
<littledan>
my idea with module expressions has always been, it's the same code but different copies of it; different instances of the same module
20:32
<Mathieu Hofman>
Right I agree. I think multiple copies is sufficient, we just need to solve the registration ergonomics, which has different approaches possible, and where rbuckton and I disagree on.
20:33
<littledan>
yeah I agree registration ergonomics is fraught. Could you elaborate on the disagreement?
20:33
<littledan>
bterlson: I think I saw you typing?
20:34
<Mathieu Hofman>
Basically Ron suggested an implicit shared cross agent registry, which is a non starter in my book.
20:34
<bterlson>
littledan: I was just catching up and typed a bit :-D Just got back from paternity and have been out for many an eon, have no idea whats going on
20:34
<bterlson>
(and I want this feature in my project I'm working on)
20:35
<littledan>
Basically Ron suggested an implicit shared cross agent registry, which is a non starter in my book.
Huh, I don't see why we'd bother with a new registry when we already have the module map
20:35
<littledan>
(and I want this feature in my project I'm working on)
Oh! Could you say more?
20:36
<bterlson>
wasm stuff, can't say more yet :P
20:37
<littledan>
are we going to start living the multithreaded wasm gc dream?
20:37
<Mathieu Hofman>
The module map is agent specific, different agents may have a different module map. The main question is independent initialization and what happens when you receive a shared struct from another agent, whether there is a relation to the "same" shared struct that may have been declared in the local agent
20:38
<Mathieu Hofman>
basically identity discontinuity
20:39
<littledan>
wait, I thought a limited form of "identity discontinuity" was a given: We're taking that we have different copies of the functions and prototype (which ideally do the same thing)
20:39
<littledan>
do you mean, the risk that it won't just be identity discontinuity, but a greater level where the actual behavior won't match up?
20:40
<littledan>
for example, I imagined that each thread has its own global object which is mutable and all, and so different prototypes on different threads will access different things and experience different behavior
20:40
<littledan>
but that this is smoothed over because [[GetPrototype]]() finds the "current" prototype given your agent
20:41
<Mathieu Hofman>
if you receive Vector from agent 1, and yourself define and instantiate Vector, should these 2 objects share the same implementation locally?
20:41
<littledan>
they should, but defining "same implementation" is a bit complicated
20:42
<littledan>
and I guess it's a question of whether "should" or "must" is what we're going for
20:42
<Mathieu Hofman>
should you see the same prototype object ?
20:42
<littledan>
oh sorry I misread the question... yes they should have the identical prototype identity IMO
20:42
<Mathieu Hofman>
or is it acceptable for these 2 objects to have equivalent prototype objects that are not the same
20:42
<littledan>
what difficulties would we have in getting there?
20:44
<Mathieu Hofman>
How do you make that happen when both agents have independently defined their own Vector shared struct. I suppose how did they define the behavior of those shared structs is the question
20:45
<Mathieu Hofman>
what is the identity used to say they're the "same"
20:45
<littledan>
I think this identity could be keyed by a module specifier. This could be either a string specifier or module block
20:45
<littledan>
that is, if you want to make a shared struct with a non-null prototype, it has to be exported from a module
20:46
<littledan>
the pair of the (absolute) module specifier + export name is the key
20:46
<Mathieu Hofman>
if it's a string, you're dealing with module maps that may not resolve the same way between agents
20:46
<Mathieu Hofman>
module blocks do not currently preserve their identity through structured cloning
20:46
<littledan>
both good points
20:46
<littledan>
module blocks do not currently preserve their identity through structured cloning
we would have to switch this attribute of module blocks if we wanted to enable this usage
20:47
<littledan>
if it's a string, you're dealing with module maps that may not resolve the same way between agents
I'm willing to take this risk, but it's a value judgement
20:47
<Ashley Claymore>
I'm curious where we imagine the module loading would be awaited
20:47
<littledan>
re: the string risk, this is not a risk for meeting this property of the shared struct prototypes not matching. It just means different methods would be available on the different sides of the boundary
20:48
<littledan>
this is also an important problem. My suggestion would be, [[GetPrototype]]() throws if the module isn't already loaded.
20:48
<Mathieu Hofman>
module blocks is however my suggestion, and it doesn't solve the independent initialization use case. By definition one agent has to init first, and share the module block definition through postMessage, then agent 2 has to define the shared struct
20:49
<littledan>
module blocks is however my suggestion, and it doesn't solve the independent initialization use case. By definition one agent has to init first, and share the module block definition through postMessage, then agent 2 has to define the shared struct
well, what if agent 1 doesn't bother sending over the module block, and just starts by sending the shared struct?
20:49
<Mathieu Hofman>
I believe however that having the same prototype object is not strictly necessary for most use cases
20:49
<littledan>
I worry that forcing module blocks rather than also allowing string specifiers will make initialization of programs too awkward and therefore impractical
20:51
<Mathieu Hofman>
it's possible that the answer is indeed to do both
20:52
<rbuckton>
The module map is agent specific, different agents may have a different module map. The main question is independent initialization and what happens when you receive a shared struct from another agent, whether there is a relation to the "same" shared struct that may have been declared in the local agent
I'm not so concerned about identity discontinuity. If each agent/realm is required to load its own copy of the behavior for a struct, the code defining that behavior could be different per agent/realm because of bundling/minification/tree shaking.
20:53
<Mathieu Hofman>
How is it a problem that it's different?
20:53
<rbuckton>
I'm not so concerned about identity discontinuity. If each agent/realm is required to load its own copy of the behavior for a struct, the code defining that behavior could be different per agent/realm because of bundling/minification/tree shaking.
I may have misinterpreted "identity discontinuity". I'm thinking about behavior, not identity.
20:54
<littledan>
Good, we are all trying to accomplish the same thing then
20:55
<Mathieu Hofman>
by identity discontinuity I meant that the prototype object for a vector received from another agent may not be the same as the prototype object for a vector object defined and instantiated in the local agent.
20:56
<rbuckton>
What matters to me is that there is a way to define Vector in two threads (A and B) such that a Vector created in A is also a Vector in B, and vise versa. Also, that a Vector created in A, sent to B, and then sent back to A is still a Vector in A.
20:57
<rbuckton>
A and B may have subtly different implementations of Vector (due to tree shaking, etc.), so there needs to be some way for A and B to coordinate what a Vector is.
20:59
<rbuckton>
The tree shaking concern isn't conjecture either. If "the way" to do multithreading in JS is going to require duplicating runtime code in each thread, developers are going to want to find ways to minimize the memory footprint of short-lived threads by tree shaking away unused functionality.
20:59
<Mathieu Hofman>
Right, but the main question is whether this should be possible without an explicit "synchronization" message between A and B
21:00
<littledan>
A and B may have subtly different implementations of Vector (due to tree shaking, etc.), so there needs to be some way for A and B to coordinate what a Vector is.
so, I guess the question is, whether we especially want to permit this difference
21:00
<rbuckton>
Right, but the main question is whether this should be possible without an explicit "synchronization" message between A and B
Or a way to register the prototype via a Worker
21:01
<rbuckton>
so, I guess the question is, whether we especially want to permit this difference
I think it's important to permit it. If we can't share code, we need to be able to reduce overhead in other ways.
21:01
<littledan>
The tree shaking concern isn't conjecture either. If "the way" to do multithreading in JS is going to require duplicating runtime code in each thread, developers are going to want to find ways to minimize the memory footprint of short-lived threads by tree shaking away unused functionality.
I can see how this is nice to have; what I have trouble understanding is whether it's an absolute blocker. There are reasons, on the other hand, to prefer that this correspondence in the code loaded is mandatory
21:01
<Mathieu Hofman>
Aka if and how can A and B define their Vector independently and for the vector instance send by A to B to have the same prototype as the one created independently by B
21:03
<littledan>
Actually: I think it should work just fine to use a small module which just contains the field definitions, and then have other chunks of code (different in different workers/agents) which go back and install methods on the class.
21:03
<Mathieu Hofman>
My understanding is that module identifiers are somewhat problematic with bundlers
21:03
<Mathieu Hofman>
and I don't see a way to accomplish this with module blocks
21:04
<rbuckton>
Bundlers, minifiers, and treeshakers exist for a reason. Performance is important, so whatever solution we come up with must be able to handle web reality.
21:04
<littledan>
Actually: I think it should work just fine to use a small module which just contains the field definitions, and then have other chunks of code (different in different workers/agents) which go back and install methods on the class.
these later "installation" chunks of code could use various different techniques to minimize the memory footprint, as rbuckton mentioned
21:04
<littledan>
the must-be-the-same parts are really limited to, what field names are there (which is mandatory to correspond anyway)
21:05
<Mathieu Hofman>
How do you "attach" the behavior to a struct definition? What is the shared identity of the struct definition
21:05
<littledan>
How do you "attach" the behavior to a struct definition? What is the shared identity of the struct definition
by mutating the prototype
21:06
<littledan>
this is just about making sure that we can call methods on these instances
21:06
<littledan>
you'd just be mutating the prototype on your own agent-local copy of course
21:07
<littledan>
Bundlers, minifiers, and treeshakers exist for a reason. Performance is important, so whatever solution we come up with must be able to handle web reality.
I think we don't quite know yet how much of this will need to differ per agent in a single cooperating application. But generally I see your point. But I think the technique I explained above should be enough.
21:07
<rbuckton>
My suggestion is to use a registry, possibly scoped to the Worker. Rather than message synchronization, the code loaded in the worker thread must register an association between a string key and a prototype. The thread that creates the Worker must also register the same key with a prototype, otherwise the behavior isn't available.
21:07
<littledan>
Ron, you know that TC39 doesn't like this kind of registry. Let's try to think of something else.
21:08
<littledan>
I don't yet understand the problem with my suggestion to use module specifiers, given that the weighty part of the class (the method definitions) can be factored out
21:09
<Mathieu Hofman>
as I've mentioned before a mutable registry with forgeable string keys scoped to an agent or realm is a non starter for me
21:10
<littledan>
I see how the registry solves the problem, but I don't understand why the other solution doesn't work as well.
21:10
<rbuckton>

Something like:

// main.js
import { Worker } from "worker_threads";
import { Vector } from "./vector.js";

const worker = new Worker("./worker.js", { structTypes: { "some-string-for-vector": Vector.prototype } });
worker.postMessage(new Vector());

// worker.js
import { parentPort } from "worker_threads";
import { Vector } from "./vector.js";

parentPort.addStructType("some-string-for-vector", Vector.prototype);
parentPort.onmessage = msg => {
  msg // a Vector
};
21:11
<Mathieu Hofman>
A per channel registry however like above is fine
21:11
<rbuckton>
What I'm suggesting is a per-channel registry.
21:12
<littledan>
huh, I don't understand how this would be implemented. It has to be that we don't have wrapper objects per instance. So the channel can't be transforming what goes across it.
21:12
<rbuckton>

An alternative, which makes things even less mutable might be:

import { parentPort, MessagePortWrapper } from "worker_threads";
const parent = new MessagePortWrapper(parentPort, { structTypes: { ... } });
parent.onmessage = msg => { ... };
21:13
<littledan>
I don't see how a registry not scoped to the agent could satisfy the no-linear-work-across-postmessage property
21:14
<rbuckton>
Ah, I think I see what you mean. You want the remote agent to define these types once for the agent.
21:16
<littledan>
well, I don't necessarily want that particular thing. For this no-linear-work requirement, an agent-scoped registry would work if defined inside that agent (it just wouldn't meet Matthieu's goals)
21:16
<rbuckton>
But the same problem exists in the main thread as well. We wouldn't be able to handle per-channel prototype associations either.
21:16
<littledan>
also using the module map would work
21:17
<rbuckton>
also using the module map would work
Except for bundling?
21:17
<rbuckton>
brb, meeting.
21:17
<littledan>
But the same problem exists in the main thread as well. We wouldn't be able to handle per-channel prototype associations either.
That's right. The prototype has to be per-agent (resolved in GetPrototype), not per-channel
21:18
<littledan>
Except for bundling?
Yes, that's right, doing something based on the module map would require usage of native ESM. That's an argument against this approach. It's also a reason why module expressions might be relevant (if we fixed the cloning behavior)
21:19
<littledan>
I guess I just sort of believe that we're getting to a point where it can be OK to bet on native ESM
21:19
<Mathieu Hofman>
Module expressions require an explicit introduction however before defining the struct
21:20
<littledan>
Module expressions require an explicit introduction however before defining the struct
yes, definitely introduces its own awkwardness
21:21
<shu>
Mathieu Hofman: to clarify, an additional mutable registry is a non-starter for you, is that right?
21:22
<shu>
like, one rather crude thing is to use an existing mutable registry, like, the global scope
21:22
<littledan>
not the global scope!!!
21:22
<shu>
bro why
21:23
<shu>
it's already there
21:23
<shu>
love to use things that are already there!
21:23
<Mathieu Hofman>
depends on what is used as keys. if using forgeable string keys, it is. if using unforgeable objects, then it's equivalent to a WeakMap, which is fine
21:23
<littledan>
it's icky
21:24
<shu>
i'm gonna make a shirt that says "globals lover"
21:24
<shu>
the o in "global" will be a heart
21:24
<Mathieu Hofman>
oh the global scope is not shared across agents, the problem is that the registry would have to be shared between realms/agents for it to be useful, no ?
21:25
<shu>
i don't think it's a hard requirement for me the registry itself be shared
21:25
<shu>
it's already part of the model that the app is responsible for ensuring the same code is loaded across agents, and hopefully we'll make that not too hard
21:25
<rbuckton>
Just to clarify, per-channel prototypes are a non-starter, which means a per-channel registry is probably a non-starter.
21:26
<shu>
ensuring the same key is used for the copies of the code would be part of that responsibility
21:26
<littledan>
shu: By that, do you mean, of course there needs to be a happy path where you load the same-acting prototype in each agent, but it's OK if you're able to load non-matching things?
21:26
<shu>
yes
21:26
<littledan>
(if so I agree; I don't see it being necessary that we force it to be the same code)
21:26
<rbuckton>
Which means the struct value->prototype relationship must be per-thread.
21:27
<littledan>
Which means the struct value->prototype relationship must be per-thread.
yes this is a given at the outset of the conversation; I thought we've been talking in those terms this whole time
21:28
<Mathieu Hofman>
In this "use the global scope" idea, what happens if the registration changes, aka the global property gets redefined. Do existing objects have their proto automagically change to the new object ?
21:28
<rbuckton>
yes this is a given at the outset of the conversation; I thought we've been talking in those terms this whole time
I'm just clarifying things, for myself if for no one else
21:29
<littledan>
In this "use the global scope" idea, what happens if the registration changes, aka the global property gets redefined. Do existing objects have their proto automagically change to the new object ?
It would be like, each time you do a property access resulting in [[GetPrototypeOf]] (so it's not an own property), the global would be looked up to see how it resolves
21:29
<Mathieu Hofman>
My concern with a forgeable key global registry was about libraries fighting to define the same thing
21:29
<shu>
why is that different than loading competing polyfills?
21:30
<shu>
or rather, is it different? maybe it is
21:30
<littledan>
I mean, I think TC39 added modules to solve this problem
21:30
<rbuckton>
So, value->prototype is per-thread, and n threads need to coordinate this value->prototype relationship to properly communicate, which means there must be some type of unique identity associated with the value->prototype relationship that can be reproduced in each thread.
21:30
<littledan>
because competing over globals was a real problem in JS
21:30
<Mathieu Hofman>
Shared struct object wouldn't allowed to be frozen, would they ?
21:31
<shu>
no, they can't be frozen without violating the "immutable shapes" requirement
21:31
<shu>
else all property accesses would need to synchronize
21:31
<littledan>
sure but there can be born-frozen shared structs (hypothetically--not arguing it should be prioritized)
21:31
<shu>
they should be able to be made frozen-from-construction, however
21:31
<shu>
yes
21:31
<littledan>
(I don't see a reason to add this, but I also don't see what it would break)
21:31
<shu>
oops, editor call and other mtgs, bbl
21:32
<Mathieu Hofman>
the prototype of a frozen object cannot change
21:32
<Mathieu Hofman>
so it can't be looked up dynamically for these frozen structs
21:32
<littledan>
well, it would never change within an agent
21:32
<littledan>
oh, right
21:32
<littledan>
yeah, that's something we could accomplish with modules but not globals
21:32
<littledan>
it'd have to be a const module export!
21:36
<Mathieu Hofman>
I also have got to get back to work. littledan hopefully that highlights where the remaining problems are with different copies of behavior code
21:42
<littledan>
I have to work too! I think this helped advance all of our understanding and we're at a good place to stop
21:42
<littledan>
I'm more OK with different copies/behaviors of code than I am with using the global object, which leads to classic namespace management problems
21:43
<rbuckton>

I wonder if we could accomplish a per-thread registry with workers via a preload mechanism? Would that be an acceptable compromise for that approach?

Consider:

  • The main thread must load/evaluate all struct types it intends to use to communicate with workers.
  • Struct types with behavior must have an associated unique identity (possibly user defined).
  • When the main thread constructs a Worker, it must specify a preload module in addition to the regular worker script.
  • The preload module is loaded in the Worker first, and should load/evaluate all struct types so that their type->prototype mapping is loaded. This is to be considered a privileged operation, so developers should take care with how third-party code is loaded at this time.
  • After the preload module has been evaluated, the Worker's struct type registry is locked down and the regular worker script/module is evaluated. It reuses the same module cache as the preload script, so module identities and reference identities are consistent between preload and normal execution.
21:44
<rbuckton>
This approach is similar to Electron's preload mechanism
21:46
<rbuckton>
https://www.electronjs.org/docs/latest/tutorial/tutorial-preload, for reference.
21:50
<rbuckton>

In that approach, you might write something like:

// point.js
@Reflect.StructIdentity("796eb01e-70d2-42c6-a30f-8bdce572db3d")
export shared struct Point {
  x;
  y;
  constructor(x, y) {
    this.x = x;
    this.y = y;
  }
  toString() {
    return `${this.x},${this.y}`;
  }
}

// main.js
import { Point } from "./point.js";
import { Worker } from "worker_threads";

const worker = new Worker("worker.js", { type: "module", preload: "preload.js" });
worker.postMessage(new Point(0, 0));

// preload.js
import "./point.js";

// worker.js
import { parentPort } from "worker_threads";
parentPort.onmessage = msg => {
  console.log(msg.toString()); // prints: 0,0
}
21:51
<littledan>
I think this requires a kind of centralized coordination that I'd prefer to not require
21:53
<rbuckton>
It's pretty much a given that we must perform some kind of coordination. I'm borrowing from electron's model because its fairly widely adopted.
21:53
<littledan>
sure there's coordination across threads to load the corresponding code, but coordinating all the "preload" code to be packaged up together seems like a different kind of thing
21:54
<littledan>
also Electron is using this for privileged APIs, but I'd prefer that this is conceptually unprivileged
21:54
<rbuckton>
A different approach might be something fully statically analyzable during module import evaluation, so long as it still allows bundling/minification/tree shaking.
21:55
<Mathieu Hofman>
Skimming back really quick, but the ability to define modules that are evaluated when constructing a new realm is exactly what I'd want.
21:56
<rbuckton>
Skimming back really quick, but the ability to define modules that are evaluated when constructing a new realm is exactly what I'd want.
Are you talking about the preload mechanism?
21:57
<Mathieu Hofman>
well more general, but yeah, modules that are executed right after the realm is instantiated by the engine, but before other code runs
21:58
<Mathieu Hofman>
can be used to seamlessly apply transformations to the realm
21:59
<Mathieu Hofman>
any realm created from your realm applies the list of registered modules
22:02
<littledan>
But do you want that mechanism plus not being able to define new shared structs later?
22:17
<Mathieu Hofman>
I don't see why we need to disallow the definition of new shared structs, but I haven't fully thought it through in this context
22:32
<rbuckton>
I would allow them to be defined, but they wouldn't be registered so a prototype walk from a foreign struct value would fail.
22:35
<Mathieu Hofman>
I'd still like a mechanism to manually synchronize so that you can use the prototype of a foreign struct that isn't pre-registered. It just means you can share the prototype with the "equivalent" struct declared locally
22:35
<rbuckton>

How would you do this in a way that doesn't involve:

  • A mutable shared registry, or
  • A wrapper object per instance
22:36
<Mathieu Hofman>
and as littledan said, maybe it's just by creating an empty prototype object the first time that can be attached props
22:36
<Mathieu Hofman>
so the seeing a struct the first time would create a registration, but it's not mutable
22:40
<Mathieu Hofman>
or as I had in my earlier suggestion in january, a way to share over post message an unforgeable representative of the struct kind, allowing to define the local prototype object for structs of that kind
22:41
<shu>
oh creating an empty proto is an interesting alternative i hadn't thought about before
22:41
<rbuckton>
So, if its registered you get the registered prototype, if its unregistered you get an empty prototype? How would you warn a user that they tried to walk the prototype of a foreign struct value that has no associated behavior?
22:41
<shu>
the warning is that the app doesn't work, i feel like?
22:42
<Mathieu Hofman>
yes that's for allowing inband synchronization.
22:42
<Mathieu Hofman>
You send a post message saying, here is an object of kind "foo", please init its proto as needed
22:42
<Mathieu Hofman>
and you attach a dummy instance of the struct
22:42
<rbuckton>
I was hoping for a better developer experience, and by "better developer experience" I mean "throw an error with a message I can search for on stackoverflow"
22:44
<Mathieu Hofman>
well my suggestion for having a kind representative and having to register using that would allow to throw an explicit message if you walk the proto of a struct that wasn't registered
22:44
<Mathieu Hofman>
it is however a little more API complexity
22:44
<rbuckton>
Like, an exotic object that throws on property access. That's still achievable with the "empty prototype" idea, except you have to do: value -> empty prototype -> throw-on-access-object
22:45
<rbuckton>
Where the "throw-on-access-object" is analogous to a revoked proxy.
22:45
<Mathieu Hofman>
please no more exotic object
22:46
<Mathieu Hofman>
I'd prefer normal objects for the prototype
22:46
<Mathieu Hofman>
the instances are already exotic enough ;)
22:46
<rbuckton>
It doesn't have to be exotic
22:47
<rbuckton>
IIRC, a revoked Proxy isn't exotic, so it could be spec'd similarly.
22:50
<rbuckton>
Alternatively, the "empty prototype" could itself be an ordinary object with hooks similar to a proxy that throws on attempts to read/write to non-existent properties, but still allow you to use Object.defineProperty.
22:55
<rbuckton>

What would the API look like? Something like this?

// worker.js
import { parentPort } from "worker_threads";
import { Point } from "./point.js";

parentPort.onunhandledstruct = (id, prototype) => {
  switch (id) {
    case "796eb01e-70d2-42c6-a30f-8bdce572db3d":
      Object.defineProperties(prototype, Object.getOwnPropertyDescriptors(Point.prototype));
      break;
  }
};

That kind of registration could have issues if shared structs are allowed to have private fields/methods (and I'd like us to consider them in the future).

22:56
<Mathieu Hofman>
a proxy object is the definition of exotic
22:56
<Mathieu Hofman>
a proxy object is how user land can make exotic objects
22:56
<rbuckton>
a proxy object is the definition of exotic
ProxyCreate calls MakeBasicObject, which creates an ordinary object.
22:57
<rbuckton>
I think having a way to signal to a developer why a value isn't behaving appropriately is important, especially considering the complex mechanisms in play necessary to make this work.
22:58
<rbuckton>
If our answer is "you just get a plain old ReferenceError because its a plain old object", we're not doing the development community any favors.
22:58
<Mathieu Hofman>
I'd agree. But I'd really prefer avoiding more exotic like behavior
22:59
<rbuckton>
The whole idea of shared structs is exotic-like behavior.
23:01
<Mathieu Hofman>
on the instances. Is the argument that we already have exotic behavior there so, exotic behavior on the prototype is ok?
23:03
<rbuckton>

Also, is there any reason we can't do an API like this instead:

// called _before_ the runtime assigns a prototype to an unregistered foreign struct for the first time.
parentPort.onunhandledstruct = (id) => {
  switch (id) {
    case "796eb01e-70d2-42c6-a30f-8bdce572db3d":
      return Point.prototype;
  }
};
23:03
<rbuckton>
Though both of the API sketches above don't seem like they'd work per-realm in the same thread (if that was a requirement?)
23:05
<rbuckton>
on the instances. Is the argument that we already have exotic behavior there so, exotic behavior on the prototype is ok?
All I care about regarding this point is that we want a good developer experience, or at least the best we can offer.
23:07
<rbuckton>
Though, if we have an API design like one that has return Point.prototype above, we could potentially have unregistered and unhandled foreign struct values just have a null prototype, and tailor the ReferenceError message based on the fact the value itself is a struct value.
23:08
<rbuckton>
That's still tantamount to a mutable registry though, even if it is first-in-wins.
23:10
<rbuckton>
I'm not really sure how any dynamic "cure an unhandled foreign struct value's prototype" mechanism isn't just another way to say "mutable shared registry"
23:18
<Mathieu Hofman>
It really feels like this pre-registration stuff really is realm "arguments". With preload or other generic realm init, you can only express the identity of a struct kind as a string in code, where if you had a way to pass arguments to that init logic, you could pass handles to the struct definitions.
23:19
<Mathieu Hofman>
at the end of the day, the "parentPort" is an implicit argument of the new Worker
23:20
<Mathieu Hofman>
maybe we should have a global called "initArgs"
23:22
<rbuckton>

Well, parentPort in this case is already a thing in NodeJS. So is something like initArgs (i.e., workerData)

https://nodejs.org/dist/latest-v20.x/docs/api/worker_threads.html#workerparentport

https://nodejs.org/dist/latest-v20.x/docs/api/worker_threads.html#workerworkerdata

23:23
<Mathieu Hofman>
TIL about workerData. I guess yes
23:27
<rbuckton>
It really feels like this pre-registration stuff really is realm "arguments". With preload or other generic realm init, you can only express the identity of a struct kind as a string in code, where if you had a way to pass arguments to that init logic, you could pass handles to the struct definitions.
What do you mean by handles?
23:29
<Mathieu Hofman>
// vector2d.js
// Each shared struct type, whether data only or "prepared" has its own unique type
export const vector2Dtype = SharedStructType.prepare(["x", "y"]);

const _Vector2D = SharedStructType.getConstructor(vector2Dtype);

// custom construction behavior
export function Vector2D(x = 0, y = 0) {
  const _this = Reflect.construct(_Vector2D, [], new.target);
  _this.x = x;
  _this.y = y;
  return _this;
}

// prototype methods
Vector2D.prototype.distanceTo = function (v) {
  const dx = this.x - v.x;
  const dy = this.y - v.y;
  return Math.sqrt(dx * dx + dy * dy);
};

SharedStructType.registerPrototype(vector2Dtype, Vector2D.prototype);

// main.js
import { Vector2D, vector2DType } from "./vector2d.js";
const v1 = new Vector2D(1, 2);
const worker = new Worker("worker.js");
worker.postMessage([vector2DType, v1]);

// worker.js
// worker imports Vector2D, which causes registration as a side-effect.
import { Vector2D, vector2DType } from "./vector2d.js";

const v2 = new Vector2D(3, 4);

parentPort.on("message", ([mainVector2DType, v1]) => {
  SharedStructType.registerPrototype(mainVector2DType, Vector2D.prototype);
  assert(mainVector2DType !== vector2DType);
  assert(
    SharedStructType.getConstructor(mainVector2DType) !==
      SharedStructType.getConstructor(vector2Dtype)
  );
  assert(v1 instanceof Vector2D); // by virtue of sharing a prototype
  v1.x; // 1
  v1.distanceTo(v2); // ok
  v1.toString(); // ok
});

SharedStructType.getConstructor() is needed to allow the program to avoid duplicating type definitions in each agent/realm. I think the burden of such deduplication should be on the program, not on the engine.

If the engine had to deduplicate itself, you'd either need a user provided type identifier at "prepare" time, and a global lock around such type definitions, or you'd need to somehow be able to collapse separate definitions into a single one. In either case you'd have to figure out what to do if the shape definition does not match, and in the case of definition collapse, how do you communicate the error to the program?

By having the type definition generate the unique type identifier, you avoid all those complications in the engine, at the cost of putting more type hydration burden on the program.

rbuckton: this
23:35
<rbuckton>
That doesn't clarify things for me. Are you saying the "handle" is the thing produced by .prepare? I don't think that works without a user-defined type identifier (which you call out yourself), so the handle is meaningless on its own.
23:36
<Mathieu Hofman>
Yes the handle is what I called the type in that example.
23:36
<rbuckton>
I need to break for the day and make dinner. I'll come back to this tomorrow
23:46
<Mathieu Hofman>

And the prepare minting the handle is sufficient if you have workerData and preload:

  • Main thread initially defines the struct using prepare, and defines it's local behavior (registers its local prototype)
  • When creating a worker, the main thread puts the struct handle in workerData, like { handles: { foo: fooHandle } }
  • The preload code in the worker looks up the handles in the workerData and defines the its local behavior. It's app specific init behavior dealing with an app specific workerData structure.
  • Until registered a struct's prototype is null, and can throw with an explicit message if accessed.
  • You can always postMessage a struct handle at any point, and a realm can update the prototype for that struct kind using its handle
  • For frozen by construction structs, you can only register the prototype if no structs of that kind has been seen by the local realm
23:50
<rbuckton>
You're still using a string key (here "foo") to communicate. Otherwise the worker doesn't know one handle from another, so the handle itself isn't meaningful
23:51
<rbuckton>
At least, not if the key is encoded into the struct value itself (like with my decorator-based example above).
23:53
<rbuckton>
That design also seems to imply you can do { handles: { foo: fooHandle, bar: fooHandle } }, but the same struct type can't really be used twice.
23:55
<rbuckton>
It also requires coordinating a struct type with an identity where you create the worker, which could happen in more than one place in your code, which increases the likelihood of a mistranscription. If the identity is defined on the struct type at the struct declaration site, then it only needs to be written once per thread, and maybe even once for the whole program if importing the same file in multiple threads
23:55
<Mathieu Hofman>
yes but the string key is only in app logic, it's not part of the shared struct API
23:56
<rbuckton>
That just increases the potential for mistakes
23:56
<Mathieu Hofman>
my concern is with enshrining a forgeable key in a built-in registry
23:57
<Mathieu Hofman>
I don't see how it creates more potential for mistakes. it just moves the knowledge of string identifiers to the application layer
23:58
<Mathieu Hofman>
and it makes it possible to define new behavior after init the same way behavior during init is defined
23:59
<rbuckton>
I don't see how it creates more potential for mistakes. it just moves the knowledge of string identifiers to the application layer
This makes it hard for multiple packages to share a common shared struct definition without that common definition also enshrininf the key as an exported const or some such.