TC39 Structs and Shared Structs on 2023-09-19

00:25	<rbuckton>	I could possibly model this in terms of `attachBehavior` and abstract it away, assuming some other information is available. I can't emulate the thread-localness I'm describing in quite the same way, but could emulate it with a lock-free data structure I threw together a bunch of pseudocode for this to get an idea of what's needed. You couldn't support the synchronous case without some kind of synchronous notification occurring when an Agent encounters a shared struct with a previously unseen type identity, but that callback would be something like: setFindMissingPrototypeCallback((exemplar, agentId) => { const agentRegistry = agentId === 0 ? registry.root : ConcurrentList.find(registry.children, registry => registry.agentId === agentId); if (!agentRegistry) { return false; } const exemplarTypeIdentity = getTypeIdentity(exemplar); const agentEntry = Array.prototype.find.call(agentRegistry.entries, entry => getTypeIdentity(entry.exemplar) === exemplarTypeIdentity); if (!agentEntry) { return false; } const thisAgentEntry = Array.prototype.find.call(perAgentRegistry.entries, entry => entry.key === agentEntry.key); if (!thisAgentEntry \|\| !thisAgentEntry.prototype) { return false; } attachBehavior(exemplar, thisAgentEntry.prototype); return true; }); And something similar would be wired up on the main thread when constructing the `Worker`
00:27	<rbuckton>	Without the synchronous case, you could achieve this via `postMessage` if the worker/port checked each shared struct being sent out to see if it had already seen its type identity, and then posting a handshake message before posting the actual message.
00:28	<Mathieu Hofman>	right there has to be something that triggers when another agent register an examplar
00:28	<rbuckton>	But this is much simpler if we do all this work on the user's behalf.
00:29	<Mathieu Hofman>	for the async case you don't really need to check every shared struct being sent, I'll send some code later
00:29	<rbuckton>	An async-only case doesn't really exist though, since any thread could set data on a shared struct visible by any other thread.
00:31	<rbuckton>	And this `setFindMissingPrototypeCallback` only needs to be invoked lazily when performing `[[GetPrototype]]`
00:32	<rbuckton>	You could theoretically shim all of this with the current shared structs trial if you want to use `Proxy` and patch a bunch of globals and imports.
00:32	<rbuckton>	but it would be abysmally slow.
00:46	<rbuckton>	And this `setFindMissingPrototypeCallback` only needs to be invoked lazily when performing `[[GetPrototype]]` And this lazy operation doesn't necessarily require blocking. By the time thread A and B can communicate, they would both have already filled out their side of the shared registry.
01:06	<Mathieu Hofman>	I'm really not good at multi-threaded coded, but I was thinking of something along the lines of: shared struct StructRegistryEntry { name; examplar; next; } shared struct StructRegistry { head; names; nonshared lastAttached; nonshared prototypes; nonshared constructor(structs = {}) { const names = Object.keys(structs); this.names = new SharedFixedArray(names.length); for (const [i, name] of names.entries()) { this.names[i] = name; } this.prepare(structs); } nonshared prepare(structs) { const prototypes = new Map([...this.names].map(name => [name, null])); const entries = []; for (const [name, constructor] of Object.entries(structs)) { if (!prototypes.has(name)) { throw new Error(`Undeclared struct name ${name}`); } prototypes.set(name, constructor.prototype) entries.push([name, new constructor()]); } this.prototypes = prototypes; for (const [name, examplar] of entries) { this.register(name, examplar); } } nonshared register(name, examplar) { if (!this.prototypes.has(name)) { throw new Error(`Undeclared struct name ${name}`); } const entry = new StructRegistryEntry() entry.name = name; entry.examplar = examplar; entry.next = this.head; while (true) { const oldHead = Atomics.compareExchange(this, 'head', entry.next, entry) if (oldHead === entry.next) { break; } else { entry.next = oldHead; } } updateRegistrations(this) } } function updateRegistrations(structRegistry) { const head = structRegistry.head; let entry = head; while (entry !== structRegistry.lastAttached) { const behavior = structRegistry.prototypes.get(entry.name); if (behavior) { attachBehavior(entry.examplar, behavior); } entry = entry.next; } structRegistry.lastAttached = head; }
01:07	<Mathieu Hofman>	`updateRegistrations` would have to be triggered anytime there is some unattached struct, or eagerly for every message received. I'm not sure how you trigger it in the sync case
01:08	<Mathieu Hofman>	anyway I need to head out, hopefully that pseudo code conveys how I thought of the StructRegistry that Ron suggested
13:40	<Mathieu Hofman>	Thinking more about it, one way to have all threads process the types of any other thread is to block completion of registering a new thread's examplar until all other existing threads connected to the registry have signaled they have attached behaviors to the new examplar somehow be able to have existing threads process new examplars while they're currently executing There doesn't seem to be a good way to explain in terms of initialization and messaging the kind of preemption required by introducing a new thread's types to other connected threads that are potentially in busy loops. Maybe it demonstrates that "attach behavior" is not sufficient, and it likely means the registry mechanism has to be language specified instead, which kinda saddens me.
14:44	<rbuckton>	What if we only support wiring up exemplars between A and B that only have a matching key in M? The shared registry would just track the type identities of each registered exemplar in one place during preload, so you wouldn't need to process new exemplars: // // main.js // import { Foo, Bar, Baz } from "./structs.js"; const structs = new StructRegistry({ Foo, Bar, Baz }); const data = new (new SharedStructType(["mut", "cond", "ready", "value"]))(); data.mut = new Atomics.Mutex(); data.cond = new Atomics.Condition(); data.ready = false; const A = new Worker("A.js", { preload: "preloadA.js", structs, workerData: data }); const B = new Worker("B.js", { preload: "preloadB.js", structs, workerData: data }); // // preloadA.js // import { Foo, Bar, Quxx } from "./structs.js"; import { prepareWorker } from "worker_threads"; prepareWorker({ structs: { Foo, Bar, Quxx } }); // // preloadB.js // import { Foo, Baz, Quxx } from "./structs.js"; import { prepareWorker } from "worker_threads"; prepareWorker({ structs: { Foo, Baz, Quxx } }); // // A.js // import { Foo, Bar, Baz, Quxx } from "./structs.js"; import { workerData } from "worker_threads"; Atomics.Mutex.lock(workerData.mut, () => { function waitForB() { while (!workerData.ready) Atomics.Condition.wait(workerData.cond, workerData.mut); } function sendToB(value) { workerData.value = value; workerData.ready = false; Atomics.Condition.notify(workerData.cond); waitForB(); } function receiveFromB() { waitForB(); return workerData.value; } waitForB(); // send our `Foo` sendToB(new Foo()); // Check whether the `Foo` sent by B shares the same prototype as our `Foo`. // This works because both A and B have registered a `Foo` entry that maps to `Foo` in the main thread. console.log(receiveFromB() instanceof Foo); // prints: true // send our `Bar` sendToB(new Bar()); // Check whether the `Bar` sent by B shares the same prototype as our `Bar`. // This does not work because preloadB.js did not register `Bar`. console.log(receiveFromB() instanceof Bar); // prints: false // send our `Baz` sendToB(new Baz()); // Check whether the `Baz` sent by B shares the same prototype as our `Baz`. // This does not work because preloadA.js did not register `Baz`. console.log(receiveFromB() instanceof Baz); // prints: false // send our `Quxx` sendToB(new Quxx()); // Check whether the `Quxx` sent by B shares the same prototype as our `Quxx`. // This does not work because main.js did not register `Quxx`. console.log(receiveFromB() instanceof Quxx); // prints: false }); // B.js import { Foo, Bar, Baz, Quxx } from "./structs.js"; import { workerData } from "worker_threads"; Atomics.Mutex.lock(workerData.mut, () => { function waitForA() { while (workerData.ready) Atomics.Condition.wait(workerData.cond, workerData.mut); } function sendToA(value) { workerData.value = value; workerData.ready = true; Atomics.Condition.notify(workerData.cond); waitForA(); } function receiveFromA() { waitForA(); return workerData.value; } // signal to A that we're ready sendToA(undefined); // Check whether the `Foo` sent by A shares the same prototype as our `Foo`. // This works because both A and B have registered a `Foo` entry that maps to `Foo` in the main thread. console.log(receiveFromA() instanceof Foo); // prints: true // send our `Foo` sendToA(new Foo()); // Check whether the `Bar` sent by A shares the same prototype as our `Bar`. // This does not work because preloadB.js did not register `Bar`. console.log(receiveFromA() instanceof Bar); // prints: false // send our `Bar` sendToA(new Bar()); // Check whether the `Baz` sent by A shares the same prototype as our `Baz`. // This does not work because preloadA.js did not register `Baz`. console.log(receiveFromA() instanceof Baz); // prints: false // send our `Baz` sendToA(new Baz()); // Check whether the `Quxx` sent by B shares the same prototype as our `Quxx`. // This does not work because main.js did not register `Quxx`. console.log(receiveFromA() instanceof Quxx); // prints: false // send our `Quxx` sendToA(new Quxx()); });
14:45	<rbuckton>	When A and B receive something they don't share a mapping for, you just get data and no behavior.
14:46	<rbuckton>	In that way its still useful for read/write and for sending it along to another thread that might be able to interpret it.
14:53	<rbuckton>	In the same vein, if `main.js` starts two workers that don't share the same registry, they can't wire up behavior at all.
16:03	<Mathieu Hofman>	I was assuming only matching keys in the registry in the first place, but I don't think that solves the problem. For example: M creates the registry M creates A with the shared registry. A can block during prepare until it has attached behaviors for M's examplars, and M can block until A has shared its examplars, and M has attached behavior M shares a container struct with A M subsequently creates B with the same shared registry. B can block during prepare until it has attached behaviors for both M and A's examplars, and M can block until B has shared its examplars, and M has attached behavior M shares the previously created container with B (possibly in the init params of the worker) B adds some shared structs it creates to the container A attempts to read from the container How do we make sure that A has had the opportunity to process B's examplars to attach behavior to B's types before A encounters the B struct types in the shared container. A may be doing a busy loop we cannot preempt. I can probably imagine patching all atomics operations to interleave the attachment check, but that feels gross. Or maybe there's something simply I'm overlooking
17:52	<rbuckton>	I don't think we need to block until behavior is attached to exemplars until we do `[[GetPrototypeOf]]`, at which time we can look up the matching exemplars from the registry. By the time A communicates with B, or either communicates with M, their registries would already be connected.
17:56	<Mathieu Hofman>	Right, that's what I mean, it requires the concept of the registry to be known to the spec so that `[[GetPrototypeOf]]` can do necessary lookup. I was still trying to explain the registry in terms of simpler attach behavior semantics, but that doesn't seem to be possible
17:57	<rbuckton>	Even for attachBehavior to work there has to be some behind-the-scenes work in the spec to generate a prototype based on the type identity of a shared struct type.
19:09	<rbuckton>	shu: yesterday we were discussing marking methods as `nonshared`, are you anticipating these methods would be attached to the instance as `nonshared` fields, or to an agent-local `[[Prototype]]`?
20:26	<Mathieu Hofman>	Even for attachBehavior to work there has to be some behind-the-scenes work in the spec to generate a prototype based on the type identity of a shared struct type. sure, but while that's also technically an internal registry, it's from an internal and non-forgeable type identity to a local behavior object. Your proposed registry is mapping from a string, which to prevent introducing a realm / agent wide communication channel has to be connection specific, or the registry state cannot be observable by the program in any way, neither of which I am convinced about being the case yet.
20:28	<rbuckton>	Even the `on("message", ...)` + `attachBehavior` mechanism uses a string key, it's just that the string key you used was `"registerPoint"`.
20:29	<rbuckton>	An in earlier discussions with shu he'd suggested something like "you send an array of exemplars", in which case the key you use is an integer. What the key is doesn't matter.
20:30	<rbuckton>	Everything I'm suggesting is basically just a layer of abstraction above the same capabilities you're proposing.
20:32	<rbuckton>	The initiating thread needs to pass a message containing one or more exemplars to a child thread, keyed in some way as to be interpreted as a way to identify which exemplar is an example of which known thing we want to associate it with.
20:32	<shu>	rbuckton: the former, though there's nothing precluding an agent-local [[Prototype]] either
20:32	<shu>	it is slightly more difficult to implement the latter so that's not what the dev trial does
20:32	<shu>	you should probably be able to express it both ways
20:33	<rbuckton>	Does this process even work if I have to attach agent local values for every method every time I receive a new instance of an existing struct type?
20:33	<shu>	sorry i think i misread
20:34	<shu>	the two choices are: a shared struct instance's [[Prototype]] is a shared field and holds a shared struct, with `nonshared` fields, into which you assign methods a shared struct instance's [[Prototype]] is a `nonshared` field and points to a per-agent local struct
20:34	<shu>	i think you want `nonshared` fields regardless
20:34	<shu>	and maybe (2) as well
20:35	<shu>	but that one's less clear to me
20:35	<shu>	i am prototyping (1) in the dev trial
20:35	<shu>	in either case you don't have to attach methods for every new instance
20:36	<Mathieu Hofman>	Everything I'm suggesting is basically just a layer of abstraction above the same capabilities you're proposing. Right but that is clearly and explicitly scoped to the connection. I'm struggling to think of a way to specify the registry that remains fully connection oriented.
20:37	<rbuckton>	(1) works, I suppose. What's important is that for a given struct type, I only need to establish the `[[Prototype]]` relationship once in a given thread, not once every time a new instance is observed.
20:38	<shu>	(1) has some advantages, like, `instanceof` just works with the usual semantics
20:38	<shu>	since all instances have the same prototype object
20:38	<rbuckton>	What I suggested is connection oriented. The main thread doesn't have a global registry shared across all workers. It has a specific registry you hand to individual workers on creation. The child thread associated with that worker will always be able to refer to its parent, thus the registry will always be reachable.
20:42	<rbuckton>	Proxies are extremely frustrating, by the way. its very difficult to actually build a membrane with them due to some of the invariants.
20:42	<rbuckton>	I'm trying to model some of what we've been discussing using the current origin trial + some proxies and shims
21:03	<rbuckton>	shu: Do you expect the `nonshared` fields to be fixed per-instance as well?
21:03	<rbuckton>	as in, predefined with `{ configurable: false }` like shared fields are
21:12	<shu>	rbuckton: the fields themselves, yes
21:13	<shu>	the fixed layout constraint applies to all fields
23:25	<Mathieu Hofman>	What I suggested is connection oriented. The main thread doesn't have a global registry shared across all workers. It has a specific registry you hand to individual workers on creation. The child thread associated with that worker will always be able to refer to its parent, thus the registry will always be reachable. What I'm wondering about is the relation between types and registries. A thread / agent is able to create registries and pass/associate them to workers it creates. That means there is really a many-to-many relationship between agents and registries. When a type is received from a postMessage, it's logical to lookup in the registry associated to that connection for a behavior mapping. However when a type is read from a value of another shared struct, how is the agent deciding where to look up for an associated behavior? Do each types keep an association to which connection they originated from, so that further types encountered through them resolve using the same registry? What happens if a type associated to one registry is shared over a connection using another registry? Or for that matter, to what registry is a type locally defined associated to?
23:35	<Mathieu Hofman>	To put it in another term, what happens in the following case: M defines Point and Rect structs M creates registry RA, used with worker A M creates registry RB, used with worker B Both A and B define their own Point and Rect, and prepare the registry they received from M with those definitions M creates rect1 and shares it with A and B A sets rect1.topLeft to a Point it creates B sets rect1.bottomRight to a Point it creates Questions: M should be able to find a behavior for both rect1.topLeft and rect1.bottomRight, but what spec logic should it follow that accomplishes that? Should B be able to find a behavior for rect1.topLeft? (corollary, should A be able to find a behavior for rect1.bottomRight ?)