TC39 Structs and Shared Structs on 2024-05-30

02:39	<Mathieu Hofman>	I chatted with Mark this afternoon. First he's sorry about not being able to make it this morning. From what I understand the biggest concern with adding prototype methods to shared structs is that it makes it too easy to transform existing single threaded code into shared memory multi-threaded code without the author realizing the implication of such a transformation. This is especially true with non-shared structs also existing as you're roughly a "shared" keyword away from transforming into multithreaded existing but non thread safe code. Apparently this is an issue that Java and C# both suffered from.
03:54	<Chris de Almeida>	I chatted with Mark this afternoon. First he's sorry about not being able to make it this morning. From what I understand the biggest concern with adding prototype methods to shared structs is that it makes it too easy to transform existing single threaded code into shared memory multi-threaded code without the author realizing the implication of such a transformation. This is especially true with non-shared structs also existing as you're roughly a "shared" keyword away from transforming into multithreaded existing but non thread safe code. Apparently this is an issue that Java and C# both suffered from. is that in reference to `static` ?
03:55	<Mathieu Hofman>	static?
04:07	<Mathieu Hofman>	I think the problem is that code written without specific handling of shared memory access is unlikely to be safe when running in multiple threads. Java and C# do not prevent object instances from being shared in the first place, so the problem in these languages is arguably worse since it's pretty much not up to the implementer of the class to enable multi-threading (at best it can document that the class is not thread safe). The current shared struct proposal does require opt-in by marking the struct type as shared, but we consider that to not be a sufficient friction point in transforming non multi-threaded code, as it's highly unlikely that simply marking a struct as shared to be sufficient, and that explicit locking logic is likely to be required as well in the methods.
04:24	<Mathieu Hofman>	here's a wild idea, probably misguided as I arguably don't fully grasp the complexities of properly implementing safe shared memory concurrency. Would it make sense that by default (without some kind of explicit opt-out), all methods of a shared struct would take a thread local lock on the instance. By that I mean every time a shared struct method is invoked, it'd check if the thread already has a lock on the object (in case of local re-entrancy or simply the method being called from another method), and if not, acquire a lock on the object. While that's unlikely to be sufficient to reliably protect the users of the object, it should at least make the methods implementations thread safe by default.
04:30	<Chris de Almeida>	static? you mentioned java and c# -- I was asking if you are referring to the `static` keyword from those languages
04:31	<Mathieu Hofman>	from what I understand there is plenty of ways in those languages to make object instances available to multiple threads, not just the `static` keyword
04:32	<Chris de Almeida>	sure. contextually, it seemed it was in reference to 'you're roughly a "shared" keyword away from transforming into multithreaded existing but non thread safe code'
04:34	<Mathieu Hofman>	In general, we remain skeptical about introducing complexity just to enable developers to use shared object as regular objects with methods
04:34	<Chris de Almeida>	what I am trying to understand is what specific comparisons are being made to java and c#
04:35	<Mathieu Hofman>	sure. contextually, it seemed it was in reference to 'you're roughly a "shared" keyword away from transforming into multithreaded existing but non thread safe code' ah yeah. I think the point I was trying to make is that it's just too easy to cause code that isn't written with thread safety in mind to execute in multiple threads
04:36	<Chris de Almeida>	it certainly can be... ask me some time about how an errant `static` nearly brought down a company
04:36	<Chris de Almeida>	although java/c# folks will probably tell you that the ease of doing that is a feature rather than a bug
04:37	<Mathieu Hofman>	the JS proposal is marginally better as it requires an opt-in from the object's implementor, but the "opt-in" is still too easy in our opinion
04:58	<Chris de Almeida>	the headers you mean?
05:17	<shu>	the headers are extremely hard to opt into, i don't understand
05:17	<shu>	mark would like more syntactic friction?
05:18	<shu>	i don't really understand how someone can accidentally opt into multitreading
05:18	<shu>	like, making the struct shared is a necessary but insufficient condition to actually opt into the style
05:18	<shu>	you have to communicate it to another thread, set up the code to receive it, etc
05:20	<shu>	this argument seems very weak to me
05:21	<shu>	ah yeah. I think the point I was trying to make is that it's just too easy to cause code that isn't written with thread safety in mind to execute in multiple threads this is true, and is not a goal of this proposal
05:23	<shu>	that is, it is not a goal of this proposal to be opinionated about a particular style of thread safety
05:25	<shu>	the syntactic friction argument doesn't hold water. if the headers aren't considered enough friction, i don't know what would be. if the headers are considered enough friction but wants it reflected at the engine level, we can choose to spec an opt-in gate that the host has to trigger, and it'll be up to Node and other runtimes to understand the intention here is that it's an opt-in feature
05:26	<shu>	here's a wild idea, probably misguided as I arguably don't fully grasp the complexities of properly implementing safe shared memory concurrency. Would it make sense that by default (without some kind of explicit opt-out), all methods of a shared struct would take a thread local lock on the instance. By that I mean every time a shared struct method is invoked, it'd check if the thread already has a lock on the object (in case of local re-entrancy or simply the method being called from another method), and if not, acquire a lock on the object. While that's unlikely to be sufficient to reliably protect the users of the object, it should at least make the methods implementations thread safe by default. that's a non-starter
05:26	<shu>	it is too costly
05:28	<Mathieu Hofman>	Does it matter if the default is costly as long as there is a way to opt out of the default safety and gain performance?
05:29	<shu>	well, yes, the default is already safe (the headers aren't present by default)
05:29	<shu>	it also puts a requirement on implementations that there be a lock per object
05:31	<Mathieu Hofman>	The concern in this case is not how hard it is for the application as a whole to adopt shared memory multithreading, but how not sufficiently hard it is to mark code that is not thread safe to "support" shared memory access. Namely add a shared keyword to a struct declaration.
05:31	<shu>	what's the counterargument to what i said above?
05:31	<shu>	adding the shared keyword is a necessary but insufficient condition
05:31	<shu>	you still have to write code to communicate a shared struct
05:32	<Mathieu Hofman>	It's sufficient from the implementor of the struct. Your argument assumes the author of the app and of the struct is the same.
05:33	<shu>	the worry is the app author downloads a library, sees that it's marked as a shared struct, and assumes it's threadsafe, but the library is buggy and it is not threadsafe?
05:34	<shu>	what's different in this case vs an otherwise buggy library?
05:34	<Mathieu Hofman>	The worry is that the library authors could believe they can support multithreading by simply adding a keyword to their objects, without taking time to understand what they're actually doing
05:35	<shu>	that is a fully generic argument that can apply to anything that requires expertise...?
05:36	<shu>	i'm on board with safe by default. i consider that status quo to have that because it requires the app author to do the opt in, not the library authors
05:37	<shu>	if the app author trusts the library authors, and that trust turned out incorrect, i see that as the normal cost of doing software development
05:37	<Mathieu Hofman>	I don't know of programming concepts that are similarly hard to get right if not extremely careful.
05:38	<shu>	i can think of several
05:38	<shu>	manual memory management, asynchrony
05:38	<shu>	JITs (dynamic codegen)
05:39	<shu>	also, what's the cost to getting it wrong?
05:39	<shu>	it's not crashes
05:39	<Mathieu Hofman>	JS doesn't really have manual memory management, and I'd argue that it's maybe too easy to shoot yourself in the foot with array buffers.
05:39	<shu>	it's something like "undefined values"
05:40	<shu>	what i'm trying to get at it is: i don't see a principle at work here for how many layers/kinds of friction is enough, if the opt-in headers aren't
05:41	<Mathieu Hofman>	I agree that asynchrony and in particular re-entrancy during suspension is not always sufficiently understood. But it's easier to reason about thanks to the explicitness of await points
05:41	<shu>	i don't think "appeasing mark" is a good design principle for how much friction something should have
05:43	<shu>	i'd also like to better understand the consequences of getting this wrong
05:43	<shu>	this = a buggy library
05:43	<shu>	why is that assumed to be a categorically worse kind of "wrong" than today's bugs?
05:43	<Mathieu Hofman>	I have to go, sorry
05:44	<shu>	all right well, i'm pretty disappointed in the state of affairs
06:05	<shu>	Mathieu Hofman: here's a hypothetical when you're back. would making shared structs inaccessible outside of `shared { }` code blocks (a la `unsafe { }` blocks in rust) be considered sufficient syntactic friction?
06:06	<shu>	and if it isn't, i'd like to understand the reasoning
06:29	<Ashley Claymore>	I don't know of programming concepts that are similarly hard to get right if not extremely careful. `FinalizationRegistry` comes to my mind
06:40	<Mathieu Hofman>	and if it isn't, i'd like to understand the reasoning I'll chat more with Mark
06:42	<Mathieu Hofman>	`FinalizationRegistry` comes to my mind That's actually a good example of a safer abstraction compared to destructors. Sure it's advanced, and still possible to create situations that are not optimal, but unlike destructors, it's a lot harder to cause critical bugs.
07:35	<Ashley Claymore>	SharedStructs are a safer abstraction than direct shared memory because there is no type-confusion as the fields don't overlap.
14:30	<littledan>	FinalizationRegistry got right the thing where it prevents you from resurrecting dead objects, but it still seems to be abused most of the time :(
14:32	<littledan>	In general, we remain skeptical about introducing complexity just to enable developers to use shared object as regular objects with methods This is a pretty broad thing to be skeptical of. How does this fit together with rbuckton's feedback that methods were important for usability? Also, are you considering that the fundamental technology ("TLS") is needed for Wasm anyway, so most of the complexity will be there in the system either way?
14:33	<littledan>	also curious how this relates to having syntax for shared struct classes, which is all about reducing friction and something proposed to enhance usability
14:35	<littledan>	this sort of "discourage people from using the feature" feedback seems to be pushing in the direction that the proposal was originally shaped in, where it was just some function calls that made some weird objects with null prototypes. I think that would be a worse design for JavaScript and I'm a big fan of the changes that have come over the past couple years.
14:36	<littledan>	even though FinalizationRegistry uses a similarly function/constructor-based API with no syntax, that doesn't really provide any meaningful friction to prevent abuse. The motivation for abuse doesn't come from convenient syntax but rather useful semantics that people misunderstand and want to get at.
15:01	<shu>	even though FinalizationRegistry uses a similarly function/constructor-based API with no syntax, that doesn't really provide any meaningful friction to prevent abuse. The motivation for abuse doesn't come from convenient syntax but rather useful semantics that people misunderstand and want to get at. this rings pretty true to me
22:41	<rbuckton>	I haven't had a chance to catch up on this conversation since it started last night. I'll try to read through it and provide my thoughts tomorrow.
23:17	<rbuckton>	If the concern is that there needs to be some kind of artificial barrier to using shared structs to discourage less-experienced developers from writing bad code, then one already exists. It is far more complex than just having a `shared` keyword, its completely out of band from the JS code itself, its something that requires domain knowledge to use correctly, and it already acts as a barrier against a number of different types of insecure code. You will need to enable COOP/COEP to be able to even use this feature on the web, just as you do for `SharedArrayBuffer`. That's a level of complexity far outside the domain of the average JS developer.
23:18	<rbuckton>	Somehow special-casing shared struct methods to require a mandatory locking mechanism does nothing to ensure thread safety since it only affects shared struct methods, not the fields that are the actual unsafe things.
23:21	<rbuckton>	I also absolutely do not want a repeat of `async`. While I absolutely love `async`/`await`, it is well established that introducing `await` often poisons your entire execution path with `async`.
23:27	<rbuckton>	I also am very concerned of repeating the mistake of C#'s `lock` and Java's `synchronized` as they are both sledgehammers in a space where finesse is the correct approach, and both are often huge performance bottlenecks.
23:31	<rbuckton>	That said, I'd find it perfectly reasonable to require something like an `unsafe` block/method/function to read or write from a struct field that is `writable: true`, such that the thread-safety risk is immediately observable to the author of the block/method/function, as it becomes up to the author of that code to reconcile how their code interacts with the surrounding code outside of that marked block.
23:51	<rbuckton>	For example: `function doWork(sharedObj) unsafe { // allow unsafe read/write anywhere in the body const x = sharedObj.x; // ok sharedObj.x = y; // ok } function doWork2(sharedObj) { unsafe { // allow unsafe read/write anywhere in the block const x = sharedObj.x; // sharedObj.x = y; // ok } } function doWork3(sharedObj) { const x = unsafe sharedObj.x; // ok, but no parentheses allowed unsafe sharedObj.x = y; // ok, but no parentheses allowed }`
23:53	<rbuckton>	For even more artificial ceremony, you could have a llnt rule that banned `unsafe` so you would be forced to disable the rule when needed (and hopefully document why).
23:59	<rbuckton>	And we can make the basic Mutex easy to use with `using` if you really want/need the sledgehammer approach: `const mut = new Atomics.Mutex(); function doWork(sharedObj, mut) unsafe { using void = new UniqueLock(mut); // lock taken until unsafe block exits }` or even: `shared struct SharedObj { readonly mut = new Atomics.Mutex(); ... } function doWork(sharedObj) unsafe { using void = new UniqueLock(sharedObj.mut); }` or `shared struct SharedObj { readonly #mut = new Atomics.Mutex(); #x; #y; // using encapsulation, all access is governed by the lock doWork() unsafe { using void = new UniqueLock(this.#mut); const x = this.#x; const y = this.#y; return { x, y }; } }`