13:52
<ryzokuken>
good morning/evening/night everyone!
13:52
<ryzokuken>
meeting starts in ~7
14:29
<littledan>
I have trouble understanding the motivation for this null change, but also it doesn't seem harmful
14:29
<littledan>
(or rather, I have trouble understanding the motivation for the null skipping in the first place)
14:30
<nicolo-ribaudo>
Random fact: Babel accidentally ignores nulls while guaranteeing at least one await -- we have a PR open to match the behavior of the current (pre-this-change) proposal
14:35
<hax (HE Shi-Jun)>
It seems this PR try to avoid nulls introducing extra awaits, am I understanding this correctly?
14:35
<nicolo-ribaudo>
Yes
14:37
<shu>
this is a small thing, but i'd be interested in people's thoughts on banning using inside base switch cases: https://github.com/tc39/proposal-explicit-resource-management/issues/215#issuecomment-2040486653
14:39
<nicolo-ribaudo>
Babel's using support in switch is currently completely broken
14:39
<nicolo-ribaudo>
Because everything in switch is incredibly annoying to compile
14:39
<shu>
i wanna ban it but recognize that'll break symmetry with let/const
14:40
<nicolo-ribaudo>
Is it for implementation reasons?
14:41
<shu>
it's laid out in the comment -- saves some codegen complexity
14:42
<shu>
we can unroll the dispose loop in all cases except switch cases
14:44
<nicolo-ribaudo>
I'm trying to think of any usecase for using in switch, but I cannot think of any where I would want using's scope to fall through multiple cases
14:46
<nicolo-ribaudo>
(Babel doesn't unroll the loop anymore, so for us it wouldn't actually be difficult to fix using in switch)
14:46
<shu>
we don't unroll currently, but would like to
14:50
<rbuckton>
I'm trying to think of any usecase for using in switch, but I cannot think of any where I would want using's scope to fall through multiple cases
The most relevant case I can see would be something like manual loop unrolling, where you skip over the using for a chunk of operations when it isn't needed, such as if a lock was needed for a 16k chunk of data that isn't needed for a smaller chunk of data. The point of manual loop unrolling is to avoid a bunch of comparisons and branches based on a single condition such as input size, so a smaller chunk of data skips over the using (and avoids the extraneous null/undefined check), while a larger chunk of data enforces the using and holds the lock until the current iteration of the loop ends.
14:52
<nicolo-ribaudo>
Oh I see, like for transpiling generators
14:53
<nicolo-ribaudo>
(in a world where generators are transpiled and using is not I guess)
14:53
<rbuckton>
for (let start = 0; start < len; start += 8) {
    switch (start % 8) {
        case 0:
          // full chunk, perform lock
          using lck = new UniqueLock(mut);
          readByte();
        case 1: readByte();
        case 2: readByte();
        case 3: readByte();
        case 4: readByte();
        case 5: readByte();
        case 6: readByte();
        case 7: readByte();
    } // lock released if taken
}
14:56
<rbuckton>
not for transpiling generators, no. that would be using a switch as a finite state machine, but an FSM would result in the using terminating early. Loop unrolling in this case is to handle chunks of data without continuously spinning and checking in the for to process one element at a time and cut down the number of comparisons/branching from n to 1/n for a given chunk size of n.
15:25
<littledan>
If we could get to Ashley's point, he can explain why TLA running eagerly is useful within Bloomberg
15:26
<hax (HE Shi-Jun)>
tla already can break expectations...
15:29
<ljharb>
certainly a TLA being added anywhere in your graph (that's not an entrpoint) is a breaking change
15:29
<littledan>
The thing is: in practice, you really shouldn't use JSON or CSS modules either, due to the lack of built-in bundling. The limited utility of built-in modules is not at all limited to this feature
15:29
<bakkot>
this strongly suggests to me that we need a path to standardize bundler-only syntax
15:29
<littledan>
until then, we'll be talking about features that are mostly useful for bundlers and non-web environments
15:30
<littledan>
this strongly suggests to me that we need a path to standardize bundler-only syntax
I disagree; I think we should fix the issues to allow native modules to be usable in browsers
15:30
<shu>
i disagree with that disagree
15:30
<shu>
doesn't seem a good use of resources tbh
15:31
<bakkot>
I disagree; I think we should fix the issues to allow native modules to be usable in browsers
That is also good! And then, after that, standardizing things like this everywhere instead of bundler-only. But not before that.
15:31
<littledan>
these are all arguments against ES6 modules in the first place
15:31
<bakkot>
cough
15:31
<bakkot>
I mean, yes.
15:31
<bakkot>
But that ship has sailed.
15:31
<ljharb>
you're soooo close
15:31
<rbuckton>
bundler-related, but could import defer potentially avoid fetch and parse as well if we could serialize the import graph on the server side ahead of time?
15:32
<hax (HE Shi-Jun)>
In Jack's "assert sync" proposal I suggest we can also add "use async" directive which might help to make TLA explicit. Maybe it also help this issue.
15:32
<littledan>
well, yeah, so this would be good input to the module harmony group, if we don't like modules features anymore...
15:32
<bakkot>
I think standardizing ES6 modules as bundler-only syntax would have been better than the current world
15:32
<rbuckton>
or is that not an option because fetch is async
15:33
<bakkot>
well, yeah, so this would be good input to the module harmony group, if we don't like modules features anymore...
import source is useful on the web!
15:33
<bakkot>
so it's not "module features" in general, just those specific ones which aren't useful on the web
15:34
<Justin Ridgewell>
/me Taking over Chair responsibilities for a bit
15:34
<bakkot>
(possibly "[...] aren't useful on the web yet")
15:34
<littledan>
I think the Matrix results show that it's useful on the web in practice
15:34
<littledan>
It's a good thing that bundlers are aligned to TC39 and browsers in their syntax and semantics. It will be good for us to preserve that.
15:35
<Ashley Claymore>
bundler-related, but could import defer potentially avoid fetch and parse as well if we could serialize the import graph on the server side ahead of time?
yep. That's what we do at Bloomberg. We find the TLA at build time. And add a simplified module graph of which imports have TLA deps to our equivalent of 'package.json'
15:35
<bakkot>
IIUC the Matrix results show that it's useful on the web to have bundler-level syntax for this, but not that it's useful to have in browsers, right?
15:35
<littledan>
we've made tons of progress by actually building things into JS; the code splitting situation was a mess before import(), and it resulted in code splitting not occurring.
15:35
<bakkot>
I am not proposing "give up on this"; I am proposing "have a path to standardize bundler-level syntax that isn't actually in browsers"
15:35
<littledan>
TC39 is the JavaScript standards committee--our role is to find this common syntax
15:36
<littledan>
once we find common syntax, it's OK if some implementations fall behind and don't implement everything, but it'd be a big cost to split the language in two
15:36
<bakkot>
my response is the thing Shu is saying out loud right now
15:36
<littledan>
we should have this as an actual agenda item; that'd make it easier to discuss
15:39
<littledan>
IMO Stage 3 serves this purpose of "it's in tools and not necessarily in browsers". Maybe we just want to leave things in Stage 3 for longer.
15:39
<bakkot>
fwiw eslint refuses to implement features prior to stage 4, though that's maybe a them problem
15:41
<ljharb>
stage 3 is when it gets shipped in (most) browsers tho
15:42
<shu>
John-David Dalton: why is it disingenuous? this is arguably better handled by tools. tools have a different view (they can see the whole app) and optimization opportunities than a VM
15:43
<shu>
we designed a super static module system. it is not surprising to me that it has been leveraged to success with ahead-of-time tooling
15:44
<littledan>
fwiw eslint refuses to implement features prior to stage 4, though that's maybe a them problem
well, there's babel-eslint
15:46
<ljharb>
(@babel/eslint-parser as of babel 7)
15:47
<littledan>
I'd be happiest if browsers ship features like this at that point, but if they want to leave some features out of browsers for now (but we still assess that the feature is solid enough for Stage 3), maybe they can just let them sit in that bucket. Doesn't require holding back 2.7 or 3.
15:47
<ljharb>
3 is a pretty clear signal, that's the big purpose of 2.7. i agree it wouldn't require holding back 2.7
15:48
<Michael Ficarra>
so is this feature going to encourage the dreaded "barrel" module pattern?
15:48
<littledan>
so is this feature going to encourage the dreaded "barrel" module pattern?
it's about optimizing the feature--no encouragement is needed
15:48
<bakkot>
While that's technically true, I would be happier revising our process if we want to do that - I would prefer that stage 2.7/3 be taken as a commitment from browsers to implement a feature at some point. I don't want to get to a point where browsers allow features to reach 2.7 that they intend to never implement.
15:49
<shu>
agree
15:49
<shu>
the two design spaces are honestly different
15:49
<shu>
the constraints are different
15:49
<ljharb>
so is this feature going to encourage the dreaded "barrel" module pattern?
oof, i hadn't thought about that.
15:49
<littledan>
While that's technically true, I would be happier revising our process if we want to do that - I would prefer that stage 2.7/3 be taken as a commitment from browsers to implement a feature at some point. I don't want to get to a point where browsers allow features to reach 2.7 that they intend to never implement.
ah OK well I'm happy to get browser commitments like that; I don't want to discourage that. I haven't thought this through enough.
15:49
<Michael Ficarra>
it's about optimizing the feature--no encouragement is needed
well it doesn't provide any benefit to those not using the pattern, so it seems like encouragement to use it
15:50
<shu>
littledan: i have been thinking about for most of this year, i'll give you something more thought out soon
15:51
<littledan>
well it doesn't provide any benefit to those not using the pattern, so it seems like encouragement to use it
I think you're overthinking it... there's a lot of existing usages of this pattern, and it's slow, and it is hard to migrate away from (a lot of effort has gone into this migration)
15:51
<littledan>
it's not about newly incentivizing usage
15:51
<Michael Ficarra>
so someone who wants to get similar tree-shakeability can just dump this pattern?
15:52
<littledan>
sure, if you can get all your dependencies to dump this pattern, then you'll have no reason to use this feature in your dependency tree
15:52
<ljharb>
there's a growing group of vocal folks in the ecosystem telling people to get rid of barrel modules, fwiw
15:53
<ljharb>
(coinbase's RN app's binary size dropped 71% when we/they banned barrel exports back in 2021)
15:53
<littledan>
there's a growing group of vocal folks in the ecosystem telling people to get rid of barrel modules, fwiw
yes, they're doing good work; I think those two groups (the ecosystem effort and this proposal) should be understood to be supporting each other
15:54
<ljharb>
even with import defer, barrel exports will still result in a larger app than "just import what you need directly", no?
15:54
<bakkot>
I don't think "this feature is not useful in browsers" is a new category of objection?
15:54
<bakkot>
that's like... the main objection
15:55
<Michael Ficarra>
can someone remind me why people ever did that in the first place?
15:55
<ljharb>
i don't know why people tend to like "god objects", deep imports is The Way. but they do tend to like them.
15:56
<ptomato>
maybe the pattern was copied from python modules and __init__.py?
15:56
<littledan>
can someone remind me why people ever did that in the first place?
I always understood it to be a response to named exports being a thing, making use of that syntactic space, whereas require returning a single function was more natural
15:56
<ljharb>
i've seen people have a visceral preference for one import keyword that has N identifiers, instead of N import keywords
15:56
<shu>
that's like... the main objection
yeah i feel like most of our concerns that have caused feature compromises and redesigns boil down to this. sometimes it's couched in more specific terms, like "performance footgun" etc
15:56
<Michael Ficarra>
I'm finding it very hard to find sympathy for these people...
15:57
<ljharb>
also i think that a number of major IDEs implemented auto-refactoring features for named exports but ignored default exports (for no technical reason) which drove increased usage of named exports
15:57
<littledan>
I'm finding it very hard to find sympathy for these people...
why does it matter whose fault this is? it's currently a performance issue, and this is a solution.
15:58
<ljharb>
but is "don't use barrel files" perhaps a better solution? (genuine question)
15:58
<Justin Ridgewell>
Barrels are so convenient, and so awful for performance (both browsers and bundlers)
15:58
<littledan>
I dunno, there's a lot of recommendations about performance that people aren't taking up...
15:59
<Justin Ridgewell>
@guybedford Your mic is echoing
15:59
<bakkot>
but wouldn't this be a recommendation for performance in exactly the same way? it requires work to adopt; it doesn't just give you performance for free
15:59
<ryzokuken>
sorry guybedford bad timing
16:00
<Justin Ridgewell>
Timebox!
16:00
<bakkot>
and if the people in question were willing to do work to get performance they already can already do so, right?
16:01
<Michael Ficarra>
@nicolo-ribaudo your async feedback from me is that, for now, I'm unconvinced by the motivation
16:01
<ljharb>
convenience often makes for better road-paving than good intentions
16:01
<littledan>
but wouldn't this be a recommendation for performance in exactly the same way? it requires work to adopt; it doesn't just give you performance for free
The uptake is a "semver-minor" change--it doesn't require that importers update
16:03
<ljharb>
so importers keep thinking they're doing a fine thing, when they're doing a horrible thing, and the entire ecosystem has to move to import defer to avoid the education problem?
16:03
<ljharb>
education's super hard so maybe that's the right tradeoff to make, tbf
16:10
<nicolo-ribaudo>
but wouldn't this be a recommendation for performance in exactly the same way? it requires work to adopt; it doesn't just give you performance for free
It moves the changes from the consumers to the library itself
16:10
<nicolo-ribaudo>
So 1 instead of N
16:10
<nicolo-ribaudo>
so importers keep thinking they're doing a fine thing, when they're doing a horrible thing, and the entire ecosystem has to move to import defer to avoid the education problem?
There is currently a choice to make between convenience and performance, and the goal here is to not make them exclusive
16:43
<ljharb>
does using import defer on a barrel file with a bundler that supports it give the ~same results as using deep imports in the first place? because treeshaking so far still doesn't fully achieve "only importing what you need in the first place" afaik.
16:46
<Michael Ficarra>
in a language without so many hidden effects all over the place, it would be
16:49
<nicolo-ribaudo>
does using import defer on a barrel file with a bundler that supports it give the ~same results as using deep imports in the first place? because treeshaking so far still doesn't fully achieve "only importing what you need in the first place" afaik.
Sorry no, it's export defer/export optional that helps with barrel files
16:49
<nicolo-ribaudo>
And export defer would have the same result as using deep imports
16:50
<ljharb>
ok, so that's a compelling argument in favor of that one
16:51
<ljharb>
so import defer would be to lessen the downside of importing a barrel file that did not use export defer?
16:52
<nicolo-ribaudo>

And the boundary is still at the export defer level, so if you do

// main.js
import { x } from "library";

// library
export defer { x, y } from "./x-and-y";
export defer { z } from "./z";

// library/x-and-y
export const x = 1;
export const y = 2;

it will also load/run the code for const y = 2 (so it's not complete dead code elimination), but gives the building block to remove code if you put unrelated code in separate modules

16:52
<nicolo-ribaudo>
Exactly like deep imports
16:53
<nicolo-ribaudo>
so import defer would be to lessen the downside of importing a barrel file that did not use export defer?
import defer is not really related to barrel file
16:53
<nicolo-ribaudo>
It's in general for "big modules subgraphs"
16:53
<nicolo-ribaudo>
Even if you have a file with a single export but many dependencies, import defer is useful there
16:54
<littledan>
One thing that's important to understand is, the barrel file stuff doesn't relate to the motivation for import defer--they solve for totally unrelated issues, just both about loading less code
17:14
<littledan>
Agree with the various arguments for minimalism here, we just don't have any reason to go into this fractional stuff or BigInts, and IsSafeInteger is a good test. I don't have a strong opinion about any of this and am just happy for range to happen
17:16
<Michael Ficarra>
I agree with Kevin that this isn't especially confusing
17:18
<Michael Ficarra>
is there maybe justification to have this produce its own iterator subclass that can have optimised helpers implemented on it?
17:18
<Michael Ficarra>
like .drop(1e300) could be implemented very efficiently for this special iterator
17:19
<bakkot>
I think drop and take are the only things which could have better implementations and I don't think it's worth doing just for those
17:19
<bakkot>
and engines could do this optimization anyway
17:20
<hax (HE Shi-Jun)>
The original proposal includes BigInt.range(), don't remember why it is added in first place
17:20
<Bradford Smith>
I might expect to be able to use Iterator.range() to generate unique identifiers, in which case possibly a BigInt type is desirable.
17:21
<hax (HE Shi-Jun)>
I might expect to be able to use Iterator.range() to generate unique identifiers, in which case possibly a BigInt type is desirable.
seems a valid use case :P
17:21
<Michael Ficarra>
to waldemar's point, yes, there are some editorial bugs, but nothing that's not fixable
17:21
<Michael Ficarra>
it hasn't passed editorial review yet
17:21
<shu>
I might expect to be able to use Iterator.range() to generate unique identifiers, in which case possibly a BigInt type is desirable.
oh yeah?
17:21
<shu>
like, give me monotonic ids starting with <large number> or something?
17:22
<ljharb>
i have no bigint range use cases, but "works for one kind of numeric primitive and not the others" seems like a big problem
17:22
<Bradford Smith>
like, give me monotonic ids starting with <large number> or something?
yes, something like that
17:23
<Justin Ridgewell>
Are bigints still massively slower than numbers?
17:23
<rbuckton>
I'd prefer it work for bigint just so I'm not having to coerce using BigInt() when I need to do math with other bigints.
17:23
<shu>
i have no bigint range use cases, but "works for one kind of numeric primitive and not the others" seems like a big problem
why is that a big problem?
17:23
<shu>
Are bigints still massively slower than numbers?
haha yes
17:23
<danielrosenwasser>
Mathieu Hofman: can you add yourself to the notes?
17:23
<ljharb>
it's an inconsistency. one of those warts we'd inevitably have to make a "fill in the table" proposal for in a few years.
17:23
<eemeli>
Timestamps in microseconds as a use case?
17:24
<ljharb>
it's fine if we want to wait for that, but is there a reason why it's beneficial to defer it?
17:24
<shu>
it's an inconsistency. one of those warts we'd inevitably have to make a "fill in the table" proposal for in a few years.
not everyone shares that goal though
17:24
<ljharb>
that is true of most of the goals we all have :-)
17:25
<shu>
right, which is why i pushed back on "seems like a big problem"
17:25
<bakkot>
it's not that inconsistent - the Math methods do not and never will take BigInts
17:25
<shu>
in any case i was given a concrete use case i found plausible
17:25
<bakkot>
in fact I don't think we have anything that takes only Number of BigInt?
17:25
<shu>
so i'm happy with bigints being accepted here
17:25
<bakkot>
but yeah I also prefer accepting BigInt here
17:26
<ptomato>
Timestamps in microseconds as a use case?
I think the idea was to introduce Temporal.Instant ranges for this use case in the future
17:28
<nicolo-ribaudo>
Maybe Iterator.range(instant1, instant2, duration) if range is overloaded anyway?
17:28
<Jack Works>
I might expect to be able to use Iterator.range() to generate unique identifiers, in which case possibly a BigInt type is desirable.
can you explain this use case more? I think unique id should be something like crypto.getRandomUUID()
17:29
<shu>
might not need to be unique
17:29
<ljharb>
also autoincrementing is fine sometimes, not everything has the german tank problem
17:30
<nicolo-ribaudo>
Python bans floats in range() right?
17:30
<ptomato>
right
17:32
<ptomato>
python's popular computing library has numpy.arange() which does suffer from unexpected iterations due to floating point errors, and numpy.linspace() which does not
17:33
<Jesse>
...decimal.
17:33
<bakkot>
I do like linspace
17:34
<bakkot>
the options object does leave room for it
17:34
<nicolo-ribaudo>
You can just use a range of integers and .map for what Matthew is proposing, right?
17:35
<Michael Ficarra>
FWIW it would be nice to have extra time to iron out all the editorial issues as well
17:35
<Michael Ficarra>
then we can have a really solid 2.7 advancement
17:36
<Bradford Smith>
I'll definitely +1 this feature either with or without fractional values.
17:36
<bakkot>
same though I do prefer accepting fractional values
17:36
<bakkot>
but if left out it's not the end of the world
17:41
<hax (HE Shi-Jun)>
the options object does leave room for it
I think linspace should be a separate method?
17:41
<Michael Ficarra>
even 2-4x slower must still be faster than .reduce((a, b) => a + b, -0)
17:43
<hax (HE Shi-Jun)>
As the original author of range floating issue, I prefer not accepting franctional values , and change the name fromIterator.range to Iterator.integers which make it even more clear to developers 😉
17:44
<Michael Ficarra>
intRange
17:46
<hax (HE Shi-Jun)>
Do people also need ProductExact() ?
17:46
<littledan>
I can't understand the difference in precision/exactitude between precise and exact in this case... can anyone else?
17:47
<eemeli>
I think "precise" applies to all numbers. So 0.1 + 0.2 has a precise value, it's just not exactly 0.3.
17:47
<shu>
kinda, i guess there is a notion of "low precision" vs "high precision" that's more commonly used than "low exactitude" vs "high exactitude"
17:47
<shu>
but in this particular case i don't really understand
17:47
<eemeli>
"Accurate" would be more... accurate.
17:48
<hax (HE Shi-Jun)>
I can't understand the difference in precision/exactitude between precise and exact in this case... can anyone else?
It just avoid some loss and more close to exact number?
17:48
<littledan>
I think "precise" applies to all numbers. So 0.1 + 0.2 has a precise value, it's just not exactly 0.3.
huh? but we're starting with Numbers, not those funny things that we don't have a representation of in JS
17:49
<littledan>
kinda, i guess there is a notion of "low precision" vs "high precision" that's more commonly used than "low exactitude" vs "high exactitude"
OK but it's not like we have a precision arg here... it's just supposed to get the right answer
17:49
<littledan>
right?
17:49
<shu>
right, in this particular case i don't understand the difference
17:49
<shu>
if it were up to me i'd name it sumSlow
17:50
<shu>
like i would totally believe that if kevin's slides said "this is named sumPrecise, but it may give the wrong impression", someone will then say "sumExact is better"
17:50
<eemeli>
To be clear, I'm fine with sumExact as a least worst option.
17:50
<Jack Works>
Yes, next follow on proposal might be adding a new Symbol.rangeTo, and then range(a, b, options) where a is an object will do the following: yield* a[Symbol.rangeTo](b, options), then you can add it on Temporal, Decimal (they're not primitive now) and your own class.
17:51
<rbuckton>
Doesn't temporal use from ?
17:51
<littledan>
what name are we considering?
17:52
<Bradford Smith>
So, I guess we should go with Math.sumPreciseFrom()?
17:52
<Bradford Smith>
or not...
17:52
<bakkot>
it sounds like we are not doing From based on everyone else doesn't like it
17:52
<bakkot>
so sumExact or sumPrecise
17:54
<hax (HE Shi-Jun)>
if it were up to me i'd name it sumSlow
As I understand, it is not necessarily slow.
17:54
<danielrosenwasser>
dminor: can you clarify your last point in the notes?
17:54
<shu>
yeah it is?
17:54
<shu>
it's necessarily slow_er_
17:54
<nicolo-ribaudo>
I was going to say that I dislike From because it currently means "I'm creating something of this type from these other values". Iterator.from creates an iterator, Array.from an array, Object.fromEntries an object. If we wanted this to contain from it should be on number (Number.fromSum), but also Number is not a collection
17:55
<rbuckton>
Temporal uses from in many places and those functions do not take iterables, so I disagree with the Michael Ficarra 's statement that "everything named from takes an iterable"
17:56
<Mathieu Hofman>
You can just use a range of integers and .map for what Matthew is proposing, right?
It's not ergonomic at all: range(start * stepDivider, end * stepDivider, stepMultipler) | map(x => x / stepDivider)
17:56
<rbuckton>
Agreed. "Foo.from" does not imply "takes an iterable", it imples "make a Foo from these inputs"
17:57
<littledan>
I'm having trouble understanding why we're bothering with an optimization for an error case
17:57
<Michael Ficarra>
I don't think arithmetic commutativity comes into play when we're talking about throwing, which is inherently not arithmetic
17:58
<Michael Ficarra>
@littledan an iterable of Number producing a NaN is not an error case?
17:58
<littledan>
I think NaN is kinda usually for error cases?
17:59
<littledan>
I get that it's within the domain of Numbers but.... when was the last time you wanted NaN to come up for you?
17:59
<Michael Ficarra>
at the business logic layer, maybe, but at this data processing layer, no
17:59
<dminor>
I have to drop, Eemeli and Matthew Gaudet will represent SpiderMonkey for the rest of the day.
18:00
<littledan>
at the business logic layer, maybe, but at this data processing layer, no
obviously we need a well-defined answer, but is this going to provide a meaningful speedup meaningfully often?
18:00
<Michael Ficarra>
yes, iterators can yield very many values
18:01
<Michael Ficarra>
a multi-hours' batch could be short-circuited immediately after starting instead of waiting until the end
18:03
<hax (HE Shi-Jun)>
it's necessarily slow_er_
As https://en.wikipedia.org/wiki/Pairwise_summation "Pairwise summation is the default summation algorithm in NumPy[8] and the Julia technical-computing language,[9] where in both cases it was found to have comparable speed to naive summation"
18:04
<danielrosenwasser>
eemeli: can you add yourself to the notes doc?
18:09
<Justin Ridgewell>
I had to miss the presentation, but why is empty list -0? (no opinion, just curious)
18:10
<shu>
it is the identity of floating point addition
18:10
<Michael Ficarra>
@Justin Ridgewell Object.is(-0 + -0, -0)
18:10
<shu>
(while +0 + -0 = +0)
18:11
<Justin Ridgewell>
Why is -0 the starting point?
18:11
<Justin Ridgewell>
If I’m interpreting those correctly....
18:11
<shu>
what does starting point mean?
18:11
<Justin Ridgewell>
let sum = -0; for (const i of array) sum += i
18:12
<Michael Ficarra>
so that when array is [-0] you get the correct answer
18:17
<Duncan MacGregor>
The equality question feels very like the question of == on value types in Java.
18:19
<bakkot>
the slides will reference project Valhalla IIRC
18:19
<Michael Ficarra>
value types being the ones with lowercase names?
18:19
<bakkot>
(i.e. value objects for java)
18:19
<bakkot>
https://openjdk.org/projects/valhalla/
18:24
<Michael Ficarra>
this presentation is so well structured 😍
18:25
<bakkot>
I really like the composite object approach though I would not call them "CompositeKey"
18:27
<bakkot>
I linked a few userland implementations here: https://github.com/tc39/proposal-record-tuple/issues/387#issuecomment-2033531920 though I am sure there are others
18:29
<rbuckton>
Composite keys don't seem like a solution to case-insensitive Maps. I still strongly favor equals/hash.
18:29
<bakkot>
I think they solve a different problem - equals/hash doesn't let me write groupBy with the result of my comparator being a composite key (without doing a bunch of work)
18:29
<ptomato>
but uniqBy without CompositeKey would cover that use case, no?
18:30
<rbuckton>
i.e., new Map([], { comparer: caseInsensitiveStringComparer }
18:30
<Michael Ficarra>
@rbuckton it works fine, Unicode has case folding for tht
18:30
<rbuckton>
I think they solve a different problem - equals/hash doesn't let me write groupBy with the result of my comparator being a composite key (without doing a bunch of work)
Wouldn't you just do groupBy over anything you want?
18:31
<rbuckton>
@rbuckton it works fine, Unicode has case folding for tht
It's an extra allocation, likely thrown away, for any given key comparison. equals/hash is defined once and reused
18:32
<bakkot>
I don't know what that means? I am asking about, for example, I have a list of { name, employer, city } objects, and I want to collect them by { employer, city }. with composite key that's just Map.groupBy(vals, x => Tuple(x.employer, x.city)). with equals/hash there's a bunch more ceremony
18:33
<rbuckton>
I have already written a groupBy that uses equals/hash`, and an equaler that already does tuple structural equality.
18:34
<rbuckton>
That library also supports [Equatable.equals](obj) and [Equatable.hash]() that can be used if an equaler is not provided, so a given composite key just builds on top of that.
18:35
<bakkot>
sure, it can certainly be done, there's just a bunch more ceremony
18:36
<rbuckton>
It is far more flexible, IMO.
18:36
<bakkot>
it's just different
18:36
<bakkot>
it solves different problems
18:36
<rbuckton>
Having to turn everything into a CompositeKey seems like a lot more ceremony to me.
18:37
<eemeli>
Map.p.getImprecise, anyone?
18:37
<rbuckton>
CompositeKey can be built on equals/hash, the other is not true.
18:37
<bakkot>
Map.groupBy(vals, x => Tuple(x.employer, x.city)) is very close to zero ceremony
18:37
<bakkot>
that's exactly how I think about the problem
18:38
<rbuckton>
Plus, AFAIK equals/hash is how every implementation implements maps natively, it's just not exposed to user code.
18:40
<rbuckton>
For that case, maybe. What about map.set(key, value) though? You have to write map.set(new CompositeKey(foo.a, foo.b), value) or map.get(new CompositeKey(foo.a, foo.b)). With equals/hash, you can set it up once and just do map.set(foo, value) or map.get(foo).
18:40
<bakkot>
but if my map is not keyed by foo, I don't want it to behave that way?
18:41
<bakkot>
I want to use a composite key for that case
18:41
<ljharb>
with a Map constructor hook, you could set it up once too with composite keys
18:41
<rbuckton>
You can use a composite key for that case, I'd like to not have to use composite key for all of the other cases.
18:41
<Mathieu Hofman>
It is far more flexible, IMO.
it's a lot more risk prone
18:41
<rbuckton>
with a Map constructor hook, you could set it up once too with composite keys
The same is true for equals/hash.
18:41
<ljharb>
right. which is why it's not an argument in favor of either one.
18:42
<rbuckton>
it's a lot more risk prone
It's far more efficient. If I want to write a custom collection that employs a hashtable, I'd rather have it easier to compute a hash.
18:43
<sffc>
For the set comparator, it seems like a prototype method [Symbol.setCompare] would work, and if Record/Tuple implement that function, then R&T Set/Map semantics should just work
18:43
<bakkot>
You can use a composite key for that case, I'd like to not have to use composite key for all of the other cases.
right that's what I mean by "they are different"
18:43
<Mathieu Hofman>
JS is a high level language, I don't want low level concept like computing hashes exposed to programmer, especially if they can cause erroneous executions if mishandled
18:43
<bakkot>
I'm not saying that we should have composite keys instead of hash/equals, just that they solve different problems
18:45
<bakkot>
(I am separately somewhat skeptical of hash/equals because of the concerns about inconsistency, especially since that seems like it might be a bug farm in engines themselves and not just programs. but that concern is unrelated to this.)
18:50
<Mathieu Hofman>
That means the collection correctness is dependent on the stable behavior of its values. I want the ability to create a collection that is resilient against misbehaving values
18:55
<Duncan MacGregor>
+1 on pretty much everything rbuckton said.
18:58
<rbuckton>
That means the collection correctness is dependent on the stable behavior of its values. I want the ability to create a collection that is resilient against misbehaving values
If you do not trust the stability of your keys, then you could use a CompositeKey and pay the overhead for every get/set. If you do trust the stability of your keys, equals/hash lets you avoid that overhead.
19:00
<rbuckton>
I have a strong preference for fast, efficient JS. CompositeKey overhead does not seem fast/efficient.
19:01
<Mathieu Hofman>
If you do not trust the stability of your keys, then you could use a CompositeKey and pay the overhead for every get/set. If you do trust the stability of your keys, equals/hash lets you avoid that overhead.
Not if the only way is for the collection to ask the value. My answer was to sffc who was proposing that
19:01
<rbuckton>
equals/hash does not preclude CompositeKey, but rather is the building block a CompositeKey could be built on. It's also the building block using a Uri as a key could be built on too.
19:02
<bakkot>
It's not something you could build CompositeKey on because CompositeKey has === equality
19:02
<bakkot>
and === will never invoke a userland equals
19:03
<shu>
i missed that there was a configuration of the new direction that has new objects that have special ===?
19:03
<rbuckton>
If we could have CompositeKey have === equality, why would we not have R&T have === equality?
19:03
<shu>
that isn't any more acceptable than the R&T's overloading of ===
19:03
<bakkot>
i missed that there was a configuration of the new direction that has new objects that have special ===?
the CompositeKey thing he presented, with interning, which is already being done in userland
19:03
<littledan>
can we change indexOf just for these new objects?
19:04
<bakkot>
it was not one ACE advocated for
19:04
<shu>
the CompositeKey thing he presented, with interning, which is already being done in userland
tools compile away === or something?
19:04
<bakkot>
no I mean they do interning
19:05
<Ashley Claymore>
can we change indexOf just for these new objects?
I assumed 'no', but never asked
19:05
<shu>
no I mean they do interning
oh by "===" you mean they do interning?
19:05
<shu>
i see, ok
19:05
<bakkot>
https://github.com/benjamn/immutable-tuple/blob/485b32326349cb0329c749090cebf43f8359fa12/src/tuple.js#L12-L19
19:06
<rbuckton>
Isn't the discussion around CompositeKey predicated on R&T not having === equality? I have concerns about the GC overhead associated with CompositeKey using something like weakmaps/maps as either CompositeKey("a") leaks forever, or it requires a FinalizationRegistry and the GC overhead for tracking/collection the object.
19:06
<littledan>
Isn't the discussion around CompositeKey predicated on R&T not having === equality? I have concerns about the GC overhead associated with CompositeKey using something like weakmaps/maps as either CompositeKey("a") leaks forever, or it requires a FinalizationRegistry and the GC overhead for tracking/collection the object.
yes, it's all predicated on === not being supported for R&T
19:06
<shu>
rbuckton: that was also my confusion. the resolution is that CompositeKeys have interning semantics built in, they return deduplicated objects such that === just works like object ===, because it's literally deduplicated
19:06
<bakkot>
or you forbid having a CompositeKey that doesn't contain an object, which is what userland does
19:06
<rbuckton>
Interning semantics will have GC overhead
19:06
<shu>
yes indeedy
19:07
<shu>
but it's like, deterministic overhead that's always there, instead of non-deterministic overhead that engines have to tune for and hope they get it right for the largest swath of the code that runs
19:07
<rbuckton>
or you forbid having a CompositeKey that doesn't contain an object, which is what userland does
CompsiteKey(obj, "a") works, but CompositeKey("a", "b") does not? That seems like a very unstable design.
19:07
<shu>
which is a better place to be in, but... definitely still overhead, yes
19:07
<Ashley Claymore>
I dropped it after presenting the slides to some other people, they felt it complicated the question I was asking
19:08
<Mathieu Hofman>
I believe that is NOT was was presented. #[1] !== #[1]
19:08
<rbuckton>
I'm concerned about the overhead of 1000s of map.get(CompositeKey(a, b)) in a loop.
19:08
<eemeli>
I kinda like the idea of being able to do something like ``` new Map([], { compare: 'duck' }) ``` and have that "just work".
19:08
<Mathieu Hofman>
but when used in collections, the equality used is the one supporting R&T equality
19:08
<bakkot>
I believe that is NOT was was presented. #[1] !== #[1]
CompositeKey was presented, it's just not what was proposed
19:10
<rbuckton>
I kinda like the idea of being able to do something like
```
new Map([], { compare: 'duck' })
```
and have that "just work".

@esfx/collections-hashmap and @esfx/equatable does this. You can do, say:

import { HashMap } from "@esfx/collections-hashmap";
import { Uri } from "./uri.js";

const map = new HashMap(undefined, { equaler: Uri.equaler });

where Uri.equaler is just an object with { equals(a, b), hash(obj) }.

19:11
<Mathieu Hofman>
I'm confused about the discussion about CompositeKey here as it's not what was proposed, but presented as background / historical context. A R/T would, unlike CompositeKey, not have === equality.
19:13
<rbuckton>
I was under the impression these were discussed as possible directions to consider, with historical context as to prior discussions within TC39.
19:14
<Mathieu Hofman>
rbuckton: I don't want to build R&T on top of a equals / hash mechanism IF there is any way that values can influence the hashing mechanism or any part of the program at the time they're added to a collection I created. I am fine if the collection defers to a construction time provided config.
19:14
<Ashley Claymore>
I should have been more clear. I was trying to say Key(v1, v2) === Key(v1, v2) is good in that it doesn't introduce a new form of equality. But from what I have heard from engines re R&T and from my own research it seems difficult to make them perform well at scale, the whole app has contention on one global interning table that requires GC hooks to clean up the unused keys.
19:16
<rbuckton>
The upside of equals/hash is that it is a known quantity. Numerous languages and runtimes use this, thus the caveats are known.
19:16
<rbuckton>
A CompositeKey based on equals/hash doesn't require complex GC semantics.
19:19
<rbuckton>
If you have a native CompositeKey with equals/hash, you just make the "default equaler" behavior used by a Map have a specific case for CompositeKey vs ===. But then you can make that "default equaler" an actual object from which to build other equalers from.
19:19
<Mathieu Hofman>
To be more clear:
new Map([], {compare: Record.equals}) is fine, and so is new Map([], {compare: customHashCodeThing}, but not new Map([], compareWithHashCode: true) which would be assuming looking up a Records.prototype[Symbol.hashCode]
19:21
<Ashley Claymore>
In other plenary sessions it has come up that the language should be as close to fully deterministic as we can. e.g. not exposing a browser-sniffing API. Object.hash feels like it would need to be fully deterministic across engines to meet these goals.
Perhaps I have mis-read committee comments with this goal.
19:22
<rbuckton>
To be more clear:
new Map([], {compare: Record.equals}) is fine, and so is new Map([], {compare: customHashCodeThing}, but not new Map([], compareWithHashCode: true) which would be assuming looking up a Records.prototype[Symbol.hashCode]
I've experimented with something like [Symbol.hashCode]. In well written applications its perfectly fine and would serve most users well. If you don't trust your inputs, you could opt-out with a custom equaler that doesn't call an @@hashCode. That said, I'm more interested in a general purpose { equals, hash } mechanism for Map/Set and various other methods that do equality checks, than i am with instance-overrideable hash codes.
19:23
<Ashley Claymore>
I also wonder if even string hashing is not as easy to standardize in 262 due to the spec being in wtf-16 and runtimes potentially using different encodings?
19:32
<Mathieu Hofman>
We don't automatically need to expose the result of the internal hashing. We could have special values for compare
19:33
<rbuckton>
In other plenary sessions it has come up that the language should be as close to fully deterministic as we can. e.g. not exposing a browser-sniffing API. Object.hash feels like it would need to be fully deterministic across engines to meet these goals.
Perhaps I have mis-read committee comments with this goal.

Object.hash() shouldn't be fully deterministic, at least not for strings. String hash code generators benefit from randomness as it evens out clumping due to collisions over the course of various app restarts. An implementation would also want to be able to swap from one hash algorithm to another from version to version as hash algorithms improve.

In .NET, you can control whether to use a randomized string hashing algorithm via configuration: https://learn.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/runtime/userandomizedstringhashalgorithm-element.

19:34
<rbuckton>
We don't automatically need to expose the result of the internal hashing. We could have special values for compare
I considered describing it as an opaque value, but you need to be able to do math on it.
19:35
<Mathieu Hofman>
To follow up on Ashley Claymore 's comment, I would be opposed to any new APIs to expose non determinism to the program. It's fine for these things to remain internal to the engine, but I don't want them to be observable
19:37
<Mathieu Hofman>
Or should I say, the new API would have to be extremely well motivated and fairly self contained, like FR/WeakRef, but I don't think hashcode would clear that line
19:38
<rbuckton>

For example

class Point {
  #x;
  #y;
  constructor(x, y) {
    this.#x = x;
    this.#y = y;
  }
  get x() { return this.#x; }
  get y() { return this.#y; }
  static equaler = {
    equals(a, b) {
      return #x in a && #x in b && a.#x === b.#x && a.#y === b.#y;
    },
    hash(obj) {
      let hc = Object.hash(obj.#x);
      hc = ((hc << 7) | (hc >>> 25)) ^ Object.hash(obj.#y);
      return hc;
    }
  }
}
19:40
<Mathieu Hofman>
I honestly don't understand why hashLike(obj) { return #[Point, obj.#x, obj.#y]; } is not an acceptable approach. Engines are definitely capable of optimizing this
19:41
<rbuckton>
Capable does not mean willing or likely to do so.
19:43
<rbuckton>
And that still doesn't let me write a hashtable. I need something I can use as a numeric index with good avalanche properties. Maybe 95% of users don't need this, but its a headache for the small percent that do need it for a good reason.
19:48
<rbuckton>

For example, I had to do quite a bit to workaround this limitation when experimenting with the shared structs dev trial. I had a very strong need for a concurrent Map, which required implementing a hash table. It would have been far easier if I could actually generate a hash for a string or an object: https://github.com/microsoft/TypeScript/blob/shared-struct-test/src/compiler/sharing/collections/concurrentMap.ts

Yes, maybe we would want to add a ConcurrentMap after we add shared structs, but that's certainly not going to be in the MVP.

19:55
<Mathieu Hofman>
generating a non-deterministic value from a basic value like and object or a string feels like a non starter for me. I'd claim it's not needed by 99.9999% of JS authors
19:56
<rbuckton>
Hash bucket lookup is approximately O(1), while a user-built collection relying on manually comparing tuples is O(n), and if you can't compare tuples using ===, it's O(n*m) where m is the number of keys in the tuple.
20:01
<rbuckton>
Then call the method Object.nondeterministicHash(obj) or Object.randomHash(obj) or document the non-deterministic nature on MDN. If you were an implementer that internally used murmur3 for string hashing and found you could speed up all applications by a significant percentage by switching to xxhash64, wouldn't you do so? Object.hash() must be explicitly "non-deterministic" from the get-go just so anyone consuming it could reap the same benefits if a better hash algorithm comes along.
20:03
<rbuckton>
Case in point, even if you're using the non-random string hash algorithm in .NET, Object.GetHashCode() is still considered to be non-deterministic as upgrading the version can change the hash algorithm to one that is more efficient.
20:03
<Mathieu Hofman>
I just don't think there is sufficient motivation to expose this in the language
20:09
<rbuckton>
I'd argue that the number of times we've discussed mechanisms for customizing equality indicates it is sufficient motivation. The main reason I want custom equality for Map/Set at all is related to performance. You can use them as-is as long as you're willing to sacrifice performance and memory at the cost of the overhead introduced by alternatives. I want custom Map equality so that I don't have to .toString() every Uri I use as a key in a map.get(). Forcing custom collections to operate at O(n) while native collections can have O(1) because they can cheat and do the hashtable lookup is not the answer when your algorithms are performance-critical.
20:18
<rbuckton>

I'm not saying that Object.hash() must return a number, but that you could conceivably do all the things you would need to do to implement a hash table for an alternative to be a reasonable compromise. For example, let's say we had an API like this:

interface Equaler<T> {
  equals(a: T, b: T): boolean;
  hash(a: T): opaque;
}

where hash returns an opaque value. I'd need, at a minimum, something like this as well:

declare const defaultEqualer: Equaler<unknown>;
declare function combineHash(a: opaque, b: opaque): opaque;
interface HashArray<T> {
  length: number;
  [hash: opaque]: T;
  ...
}
interface SharedHashArray<T> {
  length: number;
  [hash: opaque]: T;
  ...
}

to have any possibility of achieving similar perf to it being a number.

20:18
<Mathieu Hofman>
You can customize equality using the hashLike approach I suggested above. It is also optimizable by engines.
20:19
<Mathieu Hofman>
The combine hash is exactly what creating a R/T with your components does
20:20
<rbuckton>
But not for a custom collection. As I said, native Map/Set could cheat and unwrap a composite key to do a hashtable lookup for O(1), but a custom collection would be O(n) at best.
20:21
<rbuckton>
I think a design that ignores custom collections is too short sighted.
20:22
<Mathieu Hofman>
Sorry I don't understand how your opaque hash is any different than a R/T
20:22
<rbuckton>
The opaque hash is just a number you can't actually see into.
20:23
<rbuckton>
The only reason it would be opaque is so that users can't depend on the actual value.
20:23
<rbuckton>
If it were a tuple, you couldn't use it as an index into an array of hash buckets.
20:24
<rbuckton>
If it were a number, or an opaque value that's actually just a number, then you conceivably could.
20:24
<rbuckton>
Numbers are just far easier to reason over since we already have Array and shared structs would have SharedArray.
20:25
<rbuckton>
To use a hash code efficiently, I need to be able to compare it using ===, shift it using << and >>>, combine it using ^, and use it as the index into an array. All of these operations are extremely fast using 32-bit integers.
20:28
<rbuckton>
#[a, b, c] has no avalanche properties I can use to ensure proper distribution in a hashtable, and cannot be used as an index, so comparing keys using #[a, b, c] in a custom collection is at least O(n*3), since I must compare the elements of every key in the map to the elements in the provided key. It's terribly inefficient.
20:32
<Mathieu Hofman>
If it were a tuple, you couldn't use it as an index into an array of hash buckets.
You could use it as a key in a Map, how is it different?
20:33
<rbuckton>
not across threads
20:33
<Mathieu Hofman>
if the "opaque" number value is actually observable by the program, even indirectly, it defeats the purpose of being opaque
20:34
<rbuckton>
And that assumes that all JS devs will only ever need Map and Set, and that there are no other collection classes JS does not implement that will ever be needed.
20:34
<Mathieu Hofman>
when you said opaque, I had assumed actually opaque, like a unique symbol, or empty object
20:35
<Mathieu Hofman>
anything else is not opaque by my definition
20:35
<rbuckton>
I don't need it to be opaque. I'm saying the only way an opaque value would work for me is if it could have the same properties I would need from an integer hash code.
20:36
<rbuckton>
The problem is that if it's not a 32-bit integer, it becomes a boxed value in most implementations, and those are going to be slower.
20:49
<rbuckton>

Let me summarize my thoughts:

  • Any mechanism to customize equality for Map and Set should be done in a way that could be leveraged by a custom collection. Anything less is too short sighted.
  • Using a composite key/tuple will not give me the performance characteristics I would need in a custom collection, as it would result in O(n*m) lookup time.
  • Equals/hash can be employed by a hashtable in a custom collection for O(1) lookup time.
  • Using a composite key/tuple would not be compatible with custom collections under shared memory multithreading, which are likely to be necessary since concurrent/synchronized collections are not part of the shared structs MVP.
  • Composite keys require an allocation that is likely to be repeated for every call to map.get. Engines could optimize, but likely won't do so immediately, if ever.
  • A composite key mechanism can be implemented on top of equals/hash such that a WeakMap/Map-based registry is unnecessary. equals/hash cannot be implemented on top of a composite key.
  • Equals/hash can use 32-bit integers, <<, >>> and ^, which are all already fast and optimized. Opaque values require boxing, making them slow (see bigint performance as an example)
20:52
<Mathieu Hofman>
I'm saying that any value that is a source of observable non determinism does not seem sufficiently motivated for a "keying" mechanism when there are alternatives that cover most of the use cases except performance, which can be optimized, or shared memory, which is a hypothetical future, and could have its own collections API. JS is a high level language that does not expose low level / internal details of its objects or memory layout. exposing hash code would change that
20:54
<Ashley Claymore>

Composite keys require an allocation that is likely to be repeated for every call to map.get. Engines could optimize, but likely won't do so immediately, if ever.

The idea with R&T is that, being objects and arrays. In many cases they are the data structure, so there is no extra allocation for these cases

20:56
<rbuckton>
As I've discovered while discussing array iterator performance, "can be optimized" isn't a useful metric. Implementations won't optimize unless they have very good reason to. Array destructuring has been around for ~9 years and still isn't optimized.
20:59
<rbuckton>

Composite keys require an allocation that is likely to be repeated for every call to map.get. Engines could optimize, but likely won't do so immediately, if ever.

The idea with R&T is that, being objects and arrays. In many cases they are the data structure, so there is no extra allocation for these cases

@bakkot's example in matrix earlier shows this to not be true. The keys are often a subset of the data structure, and will very likely be generated on the fly.
20:59
<Ashley Claymore>
If the data model of the application is already using R&T. Inserting these into a map could be the fastest case. Because all the hashing and equality operations are entirely native, with zero entry into userland minimizing guards and allowing JIT.
20:59
<Ashley Claymore>

The keys are often a subset of the data structure

If this really is common, then apps could keep those parts of the data structure seperate

21:00
<rbuckton>
If the data model of the application is already using R&T. Inserting these into a map could be the fastest case. Because all the hashing and equality operations are entirely native, with zero entry into userland minimizing guards and allowing JIT.
You're only proving my point. Map and Set can only have O(1) performance for a tuple as a composite key if they use equals/hash internally. Custom collections cannot do the same.
21:00
<Mathieu Hofman>
Custom collections can use Map/Set internally for the vast majority of custom use cases
21:00
<Ashley Claymore>
Yes. I said earlier. The implementations must use hash + equals at some layer. I 100% agree that the pattern is efficent
21:01
<rbuckton>
custom collections aren't an ephemeral thing, there is production code that uses them. The perf hit is likely the only reason they aren't used more.
21:01
<Ashley Claymore>
The custom collections I've needed were wrappers around Maps
21:01
<Ashley Claymore>
I'd be interested in hearing about other cases. I'm familiar with the ConcurrentMap one.
21:07
<rbuckton>
The custom collections I've needed were wrappers around Maps
This is often the case, but these are often still inefficient. You often have to do some type of conversion for every get/set, which can result in unnecessary allocations. You could build a custom map that does case insensitive string comparisons on keys by passing them through a case folding algorithm, but that's terribly inefficient. For a case insensitive comparison, you might use case folding on the string to compare, but then you're also producing another string in memory. If you want to preserve the input key as written, you have to store both the original key and value as the map entry, resulting in another allocation and duplication, plus the need to wrap keys(), values(), etc. This is all unnecessary overhead.
21:09
<Ashley Claymore>
Gotcha. That matches my understanding. It's not that lots of custom collections can't be built, but they will need to do more allocations.
21:11
<rbuckton>
Map/Set wrappers can be built with allocation overhead, but only if their keys are easily translated to something that works as an existing key. Complex objects end up requiring O(n)
21:12
<rbuckton>
Even if you are using weakmap/map registries to ensure a composite key has the same identity, that's still O(n) if you're writing map.get(CompositeKey(a, b)). You're just paying the O(n) cost walking the registry.
21:13
<rbuckton>
i.e., for n steps, you're doing an O(1) operation for each step.
21:13
<Ashley Claymore>
what is n here?
21:13
<rbuckton>
n is the number of elements in the composite key.
21:14
<Ashley Claymore>
ok. I thought you were referring to collection size and got confused
21:14
<rbuckton>
https://docs.google.com/presentation/d/1JfChmW8tQ2_mrFDynosNqa1tjJ2j-qX6WoKm8vc_tkY/edit#slide=id.g2c6eebea946_0_40
21:14
<rbuckton>
Sorry, I should have said m instead of n for that case.
21:17
<littledan>

Let me summarize my thoughts:

  • Any mechanism to customize equality for Map and Set should be done in a way that could be leveraged by a custom collection. Anything less is too short sighted.
  • Using a composite key/tuple will not give me the performance characteristics I would need in a custom collection, as it would result in O(n*m) lookup time.
  • Equals/hash can be employed by a hashtable in a custom collection for O(1) lookup time.
  • Using a composite key/tuple would not be compatible with custom collections under shared memory multithreading, which are likely to be necessary since concurrent/synchronized collections are not part of the shared structs MVP.
  • Composite keys require an allocation that is likely to be repeated for every call to map.get. Engines could optimize, but likely won't do so immediately, if ever.
  • A composite key mechanism can be implemented on top of equals/hash such that a WeakMap/Map-based registry is unnecessary. equals/hash cannot be implemented on top of a composite key.
  • Equals/hash can use 32-bit integers, <<, >>> and ^, which are all already fast and optimized. Opaque values require boxing, making them slow (see bigint performance as an example)
we actually added Map and Set as built-ins without having a way that what they do could be done in a custom collection. Of course it would be nice to support efficient maps defined in JS if possible given all of the other goals and constraints, but I don't see why it makes sense to say that no advances should be made for Map and Set without managing to support this other goal.
21:18
<littledan>
the composite-ness of composite keys or R&T doesn't actually depend on any new built-in thing, but identity hashcodes continue to not be exposed
21:20
<rbuckton>
we actually added Map and Set as built-ins without having a way that what they do could be done in a custom collection. Of course it would be nice to support efficient maps defined in JS if possible given all of the other goals and constraints, but I don't see why it makes sense to say that no advances should be made for Map and Set without managing to support this other goal.
We did, and this has continued to come up regularly since. I'm not opposed to things that make Map and Set more efficient. I'm concerned with choosing an API design that caters to native maps and sets that could never be efficiently implemented in a custom collection. We either end up never improving the story for custom collections, or end up with two independent mechanisms for customizing equality.
21:22
<rbuckton>
A composite key that uses a weakmap registry just to ensure reference identity for equivalent keys is entirely unnecessary with hash/equals. If we were to then implement hash/equals to improve the custom collection story, then composite keys become an evolutionary dead end for the language.
21:23
<littledan>
Is anyone defending CompositeKey?
21:24
<littledan>
Object-based R&T seems like a cleaner mechanism and doesn't do anything which should make things harder for custom collections, I think
21:24
<bakkot>
I'm not sure what distinction between CompositeKey and object-based R&T you're drawing
21:24
<rbuckton>
How is object-based R&T not a composite key?
21:25
<rbuckton>
The distinction I would make would be between a WeakMap-registry based composite key, or a Map/Set privileged CompositeKey.
21:27
<rbuckton>
A Map/Set-privileged CompositeKey has a way forward to equals/hash, but then why not have equals/hash instead (or at the same time).
21:29
<littledan>
I'm not sure what distinction between CompositeKey and object-based R&T you're drawing
well, the basic version of R&T that we were discussing is, there isn't support for being keys in WeakMap, so the whole registry thing just goes away
21:30
<littledan>
and there's no support for ===
21:30
<littledan>
I guess the registry was more for ===; WeakMap doesn't actually need it
21:31
<littledan>
so I guess you're calling this Map/Set privileged? But I don't see what Map/Set are doing that other collections couldn't also do
21:32
<littledan>
A Map/Set-privileged CompositeKey has a way forward to equals/hash, but then why not have equals/hash instead (or at the same time).
This is something I have trouble understanding. Even if we do want to generalize to hash, the question of "why not instead" was answered--you want a nice default mechanism for the normal case, as other languages tend to have
21:32
<rbuckton>
I'd also like more details from the folks that are concerned about non-determinism as to why that should matter. It would be guaranteed to be deterministic so long as the app/browser is executing. Assuming no string-random-hash algorithm at the outset, the only thing that would change the hash result would probably be upgrading the browser as a newer runtime could choose a more efficient algorithm.
21:32
<bakkot>
my preferred approach is to allow composite objects (i.e. interned objects, that give you ===) as keys in WeakMaps iff they contain at least one thing which could itself be in a WeakMap, and to introduce a canBeWeaklyHeld predicate so that's easier to determine.
21:32
<littledan>
why not at the same time was also answered: because there is not consensus to expose hashcodes, due to interop risk
21:33
<littledan>
so I guess you want more details... what do you mean by details?
21:33
<rbuckton>
so I guess you're calling this Map/Set privileged? But I don't see what Map/Set are doing that other collections couldn't also do
No, Map/Set privileged would mean a Map would treat a CompositeKey as a special value and essentially do equals/hash natively. Without runtime support for equals/hash, a custom collection cannot do the same.
21:34
<Mathieu Hofman>

if you're writing map.get(CompositeKey(a, b)). You're just paying the O(n) cost walking the registry.

I do not understand. How is it O(n) if there is a Map that supports a native composite keying ?

21:34
<rbuckton>
This is something I have trouble understanding. Even if we do want to generalize to hash, the question of "why not instead" was answered--you want a nice default mechanism for the normal case, as other languages tend to have
Other languages tend to have equal/hash, with a few built-in equalers you can readily reference. I would certainly want those as well.
21:34
<littledan>
No, Map/Set privileged would mean a Map would treat a CompositeKey as a special value and essentially do equals/hash natively. Without runtime support for equals/hash, a custom collection cannot do the same.
hmm, what kind of custom collection are you imagining? I guess I'm not picturing it properly
21:35
<rbuckton>

if you're writing map.get(CompositeKey(a, b)). You're just paying the O(n) cost walking the registry.

I do not understand. How is it O(n) if there is a Map that supports a native composite keying ?

That was in reference to the WeakMap registry mechanism on https://docs.google.com/presentation/d/1JfChmW8tQ2_mrFDynosNqa1tjJ2j-qX6WoKm8vc_tkY/edit#slide=id.g2c6eebea946_0_40
21:40
<Mathieu Hofman>
I'd also like more details from the folks that are concerned about non-determinism as to why that should matter. It would be guaranteed to be deterministic so long as the app/browser is executing. Assuming no string-random-hash algorithm at the outset, the only thing that would change the hash result would probably be upgrading the browser as a newer runtime could choose a more efficient algorithm.
I want a language where I can deterministically reproduce execution. If you exclude I/O (which encompasses Date.now and Math.random), we currently have that language
21:41
<rbuckton>
hmm, what kind of custom collection are you imagining? I guess I'm not picturing it properly
https://github.com/esfx/esfx/blob/main/packages/collections-hashmap/src/index.ts, though its essentially just a Map that takes an { equaler } option, as I'm suggesting.
https://github.com/microsoft/TypeScript/blob/shared-struct-test/src/compiler/sharing/collections/concurrentMap.ts, which is a lock-free concurrent map that cannot use Map as a backing store.
https://github.com/microsoft/TypeScript/blob/shared-struct-test/src/compiler/sharing/collections/sharedMap.ts, which is a coarse-grained synchronized map.
to name a few. In other languages you're likely to see custom collections where specific semantics for get/set are necessary, such as with observable collections in .NET.
21:44
<rbuckton>
I want a language where I can deterministically reproduce execution. If you exclude I/O (which encompasses Date.now and Math.random), we currently have that language
But for what purpose is this important? Is it for testing? If you're going to exclude Date.now and Math.random, why would you not exclude Object.hash? Could an implementation not have a --hash-algorithm=... flag to ensure a stable hash algorithm for tests?
21:45
<rbuckton>
If it's for security, wouldn't a randomized Object.hash be preferred over a deterministic one?
21:46
<rbuckton>
Are there any "implementation defined" behaviors you also have to discount? Object.hash would essentially be implementation defined.
21:48
<Mathieu Hofman>
But for what purpose is this important? Is it for testing? If you're going to exclude Date.now and Math.random, why would you not exclude Object.hash? Could an implementation not have a --hash-algorithm=... flag to ensure a stable hash algorithm for tests?
We actually use this determinism for replay of program execution, sometimes across engines. Yes we could remove, but then it increases the difference with what programs may expect being in the language.
21:48
<rbuckton>
There's also the possibility we could define hashing as Object.hash(obj, algorithm?) where you can optionally specify "stable" or "fast" (or similar).
21:50
<rbuckton>
We actually use this determinism for replay of program execution, sometimes across engines. Yes we could remove, but then it increases the difference with what programs may expect being in the language.
And how does this differ from the unique reference identity that a CompositeKey might have at runtime?
21:52
<Mathieu Hofman>
And how does this differ from the unique reference identity that a CompositeKey might have at runtime?
all usages of such unique identity would be deterministic.
21:52
<rbuckton>
It should be clearly called out in documentation that Object.hash() is not stable across application restarts and shouldn't be serialized. I would expect that code that respects that would work just fine when replaying execution.
21:52
<rbuckton>
It isn't deterministic.
21:52
<Mathieu Hofman>
how isn't it? observably ?
21:52
<rbuckton>
It's unobservably non-deterministic. An object reference at runtime can exist in a different memory address each time you execute.
21:53
<Mathieu Hofman>
sure but I don't care about internals of the engine that are not observable
21:53
<rbuckton>
If your code treats a hash as unobservably non-deterministic, there is no difference.
21:53
<Mathieu Hofman>
it's like saying a JS object representation may live at a different pointer, that is just not relevant
21:53
<rbuckton>
The problem is that you can't make a hash code opaque without severely impacting the performance.
21:54
<rbuckton>
And the whole point is the performance.
21:54
<Mathieu Hofman>
I care about what the language exposes, that it not allow JS program to have non deterministic behaviors. I do not control the programs that run in my environment
21:55
<rbuckton>
JS allows programs to have non-deterministic behaviors, hence Date.now(), Math.random(). You can only make that assertion about a subset of the language.
21:57
<rbuckton>
Would control over the algorithm assuage that concern? Be it that from a CLI option or an option to a ShadowRealm, or an optional argument to Object.hash()?
21:57
<Mathieu Hofman>
right, and any further non-deterministic behaviors would IMO have to be extremely well motivated, and I am not seeing that from hashCode
22:13
<rbuckton>
I would find a custom equality solution that cannot be implemented efficiently in a pure-JS collection to be a serious design flaw in the language. If Map/Set have privileged support for a composite key, that will end up poisoning other APIs over time. It will inevitably be reused by groupBy and distinct, or possibly a future join/crossJoin/leftJoin for iterators. We will move further and further away from the ability for the runtime to evolve in userland as no userland implementation could have the same performance characteristics. And we regularly suggest that some proposals instead evolve in userland.
22:14
<rbuckton>
I'd like to point out one specific bit. Object.hash() is essentially deterministic for everything except strings. The only non-determinism in strings would be if a more efficient hashing algorithm were to come along.
22:18
<rbuckton>
It's possible that we could handle that case with a subtle-like API. Get consensus on a base algorithm for string hashing, and allow users to use a faster algorithm as it comes along. For example, were Object.hash to use mumur3 by default, we could later support xxhash via Object.hash.xxhash32() or Object.hash(v, "xxhash32").
22:20
<rbuckton>
Then the program remains deterministic. If randomized string hashing becomes important, make that an option but add that narrow bit to the set of things you don't support if you want determinism instead of throwing out the whole feature.
22:27
<rbuckton>
Correction, strings are not the only case. Object.hash({}) might also need to be non-deterministic to avoid introducing a side-channel. I have to think on that a bit more.
22:28
<rbuckton>
It could just be that Object.hash({}) always returns zero or throws, and you have to work around that yourself, but that would be unfortunate.
22:40
<rbuckton>
IMO, I'd rather have a non-deterministic Object.hash() and better support for time-travel debugging across engines to handle the "replayable program execution" case.
23:12
<Bradford Smith>
Mathieu Hofman: It's my understanding that iteration over Map() and Set() is always in the order in which items were added. This wouldn't be affected by allowing a userland-provided hashCode form of equals, would it? Are you thinking of some other way in which non-determinism would creep in? Are you concerned userland bugs causing the hashCode for a mutable object to change when the object is mutated?
23:16
<Mathieu Hofman>
Mathieu Hofman: It's my understanding that iteration over Map() and Set() is always in the order in which items were added. This wouldn't be affected by allowing a userland-provided hashCode form of equals, would it? Are you thinking of some other way in which non-determinism would creep in? Are you concerned userland bugs causing the hashCode for a mutable object to change when the object is mutated?
I'm not saying that the builtin collections would use hashCode for order. I'm saying exposing more non-determinism would intrinsically risk causing more non-deterministic execution of JS programs.
23:28
<Bradford Smith>

Ok, so the issue isn't that the proposed use of Symbol.hashCode and Symbol.equals properties (or something like that) for Map/Set equality is itself a cause of non-determinism. It is rather that any userland implementation of Symbol.hashCode is likely to be non-deterministic in order to be functional. And, other uses of Symbol.hashCode that may crop-up in user-land could then lead to non-deterministic behavior. The existence of this API practically requires that userland code produce non-deterministic values in order to use it.

Is that right?

23:50
<Mathieu Hofman>
My understanding is that for hashCode to be effective, some non-determinism will become observable. That non-determinism comes from APIs that would need to be introduced to hash "root" values, like strings or the identity of some values (e.g. unique symbols)