13:52 | <ryzokuken> | good morning/evening/night everyone! |
13:52 | <ryzokuken> | meeting starts in ~7 |
14:29 | <littledan> | I have trouble understanding the motivation for this null change, but also it doesn't seem harmful |
14:29 | <littledan> | (or rather, I have trouble understanding the motivation for the null skipping in the first place) |
14:30 | <nicolo-ribaudo> | Random fact: Babel accidentally ignores null s while guaranteeing at least one await -- we have a PR open to match the behavior of the current (pre-this-change) proposal |
14:35 | <hax (HE Shi-Jun)> | It seems this PR try to avoid nulls introducing extra awaits, am I understanding this correctly? |
14:35 | <nicolo-ribaudo> | Yes |
14:37 | <shu> | this is a small thing, but i'd be interested in people's thoughts on banning using inside base switch cases: https://github.com/tc39/proposal-explicit-resource-management/issues/215#issuecomment-2040486653 |
14:39 | <nicolo-ribaudo> | Babel's using support in switch is currently completely broken |
14:39 | <nicolo-ribaudo> | Because everything in switch is incredibly annoying to compile |
14:39 | <shu> | i wanna ban it but recognize that'll break symmetry with let/const |
14:40 | <nicolo-ribaudo> | Is it for implementation reasons? |
14:41 | <shu> | it's laid out in the comment -- saves some codegen complexity |
14:42 | <shu> | we can unroll the dispose loop in all cases except switch cases |
14:44 | <nicolo-ribaudo> | I'm trying to think of any usecase for using in switch, but I cannot think of any where I would want using 's scope to fall through multiple cases |
14:46 | <nicolo-ribaudo> | (Babel doesn't unroll the loop anymore, so for us it wouldn't actually be difficult to fix using in switch ) |
14:46 | <shu> | we don't unroll currently, but would like to |
14:50 | <rbuckton> | I'm trying to think of any usecase for using for a chunk of operations when it isn't needed, such as if a lock was needed for a 16k chunk of data that isn't needed for a smaller chunk of data. The point of manual loop unrolling is to avoid a bunch of comparisons and branches based on a single condition such as input size, so a smaller chunk of data skips over the using (and avoids the extraneous null /undefined check), while a larger chunk of data enforces the using and holds the lock until the current iteration of the loop ends. |
14:52 | <nicolo-ribaudo> | Oh I see, like for transpiling generators |
14:53 | <nicolo-ribaudo> | (in a world where generators are transpiled and using is not I guess) |
14:53 | <rbuckton> |
|
14:56 | <rbuckton> | not for transpiling generators, no. that would be using a switch as a finite state machine, but an FSM would result in the using terminating early. Loop unrolling in this case is to handle chunks of data without continuously spinning and checking in the for to process one element at a time and cut down the number of comparisons/branching from n to 1/n for a given chunk size of n. |
15:25 | <littledan> | If we could get to Ashley's point, he can explain why TLA running eagerly is useful within Bloomberg |
15:26 | <hax (HE Shi-Jun)> | tla already can break expectations... |
15:29 | <ljharb> | certainly a TLA being added anywhere in your graph (that's not an entrpoint) is a breaking change |
15:29 | <littledan> | The thing is: in practice, you really shouldn't use JSON or CSS modules either, due to the lack of built-in bundling. The limited utility of built-in modules is not at all limited to this feature |
15:29 | <bakkot> | this strongly suggests to me that we need a path to standardize bundler-only syntax |
15:29 | <littledan> | until then, we'll be talking about features that are mostly useful for bundlers and non-web environments |
15:30 | <littledan> | this strongly suggests to me that we need a path to standardize bundler-only syntax |
15:30 | <shu> | i disagree with that disagree |
15:30 | <shu> | doesn't seem a good use of resources tbh |
15:31 | <bakkot> | I disagree; I think we should fix the issues to allow native modules to be usable in browsers |
15:31 | <littledan> | these are all arguments against ES6 modules in the first place |
15:31 | <bakkot> | cough |
15:31 | <bakkot> | I mean, yes. |
15:31 | <bakkot> | But that ship has sailed. |
15:31 | <ljharb> | you're soooo close |
15:31 | <rbuckton> | bundler-related, but could import defer potentially avoid fetch and parse as well if we could serialize the import graph on the server side ahead of time? |
15:32 | <hax (HE Shi-Jun)> | In Jack's "assert sync" proposal I suggest we can also add "use async" directive which might help to make TLA explicit. Maybe it also help this issue. |
15:32 | <littledan> | well, yeah, so this would be good input to the module harmony group, if we don't like modules features anymore... |
15:32 | <bakkot> | I think standardizing ES6 modules as bundler-only syntax would have been better than the current world |
15:32 | <rbuckton> | or is that not an option because fetch is async |
15:33 | <bakkot> | well, yeah, so this would be good input to the module harmony group, if we don't like modules features anymore... import source is useful on the web! |
15:33 | <bakkot> | so it's not "module features" in general, just those specific ones which aren't useful on the web |
15:34 | <Justin Ridgewell> | /me Taking over Chair responsibilities for a bit |
15:34 | <bakkot> | (possibly "[...] aren't useful on the web yet") |
15:34 | <littledan> | I think the Matrix results show that it's useful on the web in practice |
15:34 | <littledan> | It's a good thing that bundlers are aligned to TC39 and browsers in their syntax and semantics. It will be good for us to preserve that. |
15:35 | <Ashley Claymore> | bundler-related, but could |
15:35 | <bakkot> | IIUC the Matrix results show that it's useful on the web to have bundler-level syntax for this, but not that it's useful to have in browsers, right? |
15:35 | <littledan> | we've made tons of progress by actually building things into JS; the code splitting situation was a mess before import(), and it resulted in code splitting not occurring. |
15:35 | <bakkot> | I am not proposing "give up on this"; I am proposing "have a path to standardize bundler-level syntax that isn't actually in browsers" |
15:35 | <littledan> | TC39 is the JavaScript standards committee--our role is to find this common syntax |
15:36 | <littledan> | once we find common syntax, it's OK if some implementations fall behind and don't implement everything, but it'd be a big cost to split the language in two |
15:36 | <bakkot> | my response is the thing Shu is saying out loud right now |
15:36 | <littledan> | we should have this as an actual agenda item; that'd make it easier to discuss |
15:39 | <littledan> | IMO Stage 3 serves this purpose of "it's in tools and not necessarily in browsers". Maybe we just want to leave things in Stage 3 for longer. |
15:39 | <bakkot> | fwiw eslint refuses to implement features prior to stage 4, though that's maybe a them problem |
15:41 | <ljharb> | stage 3 is when it gets shipped in (most) browsers tho |
15:42 | <shu> | John-David Dalton: why is it disingenuous? this is arguably better handled by tools. tools have a different view (they can see the whole app) and optimization opportunities than a VM |
15:43 | <shu> | we designed a super static module system. it is not surprising to me that it has been leveraged to success with ahead-of-time tooling |
15:44 | <littledan> | fwiw eslint refuses to implement features prior to stage 4, though that's maybe a them problem |
15:46 | <ljharb> | (@babel/eslint-parser as of babel 7) |
15:47 | <littledan> | I'd be happiest if browsers ship features like this at that point, but if they want to leave some features out of browsers for now (but we still assess that the feature is solid enough for Stage 3), maybe they can just let them sit in that bucket. Doesn't require holding back 2.7 or 3. |
15:47 | <ljharb> | 3 is a pretty clear signal, that's the big purpose of 2.7. i agree it wouldn't require holding back 2.7 |
15:48 | <Michael Ficarra> | so is this feature going to encourage the dreaded "barrel" module pattern? |
15:48 | <littledan> | so is this feature going to encourage the dreaded "barrel" module pattern? |
15:48 | <bakkot> | While that's technically true, I would be happier revising our process if we want to do that - I would prefer that stage 2.7/3 be taken as a commitment from browsers to implement a feature at some point. I don't want to get to a point where browsers allow features to reach 2.7 that they intend to never implement. |
15:49 | <shu> | agree |
15:49 | <shu> | the two design spaces are honestly different |
15:49 | <shu> | the constraints are different |
15:49 | <ljharb> | so is this feature going to encourage the dreaded "barrel" module pattern? |
15:49 | <littledan> | While that's technically true, I would be happier revising our process if we want to do that - I would prefer that stage 2.7/3 be taken as a commitment from browsers to implement a feature at some point. I don't want to get to a point where browsers allow features to reach 2.7 that they intend to never implement. |
15:49 | <Michael Ficarra> | it's about optimizing the feature--no encouragement is needed |
15:50 | <shu> | littledan: i have been thinking about for most of this year, i'll give you something more thought out soon |
15:51 | <littledan> | well it doesn't provide any benefit to those not using the pattern, so it seems like encouragement to use it |
15:51 | <littledan> | it's not about newly incentivizing usage |
15:51 | <Michael Ficarra> | so someone who wants to get similar tree-shakeability can just dump this pattern? |
15:52 | <littledan> | sure, if you can get all your dependencies to dump this pattern, then you'll have no reason to use this feature in your dependency tree |
15:52 | <ljharb> | there's a growing group of vocal folks in the ecosystem telling people to get rid of barrel modules, fwiw |
15:53 | <ljharb> | (coinbase's RN app's binary size dropped 71% when we/they banned barrel exports back in 2021) |
15:53 | <littledan> | there's a growing group of vocal folks in the ecosystem telling people to get rid of barrel modules, fwiw |
15:54 | <ljharb> | even with import defer, barrel exports will still result in a larger app than "just import what you need directly", no? |
15:54 | <bakkot> | I don't think "this feature is not useful in browsers" is a new category of objection? |
15:54 | <bakkot> | that's like... the main objection |
15:55 | <Michael Ficarra> | can someone remind me why people ever did that in the first place? |
15:55 | <ljharb> | i don't know why people tend to like "god objects", deep imports is The Way. but they do tend to like them. |
15:56 | <ptomato> | maybe the pattern was copied from python modules and __init__.py ? |
15:56 | <littledan> | can someone remind me why people ever did that in the first place? |
15:56 | <ljharb> | i've seen people have a visceral preference for one import keyword that has N identifiers, instead of N import keywords |
15:56 | <shu> | that's like... the main objection |
15:56 | <Michael Ficarra> | I'm finding it very hard to find sympathy for these people... |
15:57 | <ljharb> | also i think that a number of major IDEs implemented auto-refactoring features for named exports but ignored default exports (for no technical reason) which drove increased usage of named exports |
15:57 | <littledan> | I'm finding it very hard to find sympathy for these people... |
15:58 | <ljharb> | but is "don't use barrel files" perhaps a better solution? (genuine question) |
15:58 | <Justin Ridgewell> | Barrels are so convenient, and so awful for performance (both browsers and bundlers) |
15:58 | <littledan> | I dunno, there's a lot of recommendations about performance that people aren't taking up... |
15:59 | <Justin Ridgewell> | @guybedford Your mic is echoing |
15:59 | <bakkot> | but wouldn't this be a recommendation for performance in exactly the same way? it requires work to adopt; it doesn't just give you performance for free |
15:59 | <ryzokuken> | sorry guybedford bad timing |
16:00 | <Justin Ridgewell> | Timebox! |
16:00 | <bakkot> | and if the people in question were willing to do work to get performance they already can already do so, right? |
16:01 | <Michael Ficarra> | @nicolo-ribaudo your async feedback from me is that, for now, I'm unconvinced by the motivation |
16:01 | <ljharb> | convenience often makes for better road-paving than good intentions |
16:01 | <littledan> | but wouldn't this be a recommendation for performance in exactly the same way? it requires work to adopt; it doesn't just give you performance for free |
16:03 | <ljharb> | so importers keep thinking they're doing a fine thing, when they're doing a horrible thing, and the entire ecosystem has to move to import defer to avoid the education problem? |
16:03 | <ljharb> | education's super hard so maybe that's the right tradeoff to make, tbf |
16:10 | <nicolo-ribaudo> | but wouldn't this be a recommendation for performance in exactly the same way? it requires work to adopt; it doesn't just give you performance for free |
16:10 | <nicolo-ribaudo> | So 1 instead of N |
16:10 | <nicolo-ribaudo> | so importers keep thinking they're doing a fine thing, when they're doing a horrible thing, and the entire ecosystem has to move to |
16:43 | <ljharb> | does using import defer on a barrel file with a bundler that supports it give the ~same results as using deep imports in the first place? because treeshaking so far still doesn't fully achieve "only importing what you need in the first place" afaik. |
16:46 | <Michael Ficarra> | in a language without so many hidden effects all over the place, it would be |
16:49 | <nicolo-ribaudo> | does using export defer /export optional that helps with barrel files |
16:49 | <nicolo-ribaudo> | And export defer would have the same result as using deep imports |
16:50 | <ljharb> | ok, so that's a compelling argument in favor of that one |
16:51 | <ljharb> | so import defer would be to lessen the downside of importing a barrel file that did not use export defer ? |
16:52 | <nicolo-ribaudo> | And the boundary is still at the
it will also load/run the code for |
16:52 | <nicolo-ribaudo> | Exactly like deep imports |
16:53 | <nicolo-ribaudo> | so import defer is not really related to barrel file |
16:53 | <nicolo-ribaudo> | It's in general for "big modules subgraphs" |
16:53 | <nicolo-ribaudo> | Even if you have a file with a single export but many dependencies, import defer is useful there |
16:54 | <littledan> | One thing that's important to understand is, the barrel file stuff doesn't relate to the motivation for import defer --they solve for totally unrelated issues, just both about loading less code |
17:14 | <littledan> | |
17:16 | <Michael Ficarra> | I agree with Kevin that this isn't especially confusing |
17:18 | <Michael Ficarra> | is there maybe justification to have this produce its own iterator subclass that can have optimised helpers implemented on it? |
17:18 | <Michael Ficarra> | like .drop(1e300) could be implemented very efficiently for this special iterator |
17:19 | <bakkot> | I think drop and take are the only things which could have better implementations and I don't think it's worth doing just for those |
17:19 | <bakkot> | and engines could do this optimization anyway |
17:20 | <hax (HE Shi-Jun)> | The original proposal includes BigInt.range() , don't remember why it is added in first place |
17:20 | <Bradford Smith> | I might expect to be able to use Iterator.range() to generate unique identifiers, in which case possibly a BigInt type is desirable. |
17:21 | <hax (HE Shi-Jun)> | I might expect to be able to use |
17:21 | <Michael Ficarra> | to waldemar's point, yes, there are some editorial bugs, but nothing that's not fixable |
17:21 | <Michael Ficarra> | it hasn't passed editorial review yet |
17:21 | <shu> | I might expect to be able to use |
17:21 | <shu> | like, give me monotonic ids starting with <large number> or something? |
17:22 | <ljharb> | i have no bigint range use cases, but "works for one kind of numeric primitive and not the others" seems like a big problem |
17:22 | <Bradford Smith> | like, give me monotonic ids starting with <large number> or something? |
17:23 | <Justin Ridgewell> | Are bigints still massively slower than numbers? |
17:23 | <rbuckton> | I'd prefer it work for bigint just so I'm not having to coerce using BigInt() when I need to do math with other bigints. |
17:23 | <shu> | i have no bigint range use cases, but "works for one kind of numeric primitive and not the others" seems like a big problem |
17:23 | <shu> | Are bigints still massively slower than numbers? |
17:23 | <danielrosenwasser> | Mathieu Hofman: can you add yourself to the notes? |
17:23 | <ljharb> | it's an inconsistency. one of those warts we'd inevitably have to make a "fill in the table" proposal for in a few years. |
17:23 | <eemeli> | Timestamps in microseconds as a use case? |
17:24 | <ljharb> | it's fine if we want to wait for that, but is there a reason why it's beneficial to defer it? |
17:24 | <shu> | it's an inconsistency. one of those warts we'd inevitably have to make a "fill in the table" proposal for in a few years. |
17:24 | <ljharb> | that is true of most of the goals we all have :-) |
17:25 | <shu> | right, which is why i pushed back on "seems like a big problem" |
17:25 | <bakkot> | it's not that inconsistent - the Math methods do not and never will take BigInts |
17:25 | <shu> | in any case i was given a concrete use case i found plausible |
17:25 | <bakkot> | in fact I don't think we have anything that takes only Number of BigInt? |
17:25 | <shu> | so i'm happy with bigints being accepted here |
17:25 | <bakkot> | but yeah I also prefer accepting BigInt here |
17:26 | <ptomato> | Timestamps in microseconds as a use case? |
17:28 | <nicolo-ribaudo> | Maybe Iterator.range(instant1, instant2, duration) if range is overloaded anyway? |
17:28 | <Jack Works> | I might expect to be able to use |
17:29 | <shu> | might not need to be unique |
17:29 | <ljharb> | also autoincrementing is fine sometimes, not everything has the german tank problem |
17:30 | <nicolo-ribaudo> | Python bans floats in range() right? |
17:30 | <ptomato> | right |
17:32 | <ptomato> | python's popular computing library has numpy.arange() which does suffer from unexpected iterations due to floating point errors, and numpy.linspace() which does not |
17:33 | <Jesse> | ...decimal. |
17:33 | <bakkot> | I do like linspace |
17:34 | <bakkot> | the options object does leave room for it |
17:34 | <nicolo-ribaudo> | You can just use a range of integers and .map for what Matthew is proposing, right? |
17:35 | <Michael Ficarra> | FWIW it would be nice to have extra time to iron out all the editorial issues as well |
17:35 | <Michael Ficarra> | then we can have a really solid 2.7 advancement |
17:36 | <Bradford Smith> | I'll definitely +1 this feature either with or without fractional values. |
17:36 | <bakkot> | same though I do prefer accepting fractional values |
17:36 | <bakkot> | but if left out it's not the end of the world |
17:41 | <hax (HE Shi-Jun)> | the options object does leave room for it |
17:41 | <Michael Ficarra> | even 2-4x slower must still be faster than .reduce((a, b) => a + b, -0) |
17:43 | <hax (HE Shi-Jun)> | As the original author of range floating issue, I prefer not accepting franctional values , and change the name fromIterator.range to Iterator.integers which make it even more clear to developers 😉 |
17:44 | <Michael Ficarra> | intRange |
17:46 | <hax (HE Shi-Jun)> | Do people also need ProductExact() ? |
17:46 | <littledan> | I can't understand the difference in precision/exactitude between precise and exact in this case... can anyone else? |
17:47 | <eemeli> | I think "precise" applies to all numbers. So 0.1 + 0.2 has a precise value, it's just not exactly 0.3 . |
17:47 | <shu> | kinda, i guess there is a notion of "low precision" vs "high precision" that's more commonly used than "low exactitude" vs "high exactitude" |
17:47 | <shu> | but in this particular case i don't really understand |
17:47 | <eemeli> | "Accurate" would be more... accurate. |
17:48 | <hax (HE Shi-Jun)> | I can't understand the difference in precision/exactitude between precise and exact in this case... can anyone else? |
17:48 | <littledan> | I think "precise" applies to all numbers. So |
17:49 | <littledan> | kinda, i guess there is a notion of "low precision" vs "high precision" that's more commonly used than "low exactitude" vs "high exactitude" |
17:49 | <littledan> | right? |
17:49 | <shu> | right, in this particular case i don't understand the difference |
17:49 | <shu> | if it were up to me i'd name it sumSlow |
17:50 | <shu> | like i would totally believe that if kevin's slides said "this is named sumPrecise, but it may give the wrong impression", someone will then say "sumExact is better" |
17:50 | <eemeli> | To be clear, I'm fine with sumExact as a least worst option. |
17:50 | <Jack Works> | Yes, next follow on proposal might be adding a new Symbol.rangeTo, and then range(a, b, options) where a is an object will do the following: yield* a[Symbol.rangeTo](b, options) , then you can add it on Temporal, Decimal (they're not primitive now) and your own class. |
17:51 | <rbuckton> | Doesn't temporal use from ? |
17:51 | <littledan> | what name are we considering? |
17:52 | <Bradford Smith> | So, I guess we should go with Math.sumPreciseFrom() ? |
17:52 | <Bradford Smith> | or not... |
17:52 | <bakkot> | it sounds like we are not doing From based on everyone else doesn't like it |
17:52 | <bakkot> | so sumExact or sumPrecise |
17:54 | <hax (HE Shi-Jun)> | if it were up to me i'd name it sumSlow |
17:54 | <danielrosenwasser> | dminor: can you clarify your last point in the notes? |
17:54 | <shu> | yeah it is? |
17:54 | <shu> | it's necessarily slow_er_ |
17:54 | <nicolo-ribaudo> | I was going to say that I dislike From because it currently means "I'm creating something of this type from these other values". Iterator.from creates an iterator, Array.from an array, Object.fromEntries an object. If we wanted this to contain from it should be on number (Number.fromSum ), but also Number is not a collection |
17:55 | <rbuckton> | Temporal uses from in many places and those functions do not take iterables, so I disagree with the Michael Ficarra 's statement that "everything named from takes an iterable" |
17:56 | <Mathieu Hofman> | You can just use a range of integers and .map for what Matthew is proposing, right? range(start * stepDivider, end * stepDivider, stepMultipler) | map(x => x / stepDivider) |
17:56 | <rbuckton> | Agreed. "Foo.from" does not imply "takes an iterable", it imples "make a Foo from these inputs" |
17:57 | <littledan> | I'm having trouble understanding why we're bothering with an optimization for an error case |
17:57 | <Michael Ficarra> | I don't think arithmetic commutativity comes into play when we're talking about throwing, which is inherently not arithmetic |
17:58 | <Michael Ficarra> | @littledan an iterable of Number producing a NaN is not an error case? |
17:58 | <littledan> | I think NaN is kinda usually for error cases? |
17:59 | <littledan> | I get that it's within the domain of Numbers but.... when was the last time you wanted NaN to come up for you? |
17:59 | <Michael Ficarra> | at the business logic layer, maybe, but at this data processing layer, no |
17:59 | <dminor> | I have to drop, Eemeli and Matthew Gaudet will represent SpiderMonkey for the rest of the day. |
18:00 | <littledan> | at the business logic layer, maybe, but at this data processing layer, no |
18:00 | <Michael Ficarra> | yes, iterators can yield very many values |
18:01 | <Michael Ficarra> | a multi-hours' batch could be short-circuited immediately after starting instead of waiting until the end |
18:03 | <hax (HE Shi-Jun)> | it's necessarily slow_er_ |
18:04 | <danielrosenwasser> | eemeli: can you add yourself to the notes doc? |
18:09 | <Justin Ridgewell> | I had to miss the presentation, but why is empty list -0 ? (no opinion, just curious) |
18:10 | <shu> | it is the identity of floating point addition |
18:10 | <Michael Ficarra> | @Justin Ridgewell Object.is(-0 + -0, -0) |
18:10 | <shu> | (while +0 + -0 = +0) |
18:11 | <Justin Ridgewell> | Why is -0 the starting point? |
18:11 | <Justin Ridgewell> | If I’m interpreting those correctly.... |
18:11 | <shu> | what does starting point mean? |
18:11 | <Justin Ridgewell> | let sum = -0; for (const i of array) sum += i |
18:12 | <Michael Ficarra> | so that when array is [-0] you get the correct answer |
18:17 | <Duncan MacGregor> | The equality question feels very like the question of == on value types in Java. |
18:19 | <bakkot> | the slides will reference project Valhalla IIRC |
18:19 | <Michael Ficarra> | value types being the ones with lowercase names? |
18:19 | <bakkot> | (i.e. value objects for java) |
18:19 | <bakkot> | https://openjdk.org/projects/valhalla/ |
18:24 | <Michael Ficarra> | this presentation is so well structured 😍 |
18:25 | <bakkot> | I really like the composite object approach though I would not call them "CompositeKey" |
18:27 | <bakkot> | I linked a few userland implementations here: https://github.com/tc39/proposal-record-tuple/issues/387#issuecomment-2033531920 though I am sure there are others |
18:29 | <rbuckton> | Composite keys don't seem like a solution to case-insensitive Maps. I still strongly favor equals/hash. |
18:29 | <bakkot> | I think they solve a different problem - equals/hash doesn't let me write groupBy with the result of my comparator being a composite key (without doing a bunch of work) |
18:29 | <ptomato> | but uniqBy without CompositeKey would cover that use case, no? |
18:30 | <rbuckton> | i.e., new Map([], { comparer: caseInsensitiveStringComparer } |
18:30 | <Michael Ficarra> | @rbuckton it works fine, Unicode has case folding for tht |
18:30 | <rbuckton> | I think they solve a different problem - equals/hash doesn't let me write |
18:31 | <rbuckton> | @rbuckton it works fine, Unicode has case folding for tht |
18:32 | <bakkot> | I don't know what that means? I am asking about, for example, I have a list of { name, employer, city } objects, and I want to collect them by { employer, city } . with composite key that's just Map.groupBy(vals, x => Tuple(x.employer, x.city)) . with equals/hash there's a bunch more ceremony |
18:33 | <rbuckton> | I have already written a groupBy that uses equals /hash`, and an equaler that already does tuple structural equality. |
18:34 | <rbuckton> | That library also supports [Equatable.equals](obj) and [Equatable.hash]() that can be used if an equaler is not provided, so a given composite key just builds on top of that. |
18:35 | <bakkot> | sure, it can certainly be done, there's just a bunch more ceremony |
18:36 | <rbuckton> | It is far more flexible, IMO. |
18:36 | <bakkot> | it's just different |
18:36 | <bakkot> | it solves different problems |
18:36 | <rbuckton> | Having to turn everything into a CompositeKey seems like a lot more ceremony to me. |
18:37 | <eemeli> | Map.p.getImprecise, anyone? |
18:37 | <rbuckton> | CompositeKey can be built on equals/hash, the other is not true. |
18:37 | <bakkot> | Map.groupBy(vals, x => Tuple(x.employer, x.city)) is very close to zero ceremony |
18:37 | <bakkot> | that's exactly how I think about the problem |
18:38 | <rbuckton> | Plus, AFAIK equals/hash is how every implementation implements maps natively, it's just not exposed to user code. |
18:40 | <rbuckton> | For that case, maybe. What about map.set(key, value) though? You have to write map.set(new CompositeKey(foo.a, foo.b), value) or map.get(new CompositeKey(foo.a, foo.b)) . With equals/hash, you can set it up once and just do map.set(foo, value) or map.get(foo) . |
18:40 | <bakkot> | but if my map is not keyed by foo , I don't want it to behave that way? |
18:41 | <bakkot> | I want to use a composite key for that case |
18:41 | <ljharb> | with a Map constructor hook, you could set it up once too with composite keys |
18:41 | <rbuckton> | You can use a composite key for that case, I'd like to not have to use composite key for all of the other cases. |
18:41 | <Mathieu Hofman> | It is far more flexible, IMO. |
18:41 | <rbuckton> | with a Map constructor hook, you could set it up once too with composite keys |
18:41 | <ljharb> | right. which is why it's not an argument in favor of either one. |
18:42 | <rbuckton> | it's a lot more risk prone |
18:43 | <sffc> | For the set comparator, it seems like a prototype method [Symbol.setCompare] would work, and if Record/Tuple implement that function, then R&T Set/Map semantics should just work |
18:43 | <bakkot> | You can use a composite key for that case, I'd like to not have to use composite key for all of the other cases. |
18:43 | <Mathieu Hofman> | JS is a high level language, I don't want low level concept like computing hashes exposed to programmer, especially if they can cause erroneous executions if mishandled |
18:43 | <bakkot> | I'm not saying that we should have composite keys instead of hash/equals, just that they solve different problems |
18:45 | <bakkot> | (I am separately somewhat skeptical of hash/equals because of the concerns about inconsistency, especially since that seems like it might be a bug farm in engines themselves and not just programs. but that concern is unrelated to this.) |
18:50 | <Mathieu Hofman> | That means the collection correctness is dependent on the stable behavior of its values. I want the ability to create a collection that is resilient against misbehaving values |
18:55 | <Duncan MacGregor> | +1 on pretty much everything rbuckton said. |
18:58 | <rbuckton> | That means the collection correctness is dependent on the stable behavior of its values. I want the ability to create a collection that is resilient against misbehaving values CompositeKey and pay the overhead for every get/set. If you do trust the stability of your keys, equals/hash lets you avoid that overhead. |
19:00 | <rbuckton> | I have a strong preference for fast, efficient JS. CompositeKey overhead does not seem fast/efficient. |
19:01 | <Mathieu Hofman> | If you do not trust the stability of your keys, then you could use a |
19:01 | <rbuckton> | equals/hash does not preclude CompositeKey , but rather is the building block a CompositeKey could be built on. It's also the building block using a Uri as a key could be built on too. |
19:02 | <bakkot> | It's not something you could build CompositeKey on because CompositeKey has === equality |
19:02 | <bakkot> | and === will never invoke a userland equals |
19:03 | <shu> | i missed that there was a configuration of the new direction that has new objects that have special ===? |
19:03 | <rbuckton> | If we could have CompositeKey have === equality, why would we not have R&T have === equality? |
19:03 | <shu> | that isn't any more acceptable than the R&T's overloading of === |
19:03 | <bakkot> | i missed that there was a configuration of the new direction that has new objects that have special ===? |
19:03 | <littledan> | can we change indexOf just for these new objects? |
19:04 | <bakkot> | it was not one ACE advocated for |
19:04 | <shu> | the CompositeKey thing he presented, with interning, which is already being done in userland |
19:04 | <bakkot> | no I mean they do interning |
19:05 | <Ashley Claymore> | can we change indexOf just for these new objects? |
19:05 | <shu> | no I mean they do interning |
19:05 | <shu> | i see, ok |
19:05 | <bakkot> | https://github.com/benjamn/immutable-tuple/blob/485b32326349cb0329c749090cebf43f8359fa12/src/tuple.js#L12-L19 |
19:06 | <rbuckton> | Isn't the discussion around CompositeKey predicated on R&T not having === equality? I have concerns about the GC overhead associated with CompositeKey using something like weakmaps/maps as either CompositeKey("a") leaks forever, or it requires a FinalizationRegistry and the GC overhead for tracking/collection the object. |
19:06 | <littledan> | Isn't the discussion around |
19:06 | <shu> | rbuckton: that was also my confusion. the resolution is that CompositeKeys have interning semantics built in, they return deduplicated objects such that === just works like object ===, because it's literally deduplicated |
19:06 | <bakkot> | or you forbid having a CompositeKey that doesn't contain an object, which is what userland does |
19:06 | <rbuckton> | Interning semantics will have GC overhead |
19:06 | <shu> | yes indeedy |
19:07 | <shu> | but it's like, deterministic overhead that's always there, instead of non-deterministic overhead that engines have to tune for and hope they get it right for the largest swath of the code that runs |
19:07 | <rbuckton> | or you forbid having a CompsiteKey(obj, "a") works, but CompositeKey("a", "b") does not? That seems like a very unstable design. |
19:07 | <shu> | which is a better place to be in, but... definitely still overhead, yes |
19:07 | <Ashley Claymore> | I dropped it after presenting the slides to some other people, they felt it complicated the question I was asking |
19:08 | <Mathieu Hofman> | I believe that is NOT was was presented. #[1] !== #[1] |
19:08 | <rbuckton> | I'm concerned about the overhead of 1000s of map.get(CompositeKey(a, b)) in a loop. |
19:08 | <eemeli> | I kinda like the idea of being able to do something like ``` new Map([], { compare: 'duck' }) ``` and have that "just work". |
19:08 | <Mathieu Hofman> | but when used in collections, the equality used is the one supporting R&T equality |
19:08 | <bakkot> | I believe that is NOT was was presented. |
19:10 | <rbuckton> | I kinda like the idea of being able to do something like
where |
19:11 | <Mathieu Hofman> | I'm confused about the discussion about CompositeKey here as it's not what was proposed, but presented as background / historical context. A R/T would, unlike CompositeKey, not have === equality. |
19:13 | <rbuckton> | I was under the impression these were discussed as possible directions to consider, with historical context as to prior discussions within TC39. |
19:14 | <Mathieu Hofman> | rbuckton: I don't want to build R&T on top of a equals / hash mechanism IF there is any way that values can influence the hashing mechanism or any part of the program at the time they're added to a collection I created. I am fine if the collection defers to a construction time provided config. |
19:14 | <Ashley Claymore> | I should have been more clear. I was trying to say Key(v1, v2) === Key(v1, v2) is good in that it doesn't introduce a new form of equality. But from what I have heard from engines re R&T and from my own research it seems difficult to make them perform well at scale, the whole app has contention on one global interning table that requires GC hooks to clean up the unused keys. |
19:16 | <rbuckton> | The upside of equals/hash is that it is a known quantity. Numerous languages and runtimes use this, thus the caveats are known. |
19:16 | <rbuckton> | A CompositeKey based on equals/hash doesn't require complex GC semantics. |
19:19 | <rbuckton> | If you have a native CompositeKey with equals/hash, you just make the "default equaler" behavior used by a Map have a specific case for CompositeKey vs === . But then you can make that "default equaler" an actual object from which to build other equalers from. |
19:19 | <Mathieu Hofman> | To be more clear:new Map([], {compare: Record.equals}) is fine, and so is new Map([], {compare: customHashCodeThing} , but not new Map([], compareWithHashCode: true) which would be assuming looking up a Records.prototype[Symbol.hashCode] |
19:21 | <Ashley Claymore> | In other plenary sessions it has come up that the language should be as close to fully deterministic as we can. e.g. not exposing a browser-sniffing API. Object.hash feels like it would need to be fully deterministic across engines to meet these goals.Perhaps I have mis-read committee comments with this goal. |
19:22 | <rbuckton> | To be more clear: [Symbol.hashCode] . In well written applications its perfectly fine and would serve most users well. If you don't trust your inputs, you could opt-out with a custom equaler that doesn't call an @@hashCode . That said, I'm more interested in a general purpose { equals, hash } mechanism for Map /Set and various other methods that do equality checks, than i am with instance-overrideable hash codes. |
19:23 | <Ashley Claymore> | I also wonder if even string hashing is not as easy to standardize in 262 due to the spec being in wtf-16 and runtimes potentially using different encodings? |
19:32 | <Mathieu Hofman> | We don't automatically need to expose the result of the internal hashing. We could have special values for compare |
19:33 | <rbuckton> | In other plenary sessions it has come up that the language should be as close to fully deterministic as we can. e.g. not exposing a browser-sniffing API.
In .NET, you can control whether to use a randomized string hashing algorithm via configuration: https://learn.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/runtime/userandomizedstringhashalgorithm-element. |
19:34 | <rbuckton> | We don't automatically need to expose the result of the internal hashing. We could have special values for |
19:35 | <Mathieu Hofman> | To follow up on Ashley Claymore 's comment, I would be opposed to any new APIs to expose non determinism to the program. It's fine for these things to remain internal to the engine, but I don't want them to be observable |
19:37 | <Mathieu Hofman> | Or should I say, the new API would have to be extremely well motivated and fairly self contained, like FR/WeakRef, but I don't think hashcode would clear that line |
19:38 | <rbuckton> | For example
|
19:40 | <Mathieu Hofman> | I honestly don't understand why hashLike(obj) { return #[Point, obj.#x, obj.#y]; } is not an acceptable approach. Engines are definitely capable of optimizing this |
19:41 | <rbuckton> | Capable does not mean willing or likely to do so. |
19:43 | <rbuckton> | And that still doesn't let me write a hashtable. I need something I can use as a numeric index with good avalanche properties. Maybe 95% of users don't need this, but its a headache for the small percent that do need it for a good reason. |
19:48 | <rbuckton> | For example, I had to do quite a bit to workaround this limitation when experimenting with the shared structs dev trial. I had a very strong need for a concurrent Map, which required implementing a hash table. It would have been far easier if I could actually generate a hash for a string or an object: https://github.com/microsoft/TypeScript/blob/shared-struct-test/src/compiler/sharing/collections/concurrentMap.ts Yes, maybe we would want to add a |
19:55 | <Mathieu Hofman> | generating a non-deterministic value from a basic value like and object or a string feels like a non starter for me. I'd claim it's not needed by 99.9999% of JS authors |
19:56 | <rbuckton> | Hash bucket lookup is approximately O(1), while a user-built collection relying on manually comparing tuples is O(n), and if you can't compare tuples using === , it's O(n*m) where m is the number of keys in the tuple. |
20:01 | <rbuckton> | Then call the method Object.nondeterministicHash(obj) or Object.randomHash(obj) or document the non-deterministic nature on MDN. If you were an implementer that internally used murmur3 for string hashing and found you could speed up all applications by a significant percentage by switching to xxhash64, wouldn't you do so? Object.hash() must be explicitly "non-deterministic" from the get-go just so anyone consuming it could reap the same benefits if a better hash algorithm comes along. |
20:03 | <rbuckton> | Case in point, even if you're using the non-random string hash algorithm in .NET, Object.GetHashCode() is still considered to be non-deterministic as upgrading the version can change the hash algorithm to one that is more efficient. |
20:03 | <Mathieu Hofman> | I just don't think there is sufficient motivation to expose this in the language |
20:09 | <rbuckton> | I'd argue that the number of times we've discussed mechanisms for customizing equality indicates it is sufficient motivation. The main reason I want custom equality for Map/Set at all is related to performance. You can use them as-is as long as you're willing to sacrifice performance and memory at the cost of the overhead introduced by alternatives. I want custom Map equality so that I don't have to .toString() every Uri I use as a key in a map.get() . Forcing custom collections to operate at O(n) while native collections can have O(1) because they can cheat and do the hashtable lookup is not the answer when your algorithms are performance-critical. |
20:18 | <rbuckton> | I'm not saying that
where
to have any possibility of achieving similar perf to it being a number. |
20:18 | <Mathieu Hofman> | You can customize equality using the hashLike approach I suggested above. It is also optimizable by engines. |
20:19 | <Mathieu Hofman> | The combine hash is exactly what creating a R/T with your components does |
20:20 | <rbuckton> | But not for a custom collection. As I said, native Map /Set could cheat and unwrap a composite key to do a hashtable lookup for O(1), but a custom collection would be O(n) at best. |
20:21 | <rbuckton> | I think a design that ignores custom collections is too short sighted. |
20:22 | <Mathieu Hofman> | Sorry I don't understand how your opaque hash is any different than a R/T |
20:22 | <rbuckton> | The opaque hash is just a number you can't actually see into. |
20:23 | <rbuckton> | The only reason it would be opaque is so that users can't depend on the actual value. |
20:23 | <rbuckton> | If it were a tuple, you couldn't use it as an index into an array of hash buckets. |
20:24 | <rbuckton> | If it were a number, or an opaque value that's actually just a number, then you conceivably could. |
20:24 | <rbuckton> | Numbers are just far easier to reason over since we already have Array and shared structs would have SharedArray . |
20:25 | <rbuckton> | To use a hash code efficiently, I need to be able to compare it using === , shift it using << and >>> , combine it using ^ , and use it as the index into an array. All of these operations are extremely fast using 32-bit integers. |
20:28 | <rbuckton> | #[a, b, c] has no avalanche properties I can use to ensure proper distribution in a hashtable, and cannot be used as an index, so comparing keys using #[a, b, c] in a custom collection is at least O(n*3), since I must compare the elements of every key in the map to the elements in the provided key. It's terribly inefficient. |
20:32 | <Mathieu Hofman> | If it were a tuple, you couldn't use it as an index into an array of hash buckets. |
20:33 | <rbuckton> | not across threads |
20:33 | <Mathieu Hofman> | if the "opaque" number value is actually observable by the program, even indirectly, it defeats the purpose of being opaque |
20:34 | <rbuckton> | And that assumes that all JS devs will only ever need Map and Set , and that there are no other collection classes JS does not implement that will ever be needed. |
20:34 | <Mathieu Hofman> | when you said opaque, I had assumed actually opaque, like a unique symbol, or empty object |
20:35 | <Mathieu Hofman> | anything else is not opaque by my definition |
20:35 | <rbuckton> | I don't need it to be opaque. I'm saying the only way an opaque value would work for me is if it could have the same properties I would need from an integer hash code. |
20:36 | <rbuckton> | The problem is that if it's not a 32-bit integer, it becomes a boxed value in most implementations, and those are going to be slower. |
20:49 | <rbuckton> | Let me summarize my thoughts:
|
20:52 | <Mathieu Hofman> | I'm saying that any value that is a source of observable non determinism does not seem sufficiently motivated for a "keying" mechanism when there are alternatives that cover most of the use cases except performance, which can be optimized, or shared memory, which is a hypothetical future, and could have its own collections API. JS is a high level language that does not expose low level / internal details of its objects or memory layout. exposing hash code would change that |
20:54 | <Ashley Claymore> |
The idea with R&T is that, being objects and arrays. In many cases they are the data structure, so there is no extra allocation for these cases |
20:56 | <rbuckton> | As I've discovered while discussing array iterator performance, "can be optimized" isn't a useful metric. Implementations won't optimize unless they have very good reason to. Array destructuring has been around for ~9 years and still isn't optimized. |
20:59 | <rbuckton> |
|
20:59 | <Ashley Claymore> | If the data model of the application is already using R&T. Inserting these into a map could be the fastest case. Because all the hashing and equality operations are entirely native, with zero entry into userland minimizing guards and allowing JIT. |
20:59 | <Ashley Claymore> |
If this really is common, then apps could keep those parts of the data structure seperate |
21:00 | <rbuckton> | If the data model of the application is already using R&T. Inserting these into a map could be the fastest case. Because all the hashing and equality operations are entirely native, with zero entry into userland minimizing guards and allowing JIT. Map and Set can only have O(1) performance for a tuple as a composite key if they use equals/hash internally. Custom collections cannot do the same. |
21:00 | <Mathieu Hofman> | Custom collections can use Map/Set internally for the vast majority of custom use cases |
21:00 | <Ashley Claymore> | Yes. I said earlier. The implementations must use hash + equals at some layer. I 100% agree that the pattern is efficent |
21:01 | <rbuckton> | custom collections aren't an ephemeral thing, there is production code that uses them. The perf hit is likely the only reason they aren't used more. |
21:01 | <Ashley Claymore> | The custom collections I've needed were wrappers around Maps |
21:01 | <Ashley Claymore> | I'd be interested in hearing about other cases. I'm familiar with the ConcurrentMap one. |
21:07 | <rbuckton> | The custom collections I've needed were wrappers around Maps get /set , which can result in unnecessary allocations. You could build a custom map that does case insensitive string comparisons on keys by passing them through a case folding algorithm, but that's terribly inefficient. For a case insensitive comparison, you might use case folding on the string to compare, but then you're also producing another string in memory. If you want to preserve the input key as written, you have to store both the original key and value as the map entry, resulting in another allocation and duplication, plus the need to wrap keys() , values() , etc. This is all unnecessary overhead. |
21:09 | <Ashley Claymore> | Gotcha. That matches my understanding. It's not that lots of custom collections can't be built, but they will need to do more allocations. |
21:11 | <rbuckton> | Map/Set wrappers can be built with allocation overhead, but only if their keys are easily translated to something that works as an existing key. Complex objects end up requiring O(n) |
21:12 | <rbuckton> | Even if you are using weakmap/map registries to ensure a composite key has the same identity, that's still O(n) if you're writing map.get(CompositeKey(a, b)) . You're just paying the O(n) cost walking the registry. |
21:13 | <rbuckton> | i.e., for n steps, you're doing an O(1) operation for each step. |
21:13 | <Ashley Claymore> | what is n here? |
21:13 | <rbuckton> | n is the number of elements in the composite key. |
21:14 | <Ashley Claymore> | ok. I thought you were referring to collection size and got confused |
21:14 | <rbuckton> | https://docs.google.com/presentation/d/1JfChmW8tQ2_mrFDynosNqa1tjJ2j-qX6WoKm8vc_tkY/edit#slide=id.g2c6eebea946_0_40 |
21:14 | <rbuckton> | Sorry, I should have said m instead of n for that case. |
21:17 | <littledan> |
|
21:18 | <littledan> | the composite-ness of composite keys or R&T doesn't actually depend on any new built-in thing, but identity hashcodes continue to not be exposed |
21:20 | <rbuckton> | we actually added Map and Set as built-ins without having a way that what they do could be done in a custom collection. Of course it would be nice to support efficient maps defined in JS if possible given all of the other goals and constraints, but I don't see why it makes sense to say that no advances should be made for Map and Set without managing to support this other goal. Map and Set more efficient. I'm concerned with choosing an API design that caters to native maps and sets that could never be efficiently implemented in a custom collection. We either end up never improving the story for custom collections, or end up with two independent mechanisms for customizing equality. |
21:22 | <rbuckton> | A composite key that uses a weakmap registry just to ensure reference identity for equivalent keys is entirely unnecessary with hash/equals. If we were to then implement hash/equals to improve the custom collection story, then composite keys become an evolutionary dead end for the language. |
21:23 | <littledan> | Is anyone defending CompositeKey? |
21:24 | <littledan> | Object-based R&T seems like a cleaner mechanism and doesn't do anything which should make things harder for custom collections, I think |
21:24 | <bakkot> | I'm not sure what distinction between CompositeKey and object-based R&T you're drawing |
21:24 | <rbuckton> | How is object-based R&T not a composite key? |
21:25 | <rbuckton> | The distinction I would make would be between a WeakMap-registry based composite key, or a Map/Set privileged CompositeKey. |
21:27 | <rbuckton> | A Map/Set-privileged CompositeKey has a way forward to equals/hash, but then why not have equals/hash instead (or at the same time). |
21:29 | <littledan> | I'm not sure what distinction between CompositeKey and object-based R&T you're drawing |
21:30 | <littledan> | and there's no support for === |
21:30 | <littledan> | I guess the registry was more for ===; WeakMap doesn't actually need it |
21:31 | <littledan> | so I guess you're calling this Map/Set privileged? But I don't see what Map/Set are doing that other collections couldn't also do |
21:32 | <littledan> | A Map/Set-privileged CompositeKey has a way forward to equals/hash, but then why not have equals/hash instead (or at the same time). |
21:32 | <rbuckton> | I'd also like more details from the folks that are concerned about non-determinism as to why that should matter. It would be guaranteed to be deterministic so long as the app/browser is executing. Assuming no string-random-hash algorithm at the outset, the only thing that would change the hash result would probably be upgrading the browser as a newer runtime could choose a more efficient algorithm. |
21:32 | <bakkot> | my preferred approach is to allow composite objects (i.e. interned objects, that give you === ) as keys in WeakMaps iff they contain at least one thing which could itself be in a WeakMap, and to introduce a canBeWeaklyHeld predicate so that's easier to determine. |
21:32 | <littledan> | why not at the same time was also answered: because there is not consensus to expose hashcodes, due to interop risk |
21:33 | <littledan> | so I guess you want more details... what do you mean by details? |
21:33 | <rbuckton> | so I guess you're calling this Map/Set privileged? But I don't see what Map/Set are doing that other collections couldn't also do |
21:34 | <Mathieu Hofman> |
I do not understand. How is it |
21:34 | <rbuckton> | This is something I have trouble understanding. Even if we do want to generalize to hash, the question of "why not instead" was answered--you want a nice default mechanism for the normal case, as other languages tend to have |
21:34 | <littledan> | No, Map/Set privileged would mean a Map would treat a CompositeKey as a special value and essentially do equals/hash natively. Without runtime support for equals/hash, a custom collection cannot do the same. |
21:35 | <rbuckton> |
|
21:40 | <Mathieu Hofman> | I'd also like more details from the folks that are concerned about non-determinism as to why that should matter. It would be guaranteed to be deterministic so long as the app/browser is executing. Assuming no string-random-hash algorithm at the outset, the only thing that would change the hash result would probably be upgrading the browser as a newer runtime could choose a more efficient algorithm. |
21:41 | <rbuckton> | hmm, what kind of custom collection are you imagining? I guess I'm not picturing it properly Map that takes an { equaler } option, as I'm suggesting.https://github.com/microsoft/TypeScript/blob/shared-struct-test/src/compiler/sharing/collections/concurrentMap.ts, which is a lock-free concurrent map that cannot use Map as a backing store.https://github.com/microsoft/TypeScript/blob/shared-struct-test/src/compiler/sharing/collections/sharedMap.ts, which is a coarse-grained synchronized map. to name a few. In other languages you're likely to see custom collections where specific semantics for get/set are necessary, such as with observable collections in .NET. |
21:44 | <rbuckton> | I want a language where I can deterministically reproduce execution. If you exclude I/O (which encompasses Date.now and Math.random), we currently have that language Date.now and Math.random , why would you not exclude Object.hash ? Could an implementation not have a --hash-algorithm=... flag to ensure a stable hash algorithm for tests? |
21:45 | <rbuckton> | If it's for security, wouldn't a randomized Object.hash be preferred over a deterministic one? |
21:46 | <rbuckton> | Are there any "implementation defined" behaviors you also have to discount? Object.hash would essentially be implementation defined. |
21:48 | <Mathieu Hofman> | But for what purpose is this important? Is it for testing? If you're going to exclude |
21:48 | <rbuckton> | There's also the possibility we could define hashing as Object.hash(obj, algorithm?) where you can optionally specify "stable" or "fast" (or similar). |
21:50 | <rbuckton> | We actually use this determinism for replay of program execution, sometimes across engines. Yes we could remove, but then it increases the difference with what programs may expect being in the language. |
21:52 | <Mathieu Hofman> | And how does this differ from the unique reference identity that a CompositeKey might have at runtime? |
21:52 | <rbuckton> | It should be clearly called out in documentation that Object.hash() is not stable across application restarts and shouldn't be serialized. I would expect that code that respects that would work just fine when replaying execution. |
21:52 | <rbuckton> | It isn't deterministic. |
21:52 | <Mathieu Hofman> | how isn't it? observably ? |
21:52 | <rbuckton> | It's unobservably non-deterministic. An object reference at runtime can exist in a different memory address each time you execute. |
21:53 | <Mathieu Hofman> | sure but I don't care about internals of the engine that are not observable |
21:53 | <rbuckton> | If your code treats a hash as unobservably non-deterministic, there is no difference. |
21:53 | <Mathieu Hofman> | it's like saying a JS object representation may live at a different pointer, that is just not relevant |
21:53 | <rbuckton> | The problem is that you can't make a hash code opaque without severely impacting the performance. |
21:54 | <rbuckton> | And the whole point is the performance. |
21:54 | <Mathieu Hofman> | I care about what the language exposes, that it not allow JS program to have non deterministic behaviors. I do not control the programs that run in my environment |
21:55 | <rbuckton> | JS allows programs to have non-deterministic behaviors, hence Date.now(), Math.random(). You can only make that assertion about a subset of the language. |
21:57 | <rbuckton> | Would control over the algorithm assuage that concern? Be it that from a CLI option or an option to a ShadowRealm, or an optional argument to Object.hash() ? |
21:57 | <Mathieu Hofman> | right, and any further non-deterministic behaviors would IMO have to be extremely well motivated, and I am not seeing that from hashCode |
22:13 | <rbuckton> | I would find a custom equality solution that cannot be implemented efficiently in a pure-JS collection to be a serious design flaw in the language. If Map/Set have privileged support for a composite key, that will end up poisoning other APIs over time. It will inevitably be reused by groupBy and distinct , or possibly a future join /crossJoin /leftJoin for iterators. We will move further and further away from the ability for the runtime to evolve in userland as no userland implementation could have the same performance characteristics. And we regularly suggest that some proposals instead evolve in userland. |
22:14 | <rbuckton> | I'd like to point out one specific bit. Object.hash() is essentially deterministic for everything except strings. The only non-determinism in strings would be if a more efficient hashing algorithm were to come along. |
22:18 | <rbuckton> | It's possible that we could handle that case with a subtle -like API. Get consensus on a base algorithm for string hashing, and allow users to use a faster algorithm as it comes along. For example, were Object.hash to use mumur3 by default, we could later support xxhash via Object.hash.xxhash32() or Object.hash(v, "xxhash32") . |
22:20 | <rbuckton> | Then the program remains deterministic. If randomized string hashing becomes important, make that an option but add that narrow bit to the set of things you don't support if you want determinism instead of throwing out the whole feature. |
22:27 | <rbuckton> | Correction, strings are not the only case. Object.hash({}) might also need to be non-deterministic to avoid introducing a side-channel. I have to think on that a bit more. |
22:28 | <rbuckton> | It could just be that Object.hash({}) always returns zero or throws, and you have to work around that yourself, but that would be unfortunate. |
22:40 | <rbuckton> | IMO, I'd rather have a non-deterministic Object.hash() and better support for time-travel debugging across engines to handle the "replayable program execution" case. |
23:12 | <Bradford Smith> | Mathieu Hofman: It's my understanding that iteration over Map() and Set() is always in the order in which items were added. This wouldn't be affected by allowing a userland-provided hashCode form of equals, would it? Are you thinking of some other way in which non-determinism would creep in? Are you concerned userland bugs causing the hashCode for a mutable object to change when the object is mutated? |
23:16 | <Mathieu Hofman> | Mathieu Hofman: It's my understanding that iteration over Map() and Set() is always in the order in which items were added. This wouldn't be affected by allowing a userland-provided hashCode form of equals, would it? Are you thinking of some other way in which non-determinism would creep in? Are you concerned userland bugs causing the hashCode for a mutable object to change when the object is mutated? |
23:28 | <Bradford Smith> | Ok, so the issue isn't that the proposed use of Is that right? |
23:50 | <Mathieu Hofman> | My understanding is that for hashCode to be effective, some non-determinism will become observable. That non-determinism comes from APIs that would need to be introduced to hash "root" values, like strings or the identity of some values (e.g. unique symbols) |