TC39 Delegates on 2024-02-14

02:21	<rbuckton>	I just ran across a strange case while writing additional tests for RegExp Modifiers. I've found exactly two cases where `/\b/u` and `/\b/ui` disagree for the same character: U+017f - ſ LATIN SMALL LETTER LONG S U+212a - K KELVIN SIGN A quick test of the same patterns and inputs in C# shows no disagreement, so its not clear to me if this is expected or possibly a bug in `\b`.
02:31	<rbuckton>	possibly having to do with how Unicode case folding for those characters produces an ASCII character. It just seems strange to have something that is not considered a word character when preserving case, but is considered a word character when ignoring case.
02:39	<bakkot>	the original sin here is that `\b` and `\w` are not unicode-aware even in `u` mode
02:40	<bakkot>	this behavior follows immediately from that: `U+017f` is not an ascii word character, but it case-folds to `s`, which is, and `i` means that the regex operates on case-folded characters
02:40	<bakkot>	the decision to make `\b` and `\w` not unicode-aware predates me, unfortunately, so I cannot tell you why this is. it does seem... bad.
02:40	<bakkot>	(`\d` too but that one matters a lot less.)
03:19	<Justin Ridgewell>	Time to introduce a `w` flag for very very unicode mode?
03:27	<bakkot>	we actually did specifically discuss and reject the possibility of making `\b` etc unicode-aware in `v`-mode https://github.com/tc39/notes/blob/2fccc7f7a38201354a007394ab867ec7b245b464/meetings/2021-08/aug-31.md#regexp-set-notation--properties-of-strings
04:59	<Justin Ridgewell>	JRL: Also voicing support, I would not change these shorthands. I do not remember this
05:31	<rbuckton>	I think waldemar's concern at the time was that changing `\b`, `\w`, and `\d` shouldn't be tied to the mode that adds set notation. We'd need to opt in either with a new mode or a `{u}` suffix. Either are fine so long as the new mode could be included in the modifiers list, i.e., `\b{u}` or `(?w:\b)` (or whatever flag we'd use) would work for those cases.
05:38	<rbuckton>	Oh, I guess I mentioned modifiers during that discussion as well.
15:40	<Richard Gibson>	the decision to make `\b` and `\w` not unicode-aware predates me, unfortunately, so I cannot tell you why this is. it does seem... bad. https://github.com/tc39/proposal-regexp-unicode-property-escapes/issues/22#issuecomment-279930140 There was a pre-ES6 proposal to change the meaning of `\w`, `\d`, and `\b` in Unicode mode. It was ultimately rejected out of fear it would hurt adoption of the `u` flag. (https://github.com/tc39/proposal-regexp-unicode-property-escapes/issues/22 is the [failed] attempt to make those escapes Unicode-aware under the `v` flag)
15:57	<shu>	who can add new members to the tc39 organization on GH?
15:57	<shu>	i'd like to add a V8 bot account for the purposes of test262 2-way sync. i can add the account to the right teams but first it has to be part of the tc39 organization, apparently
16:07	<ljharb>	done.