WHATWG on 2024-03-13

01:24	<Domenic>	Seems like the sort of thing where if your concern was interop, step 1 would be an exhaustive test suite.
01:29	<sideshowbarker>	I guess my biggest concern is whether we can maybe prevent other implementors from spending time evaluating that algorithm, if it‘s not actually important or useful for it to be actually be implemented as-is to the letter of the spec.
01:29	<sideshowbarker>	It’s not clear to me at least what the algorithm is actually based on
01:31	<sideshowbarker>	For example, was is written based on reading the double definition in the IEEE 754 spec and then attempting to put together an algorithm for parsing that? Or else was it written by looking at existing parsing code for double-parsing functions (`strtod` or whatever)?
01:35	<sideshowbarker>	I would personally be happy with us just adding a non-normative Note to the end or that algorithm, saying something like Note: In practice, rather than handcrafting an implementation of the above algorithm, most existing implementations use double-parsing functions from libraries such as double-conversion and fast_float. …or whatever similar wording we might be able to get agreement on.
01:46	<sideshowbarker>	I think it’s also worth noting that the ES spec doesn’t rely on the HTML floating-point algorithm for double parsing, and I think the CSS spec doesn’t either. And so also worth noting that because of that, implementations do double-parsing in places in their code other than just for HTML attribute values — notably, in the JavaScript-handling sources, and in the CSS sources. And so, in the engine sources, implementations have common/shared code for double-parsing that‘s called into from the HTML-attribute parsing code, and the JavaScript parsing code, and the CSS parsing code. Given all that, it seems very unlikely that any engine over the long run is going to have a specific implementation of the HTML floating-point algorithm that’s separate from their shared double-parsing code. (I realize that Ladybird does now, but I think that’s likely to change eventually — for various reasons, maybe including performance.)
01:53	<sideshowbarker>	I’m personally happy with the existing level of WPT coverage that we have for this — with https://github.com/web-platform-tests/wpt/pull/44355 now merged. What I’m less happy about is the effect it may have for causing implementors be unaware that existing engines don’t implement the algorithm as-is, and for causing implementors to potentially waste time.
01:55	<Domenic>	I mean, in general it's pretty rare to implement spec algorithms as-is, especially for low-level stuff like numbers and strings. https://infra.spec.whatwg.org/#algorithm-conformance and all that.
01:58	<sideshowbarker>	True, but in most cases what’s implemented in engines is an algorithm that’s handcrafted to be a workalike that’s functionally equivalent to the spec algorithm — rather than instead being implemented by just calling some code in a third-party library that you don’t know actually fully conforms to the requirements in the spec as written.
02:00	<sideshowbarker>	Anyway, I don’t mean to beat this into the ground and I’m not bringing it up to be pedantic about it — instead, I’m just wondering whether it’s a place where we might be able to save implementors some trouble by putting a little more information in the spec, even if just a non-normative note.
02:00	<sideshowbarker>	And if so, I’d be very happy to raise a PR for it.
02:01	<Domenic>	I guess I'd personally like to hear if the implementers of the relevant parts of the browser were confused by the spec, or not. I guess we have one testimonial from yourself, but more would be helpful before concluding its a problem.
02:05	<sideshowbarker>	Fair enough
08:54	<Ms2ger>	I assume all the implementations besides Servo and Ladybird long predate the spec
08:55	<sideshowbarker>	yeah I reckon so
09:01	<annevk>	I think what is first- and third-party code can shift over time and it's not really the job of the specification to go into the weeds about that. If you find a library or function call that happens to match the requirements in the specification and passes all the tests, and is better in some measurable way over what you had before, more power to you. From what Jeffrey wrote about ISO C that at least doesn't match the spirit of the HTML language as it allows for less precision. HTML in theory also allows for that due to the overarching "limits may apply", but also encourages implementers to push those limits. I'm not sure a note would really help with this as it would have to go into the weeds as I have done here to properly convey all the nuances.
09:02	<sideshowbarker>	Yeah, as far as a note, I can imaging that it would be challenging to get the wording right
09:11	<sideshowbarker>	Also by the way, I realize I misspoke a bit about something: While it’s true that the Blink and Gecko and WebKit use double-conversion or fast_float — they don’t _just _ use those. Instead they have to do preprocessing to skip ASCII whitespace — not Unicode whitespace, and specifically not U+000B, and maybe to skip/ignore any leading plus sign. fast_float uses a `from_chars` implementation rather than `strtod` — and `from_chars` on its own per-spec doesn’t skip/ignore leading whitespace. The docs say it also doesn’t skip/ignore a leading plus sign, but it seems to me that maybe the fast_float `from_chars` at least actually does. And I’m not sure if the double parser in double-conversion skips leading plus signs and whitespace — but if it does skip whitespace, it would do it for Unicode whitespace, not the ASCII whitespace subset. So anyway, to conform to the HTML algorithm, engines using any third-party libraries would need to do preprocessing on the strings — to skip/ignore the right kind of whitespace and (possibly) the plus sign.
09:13	<sideshowbarker>	(and with that I’ll be quiet, and go back to trying to figure out how to correctly handle find-in-page for closed `details` that are nested…)
13:18	<Dominic Farolino>	How does one "copy" or "clone" an infra struct? There are several definitions for "clone" in infra, but none specifically for structs. Can we just say "copy" or "clone" manually?
13:21	<Ms2ger>	I'd ask infra for a definition
13:23	<annevk>	Hmm, maybe URL.parse() should be added: https://twitter.com/kilianvalkhof/status/1765312128188088454 (I disagree with the assertion there, but it seems reasonable to have a URL-or-null abstraction)
15:08	<Noam Rosenthal>	Dominic Farolino: usually you clone a struct manually. Often enough some special processing needs to be done on one or more of the items (e.g. if one of the items is a list, do you want to clone the list or pass it by reference?)
15:11	<annevk>	It seems reasonable to define a shallow clone for structs. We have that for lists and maps. Thus far nobody needed it for structs I guess. Should be a fairly straightforward PR.
15:36	<TabAtkins>	Hmm, maybe URL.parse() should be added: https://twitter.com/kilianvalkhof/status/1765312128188088454 (I disagree with the assertion there, but it seems reasonable to have a URL-or-null abstraction) Yes, having to use a try block every time you want to parse a URL is indeed very frustrating. If we had an expression-level way to catch an error and return a value, it wouldn't be as big of an issue, but in the absence of JS having that, we absolutely should have a non-throwing way to parse a URL (returning null on failure, definitely).
15:41	<annevk>	Reopened https://github.com/whatwg/url/issues/372 cc Adam Rice
15:57	<Jeffrey Yasskin>	I think it’s also worth noting that the ES spec doesn’t rely on the HTML floating-point algorithm for double parsing, and I think the CSS spec doesn’t either. And so also worth noting that because of that, implementations do double-parsing in places in their code other than just for HTML attribute values — notably, in the JavaScript-handling sources, and in the CSS sources. And so, in the engine sources, implementations have common/shared code for double-parsing that‘s called into from the HTML-attribute parsing code, and the JavaScript parsing code, and the CSS parsing code. Given all that, it seems very unlikely that any engine over the long run is going to have a specific implementation of the HTML floating-point algorithm that’s separate from their shared double-parsing code. (I realize that Ladybird does now, but I think that’s likely to change eventually — for various reasons, maybe including performance.) FWIW, +1 to having a single double-parsing algorithm that all of ES, CSS, and HTML can use ... if that's web-compatible. See also https://github.com/whatwg/infra/issues/189.
15:59	<annevk>	I'm not sure we should share with ES until we know how the long term number types thing plays out. It came up before and the main reason not to do it was to preserve infinite precision, which seems like a worthwhile goal for a high-level language.
16:00	<annevk>	CSS & HTML I can see though.
16:15	<Jeffrey Yasskin>	I don't feel strongly about the details, but we could explicitly divide the algorithm into 2 pieces: First we parse the string into an infinite-precision real number, which pins down syntax like whitespace and leading-plus behavior. Then we define the Real->IEEE 754 conversion, which establishes that 0 ULPs of error are allowed, and the round-to-even behavior. If ES wants to preserve infinite-precision arithmetic for a while after the string is parsed, that's fine; they just only call the first algorithm.
18:01	<judge_sour_dough_bread>	Hi all. The HTML spec says, under §4.13.4, "The `CustomElementRegistry` interface", for the "Element definition" list of steps, specifically step 18: Let upgrade candidates be all elements that are shadow-including descendants of document, whose namespace is the HTML namespace and whose local name is localName, in shadow-including tree order. Additionally, if extends is non-null, only include elements whose is value is equal to name. This suggests that e.g. `define("foo-bar", class FooBarElement extends HTMLElement { /* ... / })` (autonomous custom element) will not* "upgrade" elements like`<span is="foo-bar"><!-- ... --></span>` (`span` can be replaced with any other known HTML element for the sake of the example), correct? To explain how I have assumed this: the`name`for the element is`foo-bar`, after all (established at the outset of the aforementioned list of steps), while `localName` is same as`name` (step 5) since`extends` is null, and so only`<foo-bar><!-- ... --></foo-bar>` element(s) in the document will be upgraded. Can someone tell me my reading of the spec is correct?
20:30	<Noam Rosenthal>	judge_sour_dough_bread: seems right, you need to have an `extends` option to make this into a customize built-in element.