WHATWG on 2024-02-02

06:43	<annevk>	mfreed: I think it would be nice if you or we together drafted a small blog post around the shadow tree changes
08:07	<sideshowbarker>	Regarding trailing (not leading) characters in floating-point number values: Does the parsing algorithm at https://html.spec.whatwg.org/multipage/common-microsyntaxes.html#rules-for-parsing-floating-point-number-values require implementations to ignore all trailing characters that are not ASCII digits?
08:35	<annevk>	sideshowbarker: yeah, it ends up ignoring essentially everything at a certain point. Whereas before that it is quite strict.
08:36	<sideshowbarker>	OK, thanks
08:37	<sideshowbarker>	That complicates dealing with U+000B… Implementation-wise, it would be easier to just reject it everywhere
08:46	<annevk>	sideshowbarker: I left a comment on your implementation. You need to go deeper. :-)
08:47	<annevk>	The source of the problem is WebKit's (copied by Chromium) string to double operations. They do a bit too much.
08:51	<sideshowbarker>	The source of the problem is WebKit's (copied by Chromium) string to double operations. They do a bit too much. Yeah, I’m discovering that now…
08:52	<annevk>	sideshowbarker: It might be a bit much though so not handling U+000B correctly for now would be reasonable too.
08:52	<sideshowbarker>	OK
08:53	<annevk>	Ideally string to double would just do the minimal thing. It would progress some character pointer and return failure or a number. And then after that the caller gets to decide whether to ignore trailing characters or not.
08:54	<annevk>	And the caller also gets to decide where the character pointer starts (i.e., whether to skip whitespace and what type of whitespace beforehand).
08:57	<sideshowbarker>	And had already thought about changing `parseDouble()` to disallow U+000B as leading whitespace, but figured the problem with hard-coding it that way would be that it could regress other existing code that does expect `parseDouble()` allow U+000B. So yeah — ideally would need to be made configurable, as you said in your comment.
09:02	<annevk>	sideshowbarker: It looks like the other callers are in JSC. You could ask in the WebKit JSC channel maybe.
09:02	<sideshowbarker>	OK
09:03	<sideshowbarker>	For now I guess I may also go ahead and hard-code the `parseDouble()` code to disallow U+000B, and see what that breaks
09:03	<sideshowbarker>	(just locally, I mean)
09:04	<annevk>	Yeah seems reasonable to try. JavaScript should be concerned about more whitespace than just ASCII anyway, but maybe that's handled separately?
09:04	<sideshowbarker>	maybe so… we’ll see, anyway
09:05	<annevk>	CSS seems to have mostly its own conversion, which seems suboptimal.
12:53	<hsivonen>	annevk: Am I counting correctly that there are 16 ASCII characters that the URL Standard allows in a domain but STD3 does not? The list is surprising. Also, some characters on that list don't go to DNS resolution in Firefox but to the search engine if typed into the URL bar. How did you derive the forbidden domain code point list?
12:57	<annevk>	hsivonen: it's an attempt to be as reasonable as possible to non-DNS systems which Ryan Sleevi deemed important (and probably are in certain deployments, though unclear to what extent they have non-DNS names as that is hard to find out)
12:57	<annevk>	I don't know the exact numbers offhand though, definitely not at this point
12:59	<hsivonen>	annevk: I see. I'm wondering if it's a good idea for an IDNA library to have the UTS 46 flag of `UseSTD3ASCIIRules` where `false` means anything goes, or if an IDNA library should have an ASCIIRules parameter that take STD3 or WHATWG.
13:00	<annevk>	hsivonen: https://github.com/whatwg/url/issues/397
13:02	<annevk>	That's a good question and I'm not sure. I don't really know what email does for instance. It would be very nice if the library could just be "domain to ASCII" and "domain to Unicode" without any kind of configuration. But we might still be too much in a state of flux.
13:02	<hsivonen>	Today, I'm been wondering if I should ask UTS 46 to document the use cases for its tunables.
13:03	<hsivonen>	I'm rather unhappy about how the spec definition of the UTS 46 STD3 stuff is so much more complicated than what ICU4C does. I've spent way too much time designing data structures from the spec.
13:04	<hsivonen>	I'm also a bit unhappy about it taking me so long to realize that the next step in the URL Standard after the UTS 46 integration point provides a somewhat similar filter as ICU4C's STD3 filter. (But I didn't notice it by reading just the UTS 46 integration language in the URL Standard.)
13:05	<annevk>	I've attempted to influence UTS 46 quite a bit, but the process is still quite opaque to me and I don't always understand the decisions they make. Nor are they explained to me.
13:06	<annevk>	hsivonen: that's good feedback. We should probably move step 7 of the host parser to domain to ASCII.
13:07	<hsivonen>	I'll file a couple of URL Standard issues.
14:34	<emilio>	annevk: anything I need to do to move https://github.com/whatwg/html/pull/10067 forward?
14:34	<emilio>	It blocks some other fixes I want to do in that area
15:12	<annevk>	emilio: was that blocked on jarhar agreeing perhaps?
15:13	<annevk>	emilio: yeah I would like jarhar to agree so we don't need to go back and forth
15:14	<annevk>	But maybe two weeks is sufficient time. emilio could you wait until Monday morning? If not, happy to merge it now I suppose.
15:31	<emilio>	But maybe two weeks is sufficient time. emilio could you wait until Monday morning? If not, happy to merge it now I suppose. Sure, that's alright with me :)
15:45	<jarhar>	done, sorry that took so long
15:46	<jarhar>	also fyi i am adding more tests to the user-valid tests for checkboxes here: https://github.com/web-platform-tests/wpt/pull/44354