11:56 | <annevk> | emilio: so :active/:hover use the flat tree, but :has() uses the node tree? Is using the flat tree for selector matching not really expensive? |
12:26 | <emilio> | emilio: so :active/:hover use the flat tree, but :has() uses the node tree? Is using the flat tree for selector matching not really expensive? |
12:26 | <emilio> | Not that it would be impossible to do but it'd be weird |
12:28 | <annevk> | emilio: yeah never mind, I guess it all makes sense. And :has can't really use the flat tree as that'd break encapsulation, I think. |
12:30 | <emilio> | Right |
15:48 | <Domenic> | Thinking of trying to tackle https://github.com/whatwg/dom/issues/849 again, or at least make progress ... is there any easy way to know what element names the HTML parser accepts? Or do I have to walk through various parser states? |
15:49 | <Domenic> | I guess looking at https://html.spec.whatwg.org/#tag-open-state + tag name state is not too bad... |
16:14 | <Sam Sneddon [:gsnedders]> | can someone who understands how event dispatch is specified comment on https://bugs.webkit.org/show_bug.cgi?id=234730? because I've utterly confused myself now. |
16:27 | <annevk> | Domenic: can element names contain > today? That seems problematic |
16:27 | <annevk> | Domenic: I doubt we want to allow CR |
16:27 | <Domenic> | > is excluded from LenientElementNameStartChar and LenientElementNameChar in my sketch |
16:28 | <Domenic> | I think CR is probably disallowed by the parser but as preprocessing, so I didn't see it when reading. Good catch. |
16:28 | <Domenic> | Although what about entities hmm |
16:29 | <Domenic> | Entities don't work in tag names, huh |
16:30 | <annevk> | Oh I missed > there |
16:30 | <annevk> | Yeah entities only work inside attributes or between tags |
16:30 | <annevk> | (Context is https://github.com/whatwg/dom/issues/849 fwiw.) |
16:31 | <annevk> | The other thing I wonder about is whether we should only add leniency for the HTML namespace |
16:32 | <annevk> | But maybe it doesn't matter so much as you can already create trees that cannot serialize as XML so simplicity ought to win |
16:33 | <Domenic> | I cannot find what in the spec disallows the parser from creating elements with CR |
16:33 | <Domenic> | But browsers do not allow it http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=9940 |
16:34 | <Domenic> | I think https://html.spec.whatwg.org/#tag-name-state is missing CR |
16:34 | <Andreu Botella (he/they)> | CR is handled in the preprocessing stage |
16:34 | <annevk> |
|
16:34 | <Domenic> | I couldn't find that Ctrl+Fing for "CARRIAGE RETURN", and lots of parser steps actually look for CR... |
16:35 | <Domenic> | OK, so why does CR appear explicitly in places like https://html.spec.whatwg.org/#the-initial-insertion-mode |
16:35 | <annevk> | Domenic: I think that might be due to an entity reference? |
16:36 | <Domenic> | Seems plausible |
16:37 | <annevk> | Yeah, it's a conformance error, but it will get through |
16:37 | <Domenic> | Yeah because the tokenizer converts them then returns to the state it was in previously |
16:37 | <annevk> | No idea why that was not normalized as well... |
16:37 | <Domenic> | OK, updating whatwg/dom thread to exclude CR, and it looks like there are no spec bugs around CR |
16:38 | <annevk> | I guess it wasn't normalized because you can also get there through JS and guarding all entry points would be somewhat pointless overhead |
16:39 | <annevk> | (Not that specific point, but as an attribute value, say.) |
16:39 | <Domenic> | Well I think the idea is if you do 
 in certain places then you actually should end up with a CR in the resulting parsed data |
16:39 | <Domenic> | And so e.g. if you do that in early parts of the document then the initial insertion mode state will actually see the CR and ignore it, not normalize it |
16:40 | <annevk> | Right, though then the question is why � doesn't work (but JS equivalents do) |
16:40 | <Domenic> | https://html.spec.whatwg.org/#parsing-main-incolgroup is a better example where it inserts the CR instead of ignoring it. |
16:41 | <Domenic> | Hmmm |
16:41 | <annevk> | Finding logic in the parser might not be the best use of our time ๐ |
16:42 | <Domenic> | Yeah OK good point |
16:42 | <annevk> | For strictly split on : it might be worth clarifying you'd split on the first or concatenate return values 1...N |
16:45 | <annevk> | For Prefix there might still be some edge cases I suspect due to XML 4th/5th edition divide, where browsers didn't uniformly stick with the 4th (not entirely sure if some updated the parser, but not the corresponding DOM methods) |
16:47 | <Domenic> | Yeah I wonder about tests, I wonder if we can apply https://randomascii.wordpress.com/2014/01/27/theres-only-four-billion-floatsso-test-them-all/ to this |
16:47 | <Domenic> | Probably not :) |
16:48 | <annevk> | For PCENChar I think the banning of noncharacters is a bit dumb and removing that would simplify the production a lot |
16:48 | <Domenic> | Yeah I don't have strong feelings there, happy to take a new suggestion. |
16:54 | <annevk> | Oh I see, that came from XML and we'll preserve some of that through NameStartChar. I guess I'd consider simplifying that as well to C0 and above or even A0 and above (like URL code points), but I'm not sure how much we want to go for |
16:55 | <Domenic> | On the one hand, it's pretty separable. On the other hand, maybe we should do this all at once, since it's hard to get momentum for these sorts of things. |
16:56 | <Domenic> | Oh, or you mean just making LenientNameStartChar even more lenient |
16:56 | <annevk> | Well all of them I suppose. Less range checks ftw |
16:57 | <annevk> | Nobody has ever proven the value of segmenting Unicode in such a way to my knowledge and most things work fine without it |
16:58 | <Domenic> | Unpaired surrogates? |
16:59 | <annevk> | Hmm that's a good point, you included them but does that actually work? |
17:00 | <annevk> | Oh wait, URLs do consider noncharacters non-conforming, but they do work. Surrogates cannot work there however. |
17:06 | <Domenic> | I think they work http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=9941 |
17:06 | <annevk> | Domenic: as for testing, given Unicode is 2^21 if I'm not mistaken that might actually be feasible? Element-creation is a bit more expensive than floats though I suppose ๐ |
17:06 | <Domenic> | Yeah that's my worry. |
17:07 | <Domenic> | IIRC we already have a cloneNode test that can cause timeouts just when creating + cloning one instance of every existing/historical HTML tag name |
17:07 | <Domenic> | C++ unit tests in browsers could probably be exhaustive though |
17:09 | <annevk> | We could have a manual test for us to verify things on the side and for when computers get fast |
17:10 | <annevk> | It does seem that surrogates are fair game, hurray |
17:15 | <Domenic> | I guess the question is whether we want to make it easier to create non-serializable DOMs via DOM APIs. I think that's slightly bad? So maybe sticking with the union of current DOM API values + HTML parser values would be better than just allowing the DOM APIs to be maximally free. |
17:28 | <annevk> | Domenic: that would somewhat argue for branching on the HTML/SVG/MathML namespaces which is a bit odd |
17:29 | <annevk> | But I guess it still works if the approach is minimal set of steps starting from the status quo, or some such |
17:29 | <Domenic> | Well, within reason, I guess :) |
17:29 | <Domenic> | I'll add a comment with your approach |
17:30 | <annevk> | Thanks, I guess I mainly want to hear from someone that these range checks are negligible, since otherwise we might as well improve that while we're there |
17:35 | <Domenic> | In your version what is the justification for excluding tab, LF, CR, FF, space, /, >, and NULL? If we are no longer concerned about serializability seems like they could be allowed... |
17:42 | <annevk> | Domenic: I think it would all still work in the HTML parser when serialized? It's mainly XML that's affected for worse |
17:42 | <Domenic> | Oh right |
17:42 | <Domenic> | Not sure what I was thinking |
17:42 | <Domenic> | OK I am more in favor of your proposal now |
17:50 | <Domenic> | OK no I see what I was saying. Consider the following element local name: "$a". This is currently disallowed by createElement(). And the parser cannot parse it. But if we allow a larger set for createElement(), then the resulting serialization is something the parser cannot parse. |
17:51 | <Domenic> | Like ideally createElement() would only accept ASCII alpha for the first character; that would guarantee it only ever creates elements which serialize in a parseable way. But we already accept, for whatever reason, NameStartChar. The conclusion then is we should probably not expand beyond NameStartChar. |
18:54 | <ntim> | Domenic: Hi! I'm curious about how the <popup> anchoring works, and how/if it affects the containing block of the element? <popup> are designed to be in the top layer which forces the containing block to the viewport, so it's a question i'm wondering. |
18:54 | <Domenic> | ntim: I don't know much about <popup>; mfreed is the person to ask there. |
18:55 | ntim | hopes it's not yet another special positioning algorithm that isn't CSS describable |
19:04 | <Alan Stearns> | As far as I know, anchoring of popups is still under discussion. This has some history, and thereโs a current proposal linked at the end: https://github.com/w3ctag/design-reviews/issues/599 |