06:58
<annevk>
I think we have categorized that set of issues as out-of-scope, but I guess it's worth pointing out somewhere once it's no longer a draft.
12:59
<littledan>
I think we have categorized that set of issues as out-of-scope, but I guess it's worth pointing out somewhere once it's no longer a draft.
oh interesting, is there discussion of that somewhere?
13:00
<littledan>
I think a possible change for JS is to allow more BIDI markers where they might not be allowed today. Maybe they are already allowed in those places in HTML.
13:00
<littledan>
Another possible change is to equate certain identifiers with each other (which could be meaningful in JS as well as HTML and CSS)
14:13
<annevk>
littledan: I don't recall a place unfortunately; HTML has some restrictions, but they're all within the ASCII range
18:09
<TabAtkins>
For CSS we've largely just chosen to avoid all the complexity by relying solely on codepoint comparisons. We do limit a few weird codepoints (matching HTML's restrictions on element names) but that's it.
18:11
<annevk>
Yeah, neither really avoids the kind of "source code attacks" this document outlines, but that's really up to tooling to surface those kind of code points
18:13
<TabAtkins>
Right, whether we restrict some things or not in identifiers, they'll be allowed in strings, and editors need to be smart about it. (Which is definitely a non-trivial task, granted.)
18:13
<TabAtkins>
Would be interesting to pursue some way for CSS to display things in a smart way for this, tho.
18:18
<littledan>
For CSS we've largely just chosen to avoid all the complexity by relying solely on codepoint comparisons. We do limit a few weird codepoints (matching HTML's restrictions on element names) but that's it.
I'm not sure if this is a sufficient mitigation for the risks described in that document, but I haven't fully understood this whole space yet
18:18
<TabAtkins>
Oh it's not
18:26
<littledan>
Yeah, neither really avoids the kind of "source code attacks" this document outlines, but that's really up to tooling to surface those kind of code points
Oh, interesting, I wonder if this is how other programming languages are looking at this issue
18:27
<littledan>
I think a possible change for JS is to allow more BIDI markers where they might not be allowed today. Maybe they are already allowed in those places in HTML.
This then is the remaining issue (of the subset of issues that I understand)
18:30
<TabAtkins>
fwiw, the CSS allowed codepoints for all ident-ish things is https://drafts.csswg.org/css-syntax/#non-ascii-ident-code-point
18:31
<TabAtkins>
You can put anything in a string (tho nulls and surrogates will get censored) tho.
18:31
<TabAtkins>
it is, explicitly, a larger list than what JS allows in idents
18:32
<TabAtkins>
(there's an explanatory block after the list talking about this)
18:32
<annevk>
HTML parser is also very YOLO apart from U+0000 and surrogates, iirc
18:33
<TabAtkins>
yeah, our list matches HTML on purpose
18:33
<TabAtkins>
well, the html parser is more yolo than this list, but this matches valid html element names
18:34
<TabAtkins>
for general parsing css is also extremely yolo
18:35
<annevk>
I noticed the other day that apparently WSGL is introducing a different kind of whitespace from JS so I guess they might care more about this as well
18:36
<annevk>
Now I wonder how much overlap there is between WSGL, Wasm, and JS people
19:05
<littledan>
can you use BIDI markers outside of strings in CSS and within an HTML tag, and have them just be ignored? This is one of the recommendations of that document.
19:25
<TabAtkins>
Not validly, but the stylesheet will just keep on trucking when it's encountered. It'll just cause something to get ignored (a property, or an entire rule, depending on where it is).
19:25
<TabAtkins>
HTML will really just keep trucking, it drops very little when it finds something weird.
19:55
<littledan>
well, causing stuff to be ignored would be contrary to the intentions here
19:55
<littledan>
I mean, beyond the BIDI mark itself
19:58
<Michael Ficarra>
littledan: Not really? If you add a LTR override, it doesn't render, it doesn't affect surrounding rendering (assuming it's already LTR), but it changes behaviour by disabling the rule/property/whatever
20:07
<littledan>
yeah the idea is to not disable the rule...
20:18
<TabAtkins>
Why would you not want to disable the rule? It's violating the grammar.
20:33
<littledan>
I think the Unicode folks are proposing, smart editors and tools should be able to add these BIDI marks in various places, so that the RTL display comes out better in simpler situations, but that these shouldn't change semantics (whether causing errors or things to be ignored)
20:33
<littledan>
so the idea would be to add this to the grammar
20:35
<littledan>
or maybe no one is suggesting that and I just misunderstood
20:45
<TabAtkins>
Oh, I haven't fully read the document yet
20:45
<TabAtkins>
But I got the vibe from what I did read that the opposite was the case, and they're recommending that editors should display code as logically as possible, not affected by bidi marks and the like?
20:54
<littledan>
There are multiple conformance classes... Yes, editors should use their new BIDI rules to make things display right ,but also they describe an algorithm for inserting more marks for simpler consumers to be able to get it right