00:42 | <Hixie> | Philip`: i just used your code :-P |
00:44 | <Philip`> | Hixie: I never claimed that my code was a suitable general-purpose replacement :-) |
00:49 | <Hixie> | :-P |
00:50 | <Hixie> | ok i replaced it with the old code |
00:50 | <Hixie> | and put an ie8.html file there for IE |
01:38 | <Philip`> | http://realtech.burningbird.net/standards/xhtmlate-wordpress-comments/ seems good, though it fails to ban invalid characters from author names |
01:43 | <Philip`> | (and I can't go in to edit my post and fix the totally accidental inclusion of invalid characters because I get a YSoD :-( ) |
02:46 | <Philip`> | Does anyone know how to make an exploit based on http://golem.ph.utexas.edu/instiki/new/Testing%20%3Ciframe%20onload=alert(document%26%23x2e;cookie)%3E work in XHTML? |
02:46 | <Philip`> | The difficulty is that it's seemingly impossible to use a / character |
03:19 | <Hixie> | Philip`: escape/encode the exploit somehow in the js, e.g. %-encoded, then just have hte exploit do something like eval(unescape(...exploit...)) and %-encode that and put it in the uri |
03:21 | <Philip`> | Hixie: The JS won't get executed unless the XHTML is well-formed, and I can't see how to make it well-formed |
03:21 | <Hixie> | oh yeah to make something well-formed you need a slash |
03:22 | <Hixie> | can you use the query component? |
03:22 | <Philip`> | In some cases it might be possible to do something with <![CDATA[ and somehow make it line up well-formedly with an existing ]]> somewhere later in the document, but I don't think that can work here |
03:24 | <Philip`> | As far as I can see, the query string is just ignored |
03:54 | Hixie | snaps |
04:02 | Philip` | wonders if he can calculate the tensile strength of a Hixie |
10:17 | hsivonen | notes that the serialization algorithm now escapes " to " outside attributes. has it always been that way? |
11:48 | Xiven- | conveys his thanks to Philip` for the lesson in Unicode safety |
12:01 | <Philip`> | Xiven-: My pleasure :-) |
12:01 | Philip` | hopes the comments etc don't allow in invalid characters |
12:04 | <Xiven-> | it now removes invalid characters (based on the list at http://www.whatwg.org/specs/web-apps/current-work/#preprocessing) from all GET and POST data |
12:07 | Xiven- | also thanks annevk and Hixie for passing the message on to him :) |
12:08 | <hsivonen> | now that Hixie made astal non-characters parse errors, a conforming XHTML text content might not be conforming HTML text content |
12:08 | <hsivonen> | astral even |
12:48 | <Xiven-> | but yes, the disturbing part was that PHP's XML parser appeared to run out of memory on some invalid characters |
13:11 | <hsivonen> | aargh. it is way too easy to accidentally move an IMAP mailbox inside another in Mail.app |
13:11 | <hsivonen> | a small slip of the pointing device and a click turns into a drag |
13:16 | <hsivonen> | Over the years, having an "are you sure" dialog for mailbox move and rename would save me non-trivial time |
13:18 | hsivonen | wonders why Unicode introduced REPLACEMENT CHARACTER when ASCII had1A USBSTITUTE |
13:20 | <zcorpan_> | hsivonen: the spec has always escaped " outside attributes, yes. but i've complained about that on the list |
13:25 | <hsivonen> | zcorpan_: ok |
13:26 | <zcorpan_> | at least i remember having complained about it, i can't find it on google or in the issues list |
13:27 | <zcorpan_> | ah http://lists.w3.org/Archives/Public/public-html/2007Jul/1030.html |
13:28 | <zcorpan_> | (though, not sure if it's in the issues list) |
13:29 | zcorpan_ | had notes in http://simon.html5.org/tools/js/innerhtml-viewer/getInnerHTML.js |
13:31 | <hsivonen> | annevk, jgraham__: do you have an opinion on how test cases should test that forbidden characters emit parse errors? |
13:32 | <hsivonen> | (since the relative order of errors and tokens in not well-defined in that case) |
13:32 | <hsivonen> | for example, I implement the check as part of reading the next character from the stream |
13:34 | <Philip`> | hsivonen: The tests currently use something like ignoreErrorOrder:true for the cases where the order is undefined |
13:36 | <hsivonen> | Philip`: oh. I have to look into that |
13:36 | <hsivonen> | Philip`: thanks |
13:38 | <hsivonen> | it's crazy how long the read() method has become... |
13:38 | <annevk> | "HTML is tough" |
13:39 | <a-ja> | hsivonen: henri, whatcha think about that uF v2 idea of daniel's? |
13:41 | <hsivonen> | a-ja: I think it won't work because microformat producers are going to be sloppy and, therefore, consumers are going to be able to extract more data if they extract anything that looks like a microformat regardless of <meta> or profile='' or whatever |
13:41 | <a-ja> | breaks html5? or requires meta's to be allowed in standalone articles? just in head wouldn't seem to cut it |
13:42 | <annevk> | http://www.mnot.net/drafts/draft-nottingham-http-link-header-01.txt |
13:42 | <hsivonen> | a-ja: it's isn't as much about breaking HTML5. I just think that microformat consumers will find ignoring the meta more valuable |
13:44 | <a-ja> | i think head profile is gonna have enough issues....especially with list of long url's. could easily break content-type sniff .5k / 1k limits |
13:45 | <annevk> | there's no such limit anymore though |
13:45 | <a-ja> | no? |
13:45 | <annevk> | right |
13:46 | <a-ja> | guess that's A Good Thing...at least in some ways |
13:49 | <a-ja> | FYI: |
13:49 | a-ja | offers FREE as in BEER bug bounty - A case to whoever gets bug 311366 patch checked in before b5 code freeze, and gets it to stick for FF3 final |
13:49 | <a-ja> | not takers yet ^ |
13:49 | <a-ja> | s/not/no/ |
13:55 | <Philip`> | http://golem.ph.utexas.edu/instiki/show/Sandbox - alas |
14:00 | <hsivonen> | is there an easy online tool for converting astral chars into surrogate pairs? |
14:01 | <hsivonen> | http://rishida.net/scripts/uniview/conversion.php |
14:02 | <Philip`> | hsivonen: http://www.fileformat.info/info/unicode/char/10000/index.htm |
14:02 | <hsivonen> | Philip`: thanks |
14:34 | <hsivonen> | http://odfalliance.org/resources/google-response-post-brm.pdf contains a couple of HTML-relevant points |
14:34 | <hsivonen> | 1) Requiring vendor extensions to be documented in order to conform to the base language. |
14:34 | <hsivonen> | and |
14:34 | <hsivonen> | 2) that Transitional isn't |
15:25 | <hsivonen> | I regret suggesting that 0x80-0x9F bytes be errors when ISO-8859-1 is declared |
15:27 | <annevk> | i don't think it makes much sense |
15:27 | <annevk> | iso-8859-1 is just an another alias |
15:29 | <hsivonen> | Hixie: can I just say I was wrong and ask this detail to be reversed? (especially since doing the consistent thing with GBK would be a PITA) |
15:29 | <annevk> | hsivonen, can't you just use the declared encoding instead? |
15:29 | <annevk> | that way they'll automatically be errors |
15:30 | <hsivonen> | annevk: then the parser could no longer be used as an error reporting general-purpose parser |
15:31 | <hsivonen> | annevk: also, then show source would wrong assuming the 1252 interpretation is right |
21:09 | <hsivonen> | whoa. the Talk:Acid2 really is sad. |
21:09 | <hsivonen> | is the irc log at krijnhoetmer.nl a verifiable source? |