| 00:02 | <sicking> | Hixie, btw, I also suggest that we support <!CDATA[ ]]> everwhere where we parse PCDATA and RCDATA. Due to goal 2 above. Sounded like Operas experiment in this area is that it doesn't break the web. But it's an orthogonal discussion to the one we're having now so I wasn't going to raise it until after |
| 00:06 | <annevk3> | sicking, actually, we encountered issues with supporting <![CDATA[ |
| 00:06 | <sicking> | annevk3, big enough that you removed the support? |
| 00:06 | <annevk3> | sicking, I don't think we have yet, but I think we should |
| 00:07 | <sicking> | annevk3, why not, and why? |
| 00:07 | <annevk3> | because it causes issues and supporting CDATA has no real benefit |
| 00:07 | Philip` | hopes that if Opera doesn't remove CDATA support, they at least make it less insane |
| 00:08 | <sicking> | annevk3, i doubt you'll ever find a feature that you can deploy without any issues. The most recent example we had was a site breaking because we implemented document.readyState |
| 00:08 | <annevk3> | sicking, but CDATA is not a feature, it's near useless |
| 00:08 | <sicking> | annevk3, since we'll have to support <![CDATA[]]> inside SVG, I think the consistency would be nice |
| 00:09 | <annevk3> | maybe, I'd rather remove it there too |
| 00:09 | <annevk3> | (also Opera's current CDATA support doesn't match the HTML5 spec) |
| 00:10 | <sicking> | annevk3, i'd be supportive of that if we think there isn't much existing SVG content that uses it. In other words, if we don't think it'd affect the ability to copy SVG into HTML |
| 00:12 | <annevk3> | there's very little SVG so that's pretty hard to tell |
| 00:13 | <annevk3> | I don't recall ever needing it in any SVG content I've created, but then I mostly do simple things |
| 00:14 | <Philip`> | http://philip.html5.org/misc/spec-links-anim.svgz |
| 00:14 | <Philip`> | That's got a script, but doesn't use CDATA |
| 00:16 | <Philip`> | Hmm, why does View Source in Firefox 3 go unbelievably slowly on that SVG file? |
| 00:16 | <Philip`> | If the script had to use < or & then I probably would have put it in <![CDATA[]]> |
| 00:16 | <Philip`> | but it didn't so I didn't |
| 00:17 | <sicking> | have a good weekend people |
| 00:17 | <Philip`> | Yikes, FF3 uses about 400MB of RAM to view-source on that page :-/ |
| 00:18 | olliej | opens in S4 |
| 00:18 | olliej | wonders how badly it will fair |
| 00:18 | <sicking> | Philip`, view-source will always use more memory than the actual page |
| 00:19 | <olliej> | memory seems to have peaked at ~600mb :-O |
| 00:19 | olliej | wonders wtf is happening |
| 00:19 | <sicking> | Philip`, it doesn't use any entities either though, so should work find |
| 00:19 | <sicking> | fine even |
| 00:20 | <annevk3> | what's the advantage of toTempURL over toDataURL? working around IE bugs? |
| 00:21 | <Dashiva> | That's how I understood it |
| 00:21 | <annevk3> | feature design 101: don't propose a new feature to work around UA bugs in an existing feature |
| 00:21 | <Dashiva> | Not sure why supporting toTempURL would be easier than fixing data urls |
| 00:22 | <Philip`> | I think the idea is that you should be able to say img.src = canvas.toSomeKindOfThingThatWorksInImgSrc() and have it work in browsers that don't support data URIs |
| 00:22 | <annevk3> | Philip`, right, see above |
| 00:23 | <annevk3> | nn |
| 00:23 | <Philip`> | Dashiva: Because you can't fix data URIs in IE6 |
| 00:23 | <Philip`> | (I assume) |
| 00:24 | <Dashiva> | But can you fix toTempURL in IE6? |
| 00:24 | <Philip`> | If you're writing a plugin or add-on or whatever it is, then you presumably can |
| 00:25 | <Philip`> | since you can save the canvas data to disk and then use file:/// to refer to it |
| 00:25 | <Philip`> | or you can register a protocol handler for customHandler:// |
| 00:26 | <Dashiva> | Register a protocol handler for data:, feed the contents? |
| 00:26 | <Philip`> | I've got no idea whether you can do that |
| 00:26 | <Philip`> | (If you could, it'd be bad in terms of IE-compatibility with other sites that uses data: URIs) |
| 00:42 | <hdh> | (defun last-heading () (search-backward-regexp "C-x C-f") (match-string 0)) (setq mode-line-format '(:eval (last-heading))) |
| 00:42 | <hdh> | idk how to plug the eval part into the existing modeline |
| 00:42 | <hdh> | the text seems to keep its formatting in the matched buffer |
| 01:02 | <Hixie> | hdh: interesting |
| 11:21 | <benh_> | Historical question: does anyone know why HTML4 deprecated u, s, strike, but not b/i/big/small, even though it discouraged the later? What was special about u/s/strike? |
| 12:55 | <annevk3> | Hixie, s/buu can/but can/ |
| 12:55 | <annevk3> | Hixie, s/is can be/can be/ |
| 12:57 | <annevk3> | Hixie, s/sections that cause/sections cause/ (I think, the sentence does not seem correct otherwise) |
| 12:59 | <annevk3> | Hixie, "For each <span>cache host</span> associated with an <span>application cache</span>" Isn't a cache host always associated with one? |
| 15:02 | <annevk3> | Gmail on a message I sent: "4:06 PM (-1 minutes ago)" |
| 15:50 | <Andrii> | annevk3: google invented the time machine, cutting edge technology |
| 16:51 | <Hixie> | annevk3: please send mail for feedback, pleeeeease. :-) |
| 17:25 | <annevk3> | Hixie, next time or also for those four lines? |
| 17:25 | <annevk3> | IRC is convenient, there's less UI involved |
| 17:26 | <Dashiva> | I'm betting he wants the paper trail |
| 17:28 | <Philip`> | He prints out emails? |
| 17:29 | <Philip`> | Someone needs to make an IRC bot which lets you say "!feedback s/buu can/but can/" and it will automatically send an email to Hixie |
| 17:35 | <krijnh> | Doesn't need to be an IRC bot, since all the logs are available as HTML as well :) |
| 17:36 | <Philip`> | krijnh: Good point :-) |
| 17:37 | <Philip`> | krijnh: If you could change your log processor to let us embed RDF in our IRC messages that will get translated into RDFa in the logs, then we could write a simple RDFa-based tool to automatically extract all the feedback and email it |
| 17:37 | <annevk3> | yeah, and with enough annotation everything gets done automatically |
| 17:38 | <Dashiva> | That reminds me of joel's post about spec writing |
| 17:39 | <Dashiva> | "And then some people go into a dark place where they imagine automatically generating implementations from specs, and think they have invented a way to program without programming" (by memory) |
| 17:39 | gsnedders | starts implementing parse errors in html5lib php |
| 17:40 | <takkaria> | parse errors are overrated |
| 17:52 | <annevk3> | gsnedders, why do you need parse errors? isn't a treebuilder more useful? |
| 17:52 | <gsnedders> | annevk3: Trying to finish Tokenizer first before moving on |
| 17:59 | gsnedders | returns to the point where more than 50% of tests pass |
| 18:40 | <gsnedders> | Weeee… 23 tests failing now |
| 18:41 | <Philip`> | Delete those tests, then you'll pass 100% |
| 18:46 | <gsnedders> | 14% perf. regression from throwing parse errors |
| 18:47 | <Philip`> | In a document that has no parse errors? |
| 18:47 | <gsnedders> | Yeah |
| 18:47 | <gsnedders> | (i.e., the spec) |
| 18:48 | <takkaria> | ouch |
| 18:49 | <takkaria> | I'm not sure hubbub is ever going to have parse error reporting |
| 18:50 | <gsnedders> | 12.0s is still a massive improvement over the 48s it was a week ago |
| 18:50 | <takkaria> | sure :) |
| 18:52 | gsnedders | runs with profiler |
| 18:52 | <gsnedders> | (This takes it back to around 50s :P) |
| 18:54 | <Philip`> | You need to multithread your tokeniser |
| 18:55 | <gsnedders> | Multi-threading in PHP? :P |
| 18:55 | <Philip`> | Shouldn't be too hard to just split the input document into n pieces, and speculatively parse the last n-1, and discards any results that are invalidated by the tokeniser state at the end of the previous section |
| 19:00 | <jgraham> | Philip`: presumably you would end up being wrong a lot of the time |
| 19:00 | <jgraham> | Which seems bad |
| 19:01 | <Philip`> | jgraham: You could scan forwards to the next '>' and assume you're now going to be in the data state, which is likely to be right quite often |
| 19:02 | <takkaria> | is it? |
| 19:02 | <Philip`> | and if you were in a <script> or something then you make sure you've kept enough state so you can sync up once you've reached the </script> |
| 19:02 | <jgraham> | Hmm. You're making this sound surprisingly reasonable |
| 19:02 | <Philip`> | Really? |
| 19:02 | <Philip`> | That wasn't my intent |
| 19:02 | <jgraham> | Which suggests that you're misleading me somehow |
| 19:05 | <Philip`> | takkaria: I suppose it should be fairly easy to instrument a tokeniser to report how often it sees '>' when it's in the data state (and PCDATA, and no escape flag) |
| 19:09 | <Philip`> | Does Python have a multiprocessing thing nowadays that isn't unbearably hard to use efficiently? |
| 19:09 | <gsnedders> | Philip`: Yes, multiprocessing |
| 19:09 | <Philip`> | Ah, sounds good |
| 19:09 | <Philip`> | Maybe html5lib should do this! :-) |
| 19:10 | <gsnedders> | Will all zero users of php-html5lib kill me if I make 23 test cases fail? |
| 19:11 | <annevk3> | we should make error reporting optional in the tests |
| 19:11 | <annevk3> | or flag tests that rely on error reporting |
| 19:11 | <Philip`> | That's easy |
| 19:11 | <Philip`> | if 'ParseError' in expected_tokens: it relies on error reporting |
| 19:11 | <annevk3> | it seems to me that the PHP parser is not intended for building a validator so it should just not do it and be fast :) |
| 19:12 | <Philip`> | and you could just strip out all the ParseErrors when comparing your tokeniser against the test result |
| 19:12 | <annevk3> | yeah, I guess that's the best way |
| 19:12 | <annevk3> | you still want to test error handling |
| 19:13 | <gsnedders> | Yeah, that's what it currently does, for a few more minutes at least |
| 19:14 | <gsnedders> | annevk3: I'd disagree that it is irrelevant. You might want to only allow valid comments on my blog. Oh, wait, you already do. |
| 19:15 | <annevk3> | i wouldn't mind syntax errors actually |
| 19:15 | <takkaria> | gsnedders: testcases failing is bad, mmkay |
| 19:15 | <annevk3> | i'd just validate the tree |
| 19:15 | <annevk3> | validate the tree based on some whitelists |
| 19:15 | <Philip`> | Someone should make a blog comment CAPTCHA system which presents you with a random word (just in plain text) and requires you to use it in a grammatically-correct sentence in your comment |
| 19:16 | <gsnedders> | takkaria: All failures are due to the parse errors, and one of them is somewhat questionable (I'd argue that the test case relies on impl. specific behaviour) |
| 19:38 | <olliej> | Philip`: heheh |
| 19:39 | <gsnedders> | Philip`: How do you determine whether a sentence is grammatically correct? |
| 19:44 | <Philip`> | gsnedders: Mechanical Turk |
| 19:53 | <Niictar> | But... couldm |
| 19:54 | <Niictar> | But... couldn't a bot be programmed to generate a non-sensicial but grammatically correct sentence after doing a dictionary lookup of the word in question? |
| 19:55 | <Niictar> | Or, just quote a sentence example straight out of said dictionary? :P |
| 20:00 | <olliej> | Niictar: sssh |
| 20:00 | <olliej> | Niictar: although that captcha might be fairly good for filtering out most reddit/digg/youtube commenters :D |
| 20:01 | <Niictar> | =D |
| 20:35 | <gsnedders> | "It quite clearly shows that Humbert Hubmert does \emph{not} love Lolita." — Discuss. |
| 20:35 | <gsnedders> | :P |
| 20:36 | <Philip`> | s/Hubmert/Humbert/ |
| 20:36 | gsnedders | is typing quickly damnit! |
| 20:38 | <Dashiva> | If you reversed the grammar captcha, it could work well for sites where real users are inable to write properly, whereas a bot would be too successful |
| 20:57 | gsnedders | makes another tweet exactly 140 characters long |
| 20:57 | <gsnedders> | (And 142 bytes) |
| 21:10 | <jwalden> | gsnedders: is the limit 140 characters or 140 code points? |
| 21:12 | <gsnedders> | jwalden: It appears to apply NFC |
| 21:13 | <gsnedders> | (So my first test failed) |