| 00:12 | Philip` | tries implementing a very extremely primitive version of cInputStream, which only returns the string "Hello world" (repeating infinitely) and whose C code calls the class "Noddy" because it's copied from the Python documentation |
| 00:12 | <Philip`> | and it seems to run char() about three times faster than a pure Python version, which makes it seem actually worthwhile |
| 00:16 | <Philip`> | I believe cInputStream only needs to implement three functions, so it should be pretty straightforward... |
| 00:32 | <annevk> | only three times faster? |
| 00:32 | <annevk> | still quite a lot I suppose |
| 00:33 | <annevk> | I guess all the other Python bits impact the perf as well then. I would expect a C version to be a 100 times faster or so. |
| 00:33 | <Philip`> | All the method's doing is returning the next character from a string and increasing the offset counter, so there isn't a huge scope for improvements |
| 00:36 | <Philip`> | http://krijnhoetmer.nl/irc-logs/html-wg/20070710#l-239 |
| 00:37 | <Philip`> | (100x seems about right) |
| 00:37 | <annevk> | (I got that number from a collegue who did some comparative testing.) |
| 01:13 | <Hixie> | most uses of alt=<...> seem to be pretty serious errors (people putting markup in alt="") |
| 01:13 | <Hixie> | so we could probably turn alt=<...> into the magic "not a description but a type of image" mode |
| 01:14 | <Hixie> | (as in, instead of <img important alt=...>, we would have <img alt="<...>">) |
| 01:16 | <Hixie> | or we could use alt={...}, the only use of that seems to be for cases where alt=&...; would have been better |
| 01:16 | <Hixie> | we could even say that in {...} if the ... matches an entity name then it's treated as an entity, otherwise it's treated as an important image... |
| 01:19 | <Philip`> | I'm not sure what you mean about alt=&...; |
| 01:19 | <Philip`> | ALT="{short description of image}" seems the most common value in my data |
| 01:20 | <Philip`> | (<img alt="<...>"> is nasty because everyone will write it as <img alt="<...>"> and it'll be ugly) |
| 01:21 | <Hixie> | yeah i agree with <> |
| 01:21 | <Hixie> | alt={alpha} is the most common {...} value i see |
| 01:22 | <Philip`> | I don't see that at all |
| 01:22 | <Hixie> | which would probably be best as alt=α |
| 01:22 | <Hixie> | i expect my sample has a lot more scientific documents |
| 01:25 | <Philip`> | http://philip.html5.org/data/alt-in-braces.txt |
| 01:27 | <Philip`> | It'd be kind of annoying for people who do <img src="rendered-latex.gif" alt="latex source">, since that'd occasionally be {...} |
| 01:29 | <Hixie> | i dropped anything that had fewer than 10,000 pages, and my list was: http://damowmow.com/temp/alt-in-braces.txt |
| 01:29 | <Philip`> | Might all the alphas come from one site with lots of pages? |
| 01:30 | <Hixie> | quite possible |
| 01:31 | <Hixie> | (or from one tool) |
| 01:33 | <Philip`> | http://www.cmaj.ca/cgi/content/full/173/12/1441 has alt="{dagger}" |
| 01:33 | <Philip`> | http://jcem.endojournals.org/cgi/content/abstract/87/4/1687 has alt="{chi}" |
| 01:33 | <Philip`> | Those seem to be the only entity-like things I got |
| 01:34 | Philip` | notices some similarity in their URLs |
| 01:34 | <Hixie> | here are 15 pages that used alt={alpha}, selected at random (so if there's one site drowning the results, you should see a lot of that site): |
| 01:35 | <Hixie> | unlinked alt={alpha},http://agron.scijournals.org/cgi/content-nw/full/91/6/928/FIG6?ck=nck |
| 01:35 | <Hixie> | unlinked alt={alpha},http://www.good-sa.com.tw/97.html |
| 01:35 | <Hixie> | unlinked alt={alpha},http://hyper.ahajournals.org/cgi/collection/other_vasc_bio?notjournal=ahajournals&page=156 |
| 01:35 | <Hixie> | unlinked alt={alpha},http://www.pnas.org/papbysection.shtml |
| 01:35 | <Hixie> | unlinked alt={alpha},http://toxsci.oxfordjournals.org/cgi/content/full/69/2/354/F2 |
| 01:35 | <Hixie> | unlinked alt={alpha},http://journals.asm.org/cgi/figsearch?FIRSTINDEX=1540&SEARCHID=1&hits=10&RESULTFORMAT=&FULLTEXT=embryos&andorexactfulltext=&resourcetype=HWFIG |
| 01:35 | <Hixie> | unlinked alt={alpha},http://atvb.ahajournals.org/cgi/content/full/26/1/143 |
| 01:35 | <Hixie> | unlinked alt={alpha},http://jb.asm.org/cgi/content/full/185/1/89?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&fulltext=lacz&searchid=1&FIRSTINDEX=1400&resourcetype=HWFIG |
| 01:35 | <Hixie> | unlinked alt={alpha},http://www.jimmunol.org/cgi/content/full/174/1/205?ck=nck |
| 01:35 | <Hixie> | unlinked alt={alpha},http://ajpheart.physiology.org/papbyrecent.shtml |
| 01:35 | <Hixie> | unlinked alt={alpha},http://bloodjournal.hematologylibrary.org/cgi/content/abstract/103/4/1286 |
| 01:35 | <Hixie> | unlinked alt={alpha},http://circ.ahajournals.org/cgi/collection/endo_vastype_no?notjournal=ahajournals&page=236&ck=nck |
| 01:35 | <Hixie> | unlinked alt={alpha},http://ejcts.ctsnetjournals.org/cgi/content/full/25/3/352?ck=nck |
| 01:35 | <Hixie> | unlinked alt={alpha},http://stemcellbiology.blogspot.com/2007_09_01_archive.html |
| 01:35 | <Hixie> | unlinked alt={alpha},http://bloodjournal.hematologylibrary.org/cgi/content/abstract/106/7/2302 |
| 01:35 | <Hixie> | looks like ahajournals.org might be overrepresented |
| 01:35 | <Hixie> | but that's still a broad selection |
| 01:37 | <Philip`> | All but 2 of those are running the same software |
| 01:39 | <Philip`> | (e.g. breaking the URLs makes all of them (but 2) show pretty much identical 404 pages) |
| 01:40 | <Hixie> | makes sense |
| 03:08 | <Philip`> | Hmm... cInputStream reduces spec parse time from 16.8s by 20% to 13.5s |
| 03:08 | <Philip`> | which is alright but not fantastic |
| 03:11 | <Philip`> | (That code could probably be optimised to get another 5% or so) |
| 03:12 | <Philip`> | (mainly by caching the charsUntil patterns) |
| 07:39 | gsnedders | yawns |
| 07:42 | <gsnedders> | I need test cases for the spec-gen |
| 07:43 | <gsnedders> | I don't really want to fix bugs before that |
| 08:20 | <gsnedders> | hmm. All the broken xrefs seem to be those with <dfn><code>foo</code></dfn> |
| 08:25 | <virtuelv> | shouldn't those really have been <code><dfn>foo</dfn></code>? |
| 08:31 | <hsivonen> | Hixie: should id="" be checked for uniqueness? |
| 08:31 | <hsivonen> | Hixie: <foo id=""><bar id=""> should that only whine about empty string ids or also about duplicate ids? |
| 08:36 | <hsivonen> | othermaciej: Duplicate IDs are now checked. Thanks. |
| 08:37 | <hsivonen> | in other news, Validator.nu now checks meta refresh values |
| 08:37 | <othermaciej> | hsivonen: cool, glad you added it |
| 08:42 | <hsivonen> | Philip`: I rephrased the non-streamability message as you requested |
| 08:42 | <hsivonen> | thanks |
| 08:55 | <annevk> | http://developers.facebook.com/fbopen/ |
| 08:56 | <jgraham> | Has anyone ever checked how AT sees <figure><legend>? Apparently screenreaders have some special behaviour for <legend> that might prevent graceful degradation |
| 09:01 | <mpt> | hsivonen, do you check uniqueness of, and existence of the elements for, for= values? |
| 09:03 | <annevk> | for= also needs to do type checking |
| 09:03 | <hsivonen> | mpt: I think I do (for HTML5--not HTML4) |
| 09:03 | <hsivonen> | let's see |
| 09:03 | <jwalden> | http://svn.facebook.com/svnroot/platform/fbopen/lib/fbml/fbjs.php is the real meat as far as I can tell there |
| 09:03 | <annevk> | <label for=x>...</label> <input type=hidden id=x> should not be conforming |
| 09:04 | <annevk> | <label for=x>...</label> <div id=x></div> should not be conforming |
| 09:04 | <annevk> | etc. |
| 09:04 | <hsivonen> | mpt: yeah, I check for it |
| 09:04 | <mpt> | cool |
| 09:05 | <hsivonen> | the XHTML 1.0 / HTML 4 support in V.nu really sucks compared to the (X)HTML5 support |
| 09:05 | <hsivonen> | I wonder if I should just remove XHTML 1.0 / HTML 4 support or keep fixing the breakage |
| 09:05 | <hsivonen> | or leave it in a sucking state |
| 09:06 | <annevk> | hsivonen, it doesn't catch the type=hidden above |
| 09:06 | <annevk> | hsivonen, it does catch the <div> element |
| 09:06 | <hsivonen> | annevk: good catch. |
| 09:07 | <hsivonen> | annevk: is referring to type=hidden forbidden in the spec? (it probably should be) |
| 09:07 | <annevk> | might not be forbidden just yet |
| 09:08 | <annevk> | in fact, I don't think WF2 makes additional requirements about for |
| 09:11 | <annevk> | a more complicated check might be not allowing referencing a <datalist> child unless the <label> is itself a child of the same <datalist> |
| 09:11 | <hsivonen> | annevk: I found another type=hidden bug while I was at it. thanks |
| 09:11 | hsivonen | has totally forgotten about datalist |
| 09:12 | <hsivonen> | why would labels refer to datalists? |
| 09:12 | <annevk> | <label for=x> ... </label> <datalist> <select id=x> </select> </datalist> |
| 09:12 | <hsivonen> | eww. |
| 09:13 | <annevk> | the conforming case would be <datalist> <label for=x> </label> <select id=x> </select> </datalist> |
| 09:13 | <annevk> | it seems kind of nasty to check |
| 09:14 | <hsivonen> | consider http://bugzilla.validator.nu/show_bug.cgi?id=232 postponed |
| 09:14 | <hsivonen> | (deploying the easier fix now...) |
| 09:14 | <annevk> | yeah, fair enough |
| 09:15 | <annevk> | othermaciej, any progress on your Forms TF stuff? |
| 09:15 | <annevk> | othermaciej, the Forms WG people have stopped radio silence, maybe we should say something back |
| 09:16 | <othermaciej> | annevk: yeah, we should |
| 09:16 | <othermaciej> | annevk: I think I can get something written in the next few days |
| 09:16 | <othermaciej> | though if I don't get it done before next week, then I'll be out of action for a bit due to WWDC |
| 09:17 | <annevk> | better get on it then ;) |
| 10:26 | <Dashiva> | Just mark it as beta :) |
| 10:26 | <Dashiva> | oops. that was one scrollback too little |
| 10:37 | <Hixie> | man, some people are whiny |
| 10:39 | <annevk> | maybe that's how things worked out for them when they were little :) |
| 10:40 | <Lachy> | who's whining? |
| 10:40 | <Dashiva> | The editorial comments on xhtml and stuff, maybe |
| 10:41 | <Lachy> | ah, I still have hundreds of mails to catch up on from last weekend |
| 10:43 | <Dashiva> | I spent most of the weekend wondering why there wasn't any activity on public-html. Then I realized it was no longer May. |
| 10:43 | <hsivonen> | I'm quite happy to find less email |
| 10:44 | <Hixie> | nah, direct mail to me |
| 10:45 | <Hixie> | though i am getting tired of having to deal with the w3c |
| 10:46 | <Hixie> | hsivonen: i recommend that you make the validator collapse all the presentational attributes used with the value 0 or equivalent into one error "your document contains obsolete presentational markup" |
| 10:48 | <Hixie> | this global <a> thing is a bit irritating too |
| 10:48 | <Hixie> | could someone please explain to me what is wrong with onclick="getElementsByTagName('a')[0].click()" ? |
| 10:48 | <annevk> | it's more typing than href |
| 10:48 | <Dashiva> | It requires scripting |
| 10:48 | <Hixie> | waah. |
| 10:48 | <Philip`> | It needs CSS to give the right cursor, and JS to update the status bar |
| 10:49 | <Philip`> | (Er, do browsers still let you update the status bar?) |
| 10:49 | <Dashiva> | Sometimes, some of them |
| 10:49 | <annevk> | Hixie, maybe it's a CSS matter |
| 10:49 | <annevk> | Hixie, in that CSS could provide a way to make the target area of the link span the entire table row the link is in |
| 10:50 | <Hixie> | yeah well |
| 10:50 | <Hixie> | would be nice if the csswg was doing anything |
| 10:50 | Dashiva | gasps |
| 10:50 | <annevk> | i'm doing something |
| 10:50 | <annevk> | i moved the namespaces module to CR :) |
| 10:50 | <annevk> | next is media queries and cssom view |
| 10:51 | <Hixie> | uh huh |
| 10:51 | <Hixie> | i mean something useful :-P |
| 10:51 | <Hixie> | well, cssom view is useful |
| 10:52 | <annevk> | media queries is something html5 depends on too |
| 10:53 | <Hixie> | media queries is fine already |
| 10:53 | <Hixie> | i mean it has some minor issues |
| 10:53 | <Hixie> | but there are bigger fish to fry |
| 10:53 | <Hixie> | the other part of cssom, for instance |
| 10:53 | <annevk> | hah |
| 10:53 | <Hixie> | and the animation proposals from apple |
| 10:54 | <Hixie> | the animation stuff is probably the number one priority right now |
| 10:54 | <annevk> | dean jackson will be working on that, but the CSS WG is sort of slow setting it all up, not sure what's holding everything back |
| 10:54 | <Hixie> | if i wasn't committed to finishing html5 i'd be doing that i expect |
| 10:54 | <Hixie> | sadly to do a good job requires a rewrite of css from teh ground up |
| 10:54 | <Hixie> | since css2.1 is so not well written |
| 10:54 | <annevk> | (media queries parsing was defined in the same sense html4 parsing was defined so it needed some fixup) |
| 10:54 | <Hixie> | (and i say that as one of hte editors) |
| 10:55 | <annevk> | 2012 |
| 10:56 | <Hixie> | yeah |
| 10:56 | <Hixie> | well |
| 10:56 | <Hixie> | dom core first |
| 10:56 | <Hixie> | then svg |
| 10:57 | <Hixie> | and maybe http if gsnedders hasn't done it by then |
| 10:57 | <annevk> | dom core is zcorpan |
| 10:57 | <Hixie> | well he has til 2012 i guess :-) |
| 10:57 | <hsivonen> | Hixie: hmm. I was thinking about lobbying to allow the presentational stuff and offering warnings for them as a checkbox |
| 10:58 | <takkaria> | "then svg", heh |
| 10:59 | <Hixie> | hsivonen: i don't expect such lobbying to be fruitful |
| 10:59 | <Hixie> | hsivonen: the only use case seems to be "silence the validator" |
| 10:59 | <Hixie> | hsivonen: and that's easier done in ui |
| 10:59 | <Hixie> | annevk: is chaals around? |
| 11:00 | <Hixie> | oh man what's this geo wg crap |
| 11:01 | <Hixie> | just as we're fixing the waf+webabi mess of two wgs, we're splitting the new wg into two wgs again. |
| 11:01 | <Philip`> | You require e.g. frameborder="0" to make content work correctly in some current UAs (i.e. IE), and it'd be nice if valid HTML5 content could work in current UAs |
| 11:01 | <Hixie> | is the w3c simply unable to learn from its mistakes? |
| 11:02 | takkaria | wonders why "CSS Marquee" is a high priority spec of the csswg |
| 11:02 | <annevk> | Hixie, chaals is in Brasil last I heard. You could e-mail him I suppose |
| 11:03 | <Philip`> | takkaria: Presumably because it's a very popular feature in China |
| 11:03 | <annevk> | Hixie, I don't get the geo stuff either, all browser vendors + google indicated a preference for the new WA WG as venue |
| 11:06 | <Philip`> | (Oh, I guess my presumption was wrong, since the thingy says it's mainly for mobile browsers) |
| 11:18 | <Hixie> | wtf does this mean: |
| 11:18 | <Hixie> | svn: REPORT request failed on '/svn/!svn/vcc/default' |
| 11:18 | <Hixie> | svn: Target path does not exist |
| 11:18 | <Hixie> | ...when i try to "svn up" the html5lib directory |
| 11:21 | <Hixie> | ooo, it works if i do it when i'm in the directory at its "real" path instead of a symlinked path |
| 11:21 | <Hixie> | weird |
| 11:22 | <Philip`> | If it's a symlinked subdirectory, I guess it'd be looking for ../../.svn/ and would get unhappy because that doesn't exist |
| 11:23 | <Hixie> | i don't think the real location has one of those either |
| 11:23 | <Hixie> | but oh well |
| 11:23 | <Hixie> | whatever |
| 11:23 | <Hixie> | hey, hsivonen fixed dup id detection |
| 11:23 | Hixie | fixes the dup ids in the spec |
| 11:24 | <Hixie> | annevk: when are we publishing again? |
| 11:24 | <annevk> | the plan is Thursday |
| 11:24 | <Hixie> | k |
| 11:24 | <annevk> | it largely depends on the W3C getting its act together though, offline-webapps was scheduled for last Friday... |
| 11:25 | <Hixie> | so when do i have to have the boilerplate updated? |
| 11:25 | <Hixie> | is now ok? |
| 11:26 | <annevk> | you mean making it WD-ready? I guess they want it as late as possible. Personally I'd say that now is ok |
| 11:26 | <Hixie> | k |
| 11:29 | <Hixie> | um |
| 11:29 | <Hixie> | looks like the multipage script broke when i updated it |
| 11:29 | <Hixie> | guess i'd better look into that |
| 11:29 | <Philip`> | Broke in which ways? |
| 11:30 | <Philip`> | ("Updated" as in "updated to the latest version from SVN"?) |
| 11:32 | <Hixie> | ImportError: No module named serializer |
| 11:32 | <Hixie> | yes |
| 11:33 | <Philip`> | Sounds like an old version of html5lib |
| 11:33 | <Hixie> | updated that too |
| 11:33 | <Hixie> | svn up |
| 11:34 | <Philip`> | Hmm |
| 11:34 | <Philip`> | Certain? :-) |
| 11:34 | <Hixie> | yes |
| 11:34 | <Hixie> | what is python setup.py install going to do? |
| 11:35 | <Philip`> | Probably install into /usr/python2.5/etc |
| 11:36 | <Philip`> | (Is it using an installed old version of html5lib?) |
| 11:36 | <Hixie> | that's not going to work so well here. |
| 11:36 | <Hixie> | yeah, looks like it might be |
| 11:36 | <Hixie> | oh i see |
| 11:36 | <Hixie> | i symlinked to the wrong place |
| 11:37 | <Hixie> | for some definition of wrong |
| 11:40 | Philip` | goes away for half an hour |
| 11:40 | <Philip`> | Hixie: If it fails when calling some xpath method, that's because it needs lxml 2.0 |
| 11:41 | <Philip`> | Otherwise it ought to work, hopefully :-) |
| 11:41 | <Hixie> | `now i get: |
| 11:41 | <Hixie> | ImportError: No module named lxml |
| 11:41 | <annevk> | dependencies suck |
| 11:41 | <Hixie> | yes. |
| 11:41 | <Philip`> | Ah, in that case you also need lxml 2.0 |
| 11:41 | <Hixie> | html5lib used to be much easier to use :-) |
| 11:42 | <Hixie> | now i have to install it and add libraries... |
| 11:42 | <Philip`> | That's not html5lib's fault :-) |
| 11:42 | <Philip`> | It's just because I chose to use the lxml treebuilder (because that was nicer and faster than the DOM-ish one) |
| 11:42 | <Philip`> | It should be just "easy_install lxml" except lack of root probably makes that harder |
| 11:42 | <Hixie> | i have no idea how to install that dependency |
| 11:43 | <Hixie> | i don't have anything resembling a work environment here |
| 11:43 | <Hixie> | i have a directory. |
| 11:43 | <Hixie> | i'm lucky to have python. |
| 11:44 | <Philip`> | Do you have easy_install? |
| 11:44 | <Hixie> | no |
| 11:44 | <hsivonen> | IIRC, getting easy_install to do the right thing under Debian doesn't exactly qualify as 'easy' |
| 11:45 | <Hixie> | i think this is not debian |
| 11:45 | <Hixie> | i think it's fedora core 5 but i'm not 100% sure |
| 11:45 | <Philip`> | Hmm, that makes it much more of a pain... |
| 11:46 | <annevk> | no apt-get ? |
| 11:47 | <Hixie> | i don't have root |
| 11:47 | <Philip`> | Fedora 7 only has lxml 1.3 packages |
| 11:47 | <Hixie> | so even if it was debian, apt-get wouldn't help |
| 11:47 | <Philip`> | (though you can --enablerepo=development to get 2.0 on there) |
| 11:47 | <Hixie> | also, i don't have libxml2 or libxslt |
| 11:47 | <Hixie> | so i can't compile lxml2 |
| 11:47 | <Philip`> | or python-dev? |
| 11:47 | <Hixie> | python-dev? |
| 11:48 | <Philip`> | like the CPython header files and stuff |
| 11:48 | <Hixie> | $ python setup.py build |
| 11:48 | <Hixie> | Building lxml version 2.1.beta3-55506. |
| 11:48 | <Hixie> | NOTE: Trying to build without Cython, pre-generated 'src/lxml/lxml.etree.c' needs to be available. |
| 11:48 | <Hixie> | ERROR: /bin/sh: xslt-config: command not found |
| 11:48 | <Hixie> | ** make sure the development packages of libxml2 and libxslt are installed ** |
| 11:48 | <Hixie> | ... |
| 11:48 | <Hixie> | gcc: src/lxml/lxml.etree.c: No such file or directory |
| 11:48 | <Hixie> | gcc: no input files |
| 11:48 | <Hixie> | error: command 'gcc' failed with exit status 1 |
| 11:48 | <annevk> | maybe use the Web service from Philip` instead? |
| 11:48 | <Hixie> | that may be the better idea at this point |
| 11:49 | <Philip`> | Maybe me switching to lxml was a bad idea |
| 11:49 | <Hixie> | what's the uri for your cgi app? |
| 11:49 | <Philip`> | But it's nice when it works :-) |
| 11:49 | <annevk> | (the scary thing about installing is that it always leaves a lot of cruft behind; at least, I'm afraid of that) |
| 11:49 | <Philip`> | http://krijnhoetmer.nl/irc-logs/html-wg/20080601#l-16 |
| 11:50 | <Hixie> | can you make it ping http://www.whatwg.org/specs/web-apps/current-work/do-pubrules-update when it's done? i don't trust my end to keep a connection open for two minutes, long running jobs have a tendency to get killed when load gets high. |
| 11:50 | Philip` | is too used to being able to install dependencies trivially |
| 11:51 | <Philip`> | Hixie: Like sending a GET request? |
| 11:51 | <Hixie> | yeah |
| 11:51 | Philip` | wonders what'll happen now that Googlebot knows that URL and will keep pinging it :-) |
| 11:51 | <Hixie> | i guess the mutlipage url will get refreshed more oten! |
| 11:51 | <Hixie> | often |
| 11:52 | <annevk> | prolly the referer is checked |
| 11:52 | <annevk> | or maybe not :) |
| 11:52 | <Philip`> | I could do that in, uh, the next hour or so |
| 11:52 | <Hixie> | cool |
| 11:52 | <Hixie> | annevk: it's a two line shell script that just downloads the file and unzips it |
| 11:52 | <Hixie> | i wonder how i make wget do a post |
| 11:52 | <Philip`> | (Is it fine if my script still attempts to keep the connection open for a couple of minutes?) |
| 11:52 | <Hixie> | aha, --post-data |
| 11:52 | <Hixie> | Philip`: sure |
| 11:52 | <Philip`> | Hixie: Ah, good, since I'm not sure how to close it |
| 11:53 | <Hixie> | you just run another script in the background to do the real work |
| 11:53 | <Hixie> | :-) |
| 11:53 | <Philip`> | I've not quite worked out how to do that so the caller returns and the background process doesn't die |
| 11:54 | <Philip`> | But anyway I need to go now :-) |
| 11:54 | <Hixie> | with apache i've never had a problem just doing "foo.sh&" from within a bash shell script cgi |
| 11:55 | <Hixie> | aw man, Philip` used bz2 |
| 11:55 | <Hixie> | tar doesn't support bz2 last i checked |
| 11:55 | <Hixie> | ooh, -j |
| 11:56 | <hsivonen> | gnutar support everything |
| 11:56 | <hsivonen> | +s |
| 11:56 | <hsivonen> | whew. I've now flushed zcorpan's IRC bug reports to bugzilla |
| 12:00 | <Hixie> | oh right, Philip`'s script doesn't add all the magic symlinks i had in my version |
| 12:03 | <Hixie> | ok |
| 12:03 | <Hixie> | all fixed |
| 12:05 | <zcorpan> | hsivonen: i guess i could report to bugzilla directly in the future :) |
| 12:05 | <hsivonen> | zcorpan: that would be nice |
| 12:06 | <hsivonen> | zcorpan: thanks for the reports, btw. Fixing now. |
| 12:06 | <zcorpan> | hsivonen: np |
| 12:15 | <Hixie> | i guess i should sleep |
| 12:26 | <Hixie> | annevk: any chance i can throw scrollIntoView() into an actively maintained spec you're working on? |
| 12:27 | <annevk> | i've looked into adding it to CSSOM View |
| 12:27 | <annevk> | as that'd make the most sense |
| 12:27 | <Hixie> | yeah |
| 12:27 | <Hixie> | i have an XXX in the html5 spec about doing that |
| 12:27 | <annevk> | i wonder if there are many outstanding issues |
| 12:27 | <Hixie> | i'm going through them now |
| 12:27 | <annevk> | cause cssom view is more or less done apart from the insane thing that is offset* |
| 12:28 | <Hixie> | only cos you took out all the other hard bits and put them into a separate spec :-P |
| 12:28 | <annevk> | i'm trying to round up xhr1 stuff now, not sure when cross-site stuff is going to be done but I suppose I should start pushing CSSOM View again as the CSSWG resolved to publish something and it hasn't happened yet |
| 12:29 | <Hixie> | the most important part -- how css gets bootstrapped from html5 -- is still undefined, right? or rather, is in the unmaintained other cssom spec? |
| 12:29 | <annevk> | Hixie, oh, alternate style sheets etc., yeah |
| 12:29 | <Hixie> | (offset* is pretty important too) |
| 12:30 | <Hixie> | (and desperately needs a thorough spec) |
| 12:30 | <annevk> | i agree and offset* is specced |
| 12:30 | <annevk> | the problem is deciding which of the various options we have is the best |
| 12:31 | <Hixie> | yeah, i know the feeling |
| 12:31 | <annevk> | because offset* is completely broken :) |
| 12:31 | <Hixie> | you have to study existing content, read bugs in the various UAs' bug databases, etc |
| 12:31 | <Hixie> | it's a lot of work :-( |
| 12:32 | <annevk> | yeah, i did all the research, but there's no clear answer |
| 12:32 | <Hixie> | the html parser section was the worst so far for me |
| 12:32 | <Hixie> | talk about no clear answer :-) |
| 12:32 | <annevk> | hehe |
| 12:32 | <annevk> | i like the html parser |
| 12:32 | <annevk> | it's one of the better parts of html5 despite not being one of the original goals |
| 12:32 | <Hixie> | heh |
| 12:33 | <Hixie> | it's so complicated it isn't susceptible to bike shed colour discussions |
| 12:33 | <Hixie> | which helps |
| 12:35 | <virtuelv> | Hixie: don't misunderestimate the tendency people have to look at any building as if it was a bikeshed |
| 12:35 | <hsivonen> | I like how my one-sentence spec comments on the parsing section get answered with "Done" without a bikeshed thread in between |
| 12:36 | <Hixie> | virtuelv: there have been virtually no bikesheds on the parser section, so i think my statement is true |
| 12:42 | hsivonen | thinks it's sad that Adobe keeps breaking their greatest hit (PDF) by mixing Flash with it |
| 12:43 | <Hixie> | lol i can't validate the spec using validator.w3.org, it crashes |
| 12:44 | <hsivonen> | the spec is a good stress test case |
| 12:47 | <Hixie> | ok annevk, i checked in a pubrules-compliant version |
| 12:47 | <Philip`> | Hixie: Might it be nice/possible for the 404 page to include link-fixup.js so that e.g. http://www.whatwg.org/specs/web-apps/current-work/multipage/outdated.html#introduction sends you to the right place? |
| 12:48 | <Hixie> | i'm about to go to bed |
| 12:48 | <Hixie> | mail me the details |
| 12:49 | Philip` | wonders why http://www.whatwg.org/specs/web-apps/current-work/multipage/stdout.txt is text/html |
| 12:49 | <Philip`> | Hixie: Okay, will do |
| 12:49 | <Hixie> | i have a default text/html mime type iirc |
| 12:49 | <Hixie> | feel free to include a .htaccess file in the bz2 file |
| 12:50 | <annevk> | and stuff that steals cookies :D |
| 12:51 | <Hixie> | he could do far worse than that |
| 12:51 | <Hixie> | he could put arbitrary binary code in there |
| 12:51 | <Philip`> | Hmm, I could use that to make everyone who views the multipage spec unknowingly vote for my favourite issues in http://www.whatwg.org/issues/ |
| 12:51 | <Hixie> | and then run it |
| 12:52 | <Hixie> | and steal my e-mail |
| 12:52 | <Philip`> | Stop giving me ideas :-( |
| 12:52 | <Hixie> | and anything on any of my sites |
| 12:52 | <Hixie> | you could just log into the database and set the vots to whatever you wnat :-) |
| 12:52 | <zcorpan> | doesn't <video> work on safari for windows? |
| 12:52 | <roc> | it works |
| 12:52 | <Hixie> | Philip`: of course if you did any of these things and got caught, it would probably not do your career much good :-) |
| 12:53 | <roc> | however, it only integrates with Quicktime |
| 12:53 | <Philip`> | Hixie: I'll have to make sure I don't get caught, then |
| 12:53 | <Hixie> | :-) |
| 12:53 | Hixie | sets up his script to automatically send him diffs of every update |
| 12:58 | <jwalden> | the XSS might make someone cross |
| 12:58 | <jwalden> | the ensuing spectacle would be quite a site |
| 12:58 | jwalden | decides not to try for the second S |
| 12:59 | <Hixie> | there |
| 12:59 | <Hixie> | philip can't run arbitrary code anymore |
| 12:59 | <Hixie> | i force the permissions to non-executable |
| 13:00 | <annevk> | so much for trust :p |
| 13:00 | <Hixie> | it's not really about trusting him |
| 13:01 | <Hixie> | so much as reducing the number of hosts that can be used as attack vectors into my site |
| 13:01 | Philip` | doesn't trust himself |
| 13:04 | annevk | wanted to thank jgraham for http://meyerweb.com/eric/thoughts/2008/06/02/the-missing-link/?#comment-382177 but he's not around |
| 13:07 | <Hixie> | nn |
| 13:26 | <hsivonen> | what's the <> tag called in SGML? |
| 13:27 | <hasather> | hsivonen: empty start-tag |
| 13:28 | <hsivonen> | hasather: thanks |
| 13:28 | <annevk> | but why? |
| 13:28 | <annevk> | nobody wants to be bothered with SGML phrases :o |
| 13:29 | <Lachy> | hsivonen, http://www.is-thought.co.uk/book/sgml-9.htm#SHORTTAG is useful if you ever need to lookup SGML stuff |
| 13:34 | <hsivonen> | annevk: I'm trying to improve error messages |
| 13:34 | <hsivonen> | annevk: so I thought I should say on <> and </> that SGML features foo and bar aren't permitted in HTML |
| 13:35 | <zcorpan> | hsivonen: i'm not sure that's an improvement over saying that < must be written as < in html |
| 13:35 | <annevk> | I guess I'm glad I asked since I agree with zcorpan |
| 13:36 | <hsivonen> | ok |
| 13:36 | <annevk> | pretending that SGML still exists or that HTML is somehow related to it doesn't seem useful |
| 13:36 | <hsivonen> | ok |
| 13:36 | <hsivonen> | would you agree, though, that <? should say XML processing instructions aren't permitted? |
| 13:38 | <annevk> | yeah, that seems ok |
| 13:38 | <annevk> | XML exists and people are mixing it and HTML all the time (or at least pretending to do that) |
| 13:38 | <zcorpan> | indeed |
| 13:39 | <zcorpan> | <?xml ... ?> is common in text/html |
| 13:56 | <hsivonen> | I changed various parser error messages |
| 13:56 | <hsivonen> | please let me know if it was an improvement |
| 14:20 | <gsnedders> | virtuelv: Hixie wrote them, not me! :P |
| 14:21 | <virtuelv> | gsnedders: ? |
| 14:21 | <gsnedders> | virtuelv: <dfn><code>h1</code></dfn> |
| 14:21 | <virtuelv> | ah |
| 14:21 | <gsnedders> | virtuelv: The point is it is the defining instance of that code block |
| 14:23 | <zcorpan> | hsivonen: is there an easy way to check what the new messages are? :) |
| 14:23 | <hsivonen> | zcorpan: I'll run a diff |
| 14:28 | <annevk> | gsnedders, in theory specific handling for <hx>... <dfn> ... <dfn> ... </hx> would be nice |
| 14:29 | <gsnedders> | annevk: Specific in what way? |
| 14:29 | <annevk> | it would be nice if the <dfn> didn't get an id= in that case and references to the <dfn> would instead point to the <hx> |
| 14:32 | <gsnedders> | Ah. That would be nice. |
| 14:33 | gsnedders | currently just wants it to work right now :P |
| 14:34 | <gsnedders> | OK, I've just got running the spec-gen down from 137.302s to 16.986s on HTML 5 :P |
| 14:34 | <hsivonen> | zcorpan: http://pastebin.ca/1037720 |
| 14:38 | <zcorpan> | hsivonen: i think "unescaped logical not" goes over the heads of most authors since most authors don't know programming |
| 14:41 | <Philip`> | It seems pretty uncommon for "<>" to mean "logical not", since people don't write web pages in BASIC |
| 14:41 | <hsivonen> | should I mention only mistyped start tag? |
| 14:41 | <Philip`> | Oh, that's not logical not anyway, it's not-equal |
| 14:42 | <hsivonen> | umm right |
| 14:42 | <hsivonen> | I'll assume <> and </> are typoed tags |
| 14:43 | <annevk> | Probable cause: typo |
| 14:43 | <Philip`> | Almost all the <> I can see are in JS regexps |
| 14:43 | <annevk> | <> is acceptable within <script>, no? |
| 14:43 | <Philip`> | except |
| 14:43 | <Philip`> | http://geocities.com/scerez/: <td rowspan="7"><> |
| 14:43 | <Philip`> | http://members.aol.com/_ht_a/swinggaits/: <div align=left><font face='Arial,Helvetica,adobe-helvetica,Arial Narrow' size=3 color='#99cc33'><i> </i></font><font face='Arial,Helvetica,adobe-helvetica,Arial Narrow' size=3 color='#99cc33'><b>><><><><><><><><><><><><><><><><><><><><><><><><><><><</b></font><font face='Arial,Helvetica,adobe-helvetica,Arial Narrow' size=3 color='#99cc33'><i> </i></font></div> |
| 14:44 | <Philip`> | http://www.hakurai.ne.jp/: <param name="yradius" value="0"><> |
| 14:44 | <Philip`> | http://www.angelfire.com/in/HorseAndCarriageSo/: <><br> |
| 14:44 | <Philip`> | and so on |
| 14:44 | <annevk> | fun |
| 14:45 | <Philip`> | http://www.rad.de/: Auf rad.de bestimmt du selbst, was rad.de-Mitglieder lesen. Ob´s um Fahrräder, Touren, Tipps oder News geht <<>> die rad.de-Route ist alles andere als eine Einbahnstraße.</center> |
| 14:45 | <annevk> | hsivonen, maybe it should say "Use <> instead." |
| 14:45 | <Philip`> | I'm not sure how many are typos and how many are decorative |
| 14:45 | <hsivonen> | annevk: that's not good advice if the probable cause is a tag typo |
| 14:46 | <gsnedders> | http://hg.gsnedders.com/spec-gen/file/tip/specGen/utils.py#l65 — that gives "" for <dfn><code>foo</code></dfn> |
| 14:47 | <annevk> | hsivonen, true |
| 14:47 | <zcorpan> | hsivonen: "Either you typoed a tag or you should replace < with <" |
| 14:47 | <zcorpan> | (or something) |
| 14:48 | <Philip`> | gsnedders: Do you mean repr(Element.getroottree()) gives ""? |
| 14:48 | <gsnedders> | Philip`: Line 65 |
| 14:48 | <gsnedders> | Something is wrong with it, obviously |
| 14:48 | <gsnedders> | Wait, I haven't reloaded that in ages |
| 14:49 | <gsnedders> | Line 70 |
| 14:49 | <gsnedders> | My XPath is broken :( |
| 14:49 | <hsivonen> | I wonder how I should deal with bugs blocking on spec issues |
| 14:49 | <hsivonen> | LATER seems like a bugzilla anti-pattern |
| 14:50 | <annevk> | gsnedders, also, the <hx> in question should probably use the first <dfn> child for creating the id in question (unless id is already set) |
| 14:50 | <gsnedders> | Oh duh. descendant::text() is what I want |
| 14:50 | <annevk> | gsnedders, so <h4>The <dfn><code>em</code></dfn> element</h4> ends up as <h4 id=em> |
| 14:50 | <zcorpan> | hsivonen: the text you have for </> would work for <> too, i think (with s/end/start/) |
| 14:50 | <Philip`> | gsnedders: child::text() returns text childs, and <dfn> has no text childs |
| 14:50 | <gsnedders> | annevk: That's hard to do without forward scanning, which I don't really want to do |
| 14:51 | <gsnedders> | Philip`: Yeah, I just realised looking at the spec |
| 14:51 | <Philip`> | That's why it's called "child::text" :-) |
| 14:51 | <annevk> | gsnedders, else it would be something like <h4 id=the-em> right? what's the big difference? |
| 14:51 | <gsnedders> | Philip`: Just the difference between direct and indirect children :) |
| 14:52 | <gsnedders> | annevk: I'd rather it just used the textContent of the hx |
| 14:52 | <Philip`> | gsnedders: etree.tostring(Elements, method='text') works (at least in lxml 2.0) at finding the concatenated descendant text |
| 14:52 | <hsivonen> | zcorpan: ok. tweaking again |
| 14:52 | <Philip`> | (or at least I hope it does, since that's what I'm using) |
| 14:54 | <gsnedders> | http://stuff.gsnedders.com/html5.html — find broken xrefs that I've introduced (again)! |
| 14:54 | <zcorpan> | hsivonen: a likely cause of " in attribute name is that there's a " missing somewhere else, i think |
| 14:55 | <hsivonen> | zcorpan: which state? |
| 14:55 | <hsivonen> | I mean: at start? surely not inside? |
| 14:56 | <zcorpan> | @@ -2187,7 +2187,7 @@ |
| 14:56 | <zcorpan> | * (') Parse error. |
| 14:56 | <zcorpan> | */ |
| 14:56 | <zcorpan> | err("Quote \u201C" + c |
| 14:56 | <zcorpan> | - + "\u201D in attribute name."); |
| 14:56 | <zcorpan> | + + "\u201D in attribute name. Probable cause: \u201C=\u201D missing immediately before."); |
| 14:57 | <zcorpan> | the diff doesn't give much context :( |
| 14:57 | <zcorpan> | <p title=my title"> |
| 14:58 | <hsivonen> | zcorpan: If an attribute name starts with ", isn't the likely cause something like: name "value"? |
| 14:58 | <hsivonen> | zcorpan: that's a different state |
| 14:58 | <zcorpan> | hsivonen: ah |
| 14:58 | <hsivonen> | I don't remember what I wrote |
| 14:59 | <hsivonen> | oops. you are right. |
| 14:59 | <hsivonen> | my message is bogus |
| 14:59 | <hsivonen> | thanks |
| 15:00 | <zcorpan> | hsivonen: hmm. <!DOCTYPE html><title></title><embed src "> |
| 15:00 | <zcorpan> | gives no errors |
| 15:02 | <zcorpan> | or had i reported that before? |
| 15:02 | <annevk> | i think you did |
| 15:02 | <zcorpan> | yeah. sorry for the noise :) |
| 15:02 | <annevk> | it seems that " is not catched on the tokenizing level which is why it is not catched |
| 15:02 | <annevk> | (you can figure that out by looking at <p title "> |
| 15:02 | <hsivonen> | zcorpan: It's a spec bug |
| 15:03 | <annevk> | ) |
| 15:05 | hsivonen | has still 50 bytes to spare in the tokenizer loop |
| 15:08 | <gsnedders> | Hixie: What should the target time to run the spec-gen on HTML 5 be? |
| 15:09 | <Philip`> | gsnedders: Zero seconds |
| 15:09 | <gsnedders> | Philip`: html5lib takes 12s to parse it alone! :P |
| 15:10 | <gsnedders> | Philip`: (thanks for all the work you've done on perf. recently, though) |
| 15:11 | <Philip`> | gsnedders: It's not necessarily a reachable target, but it's the best one to aim at :-) |
| 15:11 | <takkaria> | I was reading about how python implements dictionaries internally last night, quite interesting from a performance point of view |
| 15:12 | <zcorpan> | hsivonen: isn't it better to write out the full doctype in the missing SI warnings? |
| 15:13 | <gsnedders> | I so much nicer to work on something when you can actually see the results of what you've done :) |
| 15:13 | <zcorpan> | hsivonen: as in "The doctype did not contain the system identifier prescribed by the HTML 4.01 specification. Expected \u201C<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\">\u201D." |
| 15:15 | <hsivonen> | zcorpan: I suppose. thanks |
| 15:16 | <zcorpan> | hsivonen: otherwise it looks good i think |
| 15:17 | <hsivonen> | zcorpan: thanks |
| 15:17 | <annevk> | gsnedders, less than a second would be nice |
| 15:17 | <gsnedders> | annevk: Give me chtml5lib first :) |
| 15:18 | <gsnedders> | I think 1.5s is realistically just about as quick as we'll get with current computers with chtml5lib to be fair |
| 15:21 | <Philip`> | chtml5lib needs to be multithreaded! |
| 15:22 | <takkaria> | I'm not sure there's much to parallelise |
| 15:22 | <annevk> | yeah, so it can do tokenizing before having an input stream and building a tree while downloading a resource |
| 15:22 | <takkaria> | I think my sense of humour was lacking there. oops |
| 15:23 | <hsivonen> | takkaria: if tokens were objects (as opposed to callbacks), the tokenizer and the tree builder could run on different threads |
| 15:23 | <hsivonen> | probably not useful |
| 15:24 | <takkaria> | (btw, hubbub (incomplete C html parser) takes 1.7s just to tokenise the current spec) |
| 15:24 | <hsivonen> | with Java and the V.nu parser it would be relatively easy to try two-threaded performance |
| 15:25 | <hsivonen> | zcorpan: fixed |
| 15:25 | <jmb> | takkaria: most of that's probably in the input stream handling. |
| 15:25 | <hsivonen> | Philip`: fixed image report on fatal error |
| 15:26 | <gsnedders> | takkaria: Well, WebKit is quicker than that :P |
| 15:27 | Philip` | gets another ~5% from improving cInputStream |
| 15:27 | gsnedders | fears the commit when this lands |
| 15:27 | <Philip`> | so it now parses the spec in about 12.6s, vs 16.5s for pure Python |
| 15:28 | <jmb> | s/probably// |
| 15:28 | <takkaria> | heh |
| 15:29 | <hsivonen> | Philip`: from what data source? (disk, RAM?) |
| 15:29 | <Philip`> | takkaria: Via http://krijnhoetmer.nl/irc-logs/html-wg/20070710#l-239 I got 0.35s for tokenising in C++ |
| 15:29 | <Philip`> | (on a slightly old version of the spec) |
| 15:29 | <Philip`> | (on a P4 3.0GHz, I think) |
| 15:29 | annevk | wonders how long Validator.nu takes |
| 15:30 | <Philip`> | annevk: That's the "Java" one in those results |
| 15:30 | hsivonen | prepares to run a benchmark |
| 15:30 | <Philip`> | hsivonen: Disk (but cached) |
| 15:31 | <hsivonen> | Philip`: crossing to the native file system seems expensive in Java |
| 15:31 | <Philip`> | hsivonen: It's relatively inexpensive in Python, just because the rest of the parser is a hundred times more expensive :-) |
| 15:31 | <annevk> | Philip`, how about parsing? |
| 15:32 | <annevk> | for validating I get "Total execution time 23573 milliseconds." |
| 15:32 | <Philip`> | annevk: I've never had a working C++ parser so I've never had anything interesting to compare |
| 15:33 | <Philip`> | cInputStream seems kind of worthwhile for performance - does anyone know what would be the nicest way to try including it in html5lib? |
| 15:34 | <Philip`> | (It just defines a class with char() and charsUntil(), and you stick it together with the normal HTMLInputStream (currently via inheritance) and then it goes faster) |
| 15:34 | <Philip`> | (or it crashes) |
| 15:35 | <gsnedders> | annevk: http://bugs.gsnedders.com/issues/show/5 — happy? :) |
| 15:37 | <annevk> | ah, feature request 5, now you have to fix it :p |
| 15:37 | <annevk> | (or maybe it's that it will never be finished :D ) |
| 15:38 | <hsivonen> | Parsing the spec from an in-memory UTF-16 buffer (but using all the Reader cruft in between) takes 0.12 seconds on average in the true streaming mode with a content handler that doesn't do anything |
| 15:38 | <gsnedders> | annevk: Yeah, but it's only down for 1.1 :P |
| 15:38 | <gsnedders> | annevk: Though there are no open issues on 1.0 :D |
| 15:39 | <gsnedders> | (1.0 is mainly just finish it) |
| 15:40 | <hsivonen> | I'm too lazy to test how much faster the parser would be without a Reader in there |
| 15:41 | <Philip`> | Almost exactly two orders of magnitude faster than Python |
| 15:41 | <hsivonen> | oh and this was HotSpot from Java 5 32 bit client on 2.4 GHz Core 2 Duo |
| 15:41 | <takkaria> | we should record benchmarks somewhere, probably |
| 15:42 | Philip` | could only get another 10% if he made char and charsUntil take zero time |
| 15:44 | gsnedders | wonders how to implement the TOC algorithm |
| 15:47 | gsnedders | adds something for 1.0 |
| 15:49 | <hsivonen> | in case anyone feels like running better benchmarks, please use V.nu parser trunk instead of release |
| 16:57 | <Philip`> | Manual reference counting in C is not fun |
| 16:59 | <annevk> | fun vs fast |
| 16:59 | <Philip`> | It's not particularly fast either :-) |
| 16:59 | <annevk> | :p |
| 17:00 | <Philip`> | The fastest approach is to just leak memory |
| 17:00 | <Philip`> | and the second fastest approach is to just leak memory and then scramble around to detect all the leaks once you're about to run out |
| 17:00 | <Philip`> | *run out of memory |
| 17:03 | <hsivonen> | the autoreleasepool solution is pretty cool (out of solutions that don't involve a proper garbage collector) |
| 17:03 | <annevk> | Philip`, OOM-safe should probably be a requirement |
| 17:04 | <hsivonen> | (the Mozilla 2 OOM approach looks interesting) |
| 17:06 | <Philip`> | if (!(m = malloc(size))) alert("Buy more RAM") |
| 17:27 | hsivonen | tries to understand what the W3C AB does |
| 17:28 | <hsivonen> | are the examples of past cases where AB has given the W3C an opinion on something? |
| 17:29 | <annevk> | see November/December 2007 in http://lists.w3.org/Archives/Member/process-issues/#latest |
| 17:29 | <annevk> | (minus #latest) |
| 17:30 | <hsivonen> | annevk: thanks |
| 18:40 | Philip` | gives up trying to optimise cInputStream, since there's not a lot left to gain :-( |
| 18:40 | <zcorpan_> | why is pixelratio='' only on <source>, not on <video>? |
| 20:35 | <hsivonen> | whoa! according to Simon Willison, libfbml depends on Firefox |
| 20:37 | <hsivonen> | hmm. http://wiki.developers.facebook.com/index.php/FBML_DTD |
| 20:37 | <hsivonen> | XSD is the new DTD |
| 21:00 | <zcorpan_> | hsivonen: is utf-8 assumed before the encoding decl is found? |
| 21:01 | zcorpan_ | looks at http://validator.nu/?doc=http%3A%2F%2Fwww.accessify.com%2F&showsource=yes |
| 21:06 | <hsivonen> | zcorpan_: windows-1252 is assumed |
| 21:07 | <zcorpan_> | hsivonen: ok |
| 21:07 | <zcorpan_> | hsivonen: what's up with "Stray end tag noscript." in http://validator.nu/?doc=http%3A%2F%2Fwww.accessify.com%2F&charset=UTF-8&showsource=yes ? |
| 21:08 | <hsivonen> | interesting. |
| 21:08 | <hsivonen> | I'll have to investigate in the morning |
| 21:08 | <hsivonen> | thanks |
| 21:08 | hsivonen | notes the alt on the stat single-pixel image on an accessibility site... |
| 21:09 | <hsivonen> | nn |
| 21:09 | <zcorpan_> | nn |
| 21:22 | <Dashiva> | annevk, zcorpan_: around? |
| 21:23 | <zcorpan_> | Dashiva: yep |
| 21:24 | <Dashiva> | I just noticed <a href> elements tostring as their href, has that always been the case? |
| 21:24 | <zcorpan_> | yeah |
| 21:25 | <zcorpan_> | dunno if it's specced anywhere |
| 21:25 | <Dashiva> | Strange how I've never noticed before |
| 21:25 | <Dashiva> | That was my second question :) |
| 21:25 | <zcorpan_> | i think html5 is waiting for webidl to get fancy features to support this sort of thing |
| 21:26 | <Hixie> | html5 does spec it |
| 21:26 | <zcorpan_> | oh |
| 21:26 | <Hixie> | btw zcorpan_ regarding your earlier question, it's on <source> only to discourage its use. |
| 21:26 | <zcorpan_> | [Stringifies=href] interface HTMLAnchorElement |
| 21:26 | <zcorpan_> | Hixie: ok |
| 21:27 | <gsnedders> | 0.924s for the spec-gen (admittedly, with only xref) with XML source/output of the HTML 5 spec (which is the sort of speed we should be able to get with a C impl. of html5lib) |
| 21:27 | <gsnedders> | Hixie: What sort of speed do you want the spec-gen to run in, at the very slowest? |
| 21:28 | zcorpan_ | 'd say 0.924s |
| 21:28 | <Hixie> | i WANT it to |
| 21:28 | <Hixie> | silly cat |
| 21:28 | <gsnedders> | Hixie: You see my latest tweet? |
| 21:28 | <Hixie> | i WANT it to run in 500ms or so |
| 21:29 | <gsnedders> | "I HAS SPECIFICASHUN GENERATOR! " |
| 21:29 | <Hixie> | but i'll put up with whatever i can get :-) |
| 21:29 | <gsnedders> | Hixie: How quick is the W3C spec-gen? |
| 21:29 | Hixie | checks |
| 21:30 | <Hixie> | (and no, i don't follow twitter) |
| 21:30 | <gsnedders> | (I expected that) |
| 21:30 | <gsnedders> | (Which is why I quoted it here) |
| 21:31 | <Hixie> | timing... |
| 21:31 | <Hixie> | 14 seconds |
| 21:31 | <Hixie> | an eternity |
| 21:32 | <Hixie> | well, 28 seconds since i do it twice, but i run them in parallel at this point |
| 21:32 | <Philip`> | My partly-optimised-in-C html5lib takes, uh, 12.3 seconds to parse the spec :-( |
| 21:32 | <gsnedders> | OK, if someone writes a fully-optimised-in-C html5lib, then we can be under 3s, I expect, without much more optimisation of the spec-gen itself |
| 21:34 | <Philip`> | I kind of like the approach of rewriting small pieces of the Python functionality into a C extension, but keeping all the complex rarely-called fast-enough bits in Python |
| 21:34 | <Philip`> | but I'm not sure how far that approach could be used in html5lib |
| 21:34 | gsnedders | wonders whether to switch to doing AH Computing from AH English just so he can use the AH Computing project to learn C and write an html5lib parser :P |
| 21:36 | <Philip`> | (Probably the next step would be to replace the entire tokeniser class with a pure C implementation, but the treebuilders are still kind of slow too) |
| 21:38 | <gsnedders> | 4.4s to serialize, 11.3s to parse here |
| 21:38 | <gsnedders> | leaving the other 0.8s for me to waste myself :) |
| 21:40 | <gsnedders> | Philip`: "Cue ranges" is missing a </dl> because my Feb 13th source is missing one. Damned Hixie! |
| 21:44 | <gsnedders> | Most of the bugs are down to Hixie making my input buggy :( |
| 21:45 | <Philip`> | Most of the bugs are down to you not using an up-to-date copy of the spec :-p |
| 21:45 | <gsnedders> | Philip`: If someone hadn't broken it in the first place… :P |
| 21:46 | <Hixie> | in february i had something like 3000 known bugs with the spec |
| 21:46 | <Hixie> | i still have over 2000 now! |
| 21:46 | <gsnedders> | whatwg.org is down, seemingly :( |
| 21:46 | <Hixie> | yeah |
| 21:46 | <Hixie> | not sre why |
| 21:46 | Philip` | looks innocent |
| 21:46 | <gsnedders> | Hixie: I means bugs like <a href="#appcache-history-1">change</span> (in the source) |
| 21:53 | gsnedders | follows Philip`'s suggestion to get rid of the last bit of XPath, making it .3s quicker |
| 21:55 | <Philip`> | I never suggested getting rid of the last bit of XPath |
| 21:55 | <Philip`> | I actually suggested the exact opposite |
| 21:55 | <gsnedders> | pedantic sod :) |
| 21:55 | <gsnedders> | http://hg.gsnedders.com/hgwebdir.cgi/spec-gen/rev/1a00bbac777e |
| 21:55 | <Philip`> | (keeping it to find all elements with ids) |
| 21:56 | <gsnedders> | Did you? I forgot that. But that's already gone. :P |
| 21:59 | <gsnedders> | Philip`: 0.718s (for //*[@id]) v. 0.185s (for iter over all elements, and manually checking if they have an @id) |
| 22:07 | <gsnedders> | Hixie: can I still `cat header-whatwg source` and just ignore the *.inc files? |
| 22:09 | <gsnedders> | Hixie: Can I get you to fix a rather major bug in the spec source, that the real spec-gen silently changes? |
| 22:10 | <gsnedders> | '<a href="#appcache-history-1">change</span>' ought to have a closing </a>, not a closing </span> |
| 22:10 | <annevk> | oh, whatwg.org is back up |
| 22:10 | <gsnedders> | oh wait |
| 22:10 | <gsnedders> | I'm being dumb |
| 22:10 | annevk | thought it was down from the logs and hesitated to check |
| 22:11 | <gsnedders> | I'm just looking at html5.old, my old copy |
| 22:11 | <gsnedders> | Hixie: Do nothing. |
| 22:11 | <Hixie> | i can do that |
| 22:12 | <gsnedders> | Hixie: BTW, you've made it run slower :( |
| 22:12 | <gsnedders> | Hixie: Though that was just by making the spec longer. |
| 22:12 | <Hixie> | heh |
| 22:13 | <gsnedders> | Hixie: Less detail, plz |
| 22:13 | <gsnedders> | Oh dear. I don't think this is good. id="prose-content." |
| 22:14 | <annevk> | Hixie, hey, we managed to publish something today :) |
| 22:14 | <annevk> | Hixie, http://www.w3.org/TR/offline-webapps/ |
| 22:14 | <gsnedders> | http://validator.nu/?=&doc=http%3A%2F%2Fstuff.gsnedders.com%2Fhtml5.html — they're all down to html5lib's brokenness, as far as I can see |
| 22:15 | <annevk> | given that I don't see myself doing anything else useful today I'll write a short blog entry for blog.whatwg.org |
| 22:16 | <jgraham_> | Oh, blogging. Yeah I was going to write something about @media. I wonder if Lachy put the slides up yet |
| 22:17 | <jgraham_> | gsnedders: That looks simple to fix. Patches welcome :) |
| 22:17 | <annevk> | Lachy, you around? Can you make me admin or something on the blog? |
| 22:17 | jgraham_ | could just not be lazy and fix it himself |
| 22:17 | <gsnedders> | jgraham_: Or just commits :) |
| 22:17 | <annevk> | Lachy, I can't even add a category |
| 22:18 | <jgraham_> | annevk: I can do that, I think |
| 22:18 | <annevk> | cool |
| 22:18 | <gsnedders> | I think I managed to break stuff, which is odd. |
| 22:19 | <jgraham_> | Er, I don't seem to be able to log in at the moment |
| 22:20 | <gsnedders> | jgraham_: We can allow < in unquoted attributes too |
| 22:21 | <gsnedders> | jgraham_: Can I change a test without being killed? |
| 22:21 | <jgraham_> | gsnedders: Which test? And why? |
| 22:21 | <gsnedders> | jgraham_: Change: u'<span title="foo<bar">' to u'<span title=foo<bar>' |
| 22:21 | <gsnedders> | jgraham_: < is allowed in unquoted attributes |
| 22:22 | <jgraham_> | That sounds reasonable |
| 22:22 | jgraham_ | is not too familiar with the serializer tests |
| 22:23 | gsnedders | isn't either. |
| 22:23 | <gsnedders> | I just changed to code to match the spec and looked at what tests it caused to fail :) |
| 22:24 | <jgraham_> | When I try to log in to the WHATWG blog it tells me that I got my password wrong. When I try to reset my password, it tells me that the link it sent me has an invalid key |
| 22:24 | <Hixie> | uh |
| 22:24 | <Hixie> | did we get owned or something |
| 22:25 | <annevk> | it worked for me, but I don't have an admin account |
| 22:25 | gsnedders | commits to html5lib |
| 22:25 | <jgraham_> | (it is possible that I really got my password wrong of course) |
| 22:26 | <jgraham_> | (but the reset thing should work) |
| 22:27 | <Hixie> | man the wordpress ui changed since hte last time i was here |
| 22:27 | <Hixie> | annevk: made you admin |
| 22:27 | <Hixie> | annevk: can you work how jgraham_'s problem? :-) |
| 22:27 | <Hixie> | work out, even |
| 22:28 | <annevk> | hehe |
| 22:28 | <annevk> | lots of new features |
| 22:28 | <gsnedders> | http://stuff.gsnedders.com/html5.html — now valid! |
| 22:28 | <annevk> | jgraham_, i'll pm you on w3.org |
| 22:29 | <Hixie> | oh i see how to reset the password |
| 22:29 | <Hixie> | i can reset it right here if you want jgraham_ |
| 22:29 | <Hixie> | or anne can do it |
| 22:29 | <Hixie> | whichever :- |
| 22:29 | <Hixie> | ) |
| 22:29 | <annevk> | i just did |
| 22:29 | <Hixie> | cool |
| 22:30 | <Philip`> | gsnedders: When making changes like allowing =, it would be good to add a test that fails in the old code and passes in the new code |
| 22:30 | <gsnedders> | Philip`: That is true, actually |
| 22:31 | <gsnedders> | Philip`: And fixing the description for the change |
| 22:31 | <gsnedders> | jgraham_: Should I change the order to make it more logical, or keep it verbatim? |
| 22:32 | <jgraham_> | gsnedders: Of the tests? Change the order. Our tests should really be sorted into some sort of structure so tiny baby steps can' hurt |
| 22:33 | gsnedders | commits |
| 22:39 | <Hixie> | Philip`: k, the 404 thing should work now, let me know if it is broken |
| 22:40 | <Philip`> | Hixie: Oops |
| 22:40 | <Philip`> | Hixie: It needs <body onload="fixBrokenLink()"> |
| 22:40 | Philip` | forgot about that part |
| 22:41 | <annevk> | http://blog.whatwg.org/offline-webapps |
| 22:41 | <Hixie> | try now? |
| 22:42 | <Philip`> | Hixie: Works now - thanks! |
| 22:42 | <Hixie> | cool |
| 22:42 | <Hixie> | it runs a couple of sed commands on the original whatwg 404 page |
| 22:42 | <Hixie> | and creates the .htaccess file to use it |
| 22:42 | <Hixie> | so hopefully if i modify the main whatwg 404 page, it'll keep track |
| 22:42 | <Philip`> | Hmm, doesn't work in IE because IE replaces the whole error page |
| 22:43 | <Hixie> | sucks to be an IE user |
| 22:43 | <Hixie> | bbiab |
| 22:44 | <Philip`> | Oh, real IE7 (vs Wine IE6) does display the error page |
| 22:44 | <Philip`> | but has a "permission denied" JS error while running the script :-/ |
| 22:45 | <Philip`> | Oh, it worked this time |
| 22:46 | <Philip`> | That's good enough for me |
| 22:48 | Philip` | will ignore the problem that old links to .../multipage/section-foo.html (with no #) are broken and don't automatically fix themselves |
| 22:53 | <annevk> | Philip`, that should be fixable too without too much trouble, no? |
| 22:53 | <annevk> | Philip`, given that the file names are based on id= too |
| 22:54 | <Philip`> | annevk: Hmm... I suppose if I knew when it was 404, then I could do that |
| 22:55 | <Philip`> | Hixie: Would it be possible to change to <body onload="fixBrokenLink(404)"> ? |
| 22:58 | <Philip`> | Hixie: (By the way, I've added the do-pubrules-update call now) |
| 23:13 | annevk | adds <p> and </p> to his blog entry |
| 23:31 | <annevk> | ARIA expressed in CSS? Hmm. I guess I better not reply to that as I might offend someone... |