| 00:30 | <Hixie_> | mjs, others, any input on this? http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2007-October/012683.html |
| 00:30 | <Hixie_> | should i include such an API? |
| 00:32 | <othermaciej> | Hixie_: hadn't thought about it deeply - I can see how it might be useful, the only concern I have is that preflighting for the availability of a resource in the cache instead of looking for an error on access introduces possible races |
| 00:32 | <othermaciej> | (in other words the cache item could expire between when you ask and when you try to access it) |
| 00:32 | <Hixie_> | indeed |
| 00:33 | <Hixie_> | and vice versa, could be added in between |
| 00:33 | <othermaciej> | true |
| 00:33 | <othermaciej> | or you could go offline/online in between, thus changing the rules of what resources can be served even expired |
| 00:33 | <othermaciej> | but I'm not sure how to achieve the desired result without preflighting |
| 00:34 | <Hixie_> | yeah |
| 00:35 | <kingryan> | could the API take a function that is executed iff the resource is available? then the impl could synchronize on the cache |
| 00:35 | <Hixie_> | maybe |
| 00:35 | kingryan | doesn't know how hard is suggestion would be to implement |
| 00:37 | <othermaciej> | the hard part to implement would be ensuring whatever load you may trigger uses that cached copy |
| 00:37 | <othermaciej> | it could be setting a frame's window.location, it could be setting an img src, it could be an XMLHttpRequest... |
| 00:37 | <othermaciej> | there's lots of ways to trigger a resource load |
| 00:37 | <othermaciej> | in a sense the most robust API would be to add a "try if local, otherwise error" version of all of those |
| 00:38 | <othermaciej> | or at least the ones that are considered important for this purpose |
| 08:14 | <Lachy> | good morning :-) |
| 08:29 | <hsivonen> | Lachy: morning |
| 09:34 | <hsivonen> | I wonder what the point of the NCNameness of xml:id is or the point of the normalization |
| 09:36 | <othermaciej> | architecture |
| 09:36 | <othermaciej> | *xml* architecture |
| 09:39 | <hsivonen> | off-hand, I don't see what architectural problem the NCNameness solves here |
| 09:40 | <hsivonen> | it's just an arbitrary "we like this kind of strings but not other kinds of strings" |
| 09:40 | <othermaciej> | my implication is that everything about xml boils down to that sort of thing |
| 09:41 | <hsivonen> | yeah |
| 09:42 | <hsivonen> | so far, xml:id is taking me more than thrice the # of lines of code needed for XHTML5 ids. |
| 09:46 | <hsivonen> | for now, I'm going to pretend that the conditional IDness of SVG 1.2 id doesn't exist and SVG ids are unconditionally IDs. |
| 09:47 | <othermaciej> | conditional idness? |
| 09:48 | <othermaciej> | are id attributes not ID when an xml:id is also present or something? |
| 09:48 | <hsivonen> | othermaciej: according to SVG 1.2, IIRC, yes |
| 09:48 | <hsivonen> | not cool |
| 09:48 | <othermaciej> | they probably did that to work around the fact that you can't have id="x" xml:id="x" |
| 09:49 | <othermaciej> | which you would need to both work with older versions of SVG and to drink the latest kool-aid |
| 09:51 | <hsivonen> | othermaciej: it is easy to guess why they did it, but it is the wrong fix |
| 09:51 | <hsivonen> | othermaciej: the right fix is to use only id='x' without xml:id='x' |
| 09:51 | <hsivonen> | anyway, last bullet point under http://www.w3.org/TR/SVGMobile12/struct.html#Core.attrib |
| 09:51 | <othermaciej> | hsivonen: but that would entail not using xml:id, which is the cool new thing, and thus obviously right to use |
| 09:51 | <hsivonen> | so my memory didn't fail me |
| 09:55 | <zcorpan> | hsivonen: would you be shot down for not supporting xml:id at all? :) |
| 10:05 | <hsivonen> | zcorpan: I don't know |
| 10:05 | <hsivonen> | I feel like venting on www-svg, but it would probably be wasted effort |
| 10:43 | <hsivonen> | I figured that now that the idea of href='' instead of xlink:href='' is no longer outrageous, I might as well try suggesting not working around the problems xml:id creates |
| 10:46 | <OmegaJunior> | Where would href i.o. xlink:href be outrageous? Not in html5, I suppose. In xhtml5 perhaps? |
| 10:48 | <hsivonen> | OmegaJunior: SVG |
| 10:48 | <OmegaJunior> | Ah |
| 10:57 | <hsivonen> | Hixie_: is there a good reason why ID assignment shouldn't happen if the ID candidate is the empty string? why the exception? |
| 11:22 | <zcorpan> | http://software.hixie.ch/utilities/js/live-dom-viewer/?%3Cp%20id%3D%22%22%3E%3Cscript%3Ew(document.getElementById(%22%22))%3C%2Fscript%3E |
| 11:22 | <zcorpan> | null in ie, firefox, safari |
| 11:23 | <hsivonen> | zcorpan: ok. thanks |
| 11:44 | <Lachy> | Hixie_, yt? |
| 11:45 | <Lachy> | Hixie_, the data URI kitchen is broken http://software.hixie.ch/utilities/cgi/data/data |
| 11:46 | <Lachy> | Hixie_, it seems to only work for file uploads, not text input or http URIs |
| 11:59 | <jwalden> | nice, roc followed up on isLocallyAvailable so I don't have to feel obligated to do so :-) |
| 13:36 | <zcorpan> | was something interesting said during the telecon? reading the log it seemed pretty hollow |
| 13:39 | <hsivonen> | I suppose attributes that have IDness should have IDness and remain in the infoset even if the value is "". so all code everywhere that is ID-sensitive needs to check both for IDness and the value emptystringness :-( |
| 13:42 | <zcorpan> | why do you suppose so? |
| 13:43 | <hsivonen> | zcorpan: it seems wrong to change the IDness based on value. moreover, I'm not sure if non-validating DTD processing could inject ""-valued IDs into the pipeline |
| 13:44 | <zcorpan> | ok |
| 13:45 | <zcorpan> | then html5 shouldn't say that empty id="" doesn't do ID assignment, but dom core should say that the empty string as argument to getElementById() should return null |
| 13:46 | <zcorpan> | correct? |
| 13:55 | <hsivonen> | zcorpan: depends on whether ID assignment and IDness assignment mean different things |
| 13:55 | <hsivonen> | IDness assignment clearly means if querying the attribute for its type return "ID" |
| 13:56 | <hsivonen> | I'm not sure what ID assigment exactly means but I understood in to mean: put in a hashtable with the ID value as the key |
| 13:56 | hsivonen | keeps mistyping assignment over and over |
| 14:25 | <zcorpan> | i should write an xml-stylesheet spec that doesn't have fatal error requirements |
| 14:26 | <zcorpan> | which would also benefit xbl2 |
| 14:51 | <zcorpan> | i wonder how i should approach that... |
| 14:52 | <zcorpan> | i mean, to make progress, i should just go ahead and reverse engineer relevant implementations, write test cases and a spec, then ask for feedback |
| 14:53 | <zcorpan> | but politically that might not be the best way to do it |
| 14:55 | <zcorpan> | perhaps i should start out with reverse engineering and demos, and point out the problems to the relevant WG(s), and if i don't get a response then i go ahead and write a spec and then ask for feedback |
| 15:02 | <zcorpan> | any thoughts? which are the relevant WGs for xml-stylesheet? |
| 15:04 | <heycam> | zcorpan, xml core wg seems most appropriate |
| 15:04 | <zcorpan> | heycam: ok |
| 15:05 | <heycam> | on http://www.w3.org/XML/Group/Core they list working on that document again as a "future task" |
| 15:07 | heycam | sleeps |
| 15:16 | <hsivonen> | zcorpan: fwiw, the idea that XML Core WG has about "relevant implmentations" is likely to be radically different from yours |
| 15:32 | <ROBOd> | hello guys! pardon me barging into the discussion. i have one quick question: did microsoft (via chris wilson?) publish the complete html5 review? iirc, it was supposed they'll publish a review. i haven't seen it yet |
| 15:33 | <Philip`> | They haven't done, though ChrisW mentioned it yesterday |
| 15:34 | <Philip`> | 00:29 < DanC> ChrisW: yes, I'm working with the IE team on our review... |
| 15:34 | <Philip`> | 00:30 < DanC> ... I'll have some stuff sent out prior to the ftf meeting. |
| 15:35 | <ROBOd> | aha, thanks Philip` |
| 15:37 | <zcorpan> | hsivonen: i'd like to learn which implementations they consider relevant |
| 16:47 | <zcorpan> | http://blogs.s60.com/browser/2007/10/coring_the_browser.html -- hmm, so they support SVG via plugin instead of using WebCore's native support? (remember that they parse xml with the html parser) |
| 16:50 | <zcorpan> | "The difference with browsing is that HTML, CSS and ECMAScript create a really complex system, and the standards can never exactly specify the "correct" behavior in every case." |
| 16:51 | <zcorpan> | oh? |
| 19:32 | <kingryan> | is thomas broyer around here? |
| 19:56 | <Hixie_> | Lachy: odd |
| 19:56 | <Hixie_> | oh i know why |
| 19:56 | <Hixie_> | issues with the content-type sanitation |
| 19:58 | <Hixie_> | fixed |
| 21:22 | <Vito`> | hello, we're using html5lib to generate plain text from archived html pages. we're finding html5lib bombs out with maximum recursion errors on some pages. |
| 21:22 | <kingryan> | Vito`: please give an example |
| 21:24 | <Vito`> | http://mavra.perilith.com/~vito/html5lib/2002-0919-120019.html |
| 21:25 | <Vito`> | this is the python html5lib, and we're just doing parser=html5lib.HTMLParser();dom=parser.parse(filecontents); |
| 21:25 | <kingryan> | backtrace? |
| 21:25 | <Vito`> | maximum recursion depth reached, or some such |
| 21:26 | <Vito`> | the backtrace is accordingly > 1000 lines |
| 21:26 | <kingryan> | that'd be the error message. do you have backtrace? |
| 21:26 | <kingryan> | ah |
| 21:26 | <kingryan> | can you at least give us an idea of where the recursion is happening? |
| 21:26 | <Vito`> | lines 273 and 866 are repeated |
| 21:26 | <kingryan> | of what file? |
| 21:27 | <kingryan> | and which version of html5lib are you using? |
| 21:27 | <Vito`> | html5parser.py, in endTagHtml, self.parser.phase.processEndTag(name) and self.endTagHandler[name](name) |
| 21:28 | <kingryan> | I can't reproduce the error in trunk in either python or ruby |
| 21:28 | <Vito`> | hm |
| 21:29 | <kingryan> | which version are you using? |
| 21:29 | <Philip`> | I get no error in the Python trunk version either |
| 21:29 | <Vito`> | trying to find out |
| 21:29 | <Philip`> | and http://james.html5.org/cgi-bin/parsetree/parsetree.py?uri=http%3A%2F%2Fmavra.perilith.com%2F%7Evito%2Fhtml5lib%2F2002-0919-120019.html looks alright |
| 21:44 | <Vito`> | I thought it was the latest, but I guess we're running 0.9 or something. Installing 0.10 locally got the first batch of failures passing. |
| 21:44 | <Vito`> | I'll let you know if anything new comes up. Thanks for the sanity check. |
| 21:51 | <jgraham> | Hmm. In principle there are places where html5lib could have problems with recursion as it uses recursive algorithms in some places where iterative ones could be used |
| 21:51 | <jgraham> | But AFAIK there was at least one infinite loop bug fixed since 0.9 |
| 21:52 | <Vito`> | we have a bit over 23k archived pages and we were hitting something pretty frequently |
| 21:52 | <Vito`> | but it's all OKs so far with 0.10 |
| 21:57 | <Philip`> | If you're parsing lots of pages, it may be worth looking at hsivonen's Java HTML5 parser since it's around a hundred times faster than the Python one |
| 21:59 | <Vito`> | alright, I've an AssertionError using 0.10. Should I try with trunk or are they the same? |
| 22:01 | <kingryan> | they're mostly the same |
| 22:03 | <Vito`> | happens with trunk as well |
| 22:04 | <Vito`> | http://mavra.perilith.com/~vito/html5lib/2003-0701-120001.html |
| 22:04 | <Vito`> | I can save these and stuff them all into the issue tracker as well, of course |
| 22:04 | <Vito`> | given that many more are passing than failing now |
| 22:28 | <jgraham> | Vito`: That looks like a recent regression. I'll have to investigate further |
| 22:29 | <Vito`> | I've had a handful of those now, plus one with an encoding error. I'll just put them all in the tracker. |
| 22:30 | <jgraham> | (in the meantime you can try removing the assertion that fires; I _think_ the only bad side effect is that the source position reported for errors might be wrong) |
| 22:30 | <jgraham> | Thanks |
| 22:32 | <jgraham> | Vito`: Add me as the owner for the bugs you file (jgraham.html) |
| 22:33 | <Vito`> | k |
| 22:40 | <Hixie_> | othermaciej: do you mind if i skip replying to <video>-related e-mails from you if the spec already does everything you asked for in those e-mails, or would you rather have replies to all your mails? (either is fine, just checking which you prefer) |
| 22:40 | <othermaciej> | Hixie_: if the emails predate the current version of <video> then I can do without such replies |
| 22:41 | <othermaciej> | Hixie_: I will have new feedback from Apple soon relative to the current spec as a baseline, so I am not worried about things getting lost |
| 22:41 | <Hixie_> | k |
| 22:42 | <Hixie_> | these predate the <video> dinner at google |
| 22:49 | <jgraham> | Vito`: I think I have fixed one of your issues. I'll update svn in a few minutes once I fix a few issues with my working copy |
| 22:55 | <Vito`> | ooh |
| 22:55 | <Vito`> | "Warning: Undefined behaviour for end tag section" |
| 22:55 | <Vito`> | didn't log the file for that one, darn |
| 22:55 | <jgraham> | Vito`: That's expected |
| 22:56 | <Vito`> | ah |
| 22:56 | <jgraham> | <section> is a new HTML 5 tag but its parsing isn't yet defined (we treat it like a generic unkown element). However you will sometimes find authors inventing tags in the wild |
| 22:57 | <jgraham> | So you could have encountered a rouge <section> |
| 22:57 | <Vito`> | I wonder what page that was. There were ~30 of them. |
| 22:57 | <Hixie_> | a red section? Is that, like, a porn site? |
| 22:58 | <jgraham> | Hixie_: ? |
| 22:58 | <Vito`> | rouge |
| 22:58 | <hober> | maybe you mean rogue, Vito` |
| 22:58 | <jgraham> | Sorry being slow |
| 22:59 | <Hixie_> | sorry :-) |
| 22:59 | <jgraham> | :-p |
| 23:00 | <Vito`> | I assume "Warning: Undefined behaviour for end tag header" is the same sort of thing? |
| 23:01 | <jgraham> | Uh hu. |
| 23:01 | <Vito`> | fascinating |
| 23:03 | <jgraham> | Hmm. Odd. html5lib is failing unit tests even without my change. This isn't supposed to happen :-| |
| 23:07 | <jgraham> | Maybe it's just kingryan's extra tests |
| 23:17 | <Philip`> | Incidentally, RIP has slightly interesting error handling - it has a version field, and v1 of the spec defines draconian error handling for packets with version=1 (e.g. reserved fields must contain zero, else the data is rejected), but non-draconian handling if the packet has version >= 2 |
| 23:17 | <Hixie_> | what's RIP? And is that actually implemented? |
| 23:18 | <Hixie_> | (and do people ever give a version field?) |
| 23:18 | <Philip`> | so v2 of the protocol can start using the reserved fields, being certain that v1 implementations won't be sneakily using those fields anyway, while still being backward-compatible with v1 implementations |
| 23:18 | <Philip`> | It's a routing protocol |
| 23:18 | <Philip`> | (RFC1058) |
| 23:18 | <Hixie_> | interesting |
| 23:18 | <Hixie_> | oh, that RIP |
| 23:19 | <Hixie_> | i shoulda recognised the name |
| 23:21 | <Philip`> | As far as I'm aware, people do actually use it (on very small networks, since it's very simple (which is why I've been looking at RIP and not at anything more interesting and complex :-) )) |
| 23:22 | <Hixie_> | :-) |
| 23:23 | <Vito`> | jgraham... I'm most of the way through this corpus now, and that assertion error and the unicode error are the only two unique failures I've seen |
| 23:23 | <Philip`> | (I think I'll end up having to work with BGP, which looks much scarier) |
| 23:24 | <jgraham> | Vito`: That sounds like it could be worse (did you get the message that I checked in a fix that I think helps with the assertion to trunk?) |
| 23:25 | <Vito`> | if it's in trunk I'll update and check the pages against it |
| 23:29 | <Vito`> | also html5lib can't handle GIF files with inappropriate MIME types |
| 23:30 | <Vito`> | just... so you know |
| 23:30 | <Philip`> | Hmm, it worked fine when I passed a PDF through it once |
| 23:30 | <Philip`> | What kind of problem did you get? |
| 23:31 | <Vito`> | unicodedecodeerror |
| 23:32 | <Philip`> | Ah |
| 23:32 | <Vito`> | awesome |
| 23:32 | <Vito`> | recursion error |
| 23:32 | <Vito`> | and it didn't log |
| 23:32 | <Vito`> | argh |
| 23:34 | <jgraham> | Vito`: I don't lnow what test data you're using but it clearly rocks :) |
| 23:35 | <Vito`> | it's just our group's cache of bookmarked sites, crawled over the past few years |
| 23:35 | <kingryan> | jgraham: yes, I've added some test which may break the python impl |
| 23:35 | <kingryan> | jgraham: in ruby it was mostly a matter of adding error messages to the parserError() calls |
| 23:35 | <jgraham> | kingryan: It mostly seems to be small things; I'm just working through it now |
| 23:36 | <kingryan> | cool |
| 23:36 | <kingryan> | I meant to write a note to the ML about it, but forgot |
| 23:37 | <Vito`> | jgraham... testing against your updated trunk now |
| 23:40 | <Vito`> | jgraham... looks good against a couple of the failing pages |
| 23:45 | <Vito`> | jgraham... I can't seem to mark you as owner of the unicode issue I just reported |
| 23:45 | <jgraham> | Vito`: Not to worry; I'll do it |