| 03:57 | <markp> | jgraham: you around? |
| 03:57 | <markp> | or any other html5lib hackers? |
| 14:14 | <virtuelv> | is anyone going to be offended if I refer to the selectors API naming debate as a bikeshed problem? |
| 14:19 | <zcorpan> | not me |
| 14:19 | <zcorpan> | and i participated in it :( |
| 14:20 | zcorpan | will stay away from naming debates in the future |
| 14:20 | <Dashiva> | It was a bikeshed problem where all the colors were ugly shades of purple and beige |
| 14:22 | <virtuelv> | fwiw, http://programming.reddit.com/info/2jrrg/comments |
| 14:27 | <zcorpan> | accessing radiobuttons with the [[Get]] method is funny -- if there's only one control that matches, it returns it directly, but if there are more it returns a nodelist |
| 14:27 | <zcorpan> | document.forms[0]["foo"] |
| 14:28 | <zcorpan> | and undefined if 0 match |
| 14:36 | <Dashiva> | Same with checkboxes |
| 14:37 | <zcorpan> | yeah |
| 14:42 | <Dashiva> | And we should do our best to prevent the same behavior from happening to responsebodies in multipart xhr responses :) |
| 15:59 | <gsnedders> | is there anyway to check WF2 support through JS? |
| 16:03 | <gsnedders> | document.implementation.hasFeature('WebForms', '2.0')? |
| 16:16 | <krijnh> | gsnedders: Yeah, that returns true in Opera |
| 16:18 | <zcorpan> | gsnedders: it's probably safer to check specific methods etc before using them |
| 16:18 | <gsnedders> | zcorpan: need some way to check if |input|@type=date has a native controller before falling back to a JavaScript one |
| 16:19 | <gsnedders> | (which will just be a text field on something with neither WF2 or JS) |
| 16:19 | <zcorpan> | .type == "date" ? |
| 16:19 | <gsnedders> | does that work? |
| 16:19 | <zcorpan> | think so |
| 16:19 | <gsnedders> | does that set itself to "text" in browsers that don't support it? |
| 16:20 | <zcorpan> | exactly |
| 16:20 | <zcorpan> | so you know if it's not supported |
| 16:20 | <zcorpan> | if (input.type != "date") { // not supported |
| 16:20 | <gsnedders> | falls back in Saf at least |
| 16:22 | <Dashiva> | falls back in ie and ff too |
| 16:22 | <Dashiva> | ff2, ie7 |
| 16:22 | <zcorpan> | there you go |
| 16:23 | <gsnedders> | zcorpan: thanks |
| 16:23 | <zcorpan> | welcome :) |
| 18:14 | <Lachy> | http://ajaxian.com/archives/selectors-api-method-names-selectelement-and-selectallelements |
| 18:14 | <Lachy> | http://dean.edwards.name/weblog/2007/08/names/ |
| 18:14 | <Lachy> | http://lachy.id.au/log/2007/06/naming-debate |
| 18:17 | <Lachy> | oops, wrong link. http://lachy.id.au/log/2007/08/naming-debate-revisited |
| 19:05 | <gsnedders> | kingryan: what did you do regarding text/plain sniffing? |
| 19:05 | <kingryan> | gsnedders: what do you mean? |
| 19:06 | <gsnedders> | kingryan: in your implementation of feed/html, did you just count every text/plain document as such, and not do what browsers do, and sniff it? |
| 19:06 | <kingryan> | the sniffing impl I've added to html5lib/ruby isn't actually integrated w/ the rest of the parser |
| 19:07 | <kingryan> | in other words, I haven't implemented the part of the decision tree that switches on content-type |
| 19:08 | <kingryan> | ... if that answers your question, gsnedders |
| 19:08 | <gsnedders> | ya |
| 19:08 | <kingryan> | I've also implemented it at technorati, where we don't really care about text/plain |
| 19:08 | <kingryan> | we only really care wether something is a feed or html |
| 19:08 | <kingryan> | or other |
| 19:09 | <gsnedders> | kingryan: but what if text/plain _is_ a feed or HTML? |
| 19:09 | <kingryan> | I wasn't clear... we don't care if people *say* its text/plain |
| 19:09 | <gsnedders> | you just sniff it regardless of the claim? |
| 19:10 | <kingryan> | yup |
| 19:11 | <kingryan> | but we're only indexing URLs that either 1) people pinged us with or 2) were discovered through link[@rel=~alternate] |
| 19:13 | <kingryan> | so, it works well enough for us |
| 19:13 | <gsnedders> | the feed/html algorithm currently concludes "otherwise return text/html". How do you tell apart HTML from the rest? |
| 19:14 | <kingryan> | because of where it sits in the decision tree, it doesn't matter |
| 19:14 | markp | goes off to start a text/plain blog |
| 19:14 | <kingryan> | if someone says "this is a feed", but its not, we give up on processing it |
| 19:14 | <markp> | does that happen a lot? |
| 19:14 | <kingryan> | the common case is an http 500 which has html |
| 19:15 | <markp> | heh |
| 19:15 | <kingryan> | or 503 |
| 19:15 | <gsnedders> | 503? |
| 19:15 | <kingryan> | 503 = "temporarily unavailable" |
| 19:15 | <markp> | http error 503: my web hosting sucks |
| 19:15 | <gsnedders> | markp: I was wondering whether I'd get such a response from you :) |
| 19:15 | <kingryan> | the more common case is that someone says their feed is rss , but its really atom |
| 19:16 | <gsnedders> | I don't even differentiate between the two. Just check if we have a feed or not. |
| 19:16 | <markp> | kingryan: do you use feedparser to determine feed version? |
| 19:17 | <markp> | just curious |
| 19:19 | <kingryan> | markp: our old spider uses UFP, but is slowly being replaced by one written in ruby |
| 19:20 | <kingryan> | for which the parsing is mostly based on html5lib |
| 19:33 | Philip` | tries running the html5lib validator on his documentation examples |
| 19:34 | <Philip`> | http://james.html5.org/cgi-bin/parsetree/parsetree.py?source=%3C%21DOCTYPE+HTML%3E%0D%0A%3Ctitle%3E%3Cb%3E+%26amp%3B+%3Ci%3E%3C%2Ftitle%3E - how come that's sensible there, but gives different results in SVN html5lib? |
| 19:40 | <zcorpan> | Philip`: what does the latter give? |
| 19:41 | <zcorpan> | shouldn't the innerHTML view escape < and > in title, btw? |
| 19:41 | <zcorpan> | and & |
| 19:42 | <Philip`> | >>> print p.parse('<!doctype html><title><b> & <i></title>').printTree() |
| 19:42 | <Philip`> | #document |
| 19:42 | <Philip`> | | <!DOCTYPE html> |
| 19:43 | <Philip`> | | <html> |
| 19:43 | <Philip`> | | <head> |
| 19:43 | <Philip`> | | <title> |
| 19:43 | <Philip`> | | <body> |
| 19:43 | <Philip`> | | <b> |
| 19:43 | <Philip`> | | " & " |
| 19:43 | <Philip`> | | <i> |
| 19:43 | <zcorpan> | aha. |
| 19:44 | <zcorpan> | regression then |
| 19:45 | <Philip`> | Looks like <title> is meant to be parsed as RCDATA |
| 19:46 | <zcorpan> | yes |
| 21:50 | <jgraham> | Oh attribute values |
| 21:50 | <jgraham> | I didn't do those |
| 21:50 | <zcorpan> | right |
| 21:52 | <jgraham> | That looks a bit more promising |
| 21:54 | <zcorpan> | jgraham: <plaintext> is also a cdata element |
| 21:54 | <zcorpan> | for this purpose anyway |
| 21:55 | <zcorpan> | otherwise looks right |
| 21:56 | <zcorpan> | hmm, http://james.html5.org/cgi-bin/parsetree/parsetree.py?source=%3Cnoscript%3E doesn't look like it's parsed correctly (or there's a bug in the spec) |
| 21:56 | <Philip`> | markp: Oh, I'm happy to slack - I just wanted to see if it worked already, and only found that one problem, so I'm content to wait until it works even better in the future ;-) |
| 22:00 | <zcorpan> | "<noscript>" is supposed to be parsed into <html><head><noscript></noscript></head><body></body></html> if i read the spec right |
| 22:01 | <jgraham> | zcorpan: That looks more plausible |
| 22:02 | jgraham | is wondering why html5lib has constants named rcdataElements and cdataElements that seem to be almost exactly back to front |
| 22:31 | <jgraham> | zcorpan: Where should the </noscript> be generated in the spec? |
| 22:33 | <zcorpan> | jgraham: i don't understand the question |
| 22:34 | <zcorpan> | jgraham: you mean as an implied token in the tree construction? |
| 22:34 | <jgraham> | Yeah |
| 22:34 | <zcorpan> | they aren't |
| 22:35 | <zcorpan> | the cdata parsing algorithm just looks until it finds a token that's not a character token |
| 22:35 | <zcorpan> | if that's the end tag it's ignored |
| 22:35 | <zcorpan> | otherwise you carry on |
| 22:35 | <zcorpan> | so say the tokens are: start tag "noscript", character "X", end-of-file |
| 22:36 | <jgraham> | Oh, I see |
| 22:36 | <jgraham> | We're mistakenly inserting the token for <noscript> into the stack of open elements |
| 22:36 | <zcorpan> | when you get to the end-of-file it's a parse error and you get back to "in head" |
| 22:37 | <zcorpan> | seems so |
| 22:39 | <zcorpan> | it should get the same treatment as <style> i think |
| 22:39 | <zcorpan> | basically |
| 22:40 | <zcorpan> | or exactly :) |
| 22:41 | <zcorpan> | if scripting is enabled |
| 22:41 | <jgraham> | zcorpan: I think we get <style> wrong too |
| 22:41 | <zcorpan> | http://james.html5.org/cgi-bin/parsetree/parsetree.py?source=%3Cstyle%3E |
| 22:41 | <jgraham> | Eh, that doesn't look like it should work |
| 22:51 | <jgraham> | zcorpan: OK I have a fix but I think this code needs a little more love (tomorrow :) ) |
| 22:56 | <zcorpan> | jgraham: ok :) |
| 23:13 | <zcorpan> | wonder if looking at the links href="" would yield better results than the image's src="" for <a href=...><img src=...></a> |
| 23:13 | <zcorpan> | perhaps href more often contains things like query strings |
| 23:14 | <Philip`> | <a href="page18.html"><img src="/images/next.gif"></a> wouldn't do so well |
| 23:15 | <zcorpan> | indeed |