03:57
<markp>
jgraham: you around?
03:57
<markp>
or any other html5lib hackers?
14:14
<virtuelv>
is anyone going to be offended if I refer to the selectors API naming debate as a bikeshed problem?
14:19
<zcorpan>
not me
14:19
<zcorpan>
and i participated in it :(
14:20
zcorpan
will stay away from naming debates in the future
14:20
<Dashiva>
It was a bikeshed problem where all the colors were ugly shades of purple and beige
14:22
<virtuelv>
fwiw, http://programming.reddit.com/info/2jrrg/comments
14:27
<zcorpan>
accessing radiobuttons with the [[Get]] method is funny -- if there's only one control that matches, it returns it directly, but if there are more it returns a nodelist
14:27
<zcorpan>
document.forms[0]["foo"]
14:28
<zcorpan>
and undefined if 0 match
14:36
<Dashiva>
Same with checkboxes
14:37
<zcorpan>
yeah
14:42
<Dashiva>
And we should do our best to prevent the same behavior from happening to responsebodies in multipart xhr responses :)
15:59
<gsnedders>
is there anyway to check WF2 support through JS?
16:03
<gsnedders>
document.implementation.hasFeature('WebForms', '2.0')?
16:16
<krijnh>
gsnedders: Yeah, that returns true in Opera
16:18
<zcorpan>
gsnedders: it's probably safer to check specific methods etc before using them
16:18
<gsnedders>
zcorpan: need some way to check if |input|@type=date has a native controller before falling back to a JavaScript one
16:19
<gsnedders>
(which will just be a text field on something with neither WF2 or JS)
16:19
<zcorpan>
.type == "date" ?
16:19
<gsnedders>
does that work?
16:19
<zcorpan>
think so
16:19
<gsnedders>
does that set itself to "text" in browsers that don't support it?
16:20
<zcorpan>
exactly
16:20
<zcorpan>
so you know if it's not supported
16:20
<zcorpan>
if (input.type != "date") { // not supported
16:20
<gsnedders>
falls back in Saf at least
16:22
<Dashiva>
falls back in ie and ff too
16:22
<Dashiva>
ff2, ie7
16:22
<zcorpan>
there you go
16:23
<gsnedders>
zcorpan: thanks
16:23
<zcorpan>
welcome :)
18:14
<Lachy>
http://ajaxian.com/archives/selectors-api-method-names-selectelement-and-selectallelements
18:14
<Lachy>
http://dean.edwards.name/weblog/2007/08/names/
18:14
<Lachy>
http://lachy.id.au/log/2007/06/naming-debate
18:17
<Lachy>
oops, wrong link. http://lachy.id.au/log/2007/08/naming-debate-revisited
19:05
<gsnedders>
kingryan: what did you do regarding text/plain sniffing?
19:05
<kingryan>
gsnedders: what do you mean?
19:06
<gsnedders>
kingryan: in your implementation of feed/html, did you just count every text/plain document as such, and not do what browsers do, and sniff it?
19:06
<kingryan>
the sniffing impl I've added to html5lib/ruby isn't actually integrated w/ the rest of the parser
19:07
<kingryan>
in other words, I haven't implemented the part of the decision tree that switches on content-type
19:08
<kingryan>
... if that answers your question, gsnedders
19:08
<gsnedders>
ya
19:08
<kingryan>
I've also implemented it at technorati, where we don't really care about text/plain
19:08
<kingryan>
we only really care wether something is a feed or html
19:08
<kingryan>
or other
19:09
<gsnedders>
kingryan: but what if text/plain _is_ a feed or HTML?
19:09
<kingryan>
I wasn't clear... we don't care if people *say* its text/plain
19:09
<gsnedders>
you just sniff it regardless of the claim?
19:10
<kingryan>
yup
19:11
<kingryan>
but we're only indexing URLs that either 1) people pinged us with or 2) were discovered through link[@rel=~alternate]
19:13
<kingryan>
so, it works well enough for us
19:13
<gsnedders>
the feed/html algorithm currently concludes "otherwise return text/html". How do you tell apart HTML from the rest?
19:14
<kingryan>
because of where it sits in the decision tree, it doesn't matter
19:14
markp
goes off to start a text/plain blog
19:14
<kingryan>
if someone says "this is a feed", but its not, we give up on processing it
19:14
<markp>
does that happen a lot?
19:14
<kingryan>
the common case is an http 500 which has html
19:15
<markp>
heh
19:15
<kingryan>
or 503
19:15
<gsnedders>
503?
19:15
<kingryan>
503 = "temporarily unavailable"
19:15
<markp>
http error 503: my web hosting sucks
19:15
<gsnedders>
markp: I was wondering whether I'd get such a response from you :)
19:15
<kingryan>
the more common case is that someone says their feed is rss , but its really atom
19:16
<gsnedders>
I don't even differentiate between the two. Just check if we have a feed or not.
19:16
<markp>
kingryan: do you use feedparser to determine feed version?
19:17
<markp>
just curious
19:19
<kingryan>
markp: our old spider uses UFP, but is slowly being replaced by one written in ruby
19:20
<kingryan>
for which the parsing is mostly based on html5lib
19:33
Philip`
tries running the html5lib validator on his documentation examples
19:34
<Philip`>
http://james.html5.org/cgi-bin/parsetree/parsetree.py?source=%3C%21DOCTYPE+HTML%3E%0D%0A%3Ctitle%3E%3Cb%3E+%26amp%3B+%3Ci%3E%3C%2Ftitle%3E - how come that's sensible there, but gives different results in SVN html5lib?
19:40
<zcorpan>
Philip`: what does the latter give?
19:41
<zcorpan>
shouldn't the innerHTML view escape < and > in title, btw?
19:41
<zcorpan>
and &
19:42
<Philip`>
>>> print p.parse('<!doctype html><title><b> &amp; <i></title>').printTree()
19:42
<Philip`>
#document
19:42
<Philip`>
| <!DOCTYPE html>
19:43
<Philip`>
| <html>
19:43
<Philip`>
| <head>
19:43
<Philip`>
| <title>
19:43
<Philip`>
| <body>
19:43
<Philip`>
| <b>
19:43
<Philip`>
| " & "
19:43
<Philip`>
| <i>
19:43
<zcorpan>
aha.
19:44
<zcorpan>
regression then
19:45
<Philip`>
Looks like <title> is meant to be parsed as RCDATA
19:46
<zcorpan>
yes
21:50
<jgraham>
Oh attribute values
21:50
<jgraham>
I didn't do those
21:50
<zcorpan>
right
21:52
<jgraham>
That looks a bit more promising
21:54
<zcorpan>
jgraham: <plaintext> is also a cdata element
21:54
<zcorpan>
for this purpose anyway
21:55
<zcorpan>
otherwise looks right
21:56
<zcorpan>
hmm, http://james.html5.org/cgi-bin/parsetree/parsetree.py?source=%3Cnoscript%3E doesn't look like it's parsed correctly (or there's a bug in the spec)
21:56
<Philip`>
markp: Oh, I'm happy to slack - I just wanted to see if it worked already, and only found that one problem, so I'm content to wait until it works even better in the future ;-)
22:00
<zcorpan>
"<noscript>" is supposed to be parsed into <html><head><noscript></noscript></head><body></body></html> if i read the spec right
22:01
<jgraham>
zcorpan: That looks more plausible
22:02
jgraham
is wondering why html5lib has constants named rcdataElements and cdataElements that seem to be almost exactly back to front
22:31
<jgraham>
zcorpan: Where should the </noscript> be generated in the spec?
22:33
<zcorpan>
jgraham: i don't understand the question
22:34
<zcorpan>
jgraham: you mean as an implied token in the tree construction?
22:34
<jgraham>
Yeah
22:34
<zcorpan>
they aren't
22:35
<zcorpan>
the cdata parsing algorithm just looks until it finds a token that's not a character token
22:35
<zcorpan>
if that's the end tag it's ignored
22:35
<zcorpan>
otherwise you carry on
22:35
<zcorpan>
so say the tokens are: start tag "noscript", character "X", end-of-file
22:36
<jgraham>
Oh, I see
22:36
<jgraham>
We're mistakenly inserting the token for <noscript> into the stack of open elements
22:36
<zcorpan>
when you get to the end-of-file it's a parse error and you get back to "in head"
22:37
<zcorpan>
seems so
22:39
<zcorpan>
it should get the same treatment as <style> i think
22:39
<zcorpan>
basically
22:40
<zcorpan>
or exactly :)
22:41
<zcorpan>
if scripting is enabled
22:41
<jgraham>
zcorpan: I think we get <style> wrong too
22:41
<zcorpan>
http://james.html5.org/cgi-bin/parsetree/parsetree.py?source=%3Cstyle%3E
22:41
<jgraham>
Eh, that doesn't look like it should work
22:51
<jgraham>
zcorpan: OK I have a fix but I think this code needs a little more love (tomorrow :) )
22:56
<zcorpan>
jgraham: ok :)
23:13
<zcorpan>
wonder if looking at the links href="" would yield better results than the image's src="" for <a href=...><img src=...></a>
23:13
<zcorpan>
perhaps href more often contains things like query strings
23:14
<Philip`>
<a href="page18.html"><img src="/images/next.gif"></a> wouldn't do so well
23:15
<zcorpan>
indeed