11:15
<jgraham>
Hixie: Re: an alt+etc. respecting version of textContent; yes please for use cases such as making a table of contents (or trying to extract any user-displayed string from a non-CDATA element, really)
13:11
<zcorpan>
hsivonen: http://validator.nu/?doc=http://forums.whatwg.org/&showsource=yes has some extracts missing and others ending at the wrong place
13:12
<zcorpan>
hsivonen: (or starting at the wrong place)
13:13
<zcorpan>
hsivonen: starting with error 95
14:42
<hsivonen>
zcorpan: thanks. I'll investigate.
15:51
<zcorpan>
hsivonen: xml:lang attributes didn't use to be fatal in v.nu, did they?
15:53
<gsnedders>
The Settlers' II is awesome.
15:57
<zcorpan>
yeah i remember that game
15:59
zcorpan
looks up Widelands
16:00
<gsnedders>
I've just gone back to playing it for more or less the first time since the '90s.
16:01
<gsnedders>
Really primitive graphics, but its hard to find much flaw with the gameplay itself
16:01
<gsnedders>
anyone played The Settlers II 10th Anniversary Edition?
16:03
zcorpan
downloads the demo
16:36
<gsnedders>
ack, I'll go back to not writing docs, and playing Settlers II
18:44
gsnedders
finally drags himself away from it
19:58
<gsnedders>
Is the behaviour described at <http://hg.gsnedders.com/spec-gen/raw-file/tip/README.html#table-of-contents/section-numbering>; for when there is no heading sane enough?
19:59
<gsnedders>
Hixie: ^^
20:03
<Hixie>
looks fine to me
20:24
<gsnedders>
Wow.
20:25
<gsnedders>
608 sections in HTML 5
20:27
<gsnedders>
(approx)
20:28
<gsnedders>
Yeah, 609.
20:29
gsnedders
finds he can save a couple of thousands of a second by using a deque
20:29
gsnedders
thinks that's ever so slightly pointless
20:31
gsnedders
does a different optimization, and cuts buildToc on HTML 5 down from 16.702s to 0.635s
20:31
<gsnedders>
That's probably more noticeable
20:33
<gsnedders>
Running an XPath statement once is not noticeable. Running it 608 times is.
20:35
<hsivonen>
SAX for perf!
20:36
<gsnedders>
hsivonen: I'm not even doing that. I'm just manually iterating over the element and doing what the XPath did manually :P
20:38
<gsnedders>
Besides, the thing that I really need to make a large difference to perf is a quicker parser/serializer :P
20:44
<hsivonen>
gsnedders: if your tree API works with Jython...
20:45
<hsivonen>
gsnedders: (adding tree API glue layers to the validator.nu parser is easy)
20:46
<gsnedders>
hsivonen: For most purposes libxml2's HTML parser works fine, and I'd rather work on finishing the spec-gen than getting more parsers working
20:47
<gsnedders>
Hmm. I can't see any easy way to remove //text()[contains(normalize-space(translate(., 'AEILNORSTV', 'aeilnorstv')), 'latest version') or contains(., 'http://www.w3.org/TR/';)]
20:48
<gsnedders>
I can't check .text and .tail at the same time, as I need the text nodes in document order
20:49
<gsnedders>
I can't use etree.iterwalk() because I need the tail of comments, PIs, and the like
21:13
gsnedders
could do with annevk being on IRC
21:14
<jgraham>
gsnedders: What's wrong with using d.iter() (assuming d is an etree tree)
21:14
<gsnedders>
jgraham: I need text nodes in document order. I can't get that from d.iteR()
21:14
<jgraham>
(as opposed to iterwalk, not as opposed to annevk who is on holiday for a while)
21:17
<jgraham>
gsnedders: Oh well you'd have to iterate by hand I guess, or you could use the html5lib lxml treewalker which does what you want
21:17
<gsnedders>
jgraham: Or just use XPath, which is what I do now
21:17
<gsnedders>
jgraham: I probably won't get quicker than that :(
21:17
<jgraham>
gsnedders: I thouht you were trying to replace the XPath
21:18
<gsnedders>
jgraham: For the sake of being quicker. Iterating by hand will be slow.
21:18
<zcorpan>
gsnedders: use regexp on the source
21:19
<gsnedders>
zcorpan: It has to only be text nodes for compat. with existing docs
21:19
<jgraham>
gsnedders: I thought you had claimed that iterating by hand was surprisingly fast
21:20
<jgraham>
but maybe I was mistaken
21:20
<zcorpan>
gsnedders: which docs?
21:20
<gsnedders>
jgraham: I claimed that the built-in iterators could be surprisingly quick compared with things like .find
21:21
<jgraham>
Oh OK. I guess using a built in iterator + a stack wouldn't help?
21:22
<gsnedders>
Not really
21:35
gsnedders
goes against his original plan and implements something for anne in spec-gen 1.0
21:49
<gsnedders>
http://stuff.gsnedders.com/spec-gen/html5.html#the-a-element — note where the xref for "a" now points
22:03
<gsnedders>
(i.e., not #a)