| 11:15 | <jgraham> | Hixie: Re: an alt+etc. respecting version of textContent; yes please for use cases such as making a table of contents (or trying to extract any user-displayed string from a non-CDATA element, really) |
| 13:11 | <zcorpan> | hsivonen: http://validator.nu/?doc=http://forums.whatwg.org/&showsource=yes has some extracts missing and others ending at the wrong place |
| 13:12 | <zcorpan> | hsivonen: (or starting at the wrong place) |
| 13:13 | <zcorpan> | hsivonen: starting with error 95 |
| 14:42 | <hsivonen> | zcorpan: thanks. I'll investigate. |
| 15:51 | <zcorpan> | hsivonen: xml:lang attributes didn't use to be fatal in v.nu, did they? |
| 15:53 | <gsnedders> | The Settlers' II is awesome. |
| 15:57 | <zcorpan> | yeah i remember that game |
| 15:59 | zcorpan | looks up Widelands |
| 16:00 | <gsnedders> | I've just gone back to playing it for more or less the first time since the '90s. |
| 16:01 | <gsnedders> | Really primitive graphics, but its hard to find much flaw with the gameplay itself |
| 16:01 | <gsnedders> | anyone played The Settlers II 10th Anniversary Edition? |
| 16:03 | zcorpan | downloads the demo |
| 16:36 | <gsnedders> | ack, I'll go back to not writing docs, and playing Settlers II |
| 18:44 | gsnedders | finally drags himself away from it |
| 19:58 | <gsnedders> | Is the behaviour described at <http://hg.gsnedders.com/spec-gen/raw-file/tip/README.html#table-of-contents/section-numbering> for when there is no heading sane enough? |
| 19:59 | <gsnedders> | Hixie: ^^ |
| 20:03 | <Hixie> | looks fine to me |
| 20:24 | <gsnedders> | Wow. |
| 20:25 | <gsnedders> | 608 sections in HTML 5 |
| 20:27 | <gsnedders> | (approx) |
| 20:28 | <gsnedders> | Yeah, 609. |
| 20:29 | gsnedders | finds he can save a couple of thousands of a second by using a deque |
| 20:29 | gsnedders | thinks that's ever so slightly pointless |
| 20:31 | gsnedders | does a different optimization, and cuts buildToc on HTML 5 down from 16.702s to 0.635s |
| 20:31 | <gsnedders> | That's probably more noticeable |
| 20:33 | <gsnedders> | Running an XPath statement once is not noticeable. Running it 608 times is. |
| 20:35 | <hsivonen> | SAX for perf! |
| 20:36 | <gsnedders> | hsivonen: I'm not even doing that. I'm just manually iterating over the element and doing what the XPath did manually :P |
| 20:38 | <gsnedders> | Besides, the thing that I really need to make a large difference to perf is a quicker parser/serializer :P |
| 20:44 | <hsivonen> | gsnedders: if your tree API works with Jython... |
| 20:45 | <hsivonen> | gsnedders: (adding tree API glue layers to the validator.nu parser is easy) |
| 20:46 | <gsnedders> | hsivonen: For most purposes libxml2's HTML parser works fine, and I'd rather work on finishing the spec-gen than getting more parsers working |
| 20:47 | <gsnedders> | Hmm. I can't see any easy way to remove //text()[contains(normalize-space(translate(., 'AEILNORSTV', 'aeilnorstv')), 'latest version') or contains(., 'http://www.w3.org/TR/';)] |
| 20:48 | <gsnedders> | I can't check .text and .tail at the same time, as I need the text nodes in document order |
| 20:49 | <gsnedders> | I can't use etree.iterwalk() because I need the tail of comments, PIs, and the like |
| 21:13 | gsnedders | could do with annevk being on IRC |
| 21:14 | <jgraham> | gsnedders: What's wrong with using d.iter() (assuming d is an etree tree) |
| 21:14 | <gsnedders> | jgraham: I need text nodes in document order. I can't get that from d.iteR() |
| 21:14 | <jgraham> | (as opposed to iterwalk, not as opposed to annevk who is on holiday for a while) |
| 21:17 | <jgraham> | gsnedders: Oh well you'd have to iterate by hand I guess, or you could use the html5lib lxml treewalker which does what you want |
| 21:17 | <gsnedders> | jgraham: Or just use XPath, which is what I do now |
| 21:17 | <gsnedders> | jgraham: I probably won't get quicker than that :( |
| 21:17 | <jgraham> | gsnedders: I thouht you were trying to replace the XPath |
| 21:18 | <gsnedders> | jgraham: For the sake of being quicker. Iterating by hand will be slow. |
| 21:18 | <zcorpan> | gsnedders: use regexp on the source |
| 21:19 | <gsnedders> | zcorpan: It has to only be text nodes for compat. with existing docs |
| 21:19 | <jgraham> | gsnedders: I thought you had claimed that iterating by hand was surprisingly fast |
| 21:20 | <jgraham> | but maybe I was mistaken |
| 21:20 | <zcorpan> | gsnedders: which docs? |
| 21:20 | <gsnedders> | jgraham: I claimed that the built-in iterators could be surprisingly quick compared with things like .find |
| 21:21 | <jgraham> | Oh OK. I guess using a built in iterator + a stack wouldn't help? |
| 21:22 | <gsnedders> | Not really |
| 21:35 | gsnedders | goes against his original plan and implements something for anne in spec-gen 1.0 |
| 21:49 | <gsnedders> | http://stuff.gsnedders.com/spec-gen/html5.html#the-a-element — note where the xref for "a" now points |
| 22:03 | <gsnedders> | (i.e., not #a) |