11:40
<annevk>
http://www.phpied.com/wp-content/uploads/2008/02/canvas-pie.html is nice
11:42
<Hixie>
hard to argue against the .style in that
11:46
<annevk>
heh, i wasn't even thinking along those lines
11:47
<annevk>
just that it was a nice example of <canvas> usage that degrades pretty good to UAs that don't support <canvas> or non-visual UAs
15:47
Philip`
doesn't like how the parser spec uses "... and ...", "..., and ...", "..., and then ...", "..., then ..." for exactly equivalent sentences in different places
15:48
<Philip`>
So far I've got /Act as if an? (start|end) tag (?:token )?with the tag name "(\S+)" (?:and no attributes )?had been seen(?:, and then|, and| and|, then) reprocess the current token\./ to handle all the variations I've seen
15:48
<gsnedders>
Philip`: I don't think that's in the en-gb-x-Hixie spec, better complain to the editor
15:49
<Philip`>
I guess the language was designed for humans, which makes it harder for me :-(
15:49
<zcorpan_>
Philip`: since you're inhuman?
15:50
<annevk>
heh, you're actually trying to generate code from the English text? :)
15:50
<Philip`>
zcorpan_: No, but I'm trying to create an inhuman :-)
15:52
<Philip`>
annevk: It seems to be working alright so far - mostly it's a big table of English->code, but there are lots of repeated and nearly-repeated phrases so it saves a lot of code writing
15:52
<Philip`>
and I can be fairly surely it precisely matches the spec, including when the spec changes in the future, which is good since I'm lazy and don't want to have to check it carefully :-)
15:53
<annevk>
i guess you should comment on the inconsistencies
15:54
<Philip`>
I could add a code->English translation function, to rewrite that section of the spec with no inconsistencies :-)
16:00
<zcorpan_>
we should define how Hixie English should be parsed
16:00
<gsnedders>
is http://ian.hixie.ch/bible/english not the place for that?
16:01
<zcorpan_>
yes but it's very incomplete
16:01
<gsnedders>
en-gb-x-hixie 5?
16:02
Philip`
wonders if it's sane to convert all the "generic R?CDATA parsing algorithm" bits into separate new insertion modes, so he can get rid of the bits where it needs to wait for the next token
16:02
<annevk>
heh
16:02
<Philip`>
...and the skip-next-token-if-it's-a-newline bit too
16:02
<annevk>
the /TR/ version actually uses <html lang=en-US-x-Hixie>
16:02
Philip`
hopes there aren't any other such places
16:02
<gsnedders>
annevk: US!?
16:02
<annevk>
and the WHATWG version has en-GB-Hixie
16:03
<gsnedders>
so the /TR version is in an unspecified language. yuk.
16:05
<annevk>
en-US-x-Hixie is more correct than en-GB-Hixie
16:05
<gsnedders>
annevk: "more correct" makes no sense.
16:05
<gsnedders>
correctness is a binary value, you are either correct or you are not
16:05
<gsnedders>
annevk: but why?
16:05
<Philip`>
The English words "abort", "about", "above", "accepting", ... "youngest", "your" are to be interpreted as described in the Oxford English Dictionary [OED].
16:06
<Ketsuban>
You forgot "sausage".
16:06
Ketsuban
promptly apologises for the Blackadder reference.
16:08
<zcorpan_>
gsnedders: hixie english is both en-GB-x-Hixie and en-US-x-Hixie
16:08
<zcorpan_>
gsnedders: en-GB-hixie and en-US-hixie are the old tags
16:08
<gsnedders>
zcorpan_: but there's no spec for en-us-[x-]hixie! :P
16:09
<gsnedders>
hmm, en-us-hixie is mentioned once in it
16:09
<zcorpan_>
point that out to Hixie :P
16:09
<gsnedders>
and not even with an -x :P
16:10
gsnedders
notes you need an en-gb-x-hixie parser to parse the en-gb-x-hixie spec
16:11
<Philip`>
You need an HTML parser to parse the HTML spec too
16:11
<gsnedders>
yeah, I know that.
16:12
<Ketsuban>
gsnedders: a standard en-GB parser is sufficient.
16:12
<gsnedders>
Ketsuban: most people have some variant of en-gb, though
16:14
<Ketsuban>
True. Language codes don't take into account idiolects (excepting of course en-GB-x-Hixie).
16:56
<Philip`>
Hooray, now my parser does <title> and implies <html> and <head> and <body> etc, and I think it might even do that correctly
16:57
<annevk>
we should change handling of <title> and </head> in HTML5 though
16:57
<Philip`>
If that can be done by copying-and-pasting text from other sections, then I'll be happy since I won't have to do anything
16:59
<annevk>
i guess it comes down to </head> being ignored but only being a parse error sometimes and <title> being treated like a normal element except for its content model
17:22
Philip`
wishes he could use the tree construction tests without first having to work out how to glue his tokeniser and tree-constructor together
17:24
<Philip`>
...and fortunately I can, since I can just use some other tokeniser to convert the tests so they give the token stream instead of the source text
17:25
<Philip`>
...but unfortunately I don't have a JSON parser in OCaml
18:00
<kig>
there is a json parser for ocaml, json-static or whatever it was
18:01
<kig>
(or at least there's one in our work repo)
18:18
<Philip`>
kig: Ah, thanks - json-static looks a bit scary with camlp4, but json-wheel looks like it ought to work without pain
20:22
<gsnedders>
<http://www.w3.org/TR/2008/PER-xml-20080205/>; — PER of Fifth Edition of XML 1.0
20:24
<Philip`>
XML 1.0 5?
20:25
<AwayEagle>
That's purely editorial, I imagine?
20:25
<gsnedders>
AwayEagle: no, it isn't
20:25
<Philip`>
No
20:25
<Philip`>
http://recycledknowledge.blogspot.com/2008/02/justice-at-last-part-two.html
20:27
<gsnedders>
so _that's_ what that change is all about
20:33
<AwayEagle>
interesting. Doesn't XML NS specify QName, etc., grammar in terms of unicode categories anyway?
20:33
<gsnedders>
AwayEagle: depends whether you look at NS for XML 1.0 or NS for XML 1.1
20:34
<AwayEagle>
gsnedders: sounds like I looked at the wrong one then :-)
20:34
<gsnedders>
AwayEagle: NS for XML 1.0 will have to be revised, FWIW
20:35
<gsnedders>
actually, it won't
20:35
<gsnedders>
actually, yes it will.
20:35
<gsnedders>
it refers to stuff in appendix B in XML 1.0
23:48
Philip`
starts passing a few tests