#whatwg on 2008-06-07

00:11	<annevk>	Hixie, lol
00:11	annevk	finds 'An end tag whose tag name is "sarcasm"'
00:18	<Dashiva>	annevk: How many angry outbursts for not taking the w3c seriously will that trigger, I wonder :)
00:19	<annevk>	I'm sure people will complain, because as defined it does exactly nothing
00:24	<Hixie>	it requires you to take a deep breath!
00:24	Hixie	applies the first sticker to his laptop
00:25	<Dashiva>	Hixie: Is that a normative requirement?
00:25	<Dashiva>	Because I think it will be exceedingly difficult to add breathing capabilities to most user agents
00:26	<annevk>	that's just a platform limitation
00:26	<Hixie>	Dashiva: yup
00:26	<Hixie>	annevk: indeed
00:26	Hixie	is happy with his sticker ("My other computer is a data center")
00:27	<Dashiva>	Not a beowulf cluster of toasters?
00:27	<Hixie>	no, my other computer really is a data center, that's why i like it :-)
00:28	<Philip`>	But you have to share it with a hundred million other users, whereas my computers are all mine
00:29	<Hixie>	no comment
00:29	<Dashiva>	My other computer is standing in the hallway, waiting for the electrician to show up and make a properly earthed power outlet
00:29	<Hixie>	heh
00:30	<Hixie>	hey
00:30	<Hixie>	this <li> problem is just because i'm not faking an end tag
00:34	<annevk>	so <datalist> requires a new insertion model
00:34	<Hixie>	why?
00:34	<annevk>	for <option>?
00:34	<Hixie>	i just plan to make <option> into a phrasing element with some magic for optional end tags
00:35	<annevk>	Opera treats <option> "like" <option/>
00:37	<annevk>	IE does something similar (plus the /OPTION weirdness for the end tag)
00:37	<annevk>	Firefox just drops it
00:37	<Hixie>	yeah
00:38	<Hixie>	but i can't see any way that treating it like <span> could break anything
00:38	<annevk>	option { background:lime }
00:39	<annevk>	but maybe it's feasible
00:39	<Hixie>	yeah but do any pages do that while also having one in the middle of nowhere?
00:39	<annevk>	for <datalist> Opera seems to do something similar to <select> except that <select> is allowed as child
00:39	<annevk>	that doesn't seem good
00:40	<Hixie>	indeed
01:19	<Hixie>	ok well i think this fixes the main problems
01:25	<Philip`>	jgraham_: test and install works for me on that version
01:30	<Philip`>	jgraham_: but it fails when I use something with a different version of simplejson
01:34	<Philip`>	(In particular: Works with simplejson 1.7.1 in python2.5; fails with 1.9.1 in python2.4)
01:36	<Philip`>	(It doesn't like non-ASCII in test1.test)
01:38	<Philip`>	Lots of tests fails with BeautifulSoup-3.0.6 in python2.4
01:49	<Philip`>	Hmm, how nice - BeautifulSoup has decl.string == 'html', but unicode(decl.string) == '<!html>'
01:50	<Philip`>	except in Python 2.4 unicode(decl.string) == 'html'
02:22	<Philip`>	jgraham_: I've committed some changes that are needed for some versions of Python I have; otherwise it seems to work fine in my 2.3/2.4/2.5 Pythons, except for the title=foo=bar test that you seem to have fixed in your zipped version, and except for some hex overflow warnings in Python 2.3
03:21	<MikeSmith>	"thanks for waiting"
03:22	<MikeSmith>	what the hell else choice do we have?
03:22	<MikeSmith>	wait, I guess we could offer to write the "paper" for them
03:22	<MikeSmith>	I think I'll do that
03:22	<MikeSmith>	this weekend
03:27	<Philip`>	<datatemplate> scares me :-(
03:32	<Philip`>	Mutual recursion is great
03:32	<Philip`>	s/great/evil/
03:39	<MikeSmith>	heh
03:40	<Dashiva>	Mutual recursion is recursing mutually is mutual recursion?
04:00	<Philip`>	Maybe I shouldn't send emails trying to explain nasty complicated algorithms at 4am :-/
04:01	<Dashiva>	Maybe I should be asleep at 5 am...
04:01	<Philip`>	I reason that the birds outside aren't asleep, so I don't see why I ought to be
04:03	<Dashiva>	Too bad you didn't reason like that when they were asleep :)
04:04	<Philip`>	I was too busy with <datatemplate>s to be able to reason :-)
04:05	<Philip`>	(Also playing PAA:OtRSPoD:E1, which is possibly less useful)
05:57	<takkaria>	Philip`: I do often wonder if you ever sleep at all
06:47	heycam`	curses HttpURLConnection's refusal to accept a non-standard HTTP method
08:26	<annevk>	Philip`, I found http://en.wikipedia.org/wiki/Levenberg–Marquardt_algorithm
08:27	<annevk>	but then I don't know about either that algorithm or <datagrid> to connect the dots
08:28	<annevk>	know enough*
08:35	<annevk>	http://simonwillison.net/2008/Jun/6/patent/ :o
08:49	<Hixie>	that algorithm has nothing to do with it
08:49	<Hixie>	the "levenberg" algorithm in the spec is just what josh came up with when i asked him how to solve the problem
08:50	<annevk>	ok, so what Philip` said :)
08:50	<Hixie>	yup
09:15	<hsivonen>	hmm. the 2005 DOM Core threads about aligning the spec with the real world are sad
09:21	<annevk>	we should just rebrand the DOM specs to Web DOM specs so there's no confusion
09:21	<annevk>	Web DOM -> Web, DOM specs -> well...
09:25	<annevk>	Hmm, they're standardizing UA sniffing with a really complicated server side API... http://www.w3.org/2005/MWI/DDWG/drafts/api/080602
11:13	<annevk>	view source: http://jarvklo.se/ :)
11:16	<annevk>	http://cafe.elharo.com/web/refactoring-html/why-xhtml/#comment-235640 "They attempt to destroy standards by insisting on mindless conformance to them, all practical experience to the contrary." yup...
11:17	annevk	was hoping nobody would figure it out
11:49	<jgraham_>	Does anyone have a strong opinion on whether html5lib should accept e.g. utf8 as a synonym for utf-8?
11:50	<annevk>	html5 says we should I think
11:50	<gsnedders>	yeah, they're identical per html5
11:50	<jgraham_>	Oh, that must have changed since I last looked at this
11:50	<gsnedders>	http://www.whatwg.org/specs/web-apps/current-work/#charset
11:50	<annevk>	likely
11:51	<gsnedders>	It was a fairly recent change
11:51	<gsnedders>	actually, that's not it
11:51	<annevk>	jgraham_, when is your blogpost going online?
11:51	<jgraham_>	annevk: About @media? After Lachy outs the slides online... (hint ;) )
11:51	<gsnedders>	http://www.whatwg.org/specs/web-apps/current-work/#character0 — that's the right link
11:52	<annevk>	Lachy, ^^
11:52	<annevk>	or http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#character0
11:52	<gsnedders>	http://bugs.simplepie.org/repositories/entry/sp1/trunk/create.php — that's what I use now
11:53	<annevk>	I wonder if we can make it a fixed list of character encodings
11:53	<gsnedders>	annevk: in the spec?
11:53	<annevk>	yes
11:53	<gsnedders>	That would be nice.
11:53	<annevk>	a list of Web character encodings
11:54	gsnedders	notes in his new impl. he does support UTF-32 (but on PHP 5 everything is stored as UTF-32 binary strings)
11:55	<annevk>	omg, idiots
11:55	<gsnedders>	Simply because perf. of things like substr() would be terrible using anything that sometimes used more than one word per codepoint
11:55	<gsnedders>	annevk: ?
11:55	<annevk>	storings things in UTF-32 is a design flaw
11:56	<virtuelv>	anyone looked at eric meyer's linking proposal?
11:56	<gsnedders>	annevk: Find any other way to do stuff performantly :(
11:56	<annevk>	gsnedders, even with UTF-32 you can have a single character spanning multiple code points
11:56	<jgraham_>	gsnedders: the substring argument may be optimizing for the wrong thing
11:57	<gsnedders>	annevk: As far as I can see substr() natively on PHP 6 just does it to codepoints, not characters
11:57	<annevk>	in that case basing it on UTF-16 would be better
11:57	<gsnedders>	jgraham_: It optimises for most optimisations, though, doing that (it's also the cheapest to decode/encode to)
11:57	<hsivonen>	annevk: no, you can't, but you can have a grapheme cluster spanning code points
11:57	<annevk>	as that's at least compatible with the DOM/JavaScript etc.
11:57	<jgraham_>	gsnedders: http://weblogs.mozillazine.org/roc/archives/2008/01/string_theory.html
11:58	<gsnedders>	annevk: Why? Then I have to scan through the string trying to find surrogates
11:58	<annevk>	hsivonen, is that the same as combined characters?
11:59	<hsivonen>	annevk: yeah
12:00	<annevk>	(that's what I meant)
12:00	<gsnedders>	annevk: UTF-16 will be slower. What might be quicker is having duplicate code paths, one for ASCII only UTF-8, and another for UTF-8 (when using the actual UTF-8 codepath it'll be slower than UTF-32 though)
12:01	<annevk>	UTF-32 doesn't handle combined characters
12:01	<gsnedders>	I know.
12:01	<annevk>	and eats up way too much memory and is more or less obsolete anyway as far as anyone is concerned
12:01	<gsnedders>	PHP is just too slow :(
12:08	<gsnedders>	If I were to use UTF-8 or UTF-16 I need to manually iterate over the string for things like substr to count the number of characters I'd got past, taking into account multi-byte sequences, and surrogates. With UTF-32, I can just use PHP's built-in substr function, as each codepoint is four bytes.
12:08	<gsnedders>	iterating over a string is VERY slow in PHP when you're having to watch it as a unicode string (decoding is far too slow)
12:12	<hsivonen>	if you aren't Mozilla or Opera, you should probably just use whatever UTF-* your language + its built-in string library use
12:14	<gsnedders>	hsivonen: In the case of PHP 5, that's binary strings, and no Unicode.
12:17	<hsivonen>	I don't know about PHP 5, but in PHP 4 it means byte strings with UTF-8 in them or int arrays with UTF-32 code unit per array slot
12:18	<Philip`>	gsnedders: Perl stores UTF-8 internally, and its substr() performance doesn't seem to be unacceptable
12:18	<gsnedders>	hsivonen: PHP 5 is the same as PHP 4
12:19	<gsnedders>	Philip`: It has the advantage of not needing to try and operate it in the interpreted userland, though, but in the compiled interpreter
12:19	<gsnedders>	(PHP 6, however, has Unicode support)
12:19	<gsnedders>	I want something that can behave identically on PHP 5 and PHP 6, while using PHP 6's native support
12:20	<Philip`>	gsnedders: Oh, by "on PHP 5 everything is stored as UTF-32 binary strings" do you mean that you chose to store everything as UTF-32 because PHP doesn't do Unicode strings at all, as opposed to meaning that PHP itself stores Unicode strings as UTF-32?
12:21	<gsnedders>	Philip`: PHP < 6 (which isn't out yet) doesn't do any Unicode
12:21	<Philip`>	Ah
12:21	<gsnedders>	Philip`: Well, there are some extensions that cope, one that is enabled by default, but it stops processing on the first invalid byte
12:21	<gsnedders>	and it has all kinds of bugs
12:22	<annevk>	good times
12:22	<annevk>	i'm glad my blog doesn't require the complex PHP bits
12:22	<annevk>	such as substr() :D
12:22	<Philip`>	It's strange how a toy language can become so popular ;-)
12:22	<gsnedders>	(and as PHP 6 is even less backwards compatible than PHP 5 was, and PHP 5 had slow uptake)
12:23	<hsivonen>	PHP is aimed at people who should be given all the high-level library support imaginable, but then what PHP does with strings is the kind of thing portable C does
12:23	<gsnedders>	hsivonen: PHP just generally sucks.
12:23	<hsivonen>	i.e. giving just a run of bytes without any Unicode libraries
12:23	<hsivonen>	yes, PHP sucks
12:23	<hsivonen>	I walked away from PHP years ago
12:23	<gsnedders>	It's not just strings. Everything sucks.
12:24	<hsivonen>	my latest encounter with PHP was patching the WHATWG blog and the experience made me unhappy
12:24	<hsivonen>	I should have know better and stayed away
12:24	<annevk>	yet for the simple stuff it's easier than anything else
12:24	<gsnedders>	Most common bug report on SP: it just stops output! (this happens to normally be PHP crashing Apache)
12:24	<hsivonen>	annevk: if "simple" means "I don't know yet that I need to care about Unicode"
12:24	<gsnedders>	annevk: I'd say stuff like Python and Ruby is just as easy
12:24	<annevk>	though maybe that's because there's so much to copy and paste from
12:25	<annevk>	hsivonen, right
12:25	<annevk>	people have written all kinds of simple software in PHP you can learn how to do things from and improve upon
12:25	<annevk>	I haven't found nearly as much for Python
12:26	<annevk>	Python documentation is also quite crappy
12:26	<gsnedders>	annevk: Yeah, but at least with Python most of the docs there are _is_ right
12:26	<gsnedders>	annevk: With PHP, half of it is wrong
12:27	<hsivonen>	Java seems to have the best docs
12:27	gsnedders	will likely have no real exposure to Java till he's at uni
12:27	<Philip`>	Java documentation seems to always tell me how stuff works, but never tells me how to do stuff
12:28	<hsivonen>	right, Java has great per method docs. but often it is hard to learn the big picture for a new library
12:28	<Philip`>	If I don't know the class name I'm looking for, I end up having to search random web pages to find relevant links
12:32	<Dashiva>	Vector vs ArrayList, discuss
12:32	<hsivonen>	deprecade Vector. end of discussion :-)
12:32	<hsivonen>	deprecate
12:35	<annevk>	what Philip` says about Java is probably true for Python as well
12:36	<annevk>	Python should have some clear docs on dealing with HTML form submission / dealing with request URIs / dealing with databases
12:42	<Philip`>	I think it's probably true for all languages that I'm familiar with
12:43	<Dashiva>	I suppose it'd be false for languages with no standard library, though :)
12:43	<annevk>	might be, for me it was just that there was enough sample PHP code around to figure out how to make blog software / simple cms software, etc.
12:44	<Philip`>	Dashiva: Do any such languages exist? Everything seems to at least have standard string functions, and it'd be pretty useless without those :-)
12:44	<jgraham_>	The python documentation for 2.6 is getting better http://docs.python.org/dev/
12:45	<Dashiva>	Philip`: Well, you'd define functions required to non-uselessness as part of the language, not the library
12:45	<Philip`>	The 2.6 docs look less like latex2html, which is nice
12:45	<Philip`>	Dashiva: I wouldn't :-p
12:46	<Dashiva>	Besides, strings are overrated.
12:46	<Philip`>	Dashiva: e.g. C has a standard library, and without it the language would be pretty useless until you built up everything yourself from scratch, but that's still a standard library
12:46	<Philip`>	Indeed, we should use ropes instead
12:48	jgraham_	suggests that GUI design 101 should cover not putting "Copy" and "Close Window" next to each other
12:48	<hsivonen>	modern C software has to use a string library that isn't provided by the "standard" library
12:48	Philip`	suggests the OS should have an "undo" button, that can undo the closing of windows
12:50	<annevk>	jgraham_, still not quite focused on Web programming it seems
12:51	<jgraham_>	annevk: Sure. But better than the old documentation
12:52	<annevk>	take http://docs.python.org/dev/howto/index.html for instance
12:52	<annevk>	where's HTTP and python, HTML forms and python, MySQL and python ?
12:52	<Dashiva>	Do you really have to use urllib to urlencode something for use in urllib2?
12:53	<jgraham_>	annevk: That section's new. I guess contributions are welcome :)
12:54	<annevk>	once i'll figure it out i'll let them know :p
12:54	<jgraham_>	http://wiki.python.org/moin/WebProgramming might be more informative
12:54	<annevk>	well, i have figured it out to some extent, but what i'm doing feels rather clumsy
12:58	<annevk>	hah, that wiki page endorses my naive approach
13:03	<annevk>	http://webpython.codepoint.net/introduction is quite nice
13:03	<annevk>	(links at the bottom)
13:05	<hsivonen>	hmm. If I read a text node from the DOM in IE8 mode, line breaks are normalized to single spaces
13:05	<hsivonen>	what's up with that?
13:07	<Philip`>	What if you set white-space:pre on that element?
13:10	<Philip`>	Seems the same as IE6/7 to me - whitespace is normalised except on elements where it's preserved (like <pre>, or <... style="white-space:pre">)
13:11	<hsivonen>	style="white-space:pre" helps in IE7
13:11	<roc>	freaky
13:12	<Philip`>	(innerHTML does the same normalisation)
13:14	<hsivonen>	Philip`: doesn't help in IE8 mode
13:17	Philip`	encounters a way to get IE stuck in an infinite loop
13:17	<Philip`>	The Live DOM Viewer ought to have an autosave feature :-(
13:19	<annevk>	and community features so you can rate each others code and comment on it
13:19	<Philip`>	hsivonen: http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cmeta%20http-equiv%3Dx-ua-compatible%20content%3Die%3D8%3E%0A%3Cdiv%20style%3Dwhite-space%3Apre%3Etest%0Atest%3C%2Fdiv%3E in IE8 in IE8 mode shows the non-normalised newline in the DOM view
13:22	<annevk>	http://lxyedwarddudley.wordpress.com/2008/06/05/authoring-html-5-a-supplicate-in-warp-and-woof-professionals/ :o
13:22	<gsnedders>	You do realise the p element exists for a reason?
13:23	<hsivonen>	Philip`: I'm looking at http://hsivonen.iki.fi/test/moz/document-write-cr.html
13:24	<hsivonen>	Philip`: the last script is followed by CR
13:24	<hsivonen>	Philip`: otherwise the line breaks in the source are LFs
13:24	Philip`	guesses that's based on http://www.w3.org/QA/2007/06/html5-call-to-web-professionals.html
13:26	<Dashiva>	That page makes my head hurt, annevk
13:32	Dashiva	wonders what the Functional Ashcan school is
13:39	<Philip`>	Oh, great, IE changes the DOM when access .firstChild on certain elements
13:44	<hsivonen>	Philip`: are the assumptions in my test wrong?
13:45	<Philip`>	hsivonen: You may be assuming that IE is sane :-)
13:47	Philip`	gives up trying to understand what it's doing
13:49	<Philip`>	http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cmeta%20http-equiv%3Dx-ua-compatible%20content%3Die%3D8%3E%0A%3Cdiv%20style%3Dwhite-space%3Apre%3Etest%0Atest%3C%2Fdiv%3E%0A%3Cscript%3Ewindow.onload%3Dfunction()%7B%0Avar%20d%20%3D%20document.getElementsByTagName('div')%5B0%5D%3B%0Aw(d.childNodes.length)%0Aw(d.childNodes%5B0%5D.nodeValue.replace(%2F(.)%2Fg%2Cfunction(m)%7Breturn%20m.charCodeAt(0)%2B'%20'%7D))%3B%0AsetTime
13:49	<Philip`>	Urgh
13:49	<Philip`>	http://tinyurl.com/66d3mz
13:50	<Philip`>	In IE8, that logs "1", then "116 101 115 116 32 116 101 115 116", then "2"
13:50	<Philip`>	and the DOM view shows the text being split into two text nodes
13:50	<Philip`>	and if I remove the second w() line then it stays as a single text node
13:52	<Philip`>	But I get different behaviour in my fork of the DOM viewer - it says "13" instead of "32", and "1" instead of "2"
14:07	jgraham_	guesses someone has been reading too much Joyce or Pynchon
14:24	<annevk>	http://www.theologyonline.com/newgod/
14:25	<Dashiva>	god5?
14:25	<Dashiva>	Oh, 6
14:25	<Dashiva>	Must be Dmitry up to his old tricks again
14:31	<annevk>	I still wonder who did Bible5
14:31	<annevk>	it seems to be copy and pasted all over the place, e.g. http://www.biroblu.info/2007/05/例如-对于后点-现在简化汉字主要通行于中国大陆/
14:33	<annevk>	in other news, that press release is supposed to go out this month, we better hurry up with XML5, SVG5, etc. :D
14:34	<Philip`>	It doesn't matter if we miss the deadline - we can just define Calendar5 at some point in the future, so dates can mean whatever we want them to mean
14:35	<Dashiva>	Or we could just use existing implementations
14:35	<Dashiva>	Valve has a good one, I hear
14:36	<annevk>	must. be. backwards. compatible
14:36	<Dashiva>	Just define everything before Anno Valvensis as "long ago"
15:18	<Dashiva>	I find it amusing that python raises an exception on 'raise', for not being a valid raise statement
15:52	<Philip`>	Dashiva: It is a valid raise statement, as long as it's in the dynamic scope of an except block
15:52	<Dashiva>	Yeah, but not outside
15:53	<Dashiva>	So I say "Give me an exception" and it goes "No, you can't have one. I'm going to make an exception because you did it wrong."
15:57	<Philip`>	It can be valid if it's outside an except block
15:57	<Philip`>	as long as there's another except block which it's still inside :-)
15:59	<Philip`>	Hmm, it appears it always re-throws the most recently thrown exception, rather than actually doing dynamic scoping, which seems kind of like a bug in Python
17:00	<hsivonen>	hmm. when pronouncing WCAG, is the vowel (i vs. e) inserted between W and C freeform or dependent on en-AU, en-GB, en-US, etc.?
19:03	<Lachy>	hsivonen, it's not really dependant on anything like that. It's just the preference of whoever says it
19:07	<hsivonen>	Lachy: ok. (I've now also heard recordings where the vowel varies even away from the e-i space)
19:14	<Lachy>	I've not heard anyone use an e sound. I've heard i, y and uh
19:15	<hsivonen>	Lachy: I used e to denote the way you pronounce it :-)
19:16	<Lachy>	huh? I prounounce it like WhyCAG
19:17	<hsivonen>	hmm. somehow I got an e-like impression from the standards suck episode
19:17	<hsivonen>	well, doesn't matter
19:19	<gsnedders>	I pronounce the WC like the word wick
19:20	<Dashiva>	wickag?
19:20	<gsnedders>	yeah
19:20	<hsivonen>	gsnedders: is the 'i' as in en-US 'list'?
19:21	<gsnedders>	hsivonen: No
19:21	<Dashiva>	I mentally equalize the w in wcag with x in xwhatever, so I would always pronounce it separately
19:41	<gsnedders>	Oh brilliant.
19:41	<gsnedders>	I click the "confirm your purchase" button and nothing happens. yay.
19:41	<Philip`>	I avoid the pronunciation problem by never talking about it
19:41	<gsnedders>	(odd sidenote on that matter: Air France is cheaper than easyjet)
20:53	Retrieving	#whatwg modes...
21:58	<jgraham_>	Hmm. I wonder if Laura missed my point about the Python process being "unfair" but arguably producing good results because of it