#whatwg on 2008-10-02

01:15	<Hixie>	so.......
01:15	<Hixie>	if you have an <input> element
01:16	<Hixie>	type=text
01:16	<Hixie>	and you type in some text
01:16	<Hixie>	and then the script changes it to type=checkbox
01:16	<Hixie>	and you submit
01:16	<Hixie>	what value should be submitted?
01:16	<Hixie>	opera and mozilla say "on", safari says whatever you typed, IE says you can't change the type.
01:19	<gsnedders>	Hixie: off.
01:21	<Hixie>	what.
01:21	<Hixie>	the hell.
01:21	<Hixie>	setting .value when type=checkbox causes the value attribute to change?!?!?!
01:22	<gsnedders>	heh.
01:22	<gsnedders>	Hixie: You obviously forgot to leave your sense of logic at the door.
01:22	<gsnedders>	Hixie: What's happening about Anolis, BTW?
01:23	<Hixie>	it's live
01:23	<gsnedders>	Hixie: But not committed.
01:23	<Hixie>	and committeed
01:23	<Hixie>	committed
01:23	<gsnedders>	It is?
01:23	<gsnedders>	I haven't seen that.
01:23	<gsnedders>	:P
01:23	<Hixie>	i checked it in last night
01:24	<Hixie>	r2256
01:24	<gsnedders>	no email on the list
01:25	<gsnedders>	(like the commit watchers list)
01:25	<Hixie>	probably was too long and hit the limit
01:25	<gsnedders>	Heh.
01:25	<gsnedders>	Like, almost every line changed, I expect.
01:25	<gsnedders>	Now, what need I do to get a special thanks (:)?
01:25	<gsnedders>	:P
01:25	<Hixie>	:-P
01:25	<gsnedders>	(I mean, why do you think I did this anyway)
01:26	<Hixie>	bert didn't get special thanks either :-)
01:26	<gsnedders>	multi-file Anolis?
01:26	<gsnedders>	:P
01:26	<Hixie>	:-P
01:26	<gsnedders>	Yeah, but most of what you did about that was complain about its speed :P
01:28	<gsnedders>	Hixie: http://www.w3.org/html/wg/html5/ — the TOC list has the ol numbers :(
01:32	<gsnedders>	Hixie: Should I send a email to the list saying IDs are completely different now?
01:37	<aboodman>	Hixie: I don't see the craziness in the value attribute changing when the value property is changed
01:37	<aboodman>	am I missing something?
01:38	<Hixie>	aboodman: the value attribute doesn't change when the value property is changed for any other type value
01:38	<Hixie>	aboodman: normally the value content attribute maps to the defaultValue DOM attribute
01:43	<Hixie>	man this really screws things up
01:45	<Hixie>	wtf
01:46	<Hixie>	hm, at least this is scoped to the input element
01:47	<aboodman>	out of curiosity, are most attributes in html5 defined to reflect or not reflect
01:47	<Hixie>	most reflect
01:48	<aboodman>	yipp33
01:48	<aboodman>	argh
01:48	<aboodman>	i really must learn to type at some point.
01:48	<gsnedders>	Type? Why do that?
02:11	gsnedders	changes topic to 'WHATWG (HTML5) -- http://www.whatwg.org/ -- Logs: http://krijnhoetmer.nl/irc-logs/ -- Please leave your sense of logic at the door, thanks! -- gsnedders had green hair, photos coming soon :-)'
02:15	<othermaciej>	the weird situation with .value, .defaultValue and value="" is kind of weird and pretty suboptimal
02:21	<Hixie>	othermaciej: and now standardised.
03:49	<Lachy>	Hixie, http://blog.whatwg.org/demos-2008-sept#comment-27669
06:00	<Hixie>	what does a radio button represent?
06:01	<othermaciej>	one of several mutually exclusive choices
06:02	<Hixie>	what if there aren't any others?
06:03	<othermaciej>	what does an <li> represent if there aren't any others?
06:09	<MikeSmith>	what is the sound of one hand clapping?
06:16	<Hixie>	othermaciej: a list item
06:17	<othermaciej>	then you could say a radio button represents a readio button
06:17	<othermaciej>	because in fact what it represents is a particular UI control
06:17	<othermaciej>	and the semantic of how such controls are used is dictated by convention
06:17	<Hixie>	fair enough
08:26	<Hixie>	well radio buttons turned out to be more of a pain than i expected
08:36	<jruderman_>	radio groups represent mutually exclusive choices, but so do <select size=1> and <select size=10>, and all three have different UI (by convention)
08:37	<jruderman_>	it annoys me when i get paper forms that say "check one" and have empty squares instead of empty circles
08:37	<jruderman_>	seems so wrong
08:40	<Hixie>	heh
08:45	<Hixie>	ok, only file, hidden, submit, image, reset, and button to do now.
08:47	<hsivonen>	http://www.w3.org/2008/10/TPAC/TPDay-Agenda
08:49	<hsivonen>	I wonder if Hixie has been already invited as a panelist
08:50	<virtuelv>	hmph
08:50	<virtuelv>	Google is a harsh mistress
08:51	<Hixie>	hsivonen: for which panel?
08:51	<hsivonen>	this year it's no longer html5 vs. xhtml2. now it's html vs. xml
08:51	<virtuelv>	(completely off-topic, of course)
08:51	<Hixie>	virtuelv: hm?
08:51	<Hixie>	there's a topic here?
08:51	Hixie	looks at gsnedders :-P
08:52	<virtuelv>	Hixie: yeah, when Dreamhost was mass-compromised last year, my account were among the unfortunate 3500
08:52	<hsivonen>	Hixie: Architecture and html&xml
08:53	<virtuelv>	I had one page left that I didn't actually notice when I cleaned up all of the injected spam, and for that, I'm now invisible in Google for the next couple of weeks
08:53	<Hixie>	aah
08:53	<Hixie>	hsivonen: session 2?
08:53	<Hixie>	or 7?
08:54	<Hixie>	someone should stand by the mic during the TP and pipe up briefly every time someone says something not true
08:54	<Hixie>	e.g. "The HTML5 work on the other hand uses a centralised extensibility mechanism based on formalized tagsoup parsing." "Actually that's not an accurate description."
08:55	<Hixie>	actually reading that abstract more carefully i don't even know what it means
08:55	<Hixie>	sounds like something palin might say
09:08	<heycam>	http://www.theage.com.au/photogallery/2008/10/02/1222651246843.html -- hmm, is that IE being used in the A380's cockpit?
09:08	<hsivonen>	Hixie: I was wondering about your participation in both 2 and 7
09:08	<heycam>	regardless, it looks like a terrible interface for presenting that information (with the scrolling)
09:12	<virtuelv>	hsivonen: this? http://images.theage.com.au/ftage/ffximage/2008/10/02/qantas__16__gallery__539x400.jpg
09:12	<heycam>	virtuelv, yeah
09:13	<virtuelv>	heycam: how can you tell?
09:13	<virtuelv>	I can clearly see it's windows, but IE?
09:13	<virtuelv>	the table?
09:13	<heycam>	yeah the table is what made me think that
09:15	<virtuelv>	looks like
09:15	<heycam>	windows in the cockpit is probably enough to make me worry though :)
09:16	<heycam>	i noticed linux booting up on a plane's entertainment system recently
09:16	<hsivonen>	seems like a very bad idea to put Windows in the cockpit
09:16	<hsivonen>	I'd want a small well debugged real time OS in the cockpit
09:17	<heycam>	i assume that windows thing is isolated from any flight controlling systems, but still...
09:20	<hsivonen>	Sun had the "high risk activities" clause in the non-GPL JDK licenses forever even though JDK on Solaris is probably more stable than IE on Windows
09:20	<virtuelv>	heycam: I'd assume so
09:21	<virtuelv>	but you're still pretty screwed for the 30 seconds it takes rebooting
09:22	<hsivonen>	why doesn't http://www.p01.org/releases/Demoscene/files/pNebula_canvas_256b_valid.htm work in Firefox/Minefield?
09:23	<virtuelv>	hsivonen: I guess mathieu optimized away a few bytes too many
09:23	<virtuelv>	it worked a couple of days ago
09:25	<zcorpan>	hsivonen: this is why: http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0D%0A%3Ccanvas%20id%3DR%3E%3C%2Fcanvas%3E%3Cscript%3Ew(R)%3C%2Fscript%3E
09:33	<hsivonen>	zcorpan: that's unintuitive
09:38	<zcorpan>	hsivonen: but it works in ie/opera/webkit
09:39	<zcorpan>	oh actually
09:39	<zcorpan>	http://software.hixie.ch/utilities/js/live-dom-viewer/?%3Ccanvas%20id%3DR%3E%3C%2Fcanvas%3Ex%3Cscript%3Ew(R)%3C%2Fscript%3E
09:40	<zcorpan>	i forgot about the more-script-to-head bug
09:41	<zcorpan>	oh firefox only does that in quirks mode
09:41	<hsivonen>	zcorpan: I meant the Gecko tree building behavior is unintuitive
09:41	<zcorpan>	defining R in the global scope from id that is
09:41	<zcorpan>	hsivonen: yeah, but that wasn't why the demo wasn't working
09:41	<hsivonen>	oh
09:41	<zcorpan>	hsivonen: the demo should work if you remove the doctype
09:41	<hsivonen>	now I see what you meant
09:53	<virtuelv>	zcorpan: just asked p01, and he indicated it chokes on something in the script
09:53	<virtuelv>	<p01> ff3.0.3 chokes on "R", pbly expecting "document.R"
09:54	<zcorpan>	virtuelv: yes it doesn't do <elm id=foo> -> window.foo in standards mode
09:54	<zcorpan>	virtuelv: only in quirks mode
09:54	<zcorpan>	while other browsers do it in both
09:57	<zcorpan>	i think he needs R=document.body.lastChild
09:58	<zcorpan>	and then he can drop id=R
10:07	<Philip`>	He could save another 6 bytes just by dropping the quotes around onload, I think
10:08	<Philip`>	Oh, = is not allowed in unquoted attributes? How annoying :-(
10:08	Hixie	has started using the convention of omitting quotes around attribute values for attributes that take keywords or numbers, and having them for everything else
10:09	<Hixie>	(strings, urls, script, css, etc)
10:10	<virtuelv>	zcorpan: his goal is to stay under 256 bytes and be in standards mode
10:10	<virtuelv>	if he drops being in standards mode, he can chop it down to 220+something, I think
10:11	<Philip`>	and be conforming?
10:11	<gsnedders>	Hixie: Are you trying to say I often take us off-topic?
10:11	<gsnedders>	:P
10:12	<Hixie>	i think i'm going to change how type=file handles min/max
10:15	<gsnedders>	Hixie: If you are, I guess I can't really deny it.
10:16	<Hixie>	gsnedders: no comment :-P
10:16	<gsnedders>	Hixie: I mean, you looked at me when talking about going off-topic :P
10:44	<zcorpan>	hmm anolis cuts off web-dom-core after the comment "<!-- TypeInfo dropped -->"
10:44	<zcorpan>	wonder what's up with that
10:45	<virtuelv>	Philip`: no
10:46	<virtuelv>	but we managed to chop off a version that runs in all browser, is valid and conforming to 248 bytes
10:46	<virtuelv>	fun way to spend half of your lunch
11:46	<zcorpan>	can i make Node.prefix readonly?
11:47	<zcorpan>	webkit and firefox don't check against Name or NCName (or anything) and opera doesn't do anything on setting
11:47	<zcorpan>	so you can end up with spaces in tags in webkit and firefox by setting .prefix = '1 2'
11:48	<zcorpan>	though maybe the serializer would change prefixes again
11:50	<zcorpan>	nope at least innerHTML in firefox doesn't
11:51	<zcorpan>	data:text/xml,<html xmlns='http://www.w3.org/1999/xhtml'><script>var e = document.createElementNS('a','b');e.prefix = '1>2';document.documentElement.appendChild(e); alert(document.documentElement.innerHTML)</script></html>
11:51	<zcorpan>	same with webkit
11:51	zcorpan	makes it readonly
12:30	zcorpan	mumbles something about isSameNode being useless
12:36	<hsivonen>	is it just an == check for languages that don't support == or that let tearoff implementation details leak instead of overloading ==?
12:39	<zcorpan>	i'm not sure why it was added to dom3
12:39	<zcorpan>	but it's the same as ==
12:41	<zcorpan>	considering that it's bogus for Web DOM, i'll drop it and see if someone complains
12:41	<zcorpan>	hmm there's also isEqualNode
12:44	<hsivonen>	zcorpan: well, back when distinguishing "" vs. null was considered, someone said that Delphi can't distinguish those...
12:44	<zcorpan>	that's unfortunate
12:45	<hsivonen>	yes, but as a reason for Web API design completely pointless
12:45	<zcorpan>	indeed
12:46	<hsivonen>	so, I expect you to get complaints :-(
12:46	<zcorpan>	i wonder if isEqualNode is useful for something
13:10	<BenMillard>	krijnh, how about linking data: urls? From http://krijnhoetmer.nl/irc-logs/whatwg/20081002#l-282: data:text/xml,<html xmlns='http://www.w3.org/1999/xhtml'><script>var e = document.createElementNS('a','b');e.prefix = '1>2';document.documentElement.appendChild(e); alert(document.documentElement.innerHTML)</script></html>
13:10	<BenMillard>	krijnh, also irc: URLs like those at and just after http://krijnhoetmer.nl/irc-logs/whatwg/20080226#l-321.
13:11	<Philip`>	BenMillard: If you don't escape spaces/etc in data URLs, how is any algorithm meant to work out where they end?
13:12	<BenMillard>	Philip`, I guess they normally end at the end of the line in this channel?
13:12	<BenMillard>	or at </html> or something
13:13	<BenMillard>	interestingly, Opera's IRC includes the : at the end of #l-282: in the link but doesn't include the . at the end of #l-321.
13:14	<BenMillard>	(I use Opera 9.52 for Windows XP)
13:32	annevk2	finds http://lastweekinhtml5.blogspot.com/
13:32	<annevk2>	:)
13:38	<annevk2>	hsivonen, fwiw, the tactic with "Web DOM" is that it's just for the Web browser environment and that we don't seek to obsolete other DOM specifications in any way
13:43	<hsivonen>	ok
13:46	<annevk2>	that said, it might not mitigate getting people upset as browsers would no longer follow the DOM ... series of specifications
14:06	<annevk2>	hsivonen, btw, did you seem my semi bug report yesterday?
14:07	<annevk2>	hsivonen, http://krijnhoetmer.nl/irc-logs/whatwg/20081001#l-626
15:11	<hsivonen>	annevk2: thanks. now recorded: http://bugzilla.validator.nu/show_bug.cgi?id=317
15:12	<annevk2>	the other bug was that the empty string is a valid contenteditable value
15:12	<annevk2>	iirc
15:12	<annevk2>	but i'll check that now in the spec
15:13	<annevk2>	yup: "The contenteditable attribute is an enumerated attribute whose keywords are the empty string, true, and false."
15:36	<Philip`>	Opera really doesn't work well when I run out of disk space
15:38	Philip`	is glad he has reasonably-recent backups of all the .ini files
16:43	<annevk2>	zcorpan, "The hasChildNodes() method must return false if the context node's firstChild is null, and true otherwise." what if I have a custom firstChild?
16:44	<annevk2>	zcorpan, for textContent you have getting and setting reversed
16:45	<annevk2>	(re http://simon.html5.org/specs/web-dom-core for those reading the logs)
17:06	<gsnedders>	BenMillard: First lap at full speed in S2000 LM round High Speed Ring Reverse: 1:12
17:17	<gsnedders>	BenMillard: I just can't find a good line through the double S bend
17:29	<gsnedders>	BenMillard: OK, I basically consistently do around 1:10.8
17:36	<BenMillard>	gsnedders, yeah that S-bend is touch
17:37	<BenMillard>	you have to favour the 2nd part...go deep into the first part so you are on the right-hand as you start the 2nd half
17:38	<BenMillard>	gnsedders, I'm helping my dad put pictures onto eBay now, chat later :)
17:38	<takkaria>	make sure you add alt text to all of them
17:49	Philip`	wonders how one should handle fatal errors in a streaming HTTP processor
17:49	<Philip`>	i.e. where you start sending the response before you've done all the processing (hence before you've detected all possible errors)
17:50	<Philip`>	(and maybe before you've even received the whole request)
19:18	annevk2	wonders when Bugzilla will gain OpenID support
19:18	<annevk2>	(and hopefully support for linking the OpenID to an existing account)
19:40	virtuelv	wants something better than OpenID
19:40	<virtuelv>	don't misunderstand me, openid is great within the current constraints
19:40	<virtuelv>	I'd just want browsers to act like ssh
19:50	<annevk2>	how would that work well if you use 10 different browsers on 10 different platforms?
19:57	<virtuelv>	that is an open question
19:57	<virtuelv>	your question is one of key distribution
19:58	<virtuelv>	more specifically, private key distribution
20:06	<Hixie>	ssh works without any private key distribution
20:23	<Philip`>	Only if you always SSH from the same place
20:23	<Philip`>	or use lots of independent pairs of keys
20:23	<Philip`>	or just use passwords instead
20:24	Philip`	wonders how many people have their SSH private keys written on a Post-it note stuck on their monitor
20:27	<Hixie>	you're supposed to use independent keys
20:27	<Hixie>	it lets you knock one client out if it is compromised, without having to change the rest of your config
20:30	<Philip`>	But if one private key is compromised, the attacker can log in to all the other machines you have access to, which probably contain all your other 'independent' private keys, so the whole network of keys is compromised
20:31	<Hixie>	compromised was the wrong word
20:31	<Hixie>	i mean like if you sell your hard disk
20:32	<Hixie>	if you are compromised then yes, you have to nuke everything
20:32	<Philip`>	It seems much easier to remove the private key from that disk before you sell it, rather than deleting the corresponding public key from every other machine in the entire internet that your public key has been copied to
20:33	<Philip`>	(and it's not useful to remove the private key after you sell it, because the recipient will already have compromised all your accounts)
20:33	<Philip`>	Uh
20:33	<Philip`>	s/private key/public keys/
20:34	<Hixie>	wiping hard disks is non-trivial and assumes you can access the disk
20:34	<takkaria>	wiping hard disks only requires ddd
20:34	<takkaria>	*dd
20:34	<Hixie>	anyway. to each his own. :-)
20:34	<Philip`>	Wiping disks only requires a screwdriver and sandpaper
20:34	<Hixie>	not if you're going to sell it :-)
20:35	<takkaria>	though I guess the number of people who know about dd is probably lower than 1 in 1000
20:35	<Hixie>	same with ssh!
20:35	<Philip`>	If it's a disk you've used already, it's tiny and obsolete and would have negligible resale value :-)
20:36	<Philip`>	Nobody would want to buy an old 250GB disk nowadays, you couldn't even fit six months of MP3s on it
20:46	<Philip`>	hsivonen: http://about.validator.nu/htmlparser/ says "ALTER_INFOSET is now the default", but http://about.validator.nu/htmlparser/apidocs/nu/validator/htmlparser/sax/HtmlParser.html says "By default ... the this parser treats XML 1.0-incompatible infosets as fatal errors" which seems inconsistent
20:46	<Philip`>	(Also the latter says "the this parser")
20:46	<Philip`>	(which seems not grammar)
21:34	<sicking>	Hixie, ping
21:43	<Hixie>	hey
21:51	<annevk2>	hmm, web-apps-tracker cache is now ~1.4GiB
22:01	<gsnedders>	Anyone know of any way to contact markp without waiting months for him to check his email?
22:02	<Dashiva>	Commenting on his blog?
22:02	<annevk2>	!summon markp
22:02	<gsnedders>	Dashiva: Comments are only open for like two days on his blog
22:02	<gsnedders>	Dashiva: The later feedback form just goes to email, AFAIK
22:03	<Dashiva>	gsnedders: Wait for him to post a new one :)
22:03	<gsnedders>	http://diveintomark.org/tests/client/autodiscovery/ — tests 47 to 50 are broken :\
22:04	<annevk2>	just make a blog post and hope he does an ego search
22:31	<Hixie>	gsnedders: what do you want to ask him?
22:41	<sicking>	Hixie, so we need to figure out this how-to-compare-Origin-to-Access-Control-Allow-Origin thing
22:42	<Hixie>	i thought it was a string comparison
22:42	<Hixie>	no?
22:43	<sicking>	Hixie, yes, i question that that is wise
22:43	<Hixie>	for security-related things, it seems the most conservative option is best
22:43	<Hixie>	it's hard to get a subtle security bug when you do a strict string comparison
22:44	<Hixie>	if we do something else, say partially case-insensitive IDN/punycode-equivalent full-URL comparison, we're setting ourself up for serious pain
22:44	<Hixie>	ourselves
22:44	<sicking>	everyone has to have code to do that anyway
22:44	<Hixie>	yeah and everyone has had at least oen security bug with it
22:45	<sicking>	so you can compare if two frames are same-origin or not
22:45	<sicking>	sure, but it needs to get fixed
22:45	<Hixie>	the frame origin comparison isn't a url comparison
22:45	<Hixie>	it's a tuple comparison of exact strings
22:45	<sicking>	so why reinvent the wheel, even if the new wheel is simpler
22:45	<sicking>	yup
22:45	<sicking>	well
22:45	<Hixie>	i'd rather not have a wheel at all, i'd rather have just a twig, or whatever is the right analogy here
22:45	<sicking>	strings and numbers
22:46	<Hixie>	string and numbers can be compared reasonably safely
22:46	<sicking>	it seems very confusing to have urls that are case sensitive though, when they aren't case sensitive anywhere else
22:46	<Hixie>	urls are a whole entire other ball game of extreme danger
22:46	<Hixie>	URLs are partially case-sensitive
22:47	<Hixie>	which is FAR more confusing than just comparing a string
22:47	<Hixie>	especially given that in most cases the string will either be hard-coded or echoed
22:47	<sicking>	but this must be a solved problem already
22:47	<sicking>	for everyone
22:48	<sicking>	no implementation i've talked to has seen any risk security wise
22:48	<Hixie>	the latest security bug with URL parsing was _last month_
22:48	<sicking>	me included
22:48	<Hixie>	18 years into the life of urls
22:48	<Hixie>	and you want to rely on that?
22:48	<sicking>	would it have affected parsing origins?
22:48	<sicking>	i am already relying on url parsing
22:48	<sicking>	so ues
22:48	<sicking>	yes
22:49	<Hixie>	i have no idea (i'm thinking of the :% -> crash in chrome)
22:49	<Hixie>	how are you relying on url parsing?
22:49	<sicking>	when comparing if two frames are same-origin
22:50	<sicking>	all urls ultimately start as strings that are parsed
22:50	<Hixie>	that's nor a URL comparison, it's an exact tuple comparison of strings and numbers
22:50	<Hixie>	sure
22:50	<Hixie>	but by the time they are parsed if you got the parsing wrong you went to the wrong place
22:50	<sicking>	that's what the orign to ac-allow-origin will be too
22:50	<Hixie>	so you can't get the wrong security
22:50	<Hixie>	context
22:50	<sicking>	i'll reuse the same string->url parsing code
22:51	<sicking>	and then reuse the same url to url same origin code
22:51	<Hixie>	i really don't see any advantage to doing that
22:51	<Hixie>	it seems like asking to use a tightrope to cross a ravine when there's a perfectly good concrete bridge right next to it
22:51	<Hixie>	what's the problem with comparing strings?
22:51	<sicking>	except that the tightrope is used everywhere in the already
22:52	<sicking>	in the product
22:52	<sicking>	how do you mean?
22:52	<Hixie>	what is the problem you want to solve by not doing a string comparison? (the url parsing having happened, as with origin checking, before having a security context)
22:52	<sicking>	the problem is in everywhere when there is a difference to url comparsing
22:52	<sicking>	which are 3 things:
22:53	<sicking>	1. scheme in url is case insensitive
22:53	<sicking>	2. domain in url is case insensitive
22:53	<sicking>	3. explicit default port is same as no port
22:53	<Hixie>	yes but the UA generates the string it is comparing to, so why does it matter? i mean, we could be generating hashes here too, why does it matter?
22:53	<sicking>	why aren't we using hashes?
22:54	<sicking>	or http.org.example.www(80)
22:54	<Hixie>	so that the user can see what is being compared more easily
22:54	<sicking>	user won't see this
22:54	<sicking>	my mom never will
22:54	<Hixie>	s/user/author/, sorry
22:55	<sicking>	that author will also be generating the urls for ac-allow-origin
22:55	<Hixie>	(i'm fine with using a new syntax too, it just seems gratuitous)
22:55	<sicking>	exactly, and doing string comparison is using a new syntax
22:55	<sicking>	because we no longer treat them as urls
22:56	<sicking>	i say that if something looks like a url, we should treat it as such
22:56	<Hixie>	and i'm saying that's a security nightmare waiting to happen
22:56	<sicking>	can you name any problems with url parsing that would have been made worse by using it for AC?
22:57	<sicking>	any problem with url parsing is already likely fatal
22:59	<gsnedders>	Hixie: Oh, just to tell him those tests that I mentioned above are broken
23:02	<Hixie>	sicking: no, not off-hand. But defense in depth isn't done by making sure we don't do something that could be vulnerable to an already-fixed problem, it's done by making the attack surface smaller.
23:02	<sicking>	Hixie, it's a very gratuitous defense you are inventing
23:03	<sicking>	Hixie, if you have no idea what you are protecting yourself from, how do we know what you are inventing is better
23:03	<Hixie>	it's simpler
23:03	<sicking>	Hixie, you are also adding risks by inventing new formats
23:03	<Hixie>	simpler is better :-)
23:03	<Hixie>	it's not a new format, it's a canonicalised URL
23:03	<sicking>	simpler how? for who?
23:04	<Hixie>	(s1 == s2) is simpler than (ParseURL(s1) == ParseURL(s2))
23:04	<sicking>	simpler for who?
23:04	<Hixie>	and it's simpler for authors and implementors. for authors because once they have found the string that works, it'll always work, and for implementors because, well, (s1 == s2) is simpler than (ParseURL(s1) == ParseURL(s2))
23:05	<sicking>	it's not simpler for me, it's about the same number of lines of code (about 5 vs 8), it's not simpler for authors as they have a surprising inconsistency with all other urls in the product
23:05	<othermaciej>	as an implementor I'd say neither is particularly harder or easier
23:06	<Hixie>	seriously? doing a security audit of the url parsing is as simple as a security audit of just comparing two strings?
23:06	<Hixie>	those must be some damn complicated string comparisong algorithms y'all are using
23:06	<Hixie>	comparison
23:06	<sicking>	the code is already audited in both cases
23:07	<sicking>	Hixie, i don't have two strings, i have a string (the ac-allow-origin header) and a url (the requesting site)
23:07	<othermaciej>	url parsing and comparison (and specifically of a url string converted to our SecurityOrigin class) is already audited
23:07	<othermaciej>	and already heavily exposed
23:07	<Hixie>	sicking: you have two strings, the string you sent the server, and the string it sent back
23:08	<othermaciej>	I would be less nervous security-wise about a comparison of two SecurityOrigin objects than two Strings, but seems to me either could be sound
23:08	<sicking>	Hixie, i don't have the string sent to the server saved. I could save it, but it'd add about the same number of lines of code that parsing to a url adds
23:09	<sicking>	yeah, i think both these solutions are simple and both are secure. The difference is that one seems more surprising to authors
23:11	<othermaciej>	sicking: what's an example of a case where the author would see different behavior between the two?
23:11	<sicking>	othermaciej, HTTP://Example.Org:80 vs http://example.org contains all 3 differences that i can think of
23:12	<othermaciej>	sicking: ok, but walk me through the scenario
23:12	<othermaciej>	so let's say you loaded from a page of HTTP://Example.Org:80
23:12	<virtuelv>	sicking: what about http://example.org./ vs http://example.org/
23:12	<Hixie>	virtuelv: those are different
23:13	<sicking>	virtuelv, what hixie said
23:13	<Hixie>	my concern is with things like http:///example.com vs http://example.com, where on UA has a bug and treats it differently than other UAs
23:13	<sicking>	othermaciej, well, the Origin header that the site is going to receive is always going to be on the "canonical" form (per spec)
23:14	<Hixie>	or http:://, or http://foo.com:80⊙bc:81/ or whatever
23:15	<Hixie>	seems like for something this important you shouldn't need to be a spec lawyer to be able to tell if the string is going to work or not
23:15	<sicking>	othermaciej, in most cases the server is going to have some sort of list of origins they want to approve. For example some list like ".company.com", ".partner.com"
23:15	<sicking>	othermaciej, and then match the Origin against that
23:16	<sicking>	othermaciej, and then send back a matching origin if one is found
23:16	<sicking>	othermaciej, (they're basically forced to do this given how simple the syntax of the AC-Allow-Origin header is)
23:17	<sicking>	othermaciej, so if they generate the send-back string with a uppercase Company.com because that's how they normally type the company url internally, they'll currently fail
23:17	<Hixie>	but they'll _always_ fail, so it'll be trivial to catch
23:18	<Hixie>	with url parsing, they might find it works in the UAs they test, but some syntax error means it fails in the rest
23:18	<Hixie>	and they won't know until the deploy
23:18	<Hixie>	they
23:18	<othermaciej>	sicking: so if you can generate the
23:19	<othermaciej>	"canonical" form to send it
23:19	<othermaciej>	you can generate it for the comparison too
23:19	<othermaciej>	I see the point for the send-back
23:19	<othermaciej>	requires a canonicalization on the server side, possibly
23:20	<sicking>	right
23:20	<sicking>	Hixie, differences like that are already breaking things and needs to be fixed
23:21	<Hixie>	not necessarily
23:21	<Hixie>	but sure, any interop issues should be fixed
23:21	<Hixie>	my point is just that this is too important to elevate all url parsing interop issues to that level
23:21	<sicking>	i don't think this is any more important than any other place
23:22	<Hixie>	it'd be like saying "well a lot of our houses are made of wood, and wood is quite flammable, so we have to fix that anyway, it's not a reason to not build the dam using wood"
23:22	<sicking>	but this is just another house, this is no more important than anything else in the browser
23:22	<Hixie>	?!
23:23	<sicking>	if two strings parsed into urls are considered same-origin in one UA but not in another we're already in the same trouble that would happen here
23:23	<Hixie>	this controls the ability for any random site to transfer all my money out of my bank account
23:23	<othermaciej>	I don't think URL canonicalization is risky
23:23	<Hixie>	it's FAR more important than, say, whether i can type a URL safely in the address bar
23:23	<othermaciej>	I think having different definitions of same-origin is more risky
23:23	<Hixie>	i agree with maciej
23:23	<sicking>	Hixie, so does the same-origin-url-comparison code that checks if cross-frame-scripting is allowed
23:24	<othermaciej>	Hixie: the upshot of what I'm saying is that Access-Control defining same-origin differently is arguably bad
23:24	<Hixie>	othermaciej: this isn't defining same-origin differently
23:24	<othermaciej>	since other same-origin checks canonicalize
23:25	<Hixie>	othermaciej: it's serialising the origin, and then comparing it to someone else's attempt at doing the same thing
23:25	<othermaciej>	I guess you could see it that way
23:25	<Hixie>	origin comparisons aren't url comparisons, they're the same thing as here, except without serialising the origin, just holding it as a tuple
23:25	<sicking>	so for what it's worth, IE does have a different concept of what is same-origin than everyone else
23:26	<sicking>	they don't consider ports to be part of the origin
23:26	<Hixie>	(they could as easily be done by serialising, except for the unique ID case)
23:26	<Hixie>	anyway
23:26	<sicking>	apparently they do for same-site XHR starting with IE7, but nowhere else
23:26	<Hixie>	i'm not the editor of the AC spec, so i'm not the one you have to convince :-)
23:28	<sicking>	anne has said he'll follow what websockets do
23:28	<sicking>	though technically speaking its not up to the editor but the group as a whole
23:28	<sicking>	i'd just rather it not get to that of course
23:29	<Hixie>	well technically it's up to Tim
23:29	<Hixie>	but sure
23:29	<sicking>	this is far too unimportant to raise to a vote even
23:29	<sicking>	imho
23:29	<sicking>	i just don't see any advantages to the current behavior
23:29	<Philip`>	hsivonen: When using your HTML2XML, any spaces in attributes get turned into \x20\xEF\BF\xBD (or whatever that is in UTF-8), which seems kind of weird and undesired
23:30	<Hixie>	sicking: so where do you stand on WebSocket-Location?
23:30	<sicking>	Hixie, what is that?
23:31	<Hixie>	should that be parsed and components recompared too?
23:31	<sicking>	yeah, in general i think that when we compare url like things, we should compare them as urls
23:31	<Hixie>	cripes that makes things a lot more complicated for websocket
23:32	<Hixie>	so should relative URLs here be resolved ?
23:32	<sicking>	i don't know anything about websockets
23:32	<Hixie>	how should I handle IDN/punycode?
23:32	<sicking>	i was just told they had something similar as Access-Control-Allow-Origin
23:33	<Hixie>	another reason to not do URL comparison is that we might not always be dealing with URLs, e.g. if bz gets his way and we serialise the unique ID origins too
23:33	<sicking>	so IDN is a good point actually
23:34	<Hixie>	for WebSocket I really fear the mess that will occur if we start reparsing the output from the server
23:34	<Hixie>	the whole point is that we want to make the handshake as hard as possible to fake
23:34	<sicking>	i don't know if www.Å.com is the same as www.å.com
23:34	<sicking>	(not sure if that comes out correct, should be uppercase a-ring vs lowercase a-ring)
23:35	<sicking>	if the web in general consider those same-origin
23:35	<sicking>	but string comparison wouldn't, that seems bad
23:35	<Hixie>	string comparison compares the ASCII serialisation
23:35	<Hixie>	so you would compare the punycode version
23:35	<Hixie>	you can't send non-ASCII over HTTP
23:35	<Hixie>	HTTP doesn't define an encoding beyond ASCII iirc
23:36	<sicking>	are the punycode of the two the same?
23:39	<Hixie>	no idea, but that's not our problem at that point
23:41	<sicking>	well, it sort of is if string-comparison results in different behavior than url comparison
23:43	<Hixie>	how can it result in different behaviour?
23:43	sicking	ponders
23:45	<sicking>	so if the requesting page is www.å.com
23:46	<sicking>	actually, looks like they produce the same punycode