#whatwg on 2009-04-27

01:11	<hsivonen>	Hixie: Re: http://lists.w3.org/Archives/Public/public-html/2009Mar/0503.html
01:12	<hsivonen>	Hixie: Firefox appends more text to the last text node it was appending to
01:12	<hsivonen>	Hixie: IE appends more text as child of the element it previously appended text as a child
01:13	<hsivonen>	Hixie: coalescing the text nodes if the existing last child is a text node
01:14	<hsivonen>	Hixie: so Firefox and IE are different if you insert an element from script between parser text flushes
01:41	<Hixie>	hsivonen: can you give an example of what you mean?
01:41	<hsivonen>	Hixie: I had an example in the email
01:41	<hsivonen>	looking it up
01:43	<hsivonen>	Hixie: http://hsivonen.iki.fi/test/moz/flushing-document-written-text-no-alert.html is different for you in Firefox 3.5 and IE8, right?
01:43	<Hixie>	i meant like a tiny example :-)
01:44	<Hixie>	is your example basically <script>append a text node</script>x -- are there two text nodes or one?
01:45	<hsivonen>	Hixie: no, the crux is the line document.getElementById("foo").appendChild(document.createElement("span"))
01:45	<Hixie>	my confusion is with "Firefox appends more text to the last text node it was appending to", since if a script is running, the last text node it inserted into was the <script>'s child node
01:46	<Hixie>	you're saying you want <script>write 'aaa'; append element node</script>bbb to result in "aaabbb element node" rather than "aaa element node bbb"?
01:46	<Hixie>	i'm very confused
01:47	<hsivonen>	Hixie: former in Firefox, latter in IE
01:47	<Hixie>	well clearly imho aaabbb element is outright incorrect.
01:48	<hsivonen>	Hixie: you want to spec what IE does?
01:48	<hsivonen>	WebKit is element aaabbb, IIRC
01:48	<hsivonen>	and Opera
01:48	<Hixie>	the spec already requires the text after the <script> to come after whatever state the DOM is in after the script executes.
01:50	<Hixie>	element aaa bbb is definitely wrong per spec, it means the parser isn't re-entrant.
01:50	<Hixie>	(or acts as if it's not, which is worse)
01:50	<Hixie>	http://www.hixie.ch/tests/adhoc/html/parsing/026-demo.html
01:51	<hsivonen>	Hixie: the webkit/opera behavior flows naturally from a late-flushed accumulation buffer
01:51	<Hixie>	webkit's behaviour is clearly wrong.
01:51	<Hixie>	then a late-flushed accumulation buffer is a bad dea.
01:51	<Hixie>	idea
01:51	<hsivonen>	Hixie: it's 2 out of 4 :-)
01:52	<hsivonen>	I'll have to get back on this when I have a decent text input method
01:52	<Hixie>	IE's behaviour in http://www.hixie.ch/tests/adhoc/html/parsing/026-demo.html is what the spec requires
01:52	<Hixie>	and is the only behaviour i've found in the four browsers i tested that is even remotely sane.
01:53	<Hixie>	anyway, as i noted in my e-mail reply, just use the text node's own buffer as the buffer
01:53	<hsivonen>	Hixie: you have an interesting notion of sane
01:53	<Hixie>	and then you get IE's behaviour for free
01:53	<Hixie>	really? you don't think having things come out in the order they were written is sane?
01:53	<hsivonen>	Hixie: nope, then you get the Firefox behavior
01:54	<Hixie>	you only get firefox's behaviour if you don't check to make sure the text node is still the last child each time a script runs
01:54	<Hixie>	(which is a trivial pointer test)
01:54	<hsivonen>	Hixie: my independent implementation matched webkit and opera, and I like to think I'm sane
01:54	<Hixie>	i'm not saying the implementor isn't sane
01:55	<Hixie>	i'm saying that having nodes come out in an order different than the order they went in is unintuitive
01:55	<hsivonen>	depends on whether your intuition is buffered :-)
01:56	<Hixie>	reload http://www.hixie.ch/tests/adhoc/html/parsing/026-demo.html
01:56	<Hixie>	i really think that only IE's behaviour here matches author expectations
01:56	<Hixie>	and i don't buy that it's hard to implement.
01:57	<hsivonen>	I guess I'll have to implement it then, even though WebKit and Opera have gotten away with the simple behavior
02:07	<Hixie>	hsivonen: simpler for authors is more important than simpler for implementors :-)
02:50	<hsivonen>	http://lists.xml.org/archives/xml-dev/200904/msg00062.html
02:51	<hsivonen>	http://lists.xml.org/archives/xml-dev/200904/msg00061.html
04:03	<Hixie>	heycam: be snarky here instead :-)
04:04	<heycam>	#whatwg are indeed the go-to guys for snark!
04:05	<heycam>	if i'm going to spend some time doing something, it may as well be to fix things rather than snark about them, anyway
04:08	<Hixie>	boring!
04:10	<heycam>	:)
06:39	<heycam>	http://www.unicode.org/charts/PDF/UBOOP.pdf
07:03	<Hixie>	http://lists.w3.org/Archives/Member/chairs/2009AprJun/0035.html
07:03	<Hixie>	i wonder if at our next rechartering we can get the htmlwg chartered to 2022 on the same basis
07:04	<Hixie>	"here are some realistic times"
09:57	<annevk2>	Hixie, I e-mailed some comments on webstorage last Friday and it seems they had some delay while getting to the list
09:57	<annevk2>	Hixie, do you have them?
12:07	<zcorpan>	yay http://wiki.whatwg.org/wiki/Web_ECMAScript#HTML_comments
12:09	<annevk2>	yay as in mess?
12:17	<jgraham>	annevk2: What do you mean. The IE behaviour is toatlly sane and I can't see why everyone doesn't do that
12:18	<jgraham>	(beware: low-flying sarcasm)
12:21	<Philip`>	SpiderMonkey's behaviour is different if you use ;version=something-large-enough or ;e4x=1
12:22	<jgraham>	The e4x thing makes sense
12:22	<jgraham>	But the version thng?
12:22	<Philip`>	I might be lying about that
12:22	<zcorpan>	i don't find any differences with version=1.8
12:23	<Philip`>	Oh, I'm not lying
12:23	<Philip`>	http://philip.html5.org/demos/js/jsversion.html - see the "<!-- 1 -->" line
12:23	<Philip`>	(in Firefox)
12:24	<zcorpan>	but <!-- 1 --> is not "line that starts with just whitespace and comments is treated as a line comment" is it?
12:25	<zcorpan>	s//a/
12:25	<Philip`>	No, but it's an HTML comment in the script, and the behaviour is different between ;versions
12:47	<hendry>	what's the issue with DOMTimeStamp? http://lists.whatwg.org/pipermail/commit-watchers-whatwg.org/2009/001994.html
12:50	<hendry>	http://www.w3.org/TR/DOM-Level-3-Core/core.html#Core-DOMTimeStamp DOMTimeStamp is long for non-JS
12:50	<Philip`>	hendry: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-February/018487.html and associated thread seems relevant
12:51	<hendry>	urgh
13:12	<zcorpan>	Hixie: "The rules right now are really simple -- you only need quotes if your value includes spaces, quotes, or equal signs." ...or greater than
13:15	<Philip`>	or if your value is the empty string
13:23	<zcorpan>	there you go, it's not actually really simple :)
13:24	<zcorpan>	and if you care about compat with firefox or safari or ie or opera then it becomes more complicated still
13:25	<zcorpan>	which is not documented in the spec
13:25	<zcorpan>	s/opera/old versions of opera/
13:26	<jgraham>	So it's only simple if you don't learn any rules except to always quote attribute values
13:27	<jgraham>	Which conincidentially is the fashion amonst hip web developers
13:28	<Philip`>	It's the fashion amongst pretty much all web developers, not just hip ones, as far as I can see
13:29	<Philip`>	(judging by http://canvex.lazyilluminati.com/misc/stats/tokeniser.html having 6902804 double-quoted characters, vs 387236 single-quoted and 376258 unquoted
13:29	<Philip`>	)
13:30	<jgraham>	Oh well maybe it's only the hip web developers who go around telling other people how uncool they are for not using quotes
13:31	<zcorpan>	so unquoted attributes are as common as single quoted attributes?
13:33	<zcorpan>	i.e. about 0.05%
13:33	<Philip`>	jgraham: It's the web developers who have gone beyond hip, who omit needless tags and quotes and everything
13:33	<Philip`>	and often omit compatibility with real web browsers too
13:33	<Philip`>	zcorpan: Only if you trust my data, which you shouldn't
13:34	<Philip`>	because it's counting numbers of tokeniser state transitions that are not uppercase letters or ampersands etc
13:34	<zcorpan>	Philip`: it was the only data i had at hand :)
13:34	<Philip`>	rather than counting something useful like number of attributes
13:35	<Philip`>	(But that's the only data I've acquired from an instrumented tokeniser, so I don't have anything better)
13:35	Philip`	tries to find the quote he's thinking of
13:36	<zcorpan>	Philip`: can unquoted attributes go through more state transitions than a single quoted attribute because of entities?
13:36	<Philip`>	jgraham: It's only the web developers who are so hip they can hardly see over their pelvises
13:37	<Philip`>	zcorpan: I don't think entities make a difference; I was just counting the number of times the "Anything else" transition is encountered
13:37	<Philip`>	Uh
13:37	<Philip`>	Did I say something about uppercase letters?
13:37	<Philip`>	Please ignore that entirely
13:39	<Philip`>	Ooh, actually my data probably does say how many attributes there were
13:40	<Philip`>	BeforeAttributeValueState: currentCharacter == 34 -- 366746
13:40	<Philip`>	BeforeAttributeValueState: true -- 52451
13:40	<Philip`>	BeforeAttributeValueState: currentCharacter == 39 -- 17591
13:40	<Philip`>	which are (respectively) the starts of double-quoted, unquoted and single-quoted attribute values
13:40	<Philip`>	("true" means the "Anything else" case)
13:41	<Philip`>	One can thus conclude that double-quoted attribute values have average length 19 characters, single-quoted 22, and unquoted 7
13:44	<Philip`>	jgraham: Oh, actually it might be pelves
13:49	<zcorpan>	so 84% double-quoted, 12% unquoted and 4% single-quoted
13:50	<Philip`>	Something like that
13:50	<Philip`>	based on a small, unspecified, biased sample of pages
14:00	<Lachy>	The rules for unquoted attributes are relatively simple because most of the special characters that require quotes are rather intuitive becuase they're the characters that have some other special meaning in the tag
14:03	<Philip`>	Is it intuitive that you have to include quotes if the value contains U+000C?
14:04	<Philip`>	where "intuitive" means you don't have to read the spec in detail to discover it
14:10	<Philip`>	(And if you're implementing an HTML5 serialiser, and even if you do read the spec in detail, is it intuitive that you also have to include quotes when the value contains ` or < ?)
14:15	<Lachy>	Philip`, yes, because '>' is the character used to end the tag, so it seems fairly obvious that in <span class=foo>bar..., the attribute only contains "foo"
14:33	<zcorpan>	Lachy: he didn't ask about ">"
14:49	<drostie>	so, an idea that came up in developer.mozilla.org/Talk:En/JavaScript intrigued me, although the way that the poster presented it was ill-defined.
14:51	<drostie>	A lot of people spend a lot of time with bad or generic parsers against XSS. HTML 5 could really reduce this problem if it included a <sandbox> element or so, which restricted the sorts of things you find in XSS exploits anyway. The parser would only have to check for <sandbox> and </sandbox> tags instead of everything that could launch a script.
14:51	<drostie>	The browser would make sure, e.g., that a <b> tag started inside of a sandbox didn't leave it, and that scripts didn't execute within the sandbox, and so forth.
14:53	<jgraham>	Philip`: What was the sample size for the comment data that you got for me (i.e. the total number of pages)
14:53	<drostie>	The only thing I'm thinking is that it would probably play not-nice with DOM methods.
14:55	<jgraham>	drostie: That has been discussed before and it turns out to not be quite as simple as you would hope. Ill let someone else explain though (or you can check the mail archives)
14:58	drostie	goes a-searching ^_^
14:58	<jgraham>	Philip`: Oh it says. I am blind :)
14:59	<Philip`>	jgraham: I guess it was the 425K pages?
15:00	<Philip`>	jgraham: Being blind is not an excuse for failing to read the plain text that says where the data came from :-p
15:01	<Philip`>	I suppose one problem with this dataset is it's going to be heavily biased towards sites with many pages
15:01	<Philip`>	e.g. a forum which exposes a million URLs will be a million times more likely to be included in the sample than a site that's just a single page
15:02	<Philip`>	(though the population is large enough that a single site doesn't occur more than dozens of times, if I remember correctly)
15:02	<zcorpan>	drostie: if those pages would use HTML5 parsers instead of "generic parsers", wouldn't the XSS problems go away?
15:04	<drostie>	zcorpan: that would depend -- but it's much easier to use a generic parser that replaces < and > with character entities than it is to write an HTML5 parser.
15:05	<annevk2>	drostie, there's a plan to address that through <iframe doc="string of html" seamless sandbox></iframe> iirc
15:05	<drostie>	But jgraham is right, there's a lot of discussion about this idea on the mailing list; and one good reason it will never work is because you will always need to parse the content anyway, for legacy clients.
15:06	<drostie>	anne: I'll look at that as well. :D
15:06	<annevk2>	drostie, the fallback would be a src=cross-origin
15:06	<annevk2>	drostie, note that doc= is not defined currently
15:08	<Philip`>	zcorpan: They would have to use HTML5 parsers plus sanitisers that have zero bugs
15:08	<Philip`>	plus serialisers that have zero bugs
15:08	<jgraham>	Philip`: Only zero bugs that result in exploitable holes
15:08	<Philip`>	and also browsers must not be buggy or non-standard
15:08	<Philip`>	Then you'll be secure
15:10	<Philip`>	In the meantime, the 'defence in depth' thing seems like a good idea
16:24	<gsnedders>	Oh dear
16:24	gsnedders	finds photos of him
20:37	<jgraham>	Note to the people who designed github: I am really stupendeously uninterested in whether I get a zip file or a tar.gz file
21:14	<Philip`>	jgraham: I imagine other people might care, since Windows users can only read zip, and users of more functional OSes can read both but .tar.gz is often much more efficient
21:19	<jgraham>	Philip`: There is no excuse for making the UI "click a link marked download, get a modal overlay thing sking what type of file I want". In almost all cases the extra efficiency of tar.gz will save much less download time than the extra time taken to deal with unexpected popup
21:20	<jgraham>	So just going with zip would be fine
21:20	<jgraham>	Or if they really think people need a choice it should be two different links
21:23	<Philip`>	jgraham: Oh, that does sound a bit annoying
21:33	<gsnedders>	Somebody teach me how to solve first and second order differential equations.
21:33	Philip`	forgets what first and second order differential equations are
21:34	<gsnedders>	Things in the Advanced Higher maths course :)
21:35	<Philip`>	That's not an entirely helpful description
21:44	<jgraham>	gsnedders: In general it's not possible or at least not easy, I think. But you only need to know special forms, not in general
21:47	<jgraham>	gsnedders: So if you really want help you'll need to be more specific about what you want to know
21:47	<gsnedders>	jgraham: What is required in the AH course.
21:47	<gsnedders>	:P
21:53	<jgraham>	gsnedders: The bit of the syllabus that I found so far talks about equations in the form dy/dx = F(x)/G(y)
21:54	<gsnedders>	No, not them
21:54	gsnedders	googles for ah maths and finds his school's website
21:56	<jgraham>	gsnedders: you mean ones like a(x)dy/dx + b(x)y = f(x)?
21:56	<gsnedders>	yeah
21:57	<gsnedders>	Search http://www.sqa.org.uk/files_ccc/Maths_AH_5th_ed.pdf for differential equations
21:57	<jgraham>	Ah, well you do what it say then. You rearrange it to the form dy/dx + F(x)y = G(x)
21:58	<jgraham>	and notice that multiplyin both sides by exp(int(F(x), dx)) gives you something that you can write as an exact differential
22:00	<jgraham>	since d/dx(yexp(int(F(x, dx)))) = F(x)exp(int(F(x), dx)))y + dy/dx*exp(int(F(x),dx))
22:01	<jgraham>	(excude the probably mismtched brackets)
22:02	<jgraham>	That's applying the chain rule to d/dx(exp(g(x)) where g(x) happens to be an integral
22:02	gsnedders	badly needs to learn maths
22:03	<jgraham>	gsnedders: The important thing to realise is that a) you can't do everyhing "forwards"; someimes you need to know/intuit the answer
22:03	<jgraham>	and b) you can get a long way with practice
22:03	<gsnedders>	Well, for b) I have four weeks until the exam
22:04	<jgraham>	gsnedders: Well make good use of it :)
22:05	<gsnedders>	This week I'm kinda stuck with second prelims and ball, so only three weeks really
22:05	<gsnedders>	ahhhh
22:06	Philip`	notes that balls aren't really very productive, so you could spend that time doing fun maths instead
22:06	<gsnedders>	And ditch the gorgeous girl I'm meant to be going with now? :P
22:06	<Philip`>	Sure
22:07	<Philip`>	It's statistically unlikely that you'll marry her, so it's just a waste of time in the long run
22:07	<gsnedders>	LOL
22:09	jgraham	sees a flaw in that argument
22:09	<jgraham>	Well I say one, but that's really an insult to arguments with just one flaw
22:11	<gsnedders>	Philip`: I am, however, avoiding being overly unproductive by not going to the afterball
22:11	<Philip`>	I suppose it depends on how much value you place on fleeting happiness
22:14	<jgraham>	Philip`: Not at all. It assumes that the experience will have a negligible impact on your future life
22:14	<jgraham>	Which seems like an entirely unsupportable hypothesis
22:15	<jgraham>	(it also assumes that marriage is a end unto something which is rather sily)
22:32	jgraham	notes that gsnedders should in no way use his previous argument to prioritise the ball over maths revision :)
23:02	<gsnedders>	jgraham: Sorry, but, girl I'm going with > you.
23:08	jgraham	has no idea what that has to do with anything
23:09	jgraham	decides it is bed time