#whatwg on 2008-08-13

01:36	<Hixie>	i accidentally wrote </li> instead of </ol> and the validator said "error fatal: Too many messages" after spewing about one error per paragraph in html5
01:36	<Hixie>	i may have to admit that the html5 parser's recovery is not necessarily what a validator should try to do
06:55	<heycam>	"A person's name is not the title of a work — even if people call that person a piece of work" -- heh
08:09	<hsivonen>	Hixie: I found obsolete attributes useful today: http://hsivonen.iki.fi/performance-mistake/
08:10	<hsivonen>	Hixie: what would have been the right way to do the table cell alignment?
08:11	<hsivonen>	huh? is the W3C establishing a new Activity for EOT?
08:22	<virtuelv>	hsivonen: url? (that eot thing)
08:22	<hsivonen>	virtuelv: http://lists.w3.org/Archives/Public/www-archive/2008Aug/att-0010/EOT-charter-draft-1.html
08:24	virtuelv	shall refrain from commenting further
09:08	<zcorpan>	Hixie: "and required that if two of these attributes are specified (or if all three are specified, in text/html), they have the same value." -- afaict it's not allowed to specify all 3 per the spec
09:09	<zcorpan>	"Authors must not use the xml:lang attribute (that is, the lang attribute with the xml prefix in the http://www.w3.org/XML/1998/namespace namespace) in HTML documents."
09:10	<hsivonen>	xml:lang causes so much trouble
09:11	<hsivonen>	in retrospect, XML 1.0 should have reserved id, class and lang
09:13	<zcorpan>	hsivonen: i've came to that conclusion too
09:13	<zcorpan>	s/came/come/
09:14	hsivonen	decides to start measuring the performance cost of the HTML5 tree construction layer
10:44	<Hixie>	hsivonen: i think we may want to expose charoff on col or td
10:44	<Hixie>	hsivonen: but there's not much point before UAs have any intention of implementing it
10:45	<Hixie>	zcorpan: yeah i guess you can't include all three on one element without violating another rule before worrying about the values anyway
10:45	<hsivonen>	Hixie: how about getting to use align conformingly in the mean time?
10:45	<Hixie>	what's wrong with the css equivalent?
10:46	<Hixie>	td + td { text-align: right; }
10:47	<Hixie>	anyway bed time now
10:55	<hsivonen>	Hixie: that violates the separation of content and style. I the rules in the style sheet should not have to depend on what number of columns my tables have and which columns have which alignment
12:47	<takkaria>	hsivonen: interesting the results you got with your performance changes
12:48	<takkaria>	hsivonen: because in hubbub, I was briefly toying with having a buffer only when required and switching to it if NULs/CRs/entities were encountered
12:49	<takkaria>	hsivonen: turns out it added a lot of extra macros, removed clarity, and it ended up easier and about as fast to do unconditional buffer writes
12:50	<hsivonen>	takkaria: I guess it's comforting that C is like that, too, and this isn't just a Java thing ;-)
12:51	<takkaria>	I wish I knew x86 well enough to speculate on why it's the case
12:52	<hsivonen>	today, I measured the perf cost of the HTML5 tree builder compared to an XML-ish tree builder
12:52	<hsivonen>	I'll blog about that later
12:53	<hsivonen>	executive summary: the Validator.nu HTML Parser's tokenizer is almost as fast as Xerces without the HTML tree builder complexity
12:55	<takkaria>	that's pretty speedy
12:56	<takkaria>	Hubbub's tokeniser tokenises HTML5 in about 0.5s
12:56	<takkaria>	which is something like 5MB/s
12:58	<takkaria>	however, libxml tokenises and produces a tree in 0.8s. so there's quite a lot of scope for improvement, or so I hope
13:03	<hsivonen>	takkaria: do you count IO when benchmarking the tokenizer?
13:04	<takkaria>	I have some perf tests around which basically mmap() a file into memory and then pass the memory thus mapped to libxml2/hubbub
13:05	<takkaria>	I guess I could read an entire file properly rather than letting the OS do it, but I don't think it would make that much difference
13:06	<takkaria>	at some point it would be nice to do proper somewhat-scientific benchmarks, though right now getting hubbub fully-funtional takes more priority
13:11	<hsivonen>	with Java, IO details make a huge difference
13:12	<hsivonen>	which is why I run benchmarks from RAM when I don't want to benchmark IO
13:14	<hsivonen>	(I already tweaked IO earlier)
13:14	<hsivonen>	(It might not be a bad idea to rerun tests with IO included)
13:27	<takkaria>	my benchmarking is really unscientific, and consists of writing test apps and running them through `time` on the command-line
13:28	<takkaria>	otoh, mmap() after a couple of runs results in fairly unchanging data, so it's good enough for me
13:32	<hsivonen>	I do timing inside Java and warm the VM up first
13:32	<takkaria>	yeah, you have the advantage there :)
13:36	<hsivonen>	performance tuning on HotSpot is pretty annoying
13:37	<hsivonen>	because doing the wrong thing can lead to HotSpot not JITting stuff
13:37	<hsivonen>	which means about tenth the performance
13:54	<hsivonen>	I wonder how much I could speed up the tokenizer if I removed all the run-time configurability
14:37	<Lachy>	JohnResig, yt?
14:39	<Lachy>	JohnResig, your selectors api testsuite is trying to load "data/iframe.html" in an iframe, but it's returning 404. http://ejohn.org/apps/selectortest/data/iframe.html
14:42	<JohnResig>	Lachy: correct - the URL wasn't important to the test itself
14:47	<Lachy>	JohnResig, ok. So can the iframe be removed, or the src attribute removed?
14:48	<JohnResig>	Lachy: the src is probably safe to remove
14:48	<Lachy>	ok. It still works when I remove it.
14:49	<Lachy>	the test suite is actually working in gogi now, so it appears we fixed whatever bug was causing it to abort the test earlier
14:49	<JohnResig>	Lachy: cool
14:49	<Lachy>	except, there's some weird bug with the stylesheet
14:50	<Lachy>	If I leave this style in the page: .unitTest, .test { height: 10px; } then I don't get any scrollbar
14:50	<JohnResig>	weird
14:50	<Lachy>	it doesn't appear to be important, so I just removed it so I could see all the results
14:51	<Lachy>	oh, it's because body has a class of unitTest, so it's getting set to a height of 10px.
14:52	<Lachy>	oh, no, that's not it.
14:59	<Lachy>	JohnResig, was the test suite updated after support for namespaces were removed from the spec?
15:00	<JohnResig>	Lachy: I'm updating it now
15:00	<Lachy>	ok, thanks.
15:00	<JohnResig>	Lachy: we had a copy that we were using, I'm pushing itlive
15:00	<JohnResig>	*it live
15:00	<zcorpan>	interesting, http://simon.html5.org/test/html/semantics/video/events.htm doesn't finish loading in firefox (the parser stops after the first script block)
15:01	<Lachy>	alright. Let me know when it's up
15:07	<jgraham>	zcorpan: It seems to load OK for me, but then it crashes the browser
15:09	<jgraham>	when you click play
15:19	<Lachy>	JohnResig, can you add a feature that allows me to show only failed test results? If you add class="pass" and class="fail" to each <li>, then add a checkbox or something that applies the style: .pass {display: none;}, or similar, that woud work
15:26	<JohnResig>	Lachy: done: http://ejohn.org/apps/selectortest/#target
15:27	<Lachy>	thanks
15:28	<zcorpan>	jgraham: i updated my firefox and now it loads and plays but it seems to play the video too fast
15:30	<jgraham>	zcorpan: Which platform?
15:30	jgraham	is using today's Linux build
15:31	<zcorpan>	jgraham: windows
15:31	<Lachy>	JohnResig, that scrolling bug in gogi no longer occurs, and it looks like you changed the offending css. Did that turn out to be a bug in your CSS, or is it still something wrong with gogi that I should get fixed?
15:31	<zcorpan>	jgraham: but i got the same results on ubuntu yesterday
15:31	<zcorpan>	(played too fast)
15:32	jgraham	goes to file a bug
15:32	<JohnResig>	Lachy: probably something wrong in Gogi, but it wasn't necessary in my tests
15:34	<Lachy>	ok. I still have a copy of the old file, so I'll investigate it later
15:38	<Lachy>	cool, looks like we're failing the following: stringifying null/undefined, various attribute selectors, :enabled, :disabled, :checked, and all DocumentFragment tests
15:41	<Lachy>	JohnResig, can you add <label> around the Show Failing Tests checkbox
15:43	<JohnResig>	Lachy: done
17:03	<virtuelv>	question (barely related to anything, really)
17:03	<virtuelv>	I have an element with top:0 and height:100%
17:04	<virtuelv>	fixed position
17:04	<virtuelv>	now, assume that I use full-screen zoom
17:05	<virtuelv>	If I now zoom, should offsetHeight of the element change?
17:10	<virtuelv>	nevermind, what should happen to window.innerHeight?
17:39	hsivonen	finds http://apache.org/xml/features/standard-uri-conformant Default: false
17:54	<hsivonen>	whoa! Namespace processing in Xerces is relatively more expensive than HTML5 tree building in the Validator.nu HTML Parser
20:19	<jgraham>	Wow I like the comment "anything [complex enough to require workers] is [...] valuable enough to be commercial software - and therefore requiring protection against illicit copying"
20:20	<jgraham>	Because no one ever wrote any open source software that needs threads...
20:23	<Lachy>	jgraham, who said that?
20:23	<jgraham>	Lachy: Shannon on the whatwg list. You may have stopped reading the thread...
20:23	<Lachy>	yeah, I haven't read much on whatwg lately
20:25	<csarven>	What do you guys think: 10 GETs vs. 1 GET + 50k
20:25	<Lachy>	?
20:25	<jgraham>	I would reply and mention the existence of Erlang and use cases for long running background tasks for things like web-based image editors but I don't think it would help
20:25	<jgraham>	50kb?
20:26	jgraham	suspects that 1 get + 50kb would be faster for most people
20:26	<jgraham>	If that was the question
20:26	<jgraham>	But I have no reason for thinking that
20:33	<virtuelv>	csarven: ? use case?
20:34	<virtuelv>	but in general, unless the 1 request locks up the browser/page UI, I'd propose going for it
20:34	<virtuelv>	10 requests add at the very least 3kb extra of data to download
20:35	<csarven>	jgraham virtuelv I'm applying data: URI scheme to images in CSS.
20:35	<roc>	every time someone mentions Erlang I reach for my gun
20:36	<jgraham>	roc: ?
20:36	jgraham	is quite scared
20:37	<roc>	too many people seem to believe that Erlang invented message passing
20:37	<virtuelv>	csarven: I'm probably slow, given that my body is in a different timezone than the one I'm actually in, without having travelled across timezones
20:37	<virtuelv>	csarven: but you're including images using data:
20:38	<virtuelv>	if you have 50 K of image data in data: URI's, I'd say you're doing it wrong
20:38	<jgraham>	roc: I don't know who invented message passing but Erlang is probably the most famous example of a system that uses it heavilly, right?
20:38	<roc>	wrong
20:38	<jgraham>	So what is right?
20:39	<roc>	well, thanks to the Erlang fanboys you may in fact be right at this point, but that's self-fulfilling ignorance
20:39	<jgraham>	roc: A lot of fame seems to be based around that concept...
20:40	<jgraham>	Anyway, I am curious about the other, better, examples
20:40	<virtuelv>	csarven: if your stylesheet is big, as 50 kB is, you are at some point going to end up with FOUC on a slow connection
20:41	<virtuelv>	csarven: I'd rather see if the images could be combined, using some variant of CSS sprites, to keep the number of requests down
20:41	<roc>	Mach was extremely famous and uses message passing
20:42	<csarven>	virtuelv I'm already using CSS sprites
20:43	<csarven>	It appears to be that the resulting total weight of the page is 50k more when I use data:
20:43	<jgraham>	roc: That's a microkernel, right? I guess it wouldn't be the right example to use if you were trying to convince a skeptic that message passing is a good idea on the web (but I don't disagree that it's pretty famous)
20:43	<roc>	but well before that, there were well-known programming models like actors and CSP that are focused on shared-nothing message passing
20:44	<virtuelv>	csarven: then you have a lot of image data, ~250kb, or so?
20:45	<virtuelv>	a possible optimization at that point is to store the images at a different domain/subdomain
20:45	<csarven>	Actually not that much no.
20:46	<csarven>	I must be incorrectly calculating this.
20:51	<virtuelv>	hm, or I might be off here, the overhead is 3/2
20:51	<virtuelv>	or so
20:57	<csarven>	Images weigh-in about: 45k
20:58	<csarven>	The ones that get to be used for data:
20:59	<csarven>	With data:uri ~200k total page weight, without data:uri ~150k
20:59	<csarven>	Firebug results.
21:09	<zcorpan>	hmm.. it would be nice if it wasn't possible to set website and signature until you have at least 1 post on the forums
21:12	<zcorpan>	Hixie: you think you could comment out the "Website" and "Signature" rows in http://forums.whatwg.org/profile.php?mode=register&agreed=true ?
21:46	<takkaria>	hsivonen: you may find http://takkaria.org/dmoz/ useful
21:50	<Hixie>	zcorpan: done
21:51	<Hixie>	hsivonen: separation of content and style doesn't imply that the style is in a vacuum, the tyle always depends on the structure of the content.