#whatwg on 2008-06-02

07:41	<Hixie>	alt="<%plugin_first_title>" and things like that are sadly common
07:42	<Hixie>	also things like alt="<b>Sabit</b>"
07:44	<Hixie>	this page contains an alt="" whose value itself contains two <img> elements, both with alt=...
07:44	<Hixie>	http://www.modifiyem.com/forum/f51/lastik-basinclari-41393/
08:02	<gsnedders>	Someone please tell me my computing exam will be all right because they'll be technically right, so me not having learnt the wrong answers doesn't matter.
08:03	<zcorpan_>	"An ASCII-compatible character encoding is one that is a superset of US-ASCII (specifically, ANSI_X3.4-1968) for bytes in the range 0x09 - 0x0D, 0x20, 0x21, 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A."
08:04	<zcorpan_>	Hixie: shouldn't that be s/a superset/identical/ ?
08:04	<zcorpan_>	and s/of/to/
09:19	<zcorpan>	are there <video> tests anywhere? i've found http://hsivonen.iki.fi/test/moz/video-selection/
09:23	<Lachy>	zcorpan, I have http://lachy.id.au/dev/markup/tests/html5/video/
09:24	<zcorpan>	doesn't webkit have any tests?
09:25	<zcorpan>	othermaciej: ^
09:27	<Philip`>	zcorpan: http://trac.webkit.org/browser/trunk/LayoutTests/media
09:27	<zcorpan>	Philip`: cheers
10:04	<annevk>	FWIW, Opera is removing UTF-7 support from the Web side of the product. It will remain to work in e-mail of course.
10:04	<annevk>	(We're are also in process of removing UTF-32 support altogether.)
10:09	<hsivonen>	Hixie: since even IE and Safari don't expose EBCDIC encodings in the UI, EBCDIC could only be used by declaring it on the HTTP level, so discovering usage in content should be possible from a Web crawl
10:12	<Philip`>	hsivonen: You can declare it in <meta charset> too
10:13	<hsivonen>	Philip`: how does that work?
10:13	<Philip`>	hsivonen: Like http://philip.html5.org/demos/charset/ebcdic/meta.html
10:14	<Philip`>	IE seems to detect the ASCII <meta charset> then reparses the document
10:14	<zcorpan>	hmm, if innerHTML in xml needs to take into account doms that are parsed from text/html, the cases that throw are not complete. e.g. <foo 123>
10:14	<hsivonen>	Philip`: seems more like a bug in IE and Safari than a working feature
10:15	zcorpan	should get started with dom5core soon
10:15	<Philip`>	hsivonen: Bugs are working features, and people often rely on them :-)
10:15	<hsivonen>	Philip`: I wonder if there's an exploitable security hole here
10:16	<Philip`>	hsivonen: There probably is if e.g. you have a blog that uses EBCDIC, and you allow comments and escape all '<' characters to make them safe
10:17	<Philip`>	because someone can write something that gets encoded to <meta charset=us-ascii><script>...</script>
10:17	<Philip`>	but you'd have to be incredibly dumb to run your blog in EBCDIC, so that doesn't seem like a real problem
10:18	<hsivonen>	it seems dumb to publish any Web content in EBCDIC as it doesn't work in Gecko and Opera
10:18	<annevk>	maybe it's real enough for them to stop supporting ebcdic
10:18	<Philip`>	Unfortunately people are dumb
10:20	<Philip`>	(I suppose it'll be worse if browsers ever autodetect EBCDIC)
10:23	<annevk>	it would sure be nice though if we can limit the Web to a finite set of character encodings
10:25	<Philip`>	(http://philip.html5.org/demos/charset/ebcdic/meta-autodetect.html - ah, good, looks like they don't autodetect)
10:29	<annevk>	(from internal IRC: http://kitenet.net/~joey/blog/entry/thread_patterns/ )
12:45	gsnedders	should probably bring the computing exam paper up here so we can laugh at it
12:48	<Philip`>	hsivonen: Validator.nu says "Warning: Using x-ibm-1252_p100-2000 instead of the declared encoding iso-8859-1." which seems unnecessarily more confusing than "windows-1252"
12:50	zcorpan	found a bug in webkit.. https://bugs.webkit.org/show_bug.cgi?id=19355
12:51	<hsivonen>	Philip`: Yeah, I just noticed myself. deploying a supposed fix now. will take a while as the validator rebuilds itself
12:51	<hsivonen>	Philip`: thanks
12:53	<hsivonen>	the wonders of HashMaps that harmless-looking key changes make precedence in duplicate cases change
12:55	<hsivonen>	aaargh. now I don't know what caused the encoding weirdness or why my fix isn't working
12:57	<hsivonen>	I also managed to break my instant rollback ability
12:58	<Philip`>	I suppose that means you can't instantly rollback to a version in which instant rollback worked, which sounds like a pain
13:03	<hsivonen>	Philip`: fixed
13:04	<Philip`>	hsivonen: Thanks!
13:28	<annevk>	the TAG concedes on ARIA but not distributed extensibility
13:29	<annevk>	I disagree with "short-term" and all but I guess it's better to leave it alone
13:29	<annevk>	see also http://lists.w3.org/Archives/Public/public-html/2008Jun/0044.html if you haven't been following along
13:30	<hsivonen>	I'd be interested in seeing scenarios where Distributed Extensibility would be used and the W3C wouldn't accuse the extender of bad unilateral action
13:31	<Dashiva>	hsivonen: Who said they wanted it to be used? :P
13:33	<annevk>	hsivonen, it seems to me they'd be fine with people using it for their own internal purposes. Or their own vocabulary they want to include in HTML to mark up details of various video game consoles or something like that...
13:34	<hsivonen>	annevk: why does the World-Wide Web Consortium put effort into catering for such private off-the-Web use cases?
13:36	<annevk>	hsivonen, money? I've no idea why certain Working Groups exist
13:37	<hsivonen>	annevk: this reminds me of http://dbaron.org/log/2006-08#e20060818a
13:52	<annevk>	hmm, in other news, today is the deadline of XHR1 Last Call comments
13:54	<Lachy>	last call deadline for comments doesn't really mean all that much. It's not as if you're going to ignore comments that come afterwards.
13:56	<annevk>	it means that if we addressed all comments and didn't change too much we can move to CR
13:56	<annevk>	which would be nice
13:57	<gsnedders>	I guess that means I should really look over it closely
13:57	<Dashiva>	And there's lots of recent precendent that you don't need to adress comments at all, right ;)
13:58	<gsnedders>	Hmm. It's still odd.
14:00	<Lachy>	Dashiva, you're getting confused by the way the old HTMLWG used to work.
14:00	<Lachy>	good editors don't ignore comments, they just find clever ways to reject them ;-0
14:00	<Lachy>	;-)
14:01	<annevk>	not just the HTML WG, SVG WG too, just see the link from hsivonen
14:11	<Dashiva>	I seem to recall some noise about the css wg too, back in the day
14:15	<Dashiva>	"O! For the love of! We're going to be constrained by the broken DOM APIs?"
14:15	<Lachy>	Dashiva, where is that quote from?
14:17	<Dashiva>	TAG minutes
14:17	<Dashiva>	"I cannot even stomach even abstaining on something that makes - the namespace separator"
14:18	<hsivonen>	when you are on the server side, you get to write your own XML tree API that sucks less than the DOM.
14:18	<hsivonen>	I did
14:19	<Dashiva>	"Why have aria-, why not just pick names that don't clash?"
14:19	Dashiva	boggles
14:19	<hsivonen>	Dashiva: the answer is that HTML5 and ARIA developments met too late
14:20	<hsivonen>	Dashiva: if the ARIA folks had indicated integration interest to the WHATWG a couple of years ago, it might have been different
14:20	<Dashiva>	hsivonen: The answer is also "aria-x _is_ a name that doesn't clash"
14:20	<hsivonen>	Dashiva: that too
14:21	<Dashiva>	It goes back to the previous comment, they seem to think aria- is a namespace
14:21	<annevk>	I can't really see how ARIA would've been designed differently than what we ended up with now (other than aria-role as opposed to role)
14:21	<Dashiva>	"There may be a small chance we can get them to do something reasonable"
14:22	<Dashiva>	"I want to try and maintain what credibility we can" :)
14:22	<annevk>	Well, maybe a different prefix name and such, but that's details...
14:26	<Dashiva>	a11y-role
14:26	<zcorpan>	annevk: aria could have failed and interested parties could have pushed for html5 features instead
14:27	<annevk>	zcorpan, that wouldn't have affected the design so much (apart from it not being adopted)
14:28	<zcorpan>	annevk: it could have affected html5
14:31	<annevk>	fair enough, oh well, enough "what if" talk
14:31	annevk	goes to buy a printer and such
14:32	<annevk>	i should really get my expense reports done today
14:32	<annevk>	otherwise accounting will hunt me down
14:45	<hsivonen>	annevk: did you happen to take a closer look at the XRI stuff? what is it about?
15:06	<hsivonen>	http://meyerweb.com/eric/thoughts/2008/06/02/the-missing-link/
15:06	<annevk>	hsivonen, I don't know anything apart from that it's an URI scheme and the people who give out parts of the scheme space monotize on that somehow
15:06	<annevk>	(is it really monotize? google doesn't give clues)
15:07	<Philip`>	(Monetize)
15:07	<hsivonen>	you mean someone other than TLD registrars want to monetize URI space?
15:07	<takkaria>	(monetise)
15:07	<hsivonen>	that doesn't sound good
15:08	<annevk>	hsivonen, http://en.wikipedia.org/wiki/Extensible_Resource_Identifier
15:09	<hsivonen>	how does one make a browser dereference an XRI?
15:11	<hsivonen>	seems to be that OASIS is practising distributed extensibility of the URI system by minting a new scheme :-)
15:11	<hsivonen>	s/be/me/
15:11	<annevk>	HTML vs XML / URIs vs XRIs
15:13	<annevk>	hsivonen, http://www.pacificspirit.com/blog/2008/05/30/detailed_technical_reasons_why_im_against_xris has rationale on why TAG members are against this
15:13	<hsivonen>	on the face of it, this XRI stuff seems to break key goodness of OpenID 1.0
15:16	<annevk>	hsivonen, they're using this in OpenID 2.0 I believe
15:16	<annevk>	hsivonen, OpenID 2.0 is not backwards compatible :(
15:16	<annevk>	(there's also various ways to point to OpenID 2.0, including using this new stuff)
15:17	<hsivonen>	that seems like a bad idea when they should be making OpenID 1.0 look stable and something that people can adopt
15:17	<Dashiva>	Is OpenID 2.0 still OpenPhishing as well?
15:18	<hsivonen>	Dashiva: do you mean 1.0 is OpenPhishing?
15:18	<Dashiva>	Yes
15:20	<annevk>	i've no idea why it's so complex
15:20	<annevk>	i'm not sure i want to take the time to investigate
15:22	<Dashiva>	The only non-monetizing aspect of XRI I've seen is the reassignable-name/permanent-number connection, and I don't know enough to say if that's a valid point
15:57	<othermaciej>	zcorpan: video tests? yes
16:07	<annevk>	grmbl
16:07	<annevk>	driver support strikes again
16:08	annevk	finds http://www.stchman.com/foo2zjs.html
16:17	<zcorpan>	othermaciej: philip had a pointer
16:26	<annevk>	Hmm, getting this printer to actually work (rather than being recognized and not printing anything when asked) required running some obscure script on a third party site and rebooting my computer...
18:00	<annevk>	hsivonen, so reading http://meyerweb.com/eric/thoughts/2008/06/02/the-missing-link/ it seems to fail bringing up any other use case than <tr>...
18:01	<annevk>	I'm not quite sure whether the linking stuff is a presentational concern or a semantic one.
18:10	<takkaria>	do browser implementers really want a global href not to happen?
18:11	<annevk>	I think the extremely bit is an overstatement, but as can be seen by <object> overloading is not a good idea
18:13	<annevk>	Also, people keep forgetting it's not just href; hreflang, ping, type, etc. would also be affected
18:59	<Philip`>	<base style="display:block; border:2px magenta solid" href="foo" target="_blank">
19:03	<itpastorn>	If one uses onclick to simulate a link on todays <tr>, would a screen reader know it and can it be made usable for a blind user?
19:07	<itpastorn>	My question was not rethorical
19:07	<annevk>	Screen readers should be able to deal with that...
19:08	<itpastorn>	With or without aria?
19:08	<annevk>	Opera Mini even deals with most such situations as well and it doesn't even have access to a DOM / JS execution context
19:08	<annevk>	without
19:09	<itpastorn>	ok
19:23	<annevk>	(FWIW, I don't actually if screen readers deal with it, I'm just saying it's feasible.)
21:21	<gsnedders>	Now, can I have a drop roll?
21:22	<gsnedders>	spec-gen is executing for the first ever time!
21:23	<gsnedders>	It seems to be taking a year, though
21:44	<gsnedders>	http://hg.gsnedders.com/spec-gen/file/9f0ed82d3f20/src/specGen/utils.py#l66 — I need something cheaper than that, badly.
21:47	<Dashiva>	A cheaper way to gEBI?
21:48	<Philip`>	gsnedders: Perform a single initial pass over the document, finding all elements with id and putting them in a dict?
21:48	<gsnedders>	Dashiva: yeah
21:48	<gsnedders>	Philip`: That would mean also updating it at times, when I add it to more elements
21:48	<Philip`>	The cheapest way to implement a method is to never call it
21:49	<Dashiva>	gsnedders: Do you do forward gEBIs, e.g. for elements you haven't seen yet?
21:49	<gsnedders>	Philip`: :)
21:50	<Dashiva>	gsnedders: You should definitely, definitely cache the ids you generate yourself
21:50	<gsnedders>	Dashiva: Well, the entire document is stored in memory at once, so I do look at @id anywhere in it
21:50	<Philip`>	gsnedders: Do you need to update it continuously, or could you do a series of stages like find_and_cache_all_ids(); do_some_processing_that_might_change_elements(); find_and_cache_all_ids(); do_more_processing() ?
21:50	<Dashiva>	gsnedders: So cache the ones you produce, as well as any you look up, that way you'll only lookup each element at most once
21:50	gsnedders	wonders when it will stop running on HTML 5
21:51	<Philip`>	(I'm assuming you're calling gEBI many more times than once per element-with-an-id, since otherwise it shouldn't be a bottleneck at all)
21:51	<Dashiva>	gsnedders: You should do some stats on how many gEBIs fail
21:51	<Dashiva>	I'd expect close to zero, otherwise there's trouble at work :)
21:51	<gsnedders>	Philip`: The only call is in the method above
21:52	<Dashiva>	gsnedders: And it's a loop to boot
21:52	<gsnedders>	Dashiva: Yeah, but it should rarely run more than once
21:52	<Dashiva>	gsnedders: Well, do some profiling. Check how many gEBIs fail, and how many times the loop runs more than once (and how many for those)
21:53	<Philip`>	I assume the XPath has to traverse the entire tree looking for an element with that id, so it's going to be pretty expensive each time you call it, so just memoising the function might not help much
21:54	gsnedders	wonders whether using an .iter would be cheaper
21:54	<Dashiva>	I'm actually a bit surprised htmllib doesn't keep an id lookup
21:54	<gsnedders>	Dashiva: The loop never actually runs at all
21:55	<gsnedders>	or at least I don't think so…
21:55	<Dashiva>	gsnedders: In that case, your bottleneck is the xpath.
21:55	<Dashiva>	I'd try what Philip` suggested. Run the whole tree once, collect a dict of all ids. Every time you make an id, add to dict. Never use xpath.
21:55	<Philip`>	If the loop never executes, you could prefix the lines with "#" which makes Python apply super optimisations that make code really really fast and don't change the behaviour if it wouldn't have had any effect anyway
21:56	<gsnedders>	Philip`: :D
21:56	<Philip`>	You could (should?) still use XPath once, to find all the elements with ids in the initial traversal bit
21:57	<Philip`>	but doing it for every id lookup doesn't seem like a good idea
21:57	<gsnedders>	gEBI costs 0.016s per call
21:57	<Dashiva>	Oh yeah, he could use xpath for the initial colletion, yeah
21:58	<gsnedders>	And on (my old copy of) WF2 it is called 80 times
21:58	<Philip`>	Forty million clock cycles? That's not good :-p
21:58	<gsnedders>	HTML 5 has currently been taking around 10 minutes to process
21:59	<gsnedders>	I don't think this is the sort of speed Hixie wanted me to get :)
21:59	<Dashiva>	First you make it work, then you make it fast. One at a time :)
22:00	<gsnedders>	(The textContent function above is called far more times (almost 2000) yet costs almost nothing)
22:00	<gsnedders>	(0.000s per call)
22:00	<Philip`>	gsnedders: How long is gEBI taking in total on HTML5?
22:00	<gsnedders>	(I'm assuming that's rounded down)
22:00	<gsnedders>	Philip`: Dunno. It hasn't finished running yet.
22:00	<Philip`>	gsnedders: Oh. Then why did you say you needed something faster that it, before you knew it was slow?
22:00	<gsnedders>	Philip`: Web Forms 2.0 made is obvious it was slow.
22:01	<Dashiva>	I think he figured after 5 minutes of waiting :)
22:01	<gsnedders>	Philip`: Don't need anything of HTML 5's size to prove that :)
22:01	<gsnedders>	Or, what Dashiva said, then I got wondering why, and ran it on WF2 :P
22:01	<Philip`>	I'd expect it to take O(size^2) time, and HTML5 is only five times larger than WF2, so it should only take ~30 seconds
22:02	<Philip`>	which is still too slow and worth optimising, but isn't five minutes :-)
22:02	<gsnedders>	I think we can conclude it isn't that good :)
22:02	<gsnedders>	Oh, we're over ten minutes now :)
22:02	gsnedders	throws KeyboardInterrupt
22:02	<Philip`>	You ought to make it print its incremental process, in case it's got stuck in an infinite loop
22:02	<gsnedders>	Guess what function it was in when I threw it!
22:02	<gsnedders>	gEBI!
22:02	<Philip`>	sleep?
22:03	<gsnedders>	No, gEBI :P
22:04	Philip`	doesn't understand how it could be that much slower on HTML5 than on WF2
22:06	<Dashiva>	O(n^2) probably has something to do with it
22:07	<Philip`>	Dashiva: But n=5 and 80*0.016s = 1.3s so it shouldn't be nearly that bad, unless I'm horribly mistaken somewhere
22:07	gsnedders	tries html5.src with his really naïve cache
22:08	<Philip`>	gsnedders: On line 48, did you intend to remove all space characters from anywhere in source? (.strip only strips leading/trailing characters, so that wouldn't quite work)
22:08	<Philip`>	Oh, maybe I'm missing the bit a few lines later
22:08	<gsnedders>	Philip`: :)
22:08	<gsnedders>	naïve caching gets HTML 5 down to 17.859s
22:09	<Philip`>	By the way, your loop should probably increment i
22:09	<gsnedders>	LOL
22:09	<Philip`>	else it'll loop forever
22:09	<gsnedders>	True.
22:09	<Dashiva>	oh snap!
22:09	<Philip`>	and if your cache fixes that, then your cache is broken :-p
22:09	<gsnedders>	That'll be the ten minutes of running, I bet :P
22:09	<Dashiva>	No wonder it was in gEBI
22:09	<gsnedders>	Philip`: Yeah, my cache is broken
22:09	<gsnedders>	Philip`: It just takes the initial state and never updates it
22:10	<Philip`>	(You quite possibly want the cache anyway, even without the infinite loop)
22:11	gsnedders	tries running HTML 5 without the cache but with a working loop
22:12	Philip`	predicts 25 seconds
22:12	<gsnedders>	Hint: It's all ready more than that
22:12	<Philip`>	26?
22:12	gsnedders	wonders if there's another loop
22:13	gsnedders	adds print i
22:13	Dashiva	watches scrollback disappear into infinity
22:14	<gsnedders>	That's not even getting called
22:14	<Dashiva>	Instead of printing i, you could add a "if i == 3 break". That way you won't get the previous problem :)
22:14	<gsnedders>	we get as far as Document.xpath(), then it takes an hour there
22:15	gsnedders	realises
22:15	<gsnedders>	actually running the copy you just edited helps
22:15	<Philip`>	Is your document cyclic?
22:15	<Dashiva>	Who needs bugs with coding practices like these? :P
22:16	<gsnedders>	Dashiva: Me! :P
22:16	<Philip`>	Clearly you need to make this program multithreaded to get optimum performance on modern processors
22:16	<gsnedders>	Dashiva: OK, my scrollback is now vanishing
22:16	<gsnedders>	Dashiva: Albeit slowly
22:17	<gsnedders>	The most times the loop body is getting executed is twice
22:17	<gsnedders>	1440 dfn elements in my copy of HTML 5.
22:17	<gsnedders>	No wonder it's bad :)
22:17	<Dashiva>	Running time now?
22:18	<gsnedders>	Dunno, didn't look when I started it
22:18	<gsnedders>	140.484s
22:18	<gsnedders>	6958 122.344 0.018 122.344 0.018 {method 'xpath' of 'lxml.etree._Element' objects}
22:18	<gsnedders>	Heh. That's lovely :P
22:19	<Philip`>	_Element?
22:19	<Philip`>	gEBI was calling it on a Document...
22:19	<Dashiva>	gsnedders: Need a legend for those numbers
22:19	<Philip`>	Is much (cumulative) time spent in textContent?
22:19	<gsnedders>	Dashiva: ncalls tottime percall cumtime percall filename:lineno(function)
22:20	<gsnedders>	Philip`: the root element is still an _Element :P
22:20	<gsnedders>	Philip`: 0.285s
22:20	<Philip`>	Oh, right, you just have broken naming conventions and use uppercase letters for variables
22:20	<Dashiva>	gsnedders: And with caching?
22:20	<Philip`>	so I thought it was a class name instead
22:20	<gsnedders>	Philip`: Peh. I just copied that from DOM :P
22:21	<gsnedders>	Philip`: (there was a reason why I did originally use DOM)
22:21	<Philip`>	gsnedders: It'd be saner to copy the Python conventions :-)
22:22	<gsnedders>	Philip`: textContent is cheap, and I doubt I could rewrite it to be quicker. If you iterate over everything you need to check if you have an element or a comment before you take _Element.text, while always taking .tail
22:22	<gsnedders>	Philip`: It would, but I currently want to get something that takes less than 2 minutes to create cross-references on HTML 5 :)
22:22	<Philip`>	gsnedders: You could call textContent half as many times
22:23	<Philip`>	but if it's only 0.3s then there's no point
22:23	<Philip`>	Are you including the time taken to parse the document?
22:23	<Dashiva>	Philip`: Let's stop distracting him from implementing an improvement with talk :)
22:23	<gsnedders>	Philip`: I know I've thought of doing that before
22:23	<gsnedders>	Dashiva: I need to sleep, anyway
22:23	<Philip`>	And is this two minutes in a profiler that makes everything really slow?
22:24	<gsnedders>	Philip`: No
22:24	<gsnedders>	Philip`: This is cProfile, which has next to no overhead. It really does take that long.
22:24	<gsnedders>	http://stuff.gsnedders.com/html5.html — find broken xrefs!
22:26	<Philip`>	gsnedders: That's only true if you consider 2.5x speed decrease to be next to no overhead
22:26	<Philip`>	(At least that's what I get when running html5lib with cProfile)
22:26	<gsnedders>	Philip`: I've never found it that slow, compared with other profilers :P
22:27	<Philip`>	That's just because other profilers are even more slow
22:27	gsnedders	uses time
22:27	<gsnedders>	let us see how long this takes :P
22:28	Dashiva	wonders why gsnedders is timing an implementation he knows is inefficient :)
22:28	<gsnedders>	Dashiva: To see how slow cProfile really is here :P
22:29	<annevk>	gsnedders, on that page various <dfn> elements don't have an id= assigned
22:29	<Philip`>	gsnedders: I found 64 broken references
22:29	<Philip`>	#refsWF2, #refsXHTML2, #refsRFC2119, #refsHALTINGPROBLEM, ...
22:29	<annevk>	(i also think that HTML-elements would be better as html-elements
22:29	<annevk>	)
22:30	<Dashiva>	Philip`: Those are Hixie's missing refs, aren't they?
22:30	<gsnedders>	annevk: like what?
22:30	<Philip`>	Dashiva: Yes, but nobody said I had to find broken links introduced by gsnedders
22:30	<gsnedders>	Philip`: Smartass.
22:31	<Philip`>	gsnedders: You've done something that breaks the spec-splitter
22:31	<Philip`>	(It splits the spec into five chunks, one of which is 1.4MB)
22:31	<gsnedders>	Philip`: 131.631s without cProfile
22:32	<gsnedders>	Philip`: No, it just doesn't do something that the spec-splitter needs yet, realistically :)
22:32	<Philip`>	gsnedders: Do you know what that something is?
22:33	<annevk>	gsnedders, search for <dfn>
22:33	jgraham	likes the date of 1901
22:34	<gsnedders>	annevk: hmm, odd
22:34	<gsnedders>	jgraham: That's Hixie, not me :)
22:34	<annevk>	gsnedders, so I'd suggest fixing that and lowercasing id values
22:34	<annevk>	gsnedders, except when id is explicitly set of course
22:34	<annevk>	no need to change what the author meant
22:34	<gsnedders>	annevk: yeah, sure
22:34	<Hixie>	1901?
22:35	<gsnedders>	Hixie: [DATE: 01 Jan 1901]
22:35	<Hixie>	ah
22:35	<gsnedders>	Hixie: [DATE] has the exact same affect, FWIW
22:35	<annevk>	gsnedders, looks quite nice already btw
22:35	<Hixie>	i just use it to remind myself of what the format has to be
22:35	<gsnedders>	Hixie: Ah
22:35	<Philip`>	Hmm, "Cue ranges" is missing a </dl>
22:35	<Philip`>	which breaks everything
22:36	<gsnedders>	Philip`: Well, I didn't write html5lib's serialiser
22:36	<annevk>	gsnedders, hopefully it's not slow on sanely-sized specs :)
22:36	<gsnedders>	nudge
22:36	<gsnedders>	annevk: Currently around 5s on WF2 :(
22:36	<Philip`>	gsnedders: html5lib's serialiser works perfectly well for me
22:36	<gsnedders>	Philip`: Well, I ain't doing nothin' :P
22:36	<Philip`>	(The </dl> is there in the spec-splitter's normal output)
22:36	<jgraham>	gsnedders: I recommend using the serializer with optional tags on
22:37	<jgraham>	(since I guess that's what the problem is)
22:37	<Philip`>	I use it with optional tags off, and it works fine
22:37	<Philip`>	(quote_attr_values=True, inject_meta_charset=False)
22:37	<jgraham>	Oh well, it's gsnedders fault then :)
22:37	<gsnedders>	annevk: Once done, it should be the case that html5lib is the slowest part
22:37	<Philip`>	(at least with the latest html5lib, and latest version of the spec)
22:37	<annevk>	gsnedders, 5s seems sort of acceptable
22:38	<gsnedders>	This is a version of the spec from Feb 13th :P
22:38	<gsnedders>	and almost latest html5lib
22:38	<annevk>	gsnedders, yeah yeah, I and everyone else on the planet know about html5lib perf :p
22:39	<Philip`>	gsnedders: That's kind of odd, then
22:40	<gsnedders>	I don't touch the actual raw data, and I don't do much that's crazy
22:40	<gsnedders>	http://hg.gsnedders.com/hgwebdir.cgi/spec-gen/file/1909801197f0/specGen/processes/xref.py#l86 — that's as crazy as it gets
22:40	<Philip`>	(This is line 12218)
22:42	<gsnedders>	Oh well. I'll look at this all tomorrow.
22:42	<gsnedders>	And try and work out the bizarre bug of it not having any @id
22:42	<gsnedders>	http://hg.gsnedders.com/hgwebdir.cgi/spec-gen/file/1909801197f0/specGen/processes/xref.py#l39 — it's hardly complex
22:46	<Philip`>	By the way, it'd be nice if the output of this script was well-formed XML, so later tools (like the spec splitter) could use a nice fast standard XML parser
22:49	<Dashiva>	What about the spec author's wishes? :)
22:50	<Philip`>	What wishes are those?
22:51	<Dashiva>	Maybe he likes using optional end tags
22:51	<Philip`>	The spec can be written in HTML, and the final published version could be converted to HTML, but it's much easier and faster if the intermediate stages are all XML
22:52	<Dashiva>	But once you add that end tag, the information is lost. There's no way to remove it later :)
22:52	<Philip`>	There's plenty of ways to remove it later - that's what html5lib's serialiser does
22:53	<Philip`>	See e.g. http://canvex.lazyilluminati.com/wa1/multipage/video.html - lots of non-present end tags there, in all the sensible places
22:54	<Philip`>	(Well, lots of non-present end tags not there)
22:55	<Dashiva>	But that could remove end tags that were present in the original :)
22:55	<Philip`>	and you'll get the same information loss regardless of whether you use HTML or XML in the toolchain, since it's always going to get parsed into a tree that loses that information and then get serialised
22:56	Dashiva	ponders whether to keep going
22:58	annevk	sighs at yet another W3C mailing list
22:59	<Philip`>	Mailing lists are old fashioned
23:00	<jwalden>	yeah, use IRC!
23:00	<Philip`>	The W3C should get with the times, and move all WG business to Twitter
23:00	<Dashiva>	Mike was talking about setting up a slashcode installation for htmlwg. :)
23:00	<jgraham>	Philip`: I have been tempted to suggest that to the people who complain that the W3C uses mailing lists
23:01	<jgraham>	since they never seem to have a good reason for disliking thm
23:01	<Philip`>	Dashiva: Does that mean we could mod people Troll?
23:01	<jgraham>	or at least one that I agree with :)
23:02	<Dashiva>	Philip`: That was the idea, I believe
23:02	<Dashiva>	Getting the WG to sort out noise collectively, instead of everyone doing it individually
23:03	<jgraham>	I think modding people troll would be much worse than the current situation
23:03	<jgraham>	Because people would react badly
23:04	<jgraham>	and start making a fuss about how they were being oppressed or whatever
23:04	<Philip`>	Make it so that when you mod someone Troll, they seem themselves as being modded Insightful instead
23:04	<Philip`>	then they'll never notice the oppression
23:04	<Hixie>	there are plenty of ways to foster good communities without modding or permanent banning
23:04	<Dashiva>	jgraham: Could always prevent negative modding, and just set a high positive treshold for viewing instead :)
23:05	<Hixie>	the reason the w3c lists have problems is that they aren't maintained like a proper community
23:05	<Dashiva>	Yeah, if the people in charge are willing to..
23:05	<jgraham>	Philip`: Only if they only have a single account and anonymous viewing is disallowed