#whatwg on 2007-06-04

11:36	<maikmerten>	meh, Firefox hates me and won't load http://www.whatwg.org/specs/web-apps/current-work/#video
11:36	<annevk>	you can always try http://www.whatwg.org/specs/web-apps/current-work/multipage/section-video.html#video
11:40	<maikmerten>	hmmm... that works
11:41	<maikmerten>	somehow my Firefox 2 (plain vanilla what comes with Ubuntu 7.04 plus a few extensions) is picky at times. It often gets into a "no, I won't load pages for you anymore until you kill and restart me" state
11:42	<maikmerten>	could be a profile issue, this profile may date back into Firefox alpha days ;)
11:42	<maikmerten>	err.. Phoenix alpha builds of course
11:44	<maikmerten>	I'm going to propose that http://wiki.xiph.org/index.php/OggPlayJavascriptAPI perhaps should be as close to the WHATWG API as possible
11:45	<met_>	annevk: what are reasons for XML5?
11:45	<met_>	it's only academic project or something more?
11:46	<met_>	oh www.xml5.com
11:48	<zcorpan>	http://code.google.com/p/xml5/
11:48	<annevk>	it might be worth something actually
11:48	<annevk>	ah, Phoenix!
11:49	annevk	wondered about the third name for some time
11:49	<zcorpan>	was it phoenix -> firebird -> firefox?
11:49	<annevk>	maikmerten, they are already implementing support for Ogg and <video> in Firefox
11:49	<annevk>	zcorpan, yeah
11:50	<met_>	worth something? do you think webbrowsers will implement it?
11:50	<annevk>	maybe
11:50	<annevk>	dunno
11:50	<annevk>	it would certainly solve some issues
11:51	<met_>	so you are only playing with and will see the results later?
11:51	met_	just writing some article and want mention xml5
11:52	<maikmerten>	annevk, yup, Chris Double is working on that
11:52	<annevk>	I'm doing research into it and making a draft proposal as well as an experimental implementation
11:52	<annevk>	maikmerten, doesn't that make the plugin effort kind of pointless?
11:52	<annevk>	(which is what I was after)
11:52	<maikmerten>	annevk, well, I consider the plugin more or less a testbed for liboggplay
11:53	<maikmerten>	annevk, plus Chris mentioned he may look into the plugin/liboggplay to see if some things can be harvested
11:53	<annevk>	guess I should read up on liboggplay
11:54	<annevk>	hsivonen, I think UTF-32 is part of the deal
11:54	<maikmerten>	well, it's mostly a lightweight top-level abstraction library to play back Ogg streams in a sane way. It also handles A/V sync, which is nontrivial to do crossplatform
11:56	<annevk>	thans
11:56	<maikmerten>	only problem I see for use in browsers: It depends on libfishsound (abstraction layers for the multitude of Ogg audio codecs.... Speex, FLAC, Vorbis...) and on liboggz (abstraction layer for libogg, which is the very opposite of high-level)...
11:56	<annevk>	thanks*
11:57	<maikmerten>	so it may be a bit problematic to have such a deep stack of layers when it comes to security
12:02	<virtuelv>	annevk: mind setting properties on html files in subversion?
12:03	<virtuelv>	that way, you could view http://xml5.googlecode.com/svn/trunk/specification/Overview.html in a regular browser
12:03	<zcorpan>	so xml5 parsers don't need to implement doctypes... they can abort instead :)
12:04	<annevk>	yeah, quite the opposite from HTML5 :)
12:04	<zcorpan>	heh
12:05	<zcorpan>	comments work exactly the same?
12:06	<annevk>	with less parse errors
12:06	<annevk>	atm
12:07	<zcorpan>	for the <!-- ---> case?
12:07	<zcorpan>	also <!-- -- -->
12:08	<zcorpan>	perhaps they shouldn't be parse errors in html5
12:09	<zcorpan>	<!------------------> looks good ;)
12:16	<zcorpan>	annevk: <!DOCTYPE >> is a doctype token with the name ">", correct?
12:16	<annevk>	<!-- opens a comment and --> closes it
12:16	<annevk>	what's in between doesn't matter as long as it's not -->
12:17	<zcorpan>	right
12:17	<annevk>	zcorpan, that might be a bug
12:17	<zcorpan>	DOCTYPE root name before state
12:20	<annevk>	It seems that's handled correctly by the parser...
12:20	<annevk>	as in, > does not give the name
12:21	<zcorpan>	ok
12:21	<annevk>	(though the specification suggests otherwise, indeed)
12:22	<annevk>	guess it makes sense to end if > is spotted
12:22	<zcorpan>	yeah
12:22	<zcorpan>	not sure about <!DOCTYPE a ">">
12:23	<annevk>	I think the spec is correct there now...
12:23	<annevk>	as in, the > is part of the non-reported identifier
12:25	<zcorpan>	wait, what about PUBLIC and SYSTEM?
12:26	<zcorpan>	are they fed through the root name states?
12:26	<annevk>	they're safely skipped over iirc
12:27	<annevk>	that's the goal anyway
12:27	<zcorpan>	ah
12:27	<zcorpan>	right
12:28	<annevk>	during the DOCTYPE state only a few tokens are created that only affect the rest of the tokenization stage
12:28	<annevk>	not the parsing stage
12:29	<zcorpan>	DOCTYPE identifier double quoted state, EOF, shouldn't it reprocess?
12:32	<annevk>	yeah
12:32	<annevk>	there are some minor glitches like that here and there still
12:32	<annevk>	I've also yet to write out entities and attlist
12:35	<maikmerten>	annevk, oh, one thing I told Chris Double and which I tried to tell howcome (by email, but no response): I recommend not using one of the released Theora alphas for in-browser usage. It makes sense to use Theora SVN trunk, as this features a completely rewritten decoder that is felt to be much safer when it comes to malformed content. There's a compatibility layer, so you don't need to worry about yet another API.
12:38	<annevk>	thanks
12:38	<maikmerten>	however, if you adapt to a slightly revised API (theoradec.h instead of theora.h) you have access to e.g. a system of postprocessing filters, which increases the perceived picture quality
12:38	<maikmerten>	http://img454.imageshack.us/my.php?image=bildschirmfoto1fx6.png <-- no postprocessing, default, old decoder will look like this
12:39	<maikmerten>	http://img454.imageshack.us/my.php?image=bildschirmfoto2xp3.png <- new decoder, with maximum postprocessing (notice how the blocky artifacts are nicely smoothed out)
12:39	<maikmerten>	meh, and yet again I spam this channel with irrelevant stuff, sorry :(
12:40	<annevk>	it's not that irrelevant I think
12:41	<maikmerten>	well, but it's far away from actual WHATWG specification work
12:41	<zcorpan>	so? :)
12:43	<maikmerten>	my point is that I assume most people here aren't thaaaat interested in codec discussion or how a particular codec exposes what functions over what API, so I should keep that topic fairly low-profile
12:50	<hsivonen>	it is hard to get notify the tokenizer about the byte stream crossing the 512 byte mark while supporting UTF-8 sequences landing across the boundary and maintaining otherwise efficient buffering
12:54	<hsivonen>	s/get //
12:56	<annevk>	why do it through the tokenizer?
12:58	<hsivonen>	to check for the requirement that the internal encoding declaration has to occur within the first 512 bytes if it does occur
12:59	<hsivonen>	annevk: this isn't about detecting the encoding but about detecting misplaced charset metas
13:00	<zcorpan>	what about <style>/<meta charset=utf-8>/</style>? it will be detected but isn't an element in the parsed tree
13:01	<annevk>	oh right
13:01	<hsivonen>	zcorpan: as far as I can tell, having <style>/<meta charset=utf-8>/</style> after the first 512 bytes is not an error and my implementation plan doesn't cover detecting it
13:02	<annevk>	the algorithm doesn't deal with <style>, <script>, <plaintext>, etc?
13:02	<annevk>	interesting
13:03	<hsivonen>	annevk: the byte-level sniffer doesn't
13:03	<zcorpan>	hsivonen: ok
13:03	<hsivonen>	annevk: as Hixie why not :-)
13:03	<zcorpan>	because that's what current browsers are doing
13:03	<hsivonen>	zcorpan: really? I thought current browsers run the full parser twice
13:04	<zcorpan>	yeah, really
13:04	<hsivonen>	wow
13:04	<hsivonen>	very interesting
13:04	<zcorpan>	indeed :)
13:05	<annevk>	so current browsers are not reusing the tokenizer?
13:06	<annevk>	or are using the tokenizer but not using the contentmodelflag...
13:06	<annevk>	whoa
13:06	<zcorpan>	something like that
13:06	<zcorpan>	they detect meta charset in <style>
13:06	<hsivonen>	annevk: anyway, as far as I can tell, http://www.whatwg.org/specs/web-apps/current-work/#charset makes a statement about document conformance requirements about the byte placement of an attribute (where the attributeness is determined on the character stream-reading tokenizer level)
13:06	<hsivonen>	joy
13:07	<hsivonen>	I'd like to know now if I am misunderstanding requirements for conformance checkers on this point
13:09	<annevk>	when I last read that it's indeed as you say
13:11	<hsivonen>	so as far as I can tell, I have to make sure that both byte buffering and UTF-16 code unit buffering have a forced buffer boundary at that point so that a notification can happen between buffers
13:12	<annevk>	it only matters for ASCII compatible encodings though
13:13	<hsivonen>	that doesn't help much considering that UTF-8 is variable-length and various Asian encodings, too
13:18	<annevk>	can't you just sniff for <meta in the first 512 bytes and combine those results with the result from the tree construction?
13:18	<hsivonen>	annevk: what if there is no meta in the first 512 bytes but there is one afterwards?
13:19	<hsivonen>	annevk: I should emit an error, shouldn't I?
13:19	<annevk>	yeah
13:19	<annevk>	it wouldn't be detected
13:19	<annevk>	(well, for <meta charset> that is)
13:21	<hsivonen>	also, using the java.nio.charset stuff with reported errors and recovery requires a lot of work that doesn't come from the libraries by default
13:21	<annevk>	maybe you should focus on something else first?
13:22	annevk	notes that it charset detection might change...
13:22	<zcorpan>	the 512 requirement might be dropped altogether, aiui
13:22	<annevk>	as opposed to the other bits, which should be more stable
13:22	<zcorpan>	if browser vendors find that they have to examine the entire document anyway
13:23	<annevk>	zcorpan, if the entire document is several megabytes though that doesn't seem useful
13:23	<zcorpan>	right
13:25	<zcorpan>	annevk: is namespace association dealt with after the tree construction?
13:25	<hsivonen>	annevk: perhaps. but usually when abstraction layers are violated, it is good to cover the violation points first, because if you design with clean abstractions, it is harder to punch the holes later
13:26	<kfish>	maikmerten, I'd suggest another advantage of using liboggplay (apart from being cross-platform, having good sync and being simple to program against) is that it supports Ogg Skeleton, which allows streaming from time offsets, seeking by time/chapter over HTTP etc.
13:26	<maikmerten>	oooh, right, I totally forgot that
13:27	<hsivonen>	annevk: besides, reporting and recovering from malformed byte sequences is more work anyway and is likely to stay
13:28	<kfish>	how can i tell (from a web app) if the user agent supports html5?
13:29	<annevk>	you can't
13:29	<zcorpan>	kfish: for what purpose?
13:29	<annevk>	"supporting html5" is kind of broad too
13:29	<hsivonen>	kfish: you should check for the features you need
13:29	<kfish>	sure ... i want to offer an html5 page, with <video>, if the browser supports that
13:29	<zcorpan>	then just place the fallback inside the <video> element
13:30	<annevk>	var v = document.createElement("video"); if (v.play) { // prolly some support }
13:30	<annevk>	zcorpan, "create an element" should do the namespace stuff
13:30	<zcorpan>	annevk: ok
13:30	<annevk>	zcorpan, with a similar algorithm that's already in the XML spec
13:30	<annevk>	zcorpan, I've yet to implement it though
13:31	<kfish>	zcorpan, thanks
13:31	<kfish>	annevk, thanks
13:32	<annevk>	basically just going up the open elements array and going through the attributes for xmlns: and xmlns= magic
13:35	<zcorpan>	http://simon.html5.org/sandbox/html/simply-complex
13:43	<annevk>	hmm
13:44	annevk	doesn't really understand that table to begin with
13:45	<zcorpan>	Gez' table is the same as my two tables placed on top of each other
13:46	<annevk>	I don't understand his
13:46	<annevk>	I would love to see these on some real site
13:46	<annevk>	Wikipedia for instance
13:51	<zcorpan>	i have the periodic table with headers on 3 sides in print
13:52	<annevk>	so with headers on three sides, is <td> connected with all of them?
13:52	<zcorpan>	period on the left side (1..7), group on the top (1..18), and shells on the right (K..Q)
13:52	annevk	thinks the unbound algorithm should be improved
13:53	<zcorpan>	yeah. that is not how gez' table works though
13:53	<annevk>	I know
13:53	<annevk>	I know how his table works, I'm not sure I understand the use case
13:53	<hsivonen>	perhaps a table is too complex if sighted users need AT to grok it
13:53	annevk	ponders about creating some javascript that hilites headers
13:54	<annevk>	especially if you need to press ctrl+atl+5...
13:58	<zcorpan>	i also note that in my periodic system, the headers have headers
13:59	<zcorpan>	much like http://www.webelements.com/ ("Group" and "Period")
14:02	<zcorpan>	to make it read ok with the current system in html5, you'd have to write "Period 1", "Period 2" etc in each header cell
14:03	<zcorpan>	which would make the table less foreseeable for sighted users
14:05	<annevk>	does HTML5 forbid headers to have headers?
14:05	<zcorpan>	no, but the algorithm don't associate them together
14:06	<zcorpan>	only td->th
14:06	<annevk>	that should be fixable
14:07	<zcorpan>	in my printed table, one cell is split up to form two header cells
14:07	<hsivonen>	the algorithm should probably (conceptually) walk the table from the center outwards and associate outer ths as headers for inner ths
14:12	<hsivonen>	calculating line and col numbers for malformed byte sequences is also a fun case of violating abstraction layers
14:16	<zcorpan>	http://www.radiochemistry.org/periodictable/periodic_table.jpg
14:18	<zcorpan>	how do you mark up that?
14:22	<zcorpan>	would each cell contain a DL that lists all the properties of the element?
14:23	<Philip`>	Would you actually want to mark it up, rather than stepping back and working out what data you want to present to the user, and how you could better present it without relying on complex visual associations and patterns?
14:23	<zcorpan>	dunno
14:24	<zcorpan>	it might perhaps be more like a graph, so if you wanted it in that form, you'd use SVG
14:25	<Philip`>	If someone wanted to find information about a specific element, I would expect it'd be much easier if they could type in the name and get the data back
14:25	<Philip`>	(i.e. using an input box on a form, rather than a table)
14:27	<zcorpan>	yeah
14:29	<zcorpan>	or use a table, where each cell just contains the symbol (and/or full name) that is a link to a page that lists the properties of that element
14:36	<zcorpan>	annevk: it seems like whitespace outside the root element is a parse error in xml5
14:43	<annevk>	hmm
14:49	<annevk>	doing some simple table iterating isn't that hard it seems
14:50	<annevk>	just start at a cell and go in some direction :)
15:11	<annevk>	http://www.andybudd.com/archives/2007/06/rebooted/ "although I still maintain that re-introducing the font tag is a VERY BAD IDEA, and sends out all the wrong signals"
15:12	annevk	was on stage defending the font tag :)
15:12	<annevk>	never thought I'd do that a couple of years back
15:16	<gsnedders>	I still have my doubts about it
15:16	<annevk>	oh, me too
15:16	<annevk>	although I don't really see a viable alternative
15:16	<annevk>	<span style=font-family:verdana> is worse
15:16	<gsnedders>	what were the reasons for not using <span>?
15:17	<annevk>	what are the reasons for using span?
15:17	<gsnedders>	some editors (like iWeb) already use it, we get screamed at less, etc.
15:17	<annevk>	and some don't
15:18	<annevk>	using <span style> over <font> for political reasons seems bad
15:18	<gsnedders>	agreed
15:19	<annevk>	<font> also seems easier to parse than <span style=> given that <span style=> has way more use cases
15:19	<krijnh>	They both seem bad
15:19	<annevk>	well yeah, but there's no semantic editor that's actually used so far
15:20	<krijnh>	Mine is :]
17:15	<zcorpan>	an <img> that is display:none won't request the image, right?
17:17	<annevk>	I believe <img> is always fetched
17:17	<zcorpan>	really?
17:17	<annevk>	(for instance, var i = new Image(); i.src = "blah")
17:17	<zcorpan>	oh sure, but that's a different case from <img> in the markup, no?
17:18	<annevk>	it's the same <img>
17:18	<zcorpan>	hrm
17:18	<annevk>	though I suppose Opera may have not loaded display:none <img> at some point...
17:20	<zcorpan>	data:text/html,<img style=display:none src=javascript:alert('foo')>
17:20	<zcorpan>	doesn't alert in opera (but removing display:none does)
17:23	<zcorpan>	what is a simple way to test it in other browsers?
17:24	<annevk>	src=img onload=fires?
17:24	<hsivonen>	has anyone tested what happens when C1 range Unicode characters are document.written?
17:25	<hsivonen>	do they stay intact or are they turned into the Unicode characters that occupy the C1 byte range in Windows-1252?
17:26	<zcorpan>	data:text/html,<img style=display:none src=data:, onerror=alert('foo')>
17:26	<zcorpan>	hm, so firefox does load
17:27	<zcorpan>	so does ie7
17:29	<annevk>	it's funny how the table examples given use <td><b>...
17:59	<zcorpan>	i've figured something out
17:59	<zcorpan>	if there is one thing about web design that is future proof, it's quirks mode
18:11	<hsivonen>	zcorpan: XHTML!
18:11	<zcorpan>	ha
18:11	<zcorpan>	that's just a middle-step to XML+XSLT anyway, isn't it ;)
18:12	<zcorpan>	(XSLT converting it back to HTML)
18:18	<hsivonen>	annevk: FWIW, Python doesn't come with a UTF-32 codec by default, either
18:18	<hsivonen>	hard to generate test data :-)
18:23	<othermaciej>	UTF-32 is so simple to transcode to UTF-16 or UTF-8, I'm surprised any reasonably complete text system would be missing the codec
18:24	<hsivonen>	othermaciej: well, both Sun's/Apple's JDK and Python are missing it
18:25	<hsivonen>	ICU4J has an optional jar that adds support to Java 1.4 or later
18:59	<zcorpan>	forums.whatwg.org won't load for me
19:07	Philip`	wonders if it's possible in JS to evaluate a string in a clean scope, so it can't access the variables that are visible to the evaluator, except for a set of explicitly shared variables, like with Python's eval(string, globals, locals)
22:17	<Hixie>	annevk: yt?
22:18	<Hixie>	annevk: http://bugs.webkit.org/show_bug.cgi?id=8872 might be relevant for you