#whatwg on 2007-06-29

00:30	<Hixie>	so... determining if headers="" is correctly-used or not
00:38	<Hixie>	http://www.brera.unimi.it/old/CAELUM/MUSEO/Schede/rif32.html
00:38	<Hixie>	i know it says "old" in the URL, but uh, were there even _computers_ in 1939?!
00:39	<othermaciej>	do ENIGMA-cracking machines count?
00:39	<Hixie>	only if the data on those machines was written in HTML
00:39	<Hixie>	(that HTML file has a "last modified" date of 1939)
00:59	<zcorpan_>	typo of 1993?
01:00	<Hixie>	why would you type the last modified date?
01:00	<zcorpan_>	dunno
01:26	<hober>	ImproperlyConfigured: Error loading MySQLdb module: No module named MySQLdb
01:26	<hober>	whoops, sorry for the noise.
01:28	<zcorpan_>	hmm, i think firefox has funny handling of -- in doctypes
01:29	<Hixie>	not "funny"
01:29	<Hixie>	"compliant to SGML"
01:29	<Hixie>	i opted to drop that in the spec's version, you'll be glad to notice
01:30	<zcorpan_>	not quite compliant to sgml... <!doctype html --> foo -- system>
01:31	<Hixie>	heh
01:31	<Hixie>	fair enough
01:31	<Lachy>	zcorpan_: are comments allowed in DOCTYPEs like that? I don't think so
01:31	<zcorpan_>	Lachy: per sgml, yes
01:32	<Lachy>	I thought only within the internal subset, not before the sys ident like that
01:32	<Hixie>	in any sgml declaration
01:32	<Hixie>	the doctype is an sgml declaration
01:32	<zcorpan_>	what Hixie said
01:32	<Hixie>	at least that's my understanding
01:32	<Hixie>	it's mostly academic in practice
01:32	<Lachy>	oh, right. anyway, doesn't matter
01:33	<zcorpan_>	indeed
01:35	<Hixie>	ok so for my study of longdesc="", i'm looking for these things:
01:35	<Hixie>	* does the <img> not have a longdesc at all?
01:35	<Hixie>	* is its longdesc blank?
01:35	<Hixie>	* does its longdesc have a spec character in it?
01:35	<Hixie>	* does its longdesc value match the href="" of an ancestor <a> element?
01:35	<Hixie>	anything else i should look for?
01:36	<zcorpan_>	* does it's longdesc point to the same page or a fragment on the same page?
01:36	<Lachy>	could you also check if it matches an href anywhere in the document, if there isn't an ancestor link?
01:36	<Lachy>	or at least nearby\
01:37	<zcorpan_>	given jaws implementation, same-page fragments with longdesc aren't usable at all
01:37	<Lachy>	the idea is to see if most people are willing to put [D] links, or equivalent in, despite having fairly wide AT support now
01:38	<Hixie>	ok i added a check to see if the attribute's value matches the page's url
01:38	<Hixie>	and a check to see if the value doesn't have a space in it but starts with a #
01:38	<Hixie>	do i need a check for whether the value is url+# ?
01:39	<zcorpan_>	probably not
01:39	<Lachy>	what could you conclude from url+#?
01:39	<Hixie>	i mean, the page's url + #
01:39	<Hixie>	ok
01:39	<Hixie>	now headers=""
01:39	<Hixie>	lordy
01:39	<Lachy>	oh, ok, I thought you meant just any URL. But that would tell you longdescs to the same page
01:44	<csarven>	occording to this http://www.w3.org/TR/2006/REC-xml-20060816/#NT-EmptyElemTag the trailing space on empty/null elements is optional. i can't remember correctly but was there anything about IE not interpreting the empty elements if there were no trailing spaces?
01:45	<csarven>	s/occording/according
01:45	<Hixie>	in XHTML? or in raw XML?
01:45	<Hixie>	IE doesn't support XHTML.
01:45	<Hixie>	but it handles raw XML pretty much per spec.
01:46	<csarven>	well XHTML is useless for IE so it doesn't make a difference either way right
01:46	<csarven>	im probably mistaken about this
01:46	<zcorpan_>	the trailing space mentioned in xhtml 1.0 appendix c is for NN4
01:47	<Lachy>	does NN4 really choke on it?
01:47	<csarven>	for some reason i thought that the trailing space that developers were putting on XHTML documents (served with text/html for IE) contained the trailing spaces.. i thought it was because IE didn't treat them properly when there was no trailing space
01:47	<Lachy>	I thought it was pre-NN4 browsers
01:47	<zcorpan_>	it treats it as part of the tag name
01:47	<Lachy>	they do because of the appendix c guideline, not because of IE
01:48	<Lachy>	there are no known widely used browsers that need the space these days
01:48	<zcorpan_>	<br foo=""/> probably works fine in nn4 though (it would just treat it as an attribute)
01:50	<csarven>	interesting. http://www.w3.org/TR/2006/REC-xml-20060816/#NT-EmptyElemTag states its optional but appendix C suggests to use the trailing space
01:51	<Lachy>	There are, unfortunately, still some people in the world who use NN4. There was a question on the WSG list recently by someone with a project that had to support Windows 3.1 and NN4 :-/
01:51	<Lachy>	I think it was some intranet project
01:51	<Hixie>	csarven: i highly recommend using text/html and forgetting about XML for the purposes of XHTML :-)
01:52	<Lachy>	csarven: keep in mind that appendix c is non-normative and is based upon unknown research on ancient browsers.
01:53	<zcorpan_>	...and has conflicting guidelines
01:53	<csarven>	perhaps this is the unclear part for me.. is XHTML not entirely XML? :S
01:53	<zcorpan_>	(don't use PIs, use PIs)
01:53	<Lachy>	it is, but it just has guidlines for authors wanting to make it compatible with legacy UAs as text/html
01:53	<csarven>	Hixie oh this is just for curiosity :)
01:53	<Hixie>	ah :-)
01:54	<csarven>	good point Lachy
01:54	<csarven>	zcorpan_ :)
01:54	<Lachy>	the guidlines can be mostly ignored, though
01:54	<Lachy>	zcorpan_: the conflict is that it says to use <?xml?> decl, but don't use PIs because of the <?foo?> syntax
01:56	<zcorpan_>	Lachy: C.1 vs C.14
01:56	<Lachy>	yeah, I think so
01:56	<zcorpan_>	though C.14 is about when serving as xml, not text/html
01:57	<Lachy>	oh, I see, I got it backwards.
01:57	<Lachy>	It says don't use <?xml?>, but use <?xml-stylesheet?>
01:58	<Lachy>	then there's also the issue that <?xml-stylesheet?> doesn't define how to process stylesheets identified with a fragment ident.
01:59	<zcorpan_>	we need an Associating Style Sheets with XML documents 5
02:00	<zcorpan_>	or perhaps annevk will define that as part of xml5
02:05	<zcorpan_>	<?xml-stylesheet href="A"?>
02:05	<csarven>	zcorpan_ ASS?
02:05	<zcorpan_>	:)
02:05	<csarven>	has a nice ring to it
02:05	<zcorpan_>	ASS5
02:11	<othermaciej>	obviously you should name it Canonical ASS instead
02:13	<zcorpan_>	well there you go: opera expands NCRs in the pseudo-attributes, firefox doesn't
02:13	<zcorpan_>	not that the spec doesn't define that case though
02:19	<Lachy>	zcorpan_: which way does XML define it?
02:19	<Hixie>	sigh
02:20	<Hixie>	i'm gonna have to implement the Forming a Table algorithm aren't i
02:20	<Lachy>	Hixie, yeah
02:21	<Hixie>	so
02:22	<Hixie>	once i have these two tables constructed (one with headers and one without)
02:22	<Hixie>	how do i quantitatively compare them?
02:22	<Hixie>	just 1 for different and 0 for the same?
02:32	<othermaciej>	perhaps -1 for cases where the uses of headers is illegal and so might just confuse AT
02:32	<Hixie>	ah yes
02:33	<Hixie>	so i can count "incorrect", "redundant" and "interesting"
02:49	<Lachy>	Hixie, that low quality and high quality conformance idea you had gives a possible solution to the requirement of alt="". Make it required in high quality, optional, but recommended, in low quality
02:51	<Lachy>	and probably call them Strict and Loose conformance
03:29	<Hixie>	Lachy: the names were carefuly chosen (and a big part of the proposal)
03:29	<Lachy>	ok
05:30	<Lachy>	I just realised that selectors api doesn't define a feature string for hasFeature(), while most (if not all) other DOM related specs do. I'm not sure if it matters though.
06:19	<Hixie>	Lachy: see the whatwg spec for commentary on how useless that feature is :-)
06:20	<Lachy>	I know it's useless in JS, that's why I'm not rushing to add it.
06:20	<Lachy>	but I notice you include it for HTML5 and also XBL
06:22	<othermaciej>	hasFeature sucks
06:22	<othermaciej>	Selectors API is best tested for by property testing, at least in JS
06:23	<othermaciej>	but maybe less dynamic languages like Java don't have that luxury
06:23	<Lachy>	othermaciej: yeah, that's what I was wondering in #webapi
07:15	<Lachy>	for those of you who don't know yet, http://lachy.id.au/log/2007/06/opera :-)
07:18	<fishkandy>	Lachy, onya
08:05	<jruderman>	Lachy: what percentage of opera's new hires for the oslo office have to move from another country?
08:11	<Lachy>	jruderman: I have no idea
08:12	<jruderman>	Lachy: just wondering since i keep hearing about people moving to norway to work for opera
08:12	<othermaciej>	well, it's not like the world's top web experts already live in norway
08:13	<Lachy>	I've been told there's people from 41 different nationalities working there
08:13	<jruderman>	haha both apple and opera love The New York Times as an example site to show on a mobile phone
08:14	<othermaciej>	you should see how most phone browsers render it :-)
08:16	<jruderman>	hehe
08:17	<jruderman>	what do opera and safari do with simple fixed-width pages? do you have to scroll left and right as you read each line of a paragraph?
08:17	<jruderman>	(on phones)
08:17	<othermaciej>	in Mobile Safari you can pinch-zoom, pan, and double-tap to zeem a block of text to fit
08:17	<othermaciej>	*zoom
08:17	<othermaciej>	dunno what Opera does
08:17	<othermaciej>	I hate fixed-width pages
08:18	<othermaciej>	even on the desktop
08:18	<Lachy>	Opera has small screen rendering, which you can simulate in the desktop browser
08:18	<jruderman>	i don't want to zoom to fit the width of the block if that means each letter is 4px..
08:19	<Lachy>	I thinkthere's a video that demonstrates the web browser somewhere on the apple site
08:19	<jruderman>	they always demo with the new york times
08:19	<jruderman>	or google
08:20	<othermaciej>	jruderman: you'll have to try one or get someone to demo if you want to see
08:21	<othermaciej>	jruderman: could probably expense it for "competitive analysis"
08:22	<Lachy>	hmm. the apple site says the video is 175MB to download. It's actually 318MB http://www.apple.com/iphone/usingiphone/guidedtour.html
09:26	<zcorpan_>	Lachy: only the five entities are expanded in xml-stylesheet pseudo-attributes per ASS
09:27	<Lachy>	yeah, I only expected the 5 predefined entities to be expanded
09:27	<Lachy>	but is that true even if there's a doctype with an internal subset?
09:28	<Fuzzy76>	Lachy: Congratulations on your new job :)
09:28	<Lachy>	thanks
09:28	<zcorpan_>	Lachy: yes
09:28	<Lachy>	is that a browser limitation, or per spec?
09:28	<zcorpan_>	spec
09:28	<Lachy>	ok
09:29	<zcorpan_>	haven't tested throughly what browsers do, just did one basic test (with the NCR) and found that firefox and opera did different things
09:30	<zcorpan_>	i might try to get the spec updated too... it's not sane and leaves things undefined
09:36	<Lachy>	I have to write a 40 minute presentation on HTML5 before the 20th, and I'll be away from the 7th to the 15th.
09:36	<hsivonen>	Lachy: congrats on the job
09:36	<Lachy>	thanks
09:38	<Lachy>	... and I'll be working on the XBL primer on the 4th and 5th. I am really going to struggle to find the time :-/
09:41	<hsivonen>	Hixie: Re: survey on longdesc: you should probably dereference the URI and check the content type of what is found
09:41	<hsivonen>	Hixie: considering the bogosity of longdesc pointing to an image
09:43	<zcorpan_>	hsivonen: could perhaps be checked by checking if it's the same as src=""? (there already is a check for same as parent <a href>)
09:44	<zcorpan_>	hmm. <!doctype html public ">">
09:44	<zcorpan_>	all browsers terminate the doctype at the first >
09:49	<annevk>	that'd be easy to fix in the spec
09:49	<zcorpan_>	yeah
09:54	<annevk>	"Clarify who is in charge of dropping BOMs. Hint: it's not the air force." :)
09:56	<hsivonen>	are 512 byte boundary charset meta tests available as individual files somewhere?
09:57	<hsivonen>	extra points if there are tests where a broken UTF-8 bytes requence lies across the 512 byte boundary
09:57	<hsivonen>	byte sequence even
09:58	<zcorpan_>	hsivonen: i think Hixie has some tests on that
09:58	<Hixie>	hsivonen: i can't, i have no network in this environment
09:58	<hsivonen>	Hixie: ok. not even metadata from Google cache?
09:59	<annevk>	http://hixie.ch/tests/adhoc/html/parsing/encoding/ has some tests
10:00	<hsivonen>	annevk: ok. I'll see if some of those are what I'm looking for
10:02	<Hixie>	hsivonen: not with the way i'm doing it
10:02	<hsivonen>	Hixie: ok
10:03	<Hixie>	when you start trying to do scans of billions of documents, things like looking up information in a database becoomes unscalable
10:05	<hsivonen>	I wonder if ia_archiver searches the Web depth first, breadth first or something else
10:05	<hsivonen>	for surveys you want to do breadth first, right, to diversify results given finite time?
10:06	<Hixie>	almost certainly (c), since when you run a spider of that scale you have to take into account rate of retrieval per-server
10:07	<annevk>	how about collecting the set of referenced docs and checking those later?
10:07	<Hixie>	annevk: re Jirka's "Parse errors are allowed to be corrected by parser:
10:07	<hsivonen>	it would be interesting to hook up my Java parser (once ready) to the IA crawler so that people outside google with reasonable CPU and network could do smallish surveys
10:07	<Hixie>	"
10:07	<Hixie>	annevk: HTML4 allowed it too
10:07	<zcorpan_>	<!doctypehtmlpublic"x">
10:08	<annevk>	Hixie, yeah, I don't think I'm going to bother though
10:08	<zcorpan_>	...has the name "HTML" and the FPI "x" in firefox
10:08	<Hixie>	zcorpan_: has the name "htmlpublic"a"" in the spec
10:08	<zcorpan_>	yeah
10:09	<hsivonen>	I wonder if any of the people who volunteered to survey the top sites are interested if getting the framework ready if I give an API spec for using the conformance checker development version
10:09	<Hixie>	bed time
10:09	<Hixie>	nn
10:09	<annevk>	night
10:09	<zcorpan_>	nn
10:16	<hsivonen>	ok. Hixie's tests 044, 045 and 046 are what I wanted
10:18	<zcorpan_>	ha! <!doctype html public "x>"> triggers standards mode in firefox (yet renders the "> characters in body), but <!doctype html public "x> triggers quirks mode
10:18	<annevk>	that shows the preparsing I guess
10:18	<zcorpan_>	yeah
10:22	<hsivonen>	w00t! Passed the tests on the first try after implementing something as complex as doing the buffering trickery around the 512 byte boundary
10:22	<zcorpan_>	hsivonen: nice :)
10:26	<zcorpan_>	hsivonen: why would you drop them on the floor?
10:27	<hsivonen>	zcorpan_: depends on whether you want to report null or a string that has prematurely ended
10:27	<hsivonen>	zcorpan_: on the face of it, reporting null makes sense if the string wasn't properly terminated
10:28	<hsivonen>	I now got the IBM UTF-8 decoder loaded instead of the Sun version. and now my code crashes
10:28	<hsivonen>	I wonder why
10:29	<zcorpan_>	hsivonen: well, the doctype's correctness flag is set to incorrect anyway...
10:30	<hsivonen>	Aargh. the IBM decoder does really wrong things when reporting UTF-8 errors
10:38	<zcorpan_>	oops
10:39	<zcorpan_>	ie doesn't terminate at > in FPI
10:39	<zcorpan_>	or in quotes anywhere
10:39	<annevk>	anywhere?
10:39	<annevk>	<! ">"?
10:39	<annevk>	<!-- "-->" --> ?
10:39	<zcorpan_>	those are not doctypes
10:40	<annevk>	no kididng
10:40	<zcorpan_>	but seems to apply to <! ">" >
10:40	<zcorpan_>	not <!-- "-->" -->
10:41	<zcorpan_>	applies to <? ">" > and </ ">" > too
10:42	<zcorpan_>	<^_^>
10:42	<annevk>	fancy
10:42	<annevk>	seems like IE has the same handling for all of those
10:42	<annevk>	as it doesn't really support DOCTYPEs
10:43	<zcorpan_>	indeed
10:51	<hsivonen>	I wonder if Java 6 fixes the UTF-8 decoder holes
10:51	<hsivonen>	anyway, as of Java 5, both the JDK and ICU4J are b0rked
11:48	<annevk>	I wonder if the new encoding sniffer works for <meta> 512 bytes <meta> 512 bytes <meta> ...
11:52	<hsivonen>	annevk: "the new"?
11:54	<annevk>	the one that works together with the parser
11:54	<annevk>	and has this confident flag
11:54	<hsivonen>	annevk: is there a spec change?
11:54	<hsivonen>	annevk: or is this about html5lib?
11:54	<annevk>	there was a spec change
11:55	<annevk>	r955
11:58	<hsivonen>	what? did Hixie remove the magic 512 boundary?
11:58	<hsivonen>	just when I got it working
11:59	<annevk>	I think that boundary is still there for authors
12:00	<annevk>	actually
12:02	<annevk>	i think he did
12:02	hsivonen	is rather miffed
12:07	<zcorpan_>	hsivonen: you can perhaps use the code to emit a warning, suggesting that encoding declarations should be as early as possible in the source to improve perf (and interop?)
12:08	<hsivonen>	perhaps
12:08	<hsivonen>	I'm going to stop chasing encoding and tokenization spec changes for a while
12:09	<hsivonen>	I'd love to see a realistic spec on how much data to feed to chardet
12:10	<hsivonen>	that is, should I buffer the entire stream or n first bytes
12:11	<annevk>	i think guessing 512 bytes is reasonable
12:11	<annevk>	if you then later encounter a different encoding you'd have to switch
12:12	<hsivonen>	annevk: Gecko seems to run chardet on the first buffer the html parser gets from the channel but I have no idea how big that buffer is
12:13	hsivonen	hopes someone else finds out so that I don't need to find out using a debugger
12:13	<hsivonen>	what does IE do?
12:13	<hsivonen>	will a future WebKit use the ICU detector once it is ported to C?
12:25	<zcorpan_>	so the spec allows first a preparse, then a real parse, and then a real parse again if the first real parse found conflicting encoding information?
12:27	<zcorpan_>	e.g. <style><meta charset=utf-8></style><meta charset=windows-1252>
12:28	<annevk>	it seems to allow only a single preparse (optional) and only a single reparse
12:28	<hsivonen>	zcorpan_: but is the first "real" parse running scripts?
12:31	<hsivonen>	I don't trust that the current spec is the last word on this topic
12:32	<zcorpan_>	why is the preparse specced at all, if it yealds the same result as not preparsing (modulo perf)?
12:32	<hsivonen>	It would be nice if the people in charge of the relevant code in Trident, Gecko, WebKit and Presto just disclosed what exactly it is they do and what they want to do
12:34	<zcorpan_>	or wait, it doesn't yeald the same result. not preparsing doesn't catch encoding declarations in cdata elements
12:34	<zcorpan_>	so if it's optional and different how can we achieve interop?
12:35	hsivonen	would be interested in learning Hixie's thinking here
12:36	<hsivonen>	where does svn keep passwords? does the working copy have any private data?
12:38	<Philip`>	If you checked out from a http://name:password⊙. then it'll store that in .svn/entries, which (I've found) becomes annoying when you don't notice
12:39	<Philip`>	If you don't do that, I think it's up to the SVN client how it asks you for passwords or remembers the previous entries, and it shouldn't store that in the working copy anywhere
12:40	<Philip`>	(No idea where it does store it, though)
12:40	<annevk>	can you make it store it?
12:40	<hsivonen>	Philip`: thanks
12:41	<hsivonen>	I guess I'll sanitize the svn-specific directories then
12:41	<Philip`>	I guess it also depends if it's http:// vs svn+ssh://, since the SVN client will log in in different ways and would differ on whether/how it saves passwords
12:42	<hsivonen>	the Java http://java.sun.com/j2se/1.4.2/docs/api/java/io/InputStream.html#mark(int) contract doesn't allow saying that the mark should never become invalid...
12:42	<hsivonen>	which totally sucks considering arbitrary rewinding
12:42	<Philip`>	'svn export' seems to be a convenient way of removing the .svn directories
12:43	<hsivonen>	OTOH, if the underlying stream does support arbitrary rewinding, implementing my own is bad for perf
12:44	<hsivonen>	do browsers act on a meta charset even if there's a <body> first?
13:58	<hsivonen>	zcorpan_: one thing you could test is putting upper or lower case Turkish i in various places where the spec requires a literal string that contains an i
13:59	<hsivonen>	zcorpan_: Opera has a history of making comparisons where the Turkish i equals an ASCII i
14:08	<zcorpan_>	hsivonen: ok. thanks
14:11	<zcorpan_>	U+0130 (İ) and U+0131 (ı)
15:54	<zcorpan_>	why is Node.localName uppercased?
15:56	<annevk>	because there are people relying on it I guess?
15:56	<zcorpan_>	hmm... wouldn't think so
15:57	<annevk>	because it returns undefined in IE?
15:58	<zcorpan_>	yeah. and if you use .localName you probably also check .namespaceURI or so. and i wouldn't expect it to be uppercased
16:00	<annevk>	it would be nice if there was a canonical property available
16:00	<zcorpan_>	you might do something like if ((elm.tagName == "A" && !elm.namespaceURI) \|\| (elm.localName == "a" && elm.namespaceURI == "http://www.w3.org/1999/xhtml";))
16:01	<zcorpan_>	where the former is for legacy HTML UA and the second is for HTML5 UA and for XHTML
16:11	<annevk>	hmm, document.createElementNS("A", xhtmlNS) is interesting
16:11	<annevk>	it will claim "a" yet not implement HTMLAnchorElement
16:12	<zcorpan_>	createElementNS is case-changing? in what impl?
16:12	<annevk>	the English prose of HTML5?
16:12	<annevk>	for HTML Elements (elements in the XHTML namespace) in HTML documents .tagName etc. will return lowercase
16:13	<annevk>	however, document.createElementNS will not have its first argument lowercased
16:13	<annevk>	which gives you the aforementioned edge case
16:13	<zcorpan_>	ah. ok. then createElementNS isn't case-changing (and html5 doesn't say it is)
16:15	<zcorpan_>	.tagName will return uppercase btw
16:27	<hsivonen>	zcorpan_: +1 on localName returning lower case in text/html DOM
16:50	<duryodhan>	isn't drawWindow part of the HTML5 specs?
16:50	<duryodhan>	(canvas)
16:50	<annevk>	no
16:55	<duryodhan>	so is it supported only by mozilla firefox?
16:55	<duryodhan>	or by opera/safari ?
16:56	<annevk>	I believe only by Firefox
16:56	<annevk>	and only for priveleged JavaScript
17:07	<Philip`>	duryodhan: Yep, it's Firefox-only (so it really should be in a getContext('moz-2d') or something, though it isn't, which is perhaps annoying if the spec added something like drawWindow (e.g. to handle the text-drawing issue) with different semantics)
17:08	<duryodhan>	text-drawing issue?
17:09	<Philip`>	You can't (currently) draw text onto the canvas easily, and I think one proposed solution was to have something like drawWindow so you could set up an HTML element with CSS formatting and everything, and stick text into that and then draw it onto the canvas
17:11	<duryodhan>	hmm ... then might as well implement the moz drawWindow ...
17:13	<Philip`>	Hmm, I guess it would have to be more like drawElement rather than drawWindow
17:14	<Philip`>	(or just another drawImage overload)
17:15	<Philip`>	since drawing whole windows is a fairly limited functionality (unless you're very careful and cut out precisely the section surrounding the element you want), so maybe it doesn't matter that they stole the drawWindow name already
17:17	<duryodhan>	the problem with drawElement would be ... it is painful to know where the element is exactly ...
17:17	<duryodhan>	mean to say a form ...
17:17	<duryodhan>	one could start off a form anywhere and end it anywhere ...
17:18	<duryodhan>	the code for some buttons might be in between the <form > and </form>
17:18	<duryodhan>	but the button may be in some god forsaken place ...
17:20	<Philip`>	The form element is always defining a subtree in the DOM, so you can just do drawElement(getElementById('some-form'), 200, 100, 0, 0) and the browser can work out how to render that piece of the document (with all its contained elements, and affected by stylesheets) in a 200x100 container with a transparent background, then paint it onto the canvas at posiion 0,0, perhaps
17:21	<Philip`>	s/ii/iti/
17:24	<duryodhan>	so it will basically draw so that everything in between <form> </form> is rendered?
17:27	<Philip`>	Yep - kind of like extracting the whole <form>...</form> into a new clean document and rendering that, ignoring all the extra stuff from the original page (except probably keeping all the stylesheets that applied to that element (and its contents) from the original page). But maybe that's totally impossible to implement - I have no idea really :-)
17:28	<duryodhan>	yeah...
17:28	<duryodhan>	I was trying to do something like that in scripts ...
17:28	<duryodhan>	offsetLeft etc....
17:29	<duryodhan>	but I am pretty sure that the co-ords would be wrong for a weird form
17:32	<Philip`>	It does sound hard (/impossible) in general to work out what rectangle on the page corresponds to a certain element, particularly if there's stylesheets moving everything around
17:33	<Philip`>	and perhaps that would be simplified by just defining the rectangle first, and then telling the content to draw itself inside there (the same as what happens when drawing stuff into the screen rectangle, except directly onto the canvas rather than the screen)
17:36	<duryodhan>	I don't think I understand ... if the content is drawn into the canvas .. user can
17:36	<duryodhan>	't interface with it
17:39	<Philip`>	Why would you want an interactive form in the canvas, rather than just drawn as normal HTML?
17:45	<duryodhan>	I mean to ask .... why would you directly write to canvas?
17:45	<duryodhan>	instead of writing to HTML ?
17:45	<duryodhan>	I dont know what I am talking
17:45	<duryodhan>	:)
17:49	<Philip`>	Oh, as in why would you draw the element as a new separate thingy onto the canvas, rather than cutting out an existing part of the rendered HTML page?
17:49	<Philip`>	Probably the main reason is that you'd want to draw things without them being part of the HTML page
17:50	<Philip`>	like drawElement(document.createTextNode('Hello world')) or something
17:50	<Philip`>	else you'd end up with loads of rubbish stuck all over your page, when you only ever wanted to draw it into the canvas
17:58	<duryodhan>	yeah.. I am still stuck with the notion of drawing stuff already existing on the page ....i.e clicking a snap ...
17:58	duryodhan	is a little confused and talking through his hat
17:58	Philip`	doesn't really know about the actual technical details about how any of this could work
17:59	<duryodhan>	hehe
17:59	<duryodhan>	two ppl who don't know anything ... discussing stuff ...
18:00	<Hixie>	hsivonen: the current spec on encoding was what it is now before you asked me if i was going to make any more changes, and the changes were made in response to your e-mail
18:29	<Hixie>	annevk: there are people discussing XHR in the whatwg list
19:00	<met_>	http://www.0x000000.com/?i=365
19:29	<gsnedders>	how do browsers deal with LF separated HTTP headers?
19:52	<hsivonen>	Hixie: yeah, that email was from the time when I thought that a single pass over the document and changing decoders on the fly was feasible. :-(
20:09	<Hixie>	hsivonen: i think what the spec does now is pretty much what is required for web compat. note that the 512 byte thing isn't wasted, it's actually still there it's just that you get to pick the number (and it can be zero or the whole file)
20:09	<Hixie>	(or anywhere in between, including 512)
20:10	<hsivonen>	Hixie: ok. I'll cool down a bit and implement the tree builder spec
20:10	<Hixie>	heh
20:10	<hsivonen>	(obviously, I should have paid closer attention to the sniffing section. I was tracking the changes to tokenization)
20:11	<Hixie>	well my point is there weren't really any changes
20:11	<Hixie>	just additions
20:11	<Hixie>	i don't think it should have made any code you wrote obsolete
20:12	<hsivonen>	Hixie: I think the main issue is if we can get browsers to agree how many bytes to examine. if not, everyone will have to scan the entire file or risk incompatibility with someone else
20:12	<Hixie>	no the current system is that you scan as many bytes as you like, and then do the real parser, and if the real parser finds a conflicting encoding, you start over using that instead.
20:12	<Hixie>	which is what browsers do, basically
20:12	<Hixie>	it makes the prescan optional for interop, effectively
20:13	<Hixie>	gotta go, lunch, brt
20:13	<Hixie>	bbl, rather
20:13	<hsivonen>	Hixie: another thing: since the sniffing can now proceed past the first 512 bytes, the perf penalty for not declaring the encoding is potentially serious.
20:13	<hsivonen>	Hixie: so it would make sense to encourage even the ASCII-only folk to declare
20:14	<hsivonen>	Hixie: also, always making the undeclared case non-conforming helps sanity checking CMSs even if it is a tad drastic for individual docs
20:30	<Lachy>	this is cool http://iphonetester.com/
20:52	<hsivonen>	are head and body the only node that get attributes appended to them after the initial insertion to the document?
21:01	<mpt>	annevk, "misnormers" is a misnomer
21:06	<zcorpan_>	hmmm... now that i test again, ie seems to skip past CDATA and RCDATA elements when looking for encoding declarations
21:07	<zcorpan_>	or it uses its real parser, with the content model flags
21:15	<zcorpan_>	the problem with the current spec is that the preparser can find things that the real parser doesn't
21:16	<zcorpan_>	also, safari, firefox and opera don't change their mind after they've found an encoding decl with the preparser
21:17	<zcorpan_>	afaict
21:20	<zcorpan_>	i.e., ie does what the spec says but preparses 0 bytes. the others do what the spec says but preparses the whole thing and sets the confidence flag to certain when it has found a meta charset
21:27	<zcorpan_>	mpt: misspelled or wrongly used?
21:28	<zcorpan_>	(or both)
21:28	<mpt>	possibly both
21:28	<mpt>	The passive voice makes the statement unexplained, so it's difficult to tell
21:31	<zcorpan_>	iirc, the earlier draft said that results="" assumed a particular UI or something like that
21:36	<Hixie>	hsivonen: well, UAs are gonna have to work out what the right tradeoff is for the encoding detection (how long to prescan before just doing the heavy duty parse)
21:37	<hsivonen>	Hixie: what about interop?
21:37	<Hixie>	hsivonen: the preparse never sets a confident encoding
21:37	<Hixie>	hsivonen: you always verify the encoding with the real parser
21:37	<hsivonen>	oh
21:38	<zcorpan_>	Hixie: does the real parser undo tentative encoding information if it doesn't find any?
21:38	<zcorpan_>	Hixie: <style><meta charset=utf-8></style>
21:39	<Hixie>	ah, good point
21:39	<Hixie>	then again, it's already non-compliant to not have an encoding declaration if you're not using pure ascii
21:40	<zcorpan_>	being non-compliant doesn't make implementations interoperate ;)
21:40	<hsivonen>	Hixie: why isn't to not have an encoding declaration. Period.?
21:41	<hsivonen>	s/isn't to/isn't non-compliant to/
21:41	<met_>	Should html5 drag&drop model work across domains or only in the same domain? cannot find it in the spec
21:43	<Hixie>	hsivonen: because I don't want "<!DOCTYPE HTML><title>Hello</title><p>Hello" to be invalid.
21:44	<zcorpan_>	Hixie: if we want encoding detection to be interoperable, then i think we either should go the ie route (use the real parser for the whole file to find encoding information) or the firefox/opera/safari route (use the pre-parser for the whole file to find encoding information)
21:45	<Hixie>	i do not believe that firefox uses the pre-parser over the whole file
21:45	<Hixie>	that would imply they don't do incremental parsing, which is demonstrably false.
21:46	<zcorpan_>	it finds encoding declarations inside <style>, and it doesn't change its mind later on with encoding declarations that the real parser would find
21:46	<Hixie>	yeah, i think they should fix that
21:50	<hsivonen>	Hixie: you could make it valid by adding the UTF-8 BOM
21:50	<Hixie>	i can't easily type the UTF-8 BOM
21:51	<hsivonen>	Hixie: do I understand correctly that "reconstruct the active formatting elements" is a no-op if only character tokens have been processed since the last "reconstruct the active formatting elements"?
21:51	<Hixie>	i believe that to be the case, yes
21:51	<hsivonen>	Hixie: thanks
21:51	<zcorpan__>	ok. so the ie route then. with an optional preparse. but then the real parser needs to undo tentative encoding information when it doesn't find any. no?
21:52	<Hixie>	hsivonen: in fact you can always treat runs of non-whitespace character tokens and runs of whitespace character tokens as single tokens.
21:52	<Hixie>	(the same is not true for runs of whitespace and non-whitespace)
21:52	<Hixie>	zcorpan_: well, what would it "undo" it to?
21:52	<hsivonen>	Hixie: that's what I'm asking, yes. And "in body" I can treat them both as a single token, right?
21:53	<Hixie>	i believe that's what i do in my parser, yes
21:53	<hsivonen>	ok. I'll optimize the "in body" case
21:55	<zcorpan__>	Hixie: whatever it would have used if it didn't preparse
21:56	<Hixie>	it would have used some random guess
21:57	<zcorpan__>	yeah. fair enough
21:58	<Hixie>	i mean i see what you're saying, but i don't really see how you could check to see if a UA did "undo" it or not
21:58	<Hixie>	i don't really have a good solution to this
21:59	<Hixie>	we can't really limit how much you preparse, or require it to be a certain minimum, because that has big perf implications
21:59	<Hixie>	and browsers would just ignore us
21:59	<zcorpan__>	you can check by comparing with a test that doesn't have any encoding declaration
22:00	<zcorpan__>	however, i don't think there is content that relies on encoding declarations in (r)cdata elements applying, since ie doesn't see them
22:00	<Hixie>	well, the problem is you are allowed to pick a different default each time
22:00	<Hixie>	as you "learn"
22:01	<zcorpan__>	ah. yeah that's true
22:03	<hsivonen>	Hixie: under "in table" anything else you say "If the current node is a table, tbody, tfoot, thead, or tr element, then, whenever a node would be inserted into the current node, it must instead be inserted into the foster parent element." How can the current node be anything other than a table element?
22:04	<Hixie>	<table><i>
22:04	<Hixie>	at this point, the current element is an <i>
22:04	<Hixie>	but you're "in table"
22:05	<hsivonen>	I see
22:05	<hsivonen>	but I don't see the consequences
22:06	<Hixie>	consider <table><i>X</i>Y
22:06	<Hixie>	the <i> goes into the foster parent
22:06	<Hixie>	then the X goes into the <i>, because the <i> is the current node
22:06	<Hixie>	then you close the <i> with the </i>, so the current node is the <table> again
22:06	<Hixie>	so the Y goes into the foster parent
22:06	<hsivonen>	Hixie: but the foster parent is the table itself, right?
22:07	<Hixie>	no, the foster parent is the element the table is in
22:07	<hsivonen>	ooh
22:07	<Hixie>	assuming no dom mutations are going on
22:07	<Hixie>	so <div><br 1><table><p><br 2></table><br 3> results in <div><br 1><p><br 2><table></table><br 3></div>
22:08	<hsivonen>	ok
22:09	<hsivonen>	basically, insertToFosterParent will throw in the streaming mode
22:10	<Hixie>	fwiw what i did was use a function pointer as my "append to tree" function, and most of the time it's just the straight forward add to tree, but when "in table" in points to a function that does the check
22:10	<Hixie>	that way you don't pay for the cost all the time
22:11	<Hixie>	yes, in streaming mode i don't have a solution for tables
22:11	<Hixie>	what i recommend is actually to buffer up all the content that would be appended before the table
22:11	<Hixie>	and then when you leave the table, fire a non-SAX event of everything you collected
22:12	<Hixie>	or just delay all those events til after the table
22:12	<Hixie>	so the content that normally would go before the table goes after it instead
22:12	<hsivonen>	Hixie: I didn't follow: what saving did the function pointer give?
22:13	<Hixie>	wherever i would have called appendChild(), instead dereferenced the function and called that instead
22:13	<Hixie>	so it's just a pointer dereference each time
22:13	<Hixie>	instead of being a comparison to the current node each time
22:13	<Hixie>	maybe it's not cheap to do that in java
22:13	<Hixie>	in the language i was using a function pointer is exactly as cheap as a function call
22:14	<Hixie>	since all functions are actually function pointers
22:14	<hsivonen>	I don't see why the pointer thing wouldn't be masked by other branches anyway
22:14	<Hixie>	how do you mean?
22:15	<hsivonen>	If I check for "in table", doesn't that mask the pointer issue
22:15	<Hixie>	where?
22:15	<hsivonen>	or did you implement each phase as a set of function pointers that you plug in?
22:15	<hsivonen>	like html5lib has a class for each phase
22:15	<Hixie>	i just have a massive set of nested switch() statements
22:16	<hsivonen>	me too
22:16	<Hixie>	my implementation is basically a literal implementation of the spec
22:16	<hsivonen>	are your tokens objects or callbacks?
22:16	<Hixie>	they're tuples, which is basically a struct (object)
22:17	<hsivonen>	my tokens are callbacks, so the code order of the tree builder goes against the grain of the spec, which sucks
22:17	<Hixie>	my tokeniser is a function that runs until it returns a token
22:17	<Hixie>	ah
22:17	<Hixie>	why did you do it that way?
22:17	<Hixie>	i think the ideal way of implementing the tokeniser is a generator function a la python's yield
22:18	<hsivonen>	Hixie: because the "emit a token" model mentally matched SAX
22:18	<hsivonen>	Hixie: does yield store the current continuation?
22:18	<Hixie>	yes
22:18	<Hixie>	i've never used it but it seems perfect for the input stream and the tokeniser
22:19	<hsivonen>	no luck with storing continuations in Java without doing it by blocking a thread
22:19	<Hixie>	yeah, same in sawzall
22:19	<Hixie>	except you have no threads :-)
22:20	<hsivonen>	Hixie: anyway, doing the tokenizer the way I've seen SAX parsers written was a perfect match for the spec
22:20	<Hixie>	cool
22:20	<Hixie>	sounds less than perfect for the tree part :-(
22:20	<hsivonen>	Hixie: but now I need to group tree building by token instead of by phase/mode
22:20	<Hixie>	ah
22:20	<Hixie>	makes sense
22:20	<hsivonen>	yeah
22:20	<Hixie>	i've seen other implementations of the tree part that work that way
22:21	<Hixie>	so it's certainly possible
22:21	<hsivonen>	I haven't seen anything impossible yet. the random access to the spec just sucks
22:21	<Hixie>	random access to the spec?
22:21	<Hixie>	oh you mean the way the spec doesn't match it?
22:21	<Hixie>	yeah
22:22	<hsivonen>	Hixie: having to seek the right piece of spec when I go over my code stubs instead of making a sequential pass over the spec
22:22	<Hixie>	yeah
22:28	<hsivonen>	Hixie: Do I understand correctly that <div>foo<table>bar baz</table></div> parses to <div>foobarbaz<table> </table></div>
22:28	<Hixie>	yes but that's a bug in the spec
22:28	<Hixie>	not sure what it should do yet
22:28	<Hixie>	in particular, <div>foo<table> bar</table></div> vs <div>foo<table> <tr><td></table></div> is a tough one
22:29	<hsivonen>	hmm. perhaps I make that part horrendously inefficient then
22:29	<hsivonen>	no point in optimizing something that will be discarded
22:37	<zcorpan>	Hixie: hmm. afaict, ie, opera and safari all put the text inside the table. firefox handles that case as equivalent to <div>foobar<table> </table></div>
22:38	<Hixie>	which case?
22:38	<zcorpan>	<div>foo<table> bar</table></div>
22:40	<zcorpan>	http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C%21DOCTYPE%20html%3E%0D%0A%3Cdiv%3Efoo%3Ctable%3E%20bar%3C/table%3E%3C/div%3E
22:41	<Hixie>	anything that results in the text appearing but while inside the <table> is wrong
22:41	<Hixie>	since it isn't compatible with the css model
22:43	<Hixie>	looks like we might be able to get away with just setting a flag that saves whitespace
22:43	<Hixie>	and reset the flag when you hit a table-related element
22:43	<Hixie>	i.e. when you "clear the stack..."
22:44	<zcorpan>	yep
22:44	<zcorpan>	safari treats <div>foo<table> <tr></tr>bar</table></div> as <div>foobar<table> <tr></tr></table></div>
22:45	<zcorpan>	opera and ie seem to put the text inside the table somehow. in ie, text inside table is inside an element with the tag name ""
22:46	<Hixie>	IE creates "fake caption" elements
22:46	<Hixie>	i spoke to the ie guys about it
22:46	<zcorpan>	ah
22:46	<Hixie>	they're having all kinds of trouble implementing the css table model on top of their parsing model
22:47	<hsivonen>	Hixie: the CSS table model has to handle crazy XML and DOM modifications anyway, right?
22:47	<zcorpan>	wonder how opera deals with it, since it has text nodes as child of table in the dom afaict
22:47	<Hixie>	hsivonen: the problem is that css wraps cells around unexpected elements in the table
22:47	<Hixie>	hsivonen: whereas we want to move all the content to before the table
22:55	<hsivonen>	Hixie: Ok. I'm not well aware of the legacy requirements here
23:46	<hsivonen>	Hixie: I don't understand why space characters cause "Reconstruct the active formatting elements" in "after body".
23:58	<hsivonen>	Hixie: when you go from trailing end to main phase, what's the insertion mode gonna be? in body?