#whatwg on 2007-05-23

00:05	<Philip`>	Have people reverse-engineered/documented the level file format?
00:07	<zcorpan>	dunno
00:09	<Philip`>	The .exe has some strings (I guess from assert calls) that appear to give some information about the physics, except they say things like "gyuru::beszur-ban k >= pontszam!" and I'm not quite sure what that means
00:10	<zcorpan>	um, don't know either
00:14	<Philip`>	At least I can read "fabs( ujenergia - oldenergia ) > 0.001!"
00:17	<zcorpan>	seems like it's written in Hungary
00:19	<zcorpan>	i mean Hungarian
00:22	<Philip`>	http://www.gamedev.net/community/forums/topic.asp?topic_id=397791 has some relevant notes
00:25	<Philip`>	It'd always be possible to ask the original author exactly how the physics works, though I don't know if he'd like people reproducing his work in JavaScript :-)
00:25	<zcorpan>	perhaps it makes elastomania even more famous :)
00:26	<zcorpan>	it's a very addicting game
00:26	<Philip`>	I just keep bashing my head on the floor and dying
00:27	<zcorpan>	heh
00:28	<zcorpan>	ah, its collision detection explains some bugs indeed
00:34	<zcorpan>	good stuff in there
00:49	<zcorpan>	xmoto is open source... but it doesn't have the same physics as elastomania at all
04:51	<MikeSmith>	Hixie - you there?
04:51	<MikeSmith>	wanted to ask about meanings of flags/fields in html5 checkin message
06:00	<hsivonen>	how testable is the HTML5 tokenizer these days? are the tokenizer tests in html5lib designed to run without a tree builder setting the content model flag?
06:00	<hsivonen>	testable on its own, that is
06:24	<Hixie>	MikeSmith: see the web-apps-tracker
06:59	<MikeSmith>	Hixie - thanks ... looking now
07:02	<MikeSmith>	Hixie - I know already about the browser flags in square brackets ... was wondering about the number in parens
07:02	<Hixie>	that's he stability
07:04	<hsivonen>	Hixie: are your tokenizer tests applicable without a tree builder? that is, can I develop a tokenizer first and prove that I pass the semiofficial tests before I develop a suite of tree builder?
07:04	hsivonen	has a plan for 4 or 5 tree builder subclasses
07:05	<Hixie>	i believe there are tokeniser tests, but i didn't write them
07:05	<MikeSmith>	Hixie - OK ... stability values defined anywhere (mostly seems to be either 0 or 2 in checkin descriptions)
07:05	<Hixie>	hsivonen: i assumed that most people would not write tokenisers exactly per spec
07:05	<hsivonen>	Hixie: oh. I thought you had a tokenizer test suite
07:05	<othermaciej>	presumably only the actual tree output is normative
07:06	<othermaciej>	but some tokenizer issues must inevitably affect the DOM, so end-to-end tests could be made
07:06	<Hixie>	MikeSmith: 0 = experimental, 1 = unstable, 2 = has implementations, 3 = has stable implementations
07:07	<Hixie>	hsivonen: my tokeniser wasn't an implemenatation of the spec -- e.g. it did some preprocessing magic for collecting characters together and separating whitespace from non-whitespace depending on the tree constructor mode
07:07	<hsivonen>	Hixie: how do you mean not exactly per spec? I intend to use the runtime stack for implicit state instead of an explicit state variable. Mike Day suggested making a table-driven DFA, but I'm not sure that makes sense in terms of optimizing Java performance
07:07	<Hixie>	hsivonen: so any tests i had for that wouldn't match the spec
07:07	<hsivonen>	oh
07:07	<MikeSmith>	Hixie - thanks
07:07	<Hixie>	np
07:08	<hsivonen>	othermaciej: I do intend to test the whole thing with a tree builder eventually
07:09	<hsivonen>	(although I am hoping to write the tree building stuff one with pluggable tree builder-specific back ends so that there's really only one tree builder to test instead of 4 or 5)
07:09	<hsivonen>	s/stuff one/stuff once/
08:41	<annevk>	we're doing SQL now?!
08:51	<virtuelv>	annevk: ?
08:51	<virtuelv>	SQL in HTML5?
08:51	virtuelv	confused
08:56	<annevk>	http://html5.org/tools/web-apps-tracker?from=837&to=838
09:16	<annevk>	I think it would be easier if the second argument was an array
09:16	<annevk>	of executeSql()
09:17	<annevk>	Methods with an arbitrary amount of arguments are hard to construct on the fly
09:36	<mikeday>	so, in HTML5, http-equiv is only kept around for backward compatibility with refresh, yeah?
09:40	<annevk>	it's actually conforming to use it
09:41	<mikeday>	what about http-equiv="content-type" ?
09:42	<annevk>	that's replaced with <meta charset=utf-8>
09:42	<annevk>	well, replaced with <meta charset>
09:42	<mikeday>	right, so pages that still use it can not be valid HTML5? (for whatever definition of valid we are using this week)
09:43	<othermaciej>	I think <meta http-equiv="content-type"> should remain conforming for charset purposes
09:43	<othermaciej>	so your content can be conforming but still degrade gracefully
09:44	<othermaciej>	annevk: Function.call in JS lets you call any function with a variable number of arguments with an array instead
09:44	<mikeday>	that would help reduce confusion for authors following existing tutorials
09:44	<othermaciej>	annevk: but I don't believe there's a way to do the converse
09:45	<annevk>	mikeday, valid also requires <!Doctype html>
09:45	<mikeday>	true.
09:46	<annevk>	othermaciej, when would you want to do the reverse?
09:47	<annevk>	how do you set CVSROOT in Ubuntu?
09:47	<othermaciej>	annevk: I don't think you would want to do the reverse, but it makes functions w/ variable arguments more useful than functions that take an array
09:47	<othermaciej>	if you only have one
09:47	<othermaciej>	and I think I mean Function.apply, not call
09:48	<mikeday>	export CVSROOT=?
09:49	<annevk>	mikeday, cool
09:49	annevk	continues fooling around
09:50	annevk	is trying to switch to Ubuntu
09:54	<annevk>	othermaciej, the good thing about not allowing content-type is that there's only one way to do it
09:54	<annevk>	(in the markup, as you can still set it through HTTP)
09:55	<Hixie>	i can't think of a single time i've ever wanted to call a sql evaluator with a variable number of arguments -- it's not like getElementsByClassName() in that respect
09:55	<mikeday>	annevk, what are you using now?
09:55	<annevk>	ubuntu
09:55	<annevk>	I used to use Windows XP
09:55	<mikeday>	ah.
09:55	<annevk>	I also have a Windows XP installation on this computer
09:55	<annevk>	but I hope to leave it alone
09:56	<mikeday>	vmware is handy for testing IE bugs while still running Linux
09:56	<mikeday>	if you've got gcc installed, you can always try building my stub code: libhtml.sf.net
09:56	<annevk>	yeah, I got IE6 and IE7 running
09:56	<mikeday>	(you'll need svn to check it out, not cvs)
09:57	<annevk>	I need CVS for W3C stuff
09:57	<othermaciej>	annevk: I think "only one way to do it" is a weak advantage compared to graceful degradation
09:58	<othermaciej>	in fact, I'm not sure <meta charset=""> is really justifiable
09:58	<othermaciej>	it's nicer syntax but it does not degrade in older browsers at all
09:58	<annevk>	browsers have to support it to support the web
09:58	<annevk>	in that sense it degrades perfectly
09:58	<othermaciej>	<meta charset>?
09:59	<annevk>	yeah
09:59	<othermaciej>	what current browsers support it?
09:59	<othermaciej>	I guess I just did not know about it
09:59	<Hixie>	all of them
09:59	<mikeday>	hmm, in the charset sniffing, <!--> is not a complete comment, and <!----> is, but what about <!---> ?
09:59	<Hixie>	specifically they support <meta htt-equiv=content-type content=text/html; charset=utf-8>
09:59	<Hixie>	(note the lack of quotes)
10:00	<othermaciej>	that's amusing
10:00	<annevk>	mikeday, also not complete
10:00	<othermaciej>	ok, no real objection then
10:01	<mikeday>	annevk, the spec could be clearer on that point :)
10:01	<mikeday>	or, I could be smarter
10:01	<mikeday>	I missed the bit that said after the existing --
10:01	<annevk>	(note that <!--> will become a comment in due course)
10:02	<annevk>	(to minimize differences between quirks and standards mode)
10:02	<mikeday>	"Advance the position pointer so that it points at the first 0x3E byte which is preceeded by two 0x2D bytes and comes after the second 0x2D byte that was found."
10:02	<mikeday>	it could be clearer that the two 0x2D bytes come after the second 0x2D byte that was found
10:02	<mikeday>	not just the first 0x3E byte coming after the second 0x2D byte that was found.
10:02	<Hixie>	send mail
10:03	mikeday	sends
10:03	<Hixie>	thanks
10:03	<mikeday>	(although if comment definition will change, perhaps charset sniffing will need to change as well)
10:04	<annevk>	yeah, likely
10:04	<annevk>	at least, it would make sense if they behaved the same :)
10:08	<mikeday>	I'm trying to write a little state machine for charset sniffing,
10:08	<mikeday>	as it's a lot easier than writing a state machine for complete HTML tokenisation :)
10:30	<zcorpan>	what's the use-case for a client-side database?
11:00	<hsivonen>	mikeday\|away: did you already investigate the feasibility of a table-driven DFA for the tokenizer?
11:01	<hsivonen>	SQL! whoa! how will that one interoperate without requiring every browser to embed sqlite?
11:02	<zcorpan>	who requested it?
11:02	<hsivonen>	SQL isn't exactly the best example of an interoperably implemented standard
11:04	<annevk>	got a point there...
11:04	hsivonen	is still shying away from a table-driven DFA in Java
11:04	<annevk>	zcorpan, Mozilla has it
11:04	<annevk>	not sure if they requested it
11:35	<mikeday>	hsivonen, not sure yet, the state machine in the spec is too complex as is, needs to be simplified first
11:36	<mikeday>	hsivonen, I'm just trying to get something basic working to implement <meta charset> sniffing
11:36	<mikeday>	but it won't be implemented as an array, probably just use goto
11:36	<mikeday>	and it only needs to apply to a buffer of fixed size, so blocking on input won't be an issue.
11:42	<hsivonen>	mikeday: is your main loop of control going to be inside the parser as in Java SAX parsers or outside as in expat?
11:42	<mikeday>	outside
11:43	<mikeday>	can still support SAX that way
11:44	<mikeday>	and more convenient for integration with some input sources, eg. curl
11:44	<hsivonen>	I've been thinking if I should decouple the loop from the tokenizer on the Java side, but I'll probably go with the traditional java.io/org.xml.sax model on the inside
11:45	<mikeday>	you've got a few more options in Java, but fitting in with SAX makes perfect sense.
11:48	<hsivonen>	mikeday: just about all XML parsing in Java happens with the parser pulling stuff from java.io.InputStream instead of an app-owned loop pushing buffers to a parser
11:48	<mikeday>	yeah, most C libraries work like that as well
11:48	<mikeday>	then sometimes end up trying to hack in support for progressive parsing later
11:48	<mikeday>	I'm going to try and do it the other way around from the beginning, and see how far I get :)
11:50	<mikeday>	but starting with charset sniffing, as that's easier and doesn't involve input at all
11:50	<mikeday>	as you only apply it to the bytes that you already have in the buffer.
11:51	<hsivonen>	I'm a bit uncomfortable with the sniffing result depending on buffering
11:51	<mikeday>	tee hee, I found another bug in the spec: 0x3C 0x2D (ASCII '<!')
11:51	<hsivonen>	requiring the sniffer to read up to 512 until it finds the charset would be deterministic
11:52	<mikeday>	yes, mine will read up to 512 or EOF, whichever comes first.
11:53	<mikeday>	hmm, actually mine will sometimes read past 512 at the moment
11:53	<mikeday>	do you think it would be best to clamp it at 512, regardless of how much has been read?
11:55	<hsivonen>	I'd clamp to 512 to avoid results depending on buffering details.
11:55	<hsivonen>	(those are hard to track down when something goes wrong)
11:55	<mikeday>	right, sounds good.
11:55	<mikeday>	min(buf->size, 512) :)
11:57	<mikeday>	hmm, with goto and macros you can make a decent state machine
11:57	<mikeday>	if you used labels-as-values gcc extension you could probably make it handle input buffering too
11:58	<mikeday>	basically defining a state machine mini-language in C
11:58	<hsivonen>	mikeday: do you compile with GCC on Windows?
11:58	<hsivonen>	or does MS support that GCC extension?
11:58	<mikeday>	I usually cross-compile Windows binaries on Linux using mingw32
11:58	<mikeday>	but I'll avoid using gcc extensions for now
11:58	<mikeday>	there's always someone out there not using gcc
12:01	<mikeday>	gcc also has case ranges: case 0x41 ... 0x5A:
12:01	<mikeday>	which would be quite handy in this case, as they would save me writing out 52 letters explicitly :/
12:02	<gsnedders>	mikeday: no, you write code to write tedious code :)
12:03	<mikeday>	s/me/me or an automation under my control/
12:03	<gsnedders>	:)
12:03	<mikeday>	actually, I'm curious to see if gcc generates any clever code for a switch statement on a byte value
12:04	<mikeday>	for example, does it generate a jump table by itself, or some other clever trick
12:09	<Philip`>	SELECT "a"\|\|0; - SQLite says "a0", MySQL says 0, Postgres says column "a" does not exist; interoperably not great :-(
12:09	<Dashiva>	You need 'a' for postgres, sadly
12:10	<Dashiva>	I think oracle is the same
12:10	<annevk>	SQL5 will safe us
12:10	<Dashiva>	How many doctype states will -that- have ;)
12:11	<Philip`>	Dashiva: I'm guessing Postgres follows the standard, since MySQL doesn't and SQLite is slightly crazy
12:18	<Philip`>	Hixie: I think variable number of arguments is useful whenever you want to add some abstracted interface, like with
12:18	<Philip`>	function search(fields) {
12:18	<Philip`>	db.execute("SELECT * FROM stuff WHERE " + " AND ".join(name+" = ?" for name,value in fields), [ value for name,value in fields ]);
12:18	<Philip`>	}
12:18	<Philip`>	results = search([ ['name', n], ['age', 12], ['colour', c] ]);
12:18	<mikeday>	(where is this proposal to support SQL in JavaScript?)
12:18	<Philip`>	(Er, that's probably a broken mix of JS and Python, though I suppose I might be lucky and it could be valid JS1.7)
12:19	<Philip`>	http://www.whatwg.org/specs/web-apps/current-work/multipage/section-sql.html
12:19	<annevk>	mikeday, http://html5.org/tools/web-apps-tracker?from=837&to=838
12:20	<mikeday>	"A future version of this specification may define the exact SQL subset required in more detail."
12:20	<mikeday>	now there's an understatement :)
12:21	<Philip`>	I think the only interoperable subset of SQL and of existing SQL implementations is likely to be the empty set
12:21	<Philip`>	Well, maybe "SELECT 1+1" would work
12:21	<Dashiva>	Philip`: Yeah, I finally found the part in SQL92 which mentions it
12:23	<mikeday>	"SELECT * FROM table" is probably more interoperable than 1+1
12:23	<Philip`>	I'd the practical effect would be that Firefox ships with support for the full SQLite syntax, so people write code assuming that, and other browsers have no choice but to use SQLite too, and then Microsoft would ship with Jet or whatever, and everyone would suffer pain, and then maybe someone would try to sort out the mess by defining a common subset but it'd be too late then
12:23	<Philip`>	*I'd guess
12:24	<annevk>	at that point we get "Web SQL 5"
12:25	<mikeday>	why bother? I mean honestly, is SQL the best method for doing this?
12:26	zcorpan	doesn't understand the use-case for client side SQL
12:26	<annevk>	"best" also includes what people are used to
12:26	<annevk>	I think it's pretty awesome that techniques people are used to on the server can now be easily reused client side
12:27	<annevk>	in theory you might even be able to share some code
12:27	<mikeday>	and in practice, you wouldn't
12:27	<annevk>	yeah, dunno about that either
12:27	<annevk>	although some of the database design can be reused
12:28	<Philip`>	Like using XForms to integrate server-side and client-side processing of data?
12:28	<annevk>	is there a better way to do relational data storage client side?
12:29	<mikeday>	hmm, once you say relational, no, not really
12:29	<Dashiva>	I'm not so worried about acid and all the fancy DB stuff, but SQL itself is a nice way to manipulate
12:29	<annevk>	which also includes support for querying etc.
12:30	<Philip`>	It seems there are lots of libraries that hide the SQL behind an object-like interface, so maybe it'd be worth looking at those
12:30	<mikeday>	it just doesn't seem very webby, to me.
12:31	<Philip`>	(Hmm, it doesn't look like there's even a common-subset way to concatenate strings...)
12:31	<mikeday>	HTML5 is complex enough without trying to inhale a common subset of SQL
12:31	<hsivonen>	I wonder what the disk footprint of a SQLite database is
12:31	<mikeday>	seems to far removed from "standardisation of what we already know"
12:31	<hsivonen>	considering that each domain needs its own
12:32	<mikeday>	geez, it's going to suck debugging random crappy websites in the future
12:32	<mikeday>	I mean keeping track of cookies was bad enough
12:32	<annevk>	the web evolves :)
12:32	<mikeday>	and writing eg. little screen scraping tools with wget or whatever
12:33	<mikeday>	evolution != progress, it can just mean random mutations, genetic drift, and pointless specialisation :)
12:33	<met_>	whow! there is a book already http://www.amazon.com/tag/sql5
12:33	<mikeday>	SQL: the peacock's tail of the web
12:34	<Philip`>	hsivonen: A SQLite database with one table and no data is 3072 bytes
12:34	<Philip`>	and 158 bytes after gzip
12:35	<mikeday>	hah, that's 158 bytes we won't see again in a hurry :)
12:36	<Philip`>	I bet you could do fun DOS attacks by JOINing a table with itself dozens of times
12:38	<mikeday>	presumably it's no more vulnerable to that sort of thing than JavaScript already is
12:38	<mikeday>	eg. running ackermann's function or whatever
12:40	<Philip`>	But the JS engines already provide timeouts so users can stop runaway scripts, and SQL engines might not
12:40	<Philip`>	(SQLite does, but it's marked as experimental)
12:41	<mikeday>	right, it's another weak point that would need to be checked
12:41	<mikeday>	is it in the spec purely because Mozilla support it?
12:43	<Philip`>	Mozilla doesn't appear to support it now (though they could easily add it (if they don't care about security, e.g. limiting to privileged content) since they're using SQLite)
12:43	<mikeday>	so who asked for it?
12:44	<annevk>	Developers, developers, developers!
12:44	annevk	doesn't know
12:44	<Dashiva>	You missed one developer
12:44	<mikeday>	hmm. Maybe I should ask for a complete POSIX implementation to be added to HTML5
12:44	<annevk>	Dashiva, he didn't care
12:44	<Dashiva>	Oh
12:44	<mikeday>	that way, people could run existing Linux binaries in the browser environment
12:44	<mikeday>	each domain could have a chroot style setup
12:44	<Philip`>	mikeday: That would solve the problem of making a web-based OS
12:45	<Dashiva>	Are we still talking about a world where Microsoft exists?
12:45	<mikeday>	with its own POSIX namespace, the usual APIs available, sockets, memory mapped files, etc.
12:45	<Philip`>	Just wait for virtualisation technology to advance a bit, and then it'll be fine to run a virtual machine per browser session
12:45	<mikeday>	or better yet, an x86 virtual machine would make more sense than a POSIX layer
12:45	<mikeday>	hah, you beat me to it
12:46	<mikeday>	then we could run arbitrary programs and old DOS games in the browser. take that, flash!
12:46	<Philip`>	Hmm, x86 is a pain - just make a new instruction set that's easy to virtualise, and port Linux to it
12:46	<Dashiva>	Why not just reuse JS?
12:46	<Dashiva>	We have JSON, next is JSOS
12:47	<Philip`>	Hmm, compile Linux into the instruction set for virtual machine written in JavaScript?
12:47	<Philip`>	+a
12:47	<Philip`>	Then just JIT it to the user's native instruction set, for optimal performance
12:48	<mikeday>	rather than having an SQL based storage system in the browser,
12:48	<mikeday>	you could have a virtual machine with a virtual IDE hard disk of configurable size
12:48	Philip`	wonders how many potential embedded-SQL uses could be handled just by adding B-tree indexes to globalStorage
12:48	<mikeday>	scripts can then "format" the virtual hard disk with their file system of choice, and address it at block-level
12:51	<Philip`>	I would be quite nervous looking at a web site that said "Formatting disk - please wait" and knowing that it was actually capable of formatting a disk, even if I was fairly certain it was isolated to the web browser
12:51	<mikeday>	how nervous would you be to see "Creating tables and building indices"
12:52	<Philip`>	I know how to protect myself from tables, so that's not a problem - I can just climb up the stairs and they can't follow because their legs won't bend enough
12:53	<mikeday>	touche :)
12:53	<hsivonen>	a whole new world of layout tables
12:54	<Philip`>	We could just adopt the proposal from http://sql4.by.ru/
13:03	<zcorpan>	we have good stuff in here for next year's april fool's joke
13:03	<Dashiva>	Next year it won't be a joke anymore :)
13:03	<mikeday>	why wait? After all, they didn't wait until April for the SQL proposal :)
13:04	<annevk>	:p
13:12	<mikeday>	ah well, that's enough spec bashing for now
13:12	mikeday	waves
13:13	<hsivonen>	Hixie: Re: IRC logs a week ago: Jukka Korpela is known to argue about the details of what it means to have a "sample" in the statistics sense.
13:18	<hsivonen>	of course, on the Web, it is impossible to have the kind of sample he means
13:18	<Lachy>	what kind of sample does he mean?
13:18	<Lachy>	he refused to explain when I asked and then insulted me for not having taken statistics 101
13:19	<Dashiva>	Well, intranet pages for one
13:21	<hsivonen>	Lachy: I think he means that first you identify a population and then you pick a sample at random so that each member of the population has an equal probability of appearing in the sample
13:22	<hsivonen>	Lachy: but on the Web, you cannot enumerate the population of pages and you cannot do uniform sampling
13:22	<hsivonen>	Lachy: one might argue, though, that a chunk of "important" pages from google is more useful than a proper sample of the same size
13:23	<Lachy>	doesn't taking a massive sample of several billion pages somewhat compensate for the problems of not being able to do that?
13:23	<hsivonen>	Lachy: but the argument wasn't about useful but about proper use of statistics terminology
13:23	<hsivonen>	Lachy: just don't call it a "sample" :-)
13:23	<Lachy>	well, what would you call it?
13:24	<hsivonen>	dunno. it has been a while since I've done statistics
13:24	<Philip`>	Why would taking a bigger sample make it any better, if the sample is still biased in some direction?
13:25	<hsivonen>	Philip`: well, if the bias is towards some notion of being more important
13:25	<hsivonen>	Philip`: then we can discuss what's important
13:25	<Philip`>	If you identified a billion pages and then chose a million at random to analyse in detail, you should get exactly the same results (except for incredibly rare events where you'd lose statistical significance)
13:26	<zcorpan>	Lachy: call it "a subset of the Web". :)
13:26	<Philip`>	(then the problem is in identifying the population of a billion pages, and the actual sampling is easy)
13:27	<hsivonen>	Philip`: the problem is that implicitly the infinite population of all Web pages is assumed
13:27	<Philip`>	(and using a bigger sample within that population wouldn't compensate for any problems in the choice of population)
13:28	<hsivonen>	hmm. according to wikipedia, a non-random subset is still a "sample" even if not a "random sample"
13:30	<Philip`>	Sampling an infinite space seems like it ought to be impossible (or at least not well defined) - if you chose a sample of n random positive integers, then the mean would be some value which is completely non-representative of the population because there'd be a finite number of numbers below it and an infinite number of numbers above it...
13:43	<annevk>	jgraham, I think we should be using insert(0, data) as opposed to append()
13:43	<annevk>	jgraham, for stream.qeue
13:43	<annevk>	jgraham, especially if we start allowing injection
14:09	<annevk>	hehe
14:09	<annevk>	the Content-Type discussion is funny
14:10	<annevk>	the last e-mail from Jeff :)
14:29	<annevk>	What annoys me most about these HTTP guys is that they always tell you how things should be done, but they never make it actually happen
14:30	<annevk>	Browser vendors have tried and tried to fix the web, but it hasn't paid off and didn't work out very well either.
14:30	<annevk>	In fact, we're still trying to actively educate people and publish articles, etc. and what not and still lots of people get it wrong and will get it wrong.
14:32	<annevk>	(HTTP people also have this notion about asking the user. It's even in the HTTP specification. That's one of the worst possible models you can have and it has been widely ignored in practice...)
14:38	<Philip`>	Browsers still seem to ask the user if they really want to submit a form, or if they really want to look at a secure site, or if they really want to leave a secure site
14:38	<Philip`>	though the default seems to be to only ask once, and it's not like anybody reads those messages the first time anyway, so I'm not sure what the point is
14:39	<gsnedders>	Philip`: to claim you implement the spec
14:40	<annevk>	My XML tokenizer now passes the "tests" in http://www.w3.org/TR/xml/#sec-entexpand
14:41	<annevk>	Some bits are a bit hacky for my taste, but I suppose that can eventually be made cleaner
14:44	<annevk>	Philip`, yeah, I've the feeling that those messages are there to please the HTTP gods
14:44	<annevk>	At some point in the past my parents switched browser and were heavily confused with those modal dialogs wondering what they had done wrong.
14:46	gsnedders	is still attempting to get his mother to use a computer at all
14:52	<annevk>	After I hook this up with the treebuilder I need to start doing boring things... Such as making testcases :(
14:52	<annevk>	Maybe I should open source it :)
15:04	<Dashiva>	I liked that dialog box
15:04	<Dashiva>	We should have more of those. "Do you want this page to render correctly?"
15:05	<annevk>	"Do you want to render this page per spec?"
15:09	<gsnedders>	"Do you want to render this page in a way completely incompatible with the web?"
15:09	<annevk>	"Do you want to use a browser that doesn't follow the spec but does render this page?"
15:10	<gsnedders>	Does anyone have any documentation of the major incompatibilities between HTTP/1.1 and HTTP in the real world?
15:10	<annevk>	No
15:12	<gsnedders>	ergh. does anyone know if you need to do hysterics involving CR or LF where the spec says you should have CRLF?
15:27	<hsivonen>	annevk: are html5lib tokenizer tests independent of tree builder?
15:30	<annevk>	hsivonen, yes, although you have to implement some logic to run them
15:30	<annevk>	hsivonen, they may also assume a particular implementation
15:30	<hsivonen>	annevk: what kind of assumptions?
15:33	<annevk>	that end tags have their attributes not dropped yet I believe
15:33	<annevk>	some things are done during the tree builder step
15:34	<annevk>	but just go through them yourself, it should not be that hard to modify the problematic ones
15:34	<hsivonen>	eww. do the end tag attributes ever get anything but ignored?
15:34	hsivonen	doesn't remember the spec doing anything with end tag attributes
15:39	<annevk>	hsivonen, we don't check whether it's an end or start tag when appending attributes
15:54	<annevk>	Philip`, you mentioned <canvas> perf earlier on? Do you have standalone testcases for that?
15:54	<annevk>	Philip`, Having standalone testcases makes it easier to improve the situation
16:52	annevk	removes lots of HTML specific stuff out of base.py in his XML5 project
16:53	annevk	tries to align the architecture as much as possible so it remains relatively easy to do similar things
17:08	<hsivonen>	what's the deal with this ship names thing?
17:08	<hsivonen>	is there a good use case for needing to know that a string is a ship name?
17:09	<hsivonen>	or is it just a popular contrived example of something that is italicized but doesn't have a more appropriate element than <i> in HTML?
17:09	<Dashiva>	It's apparently the best example of semantic non-emphasis italic
17:09	<hsivonen>	right.
17:10	<hsivonen>	and presumably ship names are a common case?
17:10	<Dashiva>	Personally I'd guess most people wouldn't italicize them at all
17:55	<Philip`>	annevk: I don't have any intentional tests - I've just played around a bit with http://canvex.lazyilluminati.com/misc/speed/benchmark.html but I don't think any of those cases are actually representative of what makes Canvex slow (e.g. I no longer see any real-world performance difference using drawImage(img) vs drawImage(canvas) because it's probably drowned out by other problems)
17:58	<Philip`>	I should probably set up some mode in Canvex where it starts in an interesting place looking at complex stuff, so I could run it in a profiler and see what's actually slow nowadays
20:38	<Hixie>	the SQL stuff's main use case is offline Web apps -- big web apps like say an RSS reader that wants to support working while offline needs way more than the globalStorage stuff
20:39	<Hixie>	i'm very aware of the problem with sql interoperability :-(
20:39	<Hixie>	i hope we'll be able to have implementations in the coming 6-12 months; if we then have 2 implementations we can define a common subset and lock that down
20:40	<virtuelv2>	Hixie: I don't think interoperability should be a problem if you specify the exact subset of SQL supported by SQLite
20:40	<Hixie>	(in practice i would expect it to be very similar to sqllite, since safari and firefox are both already using it, as i understand it)
20:41	<Hixie>	hsivonen: regarding samples, note that he never replied to my e-mail where i explained exactly how i defined a fixed sampling frame before taking a sample
20:41	<virtuelv2>	... since SQLite is placed in the public domain, http://sqlite.org/copyright.html
20:42	<Hixie>	yeah
20:42	<virtuelv2>	and they have this interesting take on patents, http://programming.reddit.com/info/1eypf/comments
20:42	<othermaciej>	being in the public domain is actually a little more dodgy than copyright with a liberal license
20:43	<virtuelv2>	othermaciej: you're thinking of the fact that all jurisdictions don't recognize PD?
20:43	<virtuelv2>	They offer licensing for that purpose
20:45	<othermaciej>	virtuelv2: that and because anyone can copyright even a trivial derivative work of a PD work in full, thereby potentially making the licensing issues more murky
20:45	<gavin>	if it's in the public domain, who are they to issue licenses?
20:46	<gavin>	isn't copyright in this case non-existent?
20:46	<virtuelv2>	gavin: because all contributors have to sign off the code to the PD
20:46	<virtuelv2>	you can relicense PD code if you please, as othermaciej noted
20:46	<gavin>	the copyright is what I'm wondering about
20:47	<gavin>	if they say they have no copyright to the code, I don't see how they can license it
20:47	<gavin>	you can only license something if you have the copyright
20:47	<virtuelv2>	gavin: they can, because they are free to do so under the same license they're distributing it
20:47	<virtuelv2>	you can do the same
20:48	<gavin>	maybe I should go speak to a copyright lawyer
20:48	<gavin>	wonder how much that'll cost me :)
20:48	<virtuelv2>	gavin: make sure you do in a country that accepts the PD
20:48	<virtuelv2>	Some countries don't
20:48	<gavin>	I don't see how that matters
20:49	<hsivonen>	I once had a boss who doubted our ability to waive copyright in Finland. So we used the MIT license. I have talked to copyright lawyers since and none of them saw a problem with waiving copyright.
20:49	<virtuelv2>	ok, let's say I place Item A in the public domain in country Foo
20:49	<gavin>	I have no stake in this, I'm just curious
20:49	<othermaciej>	make sure you talk to a lawyer that understands the copyright law of countries that accept the public domain
20:49	<othermaciej>	that would be the right way to put it
20:49	<gavin>	sure
20:49	<gavin>	ah, ok, I see
20:49	<gavin>	(what virtuelv2 meant)
20:50	<virtuelv2>	if a user in country Bar, that doesn't accept PD, gets my Item A, the legal system in Country Bar will assert that I own copyright of Item A
20:50	gavin	didn't intend to start a discussion about copyright law
20:50	<gavin>	I should have kept my curiosity to myself :)
21:01	Philip`	still can't imagine any two SQL implementations being close to interoperable, unless they're two versions of exactly the same database engine and then it's still fairly dodgy (e.g. SQLite 2 vs SQLite 3 being very different; though from what I've heard, v3 is considered pretty much complete and there's not going to be a v4, though I could be utterly wrong in remembering that)
21:03	<Dashiva>	Well, cut down on the utility functions and fancy frills, and it gets a lot better
21:03	<Philip`>	Oh, it was http://osdir.com/ml/python.db.pysqlite.user/2006-12/msg00015.html saying "drh does not expect there to ever be a SQLite 4"
21:03	<Philip`>	What about e.g. types?
21:03	<Philip`>	like, SQLite doesn't really have them, but other databases do
21:03	<Philip`>	which seems a fairly major interoperability concern
21:04	<Hixie>	Philip`: we couldn't imagine two html implementations being close to interoperable, but now we have html5 and the goal is in sight :-)
21:05	<Dashiva>	Maybe clarify what kind of interoperability is the worry. Developer experience/lockin, or machine/application interop?
21:05	<Philip`>	The people implementing HTML want it to be interoperable, whereas the people implementing SQL have never really cared and there's not much reason to think they ever will, and the people implementing HTML probably don't want all the work of reimplementing SQL too :-)
21:05	<Philip`>	Uh, "The people implementing HTML wanted HTML to be interoperable, ..."
21:05	<met_>	Philip` there are people building apps with same SQL working on MSSQL and Oracle too, but it is only small subset of all functionality
21:07	<Dashiva>	And on a pragmatic note, reality survives with the current set of SQL engines. Why would that change?
21:09	<Philip`>	I don't remember having seen any applications that work on more than one database, without them having put significant effort into writing multiple sets of database backend code
21:09	<Hixie>	we'll see. it may well be that we simply can't have interoperability here, and we have to drop it.
21:09	<Dashiva>	I don't imagine there are too many web applications running SQL on the client side yet
21:10	<Philip`>	e.g. Trac at http://trac.edgewall.org/wiki/DatabaseBackend eventually got support for SQLite+MySQL+Postgres but it took a long time
21:10	<Hixie>	one thing though -- if the first two implementations use SQLLite, same version, and then we get widely distributed code out there that uses it, that will greatly increase the cost of someone writing a non-compatible version.
21:12	<Dashiva>	Well, look at mozilla expanding JS, who's to say they won't do the same to their SQL?
21:13	<Dashiva>	they being browser vendors in general, not just moz
21:17	<Hixie>	they're expanding JS in conjunction with the ECMA working group
21:22	<Philip`>	If the first implementations do use SQLite and everyone else has to too, it sounds like the specification will simply (though maybe not explicitly) require all implementors to use the SQLite implementation, and it doesn't really sound very spec-like to say "you must accept the dialect of SQL spoken by SQLite" (but it wouldn't be good to not say that, if it's true)...
21:24	<Philip`>	I'm not sure if there's anything particularly wrong with everybody in the world using SQLite, if Microsoft would accept that too instead of going their own way with a different SQL implementation (assuming they ever bother with this at all), but it seems odd
21:25	<Hixie>	Philip`: no, we'd define it in detail, not like that
21:26	<Hixie>	it would just happen to be compatible with the implementations in the wild, whether that be sqllite or whatever
21:26	<Philip`>	What would happen with old content that came before it was defined, and assumed all of SQLite's behaviour?
21:27	<Hixie>	the idea would be to define it such that existing content works
21:27	<Philip`>	Just doing "CREATE TABLE t ( n )" is perfectly normal in SQLite but I don't believe it works in any other database at all (since they all require columns to have types)
21:28	<Hixie>	then if sqllite is what is used, then we will probably end up defining that
21:28	<gsnedders>	ergh. I give up with people claiming <i> is deprecated.
21:29	<Philip`>	If someone isn't going to use SQLite for whatever reason, would they implement something that's compatible with it (or with the definition of it from the spec), rather than just sticking in whatever other SQL engine is handy because it'll save them years of work?
21:30	<Hixie>	if they want to be compliant and want to work with existing content, yes
21:30	<Hixie>	hence the desire to have existing content depend on something before we spec it in detail
21:30	<Hixie>	since that rather forces the issue
21:36	<Philip`>	What'd be really neat is if you could use SQLite's virtual tables, where you implement the table backend in JavaScript and could make it do synchronous XHR requests to retrieve the data to return to the client code when it performs queries...
21:38	<Philip`>	(http://www.sqlite.org/cvstrac/wiki?p=VirtualTables etc)
21:42	<Hixie>	yeah
21:53	<virtuelv2>	2
21:53	<virtuelv2>	heh, dette er pussig
21:53	<virtuelv2>	Ubuntu NEKTER å mounte ipoden min
21:54	<virtuelv2>	augh, wrong channel
21:59	Philip`	assumes you wouldn't have to use transactions if you were doing everything in a single JS function (rather than spread across multiple event handlers), since JS has to execute single-threadedly so there wouldn't be concurrent database accesses, which is nice because it prevents some obvious errors
22:04	<Philip`>	Why does ResultSet have getName and not specify what order or what set of fields should be returned, rather than just saying it returns the same fields in the same order as in the SELECT statement (and then not needing getName because the user already knows)?
22:07	<Philip`>	getName throwing exceptions if there are no results sounds a bit annoying, because it means you couldn't do "r = executeSql(...); i = r.getName('field'); for (; r.validRow; r.next()) { dostuff(r[i]) }"
22:09	<Philip`>	insertId doesn't say what happens if multiple rows were inserted
22:13	<Philip`>	I expect people writing code would like to be able to find out how much disk space they're using and how much is available, so they can present it to the user nicely and can clean up old data if it's getting full, rather than waiting until they get an exception at an unexpected time
22:14	<Hixie>	Philip`: you don't know the rows in a SELECT * statement
22:14	<Hixie>	can you insert multiple rows?
22:14	<Hixie>	yeah, i imagine quota management might be something we'll do in a v2
22:14	<Philip`>	I think "executeSql('BEGIN'); executeSql('INSERT INTO table VALUES (?)', lots_of_data_so_we_run_out_space); executeSql('COMMIT')" would break confusingly since JS exceptions won't roll back the transaction
22:14	<Hixie>	it interacts with the other storage things
22:15	<Philip`>	Oh, I forgot about SELECT *
22:15	<Hixie>	why won't they?
22:16	<Philip`>	You can do e.g. "INSERT INTO b SELECT * FROM a" to copy a whole table at once
22:16	<Hixie>	hm
22:17	<Hixie>	what should insertId return then?
22:17	<Philip`>	I don't think they can they roll back the transaction without knowing that the exception is passing out of the transaction's scope without being caught
22:17	<Philip`>	and since the transaction is defined by a pair of BEGIN and COMMIT commands, it doesn't have a scope that the browser could know about
22:18	<Hixie>	oh, well sure, if you're going to do commit/rollback transactions, you should use exception handlers around what you're doing
22:18	<Hixie>	i think you misunderstood getName, btw
22:19	<Philip`>	You could write a transaction wrapper and run "doTransaction(function(){ executeSql(...) })" and it can do the BEGIN/COMMIT/catch/ROLLBACK/throw stuff, I suppose
22:19	<Hixie>	your example would be r = executeSql(...); for (; r.validRow; r.next()) { dostuff(r['field']) }
22:20	<Hixie>	there is in fact no way to get an index from a string
22:20	<Philip`>	SQLite's last_insert_rowid() returns the last row ID that was inserted in the current connection - if insertId is supposed to be per-resultset instead, I'm not sure how you could implement that, but it'd probably end up returning the last inserted row ID from that query
22:20	<Hixie>	k
22:21	<Philip`>	Oops, yes, I got getName the wrong way round
22:22	<Hixie>	(well i guess you can get an index from a string if you loop over the fields looking at each name in turn)
22:22	<Hixie>	I fixed the insertId thing.
22:22	<Hixie>	you're in the acknowledgements already right?
22:23	<Philip`>	I'd guess doing string->index might be useful for efficiency, if you're getting lots of rows and don't want the hash lookup cost for each one, but maybe it's not worth caring about
22:23	<Philip`>	I am
22:23	<Philip`>	(I'm famous!)
22:24	<Hixie>	christ, between twitter being down and the mail server being twitchy the people watching twitter and commit-watchers aren't going to get much of a good look at the changes today
22:25	<Philip`>	I'm not sure how you could implement insertId so it works with "INSERT INTO t (id) VALUES (123) /* insertId = 123 /; DELETE FROM t / insertId undefined /; INSERT INTO t (id) VALUES (123) / insertId = 123 */", given SQLite's last_insert_rowid (which will (I think) give exactly the same value after each of those statements)
22:27	<Hixie>	dunno, we'll see what implementors say
22:27	<Hixie>	maybe we'll just have to have insertId return the last inserted row at the point the result set was created
22:28	<met_>	In ms sql wil be the second insert 134 (dunno sqllite)
22:28	<Hixie>	or they'll fix sqllite
22:28	<Philip`>	Could just drop it entirely and use executeSql("SELECT last_insert_rowid()")[0]
22:30	<Hixie>	when writing database code i always wish i could just do \|var x = executeSql("INSERT ...").insertId;\|
22:30	<Philip`>	(and have that be per browsing context, since it's per connection and each browsing context can have a separate connection)
22:32	<Philip`>	var x = executeSql("INSERT ...; SELECT last_insert_rowid()")[0] isn't much worse
22:32	<Hixie>	but it is worse :-)
22:32	Hixie	ponders whether pushState() should always require a URI and title, or if it should allow state to be pushed without a URI
22:32	<met_>	Philip` why not per connection? it should be for one connection
22:32	<Hixie>	bbiab
22:33	<Philip`>	met_: Not quite sure what you mean
22:34	<met_>	you said last_insert_rowid() should return last id from whole context not from one connection?
22:37	<Philip`>	I meant that if you did "executeSql("INSERT ..."); setTimeout(function(){ executeSql("SELECT last_insert_rowid()") }, 1000)" (and nothing else happens on that page) then it should return the rowid from that first INSERT, regardless of whether you've got the same site open in a different tab or window and are inserting new rows over there
22:38	<met_>	yes I agree, misunderstand it
22:38	<Philip`>	which would require browsers to have a separate SQLite-connection in order to keep that separation
22:38	<Philip`>	Uh
22:38	<Philip`>	*a seperate SQLite-connection per browsing context
22:38	<met_>	yes
22:39	<met_>	has sqllite some lock mechanism for this?
22:39	<met_>	for more connection accesing one table
22:43	<Philip`>	You can open multiple connections (as long as the database is stored on disk, not in memory, I think), so each will have its own last_insert_rowid and stuff, and you can do some coarse-grained (per-table reader-writer) locking to stop transactions interfering
22:44	<Philip`>	and JavaScript's singlethreadedness should avoid most of the real concurrency issues
22:44	<met_>	fine
22:44	<Philip`>	(I think people would like finer-grained locks, but it'll only be implemented if they can find a mechanism that was described long enough ago for any patents to have expired)
22:45	<met_>	how handle in javascript case when 1 browser windows lock some table and second windows are in a que? should'n be there some timeout? for not to freeze so many windows
22:46	<met_>	*queu
22:48	<Philip`>	The browser can set SQLite's busy_timeout, so if two pages both BEGIN EXCLUSIVE then the second will time out after 5 seconds (or whatever) and presumably the browser would then raise a JavaScript exception
22:48	<Philip`>	Oh, actually, no
22:48	<met_>	busy_timetout will be useful
22:49	<Philip`>	Since JS runs single-threaded, you wouldn't want to bother with timeouts because there isn't any other thread that's going to release the lock later
22:49	<met_>	and what 2 browser windows?
22:49	<Philip`>	Both browser windows still run all the scripts in the same thread (as far as I'm aware)
22:49	<met_>	these are not neccessary in one JS thread
22:50	<met_>	or is it some recommendation for implementors?
22:50	<Philip`>	HTML5 says "... the HTML scripting model is strictly single-threaded and not reentrant"
22:51	<met_>	and concern this really more windows?
22:51	<Philip`>	So you could still have one page that does BEGIN EXCLUSIVE and then returns control to the browser, then a second page comes along and tries BEGIN EXCLUSIVE but it would immediately fail because the table's already locked (so you set the busy timeout to 0, I guess)
22:52	<Philip`>	which is kind of irritating since a single page can jam the database until you restart your browser
22:52	met_	rememer only Firefox to have js threads via XPCOM
22:52	<Philip`>	Oh, no, it only jams the database until you close the page which had the lock
22:52	<Philip`>	(since that page has its own connection, and locks belong to connections)
22:53	<met_>	or there can be some max_time_limit for one sql request
22:53	<Philip`>	I think multiple windows still have to all be in the same single thread (or at least appear exactly as if they are), because they can interact with each other in various ways
22:53	<met_>	if it takes too long db should stop it
22:55	<Philip`>	It doesn't have to be a slow request - you could just do "<script>executeSql('BEGIN EXCLUSIVE')</script>" and the request would finish quickly, but the database would still be locked
22:55	<Philip`>	(Oops, I said it was per-table but actually it's per-database)
22:57	<Hixie>	yeah the spec already describes the threading model in detail
22:58	<Hixie>	though i can't find it now
22:58	<Hixie>	hm
22:58	<Philip`>	http://www.whatwg.org/specs/web-apps/current-work/#threads ?
22:58	<Hixie>	oh there we go
22:58	<Hixie>	4.1.4
23:04	<met_>	if i read br. context well it looks like if I have two independent browser windows with www.example.com, these are 2 browsing context and do not know about each other, so single-thread rule doesn't apply for them
23:04	<Hixie>	correct
23:05	<Hixie>	bbiab
23:05	<Philip`>	Oh, in that case it's not true that the database access is single-threaded and easy
23:06	<met_>	yes
23:07	Philip`	wonders what happens if there are two unrelated browsing contexts, one in a window named "A" and one in a window named "B", and simultaneously the first does "while (1) window.open(uri, 'B');" and the second does "while (1) window.open(uri, 'A');"
23:10	<Dashiva>	One would win
23:11	<Dashiva>	Unless you're somehow managing to really run both at once using two processors and a multithread-script UA or somesuch
23:12	<Philip`>	The threading model in the spec allows them to be in separate threads on separate processors, as far as I can tell
23:13	<Philip`>	I guess a sensible implementation would just abort one of the scripts when it's forcibly navigated away from, since it's the same as if one page does "while (1) {}" and the user tries to hit the back button to get away
23:15	<MasterLexx>	heho
23:16	<Philip`>	(Or maybe a sensible implementation would do everything in a single thread anyway, since it's not like people tend to look at multiple unrelated pages which are simultaneously using the full CPU to execute JS)
23:16	<Philip`>	Good evening
23:16	<MasterLexx>	html5?
23:17	<Philip`>	Yes?
23:17	<MasterLexx>	i did just read about it, is it only because of backwards compatibility and the xml must 100% valid problem?
23:17	<Dashiva>	It's not a 'only', no
23:18	<MasterLexx>	i am currently using xhtml 1.0 strict for all my small websites, so tell me, what is the future? xhtml2 or xhtml5?
23:18	<Dashiva>	Well, browsers are implementing parts of xhtml5 already
23:19	<MasterLexx>	i have read here and there a bit, but can't find much examples of html5 and so, i am no technician, so i don't udnerstand all those documentations with this abstract text
23:19	<Dashiva>	It's a work in progress
23:19	<MasterLexx>	is there a site where i can see a comparison of xhtml2 and 5 and 1?
23:20	<MasterLexx>	okay, and html5 will have all the elements of html 4.01 strict? or on what does it build up on?
23:21	<Dashiva>	It builds on the existing web
23:22	<MasterLexx>	.....
23:22	<Philip`>	http://wiki.whatwg.org/wiki/Changes_from_HTML4 lists some differences between HTML4 and HTML5
23:22	<MasterLexx>	thx
23:24	<hasather>	Hopefully, this will be a success: http://yodel.yahoo.com/2007/05/22/one-small-step-for-email-one-giant-leap-for-internet-safety/
23:29	<MasterLexx>	is the stupid target attribute removed in html5?
23:30	<MasterLexx>	as far as i read, frames arent supported
23:32	<Hixie>	Philip`: in that kind of UA, i would imagine that the ua would not determine "that the two browsing contexts are related enough that it is ok if they reach each other"
23:48	<Philip`>	Hixie: Ah, didn't see that bit - makes sense