#whatwg on 2008-06-08

01:21	<MikeSmith>	jgraham_, annevk: thanks for taking time to reply to Laura
01:22	<Dashiva>	I was reading about Jack Thompson, and then switched to reading www-archive, and suddenly something clicked. :)
01:24	<Hixie>	http://jarvklo.se/ has a number of html5 compliance problems
01:25	<Hixie>	use of <br>, wrong alt="" text in particular
01:25	<Hixie>	(and the use of a table is dubious, but he knows that)
01:26	<Dashiva>	Is implicit start tag + explicit end tag allowed in both HTML4 and HTML5?
01:26	<Hixie>	sure
01:27	<Philip`>	The attempt at defining "conformance" as something other than "validity" seems to have failed, because everyone still thinks conformance is about machine-checkable syntactic validity
01:27	<Dashiva>	Are there any browsers that don't automatically check /favicon.ico? Seems a bit redundant
01:27	<Philip`>	(or at least the author of that page appears to, so that's close enough to "everyone")
01:29	<Philip`>	(Hence, non-machine-checkable conformance criteria are a waste of time, because even someone explicitly intending to write conforming HTML5 ignores them)
01:31	<Dashiva>	I wonder if he's for or against mandatory alt
01:32	<Philip`>	Maybe neither
01:32	<Hixie>	sorry. cat.
01:32	<Dashiva>	No such option. If he isn't for it, he's against it. And also a terrorist. :P
01:33	<Hixie>	Philip`: i disagree. i think many people agree that using <h1> for a paragraph is wrong. (or vice versa)
01:34	<Dashiva>	Anecdote: In introductory IT class, we learned basic HTML. One of my classmates wrote <h1>Headline<h1> \n <h2>First line of text<h2> \n <h2>Second line of text<h2> ...
01:35	<Dashiva>	And it actually worked. Somehow.
01:36	<Hixie>	define "work"
01:36	<Hixie>	i mean the DOM is gonna look horrible.
01:36	<Hixie>	and you won't get a useful outline out of it
01:38	<Dashiva>	Worked as in there was no obvious sign she was doing something horribly wrong
01:39	<Philip`>	Hixie: People also agree that writing pages with spelling mistakes is wrong, but that doesn't mean they think the spec should require correct spelling - things that are 'wrong' in the sense of being a poor quality document don't have to be forbidden by the markup-language spec
01:40	<Dashiva>	So Robert said "Right now our process and procedures are ripe for gaming." -- Am I the only one seeing irony here?
01:41	<Philip`>	Dashiva: He seems to be right, given that the WHATWG process has involved multiple members playing GTA4
01:42	<Hixie>	Philip`: the english spec does require correct spelling, just like unicode requires certain things with respect to characters
01:42	<Hixie>	Philip`: but both of those are out of scope of html itself
01:43	<Hixie>	Philip`: however correct use of html seems in scope of html. :-)
01:43	<Philip`>	Hixie: I think I missed the news when someone wrote a spec for English :-)
01:43	<Hixie>	Dashiva: yeah, the signs of making a mistake for that case are hard to see
01:43	<Hixie>	Philip`: http://ian.hixie.ch/bible/english
01:45	<Philip`>	Hixie: That's a useless spec since there aren't two independent implementations of it yet
01:46	<Hixie>	:-P
01:48	<Dashiva>	en-hixie is like ooxml. It doesn't provide enough information for complete compatability with existing implementations
01:51	<Philip`>	There's not even any definition of error hndlanig bhaivueor
02:28	<takkaria>	Wittgenstein wouldn't be happy if it was defined
02:30	<Hixie>	i am not qualified to define the error handling
02:30	<Hixie>	i understand it's actually a really complicated problem
02:34	<Philip`>	takkaria: Wittgenstein is dead, so it's not like he's going to raise a Formal Objection
02:36	<takkaria>	as we've seen, just because there are no formal objections doesn't mean the spec is good. :)
02:39	<Philip`>	We've also seen that people will be unhappy regardless of whether the spec is good or not, so Wittgenstein's unhappiness would mean nothing
02:42	<takkaria>	k, but my point was mainly that specifying spoken languages is a bad idea
02:44	<Philip`>	I'll admit that deployment is a hard problem
02:46	<takkaria>	it's less of the deployment and more that you would have an infinite regress of specifications
02:47	<takkaria>	because to specify english, you would have to specify it with some other language; and whatever that language is, you'd have to specify that too; and so on
02:51	<Philip`>	If that was true, the same argument would apply to any other specification ever written, since they are all written in a language which would itself need to be specified
02:56	<takkaria>	yes, but if you set out to specify a general language, specifying it in some other general language seems fairly pointless, because that language itself then needs to be defined in some way
02:58	<takkaria>	look at Laura Carlson's replies on www-archive, for instance; she wants to define "broad consensus" and similar terms
03:00	<takkaria>	it doesn't seem to me like you can define that satisfactarily except with other words
03:01	<takkaria>	I think my general point, having thought about it a little more, is that HTML et al. are built on top of natural languages, so using them to define in works, but defining in natural languages has nothing to build on top of
03:05	<Philip`>	Dictionaries seem to be a counterexample, since they define words by using words, and experience shows they are successful at letting people understand words
03:06	<takkaria>	but they don't specify exhaustively in the same way that e.g. HTML5 tries to
03:06	<takkaria>	and a language defined in the language itself is rather self-referential
03:06	<Philip`>	If you read an entire dictionary, I think you will be sufficiently exhausted
03:11	<takkaria>	:)
05:00	<takkaria>	see, it's easier to do that than you think
12:41	<gsnedders>	I probably ought to try and cleanup the markup of http://stuff.gsnedders.com/tri.html
13:11	<jgraham_>	So is it expected that input like
13:11	<jgraham_>	<!DOCTYPE HTML>
13:11	<jgraham_>	<meta charset=iso8859-2">
13:11	<jgraham_>	will now prescan to a character encoding of iso8859-2?
13:11	<jgraham_>	(this is not the behaviour expected by the html5lib tests but I think it is the right behaviour per spec)
13:45	<jgraham_>	http://james.html5.org/temp/html5lib-0.11.zip rc3 (I think that the issue smedro encountered on the Mac may still be there. I'll see if I can reproduce it)
14:07	jgraham_	wonders if the issue is related to 2 byte v 4 byte unicode builds of python
14:56	gsnedders	gets off his lazy ass and starts to write comments on XHR
15:15	<dolphinling_>	hsivonen: When I run an image map through validator.nu, I get "Error: Attribute name not allowed on element map at this point.", looks like bug 243 wasn't fully fixed
15:36	gsnedders	sends his late LC comments
20:32	<Hixie>	jgraham_: what else would it do?
21:58	<gsnedders>	Am I the only person who dates everything he writes with ISO8601:2004 Basic Form dates?
21:59	<takkaria>	I suspect not
22:00	<gsnedders>	It's fun confusing teachers
22:03	Philip`	wonders what ISO8601:2004 Basic Form dates are, since Google finds Web Forms 2.0 and not much else
22:03	<gsnedders>	Philip`: 20080608
22:03	<takkaria>	I actually tend to use dashes to seperate elements
22:04	<gsnedders>	takkaria: That's the extended form, and that confuses teachers less, and thus loses half the fun
22:04	<Philip`>	gsnedders: Ah - that does seem needlessly obscure in real life :-p
22:04	<gsnedders>	Philip`: Well, I use it for almost everything, so I'm perfectly used to it :P
22:04	<Philip`>	Why not just write dates as seconds since 1970, if you don't want normal people to understand you?
22:05	<gsnedders>	Philip`: Because I can't write dates as seconds since the UNIX epoch off the top of my head.
22:05	<gsnedders>	Philip`: ISO8601 basic form dates are perfectly simple to write
22:05	<gsnedders>	(and read)
22:08	<jgraham_>	Hixie: It used to give a different answer because there wasn't a requirement to strip punctuation when determining the encoding. But I think I've done the right thing now
22:09	jgraham_	just doesn't like changing testcases without checking that the change is right
22:11	<jgraham_>	Philip`: Why not use seconds since January 1st 1992, to exclude both normal people and people who can convert unix dates to something more convenient easily
22:12	<jgraham_>	smedero: Thanks for testing html5lib
22:12	<smedero>	sure, I see you've got a new build from the logs...
22:12	<gsnedders>	jgraham_: If I'm going for '92, why not 1992-04-20T3:20+01?
22:13	<Philip`>	Just write "Today" as the date, and then nobody at all will be able to work out what you mean
22:13	<jgraham_>	gsnedders: Because I don't know when you were born to the nearest minute, hence preventing me using the zero point I wanted to
22:14	<gsnedders>	Philip`: That's useless for me as well as everyone else, and therefore doesn't meet the needed criteria (a) I can read and write it without any assistance apart from knowing the calendar date and year; 2) It confuses people apart from myself)
22:14	<gsnedders>	jgraham_: Well, now you do know.
22:15	<jgraham_>	gsnedders: I guess what I really meant was "Why not measure time from 0, like I do" :)
22:16	<Philip`>	gsnedders: Use a perfect one-way hash function - keep something like a diary with a list of date <-> uid mappings, and when you need to encode a date you can see if it's already in the list else generate a new uid, and to decode a date you just do a uid lookup
22:16	<jgraham_>	smedero: I'm inclined to just release html5lib with that as a known issue if there's nothing else wrong with it
22:16	<gsnedders>	jgraham_: But when does time begin in my mind? When I was conceived? When I have conscious thought? When I can remember stuff now?
22:16	<smedero>	jgraham_: With html5lib 0.11 rc3 I'm getting the same test error that I emailed you.
22:17	<gsnedders>	Philip`: That means carrying around the lookup table everywhere, which isn't overly practical
22:17	<smedero>	Unit testing aside, it works as expected doing a quick little bit of parsing with lxml
22:17	<jgraham_>	smedero: Yeah, I can reproduce it on Mac too. I guess it might depend on how the unicode support in python was compiled
22:18	<smedero>	I used to have a number of known unicode offending sites at my fingertips that I would have thrown at it... but I no longer work at the Linguistic Data Consortium. So digging up some test cases is going to require a little email mining.
22:19	<gsnedders>	smedero: uniicode offending in what way?
22:19	<smedero>	oooh
22:19	<gsnedders>	*unicode
22:19	<smedero>	there's some really fun chinese news websites
22:20	<smedero>	things like the bbc arabic news site tend to be pretty sane
22:21	<gsnedders>	smedero: http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt — wrap that in pre!
22:21	<smedero>	ooh, nice. is this what you use to test simplepie?
22:21	<gsnedders>	smedero: No
22:22	<gsnedders>	smedero: SP badly needs more tests :P
22:22	<gsnedders>	(fun of SP2: starting from scratch means I can require far stricter things to get code into the repo, like having full test case coverage)
22:24	<gsnedders>	smedero: I use it to test my Unicode class, though
22:24	<smedero>	Yeah I've made good use of that.
22:25	<gsnedders>	smedero: What? My Unicode class?
22:25	<gsnedders>	smedero: Starting from scratch?
22:27	<gsnedders>	(if the former, you mean somebody actually uses it already? wow.)
23:28	<smedero>	gsnedders: Your unicode class. I tried using 0.1 at one point and ran into some problems. Shortly before I left LDC I was using the 0.2 version in a couple of projects: generic web-based annotation tool (handled bengali, tamil, & arabic unicode data while I was there... not sure what else) and an arabic reading tool - we took the georgetown press arabic college textbooks and enhanced all of the reading passages with morphological information fo
23:28	<smedero>	in theory they'll be open-sourced someday, but I'm not working on them anymore. if that happens though, i'll drop you a line.