00:11
<annevk>
Hixie, lol
00:11
annevk
finds 'An end tag whose tag name is "sarcasm"'
00:18
<Dashiva>
annevk: How many angry outbursts for not taking the w3c seriously will that trigger, I wonder :)
00:19
<annevk>
I'm sure people will complain, because as defined it does exactly nothing
00:24
<Hixie>
it requires you to take a deep breath!
00:24
Hixie
applies the first sticker to his laptop
00:25
<Dashiva>
Hixie: Is that a normative requirement?
00:25
<Dashiva>
Because I think it will be exceedingly difficult to add breathing capabilities to most user agents
00:26
<annevk>
that's just a platform limitation
00:26
<Hixie>
Dashiva: yup
00:26
<Hixie>
annevk: indeed
00:26
Hixie
is happy with his sticker ("My other computer is a data center")
00:27
<Dashiva>
Not a beowulf cluster of toasters?
00:27
<Hixie>
no, my other computer really is a data center, that's why i like it :-)
00:28
<Philip`>
But you have to share it with a hundred million other users, whereas my computers are all mine
00:29
<Hixie>
no comment
00:29
<Dashiva>
My other computer is standing in the hallway, waiting for the electrician to show up and make a properly earthed power outlet
00:29
<Hixie>
heh
00:30
<Hixie>
hey
00:30
<Hixie>
this <li> problem is just because i'm not faking an end tag
00:34
<annevk>
so <datalist> requires a new insertion model
00:34
<Hixie>
why?
00:34
<annevk>
for <option>?
00:34
<Hixie>
i just plan to make <option> into a phrasing element with some magic for optional end tags
00:35
<annevk>
Opera treats <option> "like" <option/>
00:37
<annevk>
IE does something similar (plus the /OPTION weirdness for the end tag)
00:37
<annevk>
Firefox just drops it
00:37
<Hixie>
yeah
00:38
<Hixie>
but i can't see any way that treating it like <span> could break anything
00:38
<annevk>
option { background:lime }
00:39
<annevk>
but maybe it's feasible
00:39
<Hixie>
yeah but do any pages do that while also having one in the middle of nowhere?
00:39
<annevk>
for <datalist> Opera seems to do something similar to <select> except that <select> is allowed as child
00:39
<annevk>
that doesn't seem good
00:40
<Hixie>
indeed
01:19
<Hixie>
ok well i think this fixes the main problems
01:25
<Philip`>
jgraham_: test and install works for me on that version
01:30
<Philip`>
jgraham_: but it fails when I use something with a different version of simplejson
01:34
<Philip`>
(In particular: Works with simplejson 1.7.1 in python2.5; fails with 1.9.1 in python2.4)
01:36
<Philip`>
(It doesn't like non-ASCII in test1.test)
01:38
<Philip`>
Lots of tests fails with BeautifulSoup-3.0.6 in python2.4
01:49
<Philip`>
Hmm, how nice - BeautifulSoup has decl.string == 'html', but unicode(decl.string) == '<!html>'
01:50
<Philip`>
except in Python 2.4 unicode(decl.string) == 'html'
02:22
<Philip`>
jgraham_: I've committed some changes that are needed for some versions of Python I have; otherwise it seems to work fine in my 2.3/2.4/2.5 Pythons, except for the title=foo=bar test that you seem to have fixed in your zipped version, and except for some hex overflow warnings in Python 2.3
03:21
<MikeSmith>
"thanks for waiting"
03:22
<MikeSmith>
what the hell else choice do we have?
03:22
<MikeSmith>
wait, I guess we could offer to write the "paper" for them
03:22
<MikeSmith>
I think I'll do that
03:22
<MikeSmith>
this weekend
03:27
<Philip`>
<datatemplate> scares me :-(
03:32
<Philip`>
Mutual recursion is great
03:32
<Philip`>
s/great/evil/
03:39
<MikeSmith>
heh
03:40
<Dashiva>
Mutual recursion is recursing mutually is mutual recursion?
04:00
<Philip`>
Maybe I shouldn't send emails trying to explain nasty complicated algorithms at 4am :-/
04:01
<Dashiva>
Maybe I should be asleep at 5 am...
04:01
<Philip`>
I reason that the birds outside aren't asleep, so I don't see why I ought to be
04:03
<Dashiva>
Too bad you didn't reason like that when they were asleep :)
04:04
<Philip`>
I was too busy with <datatemplate>s to be able to reason :-)
04:05
<Philip`>
(Also playing PAA:OtRSPoD:E1, which is possibly less useful)
05:57
<takkaria>
Philip`: I do often wonder if you ever sleep at all
06:47
heycam`
curses HttpURLConnection's refusal to accept a non-standard HTTP method
08:26
<annevk>
Philip`, I found http://en.wikipedia.org/wiki/Levenberg–Marquardt_algorithm
08:27
<annevk>
but then I don't know about either that algorithm or <datagrid> to connect the dots
08:28
<annevk>
know enough*
08:35
<annevk>
http://simonwillison.net/2008/Jun/6/patent/ :o
08:49
<Hixie>
that algorithm has nothing to do with it
08:49
<Hixie>
the "levenberg" algorithm in the spec is just what josh came up with when i asked him how to solve the problem
08:50
<annevk>
ok, so what Philip` said :)
08:50
<Hixie>
yup
09:15
<hsivonen>
hmm. the 2005 DOM Core threads about aligning the spec with the real world are sad
09:21
<annevk>
we should just rebrand the DOM specs to Web DOM specs so there's no confusion
09:21
<annevk>
Web DOM -> Web, DOM specs -> well...
09:25
<annevk>
Hmm, they're standardizing UA sniffing with a really complicated server side API... http://www.w3.org/2005/MWI/DDWG/drafts/api/080602
11:13
<annevk>
view source: http://jarvklo.se/ :)
11:16
<annevk>
http://cafe.elharo.com/web/refactoring-html/why-xhtml/#comment-235640 "They attempt to destroy standards by insisting on mindless conformance to them, all practical experience to the contrary." yup...
11:17
annevk
was hoping nobody would figure it out
11:49
<jgraham_>
Does anyone have a strong opinion on whether html5lib should accept e.g. utf8 as a synonym for utf-8?
11:50
<annevk>
html5 says we should I think
11:50
<gsnedders>
yeah, they're identical per html5
11:50
<jgraham_>
Oh, that must have changed since I last looked at this
11:50
<gsnedders>
http://www.whatwg.org/specs/web-apps/current-work/#charset
11:50
<annevk>
likely
11:51
<gsnedders>
It was a fairly recent change
11:51
<gsnedders>
actually, that's not it
11:51
<annevk>
jgraham_, when is your blogpost going online?
11:51
<jgraham_>
annevk: About @media? After Lachy outs the slides online... (hint ;) )
11:51
<gsnedders>
http://www.whatwg.org/specs/web-apps/current-work/#character0 — that's the right link
11:52
<annevk>
Lachy, ^^
11:52
<annevk>
or http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#character0
11:52
<gsnedders>
http://bugs.simplepie.org/repositories/entry/sp1/trunk/create.php — that's what I use now
11:53
<annevk>
I wonder if we can make it a fixed list of character encodings
11:53
<gsnedders>
annevk: in the spec?
11:53
<annevk>
yes
11:53
<gsnedders>
That would be nice.
11:53
<annevk>
a list of Web character encodings
11:54
gsnedders
notes in his new impl. he does support UTF-32 (but on PHP 5 everything is stored as UTF-32 binary strings)
11:55
<annevk>
omg, idiots
11:55
<gsnedders>
Simply because perf. of things like substr() would be terrible using anything that sometimes used more than one word per codepoint
11:55
<gsnedders>
annevk: ?
11:55
<annevk>
storings things in UTF-32 is a design flaw
11:56
<virtuelv>
anyone looked at eric meyer's linking proposal?
11:56
<gsnedders>
annevk: Find any other way to do stuff performantly :(
11:56
<annevk>
gsnedders, even with UTF-32 you can have a single character spanning multiple code points
11:56
<jgraham_>
gsnedders: the substring argument may be optimizing for the wrong thing
11:57
<gsnedders>
annevk: As far as I can see substr() natively on PHP 6 just does it to codepoints, not characters
11:57
<annevk>
in that case basing it on UTF-16 would be better
11:57
<gsnedders>
jgraham_: It optimises for most optimisations, though, doing that (it's also the cheapest to decode/encode to)
11:57
<hsivonen>
annevk: no, you can't, but you can have a grapheme cluster spanning code points
11:57
<annevk>
as that's at least compatible with the DOM/JavaScript etc.
11:57
<jgraham_>
gsnedders: http://weblogs.mozillazine.org/roc/archives/2008/01/string_theory.html
11:58
<gsnedders>
annevk: Why? Then I have to scan through the string trying to find surrogates
11:58
<annevk>
hsivonen, is that the same as combined characters?
11:59
<hsivonen>
annevk: yeah
12:00
<annevk>
(that's what I meant)
12:00
<gsnedders>
annevk: UTF-16 will be slower. What might be quicker is having duplicate code paths, one for ASCII only UTF-8, and another for UTF-8 (when using the actual UTF-8 codepath it'll be slower than UTF-32 though)
12:01
<annevk>
UTF-32 doesn't handle combined characters
12:01
<gsnedders>
I know.
12:01
<annevk>
and eats up way too much memory and is more or less obsolete anyway as far as anyone is concerned
12:01
<gsnedders>
PHP is just too slow :(
12:08
<gsnedders>
If I were to use UTF-8 or UTF-16 I need to manually iterate over the string for things like substr to count the number of characters I'd got past, taking into account multi-byte sequences, and surrogates. With UTF-32, I can just use PHP's built-in substr function, as each codepoint is four bytes.
12:08
<gsnedders>
iterating over a string is VERY slow in PHP when you're having to watch it as a unicode string (decoding is far too slow)
12:12
<hsivonen>
if you aren't Mozilla or Opera, you should probably just use whatever UTF-* your language + its built-in string library use
12:14
<gsnedders>
hsivonen: In the case of PHP 5, that's binary strings, and no Unicode.
12:17
<hsivonen>
I don't know about PHP 5, but in PHP 4 it means byte strings with UTF-8 in them or int arrays with UTF-32 code unit per array slot
12:18
<Philip`>
gsnedders: Perl stores UTF-8 internally, and its substr() performance doesn't seem to be unacceptable
12:18
<gsnedders>
hsivonen: PHP 5 is the same as PHP 4
12:19
<gsnedders>
Philip`: It has the advantage of not needing to try and operate it in the interpreted userland, though, but in the compiled interpreter
12:19
<gsnedders>
(PHP 6, however, has Unicode support)
12:19
<gsnedders>
I want something that can behave identically on PHP 5 and PHP 6, while using PHP 6's native support
12:20
<Philip`>
gsnedders: Oh, by "on PHP 5 everything is stored as UTF-32 binary strings" do you mean that you chose to store everything as UTF-32 because PHP doesn't do Unicode strings at all, as opposed to meaning that PHP itself stores Unicode strings as UTF-32?
12:21
<gsnedders>
Philip`: PHP < 6 (which isn't out yet) doesn't do any Unicode
12:21
<Philip`>
Ah
12:21
<gsnedders>
Philip`: Well, there are some extensions that cope, one that is enabled by default, but it stops processing on the first invalid byte
12:21
<gsnedders>
and it has all kinds of bugs
12:22
<annevk>
good times
12:22
<annevk>
i'm glad my blog doesn't require the complex PHP bits
12:22
<annevk>
such as substr() :D
12:22
<Philip`>
It's strange how a toy language can become so popular ;-)
12:22
<gsnedders>
(and as PHP 6 is even less backwards compatible than PHP 5 was, and PHP 5 had slow uptake)
12:23
<hsivonen>
PHP is aimed at people who should be given all the high-level library support imaginable, but then what PHP does with strings is the kind of thing portable C does
12:23
<gsnedders>
hsivonen: PHP just generally sucks.
12:23
<hsivonen>
i.e. giving just a run of bytes without any Unicode libraries
12:23
<hsivonen>
yes, PHP sucks
12:23
<hsivonen>
I walked away from PHP years ago
12:23
<gsnedders>
It's not just strings. Everything sucks.
12:24
<hsivonen>
my latest encounter with PHP was patching the WHATWG blog and the experience made me unhappy
12:24
<hsivonen>
I should have know better and stayed away
12:24
<annevk>
yet for the simple stuff it's easier than anything else
12:24
<gsnedders>
Most common bug report on SP: it just stops output! (this happens to normally be PHP crashing Apache)
12:24
<hsivonen>
annevk: if "simple" means "I don't know yet that I need to care about Unicode"
12:24
<gsnedders>
annevk: I'd say stuff like Python and Ruby is just as easy
12:24
<annevk>
though maybe that's because there's so much to copy and paste from
12:25
<annevk>
hsivonen, right
12:25
<annevk>
people have written all kinds of simple software in PHP you can learn how to do things from and improve upon
12:25
<annevk>
I haven't found nearly as much for Python
12:26
<annevk>
Python documentation is also quite crappy
12:26
<gsnedders>
annevk: Yeah, but at least with Python most of the docs there are _is_ right
12:26
<gsnedders>
annevk: With PHP, half of it is wrong
12:27
<hsivonen>
Java seems to have the best docs
12:27
gsnedders
will likely have no real exposure to Java till he's at uni
12:27
<Philip`>
Java documentation seems to always tell me how stuff works, but never tells me how to do stuff
12:28
<hsivonen>
right, Java has great per method docs. but often it is hard to learn the big picture for a new library
12:28
<Philip`>
If I don't know the class name I'm looking for, I end up having to search random web pages to find relevant links
12:32
<Dashiva>
Vector vs ArrayList, discuss
12:32
<hsivonen>
deprecade Vector. end of discussion :-)
12:32
<hsivonen>
deprecate
12:35
<annevk>
what Philip` says about Java is probably true for Python as well
12:36
<annevk>
Python should have some clear docs on dealing with HTML form submission / dealing with request URIs / dealing with databases
12:42
<Philip`>
I think it's probably true for all languages that I'm familiar with
12:43
<Dashiva>
I suppose it'd be false for languages with no standard library, though :)
12:43
<annevk>
might be, for me it was just that there was enough sample PHP code around to figure out how to make blog software / simple cms software, etc.
12:44
<Philip`>
Dashiva: Do any such languages exist? Everything seems to at least have standard string functions, and it'd be pretty useless without those :-)
12:44
<jgraham_>
The python documentation for 2.6 is getting better http://docs.python.org/dev/
12:45
<Dashiva>
Philip`: Well, you'd define functions required to non-uselessness as part of the language, not the library
12:45
<Philip`>
The 2.6 docs look less like latex2html, which is nice
12:45
<Philip`>
Dashiva: I wouldn't :-p
12:46
<Dashiva>
Besides, strings are overrated.
12:46
<Philip`>
Dashiva: e.g. C has a standard library, and without it the language would be pretty useless until you built up everything yourself from scratch, but that's still a standard library
12:46
<Philip`>
Indeed, we should use ropes instead
12:48
jgraham_
suggests that GUI design 101 should cover not putting "Copy" and "Close Window" next to each other
12:48
<hsivonen>
modern C software has to use a string library that isn't provided by the "standard" library
12:48
Philip`
suggests the OS should have an "undo" button, that can undo the closing of windows
12:50
<annevk>
jgraham_, still not quite focused on Web programming it seems
12:51
<jgraham_>
annevk: Sure. But better than the old documentation
12:52
<annevk>
take http://docs.python.org/dev/howto/index.html for instance
12:52
<annevk>
where's HTTP and python, HTML forms and python, MySQL and python ?
12:52
<Dashiva>
Do you really have to use urllib to urlencode something for use in urllib2?
12:53
<jgraham_>
annevk: That section's new. I guess contributions are welcome :)
12:54
<annevk>
once i'll figure it out i'll let them know :p
12:54
<jgraham_>
http://wiki.python.org/moin/WebProgramming might be more informative
12:54
<annevk>
well, i have figured it out to some extent, but what i'm doing feels rather clumsy
12:58
<annevk>
hah, that wiki page endorses my naive approach
13:03
<annevk>
http://webpython.codepoint.net/introduction is quite nice
13:03
<annevk>
(links at the bottom)
13:05
<hsivonen>
hmm. If I read a text node from the DOM in IE8 mode, line breaks are normalized to single spaces
13:05
<hsivonen>
what's up with that?
13:07
<Philip`>
What if you set white-space:pre on that element?
13:10
<Philip`>
Seems the same as IE6/7 to me - whitespace is normalised except on elements where it's preserved (like <pre>, or <... style="white-space:pre">)
13:11
<hsivonen>
style="white-space:pre" helps in IE7
13:11
<roc>
freaky
13:12
<Philip`>
(innerHTML does the same normalisation)
13:14
<hsivonen>
Philip`: doesn't help in IE8 mode
13:17
Philip`
encounters a way to get IE stuck in an infinite loop
13:17
<Philip`>
The Live DOM Viewer ought to have an autosave feature :-(
13:19
<annevk>
and community features so you can rate each others code and comment on it
13:19
<Philip`>
hsivonen: http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cmeta%20http-equiv%3Dx-ua-compatible%20content%3Die%3D8%3E%0A%3Cdiv%20style%3Dwhite-space%3Apre%3Etest%0Atest%3C%2Fdiv%3E in IE8 in IE8 mode shows the non-normalised newline in the DOM view
13:22
<annevk>
http://lxyedwarddudley.wordpress.com/2008/06/05/authoring-html-5-a-supplicate-in-warp-and-woof-professionals/ :o
13:22
<gsnedders>
You do realise the p element exists for a reason?
13:23
<hsivonen>
Philip`: I'm looking at http://hsivonen.iki.fi/test/moz/document-write-cr.html
13:24
<hsivonen>
Philip`: the last script is followed by CR
13:24
<hsivonen>
Philip`: otherwise the line breaks in the source are LFs
13:24
Philip`
guesses that's based on http://www.w3.org/QA/2007/06/html5-call-to-web-professionals.html
13:26
<Dashiva>
That page makes my head hurt, annevk
13:32
Dashiva
wonders what the Functional Ashcan school is
13:39
<Philip`>
Oh, great, IE changes the DOM when access .firstChild on certain elements
13:44
<hsivonen>
Philip`: are the assumptions in my test wrong?
13:45
<Philip`>
hsivonen: You may be assuming that IE is sane :-)
13:47
Philip`
gives up trying to understand what it's doing
13:49
<Philip`>
http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cmeta%20http-equiv%3Dx-ua-compatible%20content%3Die%3D8%3E%0A%3Cdiv%20style%3Dwhite-space%3Apre%3Etest%0Atest%3C%2Fdiv%3E%0A%3Cscript%3Ewindow.onload%3Dfunction()%7B%0Avar%20d%20%3D%20document.getElementsByTagName('div')%5B0%5D%3B%0Aw(d.childNodes.length)%0Aw(d.childNodes%5B0%5D.nodeValue.replace(%2F(.)%2Fg%2Cfunction(m)%7Breturn%20m.charCodeAt(0)%2B'%20'%7D))%3B%0AsetTime
13:49
<Philip`>
Urgh
13:49
<Philip`>
http://tinyurl.com/66d3mz
13:50
<Philip`>
In IE8, that logs "1", then "116 101 115 116 32 116 101 115 116", then "2"
13:50
<Philip`>
and the DOM view shows the text being split into two text nodes
13:50
<Philip`>
and if I remove the second w() line then it stays as a single text node
13:52
<Philip`>
But I get different behaviour in my fork of the DOM viewer - it says "13" instead of "32", and "1" instead of "2"
14:07
jgraham_
guesses someone has been reading too much Joyce or Pynchon
14:24
<annevk>
http://www.theologyonline.com/newgod/
14:25
<Dashiva>
god5?
14:25
<Dashiva>
Oh, 6
14:25
<Dashiva>
Must be Dmitry up to his old tricks again
14:31
<annevk>
I still wonder who did Bible5
14:31
<annevk>
it seems to be copy and pasted all over the place, e.g. http://www.biroblu.info/2007/05/例如-对于后点-现在简化汉字主要通行于中国大陆/
14:33
<annevk>
in other news, that press release is supposed to go out this month, we better hurry up with XML5, SVG5, etc. :D
14:34
<Philip`>
It doesn't matter if we miss the deadline - we can just define Calendar5 at some point in the future, so dates can mean whatever we want them to mean
14:35
<Dashiva>
Or we could just use existing implementations
14:35
<Dashiva>
Valve has a good one, I hear
14:36
<annevk>
must. be. backwards. compatible
14:36
<Dashiva>
Just define everything before Anno Valvensis as "long ago"
15:18
<Dashiva>
I find it amusing that python raises an exception on 'raise', for not being a valid raise statement
15:52
<Philip`>
Dashiva: It is a valid raise statement, as long as it's in the dynamic scope of an except block
15:52
<Dashiva>
Yeah, but not outside
15:53
<Dashiva>
So I say "Give me an exception" and it goes "No, you can't have one. I'm going to make an exception because you did it wrong."
15:57
<Philip`>
It can be valid if it's outside an except block
15:57
<Philip`>
as long as there's another except block which it's still inside :-)
15:59
<Philip`>
Hmm, it appears it always re-throws the most recently thrown exception, rather than actually doing dynamic scoping, which seems kind of like a bug in Python
17:00
<hsivonen>
hmm. when pronouncing WCAG, is the vowel (i vs. e) inserted between W and C freeform or dependent on en-AU, en-GB, en-US, etc.?
19:03
<Lachy>
hsivonen, it's not really dependant on anything like that. It's just the preference of whoever says it
19:07
<hsivonen>
Lachy: ok. (I've now also heard recordings where the vowel varies even away from the e-i space)
19:14
<Lachy>
I've not heard anyone use an e sound. I've heard i, y and uh
19:15
<hsivonen>
Lachy: I used e to denote the way you pronounce it :-)
19:16
<Lachy>
huh? I prounounce it like WhyCAG
19:17
<hsivonen>
hmm. somehow I got an e-like impression from the standards suck episode
19:17
<hsivonen>
well, doesn't matter
19:19
<gsnedders>
I pronounce the WC like the word wick
19:20
<Dashiva>
wickag?
19:20
<gsnedders>
yeah
19:20
<hsivonen>
gsnedders: is the 'i' as in en-US 'list'?
19:21
<gsnedders>
hsivonen: No
19:21
<Dashiva>
I mentally equalize the w in wcag with x in xwhatever, so I would always pronounce it separately
19:41
<gsnedders>
Oh brilliant.
19:41
<gsnedders>
I click the "confirm your purchase" button and nothing happens. yay.
19:41
<Philip`>
I avoid the pronunciation problem by never talking about it
19:41
<gsnedders>
(odd sidenote on that matter: Air France is cheaper than easyjet)
20:53
Retrieving
#whatwg modes...
21:58
<jgraham_>
Hmm. I wonder if Laura missed my point about the Python process being "unfair" but arguably producing good results because of it