00:03
Philip`
wonders how long it'll take to run "sort -R" on 4.5M lines
00:04
<parcelbrat>
nice
00:04
<anne-mac>
we could have text/html+lisp
00:05
<parcelbrat>
yeah, but then you'd have to start catering to everyone
00:05
<parcelbrat>
text/html+ruby
00:05
<parcelbrat>
text/html+python
00:05
<parcelbrat>
text/html+vbs...
00:05
parcelbrat
keyboard breaks
00:05
<anne-mac>
neh, only those we like
00:06
<parcelbrat>
shouldn't those be application/html+<<lang>>
00:06
<Philip`>
We could support them all at first, then remove them all from the spec, and see which ones get complained about the most, and then just put those ones back in
00:06
<anne-mac>
application/* is overrated I think
00:07
<parcelbrat>
philip`: works for media, and got me here
00:07
<parcelbrat>
anne-mac: newbie question: why?
00:08
<anne-mac>
parcelbrat, take text/xml versus application/xml, the only reason to prefer the latter is theoretical concerns over character encoding
00:08
<Philip`>
Hmm, 6 minutes of CPU time to randomise the list
00:08
<anne-mac>
but all implementations treat them as being equivalent
00:09
<parcelbrat>
how would handle text/xml with character encoding? let http handle it?
00:09
<anne-mac>
most text/* formats have their own specific rules for determining the character encoding actually, all against the various RFCs from the stone age
00:10
<anne-mac>
text/xml defaults to US-ASCII unless it has a charset parameter that says otherwise
00:10
<anne-mac>
application/xml defaults to whatever the XML file says unless it has a charset parameter specified
00:10
<anne-mac>
in practice, text/xml is like application/xml
00:12
<parcelbrat>
makes sense
00:12
<parcelbrat>
and we know how well application/html+xml works ;)
00:12
<anne-mac>
xhtml+xml* ;)
00:12
parcelbrat
smacl
00:13
<parcelbrat>
s/smacl/smack
00:14
<hubick>
which makes me wonder if Firefox is ever gonna support */*+xml: https://bugzilla.mozilla.org/show_bug.cgi?id=155730
00:17
<Philip`>
Seems I can download/process 4K pages per minute
00:18
<parcelbrat>
hubick: yeah, lately, we've had a problem with Ruby on Rails error messages because <%= isn't a valid xml tag... the page isn't supposed to be coming across as xml... oh well
00:18
<Philip`>
I get loads of cookie spec violation warnings :-/
00:22
<Philip`>
Argh, and I've got ill-formed XML output too
00:23
<parcelbrat>
later ya'll
00:23
<Philip`>
<header uri="http://www.ganymede.cz/"; name="Server" value="Apache/2.2.6 (Unix) mod_ssl/2.2.6 &#2; DAV/2 PHP/5.2.5"/>
00:23
Philip`
wishes XML was easy
00:25
<Philip`>
Hmm, there's five sites with &#2; in their Server
00:26
<Philip`>
<header uri="http://www.doxamus.ro/"; name="Server" value="Apache/2.2.6 (Unix) mod_ssl/2.2.6 �N&#31;&#9;YNED�����N&#31;&#9;SAHP�����N&#31;&#9;TATS�����N&#31;&#9; mod_bwlimited/1.4 mod_auth_passthrough/2.1 FrontPage/5.0.2.2635 PHP/5.2.4"/>
00:26
<Philip`>
That's really not going to work
00:35
<Philip`>
anne-mac: http://www.cl.cam.ac.uk/~pjt47/misc/media.xml
00:36
<Philip`>
anne-mac: http://www.cl.cam.ac.uk/~pjt47/misc/media.txt too
00:36
<Philip`>
from 16 kilopages, minus about 500 with errors
00:37
<Philip`>
(This is only about 400MB of HTML, so I could do more fairly easily)
00:37
<Philip`>
(but probably not enough more to find really interesting things)
00:42
<Philip`>
s/about 500/940/
00:42
<Philip`>
(Also, binary kilo)
00:52
<Hixie>
"Re: [whatwg] HTML 5, OGG, competition, civil rights, and persons with disabilities"
00:52
Hixie
fears looking at that e-mail
00:52
<bradee-oh>
lol
00:53
<Dashiva>
Hixie: Now you know how we feel about your mashup replies :P
00:53
<Hixie>
:-D
00:56
<Philip`>
hsivonen: I can process 16K pages in 20 seconds (wallclock time) - I think your parser is fast enough for me for now :-)
00:57
<_Ivo>
I can sadly say that that is a sad title.
00:59
<Philip`>
441% CPU usage? I think 'top' is lying to me...
01:18
<Hixie>
http://www.bluishcoder.co.nz/2007/12/video-element-and-ogg-theora.html is a good summary
01:27
<doublec_>
hotel network connections, sigh
01:31
<roc>
hey
01:31
<roc>
yeah, that was good Chris
01:31
<doublec_>
thanks :)
01:31
<doublec_>
I had to type it in using w3m over a ssh connection using bloggers interface.
01:32
<doublec_>
since the hotel network seems to kill any browser traffic over a certain size
01:33
<roc>
This is Avante?
01:33
<roc>
I don't remember that being a problem
01:33
<doublec_>
Yes, it's avante. And it is strange. I can receive fine. I can't even send via gmail.
01:33
<doublec_>
yet that's over ssl so I don' t understand it
01:36
<roc>
complain to the management, that's pretty important right now
01:37
<doublec_>
will do
01:38
<othermaciej>
mmmm, ogg flamage
01:38
<othermaciej>
toasty
01:39
<othermaciej>
Hixie: it argued that not recommending support for the Ogg Theora video codec will be harmful to the blind
01:39
<Hixie>
i've been trying to keep the flames to a minimum by asking for politeness off-list, i hope it helps
01:39
<Hixie>
wait, what?
01:39
<Hixie>
wow
01:39
<Hixie>
can't wait to read that
01:41
<Philip`>
http://www.cl.cam.ac.uk/~pjt47/misc/attributes.html - some numbers about values of an arbitrarily-chosen set of elements/attributes
01:42
<Hixie>
off hand those numbers match what i remember seeing
01:42
<Hixie>
except for a rev=PARENT
01:42
<Hixie>
i just saw rev=made and rev=stylesheet
01:43
<Philip`>
They all come from http://www.dcs.gla.ac.uk/~simon/quantum/
01:43
<Philip`>
(since I counted number of occurrences in total, not number of pages)
01:44
<Philip`>
Maybe number of pages would be more useful...
01:44
<Hixie>
ah yeah i found that number of occurances just never works
01:44
<Hixie>
there are too many gigantic pages that totally screw the count
01:49
<Philip`>
http://www.cl.cam.ac.uk/~pjt47/misc/attributes.html - now with number of pages too
01:50
<Philip`>
Not sure why I'm bothering to keep the number of occurrences too, but I guess it doesn't hurt
01:50
<Philip`>
Looks like cyan and magenta are the least favourite of the binary colours :-(
01:51
<othermaciej>
doublec_'s blog post is indeed a good summary
01:51
Hixie
pokes Philip` to look at his /msgs (and maybe to have him respond on w3.net, since he's not registered on freenode)
01:56
<othermaciej>
I posted on the codec thread
01:57
<othermaciej>
may knuth have mercy on my soul
01:57
<Philip`>
"The" codec thread? I thought there was about two dozen of them
01:57
<othermaciej>
one of them
02:13
<Philip`>
othermaciej: About "I've heard game vendors cited, not sure which ones": See http://wiki.xiph.org/index.php/Games_that_use_Vorbis
02:14
<othermaciej>
thanks, that doesn't list the vendors in an easy-to-find way
02:14
<Philip`>
http://www.unrealtechnology.com/features.php?ref=audio mentions Vorbis support quite prominently
02:14
<othermaciej>
main thing I wondered about was whether any Microsoft-published games use Ogg Vorbis
02:15
<othermaciej>
I think the audio issue is somewhat less important since (a) Vorbis has good quality and a somewhat more solid IP footing and (b) MP3 patents will expire in a few years, at which point MP3 is a suitable audio baseline
02:15
<Philip`>
I don't see any on the list that I recognise as Microsoft
02:17
<Philip`>
Oh
02:17
<Philip`>
Halo
02:18
<Philip`>
which was while they were owned by Microsoft
02:19
<Philip`>
s/they/Bungie/
02:19
<Hixie>
halo is certainly high profile
02:20
<Dashiva>
Fable too
02:21
<_Ivo>
and Gears of War
02:22
<Philip`>
_Ivo: Uh, I don't think that used Vorbis
02:22
<_Ivo>
as far as I know, it did
02:22
<_Ivo>
may be worth confirming
02:23
<Philip`>
Oh, looks like you're right
02:24
<Philip`>
e.g. http://utforums.epicgames.com/showthread.php?t=583980&page=10 says it has a vorbis.dll
02:26
<Hixie>
btw just so everyone is up to date, i'm thinking we should just drop the whole cross-references nonsense and replace it with some recommendations about using <a href="">
02:26
<Philip`>
Actually, maybe that's just left over from it being an Unreal Engine game - I have no idea if they really use the Ogg support
02:26
<othermaciej>
Hixie: the autolinking cross-references?
02:26
<Hixie>
yeah
02:27
<Hixie>
too much complexity for not much gain
02:27
<othermaciej>
Hixie: they did seem cute, but admittedly not that compelling over <a>
02:28
<Hixie>
good lord we got a lot of feedback on <cite>
02:32
<Hixie>
hm, i'm thinking, <cite> maybe should just be for a title of a work. any work, and even if it's not technically really cited.
02:32
<Hixie>
it seems that the typographic convention angle is more useful than the "this is a citation" angle
02:32
<Hixie>
and the line of what a citation is is a bit vague anyway
02:35
<roc>
Hixie: FWIW Apple and Nokia support software patents in general.
02:36
<othermaciej>
http://www.macobserver.com/article/2007/08/31.1.shtml
02:37
<othermaciej>
(that's not a counter-argument to supporting them in general, I don't know if Apple has an official position on that)
02:43
<roc>
http://www.macobserver.com/article/2007/08/02.12.shtml
02:43
<roc>
"However, Apple's chief patent counsel, Chip Lutton, contradicted Ms. Lee, and doesn't think the patent system is broken. In fact, "it's the best system in the world," he said. "
02:48
<othermaciej>
that certainly seems like support for the patent system in general
02:49
<roc>
it seems to indicate Apple is pretty happy with software patents in general
02:49
<roc>
because many other systems in the world don't have them
02:50
<othermaciej>
well, again, I'm not privy to Apple's official view on the matter, but I know that Apple has actively supported patent reform, and that this is likely to improve the situation at least somewhat
02:51
<othermaciej>
that probably makes a bigger difference than sound bites
02:52
<othermaciej>
I personally think software patents are broken and should either not exist or have much shorter terms than current terms, but that that is certainly not an official position
02:53
<roc>
I agree
02:54
<roc>
the "reforms" pushed by Apple and others are probably good things in themselves, but they're essentially self-interested attempts to tamp down the troll problem
02:56
roc
hopes that the US Supreme Court will just rule software patents invalid and all this will just blow away in the breeze
02:57
<othermaciej>
unfortunately that seems unlikely
02:57
<roc>
I would have thought so, but they've been most ornery about patents lately
03:00
<othermaciej>
honestly I'm not sure I get Apple's stance given which side of patent lawsuits we're usually on
03:01
<roc>
I know the feeling. I used to work for IBM
03:06
<MikeSmith>
Hixie - I won't shed any tears when the dfn cross-referencing thing gets dropped, but I think, gee wouldn't it be nice if we had a general xref mechanism?
03:07
<MikeSmith>
such that empty <xref href="#foo"> gets replaced with content of element at foo
03:34
<G0k>
uh
03:34
<G0k>
who the hell is this rudd-o clown?
03:38
<othermaciej>
G0k: he's clearly passionate about his beliefs
03:38
<G0k>
at this point i'm convinced he's an agent for MPEG LA
03:38
<G0k>
because he's doing more to discredit the Ogg crowd than anyone else I've seen
03:43
<aphid>
it's bulldada, by contrast he makes the rest of us look civilized and reasonable.
03:43
<aphid>
:D
03:44
<G0k>
uhg
04:12
<MikeSmith>
kfish - hei
04:25
<kfish>
yo MikeSmith
04:25
<kfish>
cold in tokyo?
04:44
<MikeSmith>
kfish - yeah, too cold for me already
04:44
<MikeSmith>
both in Tokyo and out at Keio/SFC
04:45
<MikeSmith>
warmed up last night by eating shabu-shabu and drinking fugu hire-zake
04:46
<kfish>
nice :-)
05:06
<_Ivo>
This won't be nice of me asking, but is it possible to block Manuel Amador (Rudd-O) from the lists at least temporarily til he cools off?
05:07
<bradee-oh>
I see he promises an upcoming barrage of emails because of his handiwork at Digg.
05:07
<bradee-oh>
oh joy.
05:08
<bradee-oh>
sure has made it difficult to have actual discussions about actual standards issues today *sigh*
05:12
<parcelbrat>
I'm one of the people who came to get clarification (earlier) due to his /. post. I'd actually like to stay involved for the real reason.
05:12
<parcelbrat>
That aside, I'm wondering if there has been a barage here from that too
05:21
<bradee-oh>
The quantity today has been *much* higher than usual, and much less productive. yay!
05:22
inimino
wonders how many productive man-hours were lost
05:24
<Teratogen>
BRING BACK OGG!
05:28
<jruderman>
Teratogen: http://yro.slashdot.org/comments.pl?sid=385689&cid=21655557
05:28
<jruderman>
(sorry, i guess that was Oog)
05:30
<parcelbrat>
does Teratogen == rudd-o?
05:34
<Teratogen>
BRING BACK OGG NOW!
05:36
<_Ivo>
what the hell is a Oog?
05:40
<jruderman>
Oog, the open-source caveman, a legendary Slashdot troll
05:44
<aphid>
ogg is also the name of the stalinesque leader from some Nat'l Petroleum Institute produced cartoon that's on archive.org
05:45
<aphid>
http://www.archive.org/details/Destinat1956
05:52
<parcelbrat>
lol
06:01
<Hixie>
if anyone is on site5, feel free to point out on this thread that we didn't trade ogg for something proprietary: http://forums.site5.com/showthread.php?t=19941
06:08
<parcelbrat>
i noticed someone on site5 on #ror, but he already left
06:43
<Hixie>
I've added documentation to the annotation system
07:37
<hsivonen>
lots and lots of ogg email :-(
07:43
<doublec>
yes, Ogg is the favourite topic of the day
07:44
<Hixie>
more mail?
07:45
<doublec>
only a couple :)
07:47
<Hixie>
we've hit 855 members
07:47
<Hixie>
that's 50 more than this morning
07:49
<othermaciej>
Hixie: you need to make more controversial changes so we pass 1000
07:51
<Hixie>
seriously
07:51
<doublec>
and find a way to get money from each new member
07:51
<doublec>
since you aren't getting bribed to remove ogg :)
07:52
<doublec>
I think your reddit replies should get some sort of award
07:53
<othermaciej>
for "most reddit replies on a single thread"?
07:53
<roc_>
when Hixie does start taking bribes we're all going to look stupid
08:48
<Hixie>
hsivonen: i don't really see anything in http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2006-December/008849.html that i can respond to (other than the <q> issue), except maybe the suggestion in the parenthetical in point 1
08:48
<Hixie>
but i don't understand what that's suggesting
08:48
<Hixie>
can you advise?
08:49
<anne-mac>
that e-mail is not from hsivonen...
08:49
<hsivonen>
Hixie: looking at it now
08:49
<Hixie>
oh you're right, it's not
08:49
<Hixie>
oops
08:50
<Hixie>
well i don't know what to do with it then
08:51
<anne-mac>
i think it's more of a rant than a comment
08:56
<hsivonen>
Hixie: The main point I'd make is that non-heuristic machine consumption (i.e. naïvely trusting marked-up semantics and only using marked-up semantics) for dialogs, names of vessels, quotations, etc. does not have a plausible market-demand story. However, the elevator pitch for hCard and hCalendar is at least plausible: UI for adding event info or contact info to iCal, Address Book or similar app
08:57
<Hixie>
agreed
08:57
<Hixie>
doesn't really affect the spec though
08:57
<Hixie>
so...
08:57
<hsivonen>
the volume of the ogg thing has caused me to get out of sync with IRC and list email. :-(
08:58
<Hixie>
it wasn't that bad
08:58
<anne-mac>
oh, Hixie replies to a annevk⊙oc e-mail!
08:58
<Hixie>
:-)
08:59
<anne-mac>
reached the two year old e-mail mark? :)
09:01
<Hixie>
nah, just dealing with mail from buckets that have newer mail
09:01
<Hixie>
thought that particular mail was actually from the time my spam filter hated you
09:01
<Hixie>
though, even
09:01
<anne-mac>
could be
09:02
<Hixie>
no it definitely was
09:02
<Hixie>
i had to go fish it out of my gmail pile to reply to it
09:02
<anne-mac>
ok :)
09:03
<virtuelv>
Hixie: you're implying you never delete even spam?
09:03
<Hixie>
no, i saved a bunch of mail from anne back when my filter hated him
09:03
<Hixie>
but e-mails saved from gmail's spam folder don't get forwarded to my main imap server
09:04
<Hixie>
they just sit in my gmail inbox
09:04
Hixie
gets to an e-mail from Tina Holmboe
09:04
Hixie
decides to deal with that one later
09:05
<Hixie>
btw, what's the status with the role="" stuff? and what's the status with the forms stuff?
09:05
<anne-mac>
forms: nobody replied to my request for input
09:06
<anne-mac>
role: zcorpan knows it better than me at this point, being on the group and all
09:06
<anne-mac>
I think they like to call it role=, want to use namespaces in some places? and the rest is aria-xxx
09:07
<Hixie>
so i have a request here for an <attn> element, whose primary use case would be to do what waiaria does with update regions
09:08
<Hixie>
will i eventually be mentioning waiaria somewhere in html5?
09:08
<anne-mac>
i think so, yes
09:08
<zcorpan>
i think the idea is to define both role= and aria-xxx=, and the old namespaced attributes, even with the knowledge that browsers don't want to support the namespaced attributes
09:08
<zcorpan>
i really don't know why
09:09
<anne-mac>
Hixie, I guess HTML5 would defer to the aria draft for role= and attributes prefixed with aria-, would make it clear there are no DOM interfaces, and that's it
09:10
anne-mac
should probably go to work at some point
09:10
<Hixie>
no DOM interfaces? huh, that sucks. since it's primarily intended to be for scripts...
09:10
<Hixie>
zcorpan: can we fight for more sanity?
09:11
<Hixie>
zcorpan: i'm willing to help if needed. in particular, can we decouple from xhtml2's role stuff?
09:11
<zcorpan>
Hixie: not sure
09:12
<zcorpan>
but i think so
09:14
<zcorpan>
Hixie: the idea is that the attributes should be correct even in legacy browsers with no knowledge of aria, so that screen readers can pick it up from the dom
09:14
<zcorpan>
although i guess they could read js properties as well
09:15
<Hixie>
ah interesting
09:18
<zcorpan>
perhaps we can introduce a dom interface for aria 2, when we know what aria 2 will require
09:26
zcorpan
considers marking all ogg emails as read
09:26
krijnh
did
09:27
<krijnh>
Apart from the big Hixie replies
09:31
<hsivonen>
Philip`: re: 20 seconds: nice
09:33
<zcorpan>
hsivonen: btw, does your schema support role="fancy-checkbox checkbox"?
09:34
zcorpan
can see a disadvantage of allowing arbitrary roles; conformance checkers won't catch typos
09:35
<zcorpan>
role="chekcbox"
09:35
jgraham_
seems to be getting new ogg emails at roughly his time average rate of reading WHATWG emails so the unread number is almost constant
09:35
<hsivonen>
Hixie: I think <cite> should be defined to capture the Chicago Manual of Style title-of-work concept while at the same time saying that an author is not an evil person for using <i> instead and saying that you don't need to change legacy content that uses <cite> for names of people
09:38
<hsivonen>
hubick: I'm near email bankruptcy but I have received your patch. Sorry about the delay.
09:45
<hsivonen>
zcorpan: no, the proof-of-concept schema approach does not take address extensibility
09:45
<zcorpan>
hsivonen: ok
09:45
<hsivonen>
zcorpan: extensibility is not on the agenda for 1.0
09:46
<hsivonen>
zcorpan: moreover, extensibility and white-list-like validation are conflicting things
09:46
<zcorpan>
perhaps i should update the authoring conformance reqs accordingly
09:56
<Philip`>
Does the Ogg discussion indicate that HTML5 is actually relevant and people care about it, or would they complain as much about any other specification that did the same thing?
09:56
<Hixie>
hsivonen: well you never need to change legacy content
09:56
<Hixie>
hsivonen: it just might not be compliant to html5 :-)
09:56
<Hixie>
(i don't see any point explicit grandfathering in legacy content in that way)
09:57
<hsivonen>
Hixie: it appears that *some* authors don't think that way
09:57
<hsivonen>
Hixie: I'm inclined to think they are misguided, but still
09:57
<Hixie>
well if they want to be compliant, so much teh better
09:57
<Hixie>
could you send a mail elaborating on your idea for <cite>?
09:57
<Hixie>
i really want to address this inline vs block issue
09:58
<Hixie>
i don't really know hwo to do so
09:58
<Hixie>
maybe i really should do this matrix idea i mentioned
09:58
<hsivonen>
http://diveintomark.org/archives/2003/01/13/semantic_obsolescence
09:59
<hsivonen>
Hixie: OK. I'll send email on both issue
09:59
<hsivonen>
s
09:59
<Hixie>
well the inline thing i'm actually thinking of resolving right now
09:59
<Hixie>
so feel free to discuss that here
09:59
<othermaciej>
Hixie: I think the matrix would be a useful excercise to clarify what's important to capture in the content model reqirements
09:59
<othermaciej>
(there might be a bunch of "don't care" boxes)
09:59
<othermaciej>
Philip`: way to see a silver lining :-)
10:00
<zcorpan>
Hixie: did you see my thoughts about inline yesterday?
10:01
<zcorpan>
http://krijnhoetmer.nl/irc-logs/whatwg/20071211#l-361
10:01
<Hixie>
hsivonen: (and regarding mark's post, note that i work indirectly with him and he is up to date with the spec :-) )
10:01
<Hixie>
zcorpan: looking
10:01
<hsivonen>
Hixie: the following are random thoughts with no conclusion that I'd be comfortable with yet:
10:02
<hsivonen>
* so far it seems that authors don't like bimorphic
10:02
<Hixie>
zcorpan: i think there are pretty solid reasons for wanting to allow <p><ol/></p>, but i agree that it's conceptually a pain, especially given the html serialisation issue
10:02
<zcorpan>
indeed
10:02
<hsivonen>
* RELAX NG can validate bimorphic but the user experience sucks by default without schema-specific UI-level papering over
10:02
<Hixie>
hsivonen: (by which you mean they like mixing inline and block content?)
10:03
<hsivonen>
Hixie: yes
10:03
<hsivonen>
Hixie: cf. Sean Fraser, Sam Ruby and Dan Connolly
10:03
<Hixie>
yeah i agree that people want that
10:03
<Hixie>
i think we should allow that
10:03
<Hixie>
let's do this matrix thing
10:03
Hixie
finds a tool that can do easy editing of grids
10:03
<hsivonen>
* having general content models like "block" or "inline" makes schemas easier to write
10:04
<hsivonen>
* also makes conformance easier to teach
10:04
<zcorpan>
Hixie: (what's the matrix thing?)
10:04
<othermaciej>
google spreadsheets?
10:04
<othermaciej>
makes a grid, easy to share on the web
10:04
<hsivonen>
* having content model differences in XHTML5 is inconvenient
10:04
<hsivonen>
- Makes schema ugly
10:04
<Hixie>
othermaciej: way ahead of you :-)
10:04
<othermaciej>
zcorpan: which block/inline elements should be allowed in what others, assuming the rules were being designed from scratch
10:05
<hsivonen>
- Requires me to maintain a separate "HTML5-compatible subset of XHTML5" schema/mode
10:05
<hsivonen>
- Does not make sense with the "people should just use text/html" party line
10:06
<hsivonen>
- Does not make sense with the "apps should use XHTML internally but serialize to HTML5 for IE" party line
10:06
<Hixie>
yeah i agree that we should strive for no differences
10:06
<othermaciej>
Hixie: would you consider also multi-level nesting? (I guess that's most likely to affect <p>)
10:06
<Hixie>
which basically means the html serialisation wins
10:06
<Hixie>
or rather, can veto
10:06
<Hixie>
othermaciej: can you give an example where three-or-more-way nesting would have a different answer than two-way?
10:07
<hsivonen>
* I'm not convinced that achieving semantic purity with intra-paragraph lists is worth all the trouble
10:07
<Hixie>
othermaciej: i was just gonna do it on the basis of indirect nesting
10:07
<Hixie>
do we have a list of elements anywhere yet? i could use docs' magic list making feature but that seems unlikely to work perfectly here :-)
10:08
<othermaciej>
Hixie: I don't think there is such a case, but expressing the rules in terms of indirect nesting may be hard to understand
10:08
<othermaciej>
there are lists of elements, yes
10:08
<othermaciej>
I think zcorpan has one
10:08
<othermaciej>
http://simon.html5.org/html5-elements
10:08
<Hixie>
cool thanks
10:08
<Hixie>
othermaciej: well we'll worry about exactly how to express the rules later
10:08
<Hixie>
i just want an idea of what the ideal would be first
10:09
<hsivonen>
* OpenOffice.org Writer/Web makes an interesting case study of an editor with very strict block/inline boundaries
10:09
<othermaciej>
Hixie: if paragraphs could contain tables, then it might make sense to let a table in a paragraph contain a paragraph in a cell
10:10
<othermaciej>
I guess that is one plausible exception I can think of
10:10
<hsivonen>
* I don't know how to reconcile hand-authoring flexibility and importability in an app like OO.o Writer/Web except defining that editing apps MAY introduce a lot of layout-wrecking <p>s
10:10
<zcorpan>
some elements only allow inlines. others allow both block and inline. inline elements never allow blocks. something in that direction seems to be what authors believe the rules are, i think
10:11
<Hixie>
othermaciej: i'm not really convinced it would, but interesting
10:11
<hsivonen>
* Having different content models for <i> and <em> sucks big time
10:12
<hsivonen>
* In general, this whole strict inline thing probably sucks
10:12
<Hixie>
if you have google accounts post your e-mail addresses here (or msg me) so i can add you to this thing
10:12
<othermaciej>
maciej⊙gc
10:12
<zcorpan>
zcorpan⊙gc
10:12
<hsivonen>
hsivonen@
10:13
<krijnh>
krijnhoetmer⊙xn
10:13
<krijnh>
Is this about the mixing of inline/block content as well?
10:14
<hsivonen>
* I agree that the <div> content model in HTML 4 sucks semantically
10:15
<Hixie>
http://spreadsheets.google.com/ccc?key=pkNVM1HEQs-wsHB7s1M5Lbw
10:15
<hsivonen>
* But it seems the horse is out and the barn has burned
10:15
<Hixie>
feel free to fill in the cells you think look obvious
10:15
<Hixie>
if someone fills in something you disagree with, put a question mark after it
10:16
<hsivonen>
* We should probably allow Philip`'s Firefox <span> workaround
10:17
<zcorpan>
hmm, i'm a bit uncomfortable about that one. i breaks the rule "inlines never allows blocks"
10:17
Philip`
isn't sure that's worthwhile since it doesn't work in IE and is therefore a bit useless for most authors
10:17
<hsivonen>
* I don't like the direction of these points, since the direction is that <p> should allow inline and almost every other block container should allow %Flow
10:18
<zcorpan>
Philip`: indeed
10:19
<hsivonen>
zcorpan: isn't block-in-span an IEism that others have to implement for compat with content out there?
10:19
<Philip`>
<section><div class=section> lets you style .section in all browsers, and HTML5 UAs will get the sectioning correct
10:20
<zcorpan>
hsivonen: also h1-h6 and address
10:20
<Hixie>
feel free to fill in this spreadsheet too, btw, i don't plan on doing all 10000 cells myself :-P
10:20
<krijnh>
Ow, you're not? ;)
10:20
<zcorpan>
hsivonen: yes, but authors think it's disallowed
10:20
<Philip`>
Is there a Google Spreadsheet API to fill these things in automatically? :-)
10:21
<hsivonen>
as a schema writer and a closet markup purist, I like bimorhic and stuff
10:21
<hsivonen>
but as a validator front end developer and realist I don't
10:21
<hsivonen>
I'm torn
10:22
<Hixie>
Philip`: it's not the api that's the hard part :-)
10:22
<Hixie>
i think it's clear that forcing a separation of inline and block isn't working in practice
10:22
jgraham
wishes he could stay for this discussion
10:22
<Hixie>
i think we can just say that paragraphs are implied
10:23
Philip`
wishes he could stay because it looks freezing outside
10:23
<Hixie>
done 50 of about 10000 so far
10:23
<jgraham>
Philip`: That too :)
10:23
<zcorpan>
(perhaps we should change address to allow %flow as well)
10:23
<hsivonen>
Hixie: yeah, but is there a credible story that'd allow the likes of OpenOffice.org Writer/Web to make all those implicit paragraphs explicit when it reserializes from its non-DOM internal datastructure?
10:23
<Hixie>
hsivonen: i don't think semantically we should disallow it
10:24
<Hixie>
hsivonen: of course as you say, it causes havoc with styling
10:24
<Hixie>
not sure what to do about that
10:24
<jgraham>
Hixie: Can you ad my jgraham.html@gmail account
10:24
<Hixie>
done
10:24
<jgraham>
thx
10:24
<Hixie>
anyone can feel free to add other people btw
10:25
<Hixie>
in case y'all want to continue this when i go to bed :-)
10:25
<colione_>
olle.lundberg@gmail
10:26
<Hixie>
added
10:26
<colione_>
thnx
10:28
<Hixie>
woot, finished <abbr>
10:29
<Hixie>
hmm, should <caption> ... <address> </address> </caption> be legal
10:31
<zcorpan>
Hixie: no, caption should only allow inline (like h1-h6)
10:31
<Hixie>
makes sense
10:32
<Hixie>
hmm, should we allow a <dialog> to contain a <section> or <address> or <footer>...
10:32
<Hixie>
indirectly even
10:33
<krijnh>
What do you mean by indirectly contain?
10:33
<Hixie>
<dialog> <dt> krijnh <dd> <section> <p> What do you mean by indirectly contain? ...
10:33
<Hixie>
in fact, should we allow <ol>/<ul>/<dl> to contain any sectioning elements
10:33
<Hixie>
i'm thinking not really
10:34
<Hixie>
how about tables? should they be allowed to contain sectioning elements?
10:34
<hsivonen>
Indirectly does not sound good for my purposes...
10:34
<Hixie>
i'm sure we'll find a better way of phrasing this in due course
10:34
<krijnh>
Hixie: probably for table based layouts
10:34
<hsivonen>
Hixie: a <td> should have the some content model as <body> to allow real-world authoring patterns
10:35
<Hixie>
krijnh: well those aren't legal anyway
10:35
<hsivonen>
s/some/same/
10:35
<Hixie>
hsivonen: really? i'd have thought discouraging table-based layouts would be a plus here.
10:35
<Hixie>
they're already invalid
10:35
<hsivonen>
or more to the point, <td> should allow at least everything <body> allows
10:35
<Hixie>
we just can't catch them
10:36
<Hixie>
maybe <address> should only be allowed as a direct child of a sectioning element?
10:36
<Hixie>
hmm
10:36
<hsivonen>
Hixie: making table-based layout invalid is pointless until the CSS WG delivers a viable alternative *and* the top four browsers implement it
10:36
<Hixie>
hsivonen: table based layout has never been valid
10:37
<hsivonen>
Hixie: then perhaps we should make them valid
10:38
<hsivonen>
Hixie: at least with appropriate role='' pixie dust
10:38
<annevk>
annevankesteren⊙gc
10:38
<Hixie>
hsivonen: ew
10:38
<Hixie>
to the first part
10:38
<Hixie>
not so much the second
10:38
<Hixie>
annevk: done
10:40
<hsivonen>
Hixie: is the table intentionally missing <table> and table-internal stuff?
10:40
<Hixie>
yeah
10:40
<Hixie>
<td> and <th> cover those
10:40
<Hixie>
and <caption>
10:40
<Hixie>
they're the only "exit points" for tables
10:40
<Hixie>
i think i might drop <ol> <ul> <dl> <dialog> too for the same reason
10:41
<annevk>
Hixie, wasn't <p><strong>blah</strong></p> for <lede> or <lead>?
10:41
<Hixie>
annevk: it's not really marking importance, is it?
10:41
<Hixie>
actually <figure> should probably be taken out too
10:42
<Hixie>
it has a whole other set of issues
10:42
<krijnh>
Can an article contain an article?
10:42
krijnh
is very efficient - 2 no's already
10:43
<annevk>
yeah, for comments on an article
10:43
hsivonen
hopes the table will eventually generalize into a handful of content models to teach and to type into a schema
10:43
<Hixie>
krijnh: yes, blog comments are articles in articles
10:43
<Hixie>
hsivonen: i hope so too
10:44
<annevk>
Hixie, yeah, fair enough, just thought that was the idea earlier on
10:45
<othermaciej>
Hixie: does it make sense to have <td> on the horizontal axis?
10:45
<othermaciej>
lots of things can indirectly contain it but not directly
10:45
<othermaciej>
(similarly for other table structure)
10:45
<Hixie>
othermaciej: you mean the vertical axis? the question is "can elements in the left hand column contain elements on the top row"
10:45
<Hixie>
you mean on the top row?
10:45
<othermaciej>
yes
10:46
<Hixie>
i agree that we should remove all the inner table elements from the top row
10:46
<Hixie>
feel free to do so
10:46
<othermaciej>
ok
10:48
<othermaciej>
I will remove <html> <head> and <body> from the top row for similar reasons
10:48
<annevk>
bah, the CSS WG discussions should really be public
10:48
<Hixie>
i'm nuking option and optgroup from the first column since they're special
10:49
<Hixie>
select too, same reason
10:49
<krijnh>
Hixie: can't <details> contain additional contact information?
10:49
<othermaciej>
whoah, what's <nest>?
10:49
<Hixie>
krijnh: like <details> <legend> Contact information for this page </legend> <address> ... </address> </details> ?
10:49
<krijnh>
Yeah
10:49
<Hixie>
othermaciej: part of the data template feature. nuke it, it's seriously special.
10:49
<Hixie>
othermaciej: same wiht <datatemplate> and <rule>
10:50
<Hixie>
othermaciej: i'm probably gonna nuke that whole section anyway, in favour of some apis to make it easier to make your own templating language and graft it onto html
10:50
<Hixie>
othermaciej: there are too many different ways of doing templates, each with their own pros and cons, to really annoint any one model
10:51
<Hixie>
krijnh: sounds plausible...
10:51
<othermaciej>
removed
10:51
<krijnh>
So then an article can be put in a details as well
10:51
<othermaciej>
looking for other things too special to be worth gridding
10:51
<krijnh>
<details><legend>More articles on this subject</legend><article>foo</article><article>bar</article></details>
10:52
<Hixie>
seems reasonable
10:52
<krijnh>
Or would that be wrong use?
10:52
<Hixie>
it seems unexpected use, but i don't see that it would be wrong... dunno
10:53
<othermaciej>
base, link, meta, noscript, optgroup, option, param, source, style, script, title
10:53
<othermaciej>
from the first row
10:53
<othermaciej>
any objections?
10:53
<krijnh>
noscript ?
10:54
<othermaciej>
isn't noscript allowed anywhere and therefore not worth mentioning?
10:54
<othermaciej>
it does not seem affected by block/inline considerations
10:54
<othermaciej>
but I'll not remove it for now
10:55
<Hixie>
leave noscript for now, it's a weird case
10:56
<Hixie>
but the others can go for sure
10:56
<Hixie>
actually <style> might be worth leaving
10:57
<annevk>
<area> can be nuked from the top row probably
10:58
<Hixie>
why?
10:58
<Hixie>
<area>'s a tough one
10:58
<Hixie>
when it's allowed is unclear to me
10:58
<annevk>
only as descendent of <map>
10:58
<annevk>
or maybe only as child of <map>
10:59
<Hixie>
we're allowing things like <map><area/><area/></map> as well as things like <map><p>...<a/><area/>...</map>
10:59
<Hixie>
so far
10:59
<othermaciej>
I nuked <style> already, I can put it back if you really want it
11:00
<othermaciej>
(where <style scoped> is allowed is kind of interesting, but not really related to the core block/inline type issue)
11:00
<annevk>
Hixie, I'm not sure what the use of the latter is
11:01
<Hixie>
othermaciej: nuking is fine
11:01
<Hixie>
annevk: makes it easier to make sure you've got all your links and areas done together
11:01
<annevk>
isn't <area> a link?!
11:01
<annevk>
hmm
11:02
<annevk>
then again, I thought <map> was display:none, it isn't
11:02
<Hixie>
annevk: <area> is a link with a shape
11:02
<krijnh>
Hmm, could video contain section elements?
11:02
<annevk>
Hixie, per HTML4 so is <a>
11:03
<Hixie>
yeah but we dropped that long ago
11:03
<annevk>
and I'm not sure how that's relevant
11:03
<Hixie>
krijnh: i'd say yes, if the fallback is very detailed :-)
11:03
<Hixie>
annevk: ?
11:03
<krijnh>
Hixie: yeah, or has contact information?
11:03
<Hixie>
krijnh: right
11:04
<Hixie>
i could see one doing <video> <address> For a transcript, contact ...</address> </video>
11:04
<Hixie>
i guess
11:04
<krijnh>
Cause you put a ? at video->address :)
11:07
<Hixie>
fixed :-)
11:07
<annevk>
Hixie, I'm not sure how <area> being a shaped link is relevant
11:07
<Hixie>
annevk: when you're doing an image map, you want to provide both the shape link <area> and the fallback link <a>
11:07
<Hixie>
for each link
11:07
<Hixie>
easiest to do if you have them all together
11:08
<Hixie>
rather than as two blocks
11:08
<annevk>
isn't the algorithm for fallback to use <area>?
11:08
<annevk>
or is this in the case <map> is not supported?
11:09
<Hixie>
if <a>s aren't provided the UA uses <area>, but that kinda sucks compared to providing a custom fallback
11:09
<annevk>
that's not at all how the image map algorithm for fallback works
11:10
<Teratogen>
bring ogg back!
11:18
<hsivonen>
coming up with suggested definition of <cite> is hard
11:18
<hsivonen>
I'm bad at writing weasel words
11:18
<Dashiva>
hehe
11:21
<Hixie>
annevk: really?
11:22
<Hixie>
that was my intention...
11:22
<Hixie>
Teratogen: :-)
11:24
<Hixie>
hm, should <map> allow <section> in it then? or <article>?
11:24
<Hixie>
or <aside> or <address>? hmm
11:26
<annevk>
just <area> imo
11:27
<hsivonen>
is the diveintomark.org favicon one of the allegedly indecent IceWeasel icon suggestions?
11:29
<Hixie>
krijnh: ogg :-P
11:29
<krijnh>
:p
11:30
<krijnh>
Bring it back! ;)
11:32
<hsivonen>
Hixie: so I didn't send email about block/inline, because I dumped my points here instead
11:33
<Hixie>
k
11:33
<Hixie>
i think i mostly agreed with your points anyway
11:33
<Hixie>
it's turning out that there are some edge cases that i hadn't really thought of
11:33
<Hixie>
like should <nav> be able to contain <article>
11:34
<annevk>
no
11:35
hdh
imagines people use narration to guide users around the site
11:36
<krijnh>
<body><nav><section><article>An interesting article with lots and lots of interesting links</article></section></nav></body>
11:37
<krijnh>
Hixie: is that conditional formatting?
11:37
<krijnh>
Yes, cool
11:38
<Hixie>
yeah
11:39
<Hixie>
ok i'm gonna go sleep, i can't concentrate anymore
11:39
<Hixie>
feel free to continue editing :-)
11:39
<Hixie>
thanks for the help btw
11:39
<Hixie>
really helpful
11:39
<Hixie>
i'll try to continue this tomorrow
11:40
<hsivonen>
it is "interesting" how <article> and friends that would be "easy" to implement are less supported by browsers than "hard" stuff like <canvas> and <video>
11:41
<annevk>
it's not entirely clear what it would mean for rendering to support <section> and other sectioning elements
11:42
<krijnh>
It's also a dull feature to sell :)
11:42
<Camaban>
hsivonen: video and canvas are 'cool' and 'new', while article doesn't 'do' anything? :)
11:43
<annevk>
<article> together with <h1>-<h6> can in theory effect rendering, but it's unclear how
11:43
<hsivonen>
krijnh: yeah. it makes me wonder if using <article> instead of <div class='article'> will ever be compelling for authors
11:44
<annevk>
I think if CSS gave you something like :heading(2) to style all level two headers that might work
11:46
<krijnh>
hsivonen: do you think authors would use the new elements already, if IE/Fx didn't close unknown block level elements immediately?
11:46
<krijnh>
That's probably easy to fix behavior, but I don't think it would change anything
11:47
<othermaciej>
:heading(n) might be handy for implementing the default rendering of <h[1-6]> as well
11:48
<othermaciej>
hsivonen: we could easily support default rendering as a block for all the new semantic block elements
11:48
<othermaciej>
hsivonen: supporting headings styled in accordance with the outline algorithm would be hard and the spec doesn't say how to do that yet
11:49
<othermaciej>
(or whether)
11:49
<othermaciej>
I will tell you that we're interested in supporting any new elements and attributes that seem like low hanging fruit in WebKit in the fairly near future
11:52
<othermaciej>
(that would basically be irrelevant="", sectioning elements, dialog, m if someone decides it should have some special default style
11:52
<othermaciej>
)
11:52
<othermaciej>
figure would also be low-hanging fruit if not for the <legend> issue
11:54
<hsivonen>
krijnh: with the IE/Firefox situation, using the new elements is not worthwhile ATM from the author POV
11:55
<hsivonen>
othermaciej: Opera Mobile has a nice "scroll to content feature" it would be cool to have that in WebKit, too, and both taking <article> into account
11:56
<hsivonen>
actually, that's the only UA-side semantic treatment of <article> that I can come up with at the moment
11:56
<hsivonen>
skipping to content whether on mobile or in an aural browsing setup
11:58
<othermaciej>
yeah, supporting the new block-level elements would not have much value besides patriotism just yet
12:12
<annevk>
maybe [#heading=n]
12:12
<annevk>
at some point there was this idea of separating intrinsic attributes of pseudo-classes, but maybe that point is moot
12:16
<othermaciej>
"intrinsic attributes"?
12:17
<annevk>
td[#col=2] was another one
12:18
<hsivonen>
anyway, it seems that the whole <section> thing hinges upon a selector with reasonable perf and implementation characteristics
12:19
<krijnh>
Something for CSS5?
12:21
<othermaciej>
I'm not sure why that's better than pseudo-classes
12:22
<annevk>
me neither, I guess the idea might have been dropped already
12:23
<krijnh>
Why was it an idea in the first place then?
12:24
<hsivonen>
Is it reasonable to expect the CSS WG to have cycles to look into an outline-dependent selector any time soon?
12:25
<hsivonen>
they seem to have a lot on their plate even without new HTML5 needs
12:27
<othermaciej>
this is the WG that's bringing us ascii art layout
12:27
<othermaciej>
there doesn't have to be a reason
12:32
<annevk>
it should be pretty easy to draft a specific selector proposal for heading:
12:32
<hsivonen>
hmm. the TAG is going the way of SGML: the more common concepts have the longer names: resource vs. resource representation
12:32
<annevk>
especially as CSS only needs to define the syntax and say that it's up to languages to define when it actually matches
12:33
<hsivonen>
annevk: you still need a kind of rare person who groks CSS formatter internals well enough to assess the computational feasibility with dynamics DOM changes
12:34
<krijnh>
no
12:34
<krijnh>
no
12:34
<krijnh>
Oops :)
12:35
<annevk>
I can ask the guy who implemented selectors in Opera
12:35
<othermaciej>
yeah, I'm not sure the current html5 outline algorithm is computationally feasible for incremental rendering and dynamic DOM updates
12:35
<annevk>
I suppose other implementors have ways to find out themselves
12:37
<othermaciej>
it's written in terms of generating a hypothetical tree and walking it
12:37
<othermaciej>
but for selector matching you need something that's evaluated from the element point of view
12:39
<othermaciej>
so it's hard to tell if it can be efficient without doing a conversion to that type of algorithm first (and making sure it is actually equivalent)
12:40
<othermaciej>
it's not clear to me if changes outside an <hn> element can change its heading level
12:41
<othermaciej>
actually it's not that clear to me how the section tree affects heading levels
12:48
<annevk>
<section><section><h1> would be level 3 I think
13:03
<hasather>
The Ogg debate is like a hydra. You read one thread, and two others pop up during the time
13:11
<Dashiva>
And over half the mails are by the same two people
13:13
<Dashiva>
I'm strongly tempted to send a mail saying "So-called non-commercial entities have the option to pay for licences, they just choose not to."
13:16
<Philip`>
Open source entities don't have that option, since they couldn't distribute their code in a way that other people could modify and use
13:18
<Dashiva>
But that's just an arbitrary restriction they place on themselves. We're here to get interoperability, not to run errands for other organizations
13:18
<Philip`>
(unless they get a licence which allows everybody royalty-free usage of the patents for any purpose)
13:18
<Philip`>
(which is what Theora got, so it's not totally impossible)
13:28
<hsivonen>
I wonder what MPEG-LA estimates as the expected value of the overall H.264 licensing income over the lifetime of the patents discounted to present value
13:33
Philip`
wonders if it's worth trying some basic comparisons of non-state-of-the-art video codecs
13:36
<othermaciej>
if you know how to do such a thing then sure
13:36
<Philip`>
I don't know how to do it especially well, hence the "basic" :-)
13:37
<othermaciej>
might be interesting to try this shootout with H.261, MPEG-2, H.263, etc: http://osnews.com/story.php/19019/Theora-vs-h.264/
13:41
<hsivonen>
something that hasn't been explored: it is important to have RF decoding and *an* RF encoder, but it doesn't follow that there could not exist non-RF state-of-the-art encoders or hardware decoders
13:41
<hsivonen>
back when *compressed* GIF encoding was encumbered, there were RF decoders
13:42
<Philip`>
http://www.doom9.org/index.html?/codecs-quali-105-1.htm has Theora, but it's a few years old now and I don't know how much has changed
13:42
<hsivonen>
and RF encoders that sucked badly (i.e. produced a stream that decoded as lzw but was not compressed)
13:43
<Philip`>
(Also, DVDs are very different quality to what you'd publish on the web)
13:44
<othermaciej>
MPEG-LA could probably limit decoder revenue to hardware implementations, or mobile devices, or both, and not lose significant revenue
13:45
<othermaciej>
s/revenue/royalties/
13:45
<hsivonen>
mobile devices is still a field-of-use restriction that would not go well with Open Source
13:46
<othermaciej>
realistically nearly all handsets on which you could run interesting software have paid the license fee
13:54
<alp>
othermaciej: mobile distributors of webkit/gtk+ and webkit/qt at least would probably be happy with a royalty-free codec from what i gather but asking for licensing fees is pushing it. maybe the situation is different for other free browsers though
13:54
<othermaciej>
I don't think you need to pay royalties for the browser component if the phone already has a hardware or software decoder
13:55
<othermaciej>
which many do
13:55
<othermaciej>
could be wrong though, it's hard to understand the MPEG-LA's documents
14:00
<alp>
for various reasons it may be necessary to ship the codec with the browser (say if the hardware only allows video overlay and you need to support more complex rendering features, or if the hardware just doesn't support the codec at all in the first place)
14:01
<Philip`>
It's fun how FFmpeg is happy to be told to output an Ogg file, but actually it silently doesn't support it and outputs something totally different
14:02
<othermaciej>
fair enough, I don't really know how this works
14:06
<Philip`>
Hmm, H.261 doesn't support 320x240 :-(
14:06
<othermaciej>
is that too small or too big for it?
14:06
<othermaciej>
(I guess that is a problem in itself)
14:08
<Philip`>
"Valid sizes are 176x144, 352x288"
14:08
<Philip`>
Great flexibility!
14:09
<Philip`>
Also, it looks horrible quality
14:10
<Lachy_>
wow, that's terrible
14:10
<Lachy_>
what about motion jpeg or any of the other older alternatives?
14:11
<Lachy_>
motion jpeg just sounds like it would have huge file sizes
14:13
<annevk>
motion jpeg is not a serious option
14:13
<Philip`>
MJPEG doesn't look that awful compared to other codecs
14:13
<maikmerten>
H.261 didn't even expire
14:14
<maikmerten>
it's a 1990 standard
14:15
<Philip`>
(With my current utterly rubbish test setup, only looking at 5 seconds of video, it's quite better quality than H.261 but twice the filesize and I guess I need to fiddle with the encoder settings to get a fair comparison)
14:15
<maikmerten>
MJPEG doesn't stand a chance against even H.261
14:15
<maikmerten>
no motion compensation, no inter-frame coding...
14:15
<Philip`>
It can do more than two different frame sizes, though :-)
14:15
<maikmerten>
I'll give it that ;)
14:16
<maikmerten>
plus there is no MJPEG standard IIRC
14:16
<maikmerten>
there are a lot of codecs claiming "MJPEG"
14:16
<Philip`>
and it can do variable bitrates, which I assume H.261 can't since it was designed for streaming over ISDN
14:17
<maikmerten>
H.261 should be able to do VBR
14:17
<maikmerten>
thanks to the very nature of video compression codecs are VBR
14:18
<maikmerten>
and it's actually a lot of work to get them to do CBR
14:18
<maikmerten>
(bitrate reservoirs etc. etc.)
14:18
MikeSmith
finally gets to reading Hixie responses to messages he sent about <term> and <xref> stuff
14:19
MikeSmith
goes to re-read spec for <i>
14:19
<maikmerten>
(well, okay, granted, it's very possible to develop codecs with a fixed bitrate and CBR)
14:19
<Philip`>
maikmerten: If you want to do e.g. videoconferencing over an ISDN channel, that'd have to be CBR since you can't do buffering, and I thought that was roughly what H.261 was designed for
14:20
<maikmerten>
(but it's difficult to do e.g. with DCT based codecs)
14:20
<maikmerten>
why wouldn't you be allowed to send less data than the line allows you to send?
14:20
<Philip`>
(but you know more about this than I do so I'm probably wrong :-) )
14:20
<maikmerten>
in worst case just pad with zeros ;)
14:20
<Philip`>
You could send less but that'd be a waste of resources
14:21
<Philip`>
since you're not going to be able to reuse the spare bandwidth for anything else
14:21
<maikmerten>
(which, actually, is one way to get CBR if you manage to ceil the bitrate: Just fill up "unused" bits with crap)
14:21
<Philip`>
so you might as well send higher quality images instead of lowering the bitrate
14:21
<maikmerten>
well, albeit it's good to use up all the bandwidth you may not be able to actually deliver
14:22
<maikmerten>
think compressing a perfect black frame
14:22
<maikmerten>
DCT will give you a nice zero run
14:22
<maikmerten>
near-perfect compression
14:22
<maikmerten>
(only protocol overhead)
14:22
<Philip`>
You could send 64Kb/s of a really really precise shade of black
14:22
<maikmerten>
so it's actually *hard* to guarantee you're using up all bandwidth ;)
14:22
<Philip`>
Fair enough :-)
14:22
Philip`
has to go for some minutes
14:23
<maikmerten>
well, you mostly can't be any more precise than (0,0,0) with black ;)
14:23
<Philip`>
You can do (0.00000000, 0.00000000, 0.00000000) :-)
14:23
<maikmerten>
basically all video codecs are integer based
14:23
Philip`
is gone
14:24
<maikmerten>
plus 0.0000000 would just be another case of "fill up with crap" ;)
14:38
<Philip`>
maikmerten: Those extra significant figures are important in physics - something that I measure as 0.0kg is probably much heavier than something I measure as 0.000000kg :-)
14:39
<maikmerten>
well, only if you try to give an idea with what sort of precision you were working along with the value
14:39
<maikmerten>
(which is often done)
14:40
<maikmerten>
(but for christ's (or any other religious figure's) sake - zero shall be zero ;-) )
14:40
<maikmerten>
actually my physics teacher always got semi-angry if we didn't compact numerical values ;)
14:41
<maikmerten>
(0.0340000 wouuld be 0.034)
14:42
<Philip`>
Eww - I've always been taught that significant figures are significant, and you can't just add them or drop them off whenever you fancy
14:43
<maikmerten>
well, you *have* to specify what precision is used
14:43
<Philip`>
and the worst thing you can possibly do is write 3½ instead of 3.5, because "½" implies some kind of mathematical precision that physics never has
14:43
Philip`
prefers Computer Science where everything is integers ;-)
14:43
<maikmerten>
aye
14:52
Philip`
wonders what typical internet video bitrate is
14:53
<maikmerten>
youtube used to serve 256 kbit/s h.263 with 64 kbit/s MP3
14:53
<maikmerten>
(the latter 22.05 kHz, mono - MP3 just is pretty far behind)
14:54
<Philip`>
Based on an extensive sample of two Youtube FLVs in my /tmp, they're 320kbit/s, so that sounds right
14:55
<maikmerten>
yup, two is basically a perfect statistical base ;)
14:55
<maikmerten>
but they're batch encoded with same settings anyway
14:57
<hsivonen>
what press release is David Gerard referring to on whatwg@?
14:58
<annevk>
the one where Chris Double is quoted
14:58
<annevk>
about Opera and Mozilla pushing <video>
14:58
hsivonen
notes that dgerard talk about wikipedia and "we" in a way that assumes that everyone knows his wikipedia affiliation
14:58
<hsivonen>
annevk: ah the PC World article?
14:59
<annevk>
PC World just copied it
14:59
<annevk>
just like Washington Post and several others
14:59
<hsivonen>
hmm. something has gotten past my HTML5 radar
14:59
<Philip`>
Urgh, H.263 says "Valid sizes are 128x96, 176x144, 352x288, 704x576, and 1408x1152. Try H.263+."
14:59
Philip`
tries H.263+, which works
15:01
<hsivonen>
annevk: whose press release it was?
15:01
<annevk>
not sure what the original was
15:01
<maikmerten>
H.263+ is 1998
15:01
<maikmerten>
it's not close to expiring
15:02
<maikmerten>
H.263 itself is 1995/1996
15:03
<maikmerten>
it's more or less a direct predecessor to MPEG4 Part 2, IIRC
15:03
<Philip`>
What happened to H.262? :-)
15:04
<hsivonen>
annevk: I don't see a <video> press release from any of WHATWG, W3C, Mozilla or Opera
15:04
<maikmerten>
when MPEG ran out of puppies they began consuming future standards
15:04
<maikmerten>
look MPEG-3 ;)
15:04
<annevk>
hsivonen, there was no press release
15:04
<hsivonen>
annevk: ok
15:04
<annevk>
someone made an article that was reused all over the place (even localized)
15:04
<hsivonen>
ok
15:05
<maikmerten>
press releases are boring anyway.... "CEO of ..... says.... '...glad to be here and drive innovation..... customer satisfaction.... revenue... world domination'" - not sure I ever read a really interesting press release
15:06
<hdh>
the opera's dork release?
15:06
<hdh>
bork, maybe, the spelling escaped me
15:11
<hsivonen>
the silly thing about press releases is that all the substance has to be cast into quotations so that jounalists can print them as quotations and avoid stating anything controversial in text that isn't quoted
15:11
<hsivonen>
so to write a press release, one has to first come up with the points, then massage them into soundbytes and then figure out who agrees to be attributed with which soundbyte
15:12
<Camaban>
so journos can misinterpret them, mis-quote them, and be selective about what they quote to create a story ;)
15:12
<Philip`>
(Hmph, I tell these things to do 256Kb/s but they end up doing 430Kb/s instead :-( )
15:13
<Philip`>
(Maybe they're just not designed to scale so low?)
15:16
<maikmerten>
many codecs have limits on how low bitrate can be
15:16
<hsivonen>
comparing codecs properly is very hard, because the fixed parameter is what code you ship to the client and the tricks the encoder does plays such a big role
15:16
<maikmerten>
tried with lower resolution?
15:16
<maikmerten>
aye
15:16
<hsivonen>
so it is quite possible that you end up testing encoders instead of decoding specs
15:17
<hsivonen>
Philip`: are you testing ffmpeg encoders against each other?
15:17
<Philip`>
Testing just decoding specs isn't very useful in practice
15:17
<maikmerten>
well, at least for extremely old codecs "what we have now is as good as it'll ever get"
15:17
<Philip`>
since people will have to encode things, using what's available
15:17
<hsivonen>
Philip`: it isn't but testing a bad encoder or a good encoder with bad params isn't, either
15:18
<hsivonen>
I'd love to have a cheat sheet of QuickTime/H.264, x264, XiphQT and ffmpeg2theora tried and true magic params
15:19
<Philip`>
hsivonen: Yep, I'm just looking at FFmpeg for now, which is far from ideal and I won't claim this is an especially good comparison :-)
15:19
<hsivonen>
since the stuff other people encode tends to look better than what I get with naïve dabbling
15:20
<maikmerten>
most frontends e.g. don't expose all coding options
15:20
<maikmerten>
like in case of Theora I usually end up altering keyframe interval, the noise threshold or even the complete set of quantization tables
15:20
<hsivonen>
maikmerten: or worse, they do expose a zillion options that let you shoot yourself in the foot by overstepping your AVC profile bounds
15:21
<maikmerten>
well, that is a genuine opportunity for formats with profiles ;)
15:21
hsivonen
thinks AVC profiles and levels are an awfully bad idea from the interop POV
15:21
<maikmerten>
well, the argument was that at least it could be made sure restricted deviced could at least reliably support *something*
15:22
<maikmerten>
but I feel this has gone out of hand a bit
15:23
<maikmerten>
it's sometimes not quite easy to e.g. come up with a file that happens to play fine on both Playstation Portable and the iPod and some mobile phone etc. etc.
15:24
<hsivonen>
Google Video has the iPod/PSP magic figured out but afaik they aren't sharing it
15:24
<maikmerten>
especially for extremely sophisticated codecs it can make sense to have profiles that drop coding schemes that (wild example) increase CPU usage by 50% but only give 5% coding gain
15:25
Philip`
can't get below 1300Kb/s with MJPEG
15:27
<MikeSmith>
hsivonen, annevk - I believe the source for the "Mozilla, Opera Want to Make Video on the Web Easier" article was Jeremy Kirk of IDG
15:27
<MikeSmith>
pcworld article has the correct byline at least
15:28
<MikeSmith>
it was not a press release from Opera or Mozilla or whoever
15:28
<annevk>
yeah, I recall reading those names
15:31
<hsivonen>
MikeSmith: ok
15:53
<Philip`>
hsivonen: Would there be any chance of your HTML Parser including a brief summary of the changes between releases?
15:56
<hsivonen>
Philip`: perhaps next time.
15:56
<hsivonen>
Philip`: this time the main difference was Mavenization
15:58
<Philip`>
hsivonen: Okay - it's just useful to know if e.g. the main difference is something like Mavenization that I don't care about, or if it's important bug fixes and I should bother updating
15:58
<Philip`>
but since "updating" involves copying one file over another, it's not a significant issue at all :-)
16:01
<hsivonen>
Philip`: I can't remember if there was something else as well
16:01
hsivonen
looks at logs
16:03
<hsivonen>
Philip`: there was also a bug fix in case you run SAX Tree without a Locator
16:05
<Philip`>
hsivonen: Okay, thanks
16:06
<hsivonen>
Philip`: also, I eleminated a bogus import that referenced a Sun-internal class and caused badness
16:06
<hsivonen>
Philip`: that's about it
16:07
<annevk>
hsivonen, see www-style for media queries
16:07
annevk
tries to fix stuff
16:08
Philip`
wonders what "pseudo-legal" really means
16:10
<hsivonen>
Philip`: my guess is that it means doing stuff that is not legal for a commercial entity to do in the United States but that is legal in e.g. Sweden or Hungary
16:13
<hsivonen>
annevk: aargh. I didn't realize there were comments, too
16:15
<annevk>
escapes, comments, error handling of syntax errors
17:17
gsnedders
probably shouldn't actually even look at the emails
17:18
<gsnedders>
woah.
17:18
<gsnedders>
102 in whatwg alone
17:19
<annevk>
November 2007: 110 e-mails
17:19
<annevk>
Decenber 2007: 208 e-mails so far
17:30
gsnedders
starts writing a reply
17:30
gsnedders
apologies and closes it
17:31
<Philip`>
gsnedders: If it's like your last two messages, it'll be sucked into my spam folder, so I won't even notice :-)
17:31
<gsnedders>
I just think there's no point.
17:31
<gsnedders>
There are formats with no valid patents that cover them (guaranteed, as they are too old) — the same can't be said for Theora
17:32
<gsnedders>
My preference is currently H.261/Vorbis (in some container, dunno what)
17:36
<annevk>
it's better to wait a week like I said, because then reports from the video workshop will be in to better inform us what's going on
17:37
<doublec>
anyone here at the workshop right now (apart from me?)
17:38
<gsnedders>
annevk: see what I wrote to you last night?
17:42
<Philip`>
gsnedders: H.261 is limited to 176x144 and 352x288, which seems a bit rubbish
17:42
<annevk>
gsnedders, I think so, looked like a start
17:42
gsnedders
doesn't remember it being limited to specific resolutions
17:42
<gsnedders>
oh well.
17:42
<gsnedders>
MJPEG anyone? :P
17:43
<Philip`>
gsnedders: FFmpeg doesn't support other sizes
17:43
<gsnedders>
Philip`: wikipedia agrees with you
17:43
<Philip`>
H.263 seemingly supports the five sizes on http://en.wikipedia.org/wiki/Common_Intermediate_Format
17:43
<maikmerten>
MJPEG is even worse than GIF ;)
17:43
<Philip`>
(which all have a stupid naming convention)
17:43
<maikmerten>
GIF can at least say "keep this part unchanged from the last frame" ;)
17:44
<maikmerten>
MJPEG just codes everything, every frame
17:44
<Philip`>
maikmerten: But keyframes in GIF are gigantic :-)
17:44
<maikmerten>
H.263 is a mid-nineties standard
17:44
<maikmerten>
won't expire any time soon
17:44
<maikmerten>
Philip`, sure, GIF is horrid
17:44
<maikmerten>
I wasn't completely serious about GIF
17:44
<doublec>
maikmerten, so you suggest animated gif's as the baseline ;)
17:45
<maikmerten>
but GIF at least is able to exploit temporal redundancy ;)
17:45
<gsnedders>
I wasn't completely serious about MJPEG :P
17:45
<maikmerten>
doublec, well, would also save some implementation effort, right? ;)
17:45
<doublec>
absolutely :)
17:45
<maikmerten>
ah, good
17:46
<maikmerten>
having like 500 outdated and underperforming-till-it's-no-fun codec as baseline for sure isn't desirable ;)
17:46
<Philip`>
Couldn't you extend JPEG to proper 3D (2D+time, making use of redundancy in all directions) by just using a 3D DCT or something? :-)
17:46
<maikmerten>
Philip`, doing anything clever would again make you target to submarine patents
17:47
<maikmerten>
because you'd effectively develop a new codec
17:49
<maikmerten>
the only old old old codec I know that is really expired would be H.120 - 2 MBit/s video conferencing
17:49
<maikmerten>
oh joy
17:49
<maikmerten>
(it's from 1982)
17:49
<maikmerten>
(and no, I don't know a nowadays implementation)
17:49
<gsnedders>
H.261 at least is commonly shipped
17:50
<maikmerten>
1990
17:50
<maikmerten>
currently not old enough
17:50
<gsnedders>
H.261 is 1982
17:50
<gsnedders>
revised 1988
17:50
<maikmerten>
not to my knowledge
17:50
gsnedders
looks up
17:50
<Philip`>
I wonder about Bink
17:51
<gsnedders>
H.260 is that
17:51
<maikmerten>
H.261 is a 1990 ITU-T video coding standard originally designed for transmission over ISDN lines on which data rates are multiples of 64 kbit/s. It
17:51
<gsnedders>
H.261 is 1990
17:51
<maikmerten>
While H.261 was preceded in 1982 by H.120 [1][2] (which also underwent a revision in 1988 of some historic importance) as a digital video coding standard, H.261 was the first truly practical digital video coding standard (in terms of product support in significant quantities).
17:51
<maikmerten>
Wikipedia
17:51
<maikmerten>
^^ yeah, I know this is not the ultimate source
17:51
<maikmerten>
but H.261 in 1990 just makes sense from the history-point-of-view
17:52
<maikmerten>
it led to a direct line of successors to H.264
17:52
gsnedders
can't remember it
17:52
<gsnedders>
but there again, I wasn't yet alive in 1990 :P
17:58
<Teratogen>
bring back ogg!
17:59
<gsnedders>
Teratogen: _CAN YOU PLEAE MAKE A CONSTRUCTIVE COMMENT!?_
17:59
<Teratogen>
yes
17:59
<Teratogen>
BRING BACK OGG!
17:59
<gsnedders>
why?
17:59
<Teratogen>
because it's free!
18:00
<gsnedders>
So? What advantages does it have over, say, H.260 or Dirac?
18:00
<Teratogen>
freedom!
18:00
<maikmerten>
H.260 is oooooold beyond usefulness
18:00
<gsnedders>
All three are free (in terms of cost to license patents).
18:00
<Teratogen>
ogg totally rocks
18:00
<maikmerten>
Dirac is not finished yet and big players would still be scared about submarines
18:01
<Philip`>
If we just built web browsers on land instead of in the sea, submarines wouldn't be a problem at all
18:01
gsnedders
hugs Philip`
18:01
<maikmerten>
I second that
18:02
<gsnedders>
Ogg is no help if it does not achieve interoperability between all browsers.
18:02
<maikmerten>
their choice.
18:03
<maikmerten>
there *is* no less-than-20-years old codec they'd accept
18:03
<gsnedders>
I don't particularly care whose choice it is. I want a video format I can use in every browser.
18:03
<Philip`>
gsnedders: You can use FLV
18:04
<gsnedders>
silly Philip`. that works in Flash, not any browser.
18:04
<gsnedders>
:P
18:06
<Philip`>
Flash works in any browser, and it works now, and in a few years it'll still work in more installed browsers than <video> even if IE8/FF3/Opera9.5/Safari4 add support
18:06
<gsnedders>
I know, that's true.
18:06
<Philip`>
and it'll support VP6, which is better than H.263
18:06
<gsnedders>
Philip`: it doesn't run on browsers on IA-64!
18:08
<doublec>
or any new hardware or devices that comes along
18:08
<doublec>
they'd have to rely on the flash vendor to port their software to it
18:08
<Philip`>
It doesn't run on Lynx either, and there's probably Lynx users than IA-64 users :-p
18:08
<Philip`>
s/probably/probably more/
18:08
<gsnedders>
Philip`: I dunno. probably more IA-64 users, but most won't run browsers on the it :P
18:09
annevk
wonders when the Ogg discussion stops
18:10
<gsnedders>
annevk: Christmas, because everyone is away :)
18:10
<Philip`>
I wonder how much it cost to get Flash on Opera Wii
18:11
gsnedders
is glad we don't require 100% consensus on everything for REC
18:12
<gsnedders>
http://xkcd.com/356/ — you know the worst part? I actually am now stuck thinking about that.
18:13
<Philip`>
gsnedders: Just do a numerical simulation :-p
18:13
<gsnedders>
I've moved on.
18:13
<gsnedders>
Better things to waste my time with.
18:13
<gsnedders>
(where <video> is worse)
18:30
<gsnedders>
hmm… vital maths test tomorrow. do I revise (i.e., learn stuff I missed when I was ill which will lead me to fail :P) or work on HTTP parsing, or write on my blog?
19:10
<Philip`>
gsnedders: By the way, would you be interested in information about HTTP response headers in the wild? There's some data I've got already, and some other stuff would be easy to collect, but I have no idea if it'd be useful for anything at all
19:14
<maikmerten>
gargh, somehow my messages to whatwg go to the moderation queue first because I subscribed as <blabla>@gmail.com but apparently Google is "correcting" the sender address to <blabla>@googlemail.com
19:14
<maikmerten>
is there a way to get the list accept that gmail.com "==" googlemail.com ?
19:15
<maikmerten>
(I really have to thank that brilliant german guy registering "GMail" as trademark and suing Google so all german users are "googlemail")
19:16
<gsnedders>
Philip`: yeah, sure. could you possibly drop it in an email to me?
19:16
<Philip`>
maikmerten: In Settings / Accounts / 'Send mail as', is that where it claims it's @gmail.com when actually it's not?
19:17
<maikmerten>
Philip`, I'll check
19:17
<maikmerten>
Philip`, wasn't aware there was a way to specify these things
19:17
<maikmerten>
anyway, thanks for the tip
19:17
<Philip`>
(The WHATWG moderation queue never gets moderated, so anything sent there will be lost)
19:17
<gsnedders>
wow. the email on whatwg won't stop.
19:18
<maikmerten>
Philip`, same policy as here at xiph.org ;)
19:18
<zcorpan>
oook. i've now cought up with whatwg email (mostly by marking large chunks as read)
19:18
<zcorpan>
not much that was interesting, actually
19:19
<maikmerten>
"You cannot send e-mail from maikmerten⊙gc"
19:19
<maikmerten>
I'll just unsubscribe and resubscribe with googlemail.com
19:19
<Philip`>
That sounds irritating
19:22
<Philip`>
I (in the UK) get a "Google Mail" logo rather than "Gmail", but it seems to be happy with my account staying as @gmail.com
19:22
<Philip`>
so I'm not quite sure how all this stuff works
19:23
<deltab>
which MLM is it?
19:23
<gsnedders>
My older email is @gmail.com
19:23
<gsnedders>
my newer is @googlemail.com
19:24
<maikmerten>
Philip`, may indeed be the "GMail" is a registered trademark in germany thingie
19:24
<gsnedders>
maikmerten: there was a dispute in the UK too
19:24
<maikmerten>
oh, wasn't aware of that
19:24
<deltab>
I think mailman supports alternate sending addresses
19:25
<maikmerten>
too bad I registered "AMail", "BMail" etc. but stopped at "FMail" ;)
19:25
<Philip`>
gsnedders: http://www.cl.cam.ac.uk/~pjt47/misc/headers.xml.bz2 is the headers from ~15K pages, as parsed by HttpClient
19:26
<Philip`>
(which is the only data I've got at the moment)
19:27
<gsnedders>
Philip`: it's served as application/xml!
19:27
<Philip`>
Uh, I think I don't care that it's served as ap...
19:27
<Philip`>
Yeah, that
19:27
<Philip`>
Not my web server :-p
19:28
<deltab>
maikmerten: instead of un/resubscribing you shoudl be able to use this: http://list.org/mailman-member/node22.html
19:28
gsnedders
remembers that SEE doesn't like large files
19:28
<maikmerten>
deltab, too late, I'm afraid.... :(
19:28
<maikmerten>
(and yeah, shame on me for not finding that myself)
19:29
<gsnedders>
Philip`: how's that done? just saving what it gives as headers?
19:31
<Philip`>
gsnedders: Yes, specifically via http://jakarta.apache.org/httpcomponents/httpclient-3.x/apidocs/org/apache/commons/httpclient/HttpMethod.html#getResponseHeaders()
19:31
<Philip`>
(i.e. using whatever kind of parsing code is provided by that)
19:32
<Philip`>
excluding anything that doesn't respond with 200
19:33
<Philip`>
and replacing control characters (except 9/A/D) with spaces
19:34
<Philip`>
(The output isn't necessarily grouped by uri, since it's processed multithreadedly)
19:38
<gsnedders>
Philip`: you got any issues with me publishing anything based on it?
19:38
gsnedders
assumes not seeming he's just linked to the data in a publicly logged IRC channel
19:38
<Philip`>
gsnedders: No - it's all from public web sites anyway, and I didn't ask them for permission ;-)
19:43
gsnedders
goes back to fumbling around with Python
19:44
Philip`
can't see an obvious way to get unparsed headers from HttpClient
20:10
<gsnedders>
Philip`: xml.parsers.expat.ExpatError: not well-formed (invalid token): line 14534, column 56 :(
20:11
<gsnedders>
Philip`: ISO-8859-1 file, with encoding undefined it seems
20:11
<Philip`>
gsnedders: Argh
20:12
<Philip`>
It was correct in my initial XML file, but it looks like xml_grep messes it up
20:13
<Philip`>
which is odd since it claims to be doing UTF-8
20:13
<hsivonen>
any recommendable JavaScript plug-in for Eclipse?
20:14
<roc>
The Aptana stuff is supposed to be good
20:14
<Philip`>
and if I set "--encoding utf-8" then it removes all linebreaks
20:15
gsnedders
is failing at Python
20:17
<gsnedders>
I'm getting a KeyError with self.headers[name] = [{"name": name, "uri": uri, "value": value}]
20:19
<Philip`>
gsnedders: Updated http://www.cl.cam.ac.uk/~pjt47/misc/headers.xml.bz2 so it should hopefully be utf-8 now
20:19
<Philip`>
Also I think I removed the <file> element
20:20
gsnedders
is just using getElementsByTagName() anyway
20:20
<Philip`>
gsnedders: I thought it only gave KeyError when trying to read a non-present key
20:21
<gsnedders>
Philip`: so did I.
20:21
<gsnedders>
"Raised when a mapping (dictionary) key is not found in the set of existing keys." — TFM
20:21
<Philip`>
About gEBTN: Ah, okay - I've been avoiding the DOM since my XML files are 200MB :-)
20:22
gsnedders
doesn't particularly care how long the scripts take to run
20:22
<gsnedders>
I'd be doing it in C if I did :P
20:22
Philip`
does care, because he's lazy and doesn't like waiting
20:24
<inimino>
gsnedders: self.headers['name'] perhaps?
20:24
<gsnedders>
inimino: no, name is a variable
20:25
<inimino>
oh, ok
20:25
<gsnedders>
(and set)
20:25
<gsnedders>
odd.
20:25
<gsnedders>
now with the new headers.xml it works
20:25
<Philip`>
That sounds impossible
20:26
<gsnedders>
totally real, though.
20:26
<gsnedders>
I love software :P
20:27
<Philip`>
I love only software that doesn't involve character encodings
20:27
<gsnedders>
Philip`: like what? :P
20:27
<Philip`>
Like anything that just uses integers :-)
20:27
<inimino>
Philip`: but not as character codes?
20:28
<gsnedders>
Philip`: but output? :P
20:28
<gsnedders>
Philip`: how do you encode those integers for display?
20:28
<Philip`>
inimino: Using integers for character codes is okay, as long as you never have to interpret those integers as characters :-)
20:29
<Philip`>
gsnedders: If you're only outputting and never inputting, then you don't have to care about encoding errors, because you just shove stuff through printf() and if the user gets garbage then it's their problem for not using ASCII
20:29
<gsnedders>
:D
20:29
<inimino>
heh
20:32
Philip`
's current work's only user interface is via Telnet outputting to a serial console in a virtual machine connected to the host through UDP and then passed through Perl and Python scripts
20:32
<Philip`>
so there's pretty much no chance of me getting character encodings straight, so I'm just sticking with ASCII there to save myself the pain
20:33
<Philip`>
(Actually, it's a Telnet server and a netcat client, so all the magic Telnet commands get printed out to the screen)
20:35
inimino
guesses there is a story behind doing it that way
20:37
<hsivonen>
roc: thanks. installing now
20:37
<Philip`>
inimino: I'm testing some networking stuff, so it has to be done in VMs, and then that's the best way I've found to collect the output from them
20:59
<hsivonen>
annevk: where does HTML5 allow target='' on <form>?
21:12
<gsnedders>
Philip`: ValueError: too many values to unpack — your file is too big for what I want :P
21:13
<Philip`>
gsnedders: Uh, that sounds like an odd reason to get ValueError
21:13
<gsnedders>
Philip`: well, there are over ~15k items in the dictionary I'm trying to iterate over
21:14
<Philip`>
I don't see why that would be a problem
21:15
<gsnedders>
Philip`: and your saying of ~15k is wrong. ~115k would've been closer :) (116945 FYI)
21:16
<Philip`>
It's ~15K unique documents, mostly with >1 header each
21:16
<gsnedders>
ah. ~15k documents.
21:16
<gsnedders>
that is what you said, actually.
21:19
<gsnedders>
This is starting to get really annoying.
21:19
<gsnedders>
http://mail.python.org/pipermail/python-list/2006-June/387414.html
21:19
<gsnedders>
hmm
21:20
<gsnedders>
the value of the dictionary is a list
21:20
<Philip`>
What do you mean by "the value of"?
21:21
<gsnedders>
the value of every key is a list.
21:22
<Philip`>
The bit where you said [{"name": ...}] is a list because of the []
21:22
<Philip`>
or do you mean something else?
21:23
<gsnedders>
{"foo": "bar"} where "foo" is the key and "bar" the value
21:24
Philip`
doesn't understand the problem
21:25
gsnedders
now understanding the problem writes a tiny exemplar
21:26
<gsnedders>
Philip`: http://pastebin.ca/813896 — make that work.
21:27
<hsivonen>
Lachy__: http://html5.lachy.id.au/ could be better if the form was seeded with an HTML5 skeleton document
21:27
<Philip`>
gsnedders: foo.items()
21:29
<Philip`>
because "for k in foo" iterates over keys, whereas "for (k, v) in foo.items()" iterates over (key, value) pairs
21:29
<gsnedders>
and for k, v in foo?
21:29
<Philip`>
(http://docs.python.org/lib/typesmapping.html shows most of the useful functions)
21:30
<hsivonen>
Lachy__: the validate html5 button works only once for me on http://html5.lachy.id.au/ in Firefox 2
21:30
<hsivonen>
Lachy__: I have to reload to make the button work again
21:30
<Philip`>
gsnedders: That will iterates over keys, and try to unpack each key into a (k,v) tuple, which will raise ValueError because your keys are strings and can't be unpacked
21:30
<gsnedders>
ah
21:31
<Philip`>
(It's like "for x in foo: k, v = x")
21:31
<Philip`>
(in terms of iterating over keys rather than keys+values)
21:31
<gsnedders>
ah
21:31
gsnedders
doesn't pretend to be anything but a python n00b
21:33
<Philip`>
It mostly makes sense when you can see which rules apply - it's not like Perl where magical context-sensitive things happen and you'll never understand unless you read that particular detail in the documentation and newsgroups :-)
21:36
<Philip`>
Hmm, I want to watch some video on the web tomorrow, but it's streaming Windows Media and I'm not sure how to handle that
21:36
<Lachy__>
hsivonen, the validate button works for me all the time, without reloading
21:38
<Lachy__>
hsivonen, send me an email about adding a template document to it, and I'll see what I can do when I get back from Linkoping on Sunday
21:39
<hsivonen>
Lachy__: ok
21:40
gsnedders
has kinda given up hope at actually passing the maths test tomorrow, having missed a couple of weeks, and barely knowing one section
21:41
<Lachy>
gsnedders, maths isn't too hard
21:41
<Lachy>
which particular maths topics are you covering this year?
21:43
<gsnedders>
Lachy: http://www.sqa.org.uk/files/nq/C10012.pdf (which I think gives enough detail :P)
21:43
<Philip`>
Aha, mplayer works
21:44
Philip`
wonders if anyone happens to know how to record and watch a stream simultaneously
21:44
<gsnedders>
can anyone explain the reasoning behind there being both plus/minus and minus/plus signs?
21:45
<gsnedders>
Lachy: it's the stuff on page 17 of the PDF that I've missed
21:45
<Philip`>
gsnedders: They're useful for e.g. "+/- x = - (-/+ x)"
21:46
<Philip`>
i.e. representing two versions of the equation, where one has a mixture of + and -, and the other has the +s and -s flipped
21:46
gsnedders
doesn't see how that helps
21:46
<Lachy>
gsnedders, on which page can I find one of these minus-plus signs?
21:46
<Lachy>
I've never seen one before
21:47
<Philip`>
You can't say "+/- x = - (+/- x)" because that would be interpreted as "+x = -(+x) and -x = -(-x)" which is untrue
21:47
<gsnedders>
Lachy: the same page (p17 of the PDF, labelled within itself as p16)
21:47
<gsnedders>
Philip`: ah. so then you have to take both from the same side!
21:47
<Philip`>
In that cos example, it means "cos(A+B) = blah - blah; and cos(A-B) = blah + blah"
21:48
<gsnedders>
yeah, that makes sense now
21:48
<Philip`>
gsnedders: Yep
21:48
<gsnedders>
(I've just about done the basics of the top two formula)
21:48
<gsnedders>
and obviously the third is just rearranged
21:48
<Philip`>
(though sometimes +/- and -/+ are used in a context-sensitive way and don't actually work like that :-) )
21:48
<gsnedders>
(or rather, a rearranged copy of the above)
21:49
<Philip`>
The third/fourth are just taking A=B
21:49
<gsnedders>
Philip`: 15332 pages of HTTP headers, BTW
21:50
<gsnedders>
Philip`: I understand the first equality on the forth, but not the second/third equalities
21:51
<gsnedders>
Philip`: are all the headers from accessing the page once, or not?
21:52
<Philip`>
cos^2 x + sin^2 x = 1
21:52
<Philip`>
...is the relevant fact that should be known
21:52
<gsnedders>
that more or less makes sense from a graph, yeah.
21:52
<Philip`>
so that gives the cos^2 A - sin^2 A = 2cos^2 A - 1 and suchlike
21:54
<Lachy>
there was a time about 10 years ago where I would have been able to do those trig equations. Now I just stare at them blankly
21:54
<jgraham>
IRC clearly needs better support for maths notation
21:54
<gsnedders>
Philip`: LaTeX!
21:54
<roc>
IRC should be XML
21:54
<roc>
then we could post our HTML examples directly
21:54
<roc>
and use MathML
21:55
<gsnedders>
peh. start with the basics. get a universally accepted character encoding on IRC :)
21:55
<Philip`>
gsnedders: Each page was GETed once, but the <header uri> in the output is the result of redirections, so it's possible that some pages redirected to the same location
21:55
<gsnedders>
kk
21:55
<roc>
and play XSS pranks on each otehr
21:55
<Hixie>
lordy what a lot of main
21:55
<Lachy>
UTF-8 seems to be fairly widely accepted for IRC these days
21:55
<gsnedders>
Philip`: there's a page that claims to have 94 headers
21:55
<Philip`>
gsnedders: Maybe it'd be more helpful if I gave the original unique requested URI instead of the redirected result?
21:55
<Hixie>
half of this video mail has neither the word "video" nor the word "ogg" in it
21:55
<Hixie>
sheesh
21:55
<roc>
fear the wrath of Ogg!
21:56
<gsnedders>
Philip`: give the original URI for each request (i.e., a redirect has a different URI)
21:56
<jgraham>
Hixie: That's to make it hard to automatically redirect to /dev/null ;)
21:56
<Lachy>
I can't believe the whole ogg discussion is still going on, on far too many different lists
21:57
<Hixie>
what i'm amused by is that for every person sending 10 flames to one of the lists, i get a person e-mailing me privately telling me that they have my support and that they believe we're doing the right thing
21:57
<gsnedders>
I'm not totally sure whether it was the right thing to do.
21:57
<gsnedders>
Though I'm sure plenty of the people on the mailing list think I agree with you :)
21:58
hsivonen
finally replies to an ogg email
21:58
<gsnedders>
Philip`: least headers is 3
21:59
<hsivonen>
evidently, rudd-o hadn't read the spec before he started his slashdot campaign
21:59
<gsnedders>
Philip`: median is 7, mean is 8
21:59
<Hixie>
hah, the first comment on http://digg.com/tech_news/Nokia_and_Apple_seem_to_have_succeeded_in_suppressing_ogg is a complete non-sequitur
21:59
<Hixie>
hsivonen: shocking, that
21:59
<gsnedders>
hsivonen: why bother? it's only another damned technical document!
21:59
<gsnedders>
(on a totally unrelated note, I updated the to-do list on the tolerant http parsing spec)
22:00
<Lachy>
oh wow, I never expected accessibility to come up in the discussion: [whatwg] HTML 5, OGG, competition, civil rights, and persons with disabilities
22:01
<hsivonen>
Lachy: I must have skipped that message.
22:01
<gsnedders>
(that's <http://stuff.gsnedders.com/draft-sneddon-http-parsing-00.html>; or .txt)
22:01
<Philip`>
gsnedders: If I give the original URI, it's still going to return the final after-redirection request's headers, so if several URIs redirect to the same place then it'll repeat the redirection target's headers
22:01
<Philip`>
which isn't necessarily bad, but it's something to be aware of
22:01
<gsnedders>
Philip`: ergh.
22:05
Philip`
wonders how to select elements of type A or type B using the subset of XPath supported by xml_grep
22:06
<Hixie>
i think i have found an easy way to achieve my goals of replying to hundreds of e-mails by month's end
22:06
<Philip`>
Oh, looks like I can't do that
22:10
<Philip`>
gsnedders: http://www.cl.cam.ac.uk/~pjt47/misc/headers2.xml.bz2 has the original request URI, and some <redirect>s to point out what got redirected
22:10
gsnedders
just commited into hg the entire XML file
22:11
<Philip`>
The new XML file is indented differently, just to make fun diffs
22:12
<gsnedders>
what is the <redirect> element? just noting movement?
22:12
<Philip`>
Yes - it's added whenever the request URI and response URI differ
22:12
<gsnedders>
pay any attention to how many redirects it has?
22:12
<Philip`>
(i.e. when HttpClient did whatever magical redirection-handling it does)
22:12
<Philip`>
It has less than 100 redirects, but that's all I know
22:13
<Philip`>
(because otherwise it throws an exception and aborts)
22:13
<hsivonen>
Philip`: did you write your own spider based on HttpClient and the Validator.nu parser?
22:13
<gsnedders>
Philip`: different root element, too
22:14
<Philip`>
gsnedders: Yes, but you said you were using getElementsByWhatever so I assumed that wouldn't matter, and I used grep/echo/cat instead of xml_grep to extract the bits from my original XML file
22:14
<gsnedders>
Philip`: yeah, just an observation :)
22:14
<Philip`>
which is totally not the right way to do it :-)
22:14
<gsnedders>
http://hg.gsnedders.com/cgi-bin/hgwebdir.cgi/http-parsing/file/96df15d57efb/Philip%20Taylor%27s%20Header%20Data/README.txt — that all right?
22:15
<hsivonen>
Philip`: is the code that you are using to drive HttpClient in SVN somewhere?
22:15
<Philip`>
hsivonen: I don't think it's a spider since it doesn't follow links at all, but I did write my own thing to download/analyse a list of HTML files using HttpClient and the Validator.nu parser
22:15
<hsivonen>
Philip`: ok
22:15
<Philip`>
hsivonen: It isn't at the moment
22:16
<hsivonen>
Philip`: surely at that point you could make links from the parse tree feed back into the download list
22:16
<hsivonen>
although it probably isn't that simple
22:17
<Philip`>
hsivonen: I'm not sure exactly what you mean
22:17
<Philip`>
gsnedders: The last paragraph doesn't really make any sense :-)
22:17
<hsivonen>
If you analyse docs, presumably the contain links and those could be put on the list to download/analyse
22:18
<hsivonen>
but then there's robots.txt
22:18
<gsnedders>
Philip`: that's true, but I only did it quickly
22:18
<Philip`>
hsivonen: Ah, yes
22:18
<hsivonen>
and crawling in a reasonable breadth-first order etc, etc
22:18
<Philip`>
hsivonen: I'd prefer to use someone else's code rather than do all that work
22:19
<Philip`>
(I'm not even looking at robots.txt now, since that would double the number of requests I make)
22:19
<gsnedders>
Philip`: "It may not be grouped by URI fully, as it is not processed by a single thread"?
22:19
<Philip`>
gsnedders: Is that sentence needed at all?
22:20
<gsnedders>
Philip`: not really, but I may as well put it there in case anyone ever cares.
22:20
<Philip`>
gsnedders: It just means it might have <header uri=a/><header uri=b/><header uri=a/>, which isn't an extremely interesting observation
22:20
<hsivonen>
Philip`: the Internet Archive spider has the kitchen sink in it but seems to be picky about its execution environment according to docs
22:21
<hsivonen>
Philip`: also, the code base isn't particularly approachable due to the kitchen sink nature
22:21
<gsnedders>
Philip`: it means you can't do anything that assumes it's in order, which you might sometimes want to do
22:23
<Philip`>
hsivonen: Hmm, it does sound not entirely trivial
22:23
<Philip`>
I'm not sure how worthwhile it would be to do actual spidering, rather than sticking with the dmoz.org list
22:24
<hsivonen>
Philip`: isn't dmoz biased towards front pages?
22:24
<Philip`>
(particularly since I can't do especially extensive spidering - I'd prefer not to be making a hundred thousand requests, because it's kind of expensive in bandwidth)
22:25
<Philip`>
hsivonen: Yes, and to CNN
22:25
<hsivonen>
also, how alive is dmoz these days? does it represent current authoring?
22:25
Philip`
has no idea
22:26
<Philip`>
I can imagine getting much worse results from a spider that gets sucked into a single giant site, so I'm not sure how to make things definitely better
22:27
<Philip`>
(I'm not even sure what "better" means)
22:29
<hsivonen>
Philip`: knowledge about web site structures would probably be needed to make reasonable guesses
22:30
<hsivonen>
Philip`: without data I might guess that a sensible strategy would be taking a list of site roots, analyzing the front page, picking two site-internal links at random, analyzing those pages too and following one site-internal link from each of those
22:30
<hsivonen>
that would give front page plus 4 non-front pages for each site
22:43
<Philip`>
hsivonen: It would be good to have a way of evaluating the strategies, to see which ones actually work sensibly in practice, but I've got no idea how to do that either :-/
22:51
<tndH>
ooh, acronym/initialism debate again
22:51
<tndH>
feels nice to read that after all the ogg stuff
22:53
tndH
pronounces it "huttamerl 5", for what it's worth
22:55
<anne-mac>
evening
22:57
<roc>
you know what would be an interesting way out of the codec mess?
22:57
<anne-mac>
ooh, codecs
22:57
anne-mac
hides
22:57
<roc>
a GPLed H.264 implementation with a patent license grant for just that implementation (and derivatives)
22:59
<Philip`>
<video src="http://transcode.google.com/?fmt=h264;src=http://mycheaplyhostedsite.com/somevideo">;
22:59
<anne-mac>
i don't want to get involved, but since this sound interesting, would that work cross-platform?
23:00
<doublec>
x264 sis a GPLed H.264 implementation
23:00
<Philip`>
roc: What if somebody took that implementation and created a derivative by deleting all the code and then pasting their own unrelated code into it, to get the patent licence?
23:00
<doublec>
but obviously would need the patent license grant
23:00
<Philip`>
s/to get/to get around/
23:01
<roc>
Philip`: that doesn't really matter. The main goal is to ensure that the patent license grant only applies to GPLed software
23:01
<roc>
so if they claim it's a derivative, then they have to keep the GPL too. They can't have it both ways
23:01
<Philip`>
roc: Ah, okay
23:01
<Philip`>
roc: I imagine Opera wouldn't like that so much :-)
23:01
<roc>
yeah well
23:01
<roc>
Mozilla wouldn't like it very much eitehr
23:01
<roc>
but it might be acceptable
23:01
<roc>
maybe
23:02
<roc>
the real main goal, of course, would be to exploit GPL virality to ensure that the patent grant would only apply to free software
23:03
<roc>
similar to the way Qt is GPLed so people writing closed source apps with it have to pay money
23:03
<anne-mac>
hmm, that doesn't sound very useful to us indeed
23:04
<anne-mac>
or content providers
23:04
<roc>
anne-mac: yeah, sorry, it's far from ideal
23:05
<Philip`>
What would the MPEG-LA get out of this?
23:05
<Philip`>
(I assume they don't want to just be nice to people)
23:06
<roc>
more players for H.264 vs VC-1
23:06
<Philip`>
(Possible answer: They would get money from content providers who are producing content in that format)
23:06
<roc>
which would appeal to some of the MPEG-LA members, I think
23:07
<Philip`>
(but that means content providers would have to be paying)
23:07
<roc>
yeah, I don't know about that situation
23:07
<Philip`>
((though small-scale content providers (e.g. bloggers) could just use MEncoder and not care about patents, like they do already with XviD and everything))
23:07
<roc>
but content providers may be a separate battle
23:07
<roc>
the patents might not even be the same
23:09
<roc>
doublec: aren't you at the workshop right now?
23:22
<doublec>
roc, yes
23:22
<roc>
ok
23:22
<doublec>
oddly someone representing license holders for a codec just spoke to me about something similar-ish
23:22
<roc>
interesting
23:23
<roc>
you should be doing that sort of thing instead of being on IRC :-)
23:23
<doublec>
so at least some people are interested in discussing options
23:23
<doublec>
:)
23:23
Hixie
wishes doublec the best of luck, really, and truly hopes doublec finds a solution
23:23
<roc>
give us a full report when you get back
23:23
<Hixie>
get together with dave singer
23:23
<doublec>
will do
23:23
<Hixie>
he's very interested in helping out too
23:24
<doublec>
i'll know more later, and will pass details on to Dave as well
23:27
<Hixie>
cool
23:27
<Hixie>
roc: re your mail, you are of course correct, i just meant given the current landscape
23:27
<Hixie>
roc: one way of forcing the issue would be to make there be a lot of theora content.
23:27
<roc>
yeah
23:29
<roc>
unfortunately my best idea there is "free pornography community"
23:31
<Hixie>
not necessarily a bad idea actually
23:33
<anne-mac>
hmm google blog search considers rudd-o's blog post to be the most relevant for "html5"
23:34
<anne-mac>
what is interesting is the amount of non-English content appearing on google blog search
23:35
<anne-mac>
it would be nice if we could somehow get in touch with all those people (or maybe we already are)
23:46
<Lachy>
I like this. Accusing Microsoft of using HTML5 as a distraction from their other problems with IE. http://inspireaction.mindandmedia.com/index.php/2007/12/11/why-is-microsoft-trying-to-distract-us-with-html-5/
23:52
<jruderman>
"force Microsoft to play by the rules"
23:52
<jruderman>
if he wants to attempt that, he's more than welcome to
23:53
<jruderman>
the rest of us are going to keep trying to improve the web
23:55
anne-mac
thinks it would make sense for <th>/<caption>/<legend>/<h1>-<h6> to share the content model
23:59
<anne-mac>
seems I can't edit though... neither Opera or Safari is supported by Google Docs :(