#whatwg on 2011-11-22

00:00	<mkanat>	erlehmann: We were referring to EXI. :-)
00:00	<gavinc>	erlehmann: Right, which is of course a violation of the XML spec
00:00	<erlehmann>	mkanat, making it even harder to parse?
00:00	<mkanat>	erlehmann: Yeeeeah.
00:01	<erlehmann>	gavinc, i see what you did there. fun fact: once upon a time you could remote-crash XMPP clients (those that used a real XML parser) by sending namespace-malformed XML
00:01	<erlehmann>	at least knocked them offline.
00:01	<finnala>	Specs schmecks. Isn't that how the web works? ;)
00:01	<erlehmann>	nowadays you'll only knock yourself offline.
00:01	<erlehmann>	finnala, the word is „schmocks“
00:03	<Yuhong>	Tag soup can be even worse of course. I already mentioned https://bugzilla.mozilla.org/show_bug.cgi?id=607222
00:04	<TabAtkins>	That has nothing to do with tag soup. It's DOM scripting.
00:05	<Yuhong>	Yep, the reason I say it is related to tag soup is that it involves document.write which writes tag soup and an appendChild of the base element.
00:07	<Yuhong>	Also: http://ln.hixie.ch/?start=1155195074&count=1
00:08	<TabAtkins>	document.write's problems go far beyond tag soup. It's a basic layering violation.
00:09	<MikeSmith>	Yuhong, TabAtkins - about the validator behavior for shape, Philip` is right. I added that error reporting because without it there, the message "Error: The shape attribute on the a element is obsolete. Use area instead of a for image maps." gets reported for every <a> element, and the user is, like, "Huh?", because they don't have any shape attributes on those elements in their source and the don't know that XML requires the parser
00:09	<MikeSmith>	add them
00:10	<MikeSmith>	if people find that annoying they shouldn't use XML
00:10	<TabAtkins>	Or use a custom DTD, I guess.
00:10	<MikeSmith>	yeah. and/or they should select the "don't load external entities" option
00:10	<MikeSmith>	or maybe we should make the "don't load external entities" option the default (if it's not otherwise)
00:11	<MikeSmith>	hmm, no, can't do that
00:11	<MikeSmith>	because then it will report all the named character references as errors
00:11	<MikeSmith>	yippee for SGML legacy misfeatures
00:13	<gavinc>	This stuff? Why RDF XMLLiteral is still a -really- bad idea.
00:13	<TabAtkins>	HTMLLiteral!
00:16	<gavinc>	Mm, can HTML be easily compared for equivalence? (also, canonical lexical form?)
00:17	<TabAtkins>	Canonicalization is the devil.
00:17	<TabAtkins>	In other words, no.
00:18	<gordo>	when i use link rel="icon" type="image/png" sizes="48x48"
00:19	<gordo>	and various other sizes
00:19	<gordo>	do browsers make a request for every size?
00:19	<gordo>	or just one
00:19	<zewt>	why not test it? heh
00:19	<gordo>	favicons don't seem to appear in the net tab of browser developer tools
00:20	<gavinc>	TabAtkins: Canonicalization isn't really that much of an issue. But lexical to value mapping is. Seems like that would end up with needing an HTML parser in order to parse RDF. Same as today with XMLLiteral needing an XML parser
00:21	<TabAtkins>	I'll defer to other people who've told me that canonicalization is an issue. But otherwise, yes, you're right - comparison requires an HTML parser to parse RDF.
00:22	<Yuhong>	I think canonicalization of HTML is even worse due to it being tag soup which it's error handing was only recently standardized.
00:23	<gavinc>	But I think an optional defined datatype for HTMLLiteral would be decent, and preferable to XMLLiteral in real usage. Yes, comparing it by value would require an HTML parser but some implementations will have one lying around anyway being in a browser already
00:23	gavinc	thinking out loud
00:24	<TabAtkins>	What are the timplementations of RDF in browsers?
00:25	<gavinc>	Okay, valid point
00:25	<gavinc>	Well, in theory an implementation of the RDF API
00:26	<gavinc>	and RDFa API
00:40	<Hixie>	nessy: ping
00:44	<Philip`>	Maybe XMLLiteral would be saner if its value space was DOMs, not strings-which-are-canonicalisations-of-DOMs
00:44	<Hixie>	has to be serialised somehow; the idea is to store the information in a database
00:48	<gavinc>	One idea was to have the value space be the XML infoset, but even creating that is complicated
00:48	<gavinc>	and an HTMLLiteral wouldn't have an infoset
00:55	<gavinc>	Oh, I don't think I've asked anyone here yet to kick Turtle in HTML tires yet http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html#in-html
01:20	<AryehGregor>	Ouch.
01:20	<AryehGregor>	I never thought about the fact that properties we add to Document will be in global scope for inline event handlers.
01:20	<AryehGregor>	That's bad.
01:23	<Hixie>	yeah that's screwed me several times already
01:29	<zewt>	no way to segregate new properties somehow, so they're not visible from there?
01:29	<zewt>	scoping nightmares go
01:35	<MikeSmith>	are the "final rendered dimensions of cells within a table" not exposed in the DOM?
01:36	<MikeSmith>	height and width of a particular cell?
01:41	<mkanat>	MikeSmith: That's just computed style, isn't it?
01:42	<MikeSmith>	mkanat: hmm, yeah, I would think so
01:59	<AryehGregor>	Awesome, it's even worse than I thought -- the element itself is also in the scope chain.
01:59	<AryehGregor>	zewt, that's exactly what I think we should do -- make new properties (at least those with short names that might cause conflict) not get hit from on* as bare names.
02:45	<MikeSmith>	Hixie, annevk - regarding the "WHATWG on Google+" discussion on the whatwg list, the parts about people needing to read through diffs
02:46	<MikeSmith>	I notice that diff has this "-show-function-line=regexp" option
02:47	<MikeSmith>	which could be used to find the nearest preceding <h1>-<h6> heading
02:47	<MikeSmith>	and/or the nearest preceding element with an id value
02:48	<MikeSmith>	and that heading text and id value would then be included in the diff
02:49	<MikeSmith>	and then http://html5.org/tools/web-apps-tracker could show the heading titles for the sections of the spec affected by that change
02:50	<MikeSmith>	and provide links back to the spec
03:13	<MikeSmith>	hmm, I forgot that Hixie doesn't use many IDs in his source..
03:13	<MikeSmith>	oh
03:13	<MikeSmith>	but the index file does have them
05:29	<erlehmann>	ha, i thought i might be on hacker news for my work on libglitch. but now this. http://news.ycombinator.com/item?id=3264074
05:29	<erlehmann>	i hope many people whime in on this discussion. it is important to have mindshare for the open web.
06:23	<erlehmann>	>For quite a few years people used the Internet Explorer icon on pamphlets and posters as an icon representing The Internet. It seems people are now starting to use Facebook's icon for that.
06:23	<erlehmann>	facepalm m(
08:13	<yaffle>	hello!
08:17	<yaffle>	http://www.whatwg.org/specs/web-apps/current-work/multipage/comms.html#messageevent
08:18	<yaffle>	why we need "origin" attribute for MessageEvent from server-sent-events ?
08:25	<yaffle>	or for MessageEvent from websockets ?
08:25	<Hixie>	we don't
08:25	<yaffle>	when it can be usefull?
08:25	<Hixie>	it's only useful for messages from window.postMessage(), iirc
08:25	<yaffle>	so, why this is presented in the spec? seems, some internet bloggers writes their aritcles and propagates "origin" attribute checks every time...
08:25	<Hixie>	you should check the 'origin' attribute all the time when receiving messages from window.postMessage()
08:27	<yaffle>	yes, i know, but checking this for messages from websockets and servern-sent-events is redundantly
08:27	<Hixie>	does it even have a value for those?
08:28	<yaffle>	http://www.html5rocks.com/en/tutorials/eventsource/basics/#toc-security
08:31	<Hixie>	yeah that's pretty much pointless. can you file a bug so that i can clarify that that warning is especially relevant for window.postMessage() messages and not so much EventSource and WebSocket ?
08:32	<yaffle>	@Hixie, ok
08:34	<Hixie>	thanks
08:38	<erlehmann>	tantek, do you like refbacks? if not, why not?
08:43	<tantek>	erlehmann, assuming you're talking about http://en.wikipedia.org/wiki/Refback then no
08:44	<tantek>	because they're too easily open to abuse
08:44	<tantek>	HTTP header forgery can be used to trick a refback enabled site into accessing some other random site out there
08:45	<erlehmann>	tantek, and then there is no link to the site and everything is well.
08:45	<hsivonen>	tantek: why is it bad to do a GET to a random site out there?
08:45	<erlehmann>	or not?
08:46	<tantek>	it's an attack surface on a server
08:46	<tantek>	for example, it's a trivial way to cause a DDOS
08:47	<tantek>	if you can trick 1000s of servers out there to issue a get request on one random site out there
08:47	<erlehmann>	tantek, what do you recommend instead?
08:48	<tantek>	there is no "Instead". the protocol is flawed all by itself.
08:48	<tantek>	go back to the drawing board. ;)
08:49	<tantek>	actually, given that I was able to explain a trivial DDOS scenario - and I'm not even a security expert, why isn't this documented on the wikipedia article?
08:49	<erlehmann>	tantek, no. i mean: what do you recommend instead of using linkbacks?
08:52	<tantek>	oh I know - erlehmann, here's your homework assignment in return for that answer, go update the wikipedia article to add a criticism section noting the trivial DDOS abuse of Refback enabled servers, with a <ref> cite to my statements above http://krijnhoetmer.nl/irc-logs/whatwg/20111122#l-387 :)
08:52	<Hixie>	it's already pretty trivial to cause a server to get a lot of GETs, that's not a particularly interesting security issue imho
08:52	<tantek>	or just add it to the existing Security issues section
08:53	<erlehmann>	what Hixie said.
08:53	<tantek>	nah, I call theoretical on that
08:53	<Hixie>	you can call what you like :-)
08:53	<tantek>	worthy of documenting as a vulnerability introduced by implementing the protocol
08:53	<erlehmann>	tantek, you are a clever and mean person.
08:54	<Hixie>	if you want to do a DDOS of the nature you describe it's far less work to just post a link on some popular site to some porn or a kitten, and include on that page a link to the victim site
08:56	<tantek>	Hixie, if that were true, people would be doing it all the time on sites like http://cuteoverload.com/ but that doesn't appear to be happening.
08:56	<Hixie>	the reason they don't do it is that a bunch of GETs isn't a particularly interesting attack
08:56	<Hixie>	same reason they don't do it with pingback, trackback, spam e-mail, etc
08:56	<Hixie>	all of which are existing ways to do what you describe
08:58	<foolip>	Hixie, jgraham, I was at one point writing a change tracking tool that used the outline algorithm to determine sections
08:58	<foolip>	but I got bored, as usual
08:59	<tantek>	Hixie, perhaps more likely, there just aren't sufficient numbers of Refback enabled servers yet to perform an interesting attack.
08:59	<tantek>	perhaps because the vulnerability is so obvious.
09:00	<Hixie>	there's plenty enough servers that get tons of traffic per link that can be used to cause the attack you describe
09:00	<Hixie>	reddit, for instance
09:00	<Hixie>	slashdot
09:00	<Hixie>	indeed the effect is even named after slashdot
09:00	<tantek>	sure, and that does happen
09:00	<Hixie>	it used to be a problem
09:00	<tantek>	articles get slashdotted
09:00	<Hixie>	and it isn't anymore
09:00	<tantek>	it used to be
09:01	<tantek>	until slashdot perhaps dropped in popularity
09:01	<Hixie>	slashdot has far more traffic now than it used to
09:01	<tantek>	by people that actually read and click? twitter-level attention spans have reduced that too
09:02	<tantek>	now you're more likely to get the effect if @KevinRose tweets a link to your site
09:02	<Hixie>	the reason it's no longer a problem is that it's so trivial to perform the attack you describe, that any serious web server software has long been hardened against that kind of attack
09:02	<Hixie>	anyway, bed time
09:02	<Hixie>	nn
09:02	<tantek>	nn
09:11	<erlehmann>	nn?
09:11	<erlehmann>	nighty-nighty?
09:11	<erlehmann>	tantek, i agree with Hixie on this. if 1000 (single) GETs are a problem, maybe one should stop hosting pages on embedded hardware.
09:12	<erlehmann>	or use better server software, written in C, with libowfat and tinyldap.
09:13	<erlehmann>	tantek, i think will implement refback and see how it goes. big problem: hashbang sites. :(
09:39	<annevk>	MikeSmith: that diff stuff seems interesting
09:39	<annevk>	MikeSmith: in most cases you can generate the ID from the header value
09:39	<annevk>	MikeSmith: using the same algorithm Anolis uses
09:39	<MikeSmith>	ah yeah
09:39	<MikeSmith>	right right
09:40	<MikeSmith>	hadn't thought about that
09:40	<hsivonen>	MikeSmith: I think I might have solved http://www.w3.org/Bugs/Public/show_bug.cgi?id=10174
09:40	<annevk>	MikeSmith: that will be a problem with duplicate "Introduction" sections and such, but they are not that frequent
09:40	<hsivonen>	it's a shocking bug
09:40	<hsivonen>	in the code that converts bytes to UTF-16 code units
09:40	<MikeSmith>	hsivonen: yeah?
09:41	<hsivonen>	the code returns an EOF sign where doing so can't be right
09:41	<MikeSmith>	annevk: and for code in intro sections it's non-normative anyway, so nobody should care too much
09:41	hsivonen	tests some more
09:41	<MikeSmith>	s/code/spec text/
09:41	<annevk>	MikeSmith: and I think I can run custom diff on my server
09:42	<MikeSmith>	annevk: cool. If I can help let me know
09:42	<annevk>	if you can provide the commandline thingie
09:42	<hsivonen>	Unicode conversion loops are hard
09:42	<MikeSmith>	I'm also happy to help with some of the other ideas that were discussed in that thread, if anybody else wants to put time into it
09:42	<annevk>	we could start with that and see what it does on web-apps-tracker
09:43	<MikeSmith>	annevk: the command line thing is just that same exact thing I put into the message
09:44	<MikeSmith>	it's really only that one line of code that invokes the diff command with the arguments needed
09:44	<MikeSmith>	so you should be able to just cut and paste it out of there
09:45	<MikeSmith>	and make the subversion config change for whatever user you have web-apps-tracker running under
09:47	<MikeSmith>	the fact that it breaks right exactly at byte 56K seemed like it must be something more than just coincidental
09:47	<MikeSmith>	0xE000
09:47	<MikeSmith>	that number even looks scary
09:47	<hsivonen>	it falls on a multiple of a buffer size
09:47	<MikeSmith>	yeah
09:51	<erlehmann>	how does the yellow highlighting work in the logs?
09:55	<MikeSmith>	hsivonen: thanks for the fix
09:55	MikeSmith	goes to push the change to w3c validator.nu backend
10:07	<annevk>	TabAtkins / ojan / etc. I think we should do it with a Web IDL annotation instead
10:07	<annevk>	I guess I should say on the list
10:13	<annevk>	so MikeSmith, I currently have "svn diff -r %s%s %s"
10:14	<MikeSmith>	OK
10:14	<MikeSmith>	you don't need to change that
10:15	<annevk>	svn diff does support --diff-cmd
10:16	<MikeSmith>	yeah
10:16	<MikeSmith>	true
10:16	<MikeSmith>	so yeah, you can do it that way
10:16	<MikeSmith>	instead of in the config file
10:17	<MikeSmith>	"svn diff --diff-cmd diffwrap -r %s%s %s"
10:17	<annevk>	I don't have full control over the server
10:17	<MikeSmith>	ah
10:17	<annevk>	the diff utility I have does support -F
10:17	<annevk>	but obviously it does not have -r
10:17	<MikeSmith>	yeah
10:18	<MikeSmith>	it's not going to pass that -r to the external diff util
10:18	<annevk>	okay
10:18	<MikeSmith>	or maybe it does, but the wrapper doesn't read that arg
10:19	<MikeSmith>	so you just need to have a "diffwrap" file somewhere
10:19	<MikeSmith>	and put into it:
10:19	<MikeSmith>	#!/bin/sh
10:19	<MikeSmith>	diff -u -F '<h[1-6]' ${6} ${7}
10:19	<MikeSmith>	maybe you don't need the -u in there
10:20	<MikeSmith>	dunno
10:20	<MikeSmith>	hmm, or maybe you do, come to think of it
10:20	<annevk>	I'm going to try with
10:20	<annevk>	command = "svn diff -r %s%s %s --diff-cmd diff -u -F '<h[1-6]'"
10:20	<MikeSmith>	yeah
10:20	<MikeSmith>	tried that already
10:20	<MikeSmith>	won't work
10:21	<MikeSmith>	even if you properly quote "diff -u -F '<h[1-6]'"
10:21	<MikeSmith>	it needs it to be in script
10:22	<MikeSmith>	because subversion can never do things that way everybody does them
10:22	<MikeSmith>	the intuitive way
10:22	<MikeSmith>	it likes to do them the asstarded way
10:22	<annevk>	it does not seem to do anything
10:23	<MikeSmith>	read the intro to http://svnbook.red-bean.com/en/1.2/svn.advanced.externaldifftools.html if you care to know the ugly details
10:24	<annevk>	oh
10:24	<annevk>	so this won't do anything
10:24	<MikeSmith>	um
10:25	<MikeSmith>	I think you may need to put the --diff-cmd before the -r part
10:25	<MikeSmith>	or at least before the final %s
10:25	<MikeSmith>	which is a filename
10:25	<MikeSmith>	and which needs to be the last arg, I think
10:26	<annevk>	ok
10:26	<annevk>	MikeSmith: so I need to make that diffwrap script?
10:27	<MikeSmith>	yup
10:28	<annevk>	sigh
10:28	<annevk>	nothing is working
10:29	<MikeSmith>	dunno what error you're getting
10:31	<MikeSmith>	but of course you have to either give --diff-cmd the absolute path to the diffwrap script, or you need to put the diffwrap script into whatever PATH the web-apps-tracker user has set
10:31	<annevk>	did it work for you with that script?
10:31	<MikeSmith>	yup
10:32	<annevk>	I can just copy and paste the template?
10:32	<annevk>	and then replace $DIFF with "diff" I suppose and remove the variable declaration
10:32	<MikeSmith>	you gotta make that script chmod 755 too
10:33	<MikeSmith>	yeah, sure
10:33	<MikeSmith>	and add the -F '<h[1-6]' part
10:37	<annevk>	ooh in there?
10:37	<MikeSmith>	heh
10:37	<MikeSmith>	yeah man
10:37	<annevk>	also -u ?
10:38	<MikeSmith>	yup
10:38	<annevk>	this means the Python stuff ends up looking like this:
10:38	<annevk>	command = "svn diff --diff-cmd diffwrap -r %s%s %s"
10:38	<MikeSmith>	yeah
10:38	<annevk>	still yielding nothing
10:38	<annevk>	maybe I should add the .sh?
10:39	<MikeSmith>	yeah, if it has a .sh extension, yeah
10:39	<MikeSmith>	and also you probably need to put the absolute path
10:39	<MikeSmith>	not just --diff-cmd diffwrap
10:40	<MikeSmith>	but -diff-cmd /home/annevk/bin/diffwrap.sh
10:40	<MikeSmith>	or whatever
10:45	<annevk>	MikeSmith: doesn't work :(
10:46	<MikeSmith>	you not getting any error message?
10:46	<annevk>	maybe, I'm just logged in via ssh and trying it live
10:46	<MikeSmith>	you sure the diffwrap script is executable?
10:47	<MikeSmith>	-rwxr-xr-x
10:47	<annevk>	hmm
10:47	<annevk>	when I execute that it says diff: unrecognized option `--right'
10:48	<MikeSmith>	ah
10:48	<MikeSmith>	yeah
10:48	<MikeSmith>	remove that part from the line in the script
10:48	<annevk>	including LEFT / RIGHT ?
10:48	<MikeSmith>	yeah, don't do $DIFF --left $LEFT --right $RIGHT at all
10:49	<MikeSmith>	I don't even understand what the hell that stuff is
10:49	<hsivonen>	'Would it help if the TAG were to "Recommend" to W3C to not be a "bad netizen"?'
10:49	<MikeSmith>	instead make it diff -u -F '<h[1-6]' ${6} ${7}
10:49	<MikeSmith>	and "Somehow getting a regular "have obligations related to registration been met"
10:49	<MikeSmith>	check into the W3C document publication/advancement procedure shouldn't be too
10:49	<MikeSmith>	difficult."
10:50	<MikeSmith>	gotta love that way of approaching things
10:50	<MikeSmith>	annevk: or diff -u -F '<h[1-6]' $LEFT $RIGHT should work too
10:50	<MikeSmith>	if you have those variables in your script
10:51	<MikeSmith>	but I don't see any point in having them because you can just do diff -u -F '<h[1-6]' ${6} ${7} directly
10:51	<annevk>	this works
10:51	<MikeSmith>	sweet
10:51	<annevk>	but it seems something else in the script breaks because of this
10:51	<MikeSmith>	oh
10:52	<annevk>	ah yeah
10:52	<annevk>	--- /tmp/tmp.27 2011-11-22 10:53:17.000000000 +0000
10:52	<annevk>	+++ /tmp/tmp.28 2011-11-22 10:53:17.000000000 +0000
10:52	<annevk>	is no longer accurate
10:52	<annevk>	and therefore I can not extract the information from it I am trying to extract
10:53	<MikeSmith>	ah yeah
10:53	<MikeSmith>	but is working int he oupt
10:53	<MikeSmith>	*in the output
10:53	<MikeSmith>	if you wait for the page to load
10:53	<MikeSmith>	shows the diff at the bottom
10:54	<MikeSmith>	hmm
10:54	<MikeSmith>	but ends up with some cases like <h6><dfn title="attr-input-type-file">
10:54	<MikeSmith>	because it truncates that part at 40 chars
10:54	<MikeSmith>	I think
10:55	<MikeSmith>	oh well
10:55	<MikeSmith>	can refine it later
10:55	<MikeSmith>	and/or make hack the diff source and build a binary that doesn't truncate at 40 chars
10:55	<MikeSmith>	because I think that's a hard-coded limit
10:58	<hsivonen>	manu-db: regarding your W3C Conference talk: why is giving your bank account number to someone a bad idea?
10:58	<hsivonen>	manu-db: is the U.S. system so broken that people can take your money if they know your account number instead of just sending you money?
11:00	<annevk>	MikeSmith: is there any way we can change back those log lines?
11:00	<jgraham>	At least in the UK giving out your bank number can, in some cases, be used to set up direct debits I think
11:00	<MikeSmith>	annevk: maybe
11:01	<MikeSmith>	but I don't understand why it's changing them to begin with
11:01	<jgraham>	hsivonen: http://news.bbc.co.uk/2/hi/7174760.stm
11:01	<MikeSmith>	annevk: what should those log lines actually say?
11:01	<annevk>	usually they give back what svn diff returns
11:01	<annevk>	the actual svn to and from numbers
11:02	<MikeSmith>	annevk: well, you already have those, don't you?
11:02	<annevk>	not if you revTo is 0
11:03	<MikeSmith>	ah
11:03	<MikeSmith>	really?
11:03	<MikeSmith>	I mean, if you are doing svn diff --diff-cmd diffwrap -r %s%s %s
11:03	<MikeSmith>	then it's just those first pair of %s%s, right?
11:03	<hsivonen>	jgraham: oh, right. Direct debit.
11:04	<hsivonen>	jgraham: the protocol for setting up direct debit is bogus here, too.
11:05	<annevk>	MikeSmith: one can be omitted
11:06	<annevk>	MikeSmith: and then we still need to know the other one for the UI
11:06	<annevk>	well we don't need to
11:06	<annevk>	but we could before
11:07	<hsivonen>	jgraham: I wonder if banks had usability people who advocated for the bogus direct debit setup protocol or if they were just full of FAIL without usability people winning over security people
11:11	<annevk>	MikeSmith: see e.g. http://html5.org/tools/web-apps-tracker?from=2011&to=1999 now
11:11	<annevk>	MikeSmith: for how it looks
11:11	<hsivonen>	aargh. # in data: URL for the lose
11:11	<annevk>	MikeSmith: for the other feature, see http://html5.org/tools/web-apps-tracker?from=6830
11:12	<annevk>	MikeSmith: note how the UI knows it's against 6831
11:12	MikeSmith	looks
11:12	<annevk>	MikeSmith: http://html5.org/tools/web-apps-tracker?from=6829 here again, the UI knows it's against 6831; and it figured that out from the log
11:12	<annevk>	I commented out diffwrap
11:13	<MikeSmith>	ah
11:15	<hsivonen>	I hadn't realized the HTML spec had outgrown Validator.nu's size limit again
11:15	<MikeSmith>	hsivonen: yeah
11:16	<hsivonen>	time to set it to 7 MB, I guess
11:16	<MikeSmith>	annevk: so maybe we can figure out later some way to get around that so your python script can still get the data it needs
11:16	<hsivonen>	at least the limit is use case-driven
11:16	<annevk>	yeah, or we kill that feature
11:16	<annevk>	always require both fields
11:16	<annevk>	oh there might be another issue
11:17	<annevk>	if you fill in a revision larger than actually exists
11:17	<annevk>	you could poise the cache
11:17	<annevk>	so you always want to know the revision numbers the diff ran against
11:17	<annevk>	there's some other ways to get around too, but this was rather trivial
11:18	<MikeSmith>	OK. well I'm happy to help more later if I can
11:18	<MikeSmith>	right now I gotta go get some food
11:18	<annevk>	I should get some lunch too
11:18	<annevk>	:)
11:55	<hsivonen>	Hixie: I adjusted the size limit on Validator.nu to accommodate the HTML spec again
11:59	<jgraham>	foolip: Oh, if I understand what you were going for, that seems simple and clever.
11:59	<foolip>	jgraham, you mean diff-sections?
11:59	<jgraham>	Yeah
12:00	<jgraham>	I didn't really read the code much, so I might not have understood
12:00	<foolip>	it splits the spec into directories and files
12:00	<foolip>	then git log -- sections/video/bla will only show commits in that subsection
12:00	<jgraham>	Right
12:00	<jgraham>	So you can subscribe to particular sections or files
12:01	<foolip>	yeah, that would be the idea
12:01	<jgraham>	s/sections/directories/
12:01	<jgraham>	Like I said, that seems clever and simple
12:01	<foolip>	but the tooling to actually do that is missing so far, feel free to step in :)
12:02	<jgraham>	Well… maybe :) I fear trying to commit to getting more done :) But this does seem like I would particularly benefit from it
12:13	<annevk>	foolip: MikeSmith's diff command thingie does give you section titles for changes
12:13	<annevk>	foolip: and it works
12:14	<annevk>	foolip: it doesn't seem like it would be a big burden to go from there to some kind of push notification if you find a particular section
12:14	<annevk>	foolip: could have a twitter account per section :)
12:17	<jgraham>	The moaning of the G+ haters will be nothing to my wrath if you start publishing data exclusively on twitter
12:17	<hsivonen>	foo&noti;bar is my least favorite part of named character reference tokenization
12:18	<annevk>	jgraham: define data
12:21	<zcorpan>	hsivonen: what's your most favorite?
12:22	<hsivonen>	zcorpan: hard to say. named character reference tokenization doesn't have particularly nice parts
12:31	<eightfold>	can someone have a look at this:
12:31	<eightfold>	http://jsfiddle.net/abmTH/
12:35	<eightfold>	i want to hide .PreviewSizes based on the content of pxField
12:37	<zcorpan>	i'll say my favorite is <a href="&copy=">
12:37	<zcorpan>	but then i don't have to implement it :)
12:42	<eightfold>	bah, that was supposed to go in #jquery. sorry.
12:55	<foolip>	annevk, yeah, I tried that as well, but with that approach you can either only follow <h1> sections or need to follow each and every sub-section, I think
12:56	<jgraham>	annevk: Data is like pornography; I know it when I see it :p
13:11	<annevk>	foolip: that's true, but the subsections are in a database
13:11	<annevk>	foolip: because of the section annotation system
13:11	<foolip>	annevk, oh, ok
13:11	<annevk>	basically, there's a couple of dots, but how to connect them...
13:11	<foolip>	well, whoever sets up a working solution first wins!
13:11	<foolip>	I hope it isn't me
13:53	<annevk>	MikeSmith: it seems we should give people at least a week before marking things as NEEDSINFO after you already requested some information
14:05	<zcorpan>	what's the difference between needsinfo keyword and RESOLVED NEEDSINFO?
14:05	<annevk>	needsinfo can be added by anyone
14:05	<annevk>	resolving can only be done by editors
14:29	<zcorpan>	i see
16:24	<jgraham>	TabAtkins: You will excuse me while I don't hold my breath for the "batch processors" selectors spec
16:58	<grendzy>	Hi! Drupal community is looking for a more sophisticated parser to replace PHP DOM (a.k.a SimpleXML, I think based on libxml2). Is http://code.google.com/p/html5lib/ abandoned? Last commit was almost 2 years ago. Thanks!
17:01	<jgraham>	I am not aware thatanyone is actively working on the PHP port
17:01	<jgraham>	If you would like to take over that would be easy to arrange
17:01	<jgraham>	But you should maybe check the performance before you decide what you want to do :)
17:03	<smaug____>	wasn't there some plan to support hsivonen's parser with libxml2
17:05	<smaug____>	grendzy: take hsivonen's parser, and generate php code from java files :)
17:07	<hsivonen>	smaug____: there's a plan. now that View Source is out of the way, it might actually become real
17:08	<erlehmann>	grendzy, as far as i can say, html5lib was usable 1 year ago.
17:09	<erlehmann>	i used the PHP portion for a wordpress plugin.
17:09	<erlehmann>	and am now using python.
17:09	<erlehmann>	PHP is pig disgusting.
17:14	<grendzy>	thanks folks… anyone mind if I quote this chat on a drupal.org discussion?
17:15	<AryehGregor>	Go ahead.
17:15	<AryehGregor>	It's publicly logged.
17:15	<grendzy>	cool, thanks again for the feedback
17:23	<jarek>	grendzy: this channel is already logged on http://krijnhoetmer.nl/irc-logs/
17:57	<Ms2ger>	"Funnily enough, I've just been talking to the DOM5 and DOM6 API designers..."
17:57	<Ms2ger>	Anybody know those?
18:09	<smaug____>	Ms2ger: where is that coming from?
18:10	timeless	saw that
18:10	timeless	can't remember
18:14	<miketaylr>	public-webapps?
18:19	<timeless>	ah yes, in a Selectors API 2 thread
19:19	<rillian_>	foolip: what do you think about video.advance(optional unsigned long frames) ?
19:20	<rillian_>	the idea would be to have something you could call the single-step when paused
19:20	<zewt>	might be expensive to implement for some codecs
19:20	<rillian_>	yeah, I was just thinking skipping many frames could be very expensive
19:20	<rillian_>	in a variable frame rate stream
19:21	<rillian_>	video.advance() wouldn't be bad though
19:22	<zewt>	don't know if there are use cases for small values of frames but greater than 1
19:22	<Hixie>	hsivonen: thanks. i think it's only a temporary issue though, i'll be removing a lot of text soon which should solve the problem anyway.
19:22	<zewt>	perhaps it would be cheap enough to just call advance() multiple times--if the decoding itself is done lazily, it would still allow frame skipping optimizations
19:23	<rillian_>	zewt: yeah. I think the idea is just to scan faster
19:24	<rillian_>	but if it's unlimited, someone might try to use it to seek, not realizing it's an expensive operations on some formats
19:24	<zewt>	right
19:24	<hsivonen>	Hixie: the old limit was around 5 MB. the spec was around 6 MB. the new limit is 7 MB
19:24	<zewt>	(some people might still call advance() a ton to try to seek, but you can only babysit so much)
19:24	<rillian_>	is it expensive on vfr mp4?
19:24	<zewt>	not sure
19:24	<rillian_>	I guess it's pretty bad on webm
19:25	<rillian_>	you can go a chunk at a time, but still
19:29	<Hixie>	hsivonen: oh, wow.
19:29	<Hixie>	hsivonen: i wonder what i added to make the difference so high
19:30	<danbeam>	anybody know if it's intentional that it's pretty much impossible to find out if setting a style from the JS/DCOM will actually trigger a CSS transition / [webkitT]ransitionEnd event? I'm having issues where I'd like to fire a callback on webkitTransitionEnd but if there's no style that ends up changing (i.e. you simply set the same style) you'll never reach this event handler as you never triggered a transition...
19:30	<Hixie>	hsivonen: it still seems to catch errors all teh way to near the bottom of the spec
19:30	<Hixie>	hsivonen: so i assumed it was just on the edge
19:32	<rillian_>	zewt: I think reason for the argument was so you could call video.advance(-1)
19:32	<zewt>	doesn't have to be an integer to allow that (though also, scanning backwards can be pretty expensive)
19:33	<rillian_>	which isn't as expensive as large n, but it still a lot of new code
19:33	<rillian_>	zewt: right
19:33	<zewt>	not necessarily much code, but it's often an expensive operation
19:34	<rillian_>	well, you might have to rememeber the last keyframe, if you're not already?
19:35	<zewt>	only if you want to optimize it further, but that's very low-level...
19:36	<zewt>	(depending heavily on the codec, of course--many formats you'll need to keep the keyframe around anyway)
19:36	<zewt>	(or multiple keyframes)
19:38	<danbeam>	s/DCOM/DOM/ **
19:38	<rillian_>	well, the reason this never works is you really want to just buffer a bunch of decoded data so you can step around
19:38	<rillian_>	which is what editing applications do
19:38	<rillian_>	but that adds a lot of footprint for a feature which mostly isn't used
19:38	<zewt>	generally when it works, seeking backwards is just painfully slow, decoding everything over and over
19:39	<zewt>	editing applications tend to just reencode the video in something designed for it (stuff that doesn't keyframe once a year)
19:39	<rillian_>	that too
19:40	<rillian_>	anyway, I think it has to map to a low-level call inside the playback engine
19:40	<rillian_>	because for variable frame rate formats, you can't be sure you're moving to a particular frame number without codec- and container- specific knowledge
19:42	<rillian_>	which is why fixed frame rate is the ONE TRUE WAY! ahem
20:15	<Ms2ger>	Hmm, nice
20:16	<Ms2ger>	Apparently all of Gecko/Webkit/Presto let you do handle = setInterval(); clearTimeout(handle);
20:22	<annevk>	Ms2ger: isn't that how that feature works?
20:23	<Ms2ger>	Not per spec afaict
20:23	<Ms2ger>	Note interval <-> timeout
21:10	<jgraham>	Ms2ger: Oh, that is interesting
21:11	<jgraham>	Presumably the reverse is also true so clearInterval and clearTimeout are synonyms?
21:12	<Ms2ger>	That's true in Gecko, haven't tested
21:14	<jgraham>	Any reason not to make the spec say that?
21:14	<Ms2ger>	Probably not, I filed a bug
21:16	<TabAtkins>	jgraham: I wouldn't hold your breath, no; it'll certainly take more than a minute or two to do it.
21:16	<TabAtkins>	But I've talked with fantasai about it, and we both think it's a reasonable idea.
21:19	<jgraham>	TabAtkins: Well ignoring the fact that selectors is turning into something that closely resembles line noise already (or Perl), waiting years for the inevitable wrangling about who owns the spec and where it is allowed to discuss it and then more time as people debate synatax seems relatively unappealing compared to slapping the already-implemented-in-Opera API onto XPath and covering all the same use cases. Even if we still so the other thing.
21:19	<jgraham>	*do
21:23	<TabAtkins>	Do you already understand XPath?
21:23	<erlehmann>	i once made a content management system using XSLT
21:23	<jgraham>	Me? I understand it enough to use it when I use lxml
21:23	<erlehmann>	madness
21:23	<jgraham>	XSLT != XPath
21:23	<jgraham>	XSLT is indeed nuts
21:24	<erlehmann>	jgraham, i had to use Xpath in between.
21:24	<jgraham>	Sure, XSLT depends on XPath
21:24	<TabAtkins>	jgraham: Then you are an extremely tiny minority of authors. Almost all authors are unaware that there even is such a thing as XPath, and would react badly if we tried to tell them to use a completely different selection syntax if they want a new feature, that doesn't work with any of the old features.
21:24	<erlehmann>	jgraham, i think tha difference between perl line noise and CSS is that CSS is single-pass tokenizing. you can't parse perl. (at all)
21:25	<jgraham>	TabAtkins: The selectors way might be a good long term thing for that reason
21:25	<TabAtkins>	And, heh, if you think CSS is line noise, I don't see how you don't think even worse of XPath. ^_^
21:25	<erlehmann>	what TabAtkins says, it sounds reasonable
21:25	<jgraham>	Even though selectors scales badly due to the syntax
21:25	<erlehmann>	XPath is just lots of JS comments to the trained eye (starting with //)
21:26	<jgraham>	XPath mostly has a consistent syntax afaict
21:26	<erlehmann>	scales badly?
21:26	<jgraham>	Selectors just picks a new character for each new feature
21:26	<jgraham>	By 2050 I will probably need to have emoji input to make complex selectors
21:26	<TabAtkins>	Only for syntax-level features. Most features can be exposed through pseudoclasses and similar.
21:27	<erlehmann>	what became of :outside?
21:27	<Ms2ger>	::outside*
21:27	<erlehmann>	::outside i mean
21:27	<TabAtkins>	Doesn't exist yet. The draft speccing is is currently abandoned.
21:27	<erlehmann>	I WANT OUTSIDE
21:27	<erlehmann>	breaking out of div hell is great.
21:27	<jgraham>	erlehmann: Thank you for demonstrating my point :)
21:28	<TabAtkins>	erlehmann: I know, I want functionality like that too.
21:28	<erlehmann>	jgraham, i like CSS. it can do, err, stuff.
21:29	<erlehmann>	here, take some blog.fefe.de/?css=http://daten.dieweltistgarnichtso.net/src/fefe-anaglyph-css/anaglyph.css
21:29	<erlehmann>	(caveat: red-cyan glasses needed.)
21:30	miketaylr	goes blind
21:30	jgraham	isn't saying anything about CSS in general
21:31	<erlehmann>	miketaylr, that blog allows external CSS. i also made an imageboard style and a facebook-like one once.
21:31	<erlehmann>	:)
21:31	<miketaylr>	ooo cool
21:31	<erlehmann>	it is a fun demo ground for neat tricks
21:33	<erlehmann>	http://blog.fefe.de/?css=http://daten.dieweltistgarnichtso.net/src/fefesbook-css/fefesbook.css
21:33	<erlehmann>	see what i did there?
21:33	<miketaylr>	heh
21:33	<miketaylr>	aw bummer, http://blog.fefe.de/?css=data:text/css,h1{color:green}
21:33	<miketaylr>	:P
21:34	<erlehmann>	i use html::before
21:34	<erlehmann>	BOW BEFORE ME
21:34	<erlehmann>	:D
21:35	<erlehmann>	maybe i should do an article on abusing CSS
21:35	<erlehmann>	hehe
21:35	<erlehmann>	selectors fun is fun!
21:48	<finnala>	With great power comes great responsibility
22:32	<TabAtkins>	Yup, Bjoern is now in my killfilter. Nearly every interaction I have with him is him trolling.
22:37	<TabAtkins>	Hmm, looks like a recency illusion, actually. He's only recently been trolling, and only in CSS stuff. I'll remove the filter and let him ride a while longer.
22:44	<Hixie>	bjoern h?
22:44	<TabAtkins>	Yeah.
22:47	<Hixie>	i haven't found him trolling, though i have for hte past few years found his priorities are more theoretical than i am comfortable with
22:47	<Hixie>	he used to be emminently practical in his feedback
22:47	<TabAtkins>	Yeah, when I looked back through my archives, his feedback on html or js stuff seems fine.
22:47	<Hixie>	now he tends to talk about process and theoretical spec correctness issues
22:48	<Hixie>	(much like julian)
22:50	<jgraham>	TabAtkins: FWIW I think accusing Robin of "arguing badly" was unjustified
22:51	<jgraham>	AFAICT there is no actual disagreement about facts only about priorities
22:52	<jgraham>	Everyone agrees that selectors don't address all the use cases today. The only disagreement is about whether the remaining use cases are important enough to address now rather than in a hypothetical future selectors spec
22:53	<jgraham>	At the cost of having nice API for two selection methods in the short term as opposed to nice API for one method and hideous API for one method
22:53	<TabAtkins>	Possibly. He was being needlessly sarcastic, and then compounded it with impugning my motives and hyperbolizing.
22:54	<TabAtkins>	I claimed he was arguing badly at the sarcasm point, though, where it was much more weakly justified.
22:55	<TabAtkins>	At this point I think it's quite accurate, though.
22:56	<jgraham>	Anyway I will go to sleep now. No doubt there will be a deluge of mail about this to look forward to :)
23:10	<Hixie>	anyone know if in mysql there's a way to check if a column's value is equal to one of a set of strings? short of manually constructing a sequence of ORed expressions?
23:11	<Hixie>	hmm, looks like IN (?, ?) will do it
23:11	<Hixie>	i wonder if there's a way to make DBI automatically fill the right number of ?s
23:13	<Philip`>	I just use '... IN ('.(join ', ', map '?', @data).')' to get the right number of them
23:14	Philip`	has never encountered a more elegant method
23:14	<TabAtkins>	Hixie appears to be above the crass practice of direct SQL string creation.
23:14	<Hixie>	Philip`: yeah that's what i'm doing, but i was hoping for something neater
23:14	<Philip`>	It's less crass if you're dynamically generating placeholders, and not putting user-supplied data into the query string
23:15	<Hixie>	yeah, it's not terrible
23:15	<Hixie>	it's still not pretty though :-)
23:16	<Philip`>	You could always create a temporary table, insert each item into that table, then do the query with "foo IN (SELECT value FROM that_temporary_table)"
23:16	<Hixie>	i would assume that that has worse perf characteristics
23:16	<Hixie>	what i'd like is just to be able to put a single ? in the query, and pass DBI an arrayref and have it expand it appropriately
23:46	<annevk>	oops
23:46	<annevk>	bit over aggressive there on the wiki
23:47	<Hixie>	no worries
23:51	<annevk>	ojan: annotating everything with [Scope] on Document/Element and their derived interfaces seems like it would add a lot of noise
23:51	<zewt>	i've wanted sql apis to be able to do things like select("select * from table where id in (?)", [1,2,3]), and know (or have a simple way of saying) that the array should be expanded appropriately
23:52	<annevk>	ojan: I can see how that makes sense if the new set becomes larger... but at this point in time?
23:52	<AryehGregor>	Hixie, the ? stuff probably maps to prepared statements on the DB side, no? Not something in the library?
23:52	<zewt>	(the same thing as what hixie said, actually)
23:52	<zewt>	no, ? is usually "expand an (escaped) parameter here"
23:53	<AryehGregor>	Yes, but is that done by the client library or the server?
23:53	<zewt>	library
23:53	<AryehGregor>	I guess it's probably done by the client library in this case.
23:53	<zewt>	the communication to the server is just the resulting SQL
23:53	<AryehGregor>	Although MySQL supports a feature with the same syntax, it would take three statements to use it, I guess: http://dev.mysql.com/doc/refman/5.6/en/sql-syntax-prepared-statements.html
23:54	<annevk>	from twitter "I use the first thing that autocompletes in the devtools :)" so that's how people end up using document.width
23:54	<AryehGregor>	More, actually.
23:54	<Hixie>	AryehGregor: either way
23:54	<AryehGregor>	What's document.width anyway?
23:54	<Hixie>	AryehGregor: i don't mind how it's implemented ;-)
23:54	<AryehGregor>	I saw people talking about it.
23:54	<annevk>	it's nothing now
23:54	<zewt>	iirc i've wanted the same syntax to be able to fill in array literals (with postgres)
23:55	<annevk>	I guess it's time to find some more features to remove
23:55	<annevk>	well, not right now, now it's sleepy time :)
23:56	<Philip`>	zewt: Common advice is to use the ? placeholders so the server can cache the optimised query and reuse it with any arguments, which seems incompatible with the idea that the expansion is done by the client library
23:56	<ojan>	annevk: yeah...i just wouldn't want new features to fall through the cracks
23:56	<ojan>	but i guess the consequences aren't that bad
23:56	<ojan>	my hope is more that we could reduce the list over time
23:57	<ojan>	but maybe it's not worth the effort
23:57	<zewt>	the main reason for ? placeholders is to abstract away string escapes
23:57	<zewt>	anything else might be useful but a distant second...
23:57	<AryehGregor>	Philip`, it might be two different features using the same ? syntax.
23:57	<zewt>	that's definitely what prepared statements are for; i don't know if non-prepared statements do that too
23:58	<zewt>	though i guess i wouldn't be surprised, at least with postgres which (last i knew) is a lot more aggressive about optimizing than mysql
23:58	<TabAtkins>	Philip`: It's compatible with the idea that you don't have to worry about escaping, because the client library handles it all for you.
23:59	<zewt>	though optimization is a bit trickier than that, since you don't know which parts of the query are static and which are variable
23:59	<zewt>	(which you do know with prepared statements)
23:59	<AryehGregor>	MySQL is great at optimizing, providing you tell it exactly what optimizations to do and are fine with completely rewriting your queries in harebrained ways to trick it into being not completely retarded.
23:59	<Hixie>	nessy: ping