2010-02-01 [16:00:00.0000] annevk: it isn't relevant [16:00:01.0000] if I use xhtml that all just works [16:00:02.0000] in that example using was simpler [16:00:03.0000] well sure [16:00:04.0000] but XHTML does not work in IE [16:01:00.0000] so that would be a problem [16:01:01.0000] if we want to test IE [16:01:02.0000] using divs would have required id attributes or something [16:01:03.0000] that wasn't a testcase [16:01:04.0000] but just an example [16:01:05.0000] even trivial one [16:01:06.0000] anyways, it seems like someone needs to define hit testing [16:01:07.0000] maybe [16:02:00.0000] though, if IE really behaves like that with mouseenter/leave [16:02:01.0000] I think those should be just dropped [16:02:02.0000] those aren't that useful [16:02:03.0000] are you sure? [16:02:04.0000] well, you can emulate them using scripts [16:02:05.0000] lots of developers want those events [16:02:06.0000] and mouseover/out [16:03:00.0000] I would wager that a majority of all uses of mouseover/out really wants mouseenter/leave [16:04:00.0000] e.g. a simple search gives http://blog.stchur.com/2007/03/15/mouseenter-and-mouseleave-events-for-firefox-and-other-non-ie-browsers/ [16:24:00.0000] http://foolip.org/microdatajs/live/ neat [16:24:01.0000] /me -> past bedtime [16:24:02.0000] nn [17:16:00.0000] foolip, your microdata tool is awesome [17:16:01.0000] i've added it to my list of consoles on http://damowmow.com/portal/ (top right) [17:16:02.0000] Hixie, new version of Requiem is out. It aparently fixes the 4GiB file size limit. [17:16:03.0000] nice [17:16:04.0000] (there was a 4GB file size limit?) [17:17:00.0000] it's taking me a while to download it from Tor - it's a really slow connection today - and I couldn't find it on any torrent sites yet. [17:18:00.0000] ah [17:19:00.0000] there was a limit, but since only a few of the HD movies available from iTunes are over 4GB, it probably hasn't affected you [17:20:00.0000] it hasn't affected me either, since being outside the US, there is no HD content available [17:21:00.0000] heh [17:22:00.0000] how weird [17:22:01.0000] and the Australian iTunes store (which my account is for) is unlikely to bother with HD content while our internet continues to be over priced and capped with extremely low usage caps [17:22:02.0000] don't they have an itunes datacenter? [17:24:00.0000] yes, they do. But our ISPs count your data usage, and downloading 2GB HD TV shows frequently will send most users over their 10GB to 20GB monthly limit quite quickly. (Plans with more than that are way over priced) [17:25:00.0000] and so unless Apple does some deals with Australian ISPs to make iTunes content be unmetered, the market for HD content will be very small [17:26:00.0000] so basically unless you lose your net neutrality, you're screwed? [17:26:01.0000] what is it with english-speaking countries and being stupid [17:28:00.0000] yeah, basically. Although, to some extent, some Aussie ISPs have gone against net neutrality, and have in the past, included some major download sites (like tucows and cnet downloads) as part of the unmetered freezone. [17:29:00.0000] Telstra includes its own BigPond network in their free zone too [17:30:00.0000] Hixie, maybe non-English-speaking countries are also stupid, but you don't notice it as much because it's in a foreign language? [17:30:01.0000] I've heard claims that the reason for the over priced, capped internet is because the cost of sending data across the Pacific ocean is expensive. [17:30:02.0000] Bandwidth caps make more sense than "unlimited" plans where you get threatening calls from your ISP if you use too much bandwidth . . . [17:30:03.0000] i speak french and lived in swizterland, and also lived in norway, and i haven't seen the same kinds of stupidity as in the UK, US, and Australia. [17:31:00.0000] but i'm sure each country has its own stupidity, certainly :-) [17:31:01.0000] Hixie, there are plenty of non-English speaking countries that are stupid. LIke China, Iran (I think), and some other middle eastern countries. [17:31:02.0000] Unlimited plans don't make much market sense, they encourage tragedies of the commons like BitTorrent. [17:32:00.0000] AryehGregor, what do you mean by tragedies of the commons? [17:32:01.0000] Lachy, I mean that if it's unlimited, you won't care how much you use, so you'll be much more likely to use an unreasonably large amount, which in aggregate degrades service for everyone. [17:32:02.0000] But every individual bears too little of the cost to care. [17:32:03.0000] http://en.wikipedia.org/wiki/Tragedy_of_the_commons [17:34:00.0000] I strongly disagree. Bandwidth caps place an unfair burdon on end users, and place wholly artificial limits on what people can and can't do. [17:35:00.0000] How is it wholly artificial or unfair to pay for the resources you use? Most resources you use are metered, like electricity. [17:35:01.0000] But yeah, it's a pain, so people don't like it and ISPs try to avoid it. [17:36:00.0000] Which creates tragedies of the commons, but oh well. [17:38:00.0000] AryehGregor: different argument. Your electricity is metered, but it's probably unlimited [17:38:01.0000] Ah, someone upgraded the wiki to 1.15.1. [17:39:00.0000] (unlimited within capacity provision) [17:39:01.0000] GarethAdams|Home, well, yes. Caps are just a coarser form of metering, though, where you have to pay in advance to upgrade to the next plan. [17:39:02.0000] Lachy, should I upgrade the wiki to the latest alphas from SVN so that it's outputting HTML5 instead of XHTML1 Transitional? :) [17:40:00.0000] Personally I think pre-pay vs credit is quite a fundamental difference [17:40:01.0000] AryehGregor, bandwidth is not a limited resource in the same sense as physical resources, and bandwidth caps attempt to solve the issue the wrong way. [17:40:02.0000] Lachy, it's limited every bit as much as things like electricity or water. How would you solve it? [17:41:00.0000] Well, it's too late for me to get into an argument, but I could set up MediaWiki 1.16alpha tonight, or not. [17:41:01.0000] You gave me shell access, but I don't want to do it without at least one other person thinking it's a reasonable idea. :) [17:41:02.0000] Bandwidth is, for the sake of simplicity, effectively a measure of how much data can be transmitted at the same time. So offering plans based on the bandwidth, or rather speed, provided to the user makes sense. [17:42:00.0000] Whee, still using MySQL 4.1. At least one user is benefiting for our support for ancient MySQL versions! [17:44:00.0000] But imposing monthly usage limits tries to treat data as a finite resource, and implies that it will somehow run out if too much is used in a month. [17:45:00.0000] Well, what they really care about is how much you're using at peak. If there's excess capacity, it makes no difference to them, no. So it's a crude metric, that's true. [17:45:01.0000] That's not quite true, actually. [17:45:02.0000] They do pay for every bit that goes over a backbone, AFAIK. [17:45:03.0000] but that's not the case at all. If a backbone can support 1Gbps, and 50 users are connected to that, each going flat out on their ADSL2+, 20Mbps connection, they could do that all indefinitely. [17:45:04.0000] But anyway, are you going to answer me about upgrading the WHATWG wiki? [17:46:00.0000] Yes, but they oversell, because people don't use all their bandwidth at once. [17:46:01.0000] Although, with bandwidth caps, they would hit their artificial limit within a day [17:46:02.0000] right. But bandwidth caps do nothing to address that problem of over selling [17:47:00.0000] Overselling isn't a problem, it's the only thing that makes sense. It would be crazy to let massive bandwidth capacity lie idle because you have to reserve a fixed amount for each customer. [17:48:00.0000] /me wonders why Lachy keeps on ignoring the questions about the wiki. [17:48:01.0000] oh, I missed the wiki question. [17:48:02.0000] I only asked it about eight times. [17:50:00.0000] sure, do whatever you like with the wiki (as long as you don't break it). [17:50:01.0000] or if you do break it, fix it [17:50:02.0000] There are no hacks, are there? [17:50:03.0000] So I can overwrite the files and there will be no problems? [17:50:04.0000] no [17:50:05.0000] k. [17:50:06.0000] there are plugins, I believe [17:51:00.0000] can't remember which ones we have installed though [17:51:01.0000] ConfirmEdit, no problem. [17:57:00.0000] AryehGregor, have you ever lived with an internet connection that was capped? [17:58:00.0000] No. Is that relevant? [17:59:00.0000] I was just wondering, since if you had, you might actually understand the detrimental effects that caps can have on many things, including work. I can tell you for a fact that it's no fun trying to do web development when your connection has been shaped to 64kbps for going over your monthly bandwidth limit [18:00:00.0000] Did I ever say caps were fun? I'm pretty sure I explicitly said customers don't like them. [18:00:01.0000] (although, I was talking more about metering than caps) [18:00:02.0000] metering implies caps [18:01:00.0000] No it doesn't. My server's connection is metered but uncapped. [18:01:01.0000] unless you're talking about plans that charge per MB, and can get quite expensive really quickly [18:01:02.0000] I've legitimately done about 1GiB/day [18:01:03.0000] granted, scp'ing 100MiB files to a server is going to do that rather quickly [18:02:00.0000] Sure, and it would be fair if you were charged more than someone who just checks their e-mail. [18:03:00.0000] if someone just wants to check their e-mail, give them a low bandwidth, but uncapped, plan. I don't mind paying extra for the extra bandwidth. But I expect to be able to make use of it, which caps don't let me do [18:04:00.0000] Updating wiki, will be down for a few minutes. [18:04:01.0000] Back up. [18:05:00.0000] Try viewing source. :) [18:06:00.0000] Anyone who finds a problem should report it to me and I'll fix it. Meanwhile, I'm going to sleep. [18:06:01.0000] [18:06:02.0000] [18:06:03.0000] wtf? [18:06:04.0000] Oh, right. [18:06:05.0000] lose the version crap [18:06:06.0000] Let me disable the RDFa. :P [18:06:07.0000] how did RDFa shit get into mediawiki? [18:06:08.0000] Try now. [18:07:00.0000] Someone committed it. [18:07:01.0000] Then I said we should also allow microdata, and then I said we should allow only microdata, and a lengthy debate ensued on the mailing list. [18:07:02.0000] (wikitech-l, I mean) [18:07:03.0000] I think the likely outcome is that both get disabled by default. [18:07:04.0000] you should get the RDFa removed. It's non-conforming HTML5 [18:08:00.0000] It's conforming HTML5+RDFa! [18:08:01.0000] Anyway, conformance matters less than usefulness. [18:08:02.0000] The HTML5 mode means a whole truckload of non-conforming content, since we don't blacklist things like cellpadding and what have you. [18:08:03.0000] that's true. But HTML+RDFa is a completely useless spec [18:09:00.0000] But it also means more features, so to heck with validators. [18:09:01.0000] Well, you don't have to convince me of that. [18:09:02.0000] /me points out type=search in the search bar, some type=number in preferences, autofocus in lots of places, . . . [18:10:00.0000] All of which I wrote. :P [18:11:00.0000] And of course, speaking of bandwidth, look at all those saved bytes! [18:11:01.0000] Okay, now's about time to go to bed. [18:34:00.0000] /me wonders if Lachy would be happier with an unmetered uncapped 256Kbps connection that he could use all day every day, or with a 20Mbps one that's capped at 100GB/month for the same cost [18:39:00.0000] /me looks at Jingle and SIP to see if either would be usable with and comes out disturbed at how complicated people make their protocols [22:25:00.0000] ping: Hixie [22:27:00.0000] hey [22:32:00.0000] hey Hixie - do you know, or know someone who knows, or know somewhere I can find out information about youtube.com/html5 ? [22:33:00.0000] what information? [22:33:01.0000] HTTP requisites [22:33:02.0000] not sure what you mean [22:33:03.0000] I'm getting 403 on my experimental build of webkit [22:33:04.0000] odd [22:33:05.0000] so presumably there are requisite headers [22:33:06.0000] 403 for the video file? [22:33:07.0000] yes [22:34:00.0000] and it works for regular webkit? [22:34:01.0000] how weird [22:34:02.0000] have you dumped the two network traffic sessions and compared them? [22:34:03.0000] i don't see why there'd be anything like that [22:34:04.0000] this is a different multimedia backend [22:35:00.0000] i'd recommend cracking out tcpdump and comparing them [22:35:01.0000] hmm, ok [22:36:00.0000] my thought is it's probably either a cookie it wants, or a missing refferer header [22:36:01.0000] my backend passes on neither at the moment [22:37:00.0000] anyway, I just thought you might have seen some documents floating around I could consult :) [22:38:00.0000] oh you almost certainly need the opt-in cookie [22:42:00.0000] yeah, unfortunately the webkit interface for media is somewhat broken in this respect. The backend is only passed a URL to the media file. [22:44:00.0000] seriously, i'd recommend using tcpdump to see what the working UA is sending [22:44:01.0000] it could be something obvious [22:44:02.0000] NickYoung: we do want to eventually fix the networking to go through WebKit's http stick - just tricky to beat both WebKit and relevant media engines into shape [22:44:03.0000] NickYoung: I think Referer or Cookie is a likely candidate [22:45:00.0000] NickYoung: if you find the differences, one easy way to see which is making the difference could be to use curl or wget with custom headers [22:47:00.0000] unfortunately I'm on linux atm, and HTML5 media support is in chrome (but not chromium) afaik [22:47:01.0000] which leaves me with nothing to compare against [22:48:00.0000] but a massive wget hack could work :P [22:59:00.0000] there's no official Chrome for Linux yet? [22:59:01.0000] wait, WebKit/Gtk has video support - does that not work? [22:59:02.0000] (or is that what you are working on now?) [23:00:00.0000] I'm working on WebKit/Qt [23:00:01.0000] you could check if WebKit/Gtk can handle YouTube, that might be the easiest point of reference (though of course you'd need a GStreamer module that does H.264...) [23:01:00.0000] yeah.. I have that [23:01:01.0000] I'll investigate :) [23:03:00.0000] also, it looks like there is an official chrome for linux now [23:20:00.0000] dear lord, the GTK version explicitly spoofs its user agent for movies.apple.com [00:09:00.0000] NickYoung: if you find that UA spoofing is indeed required for movies.apple.com, could you please send me the info so I can file a bug on them? mjs⊙ac [00:47:00.0000] othermaciej: does aria-hidden actually work in Firefox for the use case you mentioned on the list? [00:47:01.0000] hsivonen: does aria-hidden in Firefox cause content to be hidden from visual rendering? [00:48:00.0000] othermaciej: no, AFAIK [00:48:01.0000] hsivonen: I assume not, because that would violate the ARIA spec [00:48:02.0000] so I assume it does work in Firefox [00:48:03.0000] othermaciej: but at least at some point, it didn't really prune anything from the accessible tree, either [00:48:04.0000] hsivonen: that would presumably be a bug in their ARIA support... [00:48:05.0000] othermaciej: it depended on the author specifying *[aria-hidden="true"] { display: none; } [00:49:00.0000] hsivonen: I'm certainly not telling authors to include that, since it subverts the intent of the ARIA spec (as far as I can tell) [00:49:01.0000] othermaciej: I guess "the intent of the ARIA spec" is tricky thing [00:50:00.0000] my reading is that role="presentation" is supposed to skip an element (but still include its children) in the accessibility tree, and aria-hidden is supposed to hide the element *and* its children in the accessibility tree [00:50:01.0000] what I've heard from the folks behind ARIA, *[aria-hidden="true"] { display: none; } in an author style sheet is very much something that is expected [00:50:02.0000] othermaciej: that's my understanding of the spec, too. [00:51:00.0000] having that in a UA stylesheet would definitely violate the ARIA spec [00:51:01.0000] having it in an author stylesheet does not seem like a problem per se if you definitely want to visually hide all content that's also hidden from accessibility [00:51:02.0000] othermaciej: I said "author style sheet" intentionally. Putting it in the UA style sheet would be clearly wrong. [00:51:03.0000] however, the author should not *have* to specify that, to hide content in the accessibility tree [00:52:00.0000] if the UA doesn't prune it from the accessibility tree unless the author includes "*[aria-hidden="true"] { display: none; }", that too would violate the ARIA spec (as I read it) [00:52:01.0000] or at least, if the UA+AT combination still present the content in that case, then there is clearly a bug in at least one of them [00:52:02.0000] I agree. Dunno what Firefox does now and what's considered a bug and what's a feature here. [00:53:00.0000] but like I said, I've been advising people to use it like a recursive version of role="presentation" [00:55:00.0000] (the case this has particularly come up, the content does not actually need to work in other browsers, but that doesn't seem relevant to the actual issue of use cases) [00:55:01.0000] othermaciej: has this been for iTunes-embedded WebKit? [00:56:00.0000] can't really talk about the specific details, but feel free to speculate [00:56:01.0000] I see [00:56:02.0000] and I am sure we will need something similar in the future in content that works cross-browser [00:57:00.0000] at which point I guess we will have to test in all the target browsers we care about and possibly report bugs [01:06:00.0000] othermaciej: aaronlev said back in 2007 that aria-hidden did nothing in firefox but display:none hid from accessibility tree [01:06:01.0000] othermaciej: dunno if it has changed [01:07:00.0000] zcorpan: in Safari on Mac with VoiceOver, it definitely hides the whole subtree from the accessibility tree [01:07:01.0000] I could test in Firefox on Mac [01:07:02.0000] I have n easy way to test accessibility behavior on Windows though :-/ [01:08:00.0000] there's an aapi debug tool for windows [01:09:00.0000] inspect32.exe at http://www.microsoft.com/downloads/details.aspx?familyid=3755582a-a707-460a-bf21-1373316e13f0 [01:15:00.0000] my favorite thing about voiceover is that it shows what it's speaking, so you can even test with it muted... [01:16:00.0000] hmm my copy of Firefox does not seem to support VoiceOver on Mac [01:16:01.0000] I'm running 3.5.2 [01:16:02.0000] same thing with a minefiled alpha [01:17:00.0000] othermaciej: IIRC, Mac accessibility of Firefox got frozen, because the developers didn't get enough information out of Apple a couple of years ago [01:17:01.0000] hsivonen: do you know if Firefox supports accessibility at all on Mac? [01:17:02.0000] I can't get VoiceOver to read anything but the title controls [01:17:03.0000] othermaciej: there at least used to be code but IIRC it has never been turned on by default [01:17:04.0000] guess I can't test this after all [01:20:00.0000] and in Chrome only the toolbar supports VoiceOver, not the content area [01:21:00.0000] if anyone feels like testing on Windows, here is my test case: [01:21:01.0000]
First paragraph.
[01:21:02.0000] [01:21:03.0000]
Third paragraph.
[01:21:04.0000] (or on Linux) [01:21:05.0000] Safari definitely reads the first and third paragraph, but not the second, and shows all three [01:27:00.0000] othermaciej: from what i can tell with inspect32.exe, firefox does not hide the paragraph from msaa [01:31:00.0000] zcorpan: how about IE? [01:32:00.0000] surkov: is it intentional that Firefox doesn't hide stuff from the accessible tree if aria-hidden=true? [01:32:01.0000] smaug__, David Bolter tried to contact with Apple developers to get some help but no real success iirc [01:32:02.0000] hsivonen: I think so, iirc this ARIA requirement [01:32:03.0000] jeez. i only got this laptop last week and i'm already nearly killing it again. [01:32:04.0000] we set proper state on the accessible [01:32:05.0000] /me moves his laptop further away from the tea cup [01:32:06.0000] othermaciej: same as firefox [01:33:00.0000] maybe on Windows it's the screen reader's job to hide aria-hidden stuff [01:34:00.0000] (I don't actually know which of Safari or VoiceOver hides it on Mac, I tested end-to-end) [01:34:01.0000] surkov: where is the requirement in ARIA? [01:35:00.0000] hsivonen: it sounds the things were changed, now ARIA impl guide sais "# has one of the WAI-ARIA global states and properties but does not have the aria-hidden property set to "true". Hidden elements are not exposed to assistive technology." [01:35:01.0000] http://www.w3.org/WAI/PF/aria-implementation/ [01:36:00.0000] "This is not used in mapping to platform accessibility APIs. Instead, use information from the layout system to determine if the element is hidden or not. Advisory: it is incorrect use of ARIA if an element with aria-hidden="true" is visible. The aria-hidden property is exposed only so that DOM-based assistive technologies can be informed of visibility changes. However, the layout will be able to provide the most complete set of all truly hidden [01:36:01.0000] nodes." [01:36:02.0000] (same url) [01:36:03.0000] uh i mean https://developer.mozilla.org/En/ARIA_to_API_mapping [01:36:04.0000] hmm, that seems to contradict what WAI-ARIA itself says [01:37:00.0000] (but the other url seems to say the same thing) [01:37:01.0000] zcorpan: English translation: aria-hidden is only for IE7+JAWS [01:37:02.0000] "This allows http://www.w3.org/TR/wai-aria/terms#def_at or http://www.w3.org/TR/wai-aria/terms#def_useragent to properly skip http://www.w3.org/TR/wai-aria/terms#def_hidden elements in the document." [01:37:03.0000] yay for UA-specific features [01:37:04.0000] (boy that copied oddly) [01:38:00.0000] so I wonder what is the correct way to do the deep equivalent of role="presentation" [01:38:01.0000] should you just put role="presentation" on every element in the subtree? (But that still won't affect the text nodes) [01:39:00.0000] hsivonen: actually it sounds aria-hidden doesn't make sense at all in the current aria impl guide edition [01:39:01.0000] othermaciej: maybe send feedback to public-pfwg-comments⊙wo today? today is the DL for ARIA comments. [01:39:02.0000] I wonder also why the implementor's guide apprently contradicts the spec on this [01:39:03.0000] hsivonen: will attempt to [01:40:00.0000] hsivonen: I'll ping David Bolter about this issue since he is a member of ARIA group [01:40:01.0000] surkov: thanks [01:41:00.0000] othermaciej: the correct way is "don't have presentational elements" [01:41:01.0000] I'm posting to public-pfwg-comments but I don't necessarily expect to have a prompt reply [01:42:00.0000] Hixie: we have some markup that would give bad results in a screen reader if you didn't hide part of it, but where the extra bits can't be tacked on via CSS [01:42:01.0000] othermaciej: sounds like a bug for css [01:42:02.0000] Hixie: so maybe your answer is "redesign the UI", but that does not seem like helpful advice [01:43:00.0000] is it a goal for CSS to allow generated content with arbitrarily complex internal structure and styling? [01:43:01.0000] I mean, you can use XBL to do that [01:43:02.0000] but in that case you'd still need ARIA to control how your XBL binding is presented to a screen reader [01:43:03.0000] maybe the answer is xbl [01:43:04.0000] it's hard to know without examining the actual use case [01:43:05.0000] which regrettably I am not at liberty to paste here [01:44:00.0000] i think we'll probably need to add something to xbl to indicate whether the accessibility apis are expected to crawl it or not [01:44:01.0000] really this should be solved by not using the same media type for screens as screen readers [01:44:02.0000] but that's another story [01:45:00.0000] I think you may need more control than binary "yes" or "no" over the accessibility behavior [01:45:01.0000] though I suppose reading either the surface markup or the contents of the XBL binding which may internally use ARIA would cut it [01:45:02.0000] but you'd still need a proper mechanism to hide parts of the XBL shadow tree from screen readers [01:46:00.0000] using a different media type would be cleaner CSS wise, but would require generating two render trees to have both visual and audio presentation at the same time [01:46:01.0000] (which is the only mode VoiceOver has, so you can work with or help a blind person by looking over their shoulder) [01:55:00.0000] I agree XBL markup should have a stuffs to control its accessible tree, in firefox we some basic stuffs to do this. For example, accessible for bound XBL element can provide accessible name or XBL can allow or deny accessible children in its subtree [01:56:00.0000] ideally I think XBL should special markup to control all this stuffs [02:12:00.0000] hsivonen, zcorpan, surkov: http://lists.w3.org/Archives/Public/public-pfwg-comments/2010JanMar/0038.html [02:12:01.0000] http://www.floodgap.com/software/classilla/ wow [02:12:02.0000] surkov: I think the markup to control accessible presentation of XBL should be (a possibly extended version of) ARIA, and it should also be usable outside XBL, so that it's viable to use DIV soup as fallback to XBL during the transition [02:13:00.0000] I would not want to see a whole second form of accessibility markup [02:13:01.0000] hsivonen: neat parlor trick, I guess [02:15:00.0000] othermaciej, if origin is a unique identifier the result is Origin: null [02:16:00.0000] and my idea was that it would support preflights as well [02:16:01.0000] because that was the plan for Level 2 of their protocol anyway [02:18:00.0000] ok well i've dealt with all the websocket feedback that is not on the hybi list and all the feedback on the hybi list up to the 28th of this month [02:18:01.0000] all that's left is GIANT threads of pain [02:18:02.0000] i guess i go through and delete the process ones first [02:18:03.0000] annevk: should there be a "force no preflight" flag anyway to define XDR sanely? [02:18:04.0000] annevk: I guess XDR just doesn't allow sending any requests that would cause preflight, so nevermind [02:19:00.0000] Hixie: I tried to move discussion into the technical threads [02:19:01.0000] yeah that helps a lot, indeed [02:20:00.0000] i like the subthread with the subject line "Process, was: Technical feedback. was: Process" [02:20:01.0000] Hixie: I have some potentially bad news, which is that I came up with a way to do the handshake that is both much more secure and likely much easier to use in existing server software [02:20:02.0000] oh? [02:20:03.0000] Hixie: is it too late to consider changes to the handshake? [02:20:04.0000] depends how much of an improvement it is [02:20:05.0000] what is the improvement? [02:21:00.0000] ok, so as I understand it, the security goal of the "exact binary match" requirement on the first part of the header is to reduce the risk of abusing unmodified vanilla HTTP servers, or non-HTTP resources [02:21:01.0000] othermaciej: thanks [02:21:02.0000] othermaciej: more or less, yeah [02:21:03.0000] so there's two problems with the current setup: [02:21:04.0000] othermaciej, you can prevent sending a preflight by preventing making requests that result in one [02:22:00.0000] (you can force a preflight if you want) [02:22:01.0000] 1) Hardcoding not just the status line, but also a few of the initial headers, is apparently a pain in the ass to work into existing HTTP stacks in servers (I'm willing to take their word for it). [02:22:02.0000] i don't buy that at all, but i have heard it, yes [02:22:03.0000] 2) The handshake response consists exclusively of fixed contents, and literal character-for-character echoes of parts of the handshake request [02:23:00.0000] right [02:23:01.0000] this means that if you can trick any server into echoing back exact text of choice in response to a WebSocket request, you are hosed [02:23:02.0000] a better method would be requiring something in the response that proves you saw the request, but is not predictable in advance [02:23:03.0000] yes, though i'm not aware of any case where it is possible, and i've looked [02:23:04.0000] so here is my proposal: [02:24:00.0000] woah woah [02:24:01.0000] no [02:24:02.0000] we don't want to require that the server parse the request [02:24:03.0000] in fact in the common case the server won't parse the request at all [02:24:04.0000] - Include a nonce in the request [02:24:05.0000] - Server includes a hash of the nonce plus the request origin in the response status line [02:25:00.0000] - Only the status line is strictly constrained, not response headers [02:25:01.0000] This is clearly more robust against cross-protocol attacks [02:25:02.0000] unless you're aware of a web service that can be tricked into sending back the handshake, that seems unnecessarily complicated [02:26:00.0000] right now in the most common case a websocket server will send the handshake before the client does [02:26:01.0000] it's actually pretty trivial [02:26:02.0000] cross-protocol attacks do happen and it's hard to predict them [02:26:03.0000] it's not as trivial as not doing anything :-) [02:26:04.0000] witness recent firefox-based attack against IRC [02:27:00.0000] i agree that they happen, but as far as i've been able to tell, what we have currently is sufficient [02:27:01.0000] what I described is far more robust against cross-protocol attacks, and likely easier to do in the context of integrating with an existing Web server [02:27:02.0000] actually i don't see why it's any more secure [02:27:03.0000] again, anything where the necessary handshake response is predictable is much more robust [02:27:04.0000] why can't you just cause the server to echo back the right hash? [02:27:05.0000] because you can't predict the correct handshake response [02:27:06.0000] why not? [02:28:00.0000] because you (the JS-level attacker) do not know the nonce [02:28:01.0000] oh the UA makes up the nonce, ok [02:28:02.0000] it is generated by the UA [02:28:03.0000] that seems like a lot more complexity than what we have now [02:28:04.0000] i agree it's easy for experienced programmers [02:28:05.0000] it doesn't even have to be a cryptographically secure hash, in fact, it might be an improvement even if you don't hash the nonce at all [02:28:06.0000] and even more non-experienced ones [02:28:07.0000] s/more/most/ [02:29:00.0000] (or at least many0 [02:29:01.0000] but it is still more work than nothing [02:29:02.0000] i'm not aware of any protocol where you can cause the server to echo back the handshake or even get close [02:30:00.0000] especially given how constrained the client's part of the handshake is [02:30:01.0000] and putting the unpredictable part of the required handshake response in the status line makes it slightly more robust against header injection attacks on normal HTTP services [02:30:02.0000] right, the client can only control the URL part [02:31:00.0000] however, what I described is secure at a stronger level than "I'm not aware of any hackable services currently" [02:31:01.0000] agreed, but it's not enough of an improvement to throw away the half-dozen or so implementations [02:31:02.0000] (if you had suggested this six months ago, it'd be in without question) [02:32:00.0000] why don't we check if server and client implementors think it's enough of an improvement? [02:32:01.0000] go for it [02:32:02.0000] afaik the only client implementation that has actually shipped is Chrome, and they specifically said they weren't looking to prematurely lock in the protocol [02:32:03.0000] i'm certainly happy to improve matters if people are willing to rewrite their code [02:32:04.0000] though i really don't like requiring that the server have to do work in the handshake [02:32:05.0000] isn't there some way we can get around that? [02:33:00.0000] I posted a vague form of my suggestion on the hybi list [02:33:01.0000] maybe I should pull it out of the thread and/or post it on whatwg [02:33:02.0000] man, all this hybi crap in my inbox [02:33:03.0000] maybe also get abarth's opinion [02:33:04.0000] and now you guys are spamming this channel with it as well :p [02:34:00.0000] actually the more i think of this the less i like it... i don't like making the server have to read the client's data [02:34:01.0000] currently you can make a really simple server that completely ignores the server if you want to do something simple (similar to eventsource) [02:34:02.0000] er, ignores the client [02:34:03.0000] Hixie: if a server accepts requests from multiple origins, doesn't it have to read the client's data anyway? [02:34:04.0000] yes, but that's likely to be much rarer [02:36:00.0000] but the fact that it's needed in that case makes me think it is not such a huge burden [02:36:01.0000] i didn't say it was a huge burden, i said it was a burden [02:36:02.0000] right now you can literally never read a byte from the client [02:36:03.0000] the part that the server has to read, check and echo-back isn't even in the fixed-order-and-capitalization part of the request, it's in the freeform part [02:37:00.0000] is it not required for servers to check correctness of the client part of the handshake? [02:37:01.0000] no [02:37:02.0000] they can totally ignore the client [02:37:03.0000] so a server could respond with WebSocket data even if you don't include Upgrade: WebSocket Connection: Upgrade? [02:37:04.0000] yes [02:38:00.0000] wouldn't such a service be at risk of being exploited via XHR? [02:38:01.0000] comnnechow? [02:38:02.0000] er [02:38:03.0000] how? [02:38:04.0000] try connecting to damowmow.com:11111 [02:38:05.0000] how can XHR exploit that? [02:39:00.0000] If it does not check the handshake request, I bet I can send it a GET with some preformatted messages as the body [02:39:01.0000] even if I can't get the results back, that's still an integrity violation [02:39:02.0000] ? [02:40:00.0000] if the service accepts commands of some kind via WebSocket that have side effects, rather than just reporting data, you could use XHR to abuse it if it does not check the handshake request [02:40:01.0000] if the server ignores the client, there's really not much to abuse [02:40:02.0000] othermaciej: thanks for the link, I looked at the Firefox code and we do nothing with aria-hidden. And it sounds it goes with the spec :). However I'm agree it's not clear why ARIA hidden is at all. [02:40:03.0000] at least in a browser that supports CORS [02:41:00.0000] surkov: it doesn't sound to me like that is correct per the actual spec [02:41:01.0000] if the server ignores the client handshake but does listen to frames, then yes, you could send frames, just like you could if you just used zombies to send data there [02:41:02.0000] that doesn't seem like a particular problem [02:41:03.0000] XHR cannot do GET with a body [02:41:04.0000] unless the UA is broken [02:42:00.0000] it's a problem if the server uses Cookies from the request but does not otherwise check correctness [02:42:01.0000] othermaciej: the spec sais aria-hidden="true" should be on hidden elements, aria-hidden="false" on visible elements so aria-hidden does affect on nothing [02:42:02.0000] annevk: good point - do any cross-site POSTs count as "simple requests" [02:42:03.0000] yes [02:42:04.0000] if the Content-Type header matches
allowed values [02:43:00.0000] othermaciej: yes, if you read cookies you should make sure you're getting a websocket handshake. but then we're far past "ignoring the client", and checking the handshake is not a burden any more, since you're already parsing it and everything. [02:43:01.0000] so if you're not checking correctness of the handshake request you presumably don't check the method either [02:43:02.0000] Hixie: it seems to me that if you are completely ignoring the client, you don't really need a full duplex connection [02:44:00.0000] othermaciej: *shrug* [02:44:01.0000] if you listen to the messages, then you'd better check the handshake [02:44:02.0000] i doubt most servers will [02:44:03.0000] the spec should at least mention this in security considerations, even if it does not mandate rejecting a bad handshake [02:45:00.0000] if most servers will not, then we effectively have an insecure protocol [02:45:01.0000] not much we can do about that as far as i can tell [02:45:02.0000] it's always possible to make a buggy server, and indeed we can't prevent that [02:46:00.0000] but: [02:46:01.0000] a) the spec could require the server to check the handshake for correctness, even if we know some won't listen [02:46:02.0000] sure, i can add that, at least for the case where you read cookies [02:47:00.0000] b) even if it does not require it, the spec could recommend checking the handshake for correctness in the case where you look at any credentials and/or perform any side effects in response to messages [02:47:01.0000] c) my nonce proposal would actually force servers to read the handshake and process it (though it can't force them to check it is really correct, I suppose) [02:48:00.0000] if a server truly ignores all input, then it could work just as well over EventSource with less trouble, so I am not sure WebSocket has to make that use case extra easy [02:49:00.0000] i think requiring the non-sensitive cases to be harder just to secure the sensitive cases is making the wrong tradeoff, personally [02:49:01.0000] what i'd really like to do is throw out the HTTP compatibility altogether [02:49:02.0000] and go back to ports 81 and 815 [02:50:00.0000] I think the sensitive use cases are some of the most important [02:50:01.0000] and then just have people use port 443 as they will anyway [02:50:02.0000] chat is clearly a case where you care about integrity, not just confidentiality [02:50:03.0000] (i.e. you don't want random third parties to be able to forge chat messages as you) [02:51:00.0000] I think security is one area where I would be very hesitant to cut corners, even if it makes things easier for the use cases where you don't need any security [02:52:00.0000] well if we really want security to that level, we should design the handshake with that in mind and stop using HTTP [02:52:01.0000] trying to retrofit it into HTTP isn't ideal [02:53:00.0000] and will always leave us vulnerable to the fake-websocket-via-form-post attack [02:53:01.0000] retrofitting it into HTTP makes it easier to share a host:port with a web server [02:53:02.0000] i thought so too, but apparently not [02:53:03.0000] because it makes people think they should use their HTTP output code [02:54:00.0000] well it's really the input side of the handshake that matters [02:54:01.0000] i.e. the request from the client to the server [02:55:00.0000] the server response could be anything, the fact that it takes the form of an HTTP 101 request is just following things to their logical conclusion [02:55:01.0000] that being said, I think making the handshake response unpredictable would do more to improve security than making the handshake non-HTTP [02:56:00.0000] making the handshake non-HTTP could solve the entire attack scenario you described [02:56:01.0000] which is a real attack scenario [02:56:02.0000] unlike echoing the handshake, which is theoretical at this point [02:56:03.0000] so i'd say it's the other way around [02:56:04.0000] actually, if a non-http handshake still allowed the server to ignore the client handshake request, it would do nothing to prevent my attack scenario [02:58:00.0000] nah, it'd be easy to design it in a way that it could... e.g. terminate the handshake with \n, so that the first frame would be corrupt [02:59:00.0000] then you would have to hard fail on a bad frame instead of ignoring it and moving on - does the protocol require that currently? [02:59:01.0000] if you're writing a server lazily, that's by far the easiest thing to do [03:00:00.0000] the spec does explain how you can go the extra mile and ignore unknown frames, but it's extra code, like checking the handshake [03:00:01.0000] if the frame boundary is a fixed sentinel, then it's easiest to scan for the sentinel [03:00:02.0000] we could arrange for the failure to be with a length-delimited frame and for the length to be gigantic [03:00:03.0000] by flipping the meanings of the high bits in the frame typesa and the length marker [03:02:00.0000] if we do your nonce idea, we could put the nonce in a specific header that would only be included with websockets [03:02:01.0000] so that if you can't find it, you can't send it [03:02:02.0000] request-wise you mean? [03:02:03.0000] yeah [03:02:04.0000] that might make it more likely that the server implementors would fail if they can't find it [03:02:05.0000] I agree that it should go in a request header that is only sent by WebSocet [03:03:00.0000] er *WebSocket [03:03:01.0000] I think I will ask abarth for his opinion on these issues [03:04:00.0000] both whether the attacks I described are worth worrying about, and whether a nonce mechanism would be an effective defense [03:13:00.0000] one of my concerns with requiring that authors read the handshake is that one of the advantages of them NOT reading the handshake is they won't just echo back the origin [03:13:01.0000] i'm worried that authors will just do that instead of echoing back only their own origin [03:16:00.0000] servers that accept connections from multiple origins actually *do* have to do that, after checking the origin, but you are right that it may affect the likelihood of making a mistake in the single-origin case [03:18:00.0000] though of course you could write the server-side algorithms to say otherwise (i.e. have different ones for single-origin and multi-origin case, where the former only sends the fixed allowed origin after separately comparing it to the handshake origin) [03:19:00.0000] that's more or less what the spec does [03:19:01.0000] the parsing is in a separate (later) section [03:19:02.0000] people using stock tools for this (like mod_pywebsocket) might be less likely to make that mistake, if the framework asks you to declare allowed origins to it up front [03:20:00.0000] then the framework could make sure to both check against your list and echo back [03:20:01.0000] I guess the main risk there would be if the framework has an "allow any origin" mode, authors might switch that on inadvertantly [03:20:02.0000] or indeed if they default to that :-) [03:21:00.0000] given how easy it is to write a websocket server, i don't know how much point there is to a websocket framework really [03:21:01.0000] (not having to parse the header is a big part of that) [03:22:00.0000] well, if you want something that operates on a large scale, you probably need some sort of framework, even if not for parsing the header [03:22:01.0000] anyway - frameworks should probably either default to failing if you don't specify allowed origins, or default to allowing the same origin as the WEb sever they are running inside, if their purpose is to share a host:port with an existing web server [03:26:00.0000] what's blocking bytes in JS? [03:26:01.0000] TC39 [03:26:02.0000] doh [03:32:00.0000] othermaciej: btw, re having acks in the protocol, it's relatively easy to add them at the application level, but it turns out to be much harder to add them at the websocket level. Basically all you need is to number your messages, and tell the server what the last message was you got when you reconnect. [03:33:00.0000] othermaciej: but if you do that at the websocket level, you've no way to know if the script _actually_ received the message [03:33:01.0000] Hixie: right - I'm wondering what the practical benefit (if any) would be of doing it at the websocket level [03:33:02.0000] obviously, at the websocket level you could only generate transport-level acks, not guarantee actual end-to-end delivery to the app at either end [03:33:03.0000] yeah [03:34:00.0000] one thing that would help is having a flag in the onclose event to say whether it was an orderly close or not [03:34:01.0000] I'm wondering if a clean shutdown handshake is something that benefits from being at the protocol level [03:34:02.0000] TCP already does that for us, no? [03:34:03.0000] it sounds like if you don't use one at the app level, you have to do the same kind of "lingering close" dance as HTTP servers, or you may lose mssages [03:35:00.0000] some of the content on the thread implies that the TCP close handshake is broken, and http servers have to work around it in crazy ways [03:35:01.0000] I do not have enough expertise to check the accuracy of those claims [03:36:00.0000] obviously you can do a close handshake at the subprotocol level if you want, but if every subprotocol has to do it to be remotely correct, then maybe it should be in the base protocol [03:36:01.0000] i guess i'd have to understand the use case better [03:37:00.0000] to have an opinion [03:37:01.0000] apparently, if you just do a normal close of your socket when you are done sending, in some cases it can make the client drop data that it has already ACKd at the TCP level [03:38:00.0000] in other words, you might lose the last few messages even if there is no actual disruption of service [03:38:01.0000] Web servers do something crazy that I don't understand to work around this [03:38:02.0000] it would be nice if protocol reliability could just rely on TCP-level acks, but besides the close issue, there's the problem that OS TCP stacks apparently don't expose how much has been ACK'd [03:39:00.0000] which seems like a waste... [03:39:01.0000] TCP is supposed to provide a reliable stream, but the next level up has to reinvent reliability [03:39:02.0000] wait, back up [03:39:03.0000] who's closing? [03:39:04.0000] client or server? [03:39:05.0000] the server [03:39:06.0000] and why? [03:39:07.0000] ah [03:39:08.0000] the server has sent the final message it intends to [03:39:09.0000] closes the socket [03:39:10.0000] that can make the client lose the last few messages [03:39:11.0000] so the server thinks it's done but it wants to make sure it gets all the client's messages? [03:40:00.0000] (apparently) [03:40:01.0000] well there's no way to do that except at the application layer, since you don't know what the app wants to send but hasn't sent yet [03:40:02.0000] not messages from the client [03:40:03.0000] what people claim is this: [03:40:04.0000] - the server sends some messages to the client [03:40:05.0000] - the server gets TCP-level ACKs [03:40:06.0000] - the server closes the socket [03:41:00.0000] in this case, TCP requirements can cause the client to forcibly drop the last few messages [03:41:01.0000] oh i see [03:41:02.0000] and you have to do some "lingering close" trick to avoid it [03:41:03.0000] wow, really? [03:41:04.0000] that's, ah, stupid [03:41:05.0000] how about we fix TCP [03:41:06.0000] it sounds like a huge bug in TCP to me [03:41:07.0000] but I think "fix TCP" is out of the range of the practical [03:41:08.0000] someone asked about TCP5 from us [03:42:00.0000] Hixie: cool, glad you enjoyed it [03:42:01.0000] jokingly maybe [03:42:02.0000] whereas "shutdown handshake" is viable [03:42:03.0000] Hixie: did you happen to find out and document what exactly WebKit does with initial about:blank? [03:42:04.0000] /me forgot who it was [03:42:05.0000] Hixie: it turned out that the new RDF extraction algorithm is a bit b0rked, will send mail about that tonight hopefully [03:42:06.0000] hsivonen: not in any more detail than what the spec says now [03:42:07.0000] foolip: cool [03:42:08.0000] I wonder if SCTP fixes this problem [03:42:09.0000] Hixie: ok. [03:43:00.0000] clearly, what the spec says now isn't what WebKit does [03:43:01.0000] http://hsivonen.iki.fi/test/moz/about-blank-load.html [03:44:00.0000] this whole about:blank has been so sad. I reached my timeout of fixing it the right way on the critical path of HTML5 parser enablement [03:44:01.0000] so I'm going to paper over it for now [03:44:02.0000] but I'd still like to fix it the right way once the HTML5 parser is on by default on trunk [03:44:03.0000] Hixie: the "Reliable message delivery" subthread had the explanations of the socket close issue [03:44:04.0000] http://www.ietf.org/mail-archive/web/hybi/current/msg01030.html [03:45:00.0000] http://www.ietf.org/mail-archive/web/hybi/current/msg01044.html has more details [03:45:01.0000] (that is, even what the spec says now and what the spec says plus load event are too much on the critical path) [03:46:00.0000] fwiw, the above demo alerts "Different body" in Gecko, WebKit and Trident [03:46:01.0000] and "Same body" in Opera [03:46:02.0000] hsivonen: does the spec require too much relative to what WebKit does, or too little? [03:46:03.0000] othermaciej: breaks enough test cases that poking at about:blank is going to be a test case whack-a-mole project on its own right [03:47:00.0000] I am surprised that this echoed "Different body" [03:47:01.0000] getting the HTML5 parser on by default is big enough a test case whack-a-mole project already [03:48:00.0000] hsivonen: I figured out why it says "Different body" in WebKit [03:48:01.0000] hsivonen: it's not because the about:blank load is async, it's because the load event listener runs before the second script [03:49:00.0000] othermaciej: does the event loop spin or do you fire a sync load event? [03:50:00.0000] hsivonen: I don't know code-wise why it happens (either explanation is plausible), but I verified that the end() function runs before the second script by adding more alerts [03:50:01.0000] http://dev.w3.org/rdfa/specs/rdfa-dom-api.html [03:50:02.0000] othermaciej: ok. thanks [03:51:00.0000] hsivonen: I believe we do fire a load event from the middle of parsing without spinning the event loop [03:52:00.0000] foolip: I am amused that the "Triple" interface has 6 things in it... [03:52:01.0000] othermaciej: eww. I really don't want to go down that road. [03:52:02.0000] hsivonen: I just think that is how our code happens to work - no idea if this is required [03:52:03.0000] hsivonen: I think it is actually as a result of DOM insertion, not parsing, that the load event fires [03:53:00.0000] othermaciej: ok. I still want to avoid sync events while the parser is on the call stack [03:53:01.0000] and I can see the wisdom of that [03:53:02.0000] I honestly have no idea if this behavior is needed for Web compat [03:53:03.0000] I'm just trying to describe what we do [03:54:00.0000] sure. thanks [03:54:01.0000] So the IETF doesn't like cool URIs? Interesting [03:54:02.0000] that's nothing new [03:54:03.0000] though tools.ietf.org generally does [03:54:04.0000] so I use that [03:55:00.0000] othermaciej: well, (object, datatype, language) is really all part of the object, and children is just a bit odd. I assume this won't be the final draft. [03:55:01.0000] othermaciej: about the H:TML draft, I will get an announcement out to public-html shortly [03:55:02.0000] sorry for the delay, I made quite a lot of changes over the weekend [03:56:00.0000] whoa. the load event is sync in IE8, too [03:58:00.0000] othermaciej: btw, the two framing types is a result of targetting novice authors. SPDY is intended to be implemented by experts only. [03:59:00.0000] othermaciej: also, it's really easy to do multiplexing if your subprotocol supports it, just use a shared worker [03:59:01.0000] Hixie: at this point I'm a little frightened of the thought of novice authors implementing the protocol by hand - I really hope people make good frameworks for Perl/Python/Ruby/Java/etc [04:00:00.0000] why? [04:00:01.0000] the protocol is trivial [04:00:02.0000] it seems like there are many opportunities for subtle mistakes, when you dig into the details [04:01:00.0000] i've tried to minimise those [04:01:01.0000] obviously it's impossible to make a protocol idiot-proof, but it's a lot harder to screw up than most [04:02:00.0000] it's easier to make an API (more) idiot-proof than a protocol, so I hope people do so [04:03:00.0000] for multiplexing, yes, you could have clients coordinate via a SharedWorker or a shared iframe [04:04:00.0000] it also seems to me that transparent multiplexing could be added down the line (between the UA and the server, invisible to the JS client code) [04:04:01.0000] since the handshakes allow custom headers, and since you could introduce new frame types [04:04:02.0000] yes, indeed [04:05:00.0000] voluntary multiplexing also doesn't work if your clients might be running from different origins [04:05:01.0000] but sharing a connection in such a case is scarier security-wise [04:05:02.0000] bugs in doing framing become much more severe issues [04:05:03.0000] well, shared workers will be cross-origin capable in due course [04:05:04.0000] which would solve that issue [04:06:00.0000] I think transparent multiplexing would be cool, but it seems to me that it could be a 2.0 feature if deployment experience with WebSocket shows it is useful [04:06:01.0000] indeed [04:06:02.0000] likewise for transparent compression [04:06:03.0000] or whatever [04:06:04.0000] yup [04:08:00.0000] what's the idea with the frame types though? lots of people seem to have in mind non-browser use cases, will they fork some of the frame types? [04:09:00.0000] no idea [04:09:01.0000] non-browser use cases seems tupid to me [04:09:02.0000] s/seems tupid/seem stupid/ [04:09:03.0000] i mean, just use TCP [04:09:04.0000] websockets is a terrible protocol if you're not trying to use this with JS [04:09:05.0000] Hixie: I think the idea is to reuse existing services made for browsers in websocket [04:10:00.0000] in that case, there won't be forked frame types [04:10:01.0000] well, stupid or not, the people on the hybi list wanna do it [04:11:00.0000] othermaciej: oh, the reason the fixed part of the handshake is more than one line long is to ensure that you have to cause the server to echo something with a newline in it (which you can't send in the request) [04:11:01.0000] i'm not sure it's very productive to label them stupid [04:11:02.0000] that's prolly also what part of the clash is coming from [04:12:00.0000] they have completely different use cases in mind [04:12:01.0000] i have yet to see a good reason to use websocket rather than TCP for something where there's no browser as possible client [04:12:02.0000] Hixie: I see - in that case, I think a nonce would definitely remove the benefit of having a newline in the fixed part [04:12:03.0000] (indeed i've yet to see a reason at all) [04:12:04.0000] othermaciej: yes [04:12:05.0000] othermaciej: probably [04:12:06.0000] I could imagine using WebSocket if you have browser clients *and* other clients [04:13:00.0000] i think the reason for these server developers is that they can provide a easy-to-use library to their customers [04:13:01.0000] sure [04:13:02.0000] but then you wouldn't invent new frame types [04:13:03.0000] annevk: can't they do that with TCP too? [04:13:04.0000] they want to interoperate [04:13:05.0000] with other servers and new clients [04:14:00.0000] from that perspective it makes sense to have a standard protocol [04:14:01.0000] TCP is a standard protocol? [04:14:02.0000] yes, but they want an abstraction [04:14:03.0000] just like HTTP is an abstraction [04:14:04.0000] i don't see how this is very hard to get... [04:15:00.0000] websocket isn't an abstraction [04:15:01.0000] it's a minimal browser origin model security layer [04:15:02.0000] it will be an abstraction of some kind once we have send(Stream) [04:15:03.0000] well, and UTF-8 support is already an abstraction, imo [04:18:00.0000] i don't think "abstraction" means what you think it means [04:19:00.0000] higher-level? [04:19:01.0000] seriously, you can do everything websockets does in like 10 lines of documentation if all you want is a convention of using UTF-8 or something like htat. [04:19:02.0000] /me doesn't really care what term we use [04:20:00.0000] i don't understand why you would use websocket [04:20:01.0000] unless you specifically cared about browsers [04:20:02.0000] i understand using it for something where browsers is a use case and you also want to handle non-browser clients [04:21:00.0000] but for something where you don't have browser clients? [04:21:01.0000] it's silly. [04:21:02.0000] Hixie: even for cutting through firewalls? [04:22:00.0000] Hixie, it seems like you don't even try to understand their concerns [04:22:01.0000] hsivonen: websocket doesn't do anything special to cut through firewalls. [04:23:00.0000] I think a use case with both browser and non-browser clients is legitimate [04:23:01.0000] annevk: please enlighten me [04:23:02.0000] you may well not want to offer two forms of your service [04:23:03.0000] othermaciej: absolutely [04:23:04.0000] I could also imagine that once you have that infrastructure, you may want to reuse the client and server code for some cases that don [04:23:05.0000] t also involve a browser client [04:23:06.0000] I am not sure those should get consideration on the level of a primary use case [04:24:00.0000] that's like saying "ok, i've deployed IMAP, now I want to be able to run my MUD using IMAP" [04:24:01.0000] I could also imagine that you may want to talk an extended version of your protocol to a non-browser client rather than having a whole different protocol [04:24:02.0000] I would bet someone has actually run a MUD over IMAP :-) [04:24:03.0000] and they're silly :-) [04:25:00.0000] people also run back end business-to-business data messaging over HTTP [04:25:01.0000] HTTP over WS-* over Web Socket [04:26:00.0000] people do a lot of silly things [04:26:01.0000] that doesn't stop them being silly [04:27:00.0000] well, it seems to me they want something akin to HTTP for bidirectional communication [04:28:00.0000] they also think that deploying a new protocol is costly [04:28:01.0000] yes, i have several people say those things a lot [04:29:00.0000] have seen, rather [04:29:01.0000] annevk: if they want bidirectional HTTP for B2B back end integration, what does it have to do with Web Socket? [04:29:02.0000] it seems they don't believe that deploying two protocols will work so they want WebSocket to also accommodate use cases / requirements they have [04:53:00.0000] othermaciej: seems like the orderly close could be done easily just by having an echo feature of some kind [04:53:01.0000] othermaciej: either at the application level or in WS itself if we wanted to require all implementations to do it [04:53:02.0000] Hixie: echo feature? [04:54:00.0000] yeah where the side that wants to close the connection sends a packet saying "this is close attempt X" for some value of X, and the other side replies "close X", and if the first side receives a "close X" with the value of X that it sent, and it still thinks it wants to close, then it closes. [04:54:01.0000] without lingering. [04:56:00.0000] but the side that sends close X does not know when it can close? [04:56:01.0000] because wasn't the concern that if it closes to early "close X" would not arrive [04:58:00.0000] the side that sends close X will know it can close when the other side closes. [05:00:00.0000] in that case one side could just do the closing attempt right? [05:00:01.0000] no need for confirmation [05:00:02.0000] Then there might be data lost in transmission [05:02:00.0000] A says close, B closes on arrival of that message, A knows B closed [05:02:01.0000] so A can close [05:05:00.0000] Hixie's example said "and it still thinks it wants to close", though. [05:05:01.0000] I guess that might not be a necessary step [05:05:02.0000] yeah i guess you could just do it that way [05:06:00.0000] actually no [05:06:01.0000] you need the two-way handshake [05:06:02.0000] hmm looks like tests that i wrote 4 years ago have the same style as tests i write today [05:06:03.0000] Hixie: I believe that would be a fine orderly close mechanism, but I don't understand the TCP issue super well [05:06:04.0000] otherwise if A says to "close" to B, and B sends data to A then closes, B wouldn't see what A said [05:06:05.0000] er [05:06:06.0000] A wouldn't see what B siad [05:06:07.0000] except i probably try to automate tests more now [05:07:00.0000] Hixie, good point [05:07:01.0000] you win :) [05:07:02.0000] Hixie: the reason I think it should be considered for the base protocol is that WebSocket is broken for any app-level protocol where you don't do this [05:07:03.0000] othermaciej: depends on the protocol [05:07:04.0000] Hixie: not just "no acks" broken but "might gratuitously lose data even though everything seemed fine" broken [05:08:00.0000] zcorpan, heh [05:08:01.0000] othermaciej: if the server is just sending a stream of updates "stock AAPL up $1" "stock MSFT up $5" etc, then you don't care about lost packets [05:08:02.0000] othermaciej: it very much depends on the subprotocol [05:08:03.0000] Hixie: if the server initiated the close, then it seems wasteful for the server to order the client to discard the last few packets [05:08:04.0000] Hixie: even if the client being a few messages out of date might not be a critical problem [05:09:00.0000] 6 out of 8 have been fixed from http://zcorpan.1go.dk/test/opera-bugs/ [05:09:01.0000] agreed [05:09:02.0000] in that case the client would close [05:09:03.0000] not the server [05:09:04.0000] Hixie: it seems like that is what's happening with this TCP issue - not just that you don't know for sure what got delivered, but that you are actually telling the client to discard data that it already received [05:09:05.0000] you could put it in the base protocol and allow the server to close regardless if it doesn't care [05:09:06.0000] but the client wouldn't want an orderly close, especially if e.g. the websocket object is already GCed [05:10:00.0000] if we can come up with some thing simple enough that can be ignored if the server doesn't need it, it makes sense to add it to the base protocol [05:11:00.0000] I actually don't know the bare minimum required for a proper orderly close handshake over TCP [05:11:01.0000] it seems you could just use the highest byte as a special closing message [05:11:02.0000] that needs to be returned [05:12:00.0000] I wish someone on the thread had explained how you would solve the problem with a proper close handshake instead of a lingering close - like at least one example of something that works [05:12:01.0000] so 0xFF [05:12:02.0000] i wonder if we need the number X from my example [05:12:03.0000] as a standalone closing frame [05:12:04.0000] maybe we can just do 0xFF 0x00 [05:12:05.0000] and if you receive an 0xFF frame, you know the other side wants to close [05:12:06.0000] Hixie: If you want to support changing your mind about closing, you need it, don't you? [05:12:07.0000] you don't even need 0x00 I think [05:12:08.0000] and you can send an 0xFF to say ok [05:12:09.0000] annevk: i don't want to add a third kind of frame [05:13:00.0000] Dashiva: yeah, but it's not clear we need that [05:13:01.0000] boring :) [05:18:00.0000] foolip: maybe I am dumb, but if RDFa is fundamentally a graph structure, that seems like a poor API for it [05:19:00.0000] how are HTML notications in http://dev.chromium.org/developers/design-documents/desktop-notifications/api-specification supposed to work with D-Bus or Growl notifications? [05:19:01.0000] the spec talks about the browser rendering them as a browsing context [05:20:00.0000] but who really want that except in a Chrome OS -like environment where there's nothing but the browser? [05:20:01.0000] *wants [05:20:02.0000] they're not, as i understand them [05:20:03.0000] /me has similar concerns [05:20:04.0000] othermaciej: I agree, it's not terribly useful [05:20:05.0000] hsivonen: also a concern for iphone or android environments where the OS has a built-in notification system that is pure-text or close to it [05:21:00.0000] ok i should go sleep [05:21:01.0000] nn [05:21:02.0000] in particular, you'd expect the children of an RDFa graph node / triple to have the same subject as the triple's *object*, not the same subject as its subject [05:21:03.0000] /me thinks it would be good to start out with a simple icon + plain text API that maps to system notification mechanisms [05:21:04.0000] it also seems like you would really want to be able to query by subject, object or predicate, not just by type... [05:21:05.0000] what mailing list am I supposed to whine on about the notication API? [05:22:00.0000] aside: it's rather amazing that Growl still isn't part of Mac OS X itself [05:22:01.0000] amazing from a practical POV that is [05:22:02.0000] not amazing from a NIH attitude POV [05:22:03.0000] hsivonen: you could use webkit-dev, to complain about what's implemented, since the code is in WebKit, but fwiw I think Chrome is the only browser shipping with it enabled [05:22:04.0000] for proposing a standard version of this and proposing changes to that, I dunno [05:22:05.0000] HTML is not really an obstacle for Growl [05:23:00.0000] IIRC it uses WebKit to render notifications [05:23:01.0000] othermaciej: does chrome send stuff to growl or d-bus on Mac/Linux? [05:23:02.0000] hsivonen: I don't know [05:23:03.0000] ok [05:24:00.0000] non-Chrome WEbKit vendors/devs were not fully sold on the suitability of this API or the value of shipping it without sufficient cross-vendor discussion [05:24:01.0000] but I expect they implement it all internally and do not use any system or third-party notification services [05:25:00.0000] I see [05:25:01.0000] does Windows have a system-wide notification service that doesn't involve registering a tray icon first? [05:31:00.0000] onvolumechange didn't work at least [05:32:00.0000] but addEventListener worked [05:34:00.0000] it's a bug I guess [05:35:00.0000] per HTML5 it should clearly be on HTMLElement and HTMLDocument [05:36:00.0000] yes [05:36:01.0000] and window [05:37:00.0000] I need to catch up on this fullscreen thread at some point [05:37:01.0000] feature testing says it's on window but not document or element [05:37:02.0000] in webkit [05:38:00.0000] which is not so useful since volumechange doesn't bubble :) [06:41:00.0000] so to get an.ne I need to a have a registered company in Nigeria named "an" [06:41:01.0000] that seems like too much trouble [06:42:00.0000] man, uganda doesn't have any silly restrictions like that [06:45:00.0000] annevk, surely you could just bribe the right official? [07:00:00.0000] "Markup conformance requirements that need to be tested and give them a stable identifier that will persist across drafts of the specification." - http://www.w3.org/TR/2010/NOTE-test-methodology-20100128/ [07:02:00.0000] /me wonders how many thousands of distinct conformance requirements there are in HTML5. [07:03:00.0000] That's easy, just use the text of the conformance requirement as the identifier [07:04:00.0000] I guess if the text changes, you've changed the conformance requirement anyway . . . [07:05:00.0000] AryehGregor: I counted 223 requirement statements for a while ago [07:05:01.0000] so there's going to be quite a lot across the whole spec [07:05:02.0000] Is that counting something like "When X occurs, user agents must execute the following algorithm:" as one requirement or lots? [07:06:00.0000] Lots [07:07:00.0000] (I suppose it's more about testable points than about conformance requirements) [07:08:00.0000] There are what, like 8,000 tests for CSS2.1? And HTML5 is how much longer, twenty times as long? [07:08:01.0000] :/ [07:09:00.0000] /me is ballparking [07:09:01.0000] last estimate was 50000 tests needed [07:09:02.0000] maybe that was optimistic [07:10:00.0000] HTML5 has lots of features, many different facets to test for each, very detailed requirements for most of them, and some nontrivial interactions among different features [07:11:00.0000] so probably the level of test coverage needed is even higher than comparison of length to CSS would imply [07:11:01.0000] though to be fair, CSS is also pretty damn complicated [07:11:02.0000] CSS is prolly more complicated [07:11:03.0000] most of HTML5 is written in a pretty straightforward way [07:11:04.0000] CSS has very complex interactions [07:11:05.0000] You couldn't test a lot of HTML5 requirements in a cross-browser way, could you? You'd need something more like Mozilla's mochitests, that can synthesize input and whatnot. [07:11:06.0000] CSS has some very complex interactions by design [07:12:00.0000] HTML5 has a fair number of complex interactions by accident of history [07:12:01.0000] there are many HTML requirements that would be hard to test if the browser itself is your only test tool [07:12:02.0000] that doesn't mean you can't do it in a cross-browser way [07:13:00.0000] How would you do it, then? Except manually, of course. [07:13:01.0000] /me doesn't want to see anyone try running 100,000 tests manually :P [07:13:02.0000] one possible way is to run an external program to synthesize input events, and possibly capture output [07:14:00.0000] people have made such tools that work with any given browser, for example for testing Web sites or for general QA of any native application with a GUI [07:14:01.0000] i generated about 15000 tests for a css feature [07:14:02.0000] "Generated"? [07:15:00.0000] of course for any conformance requirement that can be tested by scripting against the DOM, that's the best way to do it [07:15:01.0000] yeah, i wrote some python scripts [07:16:00.0000] although i wanted to generate more for interaction with svg [07:17:00.0000] So how many substantively different things did these 15,000 tests actually test? [07:19:00.0000] i tested different combinations of percentages, lengths and keywords, visibility:hidden/visible, and the different values for the feature, for a number of different elements [07:19:01.0000] including invalid cases [07:19:02.0000] oh and with different images [08:01:00.0000] gah, I'm going crazy with all the different html parsing libraries... I want three things: parse broken html -> remove all tags except a list I specify -> get all text blocks that remain [08:02:00.0000] I guess this is why people give up and use regexps [08:02:01.0000] the only way I see of doing this, is to parse with html5lib, serialize to html, parse with lxml.html.clean and clean up, then parse again with html5lib, and select out all the text from there with xpath [08:02:02.0000] ... and that's a mess :) [08:02:03.0000] yeah :) [08:03:00.0000] Why can't you parse with html5lib with lxml treebuilder, then do whatever manipulations you need to the tree structure, then serialise as text? [08:04:00.0000] because the tree I get from the lxml treebuilder is not the format lxml.html.clean, expects... I get 'lxml.etree._ElementTree' object has no attribute 'rewrite_links' when trying to clean that document [08:04:01.0000] so it seems there are more methods on html trees somehow [08:05:00.0000] Oh, right [08:05:01.0000] Do you have to use lxml.html.clean, rather than implementing the functionality you need yourself using the normal lxml API? [08:05:02.0000] no, I'll switch as long as it works :) [08:06:00.0000] I'm screenscraping pages, and have found that a good way is to strip all tags except div, and then look for the longest text string that's left [08:06:01.0000] that's (most often) the article text [08:06:02.0000] should be pretty easy to go through a tree and dump elements you do not like and append their content to the parent? [08:07:00.0000] or some such [08:07:01.0000] maybe with some different branches depending on the element [08:07:02.0000] It'd be easier with a SAX-like API since you could trivially filter out unwanted tags [08:07:03.0000] The seperation between lxml.etree and lxml.html is a real pain [08:08:00.0000] good ideas, all of them [08:08:01.0000] jgraham: yeah, I was hoping for a lxml.html.frometree(doc) [08:08:02.0000] Huvet: Have you tried looking at the implementation of the lxml.html thing and seeing if you san port it to vanilla lxml.etree? [08:09:00.0000] jgraham: I think that's a bit over my head, I'm not that used to lxml [08:09:01.0000] http://codespeak.net/lxml/api/lxml.html.clean-pysrc.html [08:10:00.0000] 700 lines [08:10:01.0000] Huvet: You could rather inefficiently walk the tree and rebuild it in lxml.html Elements [08:11:00.0000] Or you could patch html5lib asnd see what breaks :) [08:12:00.0000] I guess the best way is what annevk said... except the "pretty easy" part ;) [08:13:00.0000] get the lxml tree and walk it and drop nodes as I go along [08:18:00.0000] Huvet: You have to be a bit careful because of the .tail [08:19:00.0000] See lxml.html.drop_tree [08:23:00.0000] *reading* [08:53:00.0000] Heyo. [09:05:00.0000] has html5 got something special for centering a div ? [09:06:00.0000] we're not in the business of styling ;) [09:06:01.0000] you want CSS [09:06:02.0000] something like margin:0 auto; plus an explicit width [09:06:03.0000] or some fiddling with table layout [09:06:04.0000] ok, so which channel is suitable for css ? [09:10:00.0000] #css on irc.w3.org, though that's mainly for the CSS WG [09:11:00.0000] There's also #css here. [09:11:01.0000] That's unofficial. [10:07:00.0000] hmm, chrome doesn't support wave in