2012-11-01 [17:18:10.0000] Version: Living ??? [17:21:52.0000] Hixie: Well, it was merely half a sentence that I could add. (It's already there, commented out.) [17:22:40.0000] hsivonen: Could you explain what you mean by "prior art" in this context? The two MIME types you mentioned don't match what Hixie and I were discussing. [17:23:19.0000] zcorpan: my recommendation is to just do the FSAs every now and then [17:23:22.0000] GPHemsley: hm? [17:23:53.0000] Hixie: right [17:26:01.0000] hmm. reading licenses is not what i wanted to do tonight. :-( [17:26:49.0000] so do something else :-) [17:27:02.0000] Hixie: My addition of a 'font' type was somewhat back-handed, as in "*if* you have one, then it goes here" [17:27:16.0000] never saying that you might be expected to or anything [17:27:58.0000] the something else will be sleeping. gn [17:32:14.0000] zcorpan: nn [17:32:39.0000] GPHemsley: not really sure how it would affect the algorithm. The algorithm is pretty much set in stone by legacy practice. [17:32:50.0000] gotta go. bbl. [17:34:24.0000] Hixie: It can't be too "legacy" if it includes WebP/WebM [17:34:35.0000] http://simon.html5.org/dump/compressive/test.html now has a zoom thingie. the containing dir also has more image versions (i think 4:2:0 is interesting). ok now going for real [19:24:49.0000] GPHemsley: not clear it should include WebP and WebM, but yeah [19:24:56.0000] GPHemsley: most of it is legacy, certainly [20:39:34.0000] well i'm amazed at how many people have CORS headers completely wrong. [20:40:29.0000] anyone know when the access-control-allow-origin header went from allowing multiple origins to only allowing a single one in it's header value? [00:21:58.0000] g'day [01:37:37.0000] why is the CR discussion expected to be long? [01:37:53.0000] Because it's the HTML WG? [01:38:29.0000] possibly [01:46:07.0000] still no new memes :-( [01:46:20.0000] Clearly everyone is gathered around TimBL [01:47:17.0000] hsivonen: patches welcome :) [01:49:46.0000] not a living standard: a living bug list [01:52:07.0000] heh [01:52:28.0000] night of the living bug list [01:52:55.0000] hsivonen: Maybe the "features at risk" discussion will be long? [01:53:08.0000] At least I plan to object to some of them [01:53:14.0000] jgraham: it probably should be, given how random the current list is [01:53:23.0000] Just mark everything at risk [01:53:44.0000] what's the id thing about? [01:57:03.0000] I think MIME Sniffing -> MIME makes sense to do at some point [01:57:12.0000] MIME RFCs are way out of date [01:57:32.0000] You know you don't need to maximize the IETF's anger [01:59:59.0000] this meeting could go much faster [02:00:12.0000] /me pushes some WebSockets tests that need to be updated [02:00:16.0000] If anyone is bored [02:00:30.0000] Updated == converted to testharness [02:00:38.0000] (and changing the port) [02:00:56.0000] like: "two impls missing. not gonna make it through CR. problem deferred for now" [02:01:33.0000] isn't it a no-brainer to mark scoped beign at risk? [02:01:44.0000] No [02:01:51.0000] why do we need to talk about it? [02:03:38.0000] To make the irrelevant people in the HTMLWG feel they have the control they wanted [02:03:39.0000] hsivonen: yes, it's a total no-brainer to mark ?&%23x2009;&%23x20dd; [10:38:36.0000] so cool [10:39:33.0000] (the thin space is arguably a hack) [10:44:17.0000] hmm actually, ? needs to be green too [11:19:23.0000] annevk: And you gotta get the font right. [11:55:46.0000] I filled up the wiki sidebar with a bunch of linky goodness. Let me know if there's any links you want me to add. [13:28:13.0000] Can we move the spec stylesheets to resources.whatwg.org? [13:29:36.0000] feel free to make a copy, but i'm keeping mine :-P [13:30:05.0000] (i just have a spec-specific one for the HTML spec) [13:33:47.0000] /me imagines Hixie's house without any spoons or knives. [13:40:31.0000] annevk: please file a bug: http://code.google.com/p/google-url/issues/entry [14:51:27.0000] Reviews welcome on my most recent mimesniff commit. [15:34:52.0000] jgraham: what was the name of the reviewing tool you talked about [15:37:26.0000] smaug____: critic [15:37:47.0000] Well technically "opera critic" (good pun, eh), but everyone calls it critic [15:38:17.0000] https://github.com/jensl/critic [15:38:54.0000] no no, obviously the page for it is http://www.theoperacritic.com/ [15:40:02.0000] heh [15:40:08.0000] /me -> sleep [16:25:28.0000] jgraham: It's called "opera critic"? I thought it was just "critic"? 2012-11-04 [19:12:26.0000] /me grumbles something about fonts that annevk already grumbled a year and a half ago. [19:13:47.0000] Does anyone know of any images that do not use an "image/" MIME type? [19:14:35.0000] (Perhaps because they are part of an "application/" container, à la "application/ogg"?) [19:19:06.0000] application/xml ? [19:19:26.0000] Hixie: What images are in that format? [19:19:29.0000] SVG? [19:19:47.0000] I'm under the impression that uses image/svg+xml [19:19:54.0000] it uses many things [19:20:02.0000] But it doesn't really matter, because XML MIME types don't get sniffed. [19:20:10.0000] application/pdf? [19:20:16.0000] depends what you mean by "image", i guess [19:20:26.0000] yeah, I suppose [19:20:27.0000] ascii art images can be sent as text/plain [19:20:30.0000] heh [19:20:33.0000] probably not that [19:21:16.0000] Hixie: Have a look at mimesniff when you get a chance. I've cleaned most of it up, I think. [19:21:28.0000] remind me monday [19:21:52.0000] alright [19:22:01.0000] hopefully by then "most" will be closer to "all" [19:22:06.0000] cool :-) [19:22:21.0000] i wish html was easy like that :-P [19:22:34.0000] (I still have to deal with sections 7.2 and 7.7.) [19:22:42.0000] Yeah, I don't know how you do it [19:23:11.0000] i get paid [19:23:15.0000] ah, there's that [19:23:18.0000] /me doesn't [19:23:19.0000] :-) [19:23:24.0000] (by anybody) [19:23:29.0000] (for anything) [19:23:49.0000] abarth: http://code.google.com/p/google-url/issues/detail?id=32 not the best report, but there you go [19:24:05.0000] GPHemsley: you at school/university, or just unemployed? [19:24:15.0000] just unemployed at the moment [19:25:09.0000] I left school at the end of spring semester, and how long unemployment lasts dictates whether I go back to school to do something else. [19:25:26.0000] So, you know, if you've got money to throw at me... ;) [19:25:42.0000] i wish [19:27:01.0000] (highly recommend higher education if you do want a job at a company like google, btw) [19:27:42.0000] I've got a Bachelor's and a year of grad school. How much does that buy me? [19:27:49.0000] heh, when we were hiring, people with more than a CS only made me think "did they stay in school longer because they couldn't find a job?" [19:27:53.0000] GPHemsley: more than i have :-) [19:28:52.0000] Well, the offer for an offer is on the table. I'm enjoying this work, but it'd sure be nice to get paid while I do it. [19:29:17.0000] Feel free to spread that around. ;) [19:29:57.0000] In the meantime, I'm gonna go watch TV. [20:10:21.0000] hmm [20:10:33.0000] so Internet Explorer does not include the leading slash in pathname [20:11:51.0000] and between Internet Explorer 6 and 7 they swapped \ for / in file: URLs [20:12:14.0000] and the Windows drive thing is [a-z][:| [20:12:15.0000] ] [20:12:25.0000] with | being converted to : [20:12:34.0000] oh, also [A-Z] and it's case-sensitive [20:12:39.0000] not too bad I guess [20:14:02.0000] http://netrenderer.com/ ++ [21:13:26.0000] oh wow [21:13:37.0000] didn't know about http://netrenderer.com/ [00:22:50.0000] gsnedders: Without checking, I think the internal release email called it "the Opera Critic", and it says "Opera Critic" at the top of the page [00:23:01.0000] But I dunno. jl never calls it that [00:24:31.0000] Cute name that way [01:54:34.0000] was there some point within the last couple years when Hixie added more named character references? [01:54:47.0000] e.g., $ and − ? [01:59:35.0000] if now find http://html5.org/r/5557 but that's not what I'm looking for [01:59:38.0000] Hixie imports the references from whatever the MathML comes up with [01:59:46.0000] Math WG* [01:59:57.0000] so if they make changes, those would affect HTML [01:00:05.0000] ok yeah I remember that now [01:00:19.0000] so I wonder if they made some changes upstream [01:01:00.0000] I'm asking because the validator source is missing some [01:01:37.0000] I guess it might make sense for the the part of the validator source that handles this to be imported and generated from the Math WG source [01:02:38.0000] I am pretty sure Henri already is generating the source [01:02:56.0000] but whatever script he's using to do that is not in the repo [01:04:45.0000] weird that the related bug is marked NEEDSINFO while it was in fact fixed (for the multi-code point entities) [01:05:42.0000] MikeSmith: I guess I'd ask hsivonen then [01:05:56.0000] yeah [01:07:53.0000] /me finds http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json [01:10:40.0000] ah yeah [01:11:21.0000] which reminds me Hixie must have some other list that he uses along with the upstream Math source [01:11:45.0000] but the Match source doesn't know anything about semicolon-less character references [01:12:05.0000] but which that entities.json is aware of [01:14:22.0000] So masinter mentioned that Dave Thaler from Microsoft registered a whole bunch of URI schemes as provisional [01:14:25.0000] looking at the cvs log for http://www.w3.org/2003/entities/2007/htmlmathml-f.ent I see the only substantive change made to it was in February 2010 when David added NotGreaterFullEqual to it [01:14:31.0000] http://www.iana.org/assignments/uri-schemes.html is a whole lot more up to date now [01:14:35.0000] annevk: yeah he told me that too [01:14:54.0000] in fact that was very recent I think [01:15:02.0000] I vaguely remember seeing mail about it [01:17:54.0000] morning folks [01:18:58.0000] Morning [01:20:23.0000] Ms2ger: http://docs.sqlalchemy.org/en/rel_0_7/core/engines.html#database-urls - the problem here is that browsers only treat certain schemes as having that syntax [01:20:48.0000] Ms2ger: I'm not sure if we can reconcile that somehow [01:20:59.0000] Dunno if we need to [01:21:04.0000] Ms2ger: I guess it would be nice if we could so these things "would just work" in JavaScript [01:21:42.0000] Apart from this another uncertain bit seems to be whether needs to be parsed separately or not for non-relative schemes [01:21:57.0000] Did the definition of "talisman" to mean essentially a nop start with the HTML specs or was it already in tech use before? I never saw it before until I encountered it in relation to xmlns. [01:22:02.0000] Apparently some UAs treat about:blank?test as about:blank and others do not [01:23:38.0000] Chrome/Safari don't; Opera/Gecko do [01:23:59.0000] asmodai: Is that used in the current HTML spec? [01:24:26.0000] I encountered in various places, e.g. http://www.w3.org/html/wg/wiki/ChangeProposals/html:xmlns [01:24:34.0000] /me checks standards [01:24:43.0000] asmodai: "tech use" dunno, as far as web specs I read goes, I believe Hixie coined it [01:24:56.0000] annevk: Ah, that might account for it. [01:25:17.0000] and it's still there in the global attributes section [01:25:17.0000] MikeSmith: Doesn't seem to be in any specs as far as I can quickly see, mostly talks about standards [01:25:28.0000] annevk: I like the term though. [01:25:53.0000] annevk: it's just used about xmlns? [01:26:03.0000] /me reads the global attributes section [01:26:03.0000] MikeSmith: yeah [01:26:06.0000] ok [01:26:33.0000] annevk: Ah yes, see it now [01:26:46.0000] asmodai: it's just a word :) there's other stuff more worthy of hate [01:27:15.0000] MikeSmith: Haha, I am not hating, I like it. :) [01:27:30.0000] ah [01:27:32.0000] MikeSmith: Just hadn't encountered it before and wondered if I missed some tech terminology :) [01:27:44.0000] somehow I misread your "like" as "hate" [01:27:48.0000] Ms2ger: maybe if it keeps coming up we could make the "relative scheme" list dynamic from a JS perspective [01:28:00.0000] Ms2ger: not entirely sure if that's desirable though [01:28:00.0000] MikeSmith: *grin* daijoubu yo! :D [01:28:36.0000] I encountered it because I am currently messing with SVG inline in HTML 5 [01:28:39.0000] asmodai: I think it was used before in that context, but can't really recall [01:28:59.0000] we used to have a funny thing for the / in too, but that got removed [01:29:07.0000] MikeSmith: For a parsing/tokenizing PoV it's clearer than calling it a nop. [01:29:15.0000] yeah [01:29:59.0000] "Then, if the element is one of the void elements, then there may be a single U+002F SOLIDUS character. This character has no effect except to appease the markup gods. As this character is therefore just a symbol of faith, atheists should omit it." [01:30:24.0000] rofl [01:30:28.0000] annevk: so is it preferable to have about:blank?test treated as about:blank or it as case where it really doesn't matter as long as UAs just treat it the same? [01:30:53.0000] hah yeah the "atheists should omit it" bit [01:30:54.0000] annevk: I can see why that got removed, but dang, that's plain brilliant. [01:31:38.0000] MikeSmith: Chrome WONTFIXED their bug on it (no compat concerns), Gecko has an about:* page that uses the query string functionality, and I cannot test Internet Explorer [01:31:53.0000] ok [01:32:05.0000] MikeSmith: It might be nice to support it so about:unicorn can support a color parameter [01:32:53.0000] heh [01:33:27.0000] but it does not make sense for javascript: and data: [01:33:37.0000] annevk: I'd propose Valve's balloonicorn: http://store.valvesoftware.com/product_images/main_images/balloonicorn_main_01.png [01:33:38.0000] :) [01:33:39.0000] but they could be special cased in either the API or elsewhere [01:34:02.0000] asmodai: haha; I need to find a public domain unicorn or create one... [01:34:06.0000] asmodai: preferably SVG [01:34:27.0000] http://en.wikipedia.org/wiki/File:Invisible_Pink_Unicorn.svg [01:34:45.0000] heh, just found the same :) [01:34:47.0000] ACID4 should be unicorn-based [01:34:51.0000] annevk: :D [01:35:22.0000] btw I found Henri's named-character-references generator code [01:35:23.0000] http://hg.mozilla.org/projects/htmlparser/file/default/translator-src/nu/validator/htmlparser/generator/GenerateNamedCharacters.java#l40 [01:36:14.0000] uses a regexp to parse the table in the spec [01:37:03.0000] not sure it's worth switching it to do elsewise so I guess I'll just re-run it and see what I get [01:38:22.0000] so yeah, with some cleanup I think we could use that SVG file [01:38:58.0000] it's about time the platform has native unicorn support [01:40:26.0000] lol [01:52:33.0000] annevk: I predict bug reports demanding the unicorn be replaced by sharks with lasers [02:13:40.0000] http://html5.org/temp/unicorn.svg [02:13:52.0000] I wonder if there's some tool that clean up those coordinates [02:13:56.0000] can* [02:24:26.0000] Hahaha [02:24:31.0000] Awesome Anne [02:25:20.0000] annevk: https://github.com/RazrFalcon/SVGCleaner ? [02:29:38.0000] asmodai: can you maybe run that? I don't have the right OS [02:29:44.0000] http://tools.ietf.org/html/rfc6694 is for about it seems [02:30:56.0000] but it does not deal with e.g. about:BL%41NK not working [02:31:03.0000] so I guess it's not useful [02:31:16.0000] annevk: Neither do I at the moment >_< [02:31:33.0000] need to reinstall this entire box, only have console access on FreeBSD atm [02:31:41.0000] and no VM around either [02:31:44.0000] "The about:blank URI references a blank page." oh my what an idiots [02:31:56.0000] they removed everything that was useful about that spec [02:32:06.0000] asmodai: no worries, this can wait [02:32:23.0000] should probably do host parsing first anyway before adding easter eggs [02:32:30.0000] *grin* [02:32:34.0000] (and defining schemes in general) [02:33:06.0000] annevk: When they say blank page, do they mean a blank HTML page? [02:33:41.0000] well the first definition was a resource whose MIME type is text/html and encoding is utf-8 and consists of the empty string [02:33:45.0000] that's the one you want [02:33:51.0000] now all that is undefined [02:33:55.0000] Yea [02:34:10.0000] because blank could also mean simply an initialized panel with no content whatsoever [02:34:25.0000] panel as in UI element [02:38:08.0000] You'd think people would've learnt from the past and just be explicit about stuff like this in specs. [02:41:21.0000] it was explicit and then it was removed [02:41:55.0000] http://tools.ietf.org/html/draft-holsten-about-uri-scheme-06#section-5.1.1 [02:43:13.0000] annevk: I wonder what people have against explicitly defininf it. [02:43:15.0000] defining [02:43:38.0000] Then again, the IETF also isn't what it used to be since politics entered it [02:46:46.0000] dunno either, I'm not really interested in following their mailing lists as I usually end up point-by-point sniping rather than getting somewhere [02:48:09.0000] Lachy's older draft had the same problem btw in theoretically working with percent-escaped bytes [03:03:42.0000] annevk: having done compiler work, I just know how annoying a spec is that leaves stuff hanging and dangling. It's like people don't like clarity at times [03:05:59.0000] http://link.springer.com/chapter/10.1007%2F978-3-642-32759-9_26 "Reachability Analysis of the HTML5 Parser Specification and Its Application to Compatibility Testing" [03:18:53.0000] hot dang [03:18:55.0000] http://www.audiotool.com/app [03:19:02.0000] that's some impressive use of HTML5 [03:49:46.0000] MikeSmith, so did they file bugs? :) [03:50:48.0000] AryehGregor: one of the authors posted a question to the whatwg list about the adoption-agency algorithm [04:26:31.0000] http://www.ietf.org/mail-archive/web/apps-discuss/current/msg05435.html "in reality no sane software developer should rely on the semantics of any particular “about” token" [04:26:36.0000] Tim flipping bozo bits again [04:26:47.0000] I'm not sure why I [04:27:14.0000] 'm going through the archives, but it seemed potentially interesting to find out why they fucked up so badly [04:34:16.0000] It seems it was removed somewhere around http://www.ietf.org/mail-archive/web/apps-discuss/current/msg03120.html when it was made a WG item but there's no mention of why [04:40:44.0000] http://www.ietf.org/mail-archive/web/apps-discuss/current/msg02863.html does not mention it either [04:41:59.0000] but does seem to indicate why Lachy gave up on them [05:04:06.0000] jgraham: Interesting, given nowhere else on github does it call it that! [05:06:34.0000] My monitor is… clicking. [05:11:53.0000] MikeSmith: About semicolon-less character references: are you looking for a file like http://svn.whatwg.org/webapps/entities-legacy.inc ? [05:14:03.0000] Hey Philip`! [05:52:36.0000] <[tm]> Philip`_: yeah, that (about character references) [05:52:41.0000] <[tm]> thanks [05:54:09.0000] <[tm]> though I've since figured out there's no bug in Henri's code [05:54:22.0000] <[tm]> I was just misunderstanding it [06:38:03.0000] /me wonders whether such a long document was required just to say that "about:blank" is blank. [06:38:51.0000] s/whether/why/, if you like [06:40:22.0000] "IETF" [06:47:08.0000] <[tm]> ha I see the author of the about blank rfc is that "sm" dude [06:47:37.0000] <[tm]> no wonder it's DOA [07:25:30.0000] annevk42: i think the use of "talisman" in the context of /> is actually hsivonen's, i would have called it something far less eloquent [07:46:30.0000] <[tm]> +1 to hsivonen way with words [07:52:41.0000] Ah, so hsivonen then? [07:53:04.0000] Still awesome. [07:53:27.0000] And I still have a tab open with that unicorn of anne's, heh [08:27:38.0000] So anyone at a university want to steal us a copy of the HTML5 testing paper? [08:28:03.0000] jgraham: will any university do? [08:28:11.0000] charl: Probably [08:28:22.0000] jgraham: html5 testing paper? what/where's that? [08:28:27.0000] Any one with a subscription to springer publications at least [08:28:38.0000] ah, ok [08:28:50.0000] http://link.springer.com/chapter/10.1007%2F978-3-642-32759-9_26 [08:28:53.0000] i work at a university but i don't know about a subscription to that [08:29:06.0000] No access here either [08:29:32.0000] It's almost liek they don't want people reading their tuff [08:29:36.0000] *stuff [08:29:57.0000] seems to me that if you're testing stuff and then not letting anyone see the results, motives come into question, heh [08:30:22.0000] the authors appear to be japanese? [08:30:39.0000] since "improving the quality of things" plainly isn't the top priority, and if that's not the priority of testing ... [08:31:21.0000] looks more like somebody did research for their own benefit, not for the benefit of the web community [08:35:18.0000] Well sure, to a certain extent the point of doing research if you are an academic is that other academics will notice and give you a job where you can keep doing research [08:35:54.0000] There is more to it than that of course, but the current - much reviled - publication system is only really good for that purpose [08:36:16.0000] that is true but if the research doesn't have improvement in some way as a consequence, what's the point? [08:36:43.0000] Funding [08:36:56.0000] lol yeah [08:36:59.0000] pretty-much [10:00:01.0000] jgraham: Apparently I'm still able to steal papers from there [10:14:31.0000] annevk: "All but two single-byte encodings have a unique index." What does that mean? [11:09:47.0000] SimonSapin: iso-8859-8 and iso-8859-8-i (iirc) have an identical index [11:12:45.0000] 05:13 < MikeSmith> didn't know about http://netrenderer.com/ < saved my bun for many years :P [11:13:17.0000] Although in the beginning I never remebered the name, so I always had to re-find it again. [11:36:49.0000] annevk42: Still have that unicorn open in a tab, remains awesome :) [11:38:05.0000] heh [13:14:20.0000] The mimesniff rewrite is almost unofficially complete. [13:15:10.0000] Reviews welcome as I tie up loose ends. [13:15:23.0000] (Then we can get to the real bug fixinging.) [13:15:27.0000] ing [13:27:51.0000] hi, i'm using html5lib and lxml to tidy some html content. For some reason, when I parse the content into a tree, it escapes the colons from xmlns attributes [13:28:25.0000] eg. xmlnsU0003Adct="http://purl.org/dc/terms/" [13:28:59.0000] why does it do it and how can I avoid this? [13:31:49.0000] prolly because you parse HTML and use an XML-aware tree [13:32:18.0000] annevk, thanks. what do you advise instead? [13:33:12.0000] I guess you should use some tree structure that can deal with xmlns:dct [13:33:28.0000] (with xmlns:dct not being in the XMLNS namespace, as is the case for HTML content) [13:36:27.0000] annevk, what are the other tree structures possible? [13:37:13.0000] maybe gsnedders can help you out there, it's been ages... [13:37:23.0000] ok. Just foudn that http://stackoverflow.com/questions/12253791/html5lib-with-lxml-treebuilder-doesnt-parse-namespaces-correctly [13:37:26.0000] reading... [13:39:57.0000] the main problem though is using xmlns attributes in text/html content, that's just bogus [13:41:12.0000] I'm gonna work with real webpages so that might happen unfortunatly [13:45:33.0000] annevk, I basically want to write a rdfa spider. What would you advise regarding malformed html + rdfa pages? [14:39:55.0000] Does HTML have a stance on the Accept-Charset header? [15:50:41.0000] aleray: Basically all the Python DOM impls don't support HTML DOMs, so you're basically stuck, even with alternate models like ElementTree. [15:51:01.0000] aleray: If you're willing to take the perf hit, you could probably fork ElementTree quite easily to remove the checks 2012-11-05 [16:45:53.0000] /me imagines what a world with an Internet Media Type like 'archive/tar+gz' would look like. [16:46:14.0000] 'archive/tar+gzip' perhaps [16:47:35.0000] archive/tar+bzip2 [16:49:11.0000] archive/xpi+zip [16:49:25.0000] hmm... needs some work [16:52:29.0000] image/svg+xml+bzip2 [16:53:10.0000] maybe image/svg+xml.bzip2 [16:54:44.0000] image/svg+xml$bzip2 [16:55:10.0000] image/svg+xml!bzip2 [16:55:14.0000] /me shrugs. [17:11:30.0000] /me also thinks that there should be an alias system, similar to how Encoding is doing it. [18:52:15.0000] anyone here work with HSTS? [19:03:33.0000] aleray: you could try using validator.nu HTML parser instead [19:03:45.0000] http://about.validator.nu/htmlparser/ [19:04:06.0000] hey mike, are you familiar with HSTS at all? [19:04:22.0000] wirepair_: nope sorry [19:04:26.0000] dunno even what it is [19:04:34.0000] strict-transport-security [19:04:39.0000] ah [19:04:40.0000] for forcing https [19:04:44.0000] yeah [19:04:53.0000] so yeah heard of it but don't know the details [19:04:59.0000] never have worked with it [19:05:08.0000] ok thanks anyways :) [19:29:48.0000] wirepair_: I guess abarth would be a good person to ask [19:29:51.0000] when he's around [19:30:08.0000] yeah [19:30:12.0000] will do [19:35:02.0000] GPHemsley: "archive" is spelt "multipart" in mime's world [20:25:04.0000] Hixie: AFAICT, "multipart" is mostly used for things like e-mail, where multiple files are included in a single file body, with a particular format. That's not the same as an "archive" as I would define it. [20:25:17.0000] "The body must then contain [20:25:18.0000] one or more body parts, each preceded by a boundary delimiter line, [20:25:18.0000] and the last one followed by a closing boundary delimiter line." [20:26:03.0000] I'd say it's rather e-mail specific. [20:27:55.0000] or "Internet message", if you'd like [20:28:10.0000] (multipart/voice-message suggests it's not just e-mail) [20:28:51.0000] but still, the list is small, and I don't think "multipart" and "archive" are necessarily the same thing [20:29:44.0000] The worst part, as annevk knows from his dealings with URLs, is that it's hard to track down the set of documents that describe all of what a MIME/Internet media type is and how the registry works, etc. [20:30:11.0000] It's strewn over a bunch of different RFCs, some of which are only partially obsoleted [20:31:00.0000] oh MIME definitely used to be for e-mail only [20:31:13.0000] i'm just saying the type that MIME uses for multiple files (an archive) is "multipart" [20:31:24.0000] but yes, multipart does imply a particular format for historical reasons [20:31:51.0000] I don't think you'd ever see something like "multipart/zip", even if that format restriction was lifted [20:32:14.0000] but I just discovered that there is currently a new draft document in the works [20:32:50.0000] http://tools.ietf.org/html/draft-ietf-appsawg-media-type-regs-14 [20:34:01.0000] I don't have the mind to read it fully right now, though [20:34:23.0000] yeah, i was just coming to that conclusion myself :-) [20:34:54.0000] what was "application/*" ever even supposed to mean [20:34:56.0000] last updated in June... not sure what that means for its progress [20:35:08.0000] "a format read by applications" doesn't seem like it's much of a categorization, heh [20:35:28.0000] zewt: One of the RFCs describes that somewhere, but I'm too tired to look it up [20:35:39.0000] probably not a great definition, whatever it is, though [20:36:19.0000] guessing it's something meaningless, because it sure seems like a meaningless group [20:36:30.0000] Does the RFC Editor's Queue mean that a draft is almost done? [20:36:45.0000] /me is not up on IETF terminology [20:37:13.0000] actually it just seems like a weird name for "we don't have a category for this" [20:38:03.0000] probably [20:38:11.0000] or at least that's what it turned into [20:38:28.0000] I was wondering whether also having "document" would make things better or worse [20:39:49.0000] coming up with a bunch of new categories probably wouldn't actually help make things less confusing, particularly since it's not a single axis categorization so there's going to be overlap [20:41:12.0000] The only two I'm really proposing are "font" and "archive"; I think those are distinct enough to warrant their own categories. [20:41:36.0000] I think we can probably deprecate a few categories, too. [20:41:50.0000] But IDK. I haven't spent too much time thinking about it or looking into it yet. [20:42:04.0000] but there are already mime types for the major file formats in use, so unless you think there'll be a big influx of new archive formats... [20:42:05.0000] /me mumbles something about just dropping this whole thing as much as possible [20:42:07.0000] (I did see a suggestion that "font" is being used in the wild somewhere.) [20:42:28.0000] zewt: Well, I was also proposing an alias mechanism, too. [20:42:46.0000] Hixie: The problem is, magic numbers aren't always magic. [20:43:05.0000] Nor should they always be. [20:43:09.0000] sounds like something that would cause breakage and busywork [20:43:50.0000] Mozilla uses various ZIP-derived formats, for example. They're really just rebranded ZIPs, actually. Magic numbers wouldn't allow that. [20:44:34.0000] but introducing an "archive/zip" alias for application/zip would only be making things more complicated [20:45:02.0000] zewt: It depends on how it was implemented/specced. Without any specifics, it's probably not worth speculating. [20:45:46.0000] if it results in people serving ZIPs from HTTP servers with "Content-Type: archive/zip", then i don't think it matters how it's specced [20:46:04.0000] it's something that people have to handle that they don't have to today [20:48:20.0000] It's a lot easier to handle "archive/*" than arbitrary types [20:48:42.0000] you have to handle arbitrary types; application/zip isn't going away [20:49:30.0000] alright; like I said, I'm tired; not worth getting into a discussion right now [20:49:34.0000] anyone who wants to say "match all mime types that are archive-like formats" will always need a list of formats [20:50:26.0000] zewt: We can discuss this more tomorrow, if you want. (Feel free to review mimesniff in the meantime.) [20:50:30.0000] /me heads off to bed. [20:51:03.0000] Oh, P.S.: http://tools.ietf.org/html/draft-ietf-appsawg-media-type-suffix-regs-07 [20:56:50.0000] jgraham: I have a PDF copy of that HTML5 parser paper [20:57:59.0000] the references section lists html5lib [20:58:09.0000] and the validator.nu parser [21:00:49.0000] there is a section where they identify some markup cases where implementations have "incompatibilities" with the spec [21:01:45.0000] they fine one such incompatibility in Safari, 3 in html5lib, and 6 in the vnu parser and Firefox [21:01:51.0000] *find [21:02:01.0000] and none in Opera or IE [21:02:59.0000] one case they say the vnu parser and Firefox get wrong is this: [21:03:01.0000]
[21:03:43.0000] which parses as
in Safari, Opera, html5lib, and IE [21:04:05.0000] but as
in the vnu parser and FF [21:05:15.0000] is another case that the vnu parser and FF and also Safari get wrong [21:05:37.0000] they parse it as [21:05:56.0000] but it should stay as [21:07:33.0000] ah the 3 cases they list as html5lib getting wrong are basically the same case [21:07:46.0000] so there're only one thing html5lib gets wrong [21:07:46.0000] w [21:07:59.0000] which is
  • [21:08:20.0000] it should parse as
  • [21:08:31.0000] but html5lib gives
  • [21:09:16.0000] and the 6 cases that FF and the vnu parser get wrong basically all come down to variations of the two cases listed above [21:55:08.0000] http://www.score.cs.tsukuba.ac.jp/~minamide/html5spec/model.html5 [23:12:10.0000] GPHemsley: seems like you're confusing MIME type and HTTP's Content-Encoding header [23:15:02.0000] MikeSmith: pretty awesome that they found those bugs [23:19:13.0000] annevk: yeah and they found them by only testing with a subset of 24 elements [23:19:34.0000] if they tested with more they might find some other things that were missed [23:20:25.0000] btw the ruby bug they describe seems to have already been fixed in Gecko and WebKit [23:20:42.0000] actually I think it wasn't really even an oversight bug anyway [23:21:33.0000] it's just that the spec changed and the versions of Firefox and Safari they tested with at the time were before the parsers were brought up to date with the spec [23:21:54.0000] the versions they tested with were from around February I think [23:22:26.0000] hmm okay [00:03:15.0000] question of the day: [00:03:28.0000] "Does IETF have an XML vocabulary for expressing ABNF (RFC 5234?) grammars?" [00:03:45.0000] solid gold [00:04:47.0000] http://lists.w3.org/Archives/Public/uri/2012Nov/0005.html [00:23:53.0000] "[whatwg] Question on Limits in Adaption Agency Algorithm" - does the provided case actually hit the loop limit? i'll admit that i don't know how aaa works but it's not obvious to me that it invokes the limit [00:27:01.0000] though dropping the makes the xyz go as the last child of body, so i guess it does invoke the limit [00:40:50.0000] http://code.google.com/p/google-url/issues/detail?id=32 is an interesting discussion [00:40:57.0000] it's about that weird behavior you found zcorpan [01:02:11.0000] annevk: http://simon.html5.org/tools/js/svg-optimizer/ (use quality 10 or so) [01:05:21.0000] zcorpan_: higher is better? [01:05:45.0000] yes [01:06:06.0000] (try e.g. 0.1) [01:06:11.0000] if I increase the quality the file savings get better... [01:06:20.0000] oh wait, nm [01:07:49.0000] I don't really see the difference between 10 and 100 [01:08:18.0000] zcorpan_: but this is very cool; how do I now remove the translate() ? [01:08:43.0000] change the viewBox values [01:08:47.0000] zcorpan_: add it to each coordinate pair? [01:09:06.0000] something like that [01:09:28.0000] would be nice if the tool did that [01:11:48.0000] this tool is such a gross hack that doing that is not implementable without reimplementing the whole thing in a more proper way [01:11:56.0000] like, e.g., operating on the parsed tree instead of the source [01:15:26.0000] oh lol [01:15:28.0000] okay then [01:16:02.0000] did anyone write a tool to translate an SVG path and just get the normalized result? [01:16:14.0000] annevk: OH: "The behavior is very well-defined." [01:16:40.0000] zcorpan_: I was quite surprised by that one [01:18:41.0000] Inkscape reportedly has the ability to do this [02:12:26.0000] zcorpan: I just got some SVG from ed and using that in your tool with a high quality makes the size bigger [02:13:11.0000] annevk: not surprising if the original svg already uses integers for the coordinates, e.g. [02:13:46.0000] k [02:27:55.0000] I made a demo for calculating the dimension of a replaced element for whoever is interested: http://lists.w3.org/Archives/Public/www-archive/2012Nov/att-0010/replaced-element-dimension-calulation [02:28:15.0000] It's a bit shocking that there's still non-interoperable case for this sort of thing... [02:29:14.0000] (Note that most of the circles are draggable) [02:31:50.0000] shocking and non-interoperable do not go together in one sentence [02:32:07.0000] ;) [02:33:21.0000] kennyluck: How does it work? What is each line? [02:33:40.0000] huh [02:33:42.0000] MikeSmith: public-iri has a restricted list? [02:33:52.0000] shouldn't [02:33:54.0000] SimonSapin, the rectangle is the min/max constraint. [02:33:59.0000] MikeSmith: you have to subscribe in order to post to it apparently [02:34:07.0000] hmm [02:34:09.0000] lemme check [02:34:17.0000] I can change that [02:34:22.0000] SimonSapin, orange circle is the intrinsic dimension and the green circle is the specified dimension [02:34:50.0000] my message did get archived at http://lists.w3.org/Archives/Public/uri/2012Nov/0007.html so I suppose it's not a big problem [02:34:56.0000] but it's kind of a nuisance if someone cc's public-iri and you reply to that [02:35:30.0000] yeah [02:35:49.0000] SimonSapin, oh, the red circle is the result dimension. [02:36:05.0000] kennyluck: I see, thanks [02:36:56.0000] So I think to explain the min/max table for a replaced element in terms of this graph, it's something like this: [02:39:03.0000] For a replaced element with intrinsic side and both dimension being 'auto' (not specified), the result dimension is the closest point from the intrinsic line to the constraint rectangle. Whenever there are multiple closest results (2 or infinity), the result dimension is the one that's closest to the intrinsic point. [02:40:14.0000] s/2 or infinity/infinite ones/ [02:55:20.0000] MikeSmith: did the authors of the paper send it to you as a bug report? or was it up to you to discover that someone had written a paper about bugs without actually filing the bugs? [02:58:08.0000] hsivonen: Did you see the paper? [02:58:43.0000] jgraham: no [02:58:54.0000] jgraham: just a bug MikeSmith filed based on the paper [03:06:13.0000] GPHemsley: my advice is trying to avoid fighting the IANA to make application/* make sense. [03:06:41.0000] GPHemsley: better just treat the "application/" part as meaningless legacy boilerplate and move on with life. [03:07:37.0000] fwiw, if you don't want jreschke to call you on that, s/IANA/IETF/ [03:07:56.0000] IANA supposedly does as they're told (except for when they broke all the registry URLs, that was them) [03:23:47.0000] What paper? [03:25:57.0000] kennyluck: http://krijnhoetmer.nl/irc-logs/whatwg/20121104#l-276 [03:26:32.0000] annevk, thanks~ [04:02:59.0000] http://html5.org/temp/unicorn.svg is now much smaller [04:03:11.0000] further improvements under CC0 welcome [04:05:17.0000] haha [04:05:25.0000] that URL from that google-url bug report [04:05:27.0000] http://%ef%bc%85%ef%bc%94%ef%bc%91.com/ [04:05:35.0000] gives different results in almost every browser [04:06:11.0000] Safari's host name is the best %41.com (yes using fullwidth %, 4, and 1 afaict) [04:07:50.0000] "The behavior is very well-defined." [04:23:20.0000] matjas: you around? saw that 1 turns into some Punycode string per http://mothereff.in/punycode but no browser does that, not even Opera [04:23:48.0000] matjas: so if that's IDNA2008... well... [04:24:45.0000] annevk: you’re really finding all the edge cases aren’t you :) good catch [04:26:41.0000] filed https://github.com/bestiejs/punycode.js/issues/12 [04:29:20.0000] "Polyglot markup is a super subset…" (from public-html) - WTF is a "super subset"? How can it be both a superset and a subset at the same time?! [04:29:42.0000] Perhaps it's just a particularly awesome type of subset. [04:37:09.0000] <[tm]> heh [04:37:28.0000] /me a day in the leif [04:38:55.0000] Wouldn't that basically be the definition of set equality? (being both a subset and superset of another set) [04:41:36.0000] I think what's meant is "extended subset" [04:44:40.0000] oh, that's not what's meant [04:46:46.0000] besides, “extended subset” is a joke [04:47:36.0000] But also a term that people are using in ernest, I think? [04:48:43.0000] sadly, that may be true [04:49:22.0000] I actually can't tell what Leif means [04:51:21.0000] This is not unusual [04:54:53.0000] matjas: I'm really just doing some adhoc testing, I actually should do a thing where I just pour all code points in and see what comes out, but I'm lazy [04:57:31.0000] matjas: so afiact what you're doing is correct per IDNA2008 [04:58:33.0000] matjas: mapping fullwidth to ASCII is something that's allowed in the UI layer (not required), but not the protocol layer, I wouldn't count as part of the UI layer [04:59:14.0000] so I’m bad at this sysadmin stuff. [04:59:35.0000] how should I debug when I have the same /etc/cron.d/foo file on two Ubuntu boxes [04:59:38.0000] hsivonen: we made parse different from legacy IE on the basis that legacy IE was not what people expect and there were few enough pages relying on this that we could change it [04:59:51.0000] hsivonen: i had to argue the case to convince Hixie to change it [04:59:57.0000] both have the same file permissions [05:00:17.0000] the cron job works on only one of the Ubuntu boxes [05:00:19.0000] how to debug? [05:00:21.0000] zcorpan: ok [05:01:14.0000] zcorpan: I think we shouldn’t change how it parses, but I think it’s bad that it isn’t a parse error [05:01:46.0000] hsivonen: http://serverfault.com/ ? [05:02:12.0000] hsivonen: not sure that's a maintained site, mind you [05:08:02.0000] init: cron main process (314) killed by TERM signal [05:08:10.0000] there’s my problem [05:09:05.0000] annevk: http://mathias.html5.org/data/unicode/format?version=6.1.0&property=Any&type=symbols may be useful for your tests [05:09:33.0000] annevk: see http://mathias.html5.org/data/unicode/ for README [05:14:05.0000] hsivonen: authors of that paper did not send a bug report to me [05:14:36.0000] and at the point when I first read the paper I didn't know it might be describing any bug cases [05:15:05.0000] MikeSmith: ok. not cool. [05:15:59.0000] I just read it because I saw that one of them had posted an interesting question to the whatwg list recently. I didn't know about the paper at that point but I found it when looking at his about page as Tsukuba university [05:16:11.0000] hsivonen: yeah they should have taken the time to report the bug [05:16:27.0000] the paper is interesting [05:16:35.0000] the parts of it that I can understand at least [05:18:16.0000] the paper describes a method for generating test cases [05:18:47.0000] starting by using a language they developed to formalize the parser algorithm [05:19:16.0000] looks like I had a midway interrupted update of cron itself on the system [05:19:22.0000] dpkg --configure -a [05:19:28.0000] did something to cron [05:19:36.0000] the method could be used to generate a lot more test cases if it were expanded to cover more than just the 24 elements they limited it to [05:20:10.0000] "we exclude formatting elements from our formalized specification because of difficulties with the destructive manipulation of the stack" [05:21:14.0000] "We are planning to address this limitation by checking the reachability to the first point where a destructive operation on the stack is required." [05:21:15.0000] Defining html parsing with a formal grammar�still hard? [05:21:36.0000] Ms2ger: still hard I guess [05:22:30.0000] I wonder if someone offers bugzilla hosting priced by amount of traffic/bugs and allows custom hostnames [05:22:53.0000] I’d like not having to run bugzilla.validator.nu myself [05:23:17.0000] specifically, exposing perl and a bunch of CGI scares me from the security POV [05:24:05.0000] this sysadmin stuff is really not my cup of tea [05:24:09.0000] perl scares me from any POV [05:30:03.0000] hsivonen: why not ask for a Product on w3.org? [05:30:16.0000] hsivonen: and just redirect there? [05:30:42.0000] annevk: not sure what the Freedom to Leave situation at w3.org is [05:31:18.0000] WHATWG is happy there [05:31:34.0000] also, I’d like to keep the old bug numbers [05:32:22.0000] /me wonders if Bugzilla works with a vanilla Dreamhost shared host where Dreamhost takes care of updating Perl [05:38:41.0000] hsivonen: have a DreamHost account? [05:38:56.0000] hsivonen: I'm happy to give you one for trying things out [05:43:32.0000] annevk: Where am I confusing MIME type and Content-Encoding? [05:45:29.0000] GPHemsley: http://krijnhoetmer.nl/irc-logs/whatwg/20121105#l-24 [05:48:06.0000] Hixie: I intend to avoid fighting with the IETF on anything. I was just pondering what would be necessary to improve MIME/Internet media types. [05:52:05.0000] Oh, that was supposed to be hsivonen: ^^ [05:53:48.0000] MikeSmith: "Anyway, one problem currently is that a lot of people don't seem to know that the validator.nu HTML parser exists." — http://www.w3.org/mid/20121105074634.GG29943@sideshowbarker [05:53:48.0000] Are there wrappers (or equivalent built-ins) in scripting languages such as python, ruby, php? [05:54:15.0000] no [05:54:31.0000] annevk: Ah, that. I was confusing the two. I was imagining a world where a file could be described by only a media type. As it stands, would you really be required to have a Content-Encoding header for a .tar.gz file? [05:54:42.0000] /me wishes that weren't the case. [05:55:23.0000] afaik, yes [05:56:22.0000] matjas: I think IDNA2008 does require NFC at least, so input like è (e, followed by U+0300) gives the wrong output in your tool compared to browsers [05:56:35.0000] GPHemsley: application/zip in ancient, so improving it would probably do more harm than good [05:57:15.0000] .tar.gz is indeed annoying from the type perspective [06:00:27.0000] GPHemsley: however, AFAICT, .tar.gz is a solved problem [06:00:40.0000] you say Content-Type: application/x-tar [06:00:41.0000] Content-Encoding: gzip [06:00:46.0000] matjas: this is mostly about processing before Punycode happens though, so how you want to call that is another matter [06:01:16.0000] and browsers will still save the gzipped file instead of ungzipping on the HTTP layer [06:01:46.0000] that's prolly documented nowhere :/ [06:02:18.0000] hsivonen: Well, alright. I wasn't necessarily saying it was a problem. I was actually just imagining what it would mean to extend +-suffixes. [06:02:38.0000] GPHemsley: I’d much rather see documentation for application/x-tar than an invention of archive/tar [06:02:40.0000] hsivonen: I found out later that someone had already imagined that; it's written up in an IETF draft. [06:03:05.0000] hsivonen: What kind of documentation? [06:03:59.0000] GPHemsley: saying it exists for starters. Maybe saying that if you are downloading application/x-tar to disk, don’t handle Content-Encoding: gzip on the HTTP layer [06:04:47.0000] hsivonen: Ah, so that would be what the +gzip would be useful for; but I see your point. [06:06:39.0000] hmm. I don’t actually find any explicit code for making Necko not gzip that stuff [06:06:47.0000] but I just tested and it doesn’t [06:06:49.0000] hmm. [06:08:02.0000] GPHemsley: anyway, this stuff needs testing and more than it needs new types [06:08:10.0000] Now, the Just Solve the File Format Problem project is documenting as many file formats as they can... but I wonder if it would be useful to have a document somewhere that said "Handle this format according to this spec." [06:08:35.0000] Something like "So you want to write a web browser" or something :P [06:09:06.0000] I think +zip for new types like application/epub+zip is OK, but I think it would be disruptive to try to force it on existing types [06:09:17.0000] like all the ODF/OOXML stuff [06:09:36.0000] (there’s a crazy number of MIME types for ODF/OOXML) [06:09:53.0000] a "So you want to write a web browser" document would probably just say "You must be new here." [06:10:21.0000] the ODF/EPUB way of putting the MIME type in the file itself at a well-known byte position is kinda cool [06:10:34.0000] I wonder if any server uses that to generate the Content-Type header [06:10:38.0000] I suspect not [06:11:26.0000] hsivonen: like html4 ? [06:13:02.0000] zcorpan: well, that one wasn’t at a known byte pattern [06:13:12.0000] true [06:16:21.0000] what’s wrong with writting a web browser? [06:18:06.0000] http://weasyprint.org/docs/tutorial/#weasyprint-navigator [06:21:11.0000] annevk: I see that you've spread out your definition of terms across multiple sections in URL, whereas I tend to stick them all into the "Terminology" section in mimesniff. Is there a preference for one way over another? [06:25:02.0000] SimonSapin: I really like that project :-) BTW, I'm _still_ in Lyon :S Waiting for flight now. Will leave in 2h30m. [06:27:45.0000] odinho_: eh. trouble with a previous flight? [06:30:49.0000] GPHemsley: I put them closest to where they are used [06:31:18.0000] GPHemsley: and if they're pretty general I put them in Terminology [06:31:24.0000] I see. [06:31:37.0000] annevk: How do you define "pretty general"? :) [06:33:08.0000] either things that are used in most major sections or things that could move into some "Platform Terminology" document at some point [06:33:17.0000] judgment call [06:37:16.0000] http://vimeo.com/52740599 seems marcos misunderstands how !important works in css (around 9:00-10:00) [06:38:44.0000] seems right to me [06:39:05.0000] roughly, anyway [06:40:00.0000] Of what I said, it all sounded very much better inside my head :P I need moar training speaking. [06:40:06.0000] an author rule without !important still overrides a user rule without !important. so adding !important to the author rule doesn't mean it overrides the user rule more. [06:56:21.0000] learning way more about IDNA2008 than I ever wanted :/ [07:00:06.0000] GPHemsley: btw, for now you only clean up the draft right? no new concepts? [07:00:21.0000] annevk: 99%, yeah. [07:01:07.0000] okay, as new implementation requirements I'd like to see discussed somewhere and definitely stuff like minting new MIME types [07:01:44.0000] incidentally... does work? [07:01:48.0000] yes [07:01:51.0000] ok [07:03:36.0000] annevk: Anything in particular you want me to make note of? [07:06:24.0000] no, just making sure we have the same understanding about what's going on at a high-level :) [07:06:44.0000] though now I'm reading things, I'd prefer if you used "MIME type" as HTML does as media type is something CSS uses [07:07:07.0000] GPHemsley: and the thing from XHR I was wondering about was "XML MIME type" which XHR defines [07:08:07.0000] annevk: HTML also defines an XML MIME type [07:08:34.0000] yeah, as long as they're all the same we're good and at some point we should maybe have a common terminology doc [07:08:59.0000] yeah, they're all roughly the same; it's just the precise language that's different [07:09:14.0000] as for MIME type: arghhhh [07:11:10.0000] I'm pretty sure "Internet media type" is the "official" terminology now [07:11:52.0000] it's certainly the name of the Wikipedia article [07:12:14.0000] and the IANA calls them media types, too [07:12:35.0000] well, parts of it call them MIME media types, so that's not helpful [07:22:08.0000] "media type" is not much used anymore in CSS. We talk more of media queries [07:23:15.0000] GPHemsley: hmm, I guess I'll defer to Hixie [07:23:36.0000] GPHemsley: but if HTML is not going to use "media type" for that, I don't think this document should either [07:23:57.0000] annevk: I'll take a page out of Hixie's playbook and say that it's easier for me to do nothing at this point. ;) [07:24:29.0000] but we'll see what he says [07:24:41.0000] no one has mentioned it before [07:24:44.0000] and he has looked at it [07:37:21.0000] sure man, I'll keep bringing it up until one of you convinced the other or you both agree :) [07:56:46.0000] oh lol [07:56:54.0000] http://tools.ietf.org/html/rfc5894#section-4.4 is the rationale document for changes in IDNA2008 [07:57:03.0000] "IDNA2008 permits, at the risk of some incompatibility" [07:57:13.0000] hahaha [07:58:48.0000] (that's about changing the mapping of ß to Punycode form rather than ss, ™ to Punycode form rather than tm, and similar such changes) [07:59:28.0000] J. Klensin [08:01:55.0000] For an organisation that is concerned with current implementations that sure is a striking statement [08:02:39.0000] oh oh oh, we cannot change URI because that would mean implementations are non-conforming, but sure we can piss all over domain names? [08:03:53.0000] I kind of wonder whether it might all have been intentional [08:04:13.0000] some people really didn't ever like the idea of IDNs [08:04:30.0000] sabotage [08:06:20.0000] this rationale document also keeps talking about user input [08:06:24.0000] user input, really? [08:06:27.0000] users use Google [08:06:57.0000] the input comes from strings, legacy strings spread all around the web [08:07:03.0000] changing their meaning is insane [08:07:33.0000] even in the name of "more sensible" and "less surprising" results (I kid you not, that's the justification thus far) [08:08:48.0000] Hi, IETF [08:08:51.0000] whoever is on the IAB should be ashamed to have let this through [08:10:57.0000] whoever it was is probably graduated from the IAB already [08:17:01.0000] http://unicode.org/reports/tr46/#Table_IDNA_Comparisons has a nice summary btw [08:18:20.0000] /me wonders if sniffing should have some sort of requirement that there be binary bytes, to avoid accidentally sniffing plaintext documents. [08:18:20.0000] /me also wonders whether there should be a requirement that future magic numbers contain at least one binary byte. [08:18:49.0000] no and no [08:19:50.0000] /me gets confused from Hixie making multiple changes in a single commit [08:20:09.0000] WebVTT and cache manifests don't such a byte and would not benefit from it [08:21:07.0000] /me wonders who annevk is talking to. [08:21:24.0000] GPHemsley: last two lines were for you [08:21:42.0000] ah [08:21:43.0000] GPHemsley: the rest is just blogging on IRC [08:22:02.0000] right [08:22:19.0000] your responses were generic enough that they could have been towards MikeSmith or someone [08:22:49.0000] annevk: But I'm wondering about your second no. [08:23:39.0000] I mean, what if I have a text file that begins "GIF89a"? [08:24:00.0000] Or worse, "BM" [08:24:09.0000] Like, "BMW Motors" [08:24:17.0000] that'll get sniffed as a bitmap [08:24:32.0000] depends on the context [08:25:04.0000] What's the context? It's a file without a Content-Type header. [08:25:10.0000] There is no context. [08:25:17.0000] loading context [08:25:45.0000] It's a file served by HTTP without a Content-Type header. [08:25:51.0000] It's the standard always-sniff context. [08:26:49.0000] for non-text formats it might make sense to require a zero byte or some such [08:27:29.0000] but typically those requirements are not read anyway so I'm not sure it makes sense [08:29:03.0000] what do you mean by "not read"? [08:29:34.0000] well there's a requirement you register new MIME types for instance, that almost never happens [08:30:00.0000] oh, you're still talking about the magic number registration, OK [09:19:13.0000] hmm... CSS... [09:28:58.0000] so fwiw the IRI WG is meeting f2f at IETF tomorrow [09:29:00.0000] https://datatracker.ietf.org/meeting/85/agenda/iri/ [09:29:35.0000] Time to set up ietfmemes [09:29:42.0000] hahaha [09:29:43.0000] yeah [09:30:13.0000] anyway the agenda has only 15 minutes set aside for "URI/IRI/URL thread among IETF/W3C/WHATWG (Larry Masinter)" [09:30:20.0000] from 18:00 to 18:15 [09:30:20.0000] IRI is not that interesting (imo) as you just get percent-encoded stuff out of the parser anyway [09:30:34.0000] it's just a syntax [09:31:12.0000] annevk, TabAtkins_ : What was the discussion recently about charset determination for CSS? [09:31:28.0000] In particular, where does the outcome of that discussion now reside? [09:31:39.0000] GPHemsley: css3-syntax [09:32:16.0000] Thanks. [09:35:35.0000] GPHemsley: Yeah, I updated Syntax to the latest sometime last week. [09:39:57.0000] thanks [09:58:04.0000] GPHemsley: Just to make sure, you're looking at dev.w3.org/csswg/css3-syntax, right? [09:58:11.0000] yup [09:58:14.0000] GPHemsley: Also, I forget, did I end up meeting you sometime last week? [09:58:18.0000] heh, no [09:58:33.0000] /me has gone anywhere. [09:58:35.0000] n't [09:58:47.0000] You've gone anywhere? :) [10:01:47.0000] brain <--> fingers [10:09:50.0000] aah, I was hoping zewt showed up in http://lists.w3.org/Archives/Public/www-international/ [10:10:25.0000] More of a list for crazy, I guess [10:10:49.0000] that's why I joined [10:11:39.0000] Fair enough [10:12:14.0000] /me grumbles something about mailing lists. [10:13:38.0000] They're support forums [10:13:58.0000] I suggest the IETF drop their version of mimesniff, they say they may assign new editors at the IETF meeting this week. [10:14:05.0000] -_- [10:14:52.0000] /me has no interest in participating in territory disputes. [10:15:11.0000] Then you came to the wrong place, sir [10:15:18.0000] WELL IT'S OBVIOUSLY AN IETF SPEC SO WHY ARE YOU STEALING IT FROM US!? [10:16:01.0000] My statement that prompted such a response: "That's correct; I do not intend to work on this document through the IETF. I think the potentially fluid nature of the material would be better served as a living WHATWG standard than a frozen RFC. I would recommend that websec drop it as a deliverable." [10:16:03.0000] GPHemsley: well, I'd welcome the competition [10:16:21.0000] GPHemsley: Link? [10:16:26.0000] gsnedders: Private e-mail. [10:16:58.0000] GPHemsley: history shows that unless a browser vendor is actively involved the IETF outcome is, well, not super [10:17:02.0000] D'awww. [10:17:06.0000] Where's the fun in that? [10:18:21.0000] gsnedders: Actually, the particular e-mail I'm quoted was also forwarded to a public mailing list without my permission, so you may be able to find it. [10:18:30.0000] s/quoted/quoting/ [10:18:32.0000] s/unless a browser vendor is actively involved// [10:18:37.0000] gsnedders: But not he response. [10:18:40.0000] +t [10:18:44.0000] /me sighs at the keyboard. [10:19:35.0000] I don't really understand all these territorial disputes; in my mind, the WHATWG, W3C, and IETF all have different functions. [10:21:22.0000] The WHATWG writes specs, the W3C publishes them for patent protection, and the IETF whines about them in 1970's-style text files? [10:22:08.0000] lol [10:24:31.0000] In the W3C WebApps Charles asked who wanted to become famous by copying WHATWG drafts, putting their name on it, and publishing them at the W3C [10:24:38.0000] meeting /\ [10:25:37.0000] What followed that? [10:26:00.0000] gsnedders: sorry? [10:26:08.0000] A call for editor for the URL spec? [10:26:25.0000] Oh, that was about all drafts I used to edit at WebApps [10:26:32.0000] I was in the room too [10:27:07.0000] It's such a weird dynamic [10:27:13.0000] annevk: What was the response from the group? [10:27:35.0000] Well they have some volunteers for XHR; Lachy volunteered for DOM [10:27:52.0000] dunno about Fullscreen / URL [10:28:25.0000] So basically they decided who wanted to become famous by copying WHATWG drafts and putting their name on it. [10:28:31.0000] Well done, W3C. [10:29:55.0000] Oh, is that what the IETF is gonna do to me? [10:30:01.0000] I'm mostly here to solve problems and to raise problems with capital p Process [10:30:43.0000] GPHemsley: well last time IETF tried (about:blank) they rendered their variant way worse so we have to take it back again [10:31:12.0000] GPHemsley: I would not expect them to do a good job of capturing requirements of browsers [10:31:32.0000] Who does the IETF think they represent, if not the browsers? [10:31:39.0000] the Internet [10:31:48.0000] which is... who, exactly? [10:32:06.0000] The Internet [10:32:18.0000] or do they do ephemeral work for an ephemeral entity? [10:32:36.0000] http://www.ietf.org/about/ [10:33:05.0000] GPHemsley: Everyone who uses the internet. [10:33:33.0000] karlcow: According to that, their area should be restricted to networking architecture. [10:34:02.0000] As long as the Web can be built on that architecture, they have no jurisdiction over the Web. [10:34:13.0000] In theory. [10:34:37.0000] /me sniggers [10:34:41.0000] GPHemsley: history, social dynamics, communities. People != robots. Or at least they try sometimes. [10:35:16.0000] karlcow: Pfft. [10:35:39.0000] QED. :) [10:36:45.0000] □ [10:37:21.0000] :) [10:37:39.0000] Well, I guess we solved the Internet. We can all go home now. [10:38:05.0000] Was there something to solve? :) [10:38:17.0000] There is a story, but nothing to solve. [10:38:23.0000] It's more like an epic poem. [10:38:50.0000] Emphasis on the "epic" [10:38:56.0000] yup [10:39:24.0000] I meant https://en.wikipedia.org/wiki/Epic_poetry [10:41:02.0000] talking about it, I should go for writing a bit about webdriver hopes for testing. :) [10:47:47.0000] ah karl left? [10:48:21.0000] I was gonna say, I hope our epic poem is somewhat easier to read than e.g. that of John Milton [10:48:33.0000] but then I suspect for many people it isn't :/ [10:52:55.0000] why does Fullscreen link to both HTML and HTML5? [10:52:56.0000] did I do that? [10:53:19.0000] ah https://github.com/whatwg/fullscreen/commit/1991f306a4e4e37c450542e29e78075de06305d2 [10:53:30.0000] hmm [11:01:34.0000] In case anyone here is interested, I filed a Mozilla bug on mimesniff implementation earlier today: https://bugzilla.mozilla.org/show_bug.cgi?id=808593 [11:17:25.0000] GPHemsley: I guess the stuff Zack mentions there could be done in a "css sniffing context" (or whatever that's called now) [11:17:50.0000] GPHemsley: currently HTML defines those rules, and I'm not sure if they apply outside of HTML... [11:18:28.0000] right... I didn't finish investigating how far the HTML go [11:18:35.0000] +rules [11:18:52.0000] nor is it clear to me when CSS would be parsed outside of HTML [11:19:11.0000] (and thus not be covered by HTML rules) [11:19:35.0000] SVG [11:19:39.0000] CSS referencing CSS [11:19:55.0000] HTTP Link header in some implementations [11:20:15.0000] (but the last one should prolly be removed) [11:20:34.0000] [11:20:40.0000] CSS doesn't give rules about how to parse other potentially CSS files? [11:20:43.0000] although CSSOM might describe it for that at the moment [11:21:20.0000] GPHemsley: CSS doesn't define the edges well, generally speaking [11:21:37.0000] I wouldn't even know how to go about sniffing CSS [11:22:15.0000] GPHemsley: e.g. @import url(test\ test); // CSS does not define how to parse that URL well and does not define how it's fetched (what Referer is etc.) and does not define what to do with the result if it e.g. lacks a content-type header; in a way that matches implementations [11:22:25.0000] GPHemsley: it's not exactly sniffing [11:22:44.0000] GPHemsley: it's more like, lacks content-type, assume it's text/css if these other conditions are true [11:22:57.0000] annevk: Are there flaws in that method? [11:23:07.0000] annevk: Or is it just not specced? [11:23:36.0000] as I said, it's specced in HTML, but whether it should apply elsewhere too is unclear [11:23:48.0000] right [11:23:59.0000] and you think it's within the scope of mimesniff to document that/ [11:24:01.0000] ? [11:24:13.0000] kinda [11:24:25.0000] lacks Content-Type, so what to do? [11:25:05.0000] well, without special handling, running it through the unknown sniffing algorithm would return text/plain, I think [11:25:37.0000] model brainstorming: URL -> HTML fetch -> resource [11:25:51.0000] sounds reasonable [11:25:52.0000] resource + context -> determine type [11:25:59.0000] determine type is MIME sniffing [11:26:10.0000] then process resource [11:26:15.0000] based on type [11:26:19.0000] I see [11:26:22.0000] hmm [11:26:40.0000] who defines context? [11:26:52.0000] the step before URL [11:27:28.0000] API.resource = Fetch(API.url) [11:28:17.0000] API.resource.type = MIME Sniffing(API.resource, API.context) [11:28:29.0000] API.context can be something like "image context" [11:29:30.0000] so, "sniffing * specifically" = context? [11:30:50.0000] if I rename context to constraints, does that help? [11:31:42.0000] neither is a term I use in mimesniff, so not really :P [11:32:26.0000] e.g. for you want to use the image sniffing rules for the resource you fetched [11:32:29.0000] I'm just trying to determine exactly which parts fall into my jurisdiction [11:32:35.0000] would be the API in the above example [11:33:11.0000] OK; current invokes the 'rules for sniffing images specifically' [11:33:23.0000] yeah that works [11:33:28.0000] +ly [11:33:31.0000] geez, keyboard [11:34:44.0000] I guess my thinking was to have "get the MIME type for resource /resource/, constrained with /images/" or some such [11:36:14.0000] not sure btw CSS really fits well into this model though, as I believe a number of things depend on whether the fetching resource is in standards mode and whether it's cross-origin or same-origin [11:37:46.0000] right now, I have the general sniffing algorithm that calls the various pattern matching algorithms based on UA preferences [11:38:11.0000] and then I have the separate algorithms that also call the pattern matching algorithms which can be used for hooks like HTML needs [11:39:10.0000] HTML calls the 'rules for distinguishing if a resource is text or binary' directly to work around the same Apache bug the main sniffing algorithm works around. [11:39:55.0000] and HTML calls the 'rules for sniffing images specifically' when dealing with images (, ,