00:14 | <Hixie> | OK! |
00:14 | Hixie | flexes fingers |
00:14 | <Hixie> | character encoding spec, where are you so i can (a) implement you and then (b) fix you good and well. |
00:52 | <annevk> | I was planning on sleeping, but then I came accross this: http://steinbaugh.com/asides/ems-layout/ "but I’m not planning anything new until HTML 5 or XHTML 2 gets finalized" someone should tell him |
00:52 | <Hixie> | xhtml2 might be finalised relatively soon |
00:53 | <Hixie> | so step 3 of the "algorithm for extracting an encoding from a Content-Type" says "If the next six characters are not 'charset', return nothing" |
00:53 | <Hixie> | anyone know if i meant that to be a case-insensitive check? |
00:54 | <Philip`> | Hmm, it still says "six"? |
00:55 | <Hixie> | uh yeah i should fix that while i'm at it |
00:55 | <Hixie> | but that wasn't the point :-) |
01:03 | <Philip`> | Lots of people use capital CHARSET so it would be most useful if it was case insensitive |
01:03 | <Hixie> | good to know |
01:04 | <Philip`> | Oh, is this for HTTP Content-Type rather than <meta>? |
01:04 | <Hixie> | it's for "text/html;charset=foo" |
01:04 | <Hixie> | whether in Content-Type or in <meta content=""> |
01:04 | <Hixie> | (not <meta charset="">) |
01:05 | <Philip`> | Ah, okay |
01:05 | <Hixie> | still case-insensitive? |
01:06 | <Philip`> | In the HTTP header, I see 3 CHARSET, 223 Charset, 25451 charset, but there are far more CHARSETs in <meta> |
01:06 | <Hixie> | k |
01:06 | <Hixie> | case-insensitive it is |
01:06 | <Philip`> | (I don't currently have a nice way of counting contents of <meta> though) |
01:06 | <Hixie> | i'm setting up a script to parse the docs and find that |
14:00 | <annevk> | hmm |
14:01 | <annevk> | is it worth responding to the global href thread? |
14:05 | <Philip`> | If you have good arguments then it's probably useful to respond and to update the FAQ to be clearer about those points |
14:05 | <annevk> | dunno |
14:06 | <annevk> | guess i'll leave it for now |
14:07 | <annevk> | i think <a> should be allowed to wrap around "block level elements" and i think that adding href everywhere complicates processing models too much for no real benefit |
14:09 | <annevk> | using script for older user agents is a valid point though |
14:10 | <Philip`> | Why does it make the processing model any more complicated than it already has to be to handle global onclick and :hover? |
14:13 | <annevk> | because every element gains an additional default click event handler |
14:14 | <annevk> | (that might conflict with existing click event handlers (in case of form controls) so you need to make choices how to handle those, etc.) |
14:14 | <annevk> | more ways to do something always makes things more complicated, and increases QA cost, etc. |
14:18 | <Philip`> | Ah, sounds like the issue is that e.g. a <button> currently has one default click handler and n DOM click event handlers, and <button href> would change that to be >1 default click handler rather than changing it to n+1 DOM click event handlers |
14:19 | <Philip`> | (and so then things like preventDefault would become confusing, because there's no longer just one default handler) |
14:20 | <annevk> | it's also not clear if submitting the form and following a link at the same time makes any sense |
14:22 | <Philip`> | Another issue is you'd sometimes want the other link-related attributes like rel and target and ping, and you'd have media and hreflang and type too for consistency, which would get messy |
15:24 | <Philip`> | http://www.google.com/m/search?q=%ef%bf%bf |
15:33 | <zcorpan> | Philip`: it amuses you doesn't it? :) |
15:36 | <Philip`> | zcorpan: It would be more fun if it wasn't so trivial to find exactly the same bugs in every XHTML site :-) |
15:43 | <Lachy> | the problem is that so few developers understand, or even bother to consider, character encoding issues |
15:44 | <Lachy> | the bigger problem is that character encodings are too complex for most people to understand |
15:45 | <annevk> | hmm, why do I get "invalid character" in Opera for http://www.google.com/m/search?q=%ef%bf%bf but the same Acid3 test still fails? |
15:49 | <Camaban> | I agree with that Lachy, I've never found anything talking about character encoding that I could understand terribly well. I can look up and find various bits of info about doctypes and stuff, but good, understandable character encoding info seems a lot harder to find |
15:55 | <annevk> | ALA prefers IE-propaganda :o |
16:03 | <Lachy> | Camaban, search for "Guide to Unicode" in google, then read the first result |
16:03 | <Lachy> | http://lachy.id.au/log/2004/12/guide-to-unicode-part-1 |
16:06 | <Camaban> | will hvae a look :) |
16:23 | <Camaban> | Lachy: sorry, but that comes under the heading of 'stuff I struggle to udnerstand', and I've only read the first couple of paragraphs |
16:23 | <Lachy> | Camaban, really? I was told by so many people that it was really easy to understand |
16:23 | <Camaban> | Since version 1.1, the Unicode standard has remained fully compatible with ISO/IEC 10646: Universal Multiple-Octet Coded Character Set. The ISO/IEC 10646 standard defines a character repertoire and character code points (or code positions), as well as two character encodings, UCS-2 and UCS-4, allowing for up to 232 code points. |
16:24 | <Camaban> | that means absollutely nothing to me |
16:24 | <Lachy> | it gets easier |
16:24 | <Camaban> | I'll have to have another go at getting further into it when I have a bit more time then |
16:26 | <annevk> | Camaban, seen http://www.joelonsoftware.com/articles/Unicode.html already? |
16:27 | <Camaban> | annevk: no, looks like perhaps I dind't know what to search for :) |
16:27 | <Lachy> | if you read that one from Joel, then you have to read this one first http://ln.hixie.ch/?start=1066145333&count=1 |
16:28 | <annevk> | Camaban, I think that one is really good, and pretty accessible too |
16:29 | <Camaban> | from a quick scan, it looks so |
16:30 | <Camaban> | as a guy who generally codes up HTML/CSS, I tended to search for stuff about character encoding to check what I should be putting in the HTML to make it work properly, the idea of actually needing to search for, and find out about unicode hadn't occured to me |
16:31 | <annevk> | it explains encodings further down |
16:31 | <Camaban> | yeah, I see that, I guess it needs a few more links to come up in google better :) |
17:41 | <met_> | annevk http://www.w3.org/TR/2008/WD-XMLHttpRequest2-20080225/ HTML 5 (work in pgoress) s/pgoress/progress |
18:36 | <annevk> | met_, thanks, fixed |
18:38 | <met_> | annevk, is it possileble tomake suchchanges in released TR document? |
18:38 | <met_> | or there is some w3c policy? |
18:40 | <annevk> | TR docs are snapshots |
18:40 | <annevk> | it will be fixed in the next snapshot |
18:41 | <annevk> | sometimes TR docs are "edited in place", but going through that trouble for a Working Draft is not worth it |
18:41 | <met_> | ok |
19:26 | <zcorpan> | <!doctype html public "-//w3c//dtd html 4.0 transitional//en" " > appeared 23 times in Philip`'s data |
19:27 | <zcorpan> | though don't seem too broken in ie because they get terminated at <html lang="..."> or so |
19:29 | <zcorpan> | looking at the pages it is getting increasingly clear that ignoring the last two characters in the fpi results in better web compat |
19:29 | <Philip`> | The "//en" characters? |
19:29 | <zcorpan> | yeah |
19:31 | <zcorpan> | pages with <!doctype html public "-//w3c//dtd html 4.0 transitional//de"> need quirks mode |
19:43 | <zcorpan> | aha! |
19:43 | <zcorpan> | http://www.quintomiglio.com/ is the first that renders better in standards mode in opera and firefox |
19:44 | <zcorpan> | (gets quirks in opera, standards in firefox) |
19:44 | <zcorpan> | i had found 14 before that one where the opposite is true |
19:45 | <zcorpan> | (and a bunch that would render pretty much the same in quirks and no-quirks) |
19:51 | <annevk> | man, what did I do to this Garett Smith |
19:51 | <annevk> | Garrett* |
20:34 | <Hixie> | annevk: ? |
20:36 | <annevk> | yo |
20:36 | <annevk> | ignore my www-archive e-mail please |
20:36 | <Hixie> | the acid3 one? |
20:36 | <Hixie> | i saw your reply, was already ignoring both :-) |
20:36 | <Hixie> | i was wondering what made you wander what you'd did to garrett |
20:37 | <annevk> | he seems so hostile |
20:37 | <Hixie> | right, but which thread? |
20:38 | <annevk> | http://lists.w3.org/Archives/Public/www-style/2008Feb/thread.html#msg274 |
20:39 | <Hixie> | oh, www-style |
20:39 | <Hixie> | i don't read that anymore |
20:40 | <annevk> | i do |
20:40 | <annevk> | I'm still interested in CSS and nobody else is doing it |
20:41 | <Hixie> | yeah |
20:41 | <Hixie> | if i wasn't doing html5 i probably would be |
20:41 | <Hixie> | though i'd be sorely tempted to start a competing organisation to standardise css properly, with a real community, etc, like the whatwg |
20:42 | <Hixie> | lord, you're right, wtf did you do to this guy |
20:44 | <jwalden> | existed |
20:45 | <othermaciej> | he seems kind of angry |
20:47 | Hixie | mumbles something about being glad this poem discussion is going on on public-html and not whatwg |
20:47 | <Hixie> | i like Philip Taylor's comment: |
20:47 | <Hixie> | "I'm reluctant to enter into this debate, yet |
20:47 | <Hixie> | feel strangely compelled so to do." |
20:48 | <annevk> | i tried reading that wiki page, but it was very unclear and very big |
20:48 | <annevk> | so i gave up and did something else |
20:49 | <Hixie> | i fear that when it comes time to go through the wiki pages, a lot of them will be getting a response of "the problem is not described well enough for me to address this issue" |
20:50 | <annevk> | surprisingly many being with describing "the solution" |
20:50 | <Lachy> | that's because describing a solution is much easier than describing what the problem is |
20:51 | <Hixie> | yup |
20:51 | <Hixie> | but as editor i can't care about solutions without understanding the problem |
20:51 | <Hixie> | since i can't evaluate a solution without knowing the problem |
20:51 | <Lachy> | I gave up on the HTMLWG wiki a long time ago, when I realised, despite several attempts to nudge them in the right direction, their content remained mostly useless |
20:53 | <annevk> | as with usability, it seems better to learn what people need than what they want |
20:54 | <Philip`> | http://tug.ctan.org/cgi-bin/filenameSearch.py?filename=%00 - http://www.hipocampo.org/buscar.asp?search=%01 - http://virtueventures.com/services.php?page=6%ef%bf%bf - this really is too easy |
20:54 | <Lachy> | I really don't understand what I did to provoke this hostile respone. http://www.w3.org/mid/47C5A056.4080108⊙mn - I thought all I did was try to clarify what the spec was saying to someone who misunderstood it |
20:55 | <Hixie> | he'd already told me off for not understanding him |
20:55 | <Hixie> | anyway as far as i'm concerned that issue is resolved |
20:55 | <Hixie> | since the term "prose" is no longer in the spec |
20:56 | <Lachy> | yeah, him telling you off is understandable. Everyone seems to do that :-) |
20:57 | <annevk> | you're part of Hixie's posse Lachy, deal with it :p |
20:57 | <Hixie> | :-) |
21:02 | gsnedders | hides from the Cabel |
21:02 | <Lachy> | gsnedders, do you have a fear of cables or did you mean cabal? |
21:02 | <gsnedders> | Lachy: cabal |
21:08 | <jwalden> | !summon zcorpan |
21:11 | Hixie | disillusions people on the semantic web in html5 http://realtech.burningbird.net/semweb/semantic-web-dull-as-dishwater-edition/#comment-372 |
21:13 | <Lachy> | Hixie, no comment from you on that page |
21:14 | <Lachy> | is it awaiting approval? |
21:15 | <Hixie> | yeus |
21:21 | Dashiva | wonders how they go from "People make links because they're useful, search engines add extra value" to "People add metadata because... there's no benefit whatsoever" |
21:22 | <Hixie> | feel free to comment also |
21:22 | <Hixie> | my comment is there now btw |
21:23 | <Hixie> | http://www.zingermans.com/ is great |
21:23 | <Hixie> | <meta http-equiv="Content-Type" content="text/html; charset=UTF-16" /> |
21:24 | <Dashiva> | I'll leave commenting to the official opera dudes |
21:31 | <Lachy> | Dashiva, non-opera people can comment too. It saves us the work :-) |
21:32 | <Dashiva> | Yeah, but I might say something to make chaals some after me |
21:34 | <Lachy> | why would chaals come after you? you don't work for Opera, do you? |
21:34 | <Dashiva> | I have worked for opera three summers in a row now, and I'm several kinds of volunteer rest of the year :) |
22:03 | <Dashiva> | You got a reply, Hixie :) |
22:03 | <Hixie> | woah, big reply |
22:05 | <Dashiva> | I wonder if links going 404 counts as data rot or not as far as bananas go |
22:20 | <annevk> | i don't think chaals has a bias towards Opera employees |
22:20 | <annevk> | btw |
22:24 | <annevk> | lol, TAG gets involved in ARIA |
22:29 | Hixie | searches for the mail anne refers to in his "Unlikely to be useful" folder |
22:31 | <annevk> | Hixie, it now changes UTF-16 to UTF-8 in two separate places, that's the idea? |
22:31 | <Hixie> | yeah. it was almost three, but i reduced it to two |
22:32 | <Hixie> | not sure how to merge the last two, but if it gets more complex i'll abstract it out into a separate "set of steps" |
22:33 | <annevk> | oh also, if you're testing charset and stuff, apparently two out of four browsers require http-equiv=content-type |
22:33 | <Hixie> | which two? |
22:33 | <Hixie> | http://www.hixie.ch/tests/adhoc/html/parsing/encoding/ is my test suite btw |
22:33 | <annevk> | IE and Opera iirc |
22:33 | <Hixie> | k |
22:33 | <Hixie> | seems safe not to require it then |
22:34 | <annevk> | i suppose |
22:34 | <Hixie> | it's sad that tbl is no longer really up to date with html on the web |
22:35 | <Hixie> | hey does anyone have IE? my mac is trying to update itself and i can't run the VM while that's happening |
22:35 | <annevk> | lets keep track of where we are in a decade |
22:36 | <annevk> | i have IE7 running, though UI-wise it's limited |
22:36 | <Hixie> | i need to know the result of these tests: |
22:36 | <Hixie> | http://www.hixie.ch/tests/adhoc/html/parsing/encoding/069.html |
22:36 | <Hixie> | http://www.hixie.ch/tests/adhoc/html/parsing/encoding/070.html |
22:36 | <Hixie> | http://www.hixie.ch/tests/adhoc/html/parsing/encoding/071.html |
22:36 | <Hixie> | http://www.hixie.ch/tests/adhoc/html/parsing/encoding/072.html |
22:36 | <annevk> | 69: FAIL |
22:37 | <annevk> | windows-1254 used |
22:37 | <annevk> | 70 fail, windows-1252 used |
22:37 | <annevk> | 71, fail, windows-1254 used |
22:37 | <annevk> | 72 fail, windows-1252 used |
22:38 | <Hixie> | well crap |
22:38 | <Hixie> | that's three browsers, three different sets of results |
22:38 | <Hixie> | firefox is Windows-1252 for all four |
22:38 | <SadEagle> | heh. for more fun, not that you care, konq3 uses iso-8859-9 --- supposedly, and... well, let's not talk about what 4 does |
22:38 | <Hixie> | safari is ISO-8859-9 (equiv of 1254 for the sake of this test) for all four |
22:39 | <Hixie> | how about opera? |
22:39 | <annevk> | maybe IE doesn't do content="...;charset='...'" |
22:39 | <annevk> | Opera passes all four |
22:40 | <Hixie> | ISO-8859-9? |
22:40 | <Hixie> | hm |
22:40 | <Hixie> | well crap |
22:40 | <annevk> | Opera ftw |
22:40 | <annevk> | :) |
22:40 | <Hixie> | well, "pass" is actually not what the spec says right now |
22:40 | <Hixie> | windows-1252 for all four is what the spec currently requires |
22:41 | <Hixie> | looks like spaces in encoding names is rare |
22:41 | <Dashiva> | Oh, Hixie. You and your wacky testcase hijinx |
22:41 | <Hixie> | so i guess, no trimming. |
22:44 | <Philip`> | annevk: In IE7, I see "PASS" ("Windows-1252") in 070 and 072 |
22:44 | <Hixie> | i just changed the tests |
22:44 | <Philip`> | Ah |
22:44 | <Hixie> | anne's earlier report is now out of date |
22:49 | <Dashiva> | "But professor, these are the same questions as last year's exam!" "I know. I changed the answers." |
22:53 | Philip` | is reminded of a quiz show that asked how many moons the Earth has, and the correct answer was two; and the next year they asked the same question, and the answer "two" resulted in -10 points because the correct answer was now five |
22:53 | <Lachy> | Hixie, Philip`, do either of you have any stats on how much VoteLinks are used? rel=vote-for, vote-against, and vote-abstain? |
22:54 | <Hixie> | haven't seen it, but i haven't done studies for rel="" in a while |
22:55 | <blooberry> | lachy: what context are they used in? (element) |
22:56 | <Lachy> | blooberry, http://microformats.org/wiki/vote-links |
22:56 | <blooberry> | ah...A element. gotcha |
22:57 | <blooberry> | quick check: 7 times in DMoz URL set for vote-for. didn't find any for the others. Would have to do a deeper check to see if I'm looking for the right thing. |
22:59 | <Lachy> | technorati has removed their vote-links tracking page. I guess that means it failed. http://www.technorati.com/live/votes.html |
23:00 | <Hixie> | live and learn |
23:02 | <blooberry> | ah, I see...it is used more in REV attribute than REL. I found cases for all values...not many though: for: 29 cases; against: 14 cases; abstain: 12 |
23:09 | <Hixie> | out of? |
23:10 | <blooberry> | 3.5 million |
23:11 | <Hixie> | wow |
23:11 | <Philip`> | I see http://www.ehlinelaw.com and http://www.wiredprairie.us out of some tens of thousands that it's still processing |
23:11 | <Hixie> | that's low enough that they might be typos! |
23:11 | <blooberry> | indeed. 8-} |
23:11 | <blooberry> | I'll get a URL list for those now...should take a couple minutes. There could be some URL overlap there too. |
23:12 | <Lachy> | 0.0008% seems quite insignificant |
23:12 | <Lachy> | I don't need it |
23:13 | <Philip`> | Hmph, lots of web pages use attributes |
23:13 | <Lachy> | Just needed to know whether or not it had any real world usage, in order to say whether or not it should be included in an HTML reference I'm reviewing |
23:13 | Philip` | 's interesting-attribute extraction thing has got to 500MB of output from 50K pages |
23:14 | <Philip`> | blooberry: By the way, I don't know if you noticed but it looks like almost all of cnn.com has been removed from dmoz.org |
23:15 | <Hixie> | man i wish people wouldn't say things like "in 8.2.2.1 step 3" |
23:15 | <Hixie> | who knows what that will refer to by the time i reply to the e-mail |
23:15 | <Lachy> | LOL |
23:16 | <Dashiva> | "In line 8137..." |
23:16 | <blooberry> | philip`: hadn't noticed *cheers* |
23:17 | <Lachy> | Hixie, maybe you should add a big note to the top of the spec, just about the TOC that says section numbers are subject to change, please refer to sections by title (or something equally descriptive) |
23:18 | <Hixie> | the people who would notice that are clever enough to work it out for themselves :-) |
23:18 | <blooberry> | philip`: I have had some similarly large reports for some web page factors |
23:20 | <Philip`> | I don't want to use up all my disk space because then I'll have to try to figure out the LVM commands to add an extra partition and I'll probably break everything horribly |
23:20 | <Lachy> | the TOC is likely to be one of the first things people look at when the load the spec, before going to somewhere more specific. I'm sure they would see it if it were big, red and blinking. |
23:21 | <Dashiva> | It may blink, but parts of it must not flash for more than 3 seconds. |
23:28 | <gsnedders> | Hixie: am I guilty (of using numbers)? |
23:28 | <gsnedders> | I know at times I use both number and section title |
23:30 | <blooberry> | lachy: ah...just noticed the results list came back for rev=vote-*. only 37 URLs total, and 8 of those were on mp3.com URLs. |
23:30 | <blooberry> | philip`: Interesting. It didn't find either of those URLs that you mentioned. 8-{ *looks to see if they are actually in my database* |
23:30 | <blooberry> | (my crawl is from ~3 months ago though) |
23:34 | <Philip`> | http://www.ehlinelaw.com http://www.wiredprairie.us http://maxicine.com/cine/criticas-cine-moriras-en-tres-dias-31303 is all, out of 125K from a few days ago |
23:34 | <Hixie> | the rev=vote-against on http://www.wiredprairie.us is bogus |
23:34 | <Hixie> | (it's to a javascript: uri!) |
23:35 | <Hixie> | it's talking about what the user is voting against |
23:35 | <Hixie> | not what the page is voting against |
23:36 | <Hixie> | and http://www.ehlinelaw.com seems to have vote-for on almost all internal links |
23:36 | <Hixie> | looks like a misguided SEO attempt |
23:39 | <blooberry> | hmm. my script isn't picking up the REV from ehlinelaw...interesting and weird. |
23:41 | <Philip`> | blooberry: Is that something like case sensitivity, or more significant misparsing of the page? |
23:42 | <blooberry> | it could be. This is worth tracking down. 8-} |
23:48 | <blooberry> | ah...heheh. not a bug. In my output I was just putting it in a different section that I had hidden. It does find it, so it must be a more recent addition. |
23:57 | <annevk> | nice, getSVGDocument() == contentDocument now |
23:57 | <annevk> | would've been better if the former was never introduced, but still |