01:19 | Philip` | wonders if it'd be interesting to see what strings <acronym> and <abbr> are currently used for, to see if they're just interchangable in practice, or if it's best to stay far away from the whole topic |
01:22 | <Ketsuban> | The main problem with asking people to distinguish between <acronym> and <abbr> is that they don't know what the difference is (that is to say, an acronym is read out as the letters, like RSPCA, whereas an abbreviation is said, like NATO). |
01:22 | <Ketsuban> | I myself advocate keeping <acronym> around and saying people SHOULD use <acronym> for acronyms, but MAY use <abbr> if they don't know the difference. |
01:23 | <Ketsuban> | This is the friendliest solution for Web developers, but for developers of e.g. screen reading software it's pretty nightmarish. |
01:24 | <Philip`> | The main problem is that people who say they do know the difference disagree on what the difference is :-) |
01:24 | <Philip`> | Wikipedia says "The word acronym was coined during the mid-20th century for abbreviations pronounced as words, such as NATO and AIDS." |
01:24 | <Ketsuban> | Unfortunately that's not modern usage. =P |
01:24 | <Ketsuban> | I was taught acronyms are read out as letters. |
01:26 | <Philip`> | That's what I mean about people disagreeing |
01:27 | <Dashiva> | Is there anyone who disagrees that an acronym is also an abbreviation, regardless of what an acronym is defined to be? |
01:27 | <Ketsuban> | I don't disagree there. |
01:28 | <Ketsuban> | So I suppose that's the strongest argument you can give for dropping <acronym> altogether. |
01:31 | <Ketsuban> | But I think keeping <acronym> around but allowing unconditional use of <abbr> is marginally friendlier to the makers of screen readers etc. |
01:34 | <Philip`> | Screen readers are only helped if <acronym> is used mostly correctly, and if it's better to trust the markup than to guess, and it doesn't seem clear that that's the case |
01:41 | <annevk> | Philip`, that be useful, yes |
01:41 | <annevk> | would be, even |
01:42 | Philip` | should probably expand his collection of pages so there's a more useful amount of data |
01:48 | <mpt> | Do any existing screenreaders treat <abbr> differently from <acronym>? |
01:48 | <Philip`> | Do any treat them differently to <span>? |
01:49 | <mpt> | That's part of what I'm wondering :-) |
02:08 | <Philip`> | There's http://philip.html5.org/data/abbr-acronym.txt of quite limited usefulness or quality |
02:11 | <Hixie> | dunno what the two most common ones are from, but they seem highly pointless |
02:11 | <Hixie> | ... title="CD">CD<... |
02:12 | <Philip`> | They're almost all http://www.imusic.dk/ |
02:12 | <Hixie> | ah |
02:13 | <Hixie> | well this is prett convincing data as far as the elements being pointless goes |
02:13 | <Hixie> | as in, having both |
02:13 | <Hixie> | vs having one |
02:14 | <Philip`> | It looks like it wouldn't be good for a screen reader to try pronouncing <acronym>s as single words |
02:14 | <Philip`> | not to read out the individual letters |
02:14 | <Philip`> | which leaves them with zero good options |
02:15 | <Philip`> | s/not/nor/ |
02:24 | Philip` | happens to see http://friendlybit.com/html/encyclopedia-of-html-elements/ saying "ACRONYM: No need to use this one, abbr is enough. Do we really need to differ between acronyms and abbreviations? What about initialisms and the other types of words?" |
02:25 | <Hixie> | screen readers are going to read it the same way they read text/plain |
02:25 | <Hixie> | which is to say, using their dictionary |
02:25 | <Hixie> | and heuristics |
02:26 | <jruderman> | <portmanteau> |
02:27 | <jruderman> | it could be a solution to the lame debate in http://en.wikipedia.org/wiki/Talk:Portmanteau#The_ubiquity_of_portmanteau : users could configure their browsers to display <portmanteau> differently |
02:29 | <Hixie> | "lame debate" is a redundant descriptor when linking to a URI with wiki/Talk: in it |
02:33 | <jruderman> | hehe |
11:26 | <takkaria> | I'm talking on a channel full of geeks who read the comic xkcd |
11:26 | <takkaria> | and it's amazing just how badly they misunderstand xhtml/html/rendering/css/the lot |
11:26 | <takkaria> | 11:31 <+kremlin> You're suggesting that Firefox should parse XML files as if they were XHTML files, xipietotec? |
11:26 | <takkaria> | 11:31 <+kremlin> The file extensions are different for a reason, you know. |
11:27 | <takkaria> | makes me wonder what hope the rest of the world has, really |
11:27 | <annevk> | rest of the world uses HTML :p |
11:28 | <takkaria> | apparently firefox 2 doesn't support XHTML so it just renders it as HTML |
11:28 | <Ketsuban> | Part of me thinks Firefox should render XHTML without any default themes at all beyond setting display: for the appropriate elements and styling form elements appropriately. |
11:30 | <jwalden> | eurgh |
11:31 | <Ketsuban> | But then I have really insane ideas sometimes. =P |
11:33 | <jwalden> | meh, we're all mad here |
11:36 | <takkaria> | now someone's saying that sometimes web servers don't serve all files ending in .html as text/html because sometimes it does content-sniffing |
11:36 | <takkaria> | sadly, I'm muted on that channel, so I can't join in the debate anymore |
11:37 | <jwalden> | *web servers* doing content sniffing? sheesh |
11:38 | <annevk> | in theory it was the idea that web servers would do that |
11:38 | <annevk> | for <meta http-equiv> for instance |
11:40 | <jwalden> | no kidding |
11:40 | <jwalden> | learn something new every day! |
12:18 | <webben> | takkaria: Firefox2 does support XHTML. |
12:19 | <webben> | It will render it as HTML only if you serve it as text/html |
12:19 | <takkaria> | webben: I know that, and now the person who told me that does too |
12:20 | <webben> | ok |
12:20 | <webben> | misunderstood what you meant by 'apparently' |
12:21 | <webben> | takkaria: there's a channel for xkcd readers? |
12:23 | <takkaria> | webben: kk. irc.xkcd.net |
12:24 | <takkaria> | or .com, I forget |
12:24 | <takkaria> | particularly #xkcd-signal; you get muted if you say something someone else has said before |
12:24 | <takkaria> | every time you do, your mute time gets doubled |
12:24 | <takkaria> | and every six hours it halves again |
12:24 | <webben> | hmm interesting |
12:24 | <webben> | ta |
15:21 | Lachy | attempted to go skiing today. |
15:22 | <Lachy> | unbelievably, the rental places didn't have skis with bindings large enough for my boots :-( |
15:25 | <jgraham_> | Oh so by "attempted" you actually mean "failed" |
15:26 | <Lachy> | yeah. |
15:26 | <jgraham_> | (I assumed you just meant you had been and were not very good) |
15:27 | <Lachy> | I do own my own crappy old straight skis that I bought second hand, but assumed I would be able to rent better skis |
15:27 | <Lachy> | so I didn't bother taking them |
15:27 | <Lachy> | I'm going to go buy some new skis tomorrow morning |
15:28 | jgraham_ | has only been skiing on crappy dry slopes and even then not for many years |
15:34 | <gsnedders> | annevk: do browsers return the last header if you request Content-Type (or something else relevant to the protocol) and there are multiple headers of the type? What if there are occurrences of the header in the trailer of a chunked response? |
15:35 | gsnedders | wishes he could go skiing more often than once every few years :( |
15:39 | Philip` | wonders if 'sliding uncontrollably down a dry ski slope and sometimes not falling over' counts as skiing |
15:40 | <didymos> | Philip`, is there any other way? :) |
15:40 | gsnedders | thinks not |
15:40 | <Philip`> | Not in my personal experience :-) |
15:40 | gsnedders | has never fallen over a on a dry ski slope |
15:41 | <gsnedders> | I've never been on one either, but hey. |
15:41 | gsnedders | can just go up to Glenshee for the day |
15:43 | <hsivonen> | Philip`: the newes copy of your dmoz URL that I have downloaded is from July. Do you have a newer URL set available for download? |
15:44 | <hsivonen> | newest even |
15:45 | <Philip`> | hsivonen: I don't have a newer one - I've just been using one from before 2007-07-15 |
15:45 | <Philip`> | (and it probably has the broken & bits in it) |
15:46 | <hsivonen> | Philip`: ok |
15:47 | <hsivonen> | I have dmoz-unique-pages.txt.gz and dmoz-unique-pages-shuffle.txt.gz that are significantly different in size |
15:48 | <Philip`> | (If I remember correctly, it just came from http://rdf.dmoz.org/rdf/content.rdf.u8.gz and Perl regexps to extract the links, then sort and uniq) |
15:48 | <Philip`> | hsivonen: The uncompressed sizes should be identical |
15:48 | <hsivonen> | ok. |
15:48 | <Philip`> | but the shuffling hurts the compression a lot |
15:48 | <hsivonen> | how did you do the shuffling? |
15:48 | <Philip`> | Ideally I would have done it with 'sort -R' |
15:50 | <Philip`> | except that didn't actually shuffle things at all when I first tried it, so I just wrote a line of Perl to read get an array of [rand(), $uri] and then sorted by the random field and then printed it out again, which took a couple of gigabytes of memory and is not necessarily the best method |
15:50 | <Philip`> | ('sort -R' works on one computer I use, but not on another, which is weird) |
16:00 | <Philip`> | hsivonen: If you're doing stuff with pages, http://canvex.lazyilluminati.com/misc/Test2.java might have some salvageably useful bits though it's full of bad ideas and copied-and-pasted chunks of code |
16:03 | <Philip`> | (I haven't even changed the original filename which it began evolving from long ago...) |
16:07 | <hsivonen> | Philip`: thanks |
16:18 | <hsivonen> | let's see what happens if I feed the first 10000 URLs from shuffle to Validator.nu |
16:19 | Philip` | wonders how long that will take |
16:20 | <Philip`> | (Parallelism definitely helps here, and is relatively trivial, which is nice) |
16:20 | <hsivonen> | this script is so simple that it doesn't have parallelism |
16:20 | <hsivonen> | I just run a simple python script on my own computer that feeds URIs sequentially to the Validator.nu Web service API |
16:21 | <Philip`> | How does it handle things like timeouts? |
16:21 | <hsivonen> | Philip`: it doesn't |
16:21 | <Philip`> | If it pauses for 30 seconds a few hundred times, it's going to be a bit painful |
16:21 | <hsivonen> | it's very likely that the setup is too simple |
16:22 | <hsivonen> | Philip`: Validator.nu itself has timeouts on its outgoing requests |
16:22 | <Philip`> | At least that's better than being too complex, so it sounds like a good place to start :-) |
16:32 | <SadEagle> | hmmm, lots of canvas changes |
16:33 | <Philip`> | and a lack of updated tests for those changes |
16:33 | <SadEagle> | so I am gonna be lazy for a bit :-) |
16:34 | <SadEagle> | thank goodness I have a centralized place to change just about all of the +/- inf and NaN handling |
16:37 | <Philip`> | Unfortunately I don't have a centralised place for that |
16:39 | Philip` | wonders how to make http://canvex.lazyilluminati.com/tests/tests/* redirect to http://philip.html5.org/tests/canvas/suite/tests/* |
16:41 | <Philip`> | Oh, with "Redirect" - that was easy |
17:17 | <hsivonen> | hmm. curiously, the 10000 url script started getting 503 from Validator.nu at some point without Validator.nu crashing |
17:18 | <hsivonen> | I wonder if mod_jk has some kind of DoS prevention that kicked in |
17:18 | <hsivonen> | or Apache itself |
17:56 | <annevk> | wow, that image maps are still so widely deployed |
18:01 | <hsivonen> | hmm. perhaps there are even more than one application/xhtml+xml site in the Alexa globar 500 |
18:01 | <hsivonen> | global |
18:01 | <annevk> | wow |
18:07 | <Philip`> | What is the " 1 www.icio.us"? |
18:08 | <hsivonen> | Philip`: probably a regexping error |
18:08 | <Philip`> | It would be nice to show the number of pages that contain each error, rather than the total count |
18:08 | <Philip`> | s/rather than/as well as/ |
18:11 | <Philip`> | http://my.opera.com/community/forums/topic.dml?id=163885&t=1202062644&page=1#comment2212326 - yay, XML |
18:20 | <blooberry> | hsivonen: did you find more than one application/xhtml+xml site in the alexa global 500 then? The "perhaps" make me curious. 8-} |
18:21 | <blooberry> | I only found one... |
18:26 | <hsivonen> | blooberry: I rechecked. there's only one. |
18:26 | <webben> | hsivonen: How are you requesting XHTML? |
18:26 | <blooberry> | iwiw.hu? |
18:26 | <hsivonen> | blooberry: yes |
18:28 | <hsivonen> | webben: Accept: application/xhtml+xml, application/xml; q=0.5, text/html; q=0.9 |
18:38 | <hsivonen> | Philip`: each page counted at most once per error: http://hsivonen.iki.fi/test/moz/alexa500-page-collapsed-counts.txt |
21:24 | <gavin_> | hsivonen: www.iwiw.hu is sending text/html as far as I can tell... |
21:24 | <gavin_> | did it just change or something? |
21:25 | <gavin_> | I've tried a few different UAs, too |
21:28 | <hsivonen> | gavin_: Page Info says application/xhtml+xml in Minefield nightly |
21:30 | <gavin_> | ah, interesting |
21:30 | <Dashiva> | Same in Opera |
21:31 | <gavin_> | that is what I get for http://iwiw.hu/pages/user/login.jsp |
21:32 | <hsivonen> | www.iwiw.hu redirects me to http://www.iwiw.hu/pages/user/login.jsp which is application/xhtml+xml to Minefield |
21:33 | <gavin_> | yeah, I see that too |
21:33 | <gavin_> | but if I load the login.jsp URL in IE7, it works |
21:33 | <gavin_> | I thought IE7 barfed on application/xhtml+xml ? |
21:33 | <hsivonen> | most likely it varies the Content-Type on Accept |
21:34 | <gavin_> | ah, right |
21:34 | <webben> | gavin: curl -H 'accept: application/xhtml+xml,text/html;q=0' -v http://www.iwiw.hu/pages/user/login.jsp returns Content-Type: application/xhtml+xml;charset=UTF-8 |
21:34 | <gavin_> | web-sniffer doesn't let me change Accept |
21:35 | <webben> | as does accept: application/xhtml+xml;q=0,text/html ... fail! |
21:35 | <webben> | curl -H 'accept: text/html' -v http://www.iwiw.hu/pages/user/login.jsp returns text/html |
21:36 | <webben> | someone tell them their content negotiation is borked ;) |
21:50 | <Philip`> | hsivonen: About error counts: Thanks |
21:51 | <Philip`> | Comparing to http://canvex.lazyilluminati.com/survey/2007-07-17/analyse.cgi/index#parse-errors there's a significant difference in the number with unencoded ampersands |
21:51 | <Philip`> | which I'd assume is due to top-500 sites being more likely to have dynamic pages with query strings needing ampersands, so that sounds quite plausible |
22:16 | <Hixie> | Lachy: if you have a chance, any way we can set up blog.whatwg.org/faq to be a redirect to the wiki faq? |
22:22 | <Lachy> | oh, damn. That had been done before, but with the last upgrade, I accidentally deleted the .htaccess. |
22:22 | <Lachy> | I'll check if I have a backup |
22:23 | <Lachy> | good, I do. Uploading it now |
22:25 | <Hixie> | thanks |
22:25 | <Hixie> | i fixed the link on the front page too |
22:25 | <Hixie> | (someone complained it was 404) |
22:25 | <Hixie> | so that shouldn't be a big issue any more |
22:30 | <Lachy> | all fixed |
22:31 | <Lachy> | ah, it looks like you removed the link entirely |
22:32 | <Hixie> | no i mean on www.whatwg.org |
22:32 | <Hixie> | made it point to the wiki |
22:33 | <Lachy> | oh, I thought you meant the blog's front page. The link to the FAQ seems to be missing from there |
22:33 | <Hixie> | odd |
22:33 | <Hixie> | didn't touch the blog |
22:33 | <Hixie> | update fallout? |
22:34 | <Lachy> | oh, maybe I never added the links again, after I moved the faq from the blog to the wiki |
22:53 | <zcorpan_> | hsivonen: in order to make the grouping feature more useful, perhaps the attribute value should be stripped from the message for errors about attribute values |
22:56 | <zcorpan_> | (or the grouping feature should just be smarter and do the same as you did with the alexia result list) |
22:59 | <zcorpan_> | looking at the bottom of that list shows that unmatched quotes is relatively common |
23:00 | <zcorpan_> | the very last error also seems harmless |
23:01 | <zcorpan_> | banning ; in attribute names might also be effective |
23:11 | <zcorpan_> | hsivonen: though, the grouping feature is intended to help with search-replace fixing rather than fix-the-spec-studying, so what to group might be different |
23:15 | <zcorpan_> | 0022 / 491 Bad value (consolidated) for attribute “width” on element “img”: Zero is not a positive integer. |
23:15 | <zcorpan_> | i think that's <noscript><img src=tracker.cgi with=0 height=0> |
23:15 | <zcorpan_> | s/with/width/ |
23:20 | <zcorpan_> | we might also want to allow width/height on <input type=image> |
23:24 | <zcorpan_> | 0013 / 491 Stray “script” start tag. |
23:24 | <zcorpan_> | when can a script start tag be stray? |
23:26 | zcorpan_ | ponders about <noscript><style scroped> |