MikeSmith: the thing is, that omitting is a legitimate way to end the
[05:12:01.0000]
hmm, yeah, I realize that now
[05:13:00.0000]
god, all this stuff must make building a conformance checker a major PITA
[05:13:01.0000]
:)
[05:15:00.0000]
right now, the PITA is that Jigsaw doesn't print informative diagnostics when stuff fails :-)
[05:16:00.0000]
I saw you had mentioned Jigsaw but I'm clueless so far about what you need it for
[05:16:01.0000]
what problem does it potentially solve for you?
[05:16:02.0000]
MikeSmith: getting the W3C run an instance of Validator.nu under their preferred container
[05:17:00.0000]
ah
[05:17:01.0000]
that would definitely be really nice to have
[05:20:00.0000]
hsivonen: getting back to the HTML5 schema, am I confused, or is it the case that if you expand the content-model references out, common.inner.phrase just amounts to text & notAllowed
[05:20:01.0000]
hmm. interesting. when the servlet-relative path is "/", Jigsaw gives it as null
[05:21:00.0000]
MikeSmith: each phrase-level element definition augments that stub definition
[05:21:01.0000]
OK
[05:23:00.0000]
hsivonen: I see now... I just need to quit being lazy and to actually read the schema
[05:34:00.0000]
do all browsers default to submitting the form to base uri if the action attribute on the form is missing?
[05:34:01.0000]
hsivonen: do you know of any tools that are able to generate a flattened version of an rng/rnc schema with the combine=choice definitions for a pattern actually combined into a single definition?
[05:35:00.0000]
MikeSmith: I'm not aware of such a tool, but here's a guess
[05:35:01.0000]
you might get that result if
[05:35:02.0000]
you run Trang to convert the schema to RELAX NG XML syntax
[05:36:00.0000]
and then run Kohsuke Kawaguchi's schema converter to convert the schema from RELAX NG to RELAX NG
[05:37:00.0000]
but that's just a guess
[05:37:01.0000]
then you could run Trang againg to compact syntax to make the result human-readable :-)
[05:37:02.0000]
MikeSmith: Trang preserves the structure of the schema
[05:38:00.0000]
yeah, tried trang .. doesn't do it, unfortunately -- or fortunately, depending on how you look at it. trang faithfully preserves the RNC structure in RNG output in such a way that is seems like it's actually round-trippable
[05:38:01.0000]
MikeSmith: Kohsuke Kawaguchi's converter builds an abstact model and reserializes it without preserving structure
[05:38:02.0000]
but IIRC, him tool doesn't read compact syntax
[05:39:00.0000]
I tried Dave Tolpin's incelim and it doesn't combine them either
[05:39:01.0000]
hence, the need to use Trang, too
[05:39:02.0000]
hsivonen: OK
[05:39:03.0000]
will try Kohsuke's tool
[05:40:00.0000]
/me apologizes again for not actually reading carefullywhat hsivonen wrote above
[05:40:01.0000]
I'll shut up now :)
[05:40:02.0000]
for a while at least
[05:46:00.0000]
w00t. I got Validator.nu to run inside Jigsaw. (without file upload support, without gzip support and without non-ASCII input support)
[05:49:00.0000]
hsivonen: congats
[05:49:01.0000]
MikeSmith: thanks. now I need to document what I did. :-)
[09:18:00.0000]
hsivonen: any clues on getting Kohsuke's rngconv working with the HTML datatype library?
[09:19:00.0000]
hsivonen: I'm trying to run a conversion, but I'm getting "http://whattf.org/datatype-draft" is not a recognized data type vocabulary"
[09:29:00.0000]
hsivonen, othermaciej just saw your dialog re: namespaces earlier (last night) http://krijnhoetmer.nl/irc-logs/whatwg/20080801#l-154
[09:30:00.0000]
feel free to add a new section (or sections), like "implementation experience" and/or "fundamental software engineering error" to http://microformats.org/wiki/namespaces-considered-harmful
[09:34:00.0000]
MikeSmith: no clue. do you have the library in classpath?
[09:35:00.0000]
tantek: I recently started a wiki page, too: http://wiki.whatwg.org/wiki/Namespace_confusion
[09:35:01.0000]
not much there yet
[09:36:00.0000]
still, a good collection
[09:36:01.0000]
feel free to link to your page also from http://microformats.org/wiki/namespaces-considered-harmful
[09:37:00.0000]
tantek: ok. I will. (gotta run now, though)
[09:44:00.0000]
hsivonen: yeah, I got the dist/html5-datatypes.jar subdir of my http://svn.versiondude.net/whattf/syntax/trunk/relaxng/datatype/java working directory
[09:44:01.0000]
it's just that one jar file, right?
[12:08:00.0000]
/me wonders if getting involved with this thread was a good idea after all
[12:35:00.0000]
it wasn't
[12:36:00.0000]
neither is the way trackback/pingback have been brought up at all
[12:42:00.0000]
maybe I should set a cron job to email http://xkcd.com/386/ to me every morning...
[12:45:00.0000]
the debate on extensibility is fundamentally a religious one, I don't see how either side will ever buckle
[12:46:00.0000]
I have this suspicion that HTML5 will never become a W3C recommendation as a result of this and other permadiscussions
[14:29:00.0000]
libxml2's APIs suck a little bit
[15:13:00.0000]
"And authors want to add metadata. Instead of forcing it into containers that haven't been designed for it (@title, @data-*), let them do it properly." -- http://lists.w3.org/Archives/Public/public-html/2008Aug/0023.html
[15:13:01.0000]
I don't get what other way would be considered the proper way to embed metadata, beyond the mechanisms designed for adding metadata?!
[15:15:00.0000]
if, as Julian claims, title and data-* weren't designed for adding some type of metadata, then I must be missing something.
2008-08-02
[18:13:00.0000]
Hixie, why does it matter if an image is stretched or not, for the purpose of conformance?
[18:13:01.0000]
why does it matter if a tag is closed or not, for the purpose of conformance?
[18:14:00.0000]
in almost all cases, stretching an image is a mistake. in the remaining cases where it is intentional, it's dubious practice. this argues for the author being notified of the problem.
[18:15:00.0000]
same as with anything else that is a conformance error
[20:39:00.0000]
Could anyone point to an SVG to Canvas converter?
[01:48:00.0000]
MikeSmith: yeah, it's just that one jar file
[01:48:01.0000]
hsivonen: yeah, I got it figured out finally
[01:49:00.0000]
I was using "jar -r" to run it
[01:49:01.0000]
and that caused it to ignore the datatype library despite the fact I had it in my classpath
[01:50:00.0000]
MikeSmith: the -jar switch sucks
[01:50:01.0000]
yeah
[01:50:02.0000]
I should have known better than to try it
[01:52:00.0000]
anyway, I did manage to be able to run it, but not without fatal errors
[01:52:01.0000]
running against the HTML5 schema, I get java.lang.StackOverflowError
[01:53:00.0000]
at com.sun.msv.grammar.ExpressionCloner.onChoice(ExpressionCloner.java:37)
[01:54:00.0000]
anyway, I'm giving up on it
[01:55:00.0000]
and instead just writing a stylesheet to pre-process the schema and combine all the choice=combine stuff
[02:06:00.0000]
MikeSmith: msv is no good with the default stack size of hotspot
[02:07:00.0000]
-XX:ThreadStackSize=2048
[02:10:00.0000]
hsivonen: OK, trying now
[02:12:00.0000]
getting same error even with that switch
[02:13:00.0000]
but anyway, the idea is kind of a no-go regardless, due to that fact that it's going to restructure the whole schema
[02:14:00.0000]
the only thing I really want is to just consolidate the @combine=choice stuff, which I think can manage to do with XSLT
[03:18:00.0000]
oh, I wish what I wrote in IRC didn't get dragged into the thread on the mailing list :-(. I chose not to respond on the list for a reason.
[03:22:00.0000]
i hope mike is happy: http://www.whatwg.org/specs/web-apps/current-work/#fetching :-)
[03:25:00.0000]
/me smiles
[03:25:01.0000]
Hixie: cool
[03:27:00.0000]
"At a time convenient to the user and the user agent" is an interesting phrase
[03:30:00.0000]
heh. It makes it sound like my browser should book an appointment with me to do the download :-)
[05:21:00.0000]
nn
[15:26:00.0000]
i think i might make alt="" required and say that when you don't know what the image is, you have to say what kind of image it is (e.g. "uploaded image", "photo", "thumbnil", or whatever) and put that in braces in the alt="" attribute, as in alt="{photo}"
[15:26:01.0000]
and that that is never allowd inside a link
[15:27:00.0000]
(in a link it should just give the text that is appropriate for the link, e.g. alt="View image" if it's a link to the image)
[15:29:00.0000]
this seems to handle all the cases that allowing alt="" to be omitted does, while verly mildly improving the accessibility of those images, and being slightly more compatible with legacy UAs
[15:29:01.0000]
(it affects some legacy pages, though not many, and certainly no more than the missing-alt-altogether case would)
[15:40:00.0000]
seems like a wise decision
[15:47:00.0000]
Hixie, why use curly braces instead of square brackets?
[15:48:00.0000]
Lachy: it's more exotic
[15:48:01.0000]
and probably less likely to conflict
[15:49:00.0000]
ok, makes sense
[16:02:00.0000]
Lachy: [] are used a lot already
[16:02:01.0000]
Lachy: {} not so much
[16:02:02.0000]
Lachy: iirc <> is used even less, but that causes problems in xml or something
[16:02:03.0000]
/me logs on to his vpn to get the numbers
[16:03:00.0000]
< >, if anyone remembers
[16:05:00.0000]
ok here are the stats as percentages of total pages scanned
[16:05:01.0000]
pages that had an img that wasn't in a link and had a value that followed the pattern [...]: 0.45%
[16:05:02.0000]
pages that had an img that wasn't in a link and had a value that followed the pattern (...): 0.13%
[16:06:00.0000]
pages that had an img that wasn't in a link and had a value that followed the pattern {...}: 0.035%
[16:06:01.0000]
pages that had an img that wasn't in a link and had a value that followed the pattern <...>: 0.033%
[16:07:00.0000]
most common alt={...} value was alt={alpha}, most of which it seems came from pages created by one particular conversion tool
[16:08:00.0000]
I'm surprised (...) is that low
[16:09:00.0000]
number of pages with : 94%
[16:09:01.0000]
number of pages with : 82%
[16:09:02.0000]
number of pages with with non-empty alt: 77%
[16:10:00.0000]
number of pages with that had at least one without an alt="": 67%
[16:11:00.0000]
number of pages with and that had all their with alt="": 27%
[16:11:01.0000]
number of pages that had at least one element but no elements with alt="": 11%
[16:12:00.0000]
which comes out to somewhere between 29%-71% of images that don't have alt
[16:12:01.0000]
ok, well, it nicely solves the problem of distinguishing between legitmate and guessed alt text, so it seems like a reasonable solution
[16:13:00.0000]
most of the alt=""s with the pattern <...> seemed to be mistakes, e.g. 0.0015% of pages had alt="span style='background-color: #CCFFFF'>Visa"
[16:13:01.0000]
so what would authoring tools like Dreamweaver be requried to insert by default? Would alt="{image}" be ok?
[16:14:00.0000]
if the author knows what the image is, then alt="{...}" is never "ok"
[16:14:01.0000]
I think apache's autoindex uses "[ TXT ]" as alt..
[16:14:02.0000]
but yeah, i guess that would be a reasonable default for the case where today they just omit alt="" or have alt="" empty without probable cause
[16:15:00.0000]
yeah, I know that. But by default when a user just drags and drops an image into the WYSIWYG editor, and doesn't enter anything for alt text into the prompt
[16:15:01.0000]
actually, just "[TXT]"
[16:15:02.0000]
jcranmer: second most common [...] alt value was [DIR] 0.085% (first was [new] 0.095%, third was [NEW] 0.066%)
[16:16:00.0000]
[DIR] is used in apache's autoindex
[16:16:01.0000]
jcranmer: [TXT] was 0.024%
[16:16:02.0000]
less common than [cpp] and [flash]
[16:16:03.0000]
and [*]
[16:17:00.0000]
most common (...) values were (+), (-) and (?)
[16:17:01.0000]
what would [cpp] be used for?
[16:17:02.0000]
\[[A-Z]+\] is quite possibly autoindexing
[16:17:03.0000]
I don't know what IIS uses, if anything
[16:17:04.0000]
(0.038%, 0.037%, and 0.0095% respectively)
[16:18:00.0000]
yeah apache's autoindexing was well represented in these results
[16:18:01.0000]
[spoiler] was common too
[16:18:02.0000]
what does [] look like if you exclude apache, which is probably a valid use case?
[16:19:00.0000]
although it would have to be ~30% of all stuff to change the rankings
[16:19:01.0000]
valid how?
[16:19:02.0000]
not sure what you mean
[16:20:00.0000]
Hixie: Hmm what turned out to be wrong with using an attribute to signal the poverty of alt text, rather than resorting to odd syntax inside the alt attribute?
[16:20:01.0000]
especially given the use-case is automatic insertion rather than hand-authoring
[16:21:00.0000]
jcranmer: the [...] values were (roughly in order): [new], [DIR], [NEW], [ ], [*], [b], [img], [i], [url], [u], [email], [quote], [flash], [fixed], [spoiler], [cpp], [strike], [TXT], [IMG], [ICO], [M], ...
[16:22:00.0000]
five of which are definitely apache
[16:22:01.0000]
webben: it seems more likely to be misused
[16:22:02.0000]
webben: e.g. through ignorant copy-paste
[16:22:03.0000]
webben: also, coming up with a name was difficult
[16:22:04.0000]
another seven of which seem to be BB-code
[16:23:00.0000]
I'd have thought exactly the opposite was the case.
[16:23:01.0000]
[img alt="[b]SEE THIS[/b]"] ?
[16:23:02.0000]
that a weird syntax inside alt is utterly opaque
[16:23:03.0000]
although the non-existence of [/b] does seem to invalidate that theory...
[16:24:00.0000]
Hixie: Is there a list of proposed names anywhere?
[16:24:01.0000]
webben: not a convenient list, no
[16:25:00.0000]
webben: noalt, important
[16:25:01.0000]
I can't remember the others
[16:25:02.0000]
yeah, well, I agree those aren't good names ;)
[16:25:03.0000]
webben: what kind of proposal did you have in mind? ?
[16:25:04.0000]
webben: (with a better name obviously!)
[16:26:00.0000]
well, you wouldn't need the ="true" (presumably?) but yeah.
[16:27:00.0000]
that was basically my importantimage="" proposal, but nobody could come up with a good attribute name.
[16:28:00.0000]
missing-text-equivalent
[16:28:01.0000]
please-sir-can-i-have-some-more-validation
[16:29:00.0000]
webben: vs seems like a tossup as to which is better
[16:29:01.0000]
that name's still probably not ideal, but I think the former is easily better.
[16:30:00.0000]
/me would suggest testing it with some newbie authors and asking them what they think these syntaxes mean
[16:30:01.0000]
actually that's not a fair test
[16:30:02.0000]
since that clues them in that {} is a syntax, which is counter-intuitive
[16:31:00.0000]
heh
[16:32:00.0000]
i wish i could remember why i had decided to look at alt={...} rather than importantimage=""
[16:32:01.0000]
there was some more serious problem than the name, iirc, but i don't recall what
[16:33:00.0000]
(You could use self-documenting decisions, that way you won't have to document anything)
[16:33:01.0000]
not sure what that would mean here :-)
[16:34:00.0000]
the decision wasn't written down anywhere, i just did it
[16:35:00.0000]
if we documented every decision, that would take the fun out of rehashing old arguments!
[16:39:00.0000]
Hixie: I just felt like taking a jab at self-documenting code.
[16:39:01.0000]
heh
[16:42:00.0000]
webben: missing-text-eqivalent doesn't really convey the right message, i'm not sure it actually helps vs {...}
[16:42:01.0000]
Hixie: is " alt-is-not-actually-a-description-but-a-category" the right message?
[16:42:02.0000]
Hixie, are you speccing the {...} feature now?
[16:43:00.0000]
the message is alt-is-not-actually-a-description-but-a-category-because-we-do-not-know-what-the-image-actually-is
[16:43:01.0000]
Lachy: i'm looking at it. it's one of the folders with the most messages
[16:44:00.0000]
Hixie: alt-is-category-only ?
[16:44:01.0000]
webben: i guess the reason i prefer {...} is that if we're going to come up with some mostly opaque syntax, i'd rather pick the most compact
[16:45:00.0000]
webben: that's pretty long, and still doesn't really help, i mean, what's a category? does that mean i can just do this on all my images? etc.
[16:46:00.0000]
I think the what's a category question is answered by the alt content itself.
[16:47:00.0000]
I like the {} syntax better because it looks ugly enough to discourage authors from using it on all their images, yet simple enough to be used where appropriate
[16:47:01.0000]
would alt="{}" be considered non-conforming?
[16:48:00.0000]
While {} might discourage authors using {}, it doesn't make it clear the alt is suboptimal. Consequently newbies could easily take away the message that alt="Photo" is a good alt.
[16:49:00.0000]
missing-text-equivalent at least hints that something's wrong.
[16:49:01.0000]
and which of these regexes best matches the proposed syntax: /^\{.+\}$/ or /^\{[^\}]+\}$/
[16:51:00.0000]
webben: the stats indicate that authors already think omitting alt altogther is fine
[16:51:01.0000]
webben: so i don't think this will make it any worse
[16:51:02.0000]
Lachy: i was just thinking /^\{.*\}$/
[16:51:03.0000]
Hixie: I don't see the logical connection between those stats and the effect of any given example.
[16:52:00.0000]
webben: i'm saying that nothing could make the current authoring practices worse
[16:52:01.0000]
Hixie, so then alt="{}" would be conforming, even though it's almost completely useless to UAs since it says nothing about what type of image it is?
[16:52:02.0000]
I don't think not making it any worse is the bar we should be setting. ;)
[16:53:00.0000]
Lachy: alt="{image}" doesn't say anything about what it is either
[16:53:01.0000]
I guess those two could be considered equivalent then
[16:53:02.0000]
fwiw it could easily be worse.
[16:53:03.0000]
webben: i don't think having a few people stick misteriously named attributes on images is going to improve things either
[16:54:00.0000]
/me doesn't really see why not.
[16:55:00.0000]
if the problem is mysteriousness, then go for a big huge long name.
[16:55:01.0000]
then nobody will use it
[16:55:02.0000]
why would nobody use it?
[16:55:03.0000]
nobody would use it /accidentally/ ... which is precisely what you're trying to avoid
[16:56:00.0000]
it's a psychology thing -- people just don't seem to use long keywords, they shy away from them
[16:56:01.0000]
i don't know why
[16:56:02.0000]
We're not talking about hand-authoring here. We're talking about sites like Flickr processing thousands of images; and software like DreamWeaver written by professionals.
[16:56:03.0000]
i guess i'm not sold on the idea that the advantage of an attribute over just a compact syntax outweigh the disadvantages... the pros and cons on both sides seem pretty minimal
[16:57:00.0000]
that's the target audience of this attribute as I understand it.
[16:57:01.0000]
flickr output is hand-authored templates.
[16:57:02.0000]
it's still hand-authored.
[16:57:03.0000]
the templates are hand-authored; the output isn't.
[16:58:00.0000]
likewise someone writes the code for DreamWeaver
[16:58:01.0000]
s/hand-authoring/small-time authoring/ if you like
[16:58:02.0000]
my point is it's not like we can just ignore authoring because there's a computer involved -- it's still hand written at some point
[16:59:00.0000]
webben, even if the attribute were theoretically a better approach (which I'm not convinced it is), then there's still the big problem of finding an appropriate name that accurately represents its meaning for all the vaious use cases
2008-08-03
[17:00:00.0000]
yeah i haven't yet seen a name that i'd be proud to have in a spec with my name at the top
[17:00:01.0000]
Lachy: If the meaning is ambiguous, then that's true of {} too. If the meaning can be expressed, then it can be expressed in an attribute name.
[17:00:02.0000]
if {} is ambiguous, that may hint at a problem with the idea
[17:00:03.0000]
are there two concepts here that need seperating?
[17:01:00.0000]
{}'s advatnage is great compactness, its disadvantage is it's meaning is not intuitive. for an attribute to outweigh its corresponding verbosity disadvantage, it has to have a name that is intuitive
[17:02:00.0000]
s/it's/its/
[17:02:01.0000]
I think you're underestimating that {} doesn't even look like code, and therefore is likely to be misused.
[17:02:02.0000]
one can't really dispute an attribute looks like code, even if it's not obvious what it does.
[17:02:03.0000]
why would it be misused more than now?
[17:03:00.0000]
it doesn't exist now.
[17:03:01.0000]