03:50
<Hixie>
heh, this e-mail talking about discovery was itself subject to discovery, that's funny: http://edge-op.org/iowa/www.iowaconsumercase.org/011607/0000/PX00622.pdf
04:25
<jwalden>
hah
06:50
<Hixie>
67% complete in categorising the table feedback, yay
06:56
<Hixie>
85%
07:12
<Hixie>
woot, sorted all teh feedback for tables
07:12
<Hixie>
now i guess i have to read it!
07:13
<Hixie>
well this is weird
07:15
<Hixie>
for all the talk about headers/id, there's very little work on it other than what ben and james have done
07:18
<Hixie>
so i guess what i'm going to have to do is implement this table cell/header association algorithm
07:18
<Hixie>
skipping step 1
07:18
<Hixie>
and see if it gives better or worse results than just obeying header/id
07:19
<Hixie>
to determine if header/id is used correctly or not
07:19
<Hixie>
if header/id is used correctly and the algorithm gets the same result, then we might as well use the algorithm
07:20
<Hixie>
if header/id is used correctly but the algorithm gets the wrong results, then we probably need header/id
07:20
<Hixie>
if header/id is used incorrectly the vast majority of the time, then it doesn't much matter, we should probably drop it anyway
11:59
<hsivonen>
aargh. lynx from MacPorts and lynx from fink break lines differently
11:59
hsivonen
is diffing the lynx text dump of the spec
12:34
<annevk>
hsivonen, web-apps-tracker is no good?
12:35
<annevk>
http://blog.kilobox.mobi/2008/03/02/whatwg-are-you-creating-html/
12:41
<zcorpan_>
annevk: i guess it's harder to read with all the markup
12:48
<hsivonen>
annevk: not enough context
12:49
hsivonen
diffs with -U 200
12:50
<hsivonen>
annevk: also, there have been too many interdependent changes in the tokenizer, so I find I can't do spec rev by spec rev, bug by bug patching
12:50
<hsivonen>
instead, I'm sweeping over the diff of the entire chapter in one go
12:50
<annevk>
I suppose we could make context an option
12:53
<hsivonen>
the main problem is that the tracker doesn't say what the heading before a given change was
12:53
<hsivonen>
so it isn't obvious which tokenizer state or insertion mode is affected
12:55
<annevk>
philip asked about that too
12:55
<annevk>
that would be a bit tricky though
12:55
<annevk>
i suppose you could make a diff with a lot of context
12:55
<annevk>
then traverse back from the changes until you hit <hx>, and drop everything before that or so
12:56
<hsivonen>
I think a large context alone is good enough
13:13
<annevk>
hsivonen, ok, there's now a context feature
13:14
<annevk>
http://html5.org/tools/web-apps-tracker?from=1356&to=1357&context=70
13:21
<hsivonen>
annevk: cool. thank you
13:46
<Dashiva>
annevk: Is that blog viewable without a mobile device?
13:48
<Philip`>
Dashiva: It seems to be perfectly viewable on my non-mobile computer
13:49
<Dashiva>
I guess it's just my connection that sucks then
13:49
<Dashiva>
I need a RART
13:49
<Philip`>
A regional asset recovery team?
13:50
<Dashiva>
A router attitude readjustment tool
13:52
hsivonen
notes that requiring spaces between a quoted attribute value and the next attribute name will make valid HTML4 invalid.
13:54
<zcorpan_>
that's true
13:54
Philip`
would estimate that to be about zero HTML4 documents in practice
13:56
<zcorpan_>
Philip`: how about any HTML documents (not just counting valid HTML4)?
13:56
<Philip`>
I still estimate about zero
13:56
<Philip`>
based on no data, except I don't remember ever seeing people intentionally omit spaces between attributes
13:57
<Philip`>
(so my view doesn't really count)
13:57
<zcorpan_>
i don't either, but i have seen it in some text/html-XHTML documents (so it was not intentional)
13:57
<hsivonen>
Philip`: actually, when the W3C Validator fixed this for XHTML, there were (angry) people who found their previously 'valid' pages were in fact invalid
13:58
<zcorpan_>
would be interesting to see if pages that lack spaces between attributes do so because of mismatching quotes or not
13:59
<zcorpan_>
(since that's what the new requirement is supposed to help with)
13:59
<Philip`>
Oh, maybe it's more common that I expected
14:00
<Philip`>
I see /<[a-z]+ [a-z]+="([^"]+)"[a-z]/ on about ...
14:00
<zcorpan_>
also, perhaps the other new requirements are sufficient for catching mismatched quotes
14:00
<hsivonen>
Hixie: any particular reason why < didn't become invalid in unquoted attribute values along =, ' and ?
14:00
Philip`
waits for it to continue finding matches
14:01
<Philip`>
... lots of pages
14:02
<Philip`>
About 5K (out of 130K), plus lots of margin of error
14:03
<Philip`>
though it looks like a lot of them are from the same site
14:03
<Philip`>
(which says <d
14:03
<Philip`>
iv id="gb_city/cottonwood-az_5"class="act_gbar" style="left: 25px; height: 18px" ></div>)
14:03
<Philip`>
(...without the line wrapping)
14:05
<zcorpan_>
i guess the whitespace between attributes requirement has a high noise to signal ratio
14:05
<Philip`>
<font face="arial color="silver">X - Y - </font>
14:06
<zcorpan_>
quote in attribute name would catch that one
14:06
<Philip`>
is the only error I've found after seeing about a dozen that are simply skipping the space
14:07
<Philip`>
<img src="http://www.w3.org/Icons/valid-xhtml10"alt="Valid XHTML 1.0!" height="31" width="88" /> - hmm
14:07
<zcorpan_>
heh
14:08
<hsivonen>
lol
14:09
Philip`
goes away for a while
14:10
<zcorpan_>
i can't come up with a case where requiring whitespace between attributes helps to catch mistakes that quotes in attribute names don't
14:12
<zcorpan_>
unless *two* quotes were omitted, e.g. <input value="foo disabled="disabled>
14:12
<zcorpan_>
but that's very unlikely
14:12
<hsivonen>
I am not opposed to making it an error, fwiw. (at least not yet)
14:13
<zcorpan_>
if it doesn't catch any harmful mistakes, isn't it just adding noise to the list of validation messages?
14:14
<hsivonen>
that's a possibility
14:15
<hsivonen>
though this way the tokenizer spec comes closer to be hackable into an XML5 tokenizer that reports XML 1.0 4th ed. errors non-fatally
14:17
<hsivonen>
every time I patch Ælfred2 I come closer to biting the bullet and turning the Validator.nu HTML parser into a non-Draconian XML parser as well.
14:17
<hsivonen>
if it weren't for DTDs (internal subset in particular), I'd be hacking on that right now
14:18
<zcorpan_>
there are more html authors that validate their documents than there are non-drocanian xml tokenizer implementors
14:18
<hsivonen>
true
14:27
<gsnedders>
zcorpan_: even among feed readers?
14:39
<hsivonen>
the recipe at http://www.w3.org/QA/2008/03/html-charset.html contradics the HTML 5 spec
14:48
<hasather_>
annevk: the tracker uses -p for the diff command which makes it output "function receiver(e) {" and other garbage on each "@@"-line
14:53
<zcorpan_>
gsnedders: i would think so, yes
14:59
<hsivonen>
for some reason, the "HTML" page on wikipedia draws a lot of vandalism
14:59
<hsivonen>
much more than e.g. XHTML
15:00
<gsnedders>
hsivonen: but it says it isn't obsolete! it must be wrong!
15:01
<hsivonen>
aside: http://en.wikipedia.org/wiki/MathML#Example
15:02
<gsnedders>
brilliant example of why I hate MathML. It takes _so_ much code to represent anything.
15:02
<hsivonen>
Hixie: I think the MathML example is so bad the we should concede to converters being necessary and shouldn't try to make stuff implied in the text/html parser
15:35
<Philip`>
If I look for /<[a-z]+ [a-z]+="([^"]+)"[a-z][^= ]*/ in the subset of the 130K pages that have a URI whose MD5 starts with '8', then I find things like
15:35
<Philip`>
<table width="100% align="center">
15:35
<Philip`>
<a href="<div align="center">
15:35
<Philip`>
<a href="pages/leaders.html"Leadership"">
15:35
<Philip`>
<img src="http://cgi.knc.ne.jp/cgi/counter/counter.pl?USER=cci&URL=index.html"alt"=counter align="bottom" width="35" height="16" border="0">
15:35
<Philip`>
<a title="Ver Video class="VERVIDEOLINK" 2005" onfocus="this.blur()" onclick="NewWindow(this.href,'vervideo','625','420','no','center');return false" href="http://raremusicvideos.org/videos/L7-the-word.html">;
15:36
<Philip`>
<a href="http://www.myrope.com/"index.asp"; target="_top">
15:36
<Philip`>
<noscript><A HREF="http://www.world1000.com">; <img src="http://w108.hitbox.com/Hitbox?cd=1&bt;&hb;</A>; </noscript><script language="javascript1.2">
15:36
<Philip`>
and several more that are simply forgetting a closing '"'
15:38
<Philip`>
Oops, I meant /<[a-z]+ [a-z]+="([^"]+)"[a-z][^= ]*"/
15:39
<zcorpan_>
all of those are covered by banning " in attribute names, afaict
15:40
<Philip`>
It would probably be much more useful if I looked for all the cases that hit the relevant parse error in the tokeniser, but that would take more effort than I want to bother with now
15:41
<zcorpan_>
i think this is useful enough data for sending to the list
15:42
<Philip`>
Given what I was searching for, I don't think it'd be possible to find anything that wouldn't be interpreted as a quote in an attribute name
15:44
<zcorpan_>
good point :)
16:02
<zcorpan_>
hmm, getAttribute() returns the empty string instead of null in ie8 if the attribute is absent
16:09
<Philip`>
Is that consistent with other browsers?
16:10
Philip`
wonders why his spec-processing script creates output saying "<a�href="a">a<table><a�href="b">b</table>x" with something funny in place of the spaces, only in that particular instance and nowhere else in the document
16:15
<SadEagle>
Philip`: that's consistent with DOM, but not with the web.
16:15
<SadEagle>
Or is it other way around?
16:16
<SadEagle>
no, I got it right the first time around.
16:17
<SadEagle>
Philip`: the classic yahoo mail at least used to depend on getAttribute("missing") returning null.
16:25
<zcorpan_>
hmm, i think <input type=number value=""> should be valid html5
16:26
<zcorpan_>
(haven't checked the spec but V.nu says it's invalid)
16:29
<annevk>
they implemented getAttribute() per spec? classic...
16:30
<zcorpan_>
i guess they went through the dom 2 core testsuite
16:31
<SadEagle>
did they start throwing WRONG_DOCUMENT_ERR too?
16:32
<zcorpan_>
dunno
16:32
<zcorpan_>
not very unexpectedly, they made changes to quirks mode and ie7 mode too
16:33
<annevk>
which makes their claims about "not breaking the Web" a lot less convincing...
16:34
<annevk>
it does further the cause of making Trident near impossible to imitate
16:34
<hsivonen>
rubys: hi
16:35
<rubys>
hsivonen: what do you mean by "above-dom"?
16:35
<hsivonen>
rubys: I mean code that reads the DOM
16:35
<Philip`>
It makes people's claims that IE would contain three independent frozen-in-time rendering engines incorrect
16:35
<hsivonen>
rubys: the SVG and MathML renderers
16:35
<hsivonen>
(and scripts that access the DOM, too)
16:35
<annevk>
Philip`, yes, that was me :)
16:36
<rubys>
cool.
16:36
<annevk>
guess I was ill informed
16:38
<hsivonen>
have I understood correctly that the IE8 CSS renderer shares the DOM impl. with the IE7 and IE5.5 modes?
16:38
<hsivonen>
but the DOM impl changes some of its behavior conditionally?
16:40
<annevk>
i think that's the case, yes
16:40
<annevk>
versioned APIs or something
16:40
<Philip`>
Would <style>a[title]{color:green}</style><a title>example</a> be a way of testing that, given what zcorpan_ said about IE8 dropping attributes?
16:41
<annevk>
probably or hasAttribute("title") maybe
16:43
<Philip`>
http://msdn2.microsoft.com/en-us/library/cc288472(VS.85).aspx - "The alt attribute is no longer displayed as the image tooltip when the browser is running in IE8 mode. Instead, the target of the longDesc attribute is used as the tooltip if present; otherwise, the title is displayed."
16:43
Philip`
hadn't seen that before
16:44
<billmason>
Given how much people mung up the content of the longdesc attribute, I have to wonder about the virtue of exposing it that way.
16:45
<hsivonen>
is there a screenshot of the longdesc thing happening?
16:45
<Philip`>
Given how little people use longdesc, it doesn't matter too much what you do with it
16:45
<billmason>
True.
16:46
Philip`
wonders what it'll do on Wikipedia images
16:46
<annevk>
hsivonen, is MathML just verbose or is it not possible to get Math compact in XML/HTML markup?
16:48
<gsnedders>
annevk: the latter
16:48
<hsivonen>
annevk: as far as I can tell, math requires lots of layout boxes, although CSS can have anonymous boxes, tweakable boxes practically have to be elements. that translates into very verbose markup
16:48
<Philip`>
Does <tex>e^{i\pi}=2</tex> count as "XML/HTML markup"?
16:49
<hsivonen>
Philip`: at least it would take anonymous box generation to a whole new level...
16:49
<hsivonen>
(I'm assuming here that we want a solution that integrates at least semi-sanely with a CSS formatter)
16:50
<Philip`>
At some level of abstraction <span>test test-test test&shy;test test</span> does complex anonymous-box-like stuff to separate and wrap words
16:50
<rubys>
http://www.w3.org/TR/mathml-for-css/
16:50
<Philip`>
(and I don't know what level of abstraction is the right one to look from)
16:51
<hsivonen>
rubys: not sufficient for Musings
16:51
<rubys>
understood, but he assumes that the spec is unchanging
16:51
<rubys>
if, however, you assume that the spec can be fixed, then it no longer is an HTML5 problem.
16:52
<annevk>
hsivonen, tweakable boxes could be ::foo
16:53
<hsivonen>
I'm assuming that in the near term fixing SVG and MathML as they are processed above the DOM should be out of scope for the WHATWG
16:53
<hsivonen>
I do think that the MathML 3 spec can still change, though
16:54
<hsivonen>
which reminds me that I should follow up on some of my comments on MathML 3
16:55
<rubys>
opera's direction (last I heard) was to do mathml via css. Whether they do so, or don't do so, and whether their doing so is sufficient for Musings is not a problem that the HTML5 working group needs to solve.
16:55
<rubys>
Preventing MathML from being styled via CSS would be bad.
16:56
<hsivonen>
rubys: I agree on both counts
16:56
<rubys>
cool. I see the HTML5's WG's job to get the markup into the DOM. And that's as far as the job goes.
16:56
<hsivonen>
but as it stands, presentational MathML has useful bits that can't be emulated with a generic XML tree and CSS3
16:56
<rubys>
then opera may have a problem with their approach.
16:56
<hsivonen>
so I think Gecko's MathML code and MathPlayer will still be relevant
16:58
<rubys>
if HTML5's HTML 5 deserialization produces the same result as HTML5's XHTML5 deserialization of equivalent documents (modulo syntax differences only), then the HTML5 working group's job is done.
16:58
<hsivonen>
rubys: I agree
16:59
<hsivonen>
which is why I think part of the solution is already constrained more than Hixie's email suggested
17:02
<billmason>
Philip`: I don't see IE8 doing anything with displaying longdesc in a tooltip, so I guess their documention is, as they say, 'preliminary'....
17:03
<Philip`>
billmason: Okay, thanks
17:03
<Philip`>
billmason: Do you know if they still display alt in the tooltip?
17:04
<billmason>
They do not, from what I'm seeing in a quick test.
17:06
Philip`
sees that IE8 has added CSS printing support for widows and orphans, and thinks it is good that they are supporting broken families
17:07
<hsivonen>
didn't someone say that the IE7 engine was still used for printing?
17:07
<billmason>
I think their release notes say it, but I'm not looking at them right this moment.
17:08
billmason
finds the notes, and yes it's in the notes
17:09
<hsivonen>
hmm. not a total freeze of the IE7 CSS formatter then?
17:11
<Philip`>
It's still quite possible that they've only added to the IE8 CSS print formatter, but the current beta uses IE7 for printing
20:15
<annevk>
http://www.xs4all.nl/~egbg/counterscript.html
21:18
<Dashiva>
So what's the "right" way to get an arbitrary character sequence as a horizontal list delimiter. I'm stumped.
21:20
<Philip`>
Argh
21:20
<Philip`>
Remind me not to type "<meta http-equiv=x-ua-compatible" into the Live DOM Viewer, because it crashes IE before I get to the end
21:21
<Philip`>
Hmm, it just doesn't like not having a content attribute
21:22
<takkaria>
heh
21:23
<gsnedders>
hmm, I can't find anything in the XML PER that disallows characters not in Char
21:25
<Philip`>
http://philip.html5.org/tests/ie8/x-ua-compatible-crash.html
21:26
<Philip`>
It's great how the "Internet Explorer has stopped working" dialog box pops up in the background where I can't see it and won't notice it unless I happen to look at the taskbar
21:26
<gsnedders>
ah. here we go.
21:26
<gsnedders>
"Taken as a whole, it matches the production labeled document."
21:29
<gsnedders>
only the surrogate blocks, U+FFFE, and U+FFFF are disallowed. odd.
21:29
<Lachy>
annevk, othermaciej_ for selectors api, what are your thoughts about changing the interface to a single NodeSelector interface that is implemented by Documents, Elements and DocumentFragments?
21:34
<annevk>
wfm
21:35
<Philip`>
Hmm, I can't type a # (ctrl+alt+3) into IE8
21:40
<othermaciej_>
Lachy: no problem with it
21:57
<Lachy>
ok, I'll make the change next time I edit the spec
22:08
<othermaciej>
wow, mathml is way uglier than I thought
22:14
<roc>
is MathML in Acid3?
22:15
<SadEagle>
don't think so.
22:15
<othermaciej>
no
22:15
<roc>
bummer, that's an oversight on our part :-)
22:15
<Lfe>
roc: you can read about what tests acid3 consists of here: http://www.webstandards.org/action/acid3/
22:16
<othermaciej>
but I was just looking at the example that hsivonen sent a link to
22:16
<othermaciej>
where the quadratic formula takes 30 lines of XML to express
22:17
<SadEagle>
it's not meant to be human-readable...
22:17
<Philip`>
It only takes one line
22:17
<SadEagle>
seems like almost everyone converts to it from TeX or TeX-like syntax
22:17
<Philip`>
once you remove the whitespace
22:17
<gsnedders>
Philip`: :P
22:17
<Philip`>
since it's not like the whitespace makes it any more readable
22:17
<othermaciej>
and then the "semantic" 40-line version
22:17
<othermaciej>
it just makes me wonder what the point is
22:18
<roc>
I think the big lesson is that XML is rubbish for token-oriented languages like math and source code
22:18
<othermaciej>
humans can't read or write it, or probably even understand it enough to usefully apply CSS styling or scripting based on the MathML markup
22:18
<Philip`>
Most humans can't read or write HTML, but it's still useful for them
22:19
<roc>
othermaciej: you could say the same about SVG
22:19
<SadEagle>
othermaciej: I guess one could say that it should inherit fonts, etc., but then math fonts are special, anyway
22:19
<othermaciej>
the idea of CSS styling SVG is a little silly, but scripting it isn't totally insane
22:19
<roc>
and I think you'd be partially right, in both cases, but also partially wrong
22:20
<roc>
given decent inspector tools, it's not that hard to figure out how to script and style MathML or SVG
22:21
<othermaciej>
this is the link I'm talking about fwiw: http://en.wikipedia.org/wiki/MathML#Example
22:22
SadEagle
wonders about the quality of MathML rendering..
22:22
gsnedders
can't render MathML well
22:23
<roc>
try ours
22:23
<othermaciej>
I guess what I'm saying is, I'm not sure that MathML is better than a <math> element which just contains magically parsed/rendered expressions would have been
22:23
<roc>
well
22:23
<SadEagle>
othermaciej: well, things like selection, etc, matter.
22:23
<othermaciej>
whereas for SVG there's no obvious better way to do it
22:23
<roc>
do you construct a math DOM or not?
22:23
<SadEagle>
of course, in most cased people just put up a PDF
22:24
<othermaciej>
gotta go for now
22:24
<roc>
if you don't, then you basically make scripting and styling impossible
22:24
<annevk>
I wonder if for text/html math could be done in such a way that you don't need to write most of the elements
22:24
<jgraham>
roc: MathML isn't in ACID3 because it doesn't have a widely implemented DOM API to test
22:24
<annevk>
they would be generated during parsing
22:24
<roc>
if you do, then using non-XML syntax could be confusing
22:24
<SadEagle>
roc: I am not sure how much of that is useful. Styling in particular.
22:24
<SadEagle>
colors, may be.
22:24
<Philip`>
annevk: Generated elements make styling and scripting harder
22:25
<jgraham>
gsnedders: Rendering MathML isn't a requirement for being a conforming MathML UA :)
22:25
<roc>
jgraham: what do you mean? you can use regular DOM APIs in our implementation
22:25
<gsnedders>
jgraham: I don't claim conformance though :)
22:26
<jgraham>
roc: What would you test that's MathML specific though?
22:26
<roc>
having an HTML <math> element that parses to MathML DOM could be very useful, but you'd need a way to put classes and IDs in there
22:26
<roc>
jgraham: stuff like MathML scriptlevel automatically changing font sizes based on formula structure
22:27
<jgraham>
roc: IIRC, that's not a normative requirement in the MathML spec
22:27
<roc>
sure is
22:27
<annevk>
Philip`, true, though probably not the common case
22:28
<SadEagle>
one would hope high-quality rendering would be an implementor's goal, though
22:28
Philip`
attempts to post bug reports to the IE8 newsgroup, which does not appear to be a place with an especially high signal-to-noise ratio
22:29
<roc>
the spec goes into some detail about how scriptlevel must interact with CSS font sizing
22:29
<roc>
I spent a few weeks implementing it last year
22:29
<roc>
I don't want to be told now that all that was non-normative :-)
22:33
<jgraham>
roc: http://www.w3.org/TR/MathML2/chapter7.html#interf.genproc (7.2.1) suggests that all the rendering rules are non-normative, but maybe I misunderstanding
22:34
jgraham
finds the MathML spec hard to understand in general
22:36
<roc>
I don't think the rules in 3.3.4.2.1 and 3.3.4.2.2 can be construed as non-normative
22:36
<roc>
http://www.w3.org/TR/MathML2/chapter3.html#id.3.3.4.2
22:56
<jgraham>
roc: Practically I agree but pedantically "Since MathML renderers may be unable to make use of arbitrary font sizes with good results, they may wish to modify the mapping from scriptlevel to fontsize to produce better renderings in their judgment." implies that you need not change fontsize when scriptlevel changes. Although I guess there might be other testable assertions there
22:57
<jgraham>
Er, the next sentence from that quote would be useful "In particular, if fontsizes have to be rounded to available values, or limited to values within a range, the details of how this is done are up to the renderer."
22:57
<roc>
pedantically, CSS renderers are allowed to have "font-size:10px ! important" in the UA style sheet
22:58
<jgraham>
Fair enough :)
22:59
<jgraham>
Am I mistaken in thinking that MathML doesn't use RFC 2119 keywords?
23:13
<annevk>
i don't think so
23:13
<annevk>
it also doesn't appear to have any normative references?