09:17
<hsivonen>
hmm. does C# allow switch() {} with any string values or just interned strings?
09:17
<hsivonen>
just curious after glancing at the Twintsam source link for [imps]
09:17
<hsivonen>
from [imps]
09:21
<Lachy>
hsivonen, check the language spec http://msdn2.microsoft.com/en-us/library/Aa645596(VS.71).aspx
09:25
<hsivonen>
hmm. looks like any string is allowed
09:50
<zcorpan>
another case where serialization needs to throw: an element in the "http://www.w3.org/2000/xmlns/"; namespace
09:53
<zcorpan>
and an unclear case: two attributes that have the same namespace and local name
12:12
zcorpan
looks at http://www.w3.org/TR/xml/#NT-AttValue
12:13
<zcorpan>
where does it say that attribute have to follow the Char production?
12:52
<hsivonen>
what's the mediawiki way to say <code>?
12:53
<hsivonen>
bah. <code> according to http://diberri.dyndns.org/wikipedia/html2wiki/index.html
13:19
<hsivonen>
Lachy: is there a way to permanently convince the whatwg wiki that I am not a bot and to suppress the math test?
13:21
<Lachy>
hsivonen, maybe.
13:26
<Lachy>
hsivonen, it should only ask you that when you attempt to add a URI to the page
13:28
<hsivonen>
Lachy: ah.
13:28
<hsivonen>
Lachy: I have namespace URIs
13:28
<Lachy>
ok
13:29
<Lachy>
I may be able to turn it off for some additional user groups, I'm just trying to find out what group I can put you in to do it
13:33
<Lachy>
hsivonen, I made you a Sysop and Bureaucrat so you have more rights now and won't see the captcha
13:37
<hsivonen>
Lachy: thanks
13:37
<Lachy>
does anyone know if it's possible to delete users? I want to get rid of known spammers completely
13:55
<Lachy>
I'm upgrading wordpress on the blog, so it may break a little bit during the process
13:57
<met_>
hi
13:57
<met_>
http://blog.whatwg.org/ gives some php errors
13:59
<Philip`>
met_: http://krijnhoetmer.nl/irc-logs/whatwg/20070911#l-104
14:00
<met_>
oh, haven't read yesterday log, ok
14:00
<met_>
no today, i am blind
14:00
<met_>
completelly 8-)
14:00
<Philip`>
It was a minute before you joined :-)
14:14
<Lachy>
blog is working again
14:23
<Lachy>
wordpress update is complete, now with a comment preview function
14:30
<hsivonen>
http://html4all.org/wiki/index.php/List_Rules
14:30
<hsivonen>
public-html might benefit from adherence to those rules
14:35
<Philip`>
Do they have a new secret hideout where things like "Greg's draft" of the list rules were circulated?
14:37
<Lachy>
I finally upgraded my own blog too :-)
14:37
<Lachy>
I think I was running an obsolete and insecure version of WP for a while now
14:56
<hsivonen>
http://wiki.whatwg.org/wiki/Validator.nu_XML_Output
14:56
<hsivonen>
comments welcome before I implement
15:01
zcorpan
would suggest to rename the uri attribute to url, though doesn't feel strongly about it
15:02
<hsivonen>
zcorpan: thanks. done.
15:07
<zcorpan>
hsivonen: is it not allowed to have <code><a> or <a><code> in <message>?
15:07
<zcorpan>
i.e. nest them
15:07
<hsivonen>
zcorpan: <a><code> should be allowed by the prose.
15:07
<zcorpan>
ah, right
15:08
<zcorpan>
why not the other way around?
15:08
<hsivonen>
zcorpan: simplicity
15:08
<zcorpan>
ok
15:09
<Lachy>
why put the url attribute on each info, error and non-document-error element instead of just once on their parent messages element?
15:09
<hsivonen>
Lachy: because messages may pertain to schema or DTD files
15:10
<hsivonen>
Lachy: defining inheritance could save bytes, though
15:10
<Lachy>
ok
15:10
<zcorpan>
the m element is in the v.validator.nu... namespace?
15:10
<Lachy>
yeah, I'd do <messages url=".."> for the default url and then allow url="" on other elements if its different for some
15:11
<hsivonen>
zcorpan: in the n.validator.nu...
15:11
<hsivonen>
Lachy: makes sense
15:12
<Lachy>
so if someone submits an IRI, will the output include the punycode version?
15:12
<hsivonen>
Lachy: yes
15:12
<Lachy>
ok. what's the reason for not outputting IRIs?
15:13
<Lachy>
for better compatibility with legacy software?
15:13
<hsivonen>
Lachy: 1) I don't have the code to do that. 2) It would be less compatible with IRIless clients.
15:17
<Lachy>
does this "implementation may count column numbers in terms of UTF-16 code units instead of characters." mean that characters above the basic multilingual plane, which use more than 16 bits, would be counted as 2 characters instead of 1?
15:18
<zcorpan>
looks good to me
15:19
<hsivonen>
Lachy: yes
15:20
<hsivonen>
zcorpan: thanks
15:20
<Lachy>
hsivonen, would a Content-Type issue be emitted as a <non-document-error>?
15:21
<hsivonen>
Lachy: hmm. IIRC, not
15:21
<hsivonen>
Lachy: IIRC, that's an IOException internally
15:22
<Lachy>
so wouldn't that qualify for <non-document-error type="io">?
15:25
<hsivonen>
Lachy: doh. yes
15:26
<hsivonen>
I should have said yes, above. sorry
15:26
<Lachy>
hsivonen, did you look at the XML output available from validator.w3.org and base any of this of that?
15:26
<hsivonen>
Lachy: I did
15:26
<hsivonen>
Lachy: it is non-streamable
15:27
<hsivonen>
or they are, rather
15:27
<hsivonen>
"I think there are two problems with the SOAP and Unicorn formats: they are unnecessarily complex and they don’t support streaming output."
15:28
<hsivonen>
though it is possible that I overvalue streaming
15:34
<hsivonen>
Lachy: also, the W3C validator format uses the entity-escaped HTML anti-pattern
15:34
<Lachy>
what does that mean?
15:35
<hsivonen>
Lachy: it's like RSS
15:35
<Lachy>
like the <![CDATA[ stuff with escaped HTML?
15:35
<hsivonen>
Lachy: yes
15:37
<Lachy>
oh I see, like in the <m:explanation> and <m:source> elements where it should just use XHTML.
15:37
<Lachy>
http://validator.w3.org/docs/api.html
15:38
<zcorpan>
hsivonen: are clients expected to reject responses where the root element is "messages" in some other namespace?
15:38
<hsivonen>
zcorpan: yes
15:40
<hsivonen>
and now it is time to spec a JSON format...
15:40
<Lachy>
I don't understand the <parse-tree> section
15:41
<hsivonen>
Lachy: if I didn't salt the namespaces and someone loaded it in a browser, it would be a script injection bug to the validator.nu domain
15:42
<zcorpan>
how can you serialize the parse tree to xml if you parsed html that is not serializable as xml?
15:42
<Lachy>
no, I understood the reason for the namespace change, but not how you serialise a parse tree
15:42
<Lachy>
wouldn't that just be the entire document?
15:42
<hsivonen>
zcorpan: I intend to cheat
15:43
<hsivonen>
zcorpan: in the case of white space
15:43
<zcorpan>
hsivonen: what about e.g. colons in local names?
15:43
<hsivonen>
with non-NCName names, I guess I'm going to drop the parse tree
15:43
<zcorpan>
ok
15:43
<hsivonen>
Lachy: yes
15:44
<zcorpan>
perhaps the parse tree could be serialized as http://simon.html5.org/specs/sdf instead? :)
15:44
<hsivonen>
I expect the parse tree part to be the least useful and the most controversial
15:44
<Lachy>
how do you intend to represent the parse tree of HTML documents, especially for the cases where HTML and XML differ
15:44
<zcorpan>
though i need to support character escaping for that format
15:45
<hsivonen>
Lachy: I intend not to
15:45
<hsivonen>
zcorpan: yeah, perhaps I should use SDF instead
15:46
<Lachy>
hsivonen, yeah, sdf would make sense
15:47
<gsnedders>
http://simplepie.org/support/viewtopic.php?id=1167 —that's so typical of PHP bugs.
15:47
<zcorpan>
"\u000C" for a form feed?
15:48
<zcorpan>
perhaps i should look at how css escapes characters and do the same
15:48
<hsivonen>
zcorpan: if you want to represent arbitratry DOMs, you should probably represent UTF-16 code point sequences
15:48
<hsivonen>
not character sequences
15:49
<hsivonen>
\uXXXX
15:49
<hsivonen>
is fine
15:49
<zcorpan>
ok
15:49
<hsivonen>
with surrogates as two escapes
15:49
<zcorpan>
makes sense
15:50
<Lachy>
yeah, \uXXXX would be compatible with JavaScript escape sequences too
15:50
<Lachy>
which might be useful
15:55
<zcorpan>
perhaps the whole spec should talk about utf-16 code points instead of characters
15:55
<hsivonen>
yeah
15:56
<hsivonen>
code units, actually
15:56
<Lachy>
zcorpan, in SDF, do all attribute nodes have to occur before any other child nodes of an element?
15:56
<zcorpan>
Lachy: no
15:56
<zcorpan>
U+0022 QUOTATION MARK -> 0x0022 UTF-16 code unit ?
15:57
<hsivonen>
yeah
15:57
<hsivonen>
but saying QUOTATION MARK is still useful
15:57
<Lachy>
so this would be conforming:
15:57
<Lachy>
e "div"
15:57
<Lachy>
t "foo"
15:57
<Lachy>
a "class" "bar"
15:57
<zcorpan>
yes
15:58
<hsivonen>
zcorpan: any particular reason for not have exactly one way of doing it? (so that results would be byte-wise comparable for equality)
15:58
<zcorpan>
i haven't thought much about serializing to sdf yet
15:59
<Lachy>
I would have expected that to be non-conforming and for each node to represented in document order
16:00
<zcorpan>
yeah, although attributes are unordered :) and if you implement it using dom methods it doesn't really matter if an attribute appears after other nodes
16:02
<zcorpan>
if you write sdf by hand you might not want to care about what order you place the attributes (though it makes sense to put them before child nodes of the element)
16:02
<zcorpan>
but when serializing and comparing you'd probably sort the attributes
16:07
<Lachy>
hmm. sdf is a lot nicer than the format onsgmls outputs, which emits attributes before the element
16:09
<zcorpan>
is there a good character that would be encoded as a surrogate pair in utf-16 that i can use in the examples?
16:10
<Lachy>
zcorpan, U+1047E
16:11
<zcorpan>
thanks
16:11
<Lachy>
that's the one that appears in Hixie's email sig
16:11
<zcorpan>
noticed (by googling for it) :)
16:31
<zcorpan>
ok, specced escaping
16:34
zcorpan
hopes he got the example right
16:36
<Lachy>
zcorpan, can CR and LF occur anywhere within a string?
16:37
<Lachy>
so if a text node spans multiple lines, it's just represnted like this:..
16:37
<zcorpan>
they can
16:37
<Lachy>
t "this string contains
16:37
<Lachy>
a new line"
16:38
<zcorpan>
yes
16:39
<Lachy>
you should probably change "It is not intended for data exchange, but rather intended to be used for test suites.", especially if Henri ends up using it for the validator's parse tree output
16:40
<Lachy>
maybe say that it's primarily intended for test suites
16:43
<hsivonen>
communicating DOMs on XML without using XML doesn't quite feel right
16:44
hsivonen
likes the images of the JSON spec
16:45
<zcorpan>
Lachy: changed
17:00
<zcorpan>
how is "0x000A" generally pronounced? oh-ecks-... or zero-ecks-... ?
17:01
<Lachy>
zcorpan, I pronounce it zero-ecks...
17:01
<zcorpan>
ok
17:01
<Lachy>
why?
17:01
<zcorpan>
"A 0x..." vs "An 0x..."
17:02
<gsnedders>
zcorpan: I pronounce it oh-ecks-…
17:02
<zcorpan>
oh well :)
17:03
<gsnedders>
zcorpan: though normally only in my head. I have little occasion to actually talk about such things :)
17:04
<zcorpan>
yeah, but it might be annoying to read text that assumes the other pronounciation (by having "a" or "an" in front of it)
17:05
<zcorpan>
"a FAQ" vs "an FAQ" is the same
17:05
<Lachy>
whenever I read something like that, I just change it mentally to read it how I like it
17:06
<zcorpan>
yeah, that's what is annoying (not very but still)
17:06
<Lachy>
"an FAQ" is correct because I pronounce it F-A-Q rather than "Fack"
17:07
zcorpan
pronounces it as "fack"
17:09
<gsnedders>
F-A-Q
17:11
<Lachy>
zcorpan, SDF should probably say something about character encodings
17:12
<Lachy>
should a consumer assume SDF files are encoded as UTF-8 or is that left explicitly undefined?
17:13
<zcorpan>
dunno
17:15
<zcorpan>
haven't thought much about it, but so far i've just used bomless utf-8
17:15
<Lachy>
maybe say encoded as UTF-8 or UTF-16 (with an appropriate BOM) or otherwise specified by a higher level protocol or container format
17:19
<hsivonen>
Lachy: why bother with UTF-16 if you have a chance of mandating UTF-8?
17:19
<Lachy>
dunno
17:20
<zcorpan>
thinking about it, perhaps the spec should talk about characters instead of code units for the things that are integral to the syntax itself, and that it's just the \uXXXX escaping that should talk about utf-16 code units
17:21
<hsivonen>
zcorpan: the language itself should probably be defined in terms of characters but the string values it encodes should be strings of UTF-16 code units
17:21
<zcorpan>
yeah
17:25
<hsivonen>
are there any guides on JSON design patterns?
17:25
<hsivonen>
I tried to google, but I only found references to the use of JSON itself being an Ajax design pattern
17:27
<Lachy>
you should be able to just do a 1:1 mapping between elements in the XML format and objects in JSON, with attributes represented as properties.
17:28
<hsivonen>
Lachy: wouldn't such a JSON format suck compared to one designed as JSON?
17:28
<Lachy>
maybe
17:28
<hsivonen>
Lachy: how do you do mixed content in JSON?
17:29
<Lachy>
what kind of mixed content?
17:29
<hsivonen>
the kind that mixes text and inline markup
17:29
<hsivonen>
as far as I can tell, JSON doesn't do mixed content
17:30
<hsivonen>
which means no document-style XML
17:30
<hsivonen>
only "data"-style XML
17:30
zcorpan
isn't really sure how to spec the escaping part
17:30
<hober>
"foo ", {"bar": [{}. "baz"]} " quux" is something like "foo <bar>baz</bar> quux"
17:30
<Lachy>
yeah, you'd have to represent xml as strings
17:31
<hsivonen>
zcorpan: I suggest escaping everything except code units that correspond to printable ASCII
17:31
<hober>
(assuming you fix the missing commas, typos, etc. in my example)
17:32
<hsivonen>
zcorpan: assuming you want something that is hard to break and unambiguous for byte comparisons
17:32
<zcorpan>
ok
17:34
<hsivonen>
zcorpan: though that would make the format suck big time as a human-readable tree dump for real pages
17:35
<hsivonen>
as opposed to ASCII-dominated test cases
17:35
<zcorpan>
indeed
17:37
<zcorpan>
though mixing utf-16 escapes and utf-8 characters seems error prone
17:38
<zcorpan>
also, if all non-ascii is escaped it doesn't matter much what encoding is used for the format itself
17:39
<Lachy>
hsivonen, something like this could work http://tinyurl.com/ytldcz
17:39
<hsivonen>
Lachy: yes, except I'm going to leave elaboration out
17:40
<Lachy>
ok
17:42
<hsivonen>
http://wiki.whatwg.org/wiki/Validator.nu_JSON_Output
17:43
<Lachy>
what's extract-offset?
17:44
<hsivonen>
a pointer to the character of interest inside extract
17:44
<hsivonen>
since <m> and mixed content is not JSON-like
17:46
<Lachy>
ok, so it's value would have to give start and end points or start and length
17:49
<hsivonen>
no, only offset to the character of interest
17:50
<hsivonen>
<m> conceptually only designates one character, but it will have to expand not to cut a sequence of a base character plus combining characters
17:50
<hsivonen>
an offset in JSON does not have that problem
17:54
<Lachy>
ok, I thought <m> would mark the whole section. e.g. for an invalid attribute, it would mark the whole thing: &lt;p <m>align="right"</m>&gt; ...
17:55
<hsivonen>
Lachy: no, it is just for indicating which part of the extract corresponds to line & col
17:55
<Lachy>
ah
17:55
<hsivonen>
attribute errors will point to the last > of the start tag
17:57
<Lachy>
oh, that's unfortunate
17:57
<hsivonen>
I need to figure out how to do this without letting a huge string of combining diacritics to be used as a DoS attack
17:57
<Lachy>
that means for something making use of the API which wants to highlight the attribute itself would need to search the string itself
17:58
<hsivonen>
Lachy: SAX only allows source locations on a per-event basis
17:58
<Lachy>
oh right
17:58
<hsivonen>
Lachy: and the whole start tag is the event
17:59
<Lachy>
I suppose it wouldn't be too hard to use a regex to find the attribute in most cases
18:00
<hsivonen>
Lachy: well, in the case of elements, the element may be implied so you wouldn't find anything matching in the source
18:00
<hsivonen>
at least SAX does some source location
18:00
<hsivonen>
the DOM does none
18:01
<hsivonen>
which is one of the reasons why I wrote my own tree model
18:02
<Lachy>
I updated my example based on the current description in the wiki http://html5.lachy.id.au/clipboard
18:03
<hsivonen>
Lachy: is it OK to copy that into the wiki under the MIT license?
18:05
<Lachy>
as long as you comply with my copyright licence <http://lachy.id.au/about/copyright>;, then sure
18:05
<zcorpan>
ok, rewrote the escaping part
18:08
<zcorpan>
hmm
18:08
<hsivonen>
Lachy: unfortunately, I'm not at liberty to release my changes to the Public Domain
18:08
<Lachy>
no worries
18:09
<Lachy>
everything after the first paragraph is basically a non-binding suggestion
18:09
<Lachy>
maybe I should update it to say "or a free licence"
18:10
<hsivonen>
Lachy: I realized it was non-binding
18:11
<hsivonen>
speccing this stuff is so much more tedious than just implementing it and telling people to view source and guess
18:14
<zcorpan>
there, i think i got it right now
18:15
<Lachy>
I updated it
18:19
<Lachy>
hsivonen, I think you should change "extract-offset" to "offset"
18:19
<Lachy>
and consider shortening "non-document-error"
18:22
Philip`
updates http://canvex.lazyilluminati.com/misc/ref/ref.html so all the examples get automatically conformance-checkered when the HTML is generated
18:24
<hsivonen>
Lachy: any suggestions for "non-document-error"? it won't waste too many bytes as there will be unlikely to be more than two per result
18:24
<hsivonen>
Philip`: with what software?
18:27
<Philip`>
hsivonen: It's using the validator.nu API (with text output)
18:27
<Philip`>
(via httplib)
18:29
<hsivonen>
Philip`: ok
18:29
<Lachy>
my example had syntax errors, I fixed them and updated the wiki
18:29
<hsivonen>
Lachy: thanks
18:32
<Lachy>
hsivonen, probably best to avoid "-" characters in property names because it allows scripts to use this notation: result.messages[0].line, which can't be done with result.result.parse-tree
18:32
<Lachy>
authors would have to use result.result["parse-tree"] instead
18:33
<hsivonen>
Lachy: ok. is any of the names a reserved word by any chance? e.g. "class" is a usual suspect
18:34
<hsivonen>
Lachy: should I use camelCase instead?
18:34
<Lachy>
yeah, that would work
18:36
<hsivonen>
class will be reserved in the future
18:36
<hsivonen>
http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Reserved_Words
18:37
<hsivonen>
should I have type and subtype instead of class and type?
18:38
<Lachy>
yeah
18:41
<virtuelv>
(mildly OT: What is the status of selector support in Firefox 3? does http://virtuelvis.com/download/2007/09/backgammon/ render correctly?)
18:48
<Philip`>
virtuelv: Doesn't work in a build from a few days ago
18:49
<Philip`>
http://www.css3.info/selectors-test/test.html indicates that everything from nth-child to nth-last-of-type is unsupported
18:52
<hsivonen>
What technical change to HTML5 is called for in the message about braille printing charts?
19:13
<Lachy>
wow, IE only has 63.86% market share now according to http://www.itproductivity.org/browser.htm
19:27
<Philip`>
Lachy: And apparently Netscape has 10% - that sounds a bit odd
19:28
<Lachy>
yeah, it's nowhere near the vales wikipedia lists from several other sources
19:28
<zcorpan>
krijnh: would be nice to pick up on /me lines also :)
19:30
<Lachy>
krijnh, it would be nice to not use grey text on a green background also. it's so hard to read
19:30
<zcorpan>
/^.{8}\* ([^ ]+)/
19:30
<Philip`>
"The survey is accurate to +/- 1.0%" is peculiar - that number of decimal places doesn't really seem justified
19:31
<kingryan>
any idea of the survey's methodology?
19:31
<Philip`>
"a majority of the NS users are on version 4.x"!
19:33
<Lachy>
see what slashdot readers have to say about it http://slashdot.org/article.pl?sid=07/09/11/1222256
19:34
<zcorpan>
the change is also percent units, not percents
19:34
<Philip`>
http://www.e-janco.com/Samples/BrowserSample.pdf lists the sites they got data from
19:35
<Philip`>
(I like how their you-must-register-to-download-it form is circumvented by disabling JavaScript)
19:36
<Lachy>
hmm. 9 web sites isn't really a reasonable sample size
19:36
<Philip`>
By the way, how can they say 63.86% +/- 1.0%?
19:36
<zcorpan>
e.g. opera had 0.78% in 2006 and 1.87% in 2007 according to that table, that's a change of +240%
19:38
<zcorpan>
Philip`: i would guess they rounded the numbers arbitrarily and picked 1.0% out from the blue :)
19:40
<markp>
re: http://krijnhoetmer.nl/irc-logs/whatwg/20070906#l-554
19:40
<markp>
html5lib does NOT have a validator
19:40
<markp>
it has a prototype of an idea of a concept of a half-baked scheme of a validator
19:40
<markp>
i hope that clarifies things
19:40
<Philip`_>
How come Netscape usage increased from 6.3% to 10.2% in the past two years, when a majority of Netscape users are (apparently) Netscape 4.x? Does that mean everybody was using NS4 two years ago, or more people have started using NS4 now than two years ago?
19:41
<markp>
and a disturbing amount of it was written on a red-eye flight from CA to NC
19:41
<Hixie>
netscape usage isn't anywhere near 6% or 10%
19:41
<kingryan>
markp: I didn't mean to overrepresent it as something useful :)
19:42
<markp>
:)
19:42
<zcorpan>
some bots might identify themselves as nn4
19:42
<zcorpan>
though i wouldn't expect that to change the number much
19:43
<Philip`_>
The numbers don't seem entirely convincing
19:44
<Philip`_>
and I'd prefer to not be convinced that people still actually use Netscape 4 :-)
20:19
zcorpan
is unsure how to implement the escaping with javascript
20:20
<Dashiva>
What kind of escaping?
20:21
<zcorpan>
in sdf
20:21
<zcorpan>
http://simon.html5.org/specs/sdf
20:21
<zcorpan>
see the last example at the bottom
20:25
<Dashiva>
Newlines are serialized as actual newlines?
20:26
<zcorpan>
LFs, yes
20:27
<Dashiva>
Isn't that likely to cause trouble outside browsers?
20:27
<zcorpan>
why?
20:28
<Dashiva>
Could be misinterpreted as a line separator?
20:29
<Lachy>
zcorpan, see http://lachy.id.au/dev/mozilla/sidebar/Unicode/character-tools.html
20:30
<Lachy>
that has the code to deal with parsing \uXXXX
20:30
<zcorpan>
Dashiva: you mean you would parse line for line instead of character by character?
20:30
<zcorpan>
Lachy: thanks
20:31
<Dashiva>
I just know I've used text editors that do weird stuff when faced with both CR and LF mixed
20:31
<Lachy>
String.fromCharCode(codepoint) where code point is XXXX
20:32
<zcorpan>
Lachy: but will that work correctly with surrogate pairs?
20:32
<Lachy>
you have to do each one separately, JS can't deal with astral characters
20:32
<zcorpan>
Dashiva: but this format will only have LFs
20:33
<Dashiva>
Oh! I missed that
20:33
<zcorpan>
Lachy: oh. well then that makes it simpler
20:33
<Lachy>
zcorpan, you should probably make that clearer about LFs, even though that's actually specified in HTML5
20:33
<Dashiva>
What's the reason for not escaping them inside strings, though?
20:34
<Dashiva>
It makes the code more complex, after all
20:34
<zcorpan>
Lachy: html5?
20:34
<Lachy>
doesn't HTML5 specify how to serialise a DOM
20:35
<zcorpan>
umm. to html and xml using innerHTML
20:35
<Lachy>
well, actually, it defines that CR and CRLF wil be replaced with LF in the input stream, so CRs won't appear in the output
20:35
<Lachy>
unless they're added later
20:35
<zcorpan>
Lachy: right, but the dom can have CRs
20:36
<zcorpan>
Lachy: CRs have to be escaped in sdf
20:36
<Lachy>
why?
20:36
<zcorpan>
to avoid ending up with a mix of CRs and LFs
20:37
<zcorpan>
some text editors will be smart and make sure the file has consistent line endings
20:38
<zcorpan>
Dashiva: i'm not sure what is more complex by having LFs in strings
20:39
<Dashiva>
You need an additional check if ( c == 0x000A ) { emit LF }
20:39
<Lachy>
zcorpan, this works http://html5.lachy.id.au/output?data=%3Cscript%3E%0D%0Avar+c1+%3D+parseInt%28%22D801%22%2C+16%29%3B%0D%0Avar+c2+%3D+parseInt%28%22DC7E%22%2C+16%29%3B%0D%0Avar+chars+%3D+String.fromCharCode%28c1%29+%2B+String.fromCharCode%28c2%29%3B%0D%0A%0D%0Adocument.write%28chars%29%3B%0D%0A%3C%2Fscript%3E&type=text%2Fhtml%3B+charset%3DUTF-8
20:39
<Philip`>
LFs in strings would mean you couldn't use a text editor to indent a chunk of SDF
20:40
<zcorpan>
Philip`: why not?
20:40
<Dashiva>
And from the syntax, it seems like SDF strings are intended to be JS-parseable, which the LFs break
20:40
<Lachy>
zcorpan, because you'd introduce extra spaces into the string
20:40
<Philip`>
Because the editor would add indentation at the start of each line, and if the start of a line is the inside of a string, then you'd be modifying the inside of the string unintentionally
20:41
<zcorpan>
good points
20:41
zcorpan
takes out the LF special case
20:41
<Lachy>
zcorpan, maybe allow \n and \r to occur within strings
20:42
<zcorpan>
maybe. if i do then i could also allow the other shorthands
20:42
<Lachy>
like \t?
20:43
<zcorpan>
yeah, and \f
20:43
<Lachy>
oh right
20:43
<Lachy>
yeah, those are compatible with various libraries and programming languages, so that shouldn't add too much complexity
20:44
<Philip`>
Can you just define it to be a JSON string?
20:44
<Dashiva>
That's pretty much what it is
20:44
<zcorpan>
well then :)
20:44
<Dashiva>
http://pastebot.nd.edu/3535
20:45
<zcorpan>
i guess i could simplify my parser after these changes
20:47
<Dashiva>
One concern about the current form is that it uses a lot of bytes for non-printables inside the ascii range, but that's hardly a big deal
20:50
zcorpan
puts sdf in version control
20:52
<hsivonen>
hmm. I wonder if a reply from me is expected to John Foliot's public-html follow-up to my message
20:58
<zcorpan>
ok, changed string to JSON string
21:01
<Hixie>
any italians able to translate this for us? :-) http://scaccoalweb.vnunet.it/2007/09/html-5-ritorno-.html
21:01
<zcorpan>
google translate seems to work :)
21:01
<Lachy>
Hixie http://translate.google.com/translate?u=http%3A%2F%2Fscaccoalweb.vnunet.it%2F2007%2F09%2Fhtml-5-ritorno-.html
21:02
<Hixie>
wow, a translation service that works? *skeptical*
21:03
zcorpan
points to http://hsivonen.iki.fi/test/translate.html
21:06
<zcorpan>
i use that bookmarklet a lot :)
21:06
<Hixie>
:-)
21:06
<Hixie>
it doesn't work on my portal
21:06
<Hixie>
i guess encodeURIComponent() doesn't work on IDN uris
21:43
<Hixie>
www.satogo.com has a pretty cool AT tool
21:43
<Hixie>
seems on par with JAWS in many ways
21:44
<Hixie>
(there's a free trial if you want to try it)
23:02
<Hixie>
i'm amused by the arguments that say that x% of pages omit alt attributes, since they seem to forget the x% of pages that do include alt attributes but do so in fundamentally useless or bad ways
23:15
<Philip`>
It matters that the x% omitting alt because of lazy/stupid/careless/etc authors is non-zero and non-negligible, because then the 0.00y% that omits alt because their authors conscientiously followed the HTML5 spec (and that has indistinguishable syntax from the first case) will have to be processed with the "lazy/stupid/careless author forgot about alt" semantics, not the "critical part of content" semantics, in order to work in the majority of cases
23:39
zcorpan
finds it somewhat ironic that http://blog.whatwg.org/result-format seems to have comments disabled
23:40
<zcorpan>
hsivonen: was that intentional?
23:50
<hsivonen>
zcorpan: no. the admin interface tells me they are enabled
23:50
<hsivonen>
Lachy: see above. fallout from the upgrade?
23:50
<hsivonen>
nn
23:51
<zcorpan>
nn hsivonen