05:14
<Hixie>
deep in the middle of a page's markup, suddenly, a wild namespace appears!:
05:14
<Hixie>
<?xml version="1.0" encoding="utf-16"?><table xmlns="http://www.w3.org/TR/REC-html40"; xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:dt="urn:schemas-microsoft-com:datatypes" xmlns:user="urn:my-scripts"><tr>...
05:28
<tantek>
xmlns - worst attribute ever?
09:31
<matjas>
I just realized that XHTML handles character references differently
09:31
<matjas>
i.e. `&#x85;` → U+0085 in XHTML, while in HTML it’s U+2026
09:31
<matjas>
but AFAICT this isn’t mentioned here: http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenizing-character-references what am I missing?
09:32
<matjas>
I assume the parsing section only applies to non-XHTML HTML, as XML is defined elsewhere, correct?
09:36
<Ms2ger>
That's in the HTML syntax section, no?
09:37
<matjas>
http://www.w3.org/TR/xml/#d0e3895
09:38
<matjas>
Ms2ger: yeah, I guess it all makes sense
09:38
<matjas>
just never realized this difference before — kinda mind-blowing
09:39
<Ms2ger>
Yay for stupid stuff
10:10
<zcorpan>
matjas: check out http://software.hixie.ch/utilities/js/live-dom-viewer/saved/2696
10:24
<hsivonen>
zcorpan: I can't remember what Gecko does with non-JS <script src>
10:46
<zcorpan>
hsivonen: in my testing it doesn't fetch anything
10:49
<hsivonen>
zcorpan: than it doesn't :-)
10:49
<hsivonen>
*then
10:52
<matjas>
zcorpan: oh wow
10:56
<hsivonen>
matjas: It's kinda annoying that Unicode starts with ISO-8859-1 instead of starting with windows-1252
11:06
<annevk-cloud>
So much
11:24
<darobin_>
Domenic_: I read in the back log that you felt bad about being flippant and unpolitic about streams — don't. It can get a lot worse without being a problem.
11:24
<darobin_>
If you feel you should somehow apologise do so, but in any case just move on
11:25
<darobin_>
and give us a fucking streams API today :)
11:26
<hsivonen>
annevk: did you check the source code for the GBK and GB18030 decoders in Gecko?
11:26
<hsivonen>
the latter is a subclass of the former
11:27
<annevk>
hsivonen: but they are not using the same table I think
11:27
<annevk>
hsivonen: I find Gecko's code for encodings hard to follow
11:27
<annevk>
hsivonen: it has all kinds of abstractions that could be removed
11:30
<hsivonen>
annevk: not the same table, right
11:30
<hsivonen>
annevk: the GBK table has one entry! for euro
11:30
<annevk>
hsivonen: whoa
11:30
<hsivonen>
I can't figure out from the GB18030 tables if 0x80 maps to euro in the larger table
11:32
<hsivonen>
euro sign is kinda like time zones: the politician who come up with this stuff should be required to implement this stuff. *Correctly.*
11:33
<annevk>
hsivonen: in gb18030 0x80 maps to 0x20AC
11:33
<annevk>
hsivonen: so if Gecko does that for gbk...
11:33
<annevk>
and http://mxr.mozilla.org/mozilla-central/source/intl/uconv/ucvcn/gbkuniq2b.ut seems to indicate it does
11:35
<annevk>
hsivonen: actually, in Chrome gb18030 0x80 does not map to 0x20AC
11:35
<hsivonen>
I regret putting gbk in the new charset menu
11:36
<hsivonen>
since now it's more of a hassle to change the UI strings
11:39
<annevk>
hsivonen: my bad, I should have sorted this out long ago
11:39
<annevk>
hsivonen: seems that in IE byte 80 is also not mapped to the Euro sign, but it's not mapped to anything sensible either so...
11:43
<annevk>
hsivonen: compare http://dump.testsuite.org/encoding/gbk/byte-80-gbk.html and http://dump.testsuite.org/encoding/gbk/byte-80-gb18030.html btw
11:43
<annevk>
hsivonen: however, it seems that treating 0x80 as 0x20AC is harmless
11:45
<annevk>
hsivonen: http://en.wikipedia.org/wiki/GB_18030#GB18030_as_a_code_page has all the details it seems in the third paragraph with respect to the incompatibility
11:45
<hsivonen>
annevk: the latter shows a question mark for 0x80 in IE11
11:45
<hsivonen>
why not U+FFFD?
11:45
<annevk>
hsivonen: same in IE10
11:46
<annevk>
hsivonen: I suspect they just have weird error handling
11:47
<annevk>
I could test what they do for FF
11:47
<annevk>
or 8100
11:47
annevk
makes a test
11:50
<annevk>
hsivonen: see also http://dump.testsuite.org/encoding/gbk/byte-FF-gb18030.html and http://dump.testsuite.org/encoding/gbk/byte-8100-gb18030.html
11:50
<annevk>
hsivonen: IE10 is a mess
11:51
<annevk>
hsivonen: not sure what Chrome is doing either :/
11:51
<annevk>
Or I suppose, what ICU is doing, that seems broken
11:52
<hsivonen>
annevk: does IE just drop 0xFF?
11:52
<annevk>
hsivonen: no it renders it as 
11:53
<annevk>
hsivonen: whoa, that's a round dot in Windows, but on Mac it renders totally different...
11:54
<annevk>
hsivonen: oh, it's PUA
11:55
<annevk>
hsivonen: U+F8F5
12:13
<zcorpan>
wonder if i should test url query encoding in the prescanner
12:18
<annevk>
hsivonen: the IE gbk table is identical to the Chrome gbk table
12:18
<annevk>
hsivonen: I find one difference in the gb18030 table weirdly enough
12:18
<annevk>
hsivonen: index 6555 maps to 3000 in Gecko and E5E5 in IE
12:24
<annevk>
hsivonen: seems like we should just alias gbk and gb18030 and not worry about the 81/82 PUAs
12:30
<hsivonen>
annevk: aliasing will change how the euro sign gets submitted in forms, right?
12:31
hsivonen
hopes the euro sign doesn't get submitted to Chinese sites often
12:31
<annevk>
hsivonen: yeah, it would use the gb18030 two byte sequence, unless we special case 20AC in the encoder
12:32
<annevk>
hsivonen: special casing it in the encoder however would be incompatible with non-gbk compatible gb18030 implementations
12:32
<hsivonen>
special-casing the encoder might break sites that really are already using de jure GB18030
12:32
<annevk>
right
12:32
<hsivonen>
currency signs are such a bad idea
12:33
<annevk>
Surprising how little code the x-user-def is
12:33
<hsivonen>
both this whole euro thing. and the sheqel sign difference between 8859-8 and windows-1255
12:34
<hsivonen>
annevk: yeah
12:34
<annevk>
Was it Unicode 6.2 or 6.3 that did the Turkish thing?
12:34
<hsivonen>
oh and *that*
12:35
<hsivonen>
at least bitcoin uses an existing sign
12:38
<annevk>
hsivonen: well, "B⃦ has been the standard currency sign for BTC for a long time. Some existing Unicode symbols have been proposed but also serious work is being done on creating a custom Bitcoin sign with its own official Unicode that is recognized by the Unicode Consortium."
12:38
<annevk>
hsivonen: from https://en.bitcoin.it/wiki/Bitcoin_symbol
12:42
<hsivonen>
annevk: sadness
12:47
<MikeSmith>
ah cool x-webkit-speech
12:47
<MikeSmith>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=24126
14:49
<Ms2ger>
odinho++
15:07
<annevk>
hsivonen: just to be clear, current gbk sites with <form> might run into issues with more than U+20AC
15:07
<annevk>
hsivonen: as every code point will map to a byte sequence
15:11
<SimonSapin>
jgraham: what’s the fix when Critic gets upset by an amended commit?
15:11
<jgraham>
SimonSapin: There isn't one
15:12
<jgraham>
SimonSapin: I have instructions from jl on how to go about fixing that though
15:12
<jgraham>
So, happy christams to me, I guess
15:13
<SimonSapin>
I knew that present would make you happy :)
15:13
<jgraham>
SimonSapin: Unless you want to make a patch ;)
15:13
<SimonSapin>
I could submit a new pull request
15:13
<SimonSapin>
for the same changes
15:14
<SimonSapin>
https://github.com/html5lib/html5lib-python/pull/123
15:15
<hsivonen>
annevk: are *all* of those sequences now inconsistent between different browsers' gbk encoders?
15:16
<annevk>
hsivonen: gbk is a subset
15:16
<annevk>
hsivonen: so if you go outside the subset, you'd get &#....; for the code point or ?, depending on the error handling
15:16
<hsivonen>
annevk: so is it inconsistent between "&#....;" and "?"?
15:17
<annevk>
hsivonen: gb18030 would however never trigger error handling
15:17
<annevk>
hsivonen: it's inconsistent between generating a gb18030 byte sequence for the code point in the merger scenario and &#...; in the gbk <form> submission scenario
15:18
<annevk>
(for code points outside the two byte range)
15:18
<annevk>
(and 0x80)
15:18
<odinho>
Ms2ger: I'm incremented! :D (Not too long ago also, I went from 26 to 27 on the 6th)
15:18
<Ms2ger>
Happy incrementation :)
15:20
<jgraham>
odinho: Wait, what, you are tracking it?
15:21
<Ms2ger>
I guess some people would avoid tracking it, at their level :)
15:21
<jgraham>
Ms2ger--
15:21
<jgraham>
:p
15:22
<Ms2ger>
:D
15:24
<odinho>
deferred incrementation :) I save up the function and only flush once per year. And then only run one cycle.
15:25
<odinho>
I wonder what happens if I start getting more decrements than all increments I've had in total. What will happen then :-O
15:26
<jgraham>
odinho: We can test if you want :
15:26
<jgraham>
p
15:26
<odinho>
^_^
15:39
<hsivonen>
annevk: so Gecko's GB18030 impl already matches the spec and we only need to change the label stuff, right?
15:41
<annevk>
hsivonen: afaict, yes
15:41
<annevk>
hsivonen: Gecko has the 0x80 mapping which is the only "weird" thing
15:46
<jgraham>
SimonSapin: urllib[2].urlopen doesn't return a HTTPResponse object. What am I missing?
15:46
<SimonSapin>
it does on Python 3
15:46
<SimonSapin>
well, urllib.request.urlopen()
15:46
<jgraham>
And the bug doesn't happen on python2?
15:47
<hsivonen>
annevk: ok. bug filed: https://bugzilla.mozilla.org/show_bug.cgi?id=951691
15:48
<SimonSapin>
jgraham: apparently urllib[2].urlopen does not use httplib on Python 2, but httplib is still affected
15:48
<GPHemsley>
Gonna be updating the wiki software today. Nobody panic. :)
15:48
<jgraham>
SimonSapin: I see
15:48
<annevk>
hsivonen: should we have a separate bug on removing GBK from the menu?
15:49
<annevk>
hsivonen: thinking about it, that has the same risk (in theory) so grouping them seems fine
15:49
<jgraham>
SimonSapin: Thanks
15:49
<hsivonen>
annevk: I think we can do that in the same bug unless the label bug gets stalled somehow
15:49
<hsivonen>
annevk: I think the menu is lower risk
15:50
<hsivonen>
annevk: since users probably already can't make an informed choice and can end up submitting whatever
15:50
<annevk>
hsivonen: have you thought about "Character Encoding" -> "Text Encoding" btw?
15:50
<hsivonen>
annevk: I haven't
15:50
<hsivonen>
annevk: not my bikeshed
15:50
<annevk>
hsivonen: ait
15:50
<annevk>
Chrome uses Encoding
15:52
<annevk>
IE names it Encoding as well
15:52
<annevk>
hsivonen: would that be a bug against Firefox or some other component?
15:53
<gsnedders>
jgraham: Are you fixing the bug, then?
15:53
<gsnedders>
jgraham: Or dealing with it at least?
15:53
<hsivonen>
annevk: Firefox
15:53
<annevk>
ta
15:54
<jgraham>
gsnedders: What bug?
15:54
<gsnedders>
jgraham: the html5lib one
15:54
<gsnedders>
jgraham: that you were discussing with SimonSapin above
15:56
<jgraham>
gsnedders: He fixed it
15:56
<annevk>
https://bugzilla.mozilla.org/show_bug.cgi?id=951695
16:03
<rwaldron>
annevk ping for quick Q
16:03
<annevk>
rwaldron: go ahead
16:03
<rwaldron>
In Boston, you mentioned an "Elements" API
16:03
<rwaldron>
IIRC, this was an Array subclass
16:03
<rwaldron>
"from the future" ;)
16:03
<annevk>
rwaldron: see links in red box http://dom.spec.whatwg.org/#elements
16:04
<rwaldron>
<3
16:04
<rwaldron>
you read my mind
16:04
<rwaldron>
thanks!
16:04
<rwaldron>
(ie. the next question was "i can has link?")
16:04
<rwaldron>
thanks again
16:04
<annevk>
heh
17:01
<GPHemsley>
Wiki upgrade in progress. Nobody panic.
17:04
<Hixie>
woot
17:04
Ms2ger
grasps his towel
17:18
<GPHemsley>
OK, I think we're good. Let me know if you see any issues.
17:57
<SimonSapin>
jgraham: what happens after approval in Critic?
17:58
<SimonSapin>
also, should I resubmit #123 to get Critic unstuck?
18:03
<jgraham>
SimonSapin: Someone has to merge
18:03
<jgraham>
or rebbase
18:04
<jgraham>
Although I think that html5lib has a non-linear history in any case so merging might not be so bad
18:04
<jgraham>
And yes, resubmitting is probably a good idea
18:22
<gsnedders>
jgraham: I aim for history from git onwards to be linear, FWIW
18:23
<jgraham>
gsnedders: Then you have a rebase to do :)
18:24
<gsnedders>
jgraham: If you can find the one number I need for this HMRC form.
18:26
<Hixie>
when you do frames[0].location.reload(), what's the source browsing context? the parent, or the subframe?
18:27
<jgraham>
gsnedders: Isn't the answer in the question? The number you need is 1.
18:27
<gsnedders>
jgraham: I suspect I paid more tax than £1. :)
18:31
<MikeSmith>
Hixie: the subframe...? isn't that the case of frames[0].location.assign and .replace?
18:32
<Hixie>
Navigation for the assign() and replace() methods must be done with the responsible browsing context specified by the incumbent settings object as the source browsing context.
18:33
<MikeSmith>
HAL^WHixie: you forgot to add ,Dave at the end of that
18:34
<jgraham>
gsnedders: Oh, just tell them the answer is in GTU (Gsnedders Tax Units)
18:34
<gsnedders>
jgraham: The form doesn't allow that. :(
18:34
<MikeSmith>
Hixie: so given that why should not that the same be true for .reload() ?
18:35
<gsnedders>
(I mean, HMRC should /already have this number/. Do I really need to tell them it!?)
18:35
<Hixie>
MikeSmith: yeah, that's my conclusion too. i've just checked that in. previously it was just undefined.
18:35
<Hixie>
MikeSmith: it's mostly academic, really. i think it only affects if you can do parent.location.reload() when you're sandboxed allow-same-origin allow-scripts, which is a very dumb situation to be in anyway.
18:36
<MikeSmith>
oh
18:36
<MikeSmith>
well I don't plan on the doing that at least
18:36
<Hixie>
heh
18:37
<Hixie>
oh actually location.reload() explicitly uses a different source browsing context in the situation of a non-overridden reload
18:37
<Hixie>
wonder why
18:37
gsnedders
kinda wants to prove the event loop is free from deadlock/livelock
18:39
<Hixie>
wow, http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=141 actually shows that's true, too
18:39
<Hixie>
at least in chrome
18:55
<Hixie>
how the heck can i retroactively figure out the source browsing context of a document.open()'ed document
18:57
<MikeSmith>
Hixie: printf
18:59
<Hixie>
printf of what, though?
19:03
<MikeSmith>
some whatever object in the browser source code and you just dump it out
19:03
<MikeSmith>
after you hack the source and recompile it
19:04
jgraham
looks for a blunt instrument to apply to mq
19:09
<TabAtkins>
Sigh. I hate it when people in non-American timezones try to ping me repeatedly over multiple days during times when America is asleep, and sign off when they don't hear anything from me.
19:09
<TabAtkins>
I can't respond! I'm asleep! You're a dumbass!
19:12
<Hixie>
yeah, it seems like if you're not going to stick around, then send e-mail
19:16
<jgraham>
TabAtkins: You should schedule irc breaks into your sleep pattern
19:16
<tantek>
TabAtkins just ask them to summarize their question and see if they bother checking the logs. If they don't, it can't have been that important right?
19:16
<TabAtkins>
That seems like a legitimate solution to the problem, yes.
19:17
<TabAtkins>
tantek: I'm getting pinged privately, so there's no logs for them to check.
19:17
<tantek>
you're getting pinged privately about *standards* questions?
19:17
<Hixie>
/away i'm asleep, please leave a detailed message after the beep. BEEEP.
19:18
<tantek>
TabAtkins perhaps an irc/pm auto-responder? "If you have a question about the web platform, please ask it in #whatwg and feel free to cc my alias"
19:19
<jgraham>
or, apparently, /away If you ask me something now and sign off before I respond I will call you a dumbass in public
19:19
<tantek>
LOL
19:19
<TabAtkins>
tantek: Yup. I'd like to tell them to just ping me in #whatwg, but I can't since they dont' stick around in public places.
19:20
<Hixie>
oh this is in IM, not IRC?
19:21
<tantek>
TabAtkins - people are pinging you about *standards* exclusively in *private* places? Ignore them until they figure it out. Or send them this for some background reading: http://tantek.com/2011/168/b1/practices-good-open-web-standards-development
19:21
<TabAtkins>
tantek: Still can't send them anything. You keep missing the fundamental difficulty I'm running into here. ^_^
19:22
<Hixie>
(fwiw, i get the same thing.)
19:23
<jgraham>
TabAtkins: /away allows you to do that, doesn't it?
19:23
<tantek>
You two are too patient. I've long since told people to go the (appropriate) IRC channel and ask there.
19:23
<Hixie>
tantek: how?
19:23
<TabAtkins>
jgraham: That requires me to actaully remember to set /away.
19:23
<tantek>
TabAtkins - how are you getting the pings?
19:23
<TabAtkins>
tantek: HOW DO I TELL ANYONE TO PING ME ELSEWHERE WHEN THEY LEAVE BEFORE I CAN TELL THEM ANYTHING?!?
19:23
<TabAtkins>
^_^
19:23
<Hixie>
yeah, what tab said
19:23
<jgraham>
TabAtkins: Well, sure, that's a problem
19:24
<jgraham>
You could possibly set up your client to do it automatically after an inactivity timeout
19:24
<tantek>
if they're able to ping you then you are able to setup an /away auto-response
19:24
<tantek>
I think some clients auto-set /away after an inactivity timeout
19:24
<jgraham>
Although assuming you use IRCCloud that might not be super-easy
19:24
<TabAtkins>
tantek: Not a specialized one, and I don't want to spam people who are legit PMing me during the day with a mesage telling them to buzz off.
19:24
<TabAtkins>
jgraham: Yes, that's what I use, and to the best of my knowledge I cant' set such a thing.
19:25
<jgraham>
Well you can
19:25
<jgraham>
Just need to write an extension or bookmarket, or something
19:25
<tantek>
FWIW Colloquy has a "Sleep Message" in the settings
19:26
<TabAtkins>
tantek: Not helpful for me on a Linux box, also IRCCloud is awesome and now free through work.
19:27
tantek
is trying the Sleep Message setting
19:28
<TabAtkins>
So IRCCloud has an auto-away, but it triggers only when you leave the page in all clients.
19:29
<Hixie>
most. ridiculous. test. ever. http://damowmow.com/playground/demos/sandbox/001.html
19:29
<Hixie>
short of bidi tests.
19:30
<tantek>
oh NM - the Colloquy sleep message is triggered like a quit message when you put your computer to sleep (as opposed to actually quitting the IRC client)
19:30
<tantek>
TabAtkins - perhaps file a feature request with the IRCCloud folks to have an "after hours" setting that auto-sets you /away with a message.
19:31
<TabAtkins>
tantek: Yeah, gonna do that.
22:24
<zcorpan>
TabAtkins: just change your nick to TabAtkins_i'm_asleep_dumbass
22:24
<TabAtkins>
zcorpan: That requires the same level of forethought as setting /away.
22:26
<zcorpan>
not if you have that nick when you're awake, too :-P
22:27
<Ms2ger_i_m_aslee>
Good idea