00:09
<Huvet>
seems to work for my use case (for anyone reading the log later on):
00:09
<Huvet>
import unicodedata def remove_control_characters(html): html = re.sub(ur"&#(\d+);", ur"\x\1", html) return "".join(ch for ch in html if unicodedata.category(ch)[0] != "C")
00:10
<gsnedders>
wait, that works?
00:10
<gsnedders>
Oh, wait, that's the raw string \x\1
00:11
<Huvet>
well, I could also be making a fool of myself, and messing up all the entities
00:11
<gsnedders>
I'm trying to remember what this does off-hand :)
00:11
<Huvet>
let me actually make sure this works... ;)
00:17
<Huvet>
import unicodedata def remove_control_characters(html): html = re.sub(ur"&#(\d+);", lambda c: unichr(int(c.group(1))), html) return "".join(ch for ch in html if unicodedata.category(ch)[0] != "C")
00:17
<Huvet>
that actually works, only for hex coded entities
00:18
<gsnedders>
That doesn't cause errors?
00:18
<gsnedders>
Huh
00:19
<Huvet>
it seems to work with realworld html at least
00:22
<Philip`>
As long as nobody uses &#x3;
00:22
<gsnedders>
Philip`: you live!
00:24
<Huvet>
Philip`: thx, added: html = re.sub(ur"&#x(\d+);", lambda c: unichr(int(c.group(1), 16)), html)
00:25
<Philip`>
Huvet: Might want &#x([0-9a-fA-F]+);
00:25
<Philip`>
or &#[xX]([0-9a-fA-F]+);
00:26
<Huvet>
what, you mean hex character can go to 16!? ;)
00:26
<Philip`>
Depending on how accurate you want to be, you might have to write an entire HTML parser in a regex
00:26
<Huvet>
yeah, I've read on stackoverflow that that is a great idea
00:26
<gsnedders>
You need to handle <script>, <textarea>, and a couple of others too.
00:26
<gsnedders>
:)
00:27
<Huvet>
this is just an ugly hack, until you fix it the non-hacky way
00:28
<Huvet>
135 domains, so quite controlled environment
10:32
<annevk_>
JonathanNeal: <object> is like <iframe> for SVG
10:33
<annevk_>
JonathanNeal: you get a browsing context and such
16:07
<annevk>
JakeA: I expanded the list of request contexts somewhat: http://fetch.spec.whatwg.org/#concept-request-context
16:07
<annevk>
JakeA: at some point I should probably start maintaining the IDL enum
16:15
<JakeA>
annevk: looks good. Where is the context set, thought the various specs when they call fetch?
16:15
<annevk>
JakeA: that's the idea
16:15
<annevk>
JakeA: specs migrating to Fetch is going to happen at some point
16:36
<annevk>
mounir: when you implemented navigator.languages, did you ensure it returns a JavaScript Array object?
16:36
<annevk>
mounir: the specification wants to return an IDL array, which seems broken
16:39
<annevk>
mathiasbynens: https://gist.github.com/annevk/6bfa782752dde6acb379
16:41
<mathiasbynens>
annevk: agreed completely, but how to convince tc39?
16:41
<mathiasbynens>
i couldn’t even get es-discuss excited
16:42
<annevk>
mathiasbynens: I think to some extent it depends on putting it on the agenda and preparing a detailed proposal
16:42
<annevk>
mathiasbynens: and then unfortunately getting someone to put it in their face when they have a meeting
16:42
<mathiasbynens>
annevk: you stopped attending?
16:42
<annevk>
mathiasbynens: hopefully that last step evolves to a more open process
16:43
<annevk>
mathiasbynens: I attended twice to get to know everyone and tell them the platform guys mean no harm and we want better IDL and what not just as much as the next guy
16:44
<Ms2ger>
So why would you ever want to put something into ES?
16:44
<annevk>
mathiasbynens: I don't think there's much value in me attending more for now, maybe once we come to another impasse
16:44
<mathiasbynens>
Ms2ger: cause then it’s not just available in browsers
16:44
<zewt>
javascript is pretty irreelvant outside of browsers
16:45
<zewt>
irrelevant, even
16:45
<Ms2ger>
I don't see why that'd necessarily require having TC39 mess with it either
16:45
<annevk>
Ms2ger: a) gets considered as part of the design of new objects (important for structured cloning) and maybe even get syntax support (might be interesting for workers some day) and b) someone else gets to maintain it and fix the bugs
16:46
<annevk>
a) seems somewhat more important, but b) is very nice too
16:46
<mathiasbynens>
zewt: you’re funny :)
16:46
<Ms2ger>
mathiasbynens, you're unhelpful
16:47
<mathiasbynens>
oh, zewt wasn’t kidding?
16:47
<Ms2ger>
So it's really just because TC39 ignores the rest of the platform, unless you put it in their spec?
16:49
<mathiasbynens>
not just that, it would also mean that non-browser ES engines can stop making up their own stdlibs that do similar things than what’s already in browsers
16:50
<Ms2ger>
They already can
16:50
<mathiasbynens>
ok then i guess the point is that not just tc39 ignores the rest of the platform
16:51
<annevk>
Ms2ger: I don't think they ignore the rest of the platform; they didn't invent new byte types
16:51
<annevk>
Ms2ger: they did refactor them and make them work nicely with arrays and now build new things on top (value objects)