| 00:09 | <Huvet> | seems to work for my use case (for anyone reading the log later on): |
| 00:09 | <Huvet> | import unicodedata def remove_control_characters(html): html = re.sub(ur"&#(\d+);", ur"\x\1", html) return "".join(ch for ch in html if unicodedata.category(ch)[0] != "C") |
| 00:10 | <gsnedders> | wait, that works? |
| 00:10 | <gsnedders> | Oh, wait, that's the raw string \x\1 |
| 00:11 | <Huvet> | well, I could also be making a fool of myself, and messing up all the entities |
| 00:11 | <gsnedders> | I'm trying to remember what this does off-hand :) |
| 00:11 | <Huvet> | let me actually make sure this works... ;) |
| 00:17 | <Huvet> | import unicodedata def remove_control_characters(html): html = re.sub(ur"&#(\d+);", lambda c: unichr(int(c.group(1))), html) return "".join(ch for ch in html if unicodedata.category(ch)[0] != "C") |
| 00:17 | <Huvet> | that actually works, only for hex coded entities |
| 00:18 | <gsnedders> | That doesn't cause errors? |
| 00:18 | <gsnedders> | Huh |
| 00:19 | <Huvet> | it seems to work with realworld html at least |
| 00:22 | <Philip`> | As long as nobody uses  |
| 00:22 | <gsnedders> | Philip`: you live! |
| 00:24 | <Huvet> | Philip`: thx, added: html = re.sub(ur"&#x(\d+);", lambda c: unichr(int(c.group(1), 16)), html) |
| 00:25 | <Philip`> | Huvet: Might want &#x([0-9a-fA-F]+); |
| 00:25 | <Philip`> | or &#[xX]([0-9a-fA-F]+); |
| 00:26 | <Huvet> | what, you mean hex character can go to 16!? ;) |
| 00:26 | <Philip`> | Depending on how accurate you want to be, you might have to write an entire HTML parser in a regex |
| 00:26 | <Huvet> | yeah, I've read on stackoverflow that that is a great idea |
| 00:26 | <gsnedders> | You need to handle <script>, <textarea>, and a couple of others too. |
| 00:26 | <gsnedders> | :) |
| 00:27 | <Huvet> | this is just an ugly hack, until you fix it the non-hacky way |
| 00:28 | <Huvet> | 135 domains, so quite controlled environment |
| 10:32 | <annevk_> | JonathanNeal: <object> is like <iframe> for SVG |
| 10:33 | <annevk_> | JonathanNeal: you get a browsing context and such |
| 16:07 | <annevk> | JakeA: I expanded the list of request contexts somewhat: http://fetch.spec.whatwg.org/#concept-request-context |
| 16:07 | <annevk> | JakeA: at some point I should probably start maintaining the IDL enum |
| 16:15 | <JakeA> | annevk: looks good. Where is the context set, thought the various specs when they call fetch? |
| 16:15 | <annevk> | JakeA: that's the idea |
| 16:15 | <annevk> | JakeA: specs migrating to Fetch is going to happen at some point |
| 16:36 | <annevk> | mounir: when you implemented navigator.languages, did you ensure it returns a JavaScript Array object? |
| 16:36 | <annevk> | mounir: the specification wants to return an IDL array, which seems broken |
| 16:39 | <annevk> | mathiasbynens: https://gist.github.com/annevk/6bfa782752dde6acb379 |
| 16:41 | <mathiasbynens> | annevk: agreed completely, but how to convince tc39? |
| 16:41 | <mathiasbynens> | i couldn’t even get es-discuss excited |
| 16:42 | <annevk> | mathiasbynens: I think to some extent it depends on putting it on the agenda and preparing a detailed proposal |
| 16:42 | <annevk> | mathiasbynens: and then unfortunately getting someone to put it in their face when they have a meeting |
| 16:42 | <mathiasbynens> | annevk: you stopped attending? |
| 16:42 | <annevk> | mathiasbynens: hopefully that last step evolves to a more open process |
| 16:43 | <annevk> | mathiasbynens: I attended twice to get to know everyone and tell them the platform guys mean no harm and we want better IDL and what not just as much as the next guy |
| 16:44 | <Ms2ger> | So why would you ever want to put something into ES? |
| 16:44 | <annevk> | mathiasbynens: I don't think there's much value in me attending more for now, maybe once we come to another impasse |
| 16:44 | <mathiasbynens> | Ms2ger: cause then it’s not just available in browsers |
| 16:44 | <zewt> | javascript is pretty irreelvant outside of browsers |
| 16:45 | <zewt> | irrelevant, even |
| 16:45 | <Ms2ger> | I don't see why that'd necessarily require having TC39 mess with it either |
| 16:45 | <annevk> | Ms2ger: a) gets considered as part of the design of new objects (important for structured cloning) and maybe even get syntax support (might be interesting for workers some day) and b) someone else gets to maintain it and fix the bugs |
| 16:46 | <annevk> | a) seems somewhat more important, but b) is very nice too |
| 16:46 | <mathiasbynens> | zewt: you’re funny :) |
| 16:46 | <Ms2ger> | mathiasbynens, you're unhelpful |
| 16:47 | <mathiasbynens> | oh, zewt wasn’t kidding? |
| 16:47 | <Ms2ger> | So it's really just because TC39 ignores the rest of the platform, unless you put it in their spec? |
| 16:49 | <mathiasbynens> | not just that, it would also mean that non-browser ES engines can stop making up their own stdlibs that do similar things than what’s already in browsers |
| 16:50 | <Ms2ger> | They already can |
| 16:50 | <mathiasbynens> | ok then i guess the point is that not just tc39 ignores the rest of the platform |
| 16:51 | <annevk> | Ms2ger: I don't think they ignore the rest of the platform; they didn't invent new byte types |
| 16:51 | <annevk> | Ms2ger: they did refactor them and make them work nicely with arrays and now build new things on top (value objects) |