02:05 | <Mathieu Hofman> | So has anyone else ever needed a String.codePointCompare function (a la Intl.Collator.prototype.compare ) to use with sort for comparing strings by Unicode code points instead of the default code units (when the comparator is missing). It seems that there is no Intl locale / collation that will do a dumb code point compare. |
02:06 | <Mathieu Hofman> | Bonus is that implementing this natively would allow engines using an internal utf8 representation for strings to just compare them by bytes! |
02:14 | <bakkot> | I have never needed to sort strings by code point, no |
02:15 | <bakkot> | I don't think any major engines use internal utf8 representations but I could be mistaken |
02:16 | <bakkot> | how did you find yourself needing this? |
02:18 | <bakkot> | speaking of sorting, though, I do want to have a Array<T>.sortBy(fn) method where the function is a map from T to Comparable: string | number | bigint | Array<Comparable> , and which sorts the inputs by comparing their outputs from fn (throwing if the outputs are of unlike types, and comparing arrays lexicographically) |
02:18 | <bakkot> | and given such a thing you could do array.sortBy(s => [...s]) |
02:19 | <bakkot> | of course we are extremely unlikely to get any new array prototype methods with reasonable names, so I guess it would have to be a static Array.sortBy(arr, fn) , which... ugh. but I'd still take it. |
02:24 | <Mathieu Hofman> | We need a portable way of sorting strings for Ocapn, and settled on unicode codepoint comparison. This is basically an interop question. |
04:14 | <Aapo Alasuutari> | Side quest: Is there actually ~any engines that use UTF-8 as their string representation? Mine does, but I'm wondering if there are others and if they simply accept string methods being non-standard, or if they take measures to hide the backing representation. |
05:04 | <Mathieu Hofman> | Moddable's XS can be built to use either utf-8 or cesu-8 |
05:05 | <Mathieu Hofman> | I thought that v8 supported utf-8 strings, especially when interacting with the DOM |
05:46 | <Domenic> | DOM uses WTF-16, sometimes (but rarely) censoring lone surrogates on the boundaries |
05:49 | <Justin Ridgewell> | Moddable's XS can be built to use either utf-8 or cesu-8 |
05:50 | <Mathieu Hofman> | compactness of strings while keeping compatibility with utf-16 |
05:51 | <Mathieu Hofman> | it makes some operations a little costly however (like random access to string index) |
13:02 | <Mathieu Hofman> | We need a portable way of sorting strings for Ocapn, and settled on unicode codepoint comparison. This is basically an interop question. |