#whatwg on 2015-08-22

00:06	<gsnedders>	nox: BTW, isn't it still the case that SSE2 is way quicker with aligned reads? Do you not want to check input is aligned first?
00:07	<nox>	gsnedders: I guess I could.
00:07	<nox>	jamesr___: Yeah, have to check.
00:07	<nox>	gsnedders: I'm not sure the simd crate handle that though.
00:08	<jamesr___>	sse2 instructions generally require 16 byte alignment or they fault
00:09	<jamesr___>	or some do, it hink
00:11	<nox>	Alignement matters only for load and store.
00:12	<nox>	gsnedders: In general when handling multibyte encodings, you can't stay aligned during the whole reading anyway.
00:13	<nox>	gsnedders: Maybe through very fancy shuffling to handle continuation bytes across chunks, but I'm not sure it's worth it.
00:15	<Domenic>	hmm how did nobody else catch that ASCII to UTF8 is a memcpy...
00:16	<jsbell>	I was wondering about that; isn't the code actually UTF-8 to ASCII, which requires range validation?
00:17	<jsbell>	(I glanced at the code only enough to realize I didn't care that much...)
00:17	<Domenic>	yeah same...
00:17	<Domenic>	"ASCIIEncoder" implies you are right
00:34	<jamesr___>	"maybe ASCII" -> utf8 is not a memcpy, if you want to map bytes with the high bit set to an error value in some way
06:08	<annevk>	TabAtkins: same line
06:08	<annevk>	TabAtkins: also, I prefer <li><p>Text to be on one line if <li> only contains a single <p>
06:09	<annevk>	jamesr___: yeah, seems to be about checking invalid bytes
07:54	<nox>	Domenic: UTF-8 is compatible with US-ASCII.
07:54	<nox>	Domenic: Not all bytes are US-ASCII code points.
07:54	<nox>	So no, decoding ASCII into UTF-8 isnt memcpy.
08:10	<Ms2ger>	What
08:10	<Ms2ger>	If it's actually ASCII, there's no bytes with the high bit set, so it is a memcpy
08:11	<nox>	Ms2ger: In the context of rust-encoding, you don't know if input is actually in said encoding.
08:11	<nox>	Ms2ger: That's why the UTF-8 decoder isn't a noop either.