01:19
<bakkot>
sideshowbarker: https://bakkot.github.io/matrix-logs/
02:25
<sideshowbarker>
sideshowbarker: https://bakkot.github.io/matrix-logs/
Beautiful
08:29
<Rob Palmer>
bakkot:
08:30
<Rob Palmer>
your logs look good - for some reason Chrome offers to translate from Albanian
09:39
<sideshowbarker>

adding a lang attribute to the html element would probably make Chrome not do that

<html lang=en>
14:08
<ryzokuken>
đź‘‹
14:08
<Jory Burson>
Hi ryzokuken !!!
14:08
<Jory Burson>
Rob Palmer Aki trying to close out that question about the video I sent ya'll in email/dm
14:09
<Jory Burson>
ping me when you log on?
14:16
<yulia>
hey hey jory
14:33
<Jory Burson>
Hi yulia !!!!
15:24
<bakkot>

adding a lang attribute to the html element would probably make Chrome not do that

<html lang=en>
I'm hesitant to do this because the bot does not know what languages the channels it's scraping are in. it happens to be the case that the current set is english, but that's not a guarantee
15:26
<sideshowbarker>
I'm hesitant to do this because the bot does not know what languages the channels it's scraping are in. it happens to be the case that the current set is english, but that's not a guarantee
Ah yeah that makes sense
15:26
<ryzokuken>
I suppose Google's auto-detection would work better once there's more content...
15:30
<sideshowbarker>
yeah in general language guessers aren’t reliable when there’s only a small amount of content
15:32
<sideshowbarker>
in the W3C HTML checker I have a language-guessing library integrated in, to try to catch cases where a document has the wrong lang value
15:33
<sideshowbarker>
and the error message that guesser emits includes a link to the bug tracker, for people to report cases where it’s misidentified the language of document
15:33
<sideshowbarker>
https://github.com/validator/validator/search?q=language+misidentified+is%3Aissue&type=issues
15:34
<sideshowbarker>
most of the reports are for documents with very little text content — things like online product catalogues
15:35
<sideshowbarker>
anyway, the reason that guesser was added to the HTML checker is that we know there are a significant number of documents with lang=en attributes that aren’t in English
15:37
<sideshowbarker>
…because of a cargo-cult copy/pasting that goes on — some developers don’t know whatlang=en is, but they see it in another document, so so they just copy it as-is into their document, even if it’s not in English
15:38
<ryzokuken>
perhaps also has something to do with poorly made templates and such?
15:38
<sideshowbarker>
the biggest problem that causes in practice is for non-English screen-reader users
15:38
<sideshowbarker>
perhaps also has something to do with poorly made templates and such?
yeah stuff like that too
15:38
<ryzokuken>
"my free wordpress theme comes with lang=en set"
15:38
<sideshowbarker>
bingo
15:39
<sideshowbarker>
so anyway, a document having no lang at all is better for screen-reader users than a document having the wrong lang value