01:19 | <bakkot> | sideshowbarker: https://bakkot.github.io/matrix-logs/ |
02:25 | <sideshowbarker> | sideshowbarker: https://bakkot.github.io/matrix-logs/ |
08:29 | <Rob Palmer> | bakkot: |
08:30 | <Rob Palmer> | your logs look good - for some reason Chrome offers to translate from Albanian |
09:39 | <sideshowbarker> | adding a
|
14:08 | <ryzokuken> | đź‘‹ |
14:08 | <Jory Burson> | Hi ryzokuken !!! |
14:08 | <Jory Burson> | Rob Palmer Aki trying to close out that question about the video I sent ya'll in email/dm |
14:09 | <Jory Burson> | ping me when you log on? |
14:16 | <yulia> | hey hey jory |
14:33 | <Jory Burson> | Hi yulia !!!! |
15:24 | <bakkot> |
|
15:26 | <sideshowbarker> | I'm hesitant to do this because the bot does not know what languages the channels it's scraping are in. it happens to be the case that the current set is english, but that's not a guarantee |
15:26 | <ryzokuken> | I suppose Google's auto-detection would work better once there's more content... |
15:30 | <sideshowbarker> | yeah in general language guessers aren’t reliable when there’s only a small amount of content |
15:32 | <sideshowbarker> | in the W3C HTML checker I have a language-guessing library integrated in, to try to catch cases where a document has the wrong lang value |
15:33 | <sideshowbarker> | and the error message that guesser emits includes a link to the bug tracker, for people to report cases where it’s misidentified the language of document |
15:33 | <sideshowbarker> | https://github.com/validator/validator/search?q=language+misidentified+is%3Aissue&type=issues |
15:34 | <sideshowbarker> | most of the reports are for documents with very little text content — things like online product catalogues |
15:35 | <sideshowbarker> | anyway, the reason that guesser was added to the HTML checker is that we know there are a significant number of documents with lang=en attributes that aren’t in English |
15:37 | <sideshowbarker> | …because of a cargo-cult copy/pasting that goes on — some developers don’t know whatlang=en is, but they see it in another document, so so they just copy it as-is into their document, even if it’s not in English |
15:38 | <ryzokuken> | perhaps also has something to do with poorly made templates and such? |
15:38 | <sideshowbarker> | the biggest problem that causes in practice is for non-English screen-reader users |
15:38 | <sideshowbarker> | perhaps also has something to do with poorly made templates and such? |
15:38 | <ryzokuken> | "my free wordpress theme comes with lang=en set" |
15:38 | <sideshowbarker> | bingo |
15:39 | <sideshowbarker> | so anyway, a document having no lang at all is better for screen-reader users than a document having the wrong lang value |