03:54
<MikeSmith>
annevk: hsivonen: https://github.com/validator/validator/issues/877 seems to be another issue about the parser in the HTML checker not conforming to the current spec requirements for <meta charset> parsing — I guess because it’s following the old requirements, right?
03:56
<MikeSmith>
the specific case cited in the issue is <meta http-equiv="Content-Type" content="text/html; xxxxxcharset=iso-8859-2">, which it parses as a iso-8859-2 despite the junk xxxxx characters
03:56
<MikeSmith>
the other cases are:
03:56
<MikeSmith>
<meta http-equiv="Content-Type" content="text/html; charset charset=iso-8859-2">
03:56
<MikeSmith>
<meta http-equiv="Content-Type" content="text/html; charsetxxxxxcharset=iso-8859-2">
03:56
<MikeSmith>
<meta http-equiv="Content-Type" content="text/html; charsetcharset=iso-8859-2">
03:57
<MikeSmith>
... for all of which it reports an "The legacy encoding declarationdid not contain charset= after the semicolon..
03:57
<MikeSmith>
..error
03:57
<MikeSmith>
I’m not sure what the OP is saying that expected behavior should be
04:48
<annevk>
MikeSmith: I don’t really recall the details of that algorithm, but bugs seem likely