WHATWG on 2024-08-29

14:33	<nicolo-ribaudo>	Why in the HTML PR preview diff the keyboard shortcut for "next" is J and for "prev" is K? The "next" button is on the right of the "prev" button, but J is on the ledt of K
14:47	<evilpie>	Why in the HTML PR preview diff the keyboard shortcut for "next" is J and for "prev" is K? The "next" button is on the right of the "prev" button, but J is on the ledt of K Probably because of vi/vim
16:47	<Domenic>	annevk: think we can merge https://github.com/whatwg/html/pull/10188 ?
16:53	<annevk>	Domenic: sounds good to me.
16:53	annevk	wonders if Domenic is traveling
17:06	<krosylight>	re: sensitive=true, I think the conversation fully went to the direction about server-side browser AI feature, which makes sense, but the original GH issue I think is more focused on AI crawlers which users don't have control of. Maybe do some enum attribute that selectively signal those crawlers for initial simplicity, like donotcollectby=crawler or something
17:07	<krosylight>	Context: today's whatnot meeting on https://github.com/whatwg/html/issues/10519
17:57	<annevk>	Why would a crawler see end user sensitive data? That would be a website bug, no?
19:28	<Timo Tijhof>	Why would a crawler see end user sensitive data? That would be a website bug, no? I suppose it's possible for cookieless authenticated URLs to be leaked or indirectly discovered by a crawler. Akin to "invite/share by link" private URLs, or internal mechanisms where emails point to APIs, static file servers, etc that are private where the URL is the secret. However this is an area where I'd expect norobots to be used on each page, and/or for these to be in a sub tree excluded wholesale by robots.txt
19:31	<Timo Tijhof>	Why would a crawler see end user sensitive data? That would be a website bug, no? I'd expect such a page, if it contains public info on the same url, to have a public version without that private data where that one is canonical or otherwise already indexed instead. The same applies to search engines already, where one needs to be careful not to allow sneaky extraction via "site" and "inurl" operators for accidentally crawled private pages.