WHATWG on 2021-06-17

02:35	<guest>	Hi. I'm trying to understand the paragraph definition in here https://html.spec.whatwg.org/multipage/dom.html#paragraphs , but I can't understand the wording. English is not my mother tongue, so can anyone help me out? Much appreciated.
02:37	<guest>	Paragraphs in flow content are defined relative to what the document looks like without the a, ins, del, and map elements complicating matters, since those elements, with their hybrid content models, can straddle paragraph boundaries, as shown in the first two examples below. Is it just these elements or any transparent element?
02:38	<guest>	Let view be a view of the DOM that replaces all a, ins, del, and map elements in the document with their contents. Then, in view, for each run of sibling phrasing content nodes uninterrupted by other types of content, in an element that accepts content other than phrasing content as well as phrasing content, let first be the first node of the run, and let last be the last node of the run. For each such run that consists of at least one node that is neither embedded content nor inter-element whitespace, a paragraph exists in the original DOM from immediately before first to immediately after last. (Paragraphs can thus span across a, ins, del, and map elements.) I can't really say that I understand this part very well either.
02:38	<guest>	I tried to look in older documents, but they had the same wording, probably even the same content, so no luck there either.
02:45	<GPHemsley>	oof, that is confusing
02:47	<GPHemsley>	TIL "paragraph" means more than just `p`
02:50	<GPHemsley>	looking at https://html.spec.whatwg.org/multipage/indices.html#elements-3 I would guess that the transparent elements other than those explicitly mentioned do not typically contain paragraphs
02:51	<GPHemsley>	most of the ones not mentioned were introduced in the HTML5 era, which likely meant they were defined more strictly
02:52	<GPHemsley>	I'm thinking the concept of a paragraph beyond `p` exists in order to handle legacy web
02:54	<GPHemsley>	guest: what are you trying to do?
02:55	<guest>	GPHemsley: Just reading through the documents and trying to understand things better.
02:56	<guest>	Can't really understand that part though. I've read it many times, but I just can't.
02:56	<GPHemsley>	yeah, it definitely relies heavily on nested definitions
02:56	<guest>	The two parts I mentioned are the ones that I can't really get.
02:57	<guest>	For example, what is a run of phrasing content look like exactly?
03:01	<GPHemsley>	I think in this case all uses of "run" mean "sequence"
03:03	<GPHemsley>	"run" makes it seem like an operation is occurring, but I don't think that is the case
03:05	<GPHemsley>	so, to paraphrase, it's saying a paragraph is a sequence of nodes that meet certain criteria
03:08	<guest>	so, to paraphrase, it's saying a paragraph is a sequence of nodes that meet certain criteria Yeah, but what criteria is it? :D
03:13	<GPHemsley>	filed https://github.com/whatwg/html/issues/6782
03:15	<guest>	Much appreciated.
03:16	<guest>	Now, I need to understand the algorithm. :D
03:16	<GPHemsley>	yeah
03:16	<GPHemsley>	let's see if we can break it down
03:17	<guest>	Let view be a view of the DOM that replaces all a, ins, del, and map elements in the document with their contents. This part is pretty clear. We also agreed that those are the only elements that will be replaced.
03:18	<GPHemsley>	so we've established that we're currently in that view
03:18	<guest>	Then, in view, for each run of sibling phrasing content nodes uninterrupted by other types of content, in an element that accepts content other than phrasing content as well as phrasing content, This part is somewhat long and confusing.
03:18	<GPHemsley>	which is basically "pretend these confusing things don't exist"
03:18	<guest>	Yeah, exactly.
03:18	<GPHemsley>	so this next sentence is actually saying two things
03:19	<GPHemsley>	not only are we in that hand-wavy view, we're also "in an element that accepts content other than phrasing content as well as phrasing content"
03:19	<GPHemsley>	I think
03:19	<guest>	phrasing content as well as phrasing content Does this mean just flow content?
03:20	<guest>	I mean, I remember I read something like that. Gimme a moment, please.
03:20	<GPHemsley>	well the full phrase is "content other than phrasing content as well as phrasing content"
03:20	<GPHemsley>	as in, "phrasing content and non-phrasing content"
03:20	<GPHemsley>	i.e. both
03:21	<guest>	How can it be both at the same time? :D
03:21	<GPHemsley>	no, not that any given content is both
03:21	<GPHemsley>	it's that the both types of contents are allowed
03:22	<guest>	Ah, I see.
03:22	<GPHemsley>	oh, this is a neat little visual: https://html.spec.whatwg.org/multipage/dom.html#kinds-of-content
03:23	<guest>	LOL -- I didn't know that chart(?) was interactive.
03:24	<GPHemsley>	so, getting back to paragraph, I think it's basically trying to break up an element that can contained mixed content
03:24	<GPHemsley>	since an element that contains only phrasing content has more obvious boundaries
03:25	<GPHemsley>	I guess, namely, the boundaries of the element itself
03:25	<GPHemsley>	for each run of sibling phrasing content nodes uninterrupted by other types of content, in an element that accepts content other than phrasing content as well as phrasing content,
03:26	<guest>	for each run of sibling phrasing content nodes uninterrupted by other types of content Are these the two sentences you were talking about? in an element that accepts content other than phrasing content as well as phrasing content
03:26	<GPHemsley>	so we've eliminated the elements that cross node boundaries, and then we're specifically looking at elements that can contain both phrasing and non-phrasing content
03:27	<GPHemsley>	so, then, it is within that context that we are running the algorithm
03:27	<GPHemsley>	and the first thing we do is identify "each run of sibling phrasing content nodes uninterrupted by other types of content"
03:28	<guest>	Okay. So, what does that mean?
03:28	<GPHemsley>	phrasing content nodes right next to each other
03:29	<guest>	Ah.
03:29	<GPHemsley>	[ P P ] NP NP NP [ P ] NP [ P P ] NP
03:29	<GPHemsley>	where the brackets indicate a run
03:30	<guest>	"p" means "phrasing content" and "np" means "non-phrasing content" there?
03:30	<GPHemsley>	yeah
03:31	<GPHemsley>	sorry these letters are so overloaded :)
03:31	<guest>	Okay, I'm getting this.
03:31	<guest>	sorry these letters are so overloaded :) lol
03:31	<guest>	Hmm.
03:31	<guest>	Let me think a little bit.
03:32	<GPHemsley>	For each such run that consists of at least one node that is neither embedded content nor inter-element whitespace, a paragraph exists in the original DOM from immediately before first to immediately after last.
03:32	<guest>	`some text node <em>some em node</em>` Does this mean [ P P ]?
03:32	<GPHemsley>	that's basically just modifying where the brackets are going
03:33	<guest>	And `some text node another text node` Mean [ P ]?
03:33	<GPHemsley>	I think they're both [ P P ] ?
03:33	<GPHemsley>	since P = node
03:33	<guest>	Then what is [ P ] ?
03:34	<GPHemsley>	some text node
03:34	<guest>	So, they are separated by a newline character?
03:35	<GPHemsley>	well
03:35	<GPHemsley>	once you're talking about nodes, you've already abstracted up a level
03:35	<GPHemsley>	how you got there is a separate question
03:36	<guest>	I think I'm not clear on what a sequence of phrasing content look like. Maybe the author(s) meant anything that is basically not non-phrasing content?
03:36	<guest>	This sounds stupid, let me be clearer.
03:37	<guest>	Like, `some text` is a phrasing content, and `some text <em>some other text</em>` is two phrasing contents, and `some text <em>some other text</em> some different text` is three phrasing contents?
03:37	<GPHemsley>	if you're talking about the raw HTML string, yes, I think so
03:38	<GPHemsley>	but we're talking about nodes, which means we've already passed that point
03:40	<GPHemsley>	https://dom.spec.whatwg.org/#nodes
03:40	<guest>	run of sibling phrasing content nodes uninterrupted by other types of content Okay, I think I understand this part.
03:43	<guest>	Sorry I'm interrupting you, but I think I'm figuring this out, but I need to quickly ask you this: is `<section>` an element that accepts content other than phrasing content as well as phrasing content?
03:43	<GPHemsley>	https://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0Asome%20text%20%3Cem%3Esome%20other%20text%3C%2Fem%3E%20some%20more%20text
03:45	<GPHemsley>	I think so, yeah
03:45	<guest>	Hmm.
03:45	<GPHemsley>	per the graph, flow content is by definition phrasing and non-phrasing content
03:45	<GPHemsley>	I think
03:45	<guest>	I think the author(s) just meant that too.
03:46	<GPHemsley>	well there's some metadata content that's not flow content... that may be included too
03:46	<guest>	But why not just say flow content?
03:47	<guest>	well there's some metadata content that's not flow content... that may be included too AH, I see what you mean.
03:47	<GPHemsley>	maybe for those times when text just shows up randomly where it doesn't belong
03:47	<GPHemsley>	idk
03:47	<guest>	So much confusion, I'm gonna be honest with you. :D
03:48	<GPHemsley>	yeah, this is definitely more complex than the number of sentences would make it seem
03:48	<guest>	Okay, I understand everything up to that point now.
03:48	<guest>	let first be the first node of the run, and let last be the last node of the run Can we select those on the first example?
03:49	<GPHemsley>	[ P(first) P(last) ] NP NP NP [ P(first,last) ] NP [ P(first) P(last) ] NP
03:49	<guest>	Is first `This is the <em>first</em> paragraph in this example.` and last `<p>This is the second.</p>`?
03:50	<guest>	Oh no, it can't be.
03:50	<guest>	first and last are `This is the <em>first</em> paragraph in this example.`?
03:51	<GPHemsley>	`first node <em>some other text</em> last node`
03:51	<GPHemsley>	`first and last node`
03:52	<guest>	I think you are looking at another example.
03:52	<GPHemsley>	oh
03:52	<guest>	I'm talking about this one: `<section> <h1>Example of paragraphs</h1> This is the <em>first</em> paragraph in this example. <p>This is the second.</p> <!-- This is not a paragraph. --> </section>`
03:52	<GPHemsley>	you meant an example in the spec
03:52	<guest>	The example in here: https://html.spec.whatwg.org/multipage/dom.html#paragraphs
03:52	<GPHemsley>	right, ok
03:52	<GPHemsley>	I was following on from our discussion examples
03:53	<guest>	AH, I see.
03:53	<guest>	So, I am thinking the variables first and last both are `This is the <em>first</em> paragraph in this example.`
03:54	<GPHemsley>	H [ P(first) P P(last) ] [ P ]
03:54	<GPHemsley>	actually
03:54	<guest>	Ah.
03:55	<GPHemsley>	https://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Csection%3E%0A%20%20%3Ch1%3EExample%20of%20paragraphs%3C%2Fh1%3E%0A%20%20This%20is%20the%20%3Cem%3Efirst%3C%2Fem%3E%20paragraph%20in%20this%20example.%0A%20%20%3Cp%3EThis%20is%20the%20second.%3C%2Fp%3E%0A%20%20%3C!--%20This%20is%20not%20a%20paragraph.%20--%3E%0A%3C%2Fsection%3E
03:56	<GPHemsley>	P H [ P(first) P P(last) ] [ P(first,last) ] P C P
03:56	<guest>	a paragraph exists in the original DOM from immediately before first to immediately after last What about that?
03:57	<guest>	[ P P P ] is the "run" we are talking about. The first P is first and the last P is last, right?
03:58	<GPHemsley>	yeah
03:58	<GPHemsley>	so the square brackets are demonstrating the paragraph boundaries
03:58	<guest>	Oh, I understand.
04:00	<GPHemsley>	it's unclear to me when the comment gets ignored
04:01	<guest>	Isn't the comment gets ignored in everywhere?
04:01	<guest>	I was doing that the whole time.
04:01	<GPHemsley>	I think it gets its own node, but there doesn't seem to be a content model associated with it
04:02	<GPHemsley>	it's confusing because inter-element whitespace is called out specifically
04:02	<guest>	What you think is clear.
04:02	<GPHemsley>	despite being grouped in with comments here: Inter-element whitespace, comment nodes, and processing instruction nodes must be ignored when establishing whether an element's contents match the element's content model or not, and must be ignored when following algorithms that define document and element semantics.
04:03	<guest>	I should really re-read that whole page when I'm not sleepy. :D
04:03	<GPHemsley>	same
04:03	<GPHemsley>	but in any case, I don't think it affects understanding
04:04	<GPHemsley>	the bottom line is that it gets ignored
04:05	<guest>	Everything is much clearer now -- thank a lot, GPHemsley!
04:05	<GPHemsley>	happy to help (and learn myself) :)
04:05	<guest>	haha
04:09	<GPHemsley>	an easter egg in the examples: https://www.youtube.com/watch?v=LYds5xY4INU
04:09	<guest>	Yeah, I know that. :D
04:10	<guest>	Cats are everywhere. :P
04:11	<GPHemsley>	it also goes to show how old this section is :)
04:11	<GPHemsley>	that video is from 2008
04:13	<guest>	I didn't know how to read and write at that time. :D
04:14	<GPHemsley>	well then
04:14	<GPHemsley>	welcome to the web :)
04:14	<guest>	hehe
04:14	<GPHemsley>	I guess I'll wander back to the old folks home now
04:15	GPHemsley	gazes wistfully into the middle distance
04:21	<guest>	lol
07:30	<Ms2ger>	[ P P ] NP NP NP [ P ] NP [ P P ] NP Is this the P=NP problem I heard so much about?
18:11	<annevk>	Jake Archibald: just came across: https://twitter.com/jaffathecake/status/1405437361643790337
18:11	<annevk>	Jake Archibald: seems worth filing some spec issues on? Or is it not acceptable to remove observers if the node is removed from the document for some reason?
18:19	<Jake Archibald>	annevk it's a GC issue, and I don't think we put that in specs right? Observers can't be removed if the element can be reconnected, so it's down to the engine being smart enough to know that that can't happen
19:46	<wanderview>	can the spec say that the observer should hold its reference weakly to the elements, though?
19:47	<wanderview>	(also, I am not prepared for Jake to change his avatar after so long...)
19:47	<foolip>	Normally specs are written assuming infinite memory and no observable differences from that. Garbage collection is an optimization in this view of things.
19:48	<foolip>	That doesn’t really hold any longer with WeakRef and such, but that was the party line for the longest time.
19:50	<wanderview>	since this kind of thing changes how developers write code it seems like a good thing to spec
19:51	<foolip>	Could tests be written for it?
19:51	<jgraham>	Presumably if we had a way to trigger gc from tests ;)
19:51	<foolip>	Hmm, I wonder if anyone’s working on that?
19:51	<cketti>	how do you test that nothing holds a reference to something you're not allowed to hold a reference to?
19:52	<wanderview>	test that the WeakRef is empty
19:52	<wanderview>	this can definitely be tested in browser-specific tests with tools to force GC... would be nice to be able to force GC in WPT
19:53	<foolip>	https://jgraham.github.io/browser-test/#the-testutils-object is coming
19:54	<foolip>	With the caveat that it can’t be 100% guaranteed that GC will happen, but in practice I assume it will be used as if it were guaranteed
19:54	<jgraham>	Yeah, just need to get that implemented in browsers and we can share this kind of test. I think internal test APIs mostly already have it so it should hopefully just be wiring up existing functionality to the right IDL interface and ensuring it's only exposed behind a flag/pref/etc.
19:55	<Andreu Botella (he/they)>	can the spec say that the observer should hold its reference weakly to the elements, though? Here's the DOM spec specifying that for mutation observers: https://dom.spec.whatwg.org/#garbage-collection
19:57	<wanderview>	and hopefully the gc() method means the same thing... I seem to recall some gc() calls that were only incremental and you had to call it multiple times to get a "full" GC
19:58	<foolip>	This one will be async since a sync gc() couldn’t work reliably with Oilpan
19:59	<foolip>	So a single call should do as much garbage collection as can be done.
20:24	<wanderview>	is there a way to get matrix to show "unread message" dot in the tab favicon? I don't want to enable full up push notifications for matrix, but would like some indication there is something to read in the tab
20:27	<Domenic>	If you use Firefox it will light up blue