01:46 | <bakkot> | has there ever been discussion of making XPathResult iterable? |
01:46 | <bakkot> | every time I write while (next = result.iterateNext()) I feel slightly unclean |
01:55 | <sideshowbarker> | bakkot: I think we donβt even have proper spec for XPathResult |
01:59 | <bakkot> | oof |
04:23 | <sujaldev> | what is a foreign element in the html tree construction stage? |
05:40 | <sideshowbarker> | sujaldev: an element in the SVG namespace or an element in the MathML namespace |
06:42 | <sujaldev> | ok but how do you know, where did you find this? |
06:43 | <sideshowbarker> | It's in the spec, in the parsing algorithm |
06:44 | <sujaldev> | oki Thank you! |
06:49 | <sideshowbarker> | You want to look at the call sites that call the parts that have to do with handling "foreign element", and look at call parameters, or see what the conditions are that lead to those calls getting made |
06:50 | <sideshowbarker> | I'd look it up but I'm on from my phone and also a bit preoccupied |
06:51 | <sujaldev> | yeah no problem, I was getting an idea from the name that any non html tag would be considered foreign just wanted to make sure. |
06:51 | <sujaldev> | by the way, if I don't want to parse foreign elements |
06:52 | <sujaldev> | are there any parts in the specs that become irrelevant for me? |
06:53 | <sujaldev> | I just want to implement a simple html parser, no handling for wrongly nested content, no namespaces, just valid html. |
06:58 | <sideshowbarker> | I don't think anybody would recommend you do that |
06:59 | <freddy> | you can implement an html parser, but it will be hard to implement a simple html parser :| |
06:59 | <sideshowbarker> | You would end up with something that doesn't parse HTML the same way that browsers do |
06:59 | <sideshowbarker> | And that would make the parser not very useful for anything in practice |
07:00 | <freddy> | Admittedly, I don't know your use case, but there are security risks. |
07:01 | <sujaldev> | my use case being a school project, I thought building it would be fun(it is fun just not when under pressure) easy and quick. Now, I have missed several deadlines. |
07:01 | <sujaldev> | And that would make the parser not very useful for anything in practice |
07:01 | <sujaldev> | the no support for misnested tags? |
07:02 | <sujaldev> | or the no support for namespaces? |
07:02 | <sideshowbarker> | But if you really wanted to do that I guess you'd make certain steps a no-op rather than doing what the spec says |
07:03 | <freddy> | For a school project, I'd re-use an existing parser. There are many useful parsers out there. When on a web page, just use the JS DOMParser API. When in Python use html5lib, Rust has html5ever |
07:03 | <freddy> | there's also a java parser |
07:03 | <sideshowbarker> | Or else make them fatal parsing failures |
07:04 | <sujaldev> | But if you really wanted to do that I guess you'd make certain steps a no-op rather than doing what the spec says I just raise a NotImplementedError wherever I feel the feature isn't quite necessary |
07:04 | <sideshowbarker> | You can decide on your own what you want to no-op or fail on |
07:05 | <sujaldev> | For a school project, I'd re-use an existing parser. There are many useful parsers out there. When on a web page, just use the JS DOMParser API. When in Python use html5lib, Rust has html5ever |
07:06 | <sujaldev> | one Thing i am really confused about is |
07:06 | <sujaldev> | what's next step after creating the dom |
07:06 | <sujaldev> | and the cssom |
07:06 | <sujaldev> | I know the render tree is to be created |
07:06 | <sujaldev> | but where I do find the specs? |
07:08 | <freddy> | https://html.spec.whatwg.org/multipage/parsing.html |
07:08 | <sujaldev> | yeah this page I know about, but where are the step in this to combine cssom tree and the dom tree? |
07:09 | <freddy> | maybe this non-normative reference is helpful? https://developer.mozilla.org/en-US/docs/Web/Performance/How_browsers_work#building_the_dom_tree |
07:10 | <freddy> | at this point, you'll end up in all sorts of other specs, including the DOM and the CSS specs |
07:12 | <sujaldev> | articles like these are helpful but do not provide the standard way to do it, like parsing is defined |
07:14 | <freddy> | I don't want to stop you from learning. Trying hard stuff yourself is a great way to learn, but it seems to me that you're doing something that I wouldn't even take on myself :-) just so you don't dig yourself a hole too deep to get out of. |
07:16 | <sujaldev> | at this point, I can't really switch projects. Looks like I am already stuck in the deep hole... |
07:17 | <sujaldev> | Maybe I'll just use something like pyqt's web view as a last resort. |
07:20 | <sujaldev> | Thank you all for your help! |
07:20 | <freddy> | talk to your advisor when things get messy. At best, they will appreciate the attempt. Going back to your CSS question, I honestly don't know. parsing seems to be here https://www.w3.org/TR/css-syntax-3/ but there are various other specs in https://www.w3.org/TR/css-syntax-3/ |
07:22 | <sujaldev> | Yes I implemented the css tokenizer and parser from this specification but after that I don't know which specification to implement next. |
07:23 | <freddy> | wow. |
07:24 | <sujaldev> | why? π |
07:24 | <freddy> | at this point, I'd make a plan with a list of things to do, finding out what's optional and what's required and maybe text rendering / layout will be tricky enough without css? :-) |
07:25 | <sujaldev> | exactly but in the html specification many of the element's style is defined as css, that's why I think css might be required. |
07:31 | <sujaldev> | Here's what I think but I could be very very wrong: parse html parse css then follow the css' selector specification to parse css selectors once we can parse css selectors we can find elements in the dom? now that we can do that, we have to implement the specificity specification in the css once we know which styles are more specific to one element and also we can parse css selectors, now we can locate elements in the dom based on css and now we traverse through each element applying styles, eliminating elements that have display: none we have the render tree now? now that we have the render tree, we convert it into a layout tree using the box model specification? now that we have the layout, we can just use your ui library to create elements for each object in the layout tree |
07:31 | <sujaldev> | (my browser doesn't support scripting) |
07:33 | <sujaldev> | again I could be entirely wrong here... |
07:38 | <freddy> | I think you could render things without CSS. but I have honestly no idea |
07:39 | <freddy> | maybe https://limpet.net/mbrubeck/2014/08/08/toy-layout-engine-1.html is interesting. not as a reference for how things ought to be done but for planning & scoping |
07:39 | <freddy> | he's also doing CSS first. maybe I'm wrong π |
07:39 | <sujaldev> | I read his article, even emailed him similar questions (no reply though) |
07:39 | <freddy> | ah :) |
07:40 | <sujaldev> | Thanks for your help anyways! |
08:23 | <zcorpan> | sujaldev: you might be interested in these in-progress books: https://browser.engineering/ and https://htmlparser.info/ |
13:42 | <sujaldev> | woah! this is super helpful! how did you find this? |
13:48 | <Ms2ger ππ> | Well, he wrote one them :) |
13:48 | <Ms2ger ππ> | //writes |
13:51 | <sujaldev> | π€―π€―π€―π€― |
13:51 | <Ms2ger ππ> | Looks like Matt Brubeck's no longer working on the web, sadly |
13:53 | <sujaldev> | Yeah I saw his resume, he's working in a company called FullStory |
13:55 | <sujaldev> | Btw, is getting into mozilla just as competitive as getting into companies like google or microsoft? |
13:56 | <sujaldev> | i assume many of the people in this group work at mozilla? |
15:29 | <miketayl_r> | hiring is complicated, and very different at each of these companies |
15:29 | <miketayl_r> | so not sure there's a useful answer for you sujaldev |
15:29 | <miketayl_r> | ^__^ |
15:37 | <miketayl_r> | i was just reminded of when i interviewed at opera, i was living in NYC at the time and was just hit riding my bike under the BQE. but they bolted my arm together with some stainless steel so i was in the clear to fly to Oslo. at some point after interview 2 i was feeling some pain so i took some pain meds (hydrocodone?). later on andreas bovens told me that HR commented that my answers were very reserved and deliberate |
15:37 | <miketayl_r> | but i think i was just stoned? |
15:40 | <Ms2ger ππ> | Sure they didn't say "very reserved and deliberate... for a stoner"? :) |
15:40 | <miketayl_r> | lol |