00:00
<jwalden>
http://mxr.mozilla.org/mozilla/source/dom/tests/mochitest/dom-level0/idn_child.html?force=1
00:00
<jwalden>
we allow IDN
00:00
<Hixie>
ok will change that too
00:01
<jwalden>
cool
00:01
<Hixie>
it just returns punycode though
00:01
<Hixie>
which was really my question
00:01
<jwalden>
I think I'd prefer returning Unicode, but I don't know what we do now
00:02
<jwalden>
I think what we return depends on whether the TLD in question is whitelisted :-(
00:02
<Hixie>
well i'll let y'all work it out while i actually get my ass to work
00:02
<Hixie>
since i just noticed it's past 4pm
00:02
<Philip`>
Hixie: Why is it not currently working?
00:02
<jwalden>
and I very much prefer Unicode to an unpredictable value
00:03
<jwalden>
and Unicode seems the most consistent choice here
00:03
<annevk>
just need a tool so that authors can convert between the two easily
00:04
<jwalden>
and at runtime as well, then; requiring authors in the long term to have to think about punycode doesn't seem very nice
00:05
<annevk>
right
00:33
<Hixie>
unicode it shall be
08:32
<zokka>
hello
08:34
<Hixie>
is either of adam barth or collin jackson here?
08:35
<Hixie>
i have some thoughts on the frame navigation policy
08:35
<Hixie>
in particular, i just noticed that while window.open() in html5 currently implements the child-restriction policy, the window.frames[] API is completely origin-blocked, so you can't navigate frames anyway
08:36
<zokka>
hello hixie i am new
08:36
<Hixie>
hello
08:36
<Hixie>
welcome :-)
08:36
<zokka>
Don't fully understand wot you are saying though
08:37
<zokka>
came here to make some suggestions though doesnt seem to be anyone here
08:37
<zokka>
apart from you that is :D
08:38
<hsivonen>
zokka: if you want to suggest something, it's good that Hixie is here
08:38
<zokka>
does anyone bother to use <ins> and <del>
08:38
<Hixie>
a few
08:38
<Hixie>
not many
08:38
<Hixie>
<ins> slightly more than <del> iirc
08:38
<zokka>
<del> is nasty
08:38
<hsivonen>
and even then mostly inline
08:38
<zokka>
i was thinking
08:39
<zokka>
why not have a new attribute for things like <p>
08:39
<zokka>
called datetime
08:39
<Hixie>
yeah, we considered that
08:39
<Hixie>
but since people use <ins> and <del> so rarely it seemed pointless to work on making them mildly better :-)
08:40
<zokka>
i thought they may use it more if it was on <p> most ppl use that and you can use browser features to highlight changes
08:40
<zokka>
it seems nasty to add changes using <ins>
08:40
<zokka>
maybe for a few words
08:41
<zokka>
well i am just an amateur :D
08:41
<Hixie>
i think if people wanted the feature, they'd use <ins> and complain
08:41
<Hixie>
instead of just not using <ins>
08:41
<Hixie>
that seems to be the way web authors work :-)
08:42
<zokka>
yep
08:42
<zokka>
some ppl just use <div> most the time :D
08:43
<zokka>
i hate the <object> tag i hear they are replacing it with <video>, <audio> etc
08:43
<othermaciej>
Hixie: they are more likely to be around at more normal-person times in PST
08:44
<Hixie>
i'll just bring it up when i have lunch with collin on wednesday i guess
08:44
<Hixie>
zokka: they is us :-)
08:45
<zokka>
liek i say i am an amateur but i've had real trouble using object to insert video it's just so complicated plus firefox crashes alot
08:46
<zokka>
for instance there are like 2-3 ways to link to the same video arnt there
08:48
<zokka>
stop me if i am wrong :D
08:50
<othermaciej>
Hixie: can't window.open() bypass the origin-blockedness of window.frames[] by targetting a deep subframe by name?
08:50
<othermaciej>
or is that just an issue for a generic frame name lookup algorithm which avoids this?
08:50
<othermaciej>
(not up to speed on relevant parts of the spec)
08:51
<othermaciej>
Hixie: btw I'm not sure it's good for document.domain changes not to propagate to about:blank children (I am guessing that is the only upshot of copying instead of referencing the parent/opener's origin)
08:52
<othermaciej>
or at least, I don't think propagating it is a security risk, and it avoids the chance that you will accidentally make your own about:blank subframes inaccessible
08:52
<othermaciej>
(I guess technically you can change their document.domain first, then your own)
08:52
<hsivonen>
zokka: markup up the kind of edits that ins/del don't cover gets complex fast for little gain
08:52
<Hixie>
othermaciej: yes, window.open() right now is subject to the child-navigation policy (i'm changing it to ancestor as we speak)
08:52
<Hixie>
othermaciej: but .frames[] right now (which is what they use in all their examples) is completely blocked in the spec (which is an oversight(
08:53
<Hixie>
))
08:53
<zokka>
yes hsivonen i agree with that i was wondering why do you use <object data="video.avi" instead of src?
08:53
<Hixie>
i guess .frames[]-type access needs to be limited to the descendant-navigation policy too
08:53
<Hixie>
which is subtly different from same-origin
08:53
<Hixie>
and will be quite exciting
08:54
<othermaciej>
I hate subtle security policies
08:55
<Hixie>
yeah
08:55
<Hixie>
i guess the other option is to make .frames[] always accessible and limit .location
08:55
<zokka>
em could you guys link to wot you are saying so i can go over it at a later date please?
08:56
<Hixie>
http://www.whatwg.org/specs/web-apps/current-work/
08:57
<zokka>
kk thxs:|
08:59
<hsivonen>
zokka: data on object is a weird thing from a decade ago. my guess is that whoever wrote the proposal for <object> back then wasn't too concerned about consistency with the rest of HTML
09:02
<othermaciej>
Hixie: why can't frames just be subject to the same-origin check?
09:04
<Hixie>
i thought that broke the web
09:04
<Hixie>
am i wrong?
09:05
<Hixie>
i'd love to leave it if that's an option
09:07
<othermaciej>
oh, I don't actually know, I thought you knew a specific reason it broke the web
09:07
<othermaciej>
lemme see what we do for it
09:08
<othermaciej>
we just make frames a self-reference, not checked for security
09:08
<othermaciej>
not sure if we check access to the contents for same-origin
09:08
<Hixie>
yeah when i say frames[] i mean the window [[Get]]-by-index behaviour
09:11
<othermaciej>
yeah index access is done without security check, not sure why, but I assume there is a reason
09:11
<othermaciej>
(if only copying other browsers)
09:11
<othermaciej>
that sucks
09:12
<Hixie>
no check at all?
09:12
<Hixie>
how about .location.href ?
09:13
<othermaciej>
read access to that is checked (probably not write access since you can assign to location without same-origin priveleges)
09:13
<Hixie>
sounds like you're vulnerable to the 1999 Georgi Guninski citibank attack then
09:14
<othermaciej>
well I am just reading the code here, not testing
09:14
<Hixie>
whereby a site can just walk the frame hierarchy and set a subframe to point to another site and spoof a password form
09:14
<othermaciej>
but I will check that out
09:14
<othermaciej>
Hixie: ah, I am wrong
09:15
<othermaciej>
we do our usual cross-frame navigation check (different from same-origin check) on assigning location.href
09:16
<Hixie>
ah interesting
09:16
<Hixie>
so you can walk the hierarchy
09:16
<Hixie>
but not assign
09:16
<Hixie>
so you could postMessage() to any frame
09:20
<othermaciej>
that appears to be the case (again not tested)
09:21
<Hixie>
interesting
09:22
<Hixie>
that's easier to specify i think
09:22
<Hixie>
so good
09:22
<zokka>
i wont bother you guys too much but how come on http://www.w3schools.com/tags/html5_embed.asp it says the embed tag is new?
09:22
<zokka>
i thought it was a old netscape tag
09:23
<Hixie>
they probably mean new to the standards
09:23
<Hixie>
html4 didn't have it
09:23
<Hixie>
even though browsers supported it
09:23
<zokka>
i thought they where getting rid of it
09:24
<othermaciej>
browsers getting rid of it? no
09:25
<zokka>
just that there are all these ppl trying to use complicated work around for <object?
09:25
<zokka>
so they didnt have to use <embed> because it wasnt standards compliated
09:27
<Hixie>
zokka: yeah
09:27
<Hixie>
zokka: we figured we should just make it standards compliant :-)
09:27
<zokka>
cool :D
09:27
<zokka>
because i hate it when the workarounds are so complicated or use aload of javascript
09:28
<zokka>
anyway the <video> tag should be nice i recon this time around the browser makers will get it right
09:29
<zokka>
i recon ie will have perfect implatation next time round :D
09:44
<Hixie>
othermaciej: what else do you allow cross-origin access to? .top? .parent?
09:45
<othermaciej>
Hixie: this gives a first-order approximation: http://trac.webkit.org/browser/trunk/WebCore/page/DOMWindow.idl?format=txt
09:45
<Hixie>
thx
09:45
<othermaciej>
(methods and attribtues marked DoNotCheckDomainSecurity)
09:46
<othermaciej>
but magical lookup rules (indexed or named lookup) are handled in custom code
09:46
<othermaciej>
and some objects that are accessible restrict all of their own methods and properties
09:46
<othermaciej>
and some things have custom implementations that do their own security checks
09:46
<Hixie>
what's [CheckNodeSecurity]?
09:46
<Hixie>
wow, you allow a lot more than html5 does
09:47
<othermaciej>
I am not sure those are all essential to allow
09:47
<othermaciej>
CheckNodeSecurity does a security check based on the containing frame elemet's window's origin instead of this window's origin
09:47
<Hixie>
lordy
09:48
<othermaciej>
(I think there might be other similar uses, like contentDocument on frames)
10:18
<hsivonen>
annevk: I'm going to make an XTech slide about Validator.nu server side being ready for AC
10:18
<hsivonen>
annevk: are there any experimental browser builds I should mention?
11:54
<Hixie>
i'm having issues with the spec gen script not handling the size of the spec
11:54
<Hixie>
sigh
11:57
<zcorpan_>
Hixie: drop the mathml entities :P
11:57
<Hixie>
heh
11:57
<Hixie>
i'm not sure they're in the spec as the spec gen sees it
11:57
<Lachy>
Hixie, are the problems just performance issues, or is it breaking in other ways?
11:58
<Hixie>
it's sometimes dying half way
11:59
<Philip`>
Sounds like a compelling reason to split HTML5 into multiple specifications
12:00
<zcorpan_>
sounds like a compelling reason to roll your own spec gen
12:00
<Lachy>
Hixie, just pull out the sections you're planning to drop, like data templates
12:01
<Philip`>
Lachy: They're not in the HTML5 spec
12:01
<Lachy>
were the already removed?
12:01
<Philip`>
Oh wait
12:01
<Philip`>
I'm confused and wrong
12:02
<Philip`>
and thinking of repetition templates instead
12:02
<Philip`>
so please choose to ignore me
12:02
<Lachy>
I already ignore you :-P
12:02
<Hixie>
yeah i should drop the data templates stuff
12:02
<Hixie>
was gonna wait until i dropped repetition, though
12:04
<Philip`>
Hixie: "The owner is URI that redirected to the javascript: URI." - s//the /
12:05
<Hixie>
fixed
12:06
<Hixie>
woot
12:06
<Hixie>
i have dealt with all the feedback i know of on origin
12:07
<Lachy>
awesome. what's next?
12:08
<Hixie>
dunno. any requests?
12:08
<Hixie>
i was thinking of adding text to canvas
12:08
<Philip`>
Fix <font> and style? :-)
12:09
<Hixie>
Philip`: how good are canvas implementations at the moment? good enough to handle more new features, or are they still quite buggy?
12:09
<Lachy>
oh yeah, drop the font element already
12:09
<Hixie>
yeah i could do font and style=""
12:09
<Hixie>
was probably gonna wait til i got to the rendering section though
12:09
<Hixie>
which is like the last priority other than references
12:09
<Philip`>
<font> seems to cause a number of people to think badly of HTML5
12:10
<Hixie>
they'll live
12:10
<Philip`>
so it seems kind of sensible to try fixing it now, rather than leaving people with an incorrect impression of how HTML5 is likely to end up
12:10
<Hixie>
i guess
12:13
<Hixie>
the problem with the style="" attribute is that i don't have a solution
12:14
<Hixie>
other than the concept of a less good conformance level, which as hsivonen points out, is a bad idea for oh so many reasons
12:15
<Philip`>
Hixie: Now that everyone's doing getImageData/putImageData, I think they've all got pretty much all the features implemented, and I'm not currently aware of any particular major bugs (except that ImageData probably varies quite a bit, and I don't know how large the compatible intersection is, since I haven't really looked at any of this since last year)
12:15
<Hixie>
oh, really
12:15
<Hixie>
interesting
12:16
<Hixie>
so do you think we could add text without compromising existing bug fixing efforts?
12:16
<zcorpan_>
Hixie: just allow style='' everywhere but say that the document must still be usable when all style=''s are ignored
12:16
<Hixie>
zcorpan_: i guess we could do that
12:16
<Philip`>
(There's loads of tiny bugs, like handling Infinity/NaN inconsistently, but I'm guessing they don't affect interoperability urgently)
12:16
<Lachy>
Hixie, what's the reasons for not making style="" conforming?
12:17
<Hixie>
Lachy: same reasons for not making <Font> conforming
12:17
<Lachy>
is it just because it's non-semantic?
12:17
<Lachy>
and presentational
12:17
<Hixie>
it being non-semantic isn't the actual reason, but it's the proxy for the reason that people can rally behind, yes
12:18
<Hixie>
it's the same reason <font size=3 color=blue> is bad
12:18
<Hixie>
it can't be repurposed, it's not maintainable, it is media-specific, etc
12:18
<Lachy>
right
12:20
<zcorpan_>
i think <font color> isn't technically bad as <b>/<i>-like annotations, although people hate <font> so i guess it makes sense to not allow it
12:20
<Hixie>
as far as i can tell, i have no <font> element feedback
12:20
<Philip`>
Hixie: That probably depends on how much time implementors will have before their next releases - it seems like most will have around a year or so, which sounds like it wouldn't be rushing much, so I'd expect it wouldn't impact much on bug-fixing of old features
12:20
<Hixie>
and only 4 e-mails on style="", one of which i just wrote (containing zcorpan_'s suggestion above)
12:21
<Philip`>
But I know very little about browser development cycles, so I could be totally wrong :-)
12:21
<Hixie>
Philip`: k
12:22
<Hixie>
i guess i should sleep now anyway
12:23
Philip`
thinks style="" makes things easier to maintain rather than harder, because he can look at the element he wants to change and then change it and then it works, instead of having to follow invisible back-references into stylesheets and then hoping it won't have side-effects on anything else in the site
12:24
<Philip`>
(...assuming the alternative to style="" is using selectors in stylesheets to associate styles and elements)
12:28
<Philip`>
Hixie: http://xhtml.com/en/future/x-html-5-versus-xhtml-2/#x5-uncool-font has <font> element feedback - it's just not cool
12:28
<Hixie>
heh
12:28
<Hixie>
the maintenance thing is fine if that's the only element with that style
12:29
<Hixie>
but that's only going to be the case on small pages that aren't part of big sites, etc
12:33
<zcorpan_>
news-like announcements often have use-once style
12:33
<zcorpan_>
that's where i've used style='' before
12:35
<Philip`>
<input size> is used when people want one-off presentational attributes
12:36
<Philip`>
(though HTML5 says they can't write that, so they'd have to do it pointlessly verbosely like <input style="width:10em">)
12:36
<zcorpan_>
(css also doesn't have a proper alternative to size='')
12:37
<Philip`>
and it doesn't make sense to extract that into an external stylesheet since it's different for every form you make (depending on how much text you expect people to enter into the form)
12:40
<Philip`>
Big sites aren't all homogeneous, and they're still going to have lots of one-off bits of code
12:40
<Philip`>
so it's nice to have some way to handle that, and just put up with people who abuse it and cause maintenace pain to themselves :-)
12:41
<Philip`>
s//n/
12:42
<zcorpan_>
Philip`: yeah
12:47
<Lachy>
I need to write an example script using .querySelector(). Does anyone have any suggestions for a trivial, yet somewhat practical example I could write?
12:50
<Philip`>
Lachy: document.querySelector('img[src$=".png"]').forEach(function (img) { img.style.filter = 'progid:DXImageTransform.Microsoft.AlphaImageLoader(src='+img.src+')' });
12:50
<Philip`>
Maybe not quite trivial enough...
12:51
<Philip`>
Do you want something that's not just getElementById or getElementsByClassName?
12:51
<Lachy>
yeah, and preferably something that would be relevant to Opera's implementation
12:52
<Lachy>
an IE hack unfortnately isn't
12:52
<zcorpan_>
Philip`: that would be pointless since only ie6 needs the filter and ie6 doesn't support querySelector
12:52
<Philip`>
zcorpan_: Yeah, I guess that's a minor problem with that example
12:54
<Philip`>
I suppose document.querySelector('input[required][value=""]) isn't that useful in Opera either :-(
12:54
<Lachy>
it would help if the editor of the selectors api spec hadn't filled it with such contrived examples!
12:55
<Philip`>
The spec must have been written for certain use cases, so why not just use those? :-)
12:55
<Hixie>
i try to base the examples in html5 on what people mention as their use cases
12:55
<Lachy>
I have one using input elements for checkboxes: "input:checked"
12:55
<zcorpan_>
perhaps querySelector can be used for aria stuff
12:57
<zcorpan_>
although most authors don't know about aria so it would likely just be a distraction
12:57
<Lachy>
yeah, I guess I could take a look at previous mailing list discussions about it to see what came up.
12:57
<Philip`>
var tds = document.getElementById('graphdata').querySelector('td:last-of-type'); tds.forEach(function (td) { var value = td.textContent; td.innerHTML = '<img src="bar.png" style="width:'+value+'px" alt="'+value+'">' });
12:58
<Philip`>
(for converting a numerical table into a graphicalised version)
12:58
<Lachy>
or look at what people are using the equivalent JQuery APIs for, but it's kind of hard to ask google "Show me all pages that use the JQuery API script"
12:58
<Philip`>
(Maybe Opera is too rubbish to have forEach, though)
12:59
<Philip`>
Lachy: It would be convenient if you had a collection of loads of web pages that you could grep
13:01
<Lachy>
Philip`, that wouldn't work anyway, since the .forEach method isn't implemented on NodeLists. It would need to be converted to an array first.
13:02
<Philip`>
Lachy: http://philip.html5.org/misc/jquery-pages.txt has some that match /<script[^>]*jquery/
13:02
<Philip`>
Lachy: Oh, maybe Array.forEach(tds, function ...) then
13:06
<Lachy>
Philip`, thanks. That's a useful list.
13:07
<Philip`>
I have another four hundred if that list is too short :-)
13:07
krijnh
also has some jQuery examples, if needed :)
13:08
<Philip`>
jQuery: More popular than <cite>!
13:09
<Philip`>
That'd be a great marketing slogan
13:09
<Philip`>
Woah, it's even more popular than <csobj>
13:09
Philip`
waits for webgrep to finish making the list grow
13:10
<Lachy>
krijnh, I'm sure they'll be useful for me in the future
13:10
<Lachy>
what's <csobj>?
13:10
<Philip`>
Some GoLive component thing
13:17
<Hixie>
<font> is gone
13:17
<Hixie>
style="" is global
13:27
<hsivonen>
Hixie: there should be <fond> feedback from me asking for <font color> to become conforming
13:28
<Hixie>
ah
13:28
<Hixie>
might have gotten filed elsewhere
13:31
<Philip`>
Fond feedback?
13:32
<hsivonen>
font
13:34
<Philip`>
Is the empty string a valid CSS declaration?
13:34
<Hixie>
yes
13:34
<Philip`>
Ah, good
13:35
<Philip`>
Lachy: http://philip.html5.org/misc/jquery-pages.txt has a load more now
13:36
<Lachy>
Philip`, thanks. But I've got a good example now
13:38
<Hixie>
nn
13:38
<Philip`>
jQuery: More popular than <caption>!
13:40
<zcorpan_>
Philip`: makes sense, <caption> is pretty boring
13:41
<Philip`>
More popular than <blink> too, and that isn't pretty boring
13:42
<zcorpan_>
perhaps we should make <blink> animate so that it looks just as cool as jquery animations, to get some competition
13:44
<Philip`>
It'll have to go a long way to catch up with <marquee>
13:50
<zcorpan_>
<marquee>: More popular than jQuery!
14:32
<zcorpan_>
html5-elements r28: -<font>, +style, +onstorage, +data-*
14:32
<Philip`>
HTML5: Infinitely more attributes than HTML4!
15:01
<hsivonen>
Philip`: that was already true with <embed>
15:01
<hsivonen>
twice infinitely more attributes than HTML4!
15:04
<Philip`>
hsivonen: But <embed> wasn't in HTML4
15:05
<zcorpan_>
Philip`: that's why it was already true with <embed>...
15:05
<Philip`>
Huh?
15:05
<Philip`>
Oh
15:05
<Philip`>
I thought you meant <embed> in HTML4
15:06
<Philip`>
hsivonen: To avoid causing me confusion and suffering, I think you should have said "twice infinitely more attributes than the previous revision of HTML5!"
15:07
<Philip`>
Oh wait
15:07
<Philip`>
Now I'm just totally wrong
15:07
<Philip`>
hsivonen: You were right all along
15:08
Philip`
misread "twice infinitely more than" as "twice as many as the infinity that were in"
15:11
<zcorpan_>
Hixie: the spec says "The declarations specified must be parsed", not "The attribute's value must be parsed", which might imply that there is some preprocessing (like, splitting on ;) to get the declarations
15:12
<zcorpan_>
s/;)/';')/
15:18
<htmlfivedotnet>
ooh. only one failure in the html5lib now. is that what you guys are all getting? I can't wait to play around with it
15:22
<zcorpan_>
Hixie: how about the icons used at http://html5.org/tools/web-apps-tracker ?
15:23
<Philip`>
zcorpan_: What licensing do they have?
15:23
<zcorpan_>
Philip`: dunno
15:38
<takkaria>
Hixie: how does the PDF version of the spec get generated?
15:40
<Philip`>
takkaria: Probably a bit like http://hsivonen.iki.fi/printing-wa10/
16:30
<hsivonen>
hmm. interesting. although methods beat switch on the PPC client VM, switch does indeed beat methods on the x86 client VM!
16:40
<htmlfivedotnet>
hsivonen: wider pipelines to process more at once for the ppc (all methods), whereas the x86 barrels through the narrow switches at full megahertz speed
16:41
<htmlfivedotnet>
i bet that compiled in 64 bit, the x86 would be right in line on methods
16:54
<Philip`>
hsivonen: I have Java on x86_64 in case it'd help to test that too
16:55
<Philip`>
(It seems to say I only have the server VM and not client, though...)
16:56
<hsivonen>
Philip`: http://about.validator.nu/htmlparser/perf2.zip
16:56
<hsivonen>
Philip`: the client VM is not available for x86_64
16:58
<Philip`>
hsivonen: Ah, that would explain it
16:58
<hsivonen>
Hixie: I think making PCDATA-data and non-PCDATA-data states different states is likely to be a big perf win
16:59
<Philip`>
Hmm, my results might be a little suspect since three quarters of my CPU is running theorem provers
16:59
<hsivonen>
I think I'm going to break the data state into four states: PCDATA, RCDATA, CDATA and escape flag
17:00
<hsivonen>
I also think I should break the escape flag state further to avoid a look-back buffer
17:00
<hsivonen>
Hixie: you are cheating whenever you spec lookahead or lookback :-)
17:01
<Philip`>
hsivonen: He's optimising for readability, and trusting implementors to optimise for performance :-)
17:02
<hsivonen>
Philip`: does your OCaml parser generator generate 6 states for the data state?
17:16
<Philip`>
hsivonen: No
17:16
<Philip`>
hsivonen: I do have code that approximately splits each state up into lots of (state, content-model-flag) stats, like in http://canvex.lazyilluminati.com/misc/states2.png
17:17
<Philip`>
but haven't put that into the code generation part, because it gets a little messier when the tree constructor can change content-model-flag without the tokeniser's knowledge
17:18
<Philip`>
hsivonen: Running for 1 minute on Wikipedia's Main_Page from a few days ago, on Java HotSpot(TM) 64-Bit Server VM (build 1.6.0-b105, mixed mode), for XML/methods/switch I get 53280/27395/27294
17:19
<Philip`>
so there's no real different between methods and switch
17:19
<Philip`>
*difference
17:21
<Philip`>
s/stats/states/ five minutes ago
17:22
<Philip`>
Oh, also I haven't put it into the code generation part because I'd need to write a better optimiser for the conditions (e.g. "contentModelFlag == PCDATA" is trivially false in the DataState-RCDATA state)
17:22
<Philip`>
(but some things are less trivial)
17:48
<Philip`>
hsivonen: Also, 5 minute runs on the same machine/page give pretty much the same results (XML=269592, methods=137637, switch=132836)
19:14
<hsivonen>
Philip`: I think the content model flag should become a return value from the tree builder
19:15
<hsivonen>
Philip`: thanks for the x86_64 results
19:15
<hsivonen>
x86 server: http://pastebin.ca/1002005
19:42
<virtuelv>
MikeSmith: are there still lightning talk spots open?
19:44
<MikeSmith>
virtuelv, yeah
19:47
<virtuelv>
I certainly hope the talks don't have to be on topic and serious?
19:48
<MikeSmith>
the more fun, the better
19:49
<virtuelv>
Yeah, I'm struggling between "The political importance of lolcats" and "My name's relevance to «A boy named Sue»"
19:56
<hsivonen>
virtuelv: but lolcats are seriously important for political dissenters
19:57
<virtuelv>
hsivonen: yes, and so are their internet siblings, the motivational posters
19:58
<virtuelv>
MikeSmith: key point is, I'm interested, but I have no topic yet
20:16
<hsivonen>
Hixie: are JS engines now required to track the origin or string objects for data URIs?
20:16
<hsivonen>
do they do that already or is this something new?
20:17
<othermaciej>
hsivonen: right now Gecko and the latest WebKit give frames loaded from data: URIs no security authority
20:18
<othermaciej>
not sure what IE does
20:18
<othermaciej>
did Hixie invent something more complicated?
20:20
<hsivonen>
othermaciej: if I understood correctly, Hixie made data URIs inherit origin from the document/script where the string was minted
20:21
<hsivonen>
it's possible that I didn't understand correctly
20:21
<othermaciej>
oh that's not gonna fly
20:21
<othermaciej>
inheriting origin from the document/script that initiated the location change might do
20:22
<othermaciej>
but is somewhat complicated and unlike the handling for javascript: URIs for instance
20:22
<othermaciej>
(sadly I don't think data: can safely use either the javascript: or the about:blank security policy)
20:25
<hsivonen>
Philip`: it appears that your state analysis doesn't know that content model is always PCDATA when endTag returns
20:27
<Philip`>
hsivonen: That's the part that's impossible to determine when you've only got the tokeniser algorithm
20:28
<Philip`>
since it's the tree constructor that sets it back to PCDATA, if I remember correctly
20:28
<Philip`>
I should just hard-code that fact into the state-expander algorithm, and then it'd probably work alright
20:29
<hsivonen>
I think the various flavors of the data state should be individual states and startTag should be allowed to return the next state it wants
20:30
<Philip`>
That sounds like a sensible way to model it
20:32
<Philip`>
(I've kind of ignored the whole tokeniser / tree constructor interaction for now, and treated them as isolated algorithms)
20:32
<Philip`>
(I could just convert them into a single state machine)
20:34
<Philip`>
(37 tokeniser states * 19 tree constructor states isn't really that many)
20:35
<Philip`>
(* ~4 for the content model flag)
20:36
<Philip`>
(* 2 for the escape flag)
20:36
<Philip`>
(Okay, maybe I don't actually want to end up with five thousand functions in my parser)
20:46
<hsivonen>
Philip`: at first I thought that it would be smart to inline tree contructor into tokenizer, but it really isn't
20:48
<Philip`>
hsivonen: Why isn't it?
20:48
<hsivonen>
Philip`: the tree constructor methods are large and have more than one call site
20:48
<Philip`>
(I'm guessing it'd just be a huge amount of code duplication)
20:51
<hsivonen>
Hmm. I already have a concept of return state for entity consumption
20:51
<hsivonen>
I could use the same mechanism for rememberig CDATA vs. RCDATA when in escape
21:18
<jgraham__>
hsivonen: How do you deal with gathering all the tokens till the next non character token when parsing (R)CData? Do you have a special loop just for doing that or does it all go through the main loop somehow?
21:19
jgraham__
has to fix html5lib's liberal xml parser for cases like <script /><head> but really doesn't want to
21:19
<jgraham__>
(I want to not have a liberal xml parser based on html5lib. Unfortunately it is actually used in the wild)
21:20
<jgraham__>
s/?$/? or something else?/
21:26
<hsivonen>
jgraham__: I have a variable that holds the position of the first character in a run of text and when a text run ends, I call a flushing method
21:27
<hsivonen>
jgraham__: when I skip over data that may turn out to be text, I either store it in a buffer just in case or have static buffers with certain magic strings
21:28
<hsivonen>
nn
21:29
<jgraham__>
goodnight
21:51
<Hixie>
takkaria: by a script that often gets killed by the kernel, running every day around 6am
21:52
<Hixie>
hsivonen: no, there shouldn't be any tracking of origin for data: URIs beyond the point at which it is used... did I make it more complicated by accident?
22:28
<Hixie>
if a WimLeers guy comes by when i'm not around, someone help him out :-)