00:27
<ajnewbold>
gsnedders: ahoy
06:29
<jwalden>
"We think that developers will have an easier time building interoperable sites on top of IE8’s strong platform work (like [...] cross-domain requests (XDR) [...])."
06:29
<jwalden>
the sheer audacity
06:47
<BenMillard>
jwalden, you're talking about the IE blog here? http://blogs.msdn.com/ie/archive/2008/12/03/compatibility-view-improvements-to-come-in-ie8.aspx
06:48
<jwalden>
yeah
06:48
<jwalden>
I added a comment to about that effect, too, except less exasperatedly-worded
06:48
jwalden
neologizes a bit, and a bit more
06:51
<BenMillard>
"When users install Windows 7 Beta or the next IE8 update, they get a choice about opting-in to a list of sites that should be displayed in Compatibility View."
06:53
<BenMillard>
"Users who choose to get the list receive it via Windows Update packages, just like IE security updates."
06:54
<BenMillard>
"[...] the presence of a <META> tag / HTTP header “wins” [...]"
06:55
<BenMillard>
"In some cases, for a new browser, developers have to spend time to add a tag or header to make their sites compatible."
06:56
<BenMillard>
it's good that Microsoft are discussing this in detail in advance
06:58
<BenMillard>
oh, it names some sites: "[...] high-volume sites like facebook.com, myspace.com, bbc.co.uk, and cnn.com with pages that weren’t working for end-users with IE’s new standards compliant default."
07:00
<BenMillard>
"The most important thing we can do now is deliver better interoperability for a better web [...]"
07:15
<jwalden>
it's mostly just this-is-how-far-we'll-bend-over-backwards-for-sites-that-relied-on-our-bugs plus a little see-we're-trying-to-do-things-that-look-monopolistic
07:15
<jwalden>
which is all well and good, as far as that goes
07:15
<jwalden>
er
07:16
<jwalden>
look-not-monopolistic
07:19
hsivonen
wonders what the sites using XDR will be interoperable with
07:29
<hsivonen>
hmm. the opt-in list will be fun for QA
07:30
<hsivonen>
(both at Microsoft and at the sites affected)
07:33
<zcorpan>
i guess it means authors will have to make sure it works in both IE8 standards mode and IE8 compat mode even if they only want to target standards mode
07:35
<BenMillard>
zcorpan, perhaps such sites are expected to provide "a <META> tag / HTTP header"?
07:38
<zcorpan>
BenMillard: does it override the user's opt-in list?
07:42
<BenMillard>
zcorpan, apparently "[...] the presence of a <META> tag / HTTP header “wins” [...]"
07:43
<BenMillard>
zcorpan, you can read the whole thing: http://blogs.msdn.com/ie/archive/2008/12/03/compatibility-view-improvements-to-come-in-ie8.aspx
07:55
<BenMillard>
krijn, could simple mentions of domain names like example.com and subdomain.example.com get linkified in the logs?
08:54
<MikeSmith>
hsivonen: very cool to see your HTML5 Parsing in Gecko news
09:20
<gsnedders>
Philip`: I will be, provided I'm on a NXEC train (and not some temp. hired one without wifi)
09:48
<krijn>
BenMillard: probably, yes :)
09:48
<krijn>
Got a nice regex for that?
09:49
<BenMillard>
krijn, I don't
09:49
<krijn>
Me neither
09:50
<BenMillard>
krijn, TLDs are standardised, and they are preceeded by a dot when used in a simple domain
09:50
<BenMillard>
so if you have a list of valid TLDs, maybe you can use that as the start of a RegEx?
09:51
<BenMillard>
supporting ccTLDs could be a real chore, but .com and .org and simple ones like that would still help
09:51
<krijn>
Imho it's easier to select the foo.com, right click, Go to web address
09:51
<krijn>
But that's just me :)
09:51
<krijn>
Auto linking example.com doesn't make sense
09:52
<krijn>
And that's probably the most used one
09:53
<BenMillard>
krijn, it's your choice, of course
09:53
<krijn>
Trying to convince you a bit, since I'm lazy
09:53
<zcorpan>
i think it'd result in a fair amount of mislinking
09:53
<krijn>
Yeah
09:53
<jwalden>
you could make a regex from http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/src/effective_tld_names.dat?raw=1
09:54
<krijn>
I don't think it's worth it
09:54
<jwalden>
which would be as accurate as it gets these days
09:54
<jwalden>
I agree
09:54
<jwalden>
:-)
09:54
<BenMillard>
fair enough, was just an idea :)
09:54
<krijn>
Great idea btw! ;)
09:54
<BenMillard>
krijn, did you see I made demos for some other IRC log pages?: http://projectcerbera.com/!dev/irc-logs/
09:55
<krijn>
Yeah, very colorish :)
09:55
<BenMillard>
krijn, as before, take as much/little as you want from them
09:55
<BenMillard>
having a consistent header and footer across all IRC log pages, and better navigation between pages, seemed like an improvement
09:56
<krijn>
I agree
09:56
<krijn>
This would certainly double the readers :)
09:58
<BenMillard>
lol, wow that'd be 8!! :P
09:59
<krijn>
http://krijnhoetmer.nl/stats/usage_200812.html#TOPURLS
09:59
<krijn>
It's one of my top urls though
10:00
<krijn>
(Could mean the rest is even more shitty, but I'll try not to think about that case)
10:01
<BenMillard>
krijn, wow it looks like individual pages of logs are more popular than any other page on your site!
10:01
<BenMillard>
krijn, so the /irc-logs/ area must dominate your traffic?
10:01
<krijn>
Yeah, it does
10:02
<krijn>
But I think that has to do with forwarding /irc-logs/htmlwg/ to the most recent day
10:02
<krijn>
And that /irc-logs/htmlwg/ is linked from w3.org
10:02
<krijn>
html-wg even
10:02
<krijn>
From http://www.w3.org/html/wg/ I mean
10:02
<BenMillard>
nice! I hadn't realised that linked to your logs.
10:03
<krijn>
Yay for my PR :p
10:03
<BenMillard>
I added navigation between pages after seeing Dan Connolly say what I'd been thinking for some time: http://krijnhoetmer.nl/irc-logs/microformats/20081125#l-122
10:03
<BenMillard>
s/say/said/
10:03
<krijn>
Yeah, prev/next day links would be handy
10:04
<annevk3>
should be quite easy to make too...
10:04
<krijn>
Not for me ;)
10:07
<krijn>
Could do some loops, currentday - 1, see if that logfile exists.. If yes, make a link, if not, check currentday - 2, etc
10:07
<krijn>
Same for nextday
10:07
<krijn>
With a max of 10 loops or something
10:11
<BenMillard>
krijn, sounds like that would handle changes in the month, too
10:11
<krijn>
Yeah, that's what it is for, mostly
10:12
<BenMillard>
neato
10:12
<krijn>
Or I could list all logfiles, sort them, pick the current one - 1, and the current one + 1
10:12
<krijn>
(Which is what I do on the homepage)
10:18
<yecril71>
I would rather recommend a DOM transformation to convert a set of check boxes to a drop-down list.
10:18
<yecril71>
That, of course, relies on innerHTML to work.
10:19
<yecril71>
Structured approaches work better with structured languages than regular expressions.
10:20
<BenMillard>
yecril71, where is that being done?
10:20
<yecril71>
Regular expressions are for simple validation, and for languages that have no public DOM.
10:21
<yecril71>
Date: Tue, 2 Dec 2008 07:48:15 +0000 (UTC)
10:21
<yecril71>
Subject: Re: [whatwg] [WF2] action="mailto:" - encoding spaces
10:22
<BenMillard>
oh, an e-mail on a list, not what krijn and I were talking about. ok :)
10:22
<krijn>
Not really, no :)
10:22
<krijn>
A mailing list is so outdated!
10:23
<krijn>
(And hard to keep up with, as well)
10:23
<BenMillard>
krijn, I'm off for dinner now, then I'll resume neatening up my idea for the "day" log. Kudos for being so open to suggestions.
10:23
<krijn>
Np, sorry I'm not using everything immediately :)
10:24
<yecril71>
Absolutely, it is an antique way of getting things done, and extremely inefficient.
10:26
<yecril71>
I would still keep replying on the list but my remarks have been criticised as unintelligible and unconstructive.
10:26
<yecril71>
So Hixie told me I should rather interrupt your discussions on IRC.
10:34
<jgraham>
IRC: where being unintelligible and unconstructive is considered normal
10:37
jgraham
notes that listing "The body element" under "obsolete elements" is a bit confusing
10:39
annevk3
notes that it has been noted already
10:41
jgraham
notes that he expected it had already been noted but thought it was worth noting again just in case
10:47
krijn
scratches his notes
10:49
annevk3
notes he would have noted that he expected jgraham would have noted it already, but wanted to spare people on the amount of notes
10:49
<krijn>
Duly noted
10:50
<hsivonen>
thanks for the kind comments on the Gecko build
10:52
<jgraham>
Hmm am I misreading the spec... it looks as if </br> should create a <br> start tag
10:52
<zcorpan>
jgraham: that's correct
10:53
<roc>
hsivonen: context?
10:53
<hsivonen>
roc: http://hsivonen.iki.fi/html5-gecko-build/
10:53
<jgraham>
zcorpan: Did that change recently?
10:53
<zcorpan>
jgraham: no
10:53
jgraham
wishes for a way to find out when a section of the spec last changed
10:54
<roc>
cool
10:54
<roc>
ooh, you're supporting Hixie's SVG and MathML inclusion?
10:54
<roc>
that'll put the cat among the pigeons
10:55
<hsivonen>
roc: thanks. yes, SVG and MathML are supported
10:55
<doublec>
amazon S3 came up as an example during the 'video element and cross domain' debate as a service that doesn't support access controls
10:55
<doublec>
it turns out it does provide the functionality to add custom headers
10:55
<doublec>
so as long as the headers don't need dynamic data, you could use it to host cross domain files
10:56
<roc>
intriguing
10:56
<annevk3>
Access-Control might need dynamic data if the video is personalized based on a cookie
10:56
<doublec>
yes, true
10:56
<annevk3>
but if it's simple storage Access-Control-Allow-Origin:* would suffice
10:57
annevk3
doesn't really know what S3 allows for
10:57
<doublec>
only setting a header and string value when uploading the file, which gets served as an http header when a client downloads it
10:58
<annevk3>
so you can't really have personalized video hosting there anyway
10:58
<doublec>
correct
11:03
zcorpan
tests hsivonen's gecko build
11:03
<zcorpan>
seems to work :)
11:04
<zcorpan>
hsivonen: http://simon.html5.org/articles/mobile-results is rendered in quirks mode
11:06
<zcorpan>
or is document.compatMode lying?
11:07
<hsivonen>
zcorpan: oops. it always renders in the quirks mode.
11:07
<zcorpan>
heh
11:08
<hsivonen>
zcorpan: blog post fixed. thanks
11:17
<yecril71>
MSDN says: The value of a form control is encoded so that spaces and special characters are converted to their ASCII equivalents.
11:17
<zcorpan>
pages seem to render surprisingly well in quirks mode
11:18
<yecril71>
Guess what the ASCII equivalent for a space should be, given that a space is a valid ASCII character :-)
11:19
<zcorpan>
hsivonen: some pages suffer from the FOUC
11:19
<zcorpan>
e.g. csszengarden
11:20
<yecril71>
The example is oPhrase=My%20favorite%20color%20is%20plaid so that must be %20.
11:20
<yecril71>
In particular, converting a space to + is not mentioned at all.
11:22
<Lachy>
hsivonen, in the HTML5 gecko build, there's a strange bug with the handling of numeric character references
11:23
<Lachy>
hsivonen, see the To: field on this post for an example: http://lists.w3.org/Archives/Public/public-html/2008Dec/0074.html
11:24
<Lachy>
hsivonen, note how it says: "HTML WG <&#0119;&#0051;&#0046;&#0111;&#0114;..." whereas the other email addresses, which are also in the source as char refs, display proplerly
11:27
<zcorpan>
Lachy: if i paste the source into the live dom viewer then it doesn't happen
11:42
<yecril71>
IE7, however, converts a space to + and encodes + as %2B as required.
11:47
<Lachy>
hsivonen, here's a minimified demo showing the bug http://tinyurl.com/5z6phg
11:49
<Lachy>
hsivonen, note that the behaviour seems to change depending on the length of the source code before it. i.e. if you remove one line of source code, it's likely to display a different set of character references wrongly
11:54
<zcorpan>
Hixie: at least 1 web developer thinks <iframe seamless> is a good thing :) http://friendlybit.com/html/html-includes/
12:17
<yecril71>
I wonder if it is appropriate for the user agent to display a HTML document upon navigating to a mailto: URI.
12:18
<yecril71>
I think Lynx does that.
12:20
<roc>
Firefox can do that
12:20
<yecril71>
I also thing inappropriate appearance directives for FORM controls should be ignored.
12:20
<roc>
a Web app can register a page as the mailto: protocol handler
12:21
<MikeSmith>
yecril71: my Lynx launches some curses-based interactive app for actually sending mail
12:21
<MikeSmith>
or maybe it's built into Lynx
12:26
<annevk3>
http://www.bluishcoder.co.nz/2008/12/srt-subtitles-with-html5-video.html
12:26
<annevk3>
fail
12:27
<annevk3>
(not the post)
12:28
<MikeSmith>
annevk3: the jquery.srt doesn't work as expected?
12:28
<annevk3>
MikeSmith, the approach to subtitles was supposed to be subtitles embedded in the video stream, rather than some ECMAScript hack
12:29
<MikeSmith>
I see
12:29
<annevk3>
hopefully it will work long term, but I'm sceptic
12:33
<krijn>
Subtitles/captioning embedded in videos doesn't work, people want them to be indexable right now as well..
12:34
<hsivonen>
annevk3: I had a glass half-empty feeling, too :-/
12:35
<krijn>
http://www.minvws.nl/en/video/ also uses SRT afaik
12:36
<hsivonen>
for the record, I think SRT is the way to go--but as a built-in feature without JS
12:37
<hsivonen>
when it comes to captioning, SRT vs. anything fancy is even more drastic than 80% of use cases with 20% the effort
12:37
<krijn>
Something like <video subs=foo.srt> you mean?
12:38
<hsivonen>
krijn: or SRT-equivant data in the media stream
12:38
<krijn>
Or subs=foo (for mookid)
12:39
<hsivonen>
SRT vs. fancy is probably closer to 99% of use cases for 1% of effort
12:39
<hsivonen>
s/equivant/equivalent/
12:41
<Lachy>
SRT may be ok for some simple captioning/subtitling, but it does have a lot of limitiations
12:42
<hsivonen>
Lachy: right, I didn't say 100% of use cases :-)
12:43
<Lachy>
Joe Clark was advocating open captions on ALA recently, based on the limitations (especially fonts) of closed caption formats
12:43
<Lachy>
another alternative is the bitmap subtitle formats, such as those used on DVD and Blu-ray
12:44
<hsivonen>
Joe Clark may have unusually strong sensitivities to getting particular fonts displayed
12:44
<Philip`>
Would plain transcripts (not temporally linked to the video at all) satisfy many of the use cases, with much reduced effort?
12:44
<Lachy>
he has a point about using fonts that are optimised for readability on screen
12:45
<hsivonen>
Lachy: and OSs don't ship with any screen-readable fonts?
12:45
<Lachy>
hsivonen, I don't know.
12:46
<jmb>
Philip`: they may well do. although, there's loads of non-verbal stuff in video which should probably be conveyed somehow
12:46
<hsivonen>
Philip`: well, the TV and cinema people have assumed you need at least the SRT level of features
12:47
<Philip`>
Lachy: Does he only care about accessibility of professionally-produced videos, and not care about other people producing videos when they don't know anything about fonts and don't have any installed and decide to go with Comic Sans because it looks nice and friendly?
12:47
<Lachy>
anyway, I think the ultimate solution would be to use text-based subtitle format with the ability to embed custom fonts within the video container
12:47
<Lachy>
Philip`, for amature videos, I'm sure he's aware that there isn't much chance of getting people to use anything much more complex than SRT, if they'll even use that
12:48
<Philip`>
hsivonen: The TV and cinema people have a vested interest in not producing a solution that removes the video component :-)
12:48
<hsivonen>
for now, I'd settle for SRT rendered in white with black outline near the bottom of the frame rendered in my browser-defalt sans-serif font
12:50
<Lachy>
hsivonen, sure, but it would also need to support basic positioing, so that it can be moved to the top to avoid overlapping other on-screen text
12:50
hsivonen
notes that Joe Clark relies on the user's computer having fonts for his blog instead of sending huge Microsoft Publisher-created bitmaps
12:51
<Lachy>
that's because using bitmaps like that would be less accessible than using even poor quality fonts
13:16
<MikeSmith>
I notice that the Moz and Webkit default stylesheets use -moz-padding-start: and -webkit-padding-start: instead of just padding: ... wondering why
13:18
<MikeSmith>
also -moz-margin-start and -webkit-margin-start, and -moz-border-start
13:19
<mstange>
MikeSmith: https://developer.mozilla.org/En/CSS/-moz-padding-start
13:19
<MikeSmith>
mstange: thanks
13:20
<annevk3>
yeah, those properties should just be standardized
13:21
<annevk3>
otoh, if all those properties end up in browsers and tools and such, authors might get confused quite a bit
13:22
<yecril71>
The encoding algorithm for mailto links, as specified, consists of two steps.
13:22
<yecril71>
The first step is encoding the control values normally, and the second step is to transform all occurrences of + to %20.
13:23
<yecril71>
This is equivalent to making an exception in the first algorithm.
13:25
<BenMillard>
MikeSmith, your document could convert "-start" to "-left" and "-end" to "-right" in such properties, then remove the vendor prefix, and maybe say somewhere it assumes dir="ltr"?
13:27
<MikeSmith>
BenMillard: the default browser stylesheets don't actually use -end
13:27
<MikeSmith>
would it be wrong to just convert it to plain "padding" ?
13:28
<BenMillard>
MikeSmith, moz-padding-start is equivalent to padding-left when dir="ltr", afaict.
13:28
<BenMillard>
s/moz-padding-start/-moz-padding-start/
13:28
<MikeSmith>
I see
13:30
<yecril71>
It seems "jump to step" would be more readable if the steps were both numbered (for easy location) and labeled (for remembering what they do).
13:32
<MikeSmith>
BenMillard: so I guess I'll have my script convert those to padding-left and maybe put a note somewhere explaining
13:33
<annevk3>
rtl people could claim you're biased
13:35
<MikeSmith>
annevk3: hmm, so maybe I'll just keep it -vendor-padding-start , with "vendor" in ital
13:36
<MikeSmith>
hmm, yeah, that seems better
13:36
<BenMillard>
annevk3 & MikeSmith, the document is written in English. :)
13:37
<annevk3>
BenMillard, what it describes should still be sort of i18n compatible
13:37
<annevk3>
BenMillard, just like we care about accessibility and such :p
13:37
<annevk3>
;)
13:37
<MikeSmith>
annevk3: btw, what about "-appearance" ? any move to standardize that?
13:37
<annevk3>
that's in a standard, but the standard is not that clear
13:37
<annevk3>
css3-ui
13:40
<MikeSmith>
OK
13:40
<BenMillard>
annevk3, versions which are translated into languages with other writing directions could translate the CSS samples accordingly.
13:40
<annevk3>
the CSS samples are UA default style sheet rules...
13:40
<hsivonen>
Lachy: thanks for the minimized test case
13:41
<annevk3>
so not really samples
13:42
<hsivonen>
hmm. that's a cross-language bug
13:42
<hsivonen>
fortunately
13:43
<BenMillard>
annevk3, I'm not sure that distinction is helpful when a reader sees something which look like CSS sample CSS but which uses propeties they don't know about...but it's Mike's call. :)
13:43
<BenMillard>
s/CSS sample CSS/a CSS sample/
13:45
<annevk3>
it's not a sample
13:56
<annevk3>
"Look ma! No namespace declarations!" :p
13:56
<annevk3>
-- http://hsivonen.iki.fi/test/svg-and-mathml-in-html.html
14:02
<MikeSmith>
hmm, looking for a default Opera stylesheet but not finding one
14:02
<annevk3>
there's not really such a thing
14:02
<MikeSmith>
ah, OK
14:02
<MikeSmith>
I see other stuff in /usr/share/opera/styles
15:23
<annevk3>
"The net result of the 3.0 generalizations is that Python 3.0 runs the pystone benchmark around 10% slower than Python 2.5. Most likely the biggest cause is the removal of special-casing for small integers. There’s room for improvement, but it will happen after 3.0 is released!"
15:23
<annevk3>
guess we better not port html5lib just yet
15:23
<annevk3>
-- http://docs.python.org/dev/3.0/whatsnew/3.0.html
15:24
<rubys>
unicode is much better in python 3
15:25
<annevk3>
sure
15:25
<rubys>
in any case, I'd like to see hsivonen's work be used to bootstrap a C/C++ implementation with a Python binding
15:25
annevk3
doesn't like the new print either
15:25
<annevk3>
I'd love that too
15:25
<rubys>
yea, the new print sucks, but isn't a big deal
15:25
<rubys>
I mostly use print for debugging
15:26
jgraham
is very happy about better unicode handling
15:27
<jgraham>
although html5lib has some bugs that only occur on UCS2 or UCS4 builds
15:27
<tthorsen>
I think the new print looks ok, but I haven't tried it. What's the problem with it?
15:27
<rubys>
parens are required
15:27
tthorsen
likes parens
15:28
<jgraham>
I have a hard time complaining about two extra characters
15:28
<jgraham>
I guess it may be a small usability loss
15:28
<annevk3>
I don't :) but rubys is right, it's not crucial
15:29
<rubys>
In ruby, p is a function that can be used without parens, and equates roughly to python's print(repr(...)). Very handy for debugging.
15:29
<jgraham>
rubys: Can't most ruby functions be used without parens?
15:29
<rubys>
yes
15:30
jgraham
generally likes parens around functions
15:31
<jgraham>
Or at least I found it weird when I tried Haskell. Although that could be other things too
15:31
<rubys>
the one case where it is handy is functions that return None (or nil, or void or whatever) can be used as statements.
15:41
<rubys>
hsivonen: take a look at http://rails.intertwingly.net/blog/index.html using your html5 parser enhanced minefield
15:42
<rubys>
The whatwg logo is borked
16:00
<hsivonen>
rubys: That bug has been fixed on trunk. It was a layout layer problem.
16:00
<rubys>
cool
16:01
<hsivonen>
rubys: https://bugzilla.mozilla.org/show_bug.cgi?id=459817
16:03
<rubys>
oh, cool, I'm the testcase! :-)
16:07
<Philip`>
Python 2.x lets you say "print x," to prevent it printing a newline at the end; can you do something similar in Python 3.0 if print is using standard function call syntax instead?
16:08
<tthorsen>
print(x, end=""), I think
16:09
<Philip`>
That's quite horrid
16:09
Philip`
prefers it when programming languages are updated to make it easier to write things, not harder
16:09
<rubys>
http://docs.python.org/dev/3.0/whatsnew/3.0.html#print-is-a-function
16:12
gsnedders
should probably care about supporting Python 3 now
16:12
<gsnedders>
does html5lib work at all?
16:13
<jgraham>
gsnedders: No
16:13
<tthorsen>
Philip`: I did not know about the "print x," in 2.X. That's pretty weird syntax - even if it is easy to write. I think the new syntax is easier to understand for people who are not Python wizards
16:13
<jgraham>
That is pretty low on my priority list though, after "stop hunderds of testcases failing"
16:16
jgraham
made a start on the ferry but has some way to go
16:16
<jgraham>
Like before it was over a thousand testcases
16:17
<Philip`>
tthorsen: It isn't any easier for those people to write - they'll do "print('hello')" and not want a newline, then they'll see that it adds a newline, and then they'll either give up or they'll have to look in the documentation, which will then tell them "write 'print "hello",' to suppress the newline" or "write 'print("hello", end="")' to suppress the newline"
16:17
<Philip`>
so it's equally a pain both ways
16:17
<jgraham>
sys.stdout.write("hello") works I guess
16:18
<rubys>
only if you import sys :-)
16:18
<Philip`>
but with the new way, when you're feeling very frustrated at the lack of a decent debugger and are trying to sprinkle a load of print commands throughout your code to work out what's going on, you'll have to type more characters on every line and you will get even more frustrated :-)
16:19
<tthorsen>
yeah. I was thinking about reading code that others had written. If I saw "print x," somewhere I would probably not even notice the ',' and I would certainly not guess that it was for suppressing the newline
16:20
<jgraham>
import sys; p = sys.stdout.write
16:20
<tthorsen>
If I was in charge, print would not output newlines at all. I don't mind putting a \n at the end of my strings.
16:20
<tthorsen>
s/newlines/automatic newlines/
16:21
<Philip`>
The rationale at http://www.python.org/dev/peps/pep-3105/ seems to be concerned entirely with relatively uncommon advanced requirements, and totally ignores the basic "I just want to print some text" use case
16:22
<Philip`>
tthorsen: Recent versions of Perl attempt to solve that problem by having 'print' which does no newlines, and 'say' which adds one automatically
16:22
<rubys>
That's Python for you: one size fits all. If you want a language which cottons to use cases (at the expense of consistency), consider Perl or Ruby.
16:23
<annevk3>
I very much like significant whitespace though
16:25
<Philip`>
I think the main reason I use Python nowadays is ctypes, because I can't anything that works that well in Perl
16:25
<Philip`>
and if I'm writing a little bit of code that requires ctypes, I might as well write the rest of the program in Python because that's easier than mixing languages
16:25
<tthorsen>
annevk3: I'd actually prefer c-style { and }. The whitespace blocks are okay until you want to refactor a 3000 line program. Moving and reindenting big blocks of code feels dangerous when the indentation actually matters.
16:25
<annevk3>
I don't write 3000 line programs
16:26
<Philip`>
$ wc -l html5parser.py
16:26
<Philip`>
2169 html5parser.py
16:26
<Philip`>
It's close :-p
16:27
hsivonen
wonders if PyDev supports all the eclipse refactoring goodness for Python
18:21
<Philip`>
When html5lib's sanitizer says things like re.match("^(\s*[-\w]+\s*:\s*[^:;]*(;|$))*$", style) isn't that, like, totally wrong, because it needs escaped backslashes or needs to be r"..."?
18:22
<Philip`>
Oh, maybe not - it seems Python treats "\w\s" identically to "\\w\\s"
18:23
<Philip`>
(Seems fragile to rely on that, though)
18:25
<Dashiva>
I never saw a reason not to use raw strings for regexps
18:32
<Philip`>
For today's exciting "you have two problems" quiz: How do you make re.match('^(\s*[-\w]+\s*:\s*[^:;]*(;|$))*$', 'x: y; ' * 20) not take a probably-exponential amount of time to run?
18:34
<Philip`>
Bonus points if the solution is still adequately fast when the input is ('x: y; ' * 20)+'x'
18:34
<Dashiva>
Could you put \s inside the [^:;]?
18:35
<Dashiva>
Not sure what you're matching exactly
18:36
<Philip`>
Dashiva: It's trying to sanitize style attribute values
18:37
<Philip`>
so it's matching "x: y; x: y" and "x: y; x: y;" (with various whitespace, and limitations on 'x' and 'y')
18:37
<Philip`>
and if it fails to match when it reaches the end, it apparently does a huge amount of backtracking
18:38
<Dashiva>
This part here seems like the problem as far as I can tell: \s*[^:;]*
18:38
<Dashiva>
Otherwise there isn't much to backtrack on
18:39
<Dashiva>
Although that wouldn't stop it from trying, would it...
18:40
<Philip`>
Hmm, that seems a good point - that's where it's going to get the exponential number of matches from
18:41
<Dashiva>
Could you just remove the \s part? Since it'd be part of the class anyway
18:42
<Philip`>
Yep, that's what I tried and it seems to make it go fast
18:43
<hsivonen>
http://delicious.com/url/d01372fa73da4044019a8ce733974a5e
18:43
<hsivonen>
tagged lol, omg, wtf
18:45
<MikeSmith>
"it is becoming increasingly difficult to distinguish W3C specs from Onion articles"
18:47
<jcranmer>
Philip`: regular expressions are not NP-hard
18:47
<jcranmer>
well, PCRE are
18:50
<Philip`>
jcranmer: That doesn't stop implementations sometimes taking exponential time to execute them
18:51
<jcranmer>
it depends on whether or not you want to refer to subgroups or just match
18:52
<Philip`>
jcranmer: Just matching, but in an implementation that does backtracking rather than compiling to a DFA or whatever silly thing computer scientists say you should do
18:53
<jcranmer>
nondeterministic fininte automotan implicitly constructing the DFA
18:59
<jcranmer>
Philip`: simple way to do it is this
19:00
<jcranmer>
get rid of the outermost * and just match multiple times :-)
20:17
<Philip`>
Hmm, Python's regexp engine is dumber than Perl's
20:18
<Philip`>
re.match('^(a|a)*$', ('a' * 21) + 'b') takes forever, whereas (("a" x 1000)."b") =~ /^(a|aa)*$/ is instantaneous
20:18
<Philip`>
Uh
20:18
<Philip`>
I meant (("a" x 1000)."b") =~ /^(a|a)*$/
20:20
<Dashiva>
That's because perl cheats
20:21
<Philip`>
"Cheat" is just a term that means your opponent is being cleverer than you
20:48
<gsnedders>
Dashiva: Then all optimizations are cheats
20:48
<Dashiva>
Not all of them, just the ones that are cheats
20:49
<Hixie>
shepazu: specifically number 2 in http://lists.w3.org/Archives/Public/public-html/2008Oct/0127.html
20:50
<Hixie>
shepazu: (that e-mail is sorted in the order of what would most help html5, from most helpful to least helpful)
20:50
<Philip`>
Hixie: Did you mean those last two lines in #html-wg?
20:50
<Hixie>
er yes
20:50
<Hixie>
thanks
20:52
<annevk3>
rubys, if Decimal is serialized to a string, does it include the m at the end?
20:52
annevk3
wonders if there's some way to detect the difference between a decimal and float
20:56
<annevk3>
some example in http://intertwingly.net/stories/2008/09/20/estest.html suggests not
21:04
<rubys>
annevk3: the current thinking is to *not* serialize it with an m
21:05
<rubys>
annevk3: you there?
21:05
<annevk3>
yeah
21:05
<annevk3>
had some issues
21:06
<rubys>
just wondering why you want to tell the difference between decimal and number?
21:06
<annevk3>
seems it would be annoying if you ever wanted to extend e.g. JSON
21:06
<rubys>
JSON is very specific: 3.1m would be a syntax error
21:06
<annevk3>
hence "extend"
21:06
<rubys>
extending JSON would be very hard given that there are a number of implementations in a number of languages.
21:07
<annevk3>
but maybe it's ok, I haven't really thought long about it
21:07
<rubys>
the thinking is that perhaps someday there would be a "use decimal" and the JSON parser would produce decimal quantities when parsing JSON
21:07
<annevk3>
you'd think that be true for XML too, and yet they shipped XML 1.0 5ed a few days ago
21:07
<Philip`>
annevk3: Just write it as a JSON string, and the consumer can decide that they want to parse it as a (decimal) number
21:08
<rubys>
Philip`: +1
21:08
<annevk3>
true
21:08
<rubys>
The question is: if somebody puts 1.1 in JSON, do they mean 1.100000000000000088817841970012523 (which is what you get with number) or 1.1 (which is what you get with decimal)?
21:09
<Hixie>
annevk3: JSON has no defined error handling, so extending it would be a huge amount of work
21:10
<annevk3>
and yet people want it in all kinds of APIs
21:28
<annevk3>
Hixie, http://www.whatwg.org/specs/web-apps/current-work/#textFieldSelection never happened "When we define HTMLTextAreaElement and HTMLInputElement we will have to add the IDL given below to both of their IDLs."
21:36
<Hixie>
there's an XXX in both of those sections about it
21:37
<annevk3>
k
23:51
<olliej>
annevk3: congrats
23:57
<jwalden>
I ask <http://blogs.msdn.com/ie/archive/2008/12/03/compatibility-view-improvements-to-come-in-ie8.aspx#9173873>;; the response is http://blogs.msdn.com/ie/archive/2008/12/03/compatibility-view-improvements-to-come-in-ie8.aspx#9175737
23:57
<jwalden>
absolutely ridiculous
23:58
<jwalden>
although it's not clear that's ms responding
23:58
<Hixie>
it's somewhat true, and a better situation than nothing being interoperable at all...
23:59
<Philip`>
jwalden: It seems clear that's not MS responding, because it doesn't say "[MS]" in their name