02:15
<MikeSmith>
zewt: fwiw, about the problem you mentioned the other day about the validator reporting errors about URLs with non-default ports -- the underlying problem will be going away soon, because I'm switching the URL checker in the validator away from the old Jena IRI checker we were usin
02:16
<MikeSmith>
switching to using smola_'s Galimatias instead https://github.com/smola/galimatias
02:16
<MikeSmith>
which implementes the whatwg URL spec
02:17
<MikeSmith>
so if you have objections to any of the error messages after that, you can blame either smola_ for his code or AnneVK for his spec :-)
07:20
<MikeSmith>
is anybody other than anne familiar with https://raw.github.com/w3c/web-platform-tests/master/url/urltestdata.txt?
07:20
<MikeSmith>
oh that came from webkit
08:23
<MikeSmith>
hmm is the fragment part of a URL allowed to contain spaces?
08:27
<MikeSmith>
not allowed but it appears that the parsing algorithm results in the fragment of the parsed URL retain ingthe space as-is
08:28
<MikeSmith>
or maybe not, and smola_ parse has a bug[D[D[D[D[D[D[D[D[D[Dr
08:45
<MikeSmith>
ok, http://url.spec.whatwg.org/#fragment-state : "3. utf-8 percent encode c using the simple encode set, and append the result to url's fragment.", where "The simple encode set are all code points less than U+0020 (i.e. excluding U+0020) and all code points greater than U+007E.", which doesn't include U+0032 SPACE so that behavior is expected
08:48
<Ms2ger>
Nono
08:48
<Ms2ger>
Space is 32 decimal, so U+0020
08:49
Ms2ger
poofs
08:49
<MikeSmith>
oh
08:49
<MikeSmith>
weird then
08:49
MikeSmith
looks back at smola_ code
09:20
<MikeSmith>
nm
10:16
<jgraham>
MikeSmith: I think anne subsumed urltestdata.txt into his tests
16:54
<sicking>
abarth: ping
16:54
<abarth>
hi
16:55
<sicking>
abarth: saw your comment about deprecating "bigger" features in the showModalDialog thread
16:55
<sicking>
abarth: something we're unsurprisingly interested in too :)
16:55
<abarth>
that's mostly a reference to XSLT
16:55
<sicking>
abarth: by "bigger", do you mean "used more often"?
16:55
<sicking>
abarth: hah
16:55
<abarth>
the current approach we're pursuing for XSLT is to make a JS polyfill using asm.js
16:55
<sicking>
XSLT was my baby
16:56
<abarth>
oh, sorry :(
16:56
<sicking>
it's ok, she had a good run
16:56
<abarth>
in principle, the polyfill should work in Firefox too
16:56
<sicking>
"it seemed like a good idea at the time"
16:56
<abarth>
yeah, it made sense in the past when JavaScript was slow
16:57
<sicking>
yeah
16:57
<sicking>
why polyfill with asm.js? Would you compile all of libxml?
16:57
<abarth>
are there specific things you're interested in deprecating? these sorts of thing are more likely to stick if we coordinate
16:57
<abarth>
yes
16:57
<abarth>
libxml + libxslt
16:57
<sicking>
wow!
16:58
<abarth>
then an extension to expose the API to web pages
16:58
<sicking>
why not just use the DOM?
16:58
<sicking>
and do a DOM->DOM transformation
16:58
<abarth>
I don't want to re-implement XSLT in JavaScript
16:58
<sicking>
ok
16:58
<abarth>
there are several people who've tried that
16:58
<abarth>
and their things sort of, kind of work
16:58
<abarth>
I also tried a Java one
16:58
<abarth>
compiled it to JS using GWT
16:58
<sicking>
disable-output-escaping is the big thing you'd lose
16:59
<abarth>
and that worked for about half the sites
16:59
<abarth>
but it had some bugs
16:59
<abarth>
i fixed some of the bugs, but there were more bugs
16:59
<abarth>
so I got sad and decided we needed to use the libxslt implementation
16:59
<sicking>
heh
16:59
<sicking>
makes sense
17:00
<sicking>
abarth: showModalDialog is definitely the big one that would be exiting to get rid of. XSLT is an interesting one too, though if we polyfill it's not really "getting rid of"
17:00
<abarth>
the idea is that we would put the polyfill in the extension gallery
17:00
<abarth>
and not ship it with the browser
17:00
<sicking>
(also, my heart does cry a little getting rid of XSLT)
17:01
<abarth>
if people wanted to use it, they could install the extension
17:01
<abarth>
also, people who were passionate about XSLT could fork our version and improve it
17:01
<jgraham>
Hmm
17:01
<sicking>
i see
17:01
<sicking>
interesting
17:01
<jgraham>
I thought there was enough of the public internet using XSLT that wasn't viable
17:02
<abarth>
i'm hopeful we can solve that problem by making the extension discoverable
17:02
<sicking>
jgraham: the numbers that chrome is collecting looks really low
17:02
<sicking>
jgraham: much lower than showmodaldialog
17:02
<jgraham>
A few years ago it was enough that Opera felt the need to implement it at least
17:02
<abarth>
e.g., an infobar that says "this page is using an old API, click here to install a compatibility shim"
17:02
<jgraham>
Maybe usage has declined
17:02
<sicking>
don't know
17:02
<sicking>
i'd definitely want to get Gecko specific stats
17:03
<sicking>
anyhow
17:03
<sicking>
abarth: there's a lot of interesting APIs on the Chrome usage stats page
17:03
<abarth>
http://www.chromestatus.com/metrics/feature/timeline/popularity/79
17:03
<abarth>
http://www.chromestatus.com/metrics/feature/timeline/popularity/78
17:03
<abarth>
the XSLT thing is a bit of a dream still. we're still working on the technical part
17:03
<abarth>
sicking: those are just whatever people happened to add metrics for
17:04
<abarth>
sicking: I wouldn't read into it too much
17:04
<sicking>
abarth: <isindex> for example. And document.all()
17:04
<abarth>
we've removed <isindex>
17:04
<sicking>
abarth: oh? Also <input name=isindex>?
17:05
<abarth>
data:text/html,<input name=isindex>
17:05
<abarth>
yep
17:06
<sicking>
abarth: neat
17:06
<sicking>
abarth: when you say "don't read too much into it", does that mean you don't think the numbers are accurate?
17:07
<sicking>
abarth: or does it mean that there's many more things that might have low usage but you don't gather stats on it
17:07
<sicking>
I expected the latter, not the former
17:08
<abarth>
the latter
17:08
<sicking>
ok
17:08
<sicking>
cool
17:08
<sicking>
abarth: other things that I'd love to see stats on (we're building the same thing for gecko, so i can get it myself soon) is document.domain-setter and namespaced attributes (modulo the SVG ones)
17:10
<abarth>
yeah, those would be interesting
17:11
<sicking>
in part, getting rid of document.domain setting could allow a more narrow process infrastructure
17:11
<sicking>
i.e. process-per-origin rather than process-per-eTLD+1
17:12
<sicking>
process separation in particular is something we have to figure out to make webapps happen I think
17:12
<jgraham>
Pretty sure that document.domain setting is used every-f—ing-where
17:13
<abarth>
jgraham: it used to be used by facebook. not sure if they still use it
17:13
<sicking>
yeah
17:13
<jgraham>
Also Yahoo
17:17
<sicking>
if it's just a couple of sites that use it a lot, then that gives some hope of getting rid of it
17:17
<sicking>
by evangelizing said sites
17:18
<jgraham>
That is a very optimistic view
17:18
<sicking>
or even by whitelisting them and removing it elsewhere, then evangelizing
17:18
<jgraham>
:)
17:18
<sicking>
i try to be :)
17:19
<sicking>
wow, SVG is used on over 9 percent of the web
17:19
<sicking>
that's amazing
17:20
<sicking>
hmm.. though "SVGSVGElementInDocument" says only 0.1%. I'm not sure what that means
17:20
<Ms2ger>
Maybe inline vs <img>?
17:21
<sicking>
ooh
17:22
<sicking>
both inline and <img> counts as "in document". But Modernizr (presumably feature detection) doesn't
17:22
<abarth>
sicking: yes, we've tried a couple times to quantify SVG usage
17:22
<abarth>
i think modernizr was causing tricking us a few times
17:22
<sicking>
so i guess pages use libraries that check if svg is supported, but then never actually use svg in any form?
17:23
<abarth>
yes
17:23
<sicking>
makes sense
17:23
<abarth>
we can check with pdr to be sure
17:23
<abarth>
we had the same trouble with webkitNotifications
17:23
<abarth>
where people would touch the property but not actually use it
17:25
<sicking>
abarth: i'm surprised that some of the properties see such big changes in use of such short period of time. Makes me worried about trusting the data
17:25
<sicking>
for example http://www.chromestatus.com/metrics/feature/timeline/popularity/211
17:25
<abarth>
which properties?
17:26
<sicking>
http://www.chromestatus.com/metrics/feature/timeline/popularity/211
17:26
<abarth>
that's probably the metric rolling out into the stable channel
17:26
<sicking>
aah
17:26
<abarth>
the graphics aren't well normalized
17:26
<abarth>
we have more detailed graphics internally that slice and dice by version and platform
17:27
<abarth>
if you have specific questions, I can dig into those for you
17:28
<sicking>
does those internal graphs allow you to get stats per website?
17:28
<sicking>
i.e. could you see if document.domain is only used by facebook/yahoo for example?
17:28
<sicking>
(i'm not really hoping that we can get rid of document.domain anytime soon)
17:29
<abarth>
no, we don't have per-site data
17:29
<abarth>
its aggregated by page views
17:30
<sicking>
ok
17:31
<sicking>
another thing that would be lovely to get rid of is GlobalScopePolluter. But I think we'd only have a chance to do so on non-quirks pages. Or in ES6 strict mode or some such
17:32
<Ms2ger>
Remember how we only supported it in quirks and Chrome's demos pushed us to enable it everywhere?
17:35
<Domenic_>
that was IE's demos right?
17:35
<sicking>
Ms2ger: Right, that was MS's demos. And it wasn't intentional I bet
17:35
<Domenic_>
it seemed intentional... trying to create something that failed in other browsers...
17:36
<sicking>
Domenic_: i don't think it was. They just didn't care about testing in other browsers
17:36
<sicking>
which arguably is equivalent to making it not work in other browsers
17:36
<sicking>
but I don't think they were intentional about breaking in other browsers
17:36
<sicking>
anyhow, that's just guessing
17:37
<Domenic_>
it was a sad time, i am still somewhat bitter
17:37
<sicking>
Ms2ger: and just because we couldn't get consensus to do something good about GlobalScopePolluter back then, doesn't mean that we can't get it now
17:38
<Ms2ger>
Oh, sure
17:38
<sicking>
Domenic_: actually, since I have you here...
17:38
<sicking>
Domenic_: I had one more thought on binary Streams
17:39
<Domenic_>
sicking: yes?
17:39
<sicking>
Domenic_: have you thought about performance around Streams that shuffle lots of data? In particular about how many times an implementation will have to copy data between buffers?
17:39
<Domenic_>
sicking: that is our primary concern :P
17:40
<Domenic_>
there should be no copying in any in-process use cases
17:40
<sicking>
Domenic_: I don't think that is possible in the current API
17:40
<Domenic_>
sicking: it is possible, as shown by the implementation that does so
17:40
<sicking>
hmm...
17:41
<Domenic_>
the buffers can be implemented as queues (i.e. with pointers)
17:42
<Domenic_>
so the location of the data chunks (e.g. ArrayBuffer backing stores) can stay the same all the time
17:42
<sicking>
Domenic_: So say that I have a Stream representing reading from a file
17:43
<sicking>
Domenic_: well.. if you just keep pointers to ArrayBuffers, then that means that you'll end up with an array of ArrayBuffers when the data is asked for. Not a single ArrayBuffer
17:43
<Domenic_>
yes
17:44
<Domenic_>
you ask for one chunk at a time
17:44
<Domenic_>
read() returns a single chunk, in whatever format---object, arraybuffer, string, etc.
17:44
<Domenic_>
(whatever the stream wants to give you)
17:45
<Domenic_>
if you want to concatenate them for some reason (you should never need to do this really...) then you'd do so yourself, and pay the cost.
17:45
<sicking>
Domenic_: So read() doesn't return all data read so far? It just returns the first buffer of all data read so far (API-wise that's the same, but implementation and performance-wise they are different)
17:45
<Domenic_>
while the stream is readable, read() returns the oldest unread chunk
17:45
<Domenic_>
you call read() repeatedly until the stream is no longer readable
17:46
<sicking>
Domenic_: so that's a "yes"?
17:46
<Domenic_>
sicking: I don't know what "first buffer" means, but "probably"?
17:47
<Domenic_>
sicking: https://github.com/whatwg/streams/blob/master/Examples.md#usage
17:48
<sicking>
Domenic_: well.. if I have a Stream that represents reading from a file. The way I'd probably implement that is by having some background thread allocate buffers, then issue a read() call into that buffer and then send the buffer to the JS thread. And then do that in a loop until I've read the whole file.
17:48
<sicking>
Domenic_: how fast I'd be sending buffers is a function of the OS IO performance at the time
17:48
<sicking>
Domenic_: i.e. i might be creating buffers faster than JS is consuming them
17:48
<Domenic_>
sicking: ok, by buffers here you mean "ArrayBuffers" (or their C++ backing stores), not "stream buffers". I guess that's confusing.
17:49
<sicking>
Domenic_: I mean their C++ backing stores
17:49
<sicking>
Domenic_: which is essentially the same as an ArrayBuffer yes
17:49
<Domenic_>
streams each have a single buffer containing the chunks available for reading, so that's the confusion
17:49
<Domenic_>
but ok makes sense
17:50
<sicking>
Domenic_: so when the call comes to read(), I might be sitting on a long list of ArrayBuffers
17:50
<Domenic_>
you shouldn't be, if you are respecting the backpressure
17:51
<Domenic_>
the stream asks the underlying source (i.e. your C++) for a certain number of bytes
17:51
<Domenic_>
up to a high water mark
17:51
<Domenic_>
e.g. 16 KB
17:51
<Domenic_>
so that is supposed to be the maximum stored in memory at any given time
17:51
<sicking>
Domenic_: We want to enable paralell IO and processing, no?
17:51
<Domenic_>
then it waits for the consumer to start draining that before asking you to fill back up to the HWM
17:52
<Domenic_>
sure, but the point of streams is to limit the memory used
17:52
<Domenic_>
so say 16 KB at a time
17:52
<sicking>
Domenic_: sure, I'm not saying that we'd consume unlimited amounts of space.
17:52
<Domenic_>
ok
17:52
<Hixie>
abarth: didn't realise you'd gotten rid of name=isindex, on the thread it was only the parser thing that people were talking about
17:52
<sicking>
Domenic_: but reading 16KB into a single buffer, and then stop IO until that buffer has been requested by the page doesn't seem good performance-wise
17:53
<abarth>
Hixie: maybe I don't understand what name=isindex does
17:53
<abarth>
Hixie: i think we only changed the parser
17:53
<Domenic_>
sicking: i was told the most natural (backing-)buffer size for most OSes was 1KB
17:53
<sicking>
abarth: no, looks like you got rod of more than the parser
17:53
<Hixie>
abarth: does look like name=isindex is gone too
17:53
<abarth>
ok, i didn't review the actual code change
17:53
<Domenic_>
sicking: well, what if you are uploading that to a server over a slow connection?
17:53
<sicking>
Domenic_: that might be true
17:54
<Domenic_>
sicking: maybe in browsers a higher default HWM makes sense. In node by default each stream should take only 16 KB max of memory
17:54
<Domenic_>
sicking: but maybe in browsers we anticipate fewer streams open at a given time so it should be higher
17:54
<sicking>
Domenic_: i agree that backpressure is an important topic. But it also seems important to support reading from a file at maximum speed, no?
17:54
<Domenic_>
sicking: you should read from the file exactly as fast as the consumer is able to process data
17:55
<sicking>
Domenic_: that's impossible
17:55
<Domenic_>
sicking: if the consumer processes data synchronously, then there will be no difference between a HWM of 16KB and a HWM of 0 KB
17:55
<Domenic_>
because in both cases the data never accumulates
17:55
<Domenic_>
so the limit is never hit
17:55
<Hixie>
abarth: hey while you're here, quick TLS question unrelated to HTML. If I have two servers who talk to each other over TLS, can the "client" authenticate with a server certificate to prove its host name to the "server"?
17:55
<sicking>
Domenic_: i'm still not understanding how you envison an implementation should work
17:55
<Domenic_>
sicking: exactly like the one we already have does work :)
17:56
<abarth>
Hixie: I think the client and server certs are different, but I'm not sure
17:56
<sicking>
Domenic_: then I don't understand how the one you already has does work
17:56
<Hixie>
abarth: ah, bummer
17:56
<Domenic_>
sicking: OK, I will find the source code for you
17:56
<abarth>
Hixie: you should check with someone who is sure though
17:57
<Hixie>
abarth: who would know better than you?
17:57
<abarth>
I'd ask agl
17:57
<sicking>
Domenic_: do you know how it works?
17:57
<Hixie>
abarth: k, thanks
17:57
<Domenic_>
sicking: yes, but it's easier to point to code
17:57
<Domenic_>
sicking: https://github.com/joyent/node/blob/master/lib/fs.js#L1518
17:57
<Domenic_>
when the consumer asks for data, you read it and give it back to them
17:58
<Domenic_>
_read is called by the stream implementation when the stream's buffer is below the HWM.
17:59
<sicking>
Domenic_: so you don't issue a filesystem read until someone calls read()?
17:59
<Domenic_>
so this ensures the stream's buffer is always full up to the HWM
17:59
<Domenic_>
sicking: you preemtively fill the buffer up to the HWM, but once the HWM is reached you stop filling until they call read() to make space in the buffer.
18:00
<sicking>
Domenic_: so when you issue a filesystem read, you issue it for the full HWM size?
18:01
<Domenic_>
sicking: you should issue it for the amount of space in the buffer. Initially that will be full HWM size; after a single read() it will be [full HWM size - size of returned chunk]
18:01
<Domenic_>
after five reads it will be [full HWM size - 5 * size of returned chunk]
18:01
<sicking>
Domenic_: why wouldn't you return the full chunk?
18:02
<sicking>
Domenic_: wait
18:02
<Domenic_>
let's say the buffer has room for 16 chunks before hitting HWM
18:02
<sicking>
Domenic_: before we go further, lets be more explicit about what buffers we are talking about
18:02
<Domenic_>
yes, i am talking about the stream's buffer
18:02
<Domenic_>
which has room for e.g. 16 KB of data
18:02
<sicking>
is a "buffer" a continuous piece of memory?
18:02
<Domenic_>
i use "chunk" for a 1 KB chunk of data from the file
18:03
<sicking>
is a "chunk" a continuous piece of memory?
18:03
<Domenic_>
yes
18:03
<sicking>
so chunk is continous but buffer is not?
18:03
<Domenic_>
the stream's buffer is just a queue
18:03
<sicking>
a buffer is an array of up to 16 chunks?
18:03
<Domenic_>
yeah
18:03
<Domenic_>
conceptually
18:03
<sicking>
ok, cool
18:04
<Domenic_>
in node it looks like they use a contiguous chunk as a "pool" for the stream's buffer
18:04
<sicking>
ok, so the first thing you do is that you issue a 1KB read to the OS to read into the first chunk
18:05
<sicking>
and then you do that in a loop until you have 16 chunks
18:05
<Domenic_>
if i am understanding my sources correctly, yes :)
18:05
<Domenic_>
these sources who tell me reading 1 KB at a time is best
18:05
<sicking>
meanwhile, if read() is called, you return the first chunk and the first chunk only
18:06
<Domenic_>
yup
18:06
<sicking>
?
18:06
<sicking>
ok, so that's what I was asking for earlier. No matter how many chunks (sorry, i used "buffer" earlier) has been read, you just return the first one
18:07
<Domenic_>
ok, heh, sorry it took us so long to get there
18:07
<Domenic_>
yeah the buffer thing is confusing...
18:07
<Domenic_>
it's about "buffering" data, so i'm hesitant to rename it to "queue," but that might be less confusing...
18:08
<sicking>
often times "memory buffer" is used to describe a continuous block of memory that has been allocated
18:08
<sicking>
which is what got me confused
18:08
<sicking>
obviously there are many other types of buffers though
18:09
<Domenic_>
i will open an issue, if other people agree that it is confusing i am happy to rename
18:09
<sicking>
i don't really care, just explaining the terminology i'm used to
18:10
<sicking>
if these terms appear in the spec, then it'd be good to explain them
18:11
<Domenic_>
For sure
18:11
<sicking>
ok, i'll have to chat up some other people that know IO performance better than me to know if this is a good strategy
18:11
<Domenic_>
sounds good! feel free to put them in touch or open issues or whatnot.
18:12
<sicking>
one thing that is not possible in the current API, but that I think might be too non-JSy to worry about, is being able to reuse ArrayBuffer objects
18:12
<sicking>
so you end up with a bunch of churn
18:12
<sicking>
allocator churn
18:12
<sicking>
but i suspect that's fine
18:13
<sicking>
s/ArrayBuffer objects/Arraybuffer backing store objects/
18:13
<Domenic_>
hmm yeah i had requests for that from a node person actually
18:13
<Domenic_>
.readInto(ab)
18:14
<sicking>
the tricky part is that you don't want to allow the page to have a reference to an ArrayBuffer that you are writing into on a background thread
18:14
<sicking>
so you have to mess around with transferring ArrayBuffers back and forth
18:15
<sicking>
which is messy
18:15
<Domenic_>
ah yeah makes sense
18:15
<Domenic_>
cf. web audio api?
18:15
<sicking>
But the API might be as simple as Stream.releaseBuffer(ab) after you're done with it
18:16
<sicking>
i haven't looked at webaudio
18:16
<sicking>
not sure what they do
18:17
<Domenic_>
well there was a whole issue with data races
18:17
<Domenic_>
which iirc was exactly that "writing into it on a background thread" thing
18:17
<sicking>
ah, right
18:18
<sicking>
ok, gotta head into office
18:18
<sicking>
i'll try to get some perf guys to look at this. Not sure if i'll be able to, but i'll try
18:19
<Domenic_>
awesome, thank you
18:19
<sicking>
Domenic_: oooh, now i see what I was looking for. Is it expected that readBytes() will cause memory copying?
18:20
<sicking>
if so, that makes sense
18:22
<Domenic_>
sicking: er, i think you are looking at the wrong spec?
18:22
<Domenic_>
sicking: https://github.com/whatwg/streams
18:43
<sicking>
Domenic_: oooh! This looks so much better!
18:43
<Domenic_>
sicking: ahaha yay! :D
18:44
<sicking>
Domenic_: does read() return a promise?
18:44
<Domenic_>
sicking: nope, it's synchronous
18:44
<sicking>
Domenic_: how do you know if you can read from it?
18:44
<sicking>
.state?
18:44
<Domenic_>
yeah exactly, state === "readable"
18:45
<sicking>
do you got a callback when state changes?
18:45
<sicking>
oh, wait()?
18:45
<Domenic_>
yeah, exactly
18:45
<sicking>
so you do while(state === readable) process(read()) ?
18:45
<Domenic_>
i bet with this in hand the examples at https://github.com/whatwg/streams/blob/master/Examples.md#readable-streams make more sense
18:46
<sicking>
man, i'm so happy all the encoding stuff is dropped
18:46
<Domenic_>
hahaha me too man
18:46
<sicking>
i wasn't looking forward to having that argument :)
18:52
<sicking>
Domenic_: so it feels like you need a lot of boilerplate to implement streamToConsole. I.e. something that essentially pipes the stream into a process() function
18:54
<Domenic_>
sicking: somewhat agreed. the idea is it's a lower-level primitive and there will be lots of user-land stream-utils packages, or writable streams
18:54
<Domenic_>
interestingly in node it's not popular to simply stream something to a process function
18:55
<Domenic_>
normally it would be a writable or transform stream
18:55
<sicking>
Domenic_: interesting
18:56
<sicking>
Domenic_: a TransformStream which pipes all data through a mapping function might address the use cases
18:57
<Domenic_>
yeah, I think it's part of the non-minimal subset to have easy ways of creating properly-behaved transform streams.
18:57
<sicking>
Domenic_: though honesly, the cases I've needed the most are simply piping the data to disk or to a network connection
18:57
<sicking>
so no processing needed
18:57
<Domenic_>
right, in that case, writable streams :)
18:57
<sicking>
right
18:58
<sicking>
or simply allowing XHR.send() to take a ReadableStream
19:00
<sicking>
Domenic_: any thoughts on putBack(ab)? It's something that we've somewhat needed internally in Gecko
19:00
<Domenic_>
sicking: I am almost sure it is needed actually.
19:00
<Domenic_>
i was trying to get away with not needing it but it seems likely.
19:02
<Domenic_>
https://github.com/whatwg/streams/issues/3 the minimalists are arguing "do it yourself" but I think giving you the ability to push onto the stream's buffer, instead of maintaining your own, is much better.
19:02
<Domenic_>
More unclear is https://github.com/whatwg/streams/issues/74
19:02
<sicking>
the case we had was wanting to peek at the beginning of a stream and decide what to do with it (display or save-to-disk). Once we had determined that we wanted to do either, it was really annoying to have to deal with sending a <data we've pulled out of stream to peek, remaining stream> tuple to the chosen consumer, rather than just a simple stream
19:02
<Domenic_>
(which is related)
19:02
<sicking>
yeah, "do it yourself" has performance implications
19:02
<Domenic_>
i wonder about putBack vs. peek
19:03
<sicking>
that i don't know though
19:03
<sicking>
either might work
19:03
<sicking>
the tricky thing with peek is how to deal with a consumer that doesn't know how many bytes it needs to peek
19:04
<sicking>
you don't want it to have to do peek(10), then if that wasn't enough data peek(20) etc
19:04
<sicking>
but how do you create an API that allows you to "peek a little more, on top of what i previously peeked"
19:04
<Domenic_>
hmm yeah
19:04
<sicking>
potentially you could use a tee
19:04
<Domenic_>
is putBack better in that regard?
19:05
<sicking>
putback would let you do read() until you've got enough data, then do a putBack(array-of-all-data-I-read)
19:05
<Domenic_>
ah ok i see
19:05
<Domenic_>
that's a fairly solid argument
19:06
<sicking>
but potentially you could also do a tee, and then read from the tee
19:06
<sicking>
that leaves the original stream unchanged
19:06
<sicking>
i.e. you'd implement peek by tee-ing the stream
19:06
<sicking>
a tee might have performance overhead though. Depending on how it handles chunks and buffers
19:07
<sicking>
arbuably, read-then-putback is also going to affect buffers to some extent
19:08
<Domenic_>
i am tempted to avoid tees if possible; they seem annoying and complex
19:08
<Domenic_>
(but of course necessary in some cases)
19:08
<sicking>
yeah
19:09
<sicking>
btw, if you go the putBack route, then it'd be great if you could transfer buffers into the stream when putting them back. Otherwise the stream has to copy them
19:09
<Domenic_>
why would it have to copy them?
19:10
<Domenic_>
it wouldn't need to modify them after they're put back
19:10
<Domenic_>
it would just keep it around for the next read() call
19:10
<sicking>
since ArrayBuffers are mutable. And you don't want whoever putBack the data to be able to mutate the data once it's semantically "in the stream"
19:10
<Domenic_>
mmm :-/
19:11
<Domenic_>
well but streams are not just ArrayBuffers. This is a general issue with any mutable objects you put back in the stream
19:11
<sicking>
yup
19:11
<sicking>
this is also an issue for peek
19:11
<sicking>
you don't want to enable someone to peek the start of the stream and then mutate the contents of the un-read() data
19:12
<sicking>
it's arguably (though yehuda might not agree) somewhat different with ArrayBuffers
19:12
<sicking>
I don't think of a Stream as a objectstream of ArrayBuffer objects
19:12
<sicking>
I think of it as a stream of bytes
19:12
<sicking>
it seems surprising that you could mutate those bytes inside the stream
19:13
<sicking>
with a stream of objects you can't mutate which objects the stream if containing
19:13
<sicking>
but you can mutate the objects
19:13
<Domenic_>
it seems pretty awkward to think of it as a stream of bytes
19:13
<sicking>
really?
19:13
<Domenic_>
read() would then return a single byte
19:13
<sicking>
well, we do chunks for performance
19:14
<sicking>
but really what we're semantically representing is a stream of bytes
19:14
<Domenic_>
but that's not what the API is communicating :-S
19:14
<Domenic_>
besides, why bytes and not words or bits or disk sectors? :P
19:16
<sicking>
well.. there's the whole platform endianness debacle
19:16
<sicking>
but if we agree that it's unfortunate that ArrayBufferViews expose CPU endianness
19:16
<sicking>
then an ArrayBuffer contains bytes and not words
19:16
<sicking>
but yes, you can think of it as a stream of bits too
19:16
<sicking>
a stream of bits and a stream of bytes seem equivalent
19:17
<Domenic_>
i guess i am more concerned about how consumers interact with the API than what is semantically being represented
19:18
<sicking>
sure, but it'll affect API behavior
19:19
<sicking>
if you create a WritableStream/ReadableStream pair (can you?), would you expect that object identity of ArrayBuffer objects would remain?
19:20
<Domenic_>
what do you mean by a pair in this case?
19:21
<sicking>
can you create a WriableStream/ReadableStream pair such that anything that's written into the WriableStream appears in the ReadableStream?
19:21
<Domenic_>
sure, that's an identity transform stream
19:22
<sicking>
do you have to write JS code which pumps data between the two? Or can you create a pair of platform objects where that happens automatically?
19:24
<Domenic_>
you could create platform objects implement in JS... but the JS code to pump data between the two would just be something like `new TransformStream(function (in) { return this.addToOutput(in); })` (non-final API)
19:26
<sicking>
is a TransformStream both a readablestream and a writablestream in one object?
19:26
<Domenic_>
it is an { input: WritableStream, output: WritableStream } object literal
19:26
<Domenic_>
not nominally type-checked, of course
19:27
<sicking>
do you mean { input: WritableStream, output: ReadableStream }?
19:27
<Domenic_>
yes, sorry
19:28
<sicking>
wait, i stil don't get it. What does the 'this' map to in your code example?
19:28
<sicking>
not that object literal obviously
19:29
<Domenic_>
nah, it was a perhaps overly-simplified example. the idea is that the TransformStream helper produces such objects. Perhaps a better constructor would be `new SyncTransformStream(function (in) { return in; })` or `new AsyncTransformStream(function (in, push, done) { push(in); push(in); done(); })` or something
19:29
<Domenic_>
this area is largely under-developed
19:29
<Domenic_>
as long as SyncTransformStream and AsyncTransformStream objects have { input, output } properties everything will work.
19:30
<sicking>
ok, let me ask the question this way then. If I have a ReadableStream, that I got from say XHR, can I pass that to Worker?
19:31
<Domenic_>
that goes outside my area of expertise... doesn't that have to be transferrable then?
19:31
<sicking>
transferrable or structured-clonable. For streams transferrable is indeed what you likely want
19:32
<Domenic_>
i assume that would have to apply to all objects it holds a reference too
19:32
<sicking>
otherwise you have to tee the stream
19:32
<Domenic_>
which would include arbitrary objects in general
19:32
<Domenic_>
since you can create a stream of arbitrary objects
19:32
<sicking>
this is a stream i got from an XHR
19:32
<sicking>
...if that makes a difference
19:33
<Domenic_>
ok
19:33
<sicking>
so can you transfer that to a worker?
19:33
<Domenic_>
so i think i see what you're getting at
19:33
<Domenic_>
you would like certain platform-created streams to only hold references to transferable objects
19:33
<Domenic_>
so that they could be tranfserred to workers
19:33
<Domenic_>
but putBack would defeat this
19:33
<sicking>
not neccesarily
19:34
<sicking>
and i don't think i just want it for "certain platform created streams"
19:34
<Domenic_>
well user-created streams can be streams of functions, which IIRC are not transferable
19:34
<Domenic_>
or streams where most of the time it's a string and then every 1000th element is a function
19:34
<sicking>
right
19:35
<sicking>
so my qustion remains. Do you think you should be able to take a stream that comes from an XHR and pass that to a worker?
19:36
<sicking>
*I* want that to be possible, but I realize that might not be something that everyone thinks is a priority
19:36
<Domenic_>
i don't know enough about use cases to answer decisively. my gut is that the worker itself should have input and output streams you can pipe through
19:36
<Domenic_>
it would be nice if piping arraybuffers or other transferables into a worker did a transfer instead of a copy
19:37
<sicking>
similarly, should you be able to take a ReadableStream and write it to disk using the filesystem API?
19:37
<Domenic_>
i feel like these are all use cases streams were meant to obsolete
19:37
<Domenic_>
you pipe a readable stream to the disk
19:37
<Domenic_>
you don't store the stream itself on the disk
19:38
<Domenic_>
you pipe a stream to a web worker
19:38
<sicking>
sure, s/write/pipe/
19:38
<Domenic_>
you don't transfer the stream itself to the web worker
19:38
<Domenic_>
ok
19:38
<Domenic_>
but in that case the disk is a writable stream
19:38
<Domenic_>
and you're just piping the readable stream to it
19:38
<Domenic_>
which operates through the well-known public API
19:38
<sicking>
well
19:39
<sicking>
if I get a readable stream from an XHR, and then pipe that to a writablestream that goes to disk, I don't think we want to hit the main thread for each chunk of data, right?
19:39
<Domenic_>
fair, but that's an optimization
19:39
<Domenic_>
you can e.g. only make that optimization if nobody's done `diskStream.write = function (chunk) { console.log(chunk); oldDiskStreamWrite.apply(this, arguments); }`
19:39
<sicking>
is that an optimization that's possible if we always go through the public API?
19:40
<Domenic_>
the optimization would be skipping the public API where possible
19:40
<sicking>
i'm not sure if that's a good idea. I've never thought about it that way
19:41
<sicking>
seems like it potentially is a big performance hit
19:41
<Domenic_>
hmm. from what i understand many JS engine optimizations work that way
19:41
<sicking>
what happens if no-one had touched diskStream.write, but after 2 minutes they suddenly set diskStream.write?
19:41
<Domenic_>
like if you do Object.defineProperty(Array.prototype, "0", { set: function () { console.log("haha!"); }) you skip the fast-path on setting array elements in memory
19:41
<Domenic_>
hmm
19:41
<sicking>
should we detect that and reroute the traffic at that point?
19:42
<Domenic_>
we could spec pipe to cache the value of the write function at the time the pipe initiates
19:42
<sicking>
it's not impossible. But it likely will mean we won't do it for a long time
19:43
<sicking>
i'm not sure what the right fix is. But I think it needs to be relatively easily possible for an implementation to not have to go through any thread that the data has been piped through at some point
19:43
<Domenic_>
i agree that should be a high high priority
19:43
<Domenic_>
probably the highest
19:44
<sicking>
cool
19:45
<sicking>
the way we do this in the DOM is that we don't operate through the public API at all times. Rather we usually operate through internal operations. Which I realize might break use cases that you have in mind
19:45
<sicking>
so not saying this as a recommendation, but rather as a "that's how we solve it elsewhere"
19:46
<sicking>
caching write (and any function it calls, if any?) might actually work too
19:48
<sicking>
and I do think that we need to be able to hand off consuming a stream from one thread to another. If that means "pass to worker" or "pipe to worker" I'm not sure. But I don't know what "pipe to worker" would look like.
19:48
<Domenic_>
yeah, it is generally better to have fewer internal ops if possible, to allow users the same level of power and access as the platform. So if we can do that without hurting perf that's ideal. But obviously a large part of the point of streams is perf, so we have to be extra careful not to hurt ourselves in that way.
19:48
<Domenic_>
I am hoping pipe to worker looks like `myStream.pipeTo(myWorker.input)`
19:48
<sicking>
what's myworker.input?
19:48
<Domenic_>
a WritableStream
19:49
<sicking>
where does the data go?
19:49
<Domenic_>
which probably manifests as self....something... inside the worker
19:49
<Domenic_>
where ...something... is a readable stream
19:49
<sicking>
so you can only pipe one stream to a worker ever?
19:49
<Domenic_>
node has this in its child process API, but granted that's processes, not heavyweight threads
19:49
<Domenic_>
nah, you can always multiplex
19:50
<Domenic_>
or we could do messageport-style ports
19:50
<Domenic_>
MessageChannel I mean. I suppose the fact that MessageChannel exists in the face of postMessage implies people want easy multiplexing
19:51
<sicking>
yeah, Gecko's lack of MessageChannel has been a thorn in yehuda's side
19:54
<sicking>
so would each messageport have a .input (ReadableStream) and a .output (WritableStream) property?
19:56
<Domenic_>
other way around, you write to inputs and and read from outputs, but yes
19:56
<hober>
"thorn in yehuda's side" was the name of *whose* high school band?
19:56
<Ms2ger>
Mine
19:56
<sicking>
hober: haha
19:58
<sicking>
Domenic_: seems like a somewhat clunky way to get a stream transferred to a worker. You first have to create a message channel, then transfer one of the ports to the worker, then pipe your readablestream into the other port's input
20:00
<Domenic_>
sicking: is that how message channels work now? wow that is clunky
20:00
<Domenic_>
better APIs welcome certainly
20:01
<sicking>
Domenic_: only when you want to establish a new channel. If you already have a channel open then you can just postMessage
20:01
<sicking>
Domenic_: but if you have a channel open and you want to tell it "here, process this stream", then you need a way to do that. For other things we just transfer them
20:02
<sicking>
Domenic_: but if you can't transfer streams then a simple postMessage doesn't work
20:02
<Domenic_>
ok, but if you already have the channel open, then it's just stream.pipeTo(channel.input). Seems similar effort to postMessage
20:03
<sicking>
Domenic_: but you can only do that once. If you later want to say "here, process this stream too" then you'd have to multiplex over the same channel
20:03
<sicking>
err.. over the same stream
20:03
<Domenic_>
ah, i see
20:04
<Domenic_>
i guess i am assuming you would want multiple streams just as often as you would want multiple channels
20:04
<Domenic_>
what do people open multiple channels for anyway?
20:07
<sicking>
Domenic_: i'm not entirely sure. Ask Yehuda.
20:07
<Domenic_>
sicking: will do!
20:07
<sicking>
Domenic_: but here it's more "multiple work items" rather than "multiple channels"
20:07
<sicking>
i.e. a stream can represent a work item
20:08
<sicking>
anyhow, gotta join a call, sorry
20:09
<Domenic_>
np, i should probably do my day job
20:16
<Rahul21>
hello everyone
20:17
<Rahul21>
is it possible to customize the <audio> tag, to set cue points in the seekbar?
20:18
<TabAtkins>
Nope.
20:18
<Rahul21>
TabAtkins: http://imgur.com/onhyHYR
20:19
<Rahul21>
TabAtkins: something like that, any alternatives to do it?
20:19
<TabAtkins>
Not with the <audio> tag, no.
20:35
<SimonSapin>
Rahul21: what you can do is have an invisible <audio> element (without controls) and implement your own controls based on the JS API
20:36
<SimonSapin>
… maybe
20:36
<SimonSapin>
(I haven’t actually tried anything like this)
20:38
<Rahul21>
SimonSapin: any demos or links would be very helpful
20:38
<Rahul21>
resources anything
20:39
<SimonSapin>
I don’t any right now, sorry
23:00
<Hixie>
anyone know that status of cross-origin font loading in the various browsers?
23:00
<TabAtkins>
Firefox blocks, Webkit doesn't, Blink doesn't but is considering switching to blocking, IE I dunno.
23:01
<Hixie>
thanks