简体   繁体   中英

does Perl's Capture::Tiny::capture() avoid disk io required when using system()?

When calling an external program from a Perl script, does Capture::Tiny avoid disk io required when using system()? I get essentially the same performance when using either. A colleague is using my code and told me that it was hammering his disks. I (perhaps) don't have this problem when running on my local machine and writing to local disks.

I was previously doing this:

open($fhStdin, ">stdin.txt");
print $fhStdin "some text\n";
close($fhStdin);
system("cmd < stdin.txt 1> stdout.txt 2> stderr.txt"); 
# open and read stdout.txt
# open and read stderr.txt

And changed to this:

($stdout, $stderr, $exit) = capture {
    open($fhStdin, '| cmd');
    print $fhStdin "some text\n";
    close($fhStdin);
};

But NYTProf tells me that they take essentially the same amount of time to run (but NYTProf removes disk io overheads from subroutine times). So I wondered if capture() is writing to temporary files under the hood? (I tried reading the Tiny.pm source code but am ashamed to say I couldn't tell from that.)

Thanks for any tips.

The documentation for Capture::Tiny::capture states that files are indeed used

Captures are normally done to an anonymous temporary filehandle.

This can be seen in the source for the _capture_tee sub, used as a generic routine for all methods. About half-way through this sub we find a call to File::Temp->new happening, unless named files are to be used (see below). The rest of processing can be traced with some care.

The docs proceed to offer a way to monitor all this via named files instead

To capture via a named file (eg to externally monitor a long-running capture), provide custom filehandles as a trailing list of option pairs:

 my $out_fh = IO::File->new("out.txt", "w+"); my $err_fh = IO::File->new("out.txt", "w+"); capture { ... } stdout => $out_fh, stderr => $err_fh;

The filehandles must be read/write and seekable. Modifying the files or filehandles during a capture operation will give unpredictable results. Existing IO layers on them may be changed by the capture.

(If this is done then the call to File::Temp doesn't go, as mentioned above. See source.)

If this disk activity is a problem you can use piped open to read cmd 's output (write its input to a file first), or use qx (backticks). But then you'd have to merge or redirect STDERR and go through more hoops to check and handle error.

Another option is to use IPC::Run3 . While it also uses files it offers far more options which may be leveraged to lessen the disk I/O, or perhaps avoid disk altogether. (The idea to invoke with a filehandle opened to a scalar (in-memory) doesn't work since this isn't a real filehandle. )

The "nuclear" option is the more complex IPC::Run which can take output without using disk.


A crude sketch

The "dispatch" of all methods to _capture_tee is done in the beginning , where a set of flags is unshift ed to @_ before goto &func takes it away, to distinguish methods. For capture this is 1,1,0,0 , what sets up variables $do_stdout and $do_stderr in _capture_tee . These are then used to set up the %do hash , which keys are iterated over to set up $stash .

If extra arguments were passed to capture (for named files) then $stash->{capture} is set , otherwise a File::Temp object is assigned. The $stash is later passed to _open_std where the redirection happens.

There is a lot more, but mostly related to manipulation of localized globs and layers.


The most usual invocation writes to scalar(s)

run3 \@cmd, \my $in, \my $out, \my $err;

but this uses files, as explained in docs under How it works .

An attempt to trick it into not using files, by writing to a filehandle which is opened to a scalar

my @cmd = qw(ls -l .);
open my $fh, '>', \my $cmd_out;  # not a real filehandle ...
run3 \@cmd, \undef, $fh;         # ... so this won't work

aborts with

run3(): Invalid argument redirecting STDOUT at ...

This is because an open to a scalar doesn't set up a real filehandle. See this post .

If the filehandle is opened to a file this works as intended, writing to that file. This may well result in a more efficient disk I/O operation, compared with what Capture::Tiny does.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM