简体   繁体   中英

How do I get the size of a file in megabytes using Perl?

I want to get the size of a file on disk in megabytes. Using the -s operator gives me the size in bytes, but I'm going to assume that then dividing this by a magic number is a bad idea:

my $size_in_mb = (-s $fh) / (1024 * 1024);

Should I just use a read-only variable to define 1024 or is there a programmatic way to obtain the amount of bytes in a kilobyte?

EDIT: Updated the incorrect calculation.

If you'd like to avoid magic numbers, try the CPAN module Number::Bytes::Human .

use Number::Bytes::Human qw(format_bytes);
my $size = format_bytes(-s $file); # 4.5M

You could of course create a function for calculating this. That is a better solution than creating constants in this instance.

sub size_in_mb {
    my $size_in_bytes = shift;
    return $size_in_bytes / (1024 * 1024);
}

No need for constants. Changing the 1024 to some kind of variable/constant won't make this code more readable.

This is an old question and has been already correctly answered, but just in case your program is constrained to the core modules and you can not use Number::Bytes::Human here you have several other options I have been collected over time. I have kept them also because each one use a different Perl approach and is a nice example for TIMTOWTDI :

  • example 1: uses state to avoid reinitialize the variable each time (before perl 5.16 you need to use feature state or perl -E)

http://kba49.wordpress.com/2013/02/17/format-file-sizes-human-readable-in-perl/

    sub formatSize {
        my $size = shift;
        my $exp = 0;

        state $units = [qw(B KB MB GB TB PB)];

        for (@$units) {
            last if $size < 1024;
            $size /= 1024;
            $exp++;
        }

        return wantarray ? ($size, $units->[$exp]) : sprintf("%.2f %s", $size, $units->[$exp]);
    }
  • example 2: using sort map

.

sub scaledbytes {

    # http://www.perlmonks.org/?node_id=378580
    (sort { length $a <=> length $b 
          } map { sprintf '%.3g%s', $_[0]/1024**$_->[1], $_->[0]
                }[" bytes"=>0]
                ,[KB=>1]
                ,[MB=>2]
                ,[GB=>3]
                ,[TB=>4]
                ,[PB=>5]
                ,[EB=>6]
    )[0]
  }
  • example 3: Take advantage of the fact that 1 Gb = 1024 Mb, 1 Mb = 1024 Kb and 1024 = 2 ** 10:

.

# http://www.perlmonks.org/?node_id=378544
my $kb = 1024 * 1024; # set to 1 Gb

my $mb = $kb >> 10;
my $gb = $mb >> 10;

print "$kb kb = $mb mb = $gb gb\n";
__END__
1048576 kb = 1024 mb = 1 gb
  • example 4: use of ++$n and ... until .. to obtain an index for the array

.

# http://www.perlmonks.org/?node_id=378542
#! perl -slw
use strict;

sub scaleIt {
    my( $size, $n ) =( shift, 0 );
    ++$n and $size /= 1024 until $size < 1024;
    return sprintf "%.2f %s",
           $size, ( qw[ bytes KB MB GB ] )[ $n ];
}

my $size = -s $ARGV[ 0 ];

print "$ARGV[ 0 ]: ", scaleIt $size;  

Even if you can not use Number::Bytes::Human, take a look at the source code to see all the things that you need to be aware of.

Well, there's not 1024 bytes in a meg, there's 1024 bytes in a K, and 1024 K in a meg...

That said, 1024 is a safe "magic" number that will never change in any system you can expect your program to work in.

I would read this into a variable rather than use a magic number. Even if magic numbers are not going to change, like the number of bytes in a megabyte, using a well named constant is a good practice because it makes your code more readable. It makes it immediately apparent to everybody else what your intention is.

1) You don't want 1024. That gives you kilobytes. You want 1024*1024, or 1048576.

2) Why would dividing by a magic number be a bad idea? It's not like the number of bytes in a megabyte will ever change. Don't overthink things too much.

Don't get me wrong, but: I think that declaring 1024 as a Magic Variable goes a bit too far, that's a bit like "$ONE = 1; $TWO = 2;" etc.

A Kilobyte has been falsely declared as 1024 Bytes since more than 20 years, and I seriously doubt that the operating system manufacturers will ever correct that bug and change it to 1000.

What could make sense though is to declare non-obvious stuff, like "$megabyte = 1024 * 1024" since that is more readable than 1048576.

Since the -s operator returns the file size in bytes you should probably be doing something like

my $size_in_mb = (-s $fh) / (1024 * 1024);

and use int() if you need a round figure. It's not like the dimensions of KB or MB is going to change anytime in the near future :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM