简体   繁体   English

如何使用Perl获得以兆字节为单位的文件大小?

[英]How do I get the size of a file in megabytes using Perl?

I want to get the size of a file on disk in megabytes. 我想获取磁盘上文件的大小(以兆字节为单位)。 Using the -s operator gives me the size in bytes, but I'm going to assume that then dividing this by a magic number is a bad idea: 使用-s运算符可以得到以字节为单位的大小,但是我将假定将其除以魔术数字是个坏主意:

my $size_in_mb = (-s $fh) / (1024 * 1024);

Should I just use a read-only variable to define 1024 or is there a programmatic way to obtain the amount of bytes in a kilobyte? 我应该只使用只读变量来定义1024,还是可以通过编程方式获取千字节中的字节数?

EDIT: Updated the incorrect calculation. 编辑:更新了不正确的计算。

If you'd like to avoid magic numbers, try the CPAN module Number::Bytes::Human . 如果您想避免使用幻数,请尝试使用CPAN模块Number :: Bytes :: Human

use Number::Bytes::Human qw(format_bytes);
my $size = format_bytes(-s $file); # 4.5M

You could of course create a function for calculating this. 您当然可以创建一个用于计算此函数的函数。 That is a better solution than creating constants in this instance. 这比在这种情况下创建常量更好。

sub size_in_mb {
    my $size_in_bytes = shift;
    return $size_in_bytes / (1024 * 1024);
}

No need for constants. 不需要常量。 Changing the 1024 to some kind of variable/constant won't make this code more readable. 1024更改为某种变量/常量将不会使此代码更具可读性。

This is an old question and has been already correctly answered, but just in case your program is constrained to the core modules and you can not use Number::Bytes::Human here you have several other options I have been collected over time. 这是一个古老的问题,并且已经得到了正确的答案,但是以防万一您的程序被限制在核心模块上,并且您无法使用Number :: Bytes :: Human ,随着时间的推移,您还有其他几种选择。 I have kept them also because each one use a different Perl approach and is a nice example for TIMTOWTDI : 我之所以保留它们,是因为每个人都使用不同的Perl方法,并且是TIMTOWTDI的一个很好的例子:

  • example 1: uses state to avoid reinitialize the variable each time (before perl 5.16 you need to use feature state or perl -E) 示例1:使用状态避免每次都重新初始化变量(在perl 5.16之前,您需要使用功能状态或perl -E)

http://kba49.wordpress.com/2013/02/17/format-file-sizes-human-readable-in-perl/ http://kba49.wordpress.com/2013/02/17/format-file-sizes-human-read-in-perl/

    sub formatSize {
        my $size = shift;
        my $exp = 0;

        state $units = [qw(B KB MB GB TB PB)];

        for (@$units) {
            last if $size < 1024;
            $size /= 1024;
            $exp++;
        }

        return wantarray ? ($size, $units->[$exp]) : sprintf("%.2f %s", $size, $units->[$exp]);
    }
  • example 2: using sort map 示例2:使用排序图

.

sub scaledbytes {

    # http://www.perlmonks.org/?node_id=378580
    (sort { length $a <=> length $b 
          } map { sprintf '%.3g%s', $_[0]/1024**$_->[1], $_->[0]
                }[" bytes"=>0]
                ,[KB=>1]
                ,[MB=>2]
                ,[GB=>3]
                ,[TB=>4]
                ,[PB=>5]
                ,[EB=>6]
    )[0]
  }
  • example 3: Take advantage of the fact that 1 Gb = 1024 Mb, 1 Mb = 1024 Kb and 1024 = 2 ** 10: 示例3:利用1 Gb = 1024 Mb,1 Mb = 1024 Kb和1024 = 2 ** 10的事实:

.

# http://www.perlmonks.org/?node_id=378544
my $kb = 1024 * 1024; # set to 1 Gb

my $mb = $kb >> 10;
my $gb = $mb >> 10;

print "$kb kb = $mb mb = $gb gb\n";
__END__
1048576 kb = 1024 mb = 1 gb
  • example 4: use of ++$n and ... until .. to obtain an index for the array 示例4:使用++$n and ... until ..获得数组的索引

.

# http://www.perlmonks.org/?node_id=378542
#! perl -slw
use strict;

sub scaleIt {
    my( $size, $n ) =( shift, 0 );
    ++$n and $size /= 1024 until $size < 1024;
    return sprintf "%.2f %s",
           $size, ( qw[ bytes KB MB GB ] )[ $n ];
}

my $size = -s $ARGV[ 0 ];

print "$ARGV[ 0 ]: ", scaleIt $size;  

Even if you can not use Number::Bytes::Human, take a look at the source code to see all the things that you need to be aware of. 即使您不能使用Number :: Bytes :: Human,也请查看源代码以查看您需要了解的所有内容。

Well, there's not 1024 bytes in a meg, there's 1024 bytes in a K, and 1024 K in a meg... 嗯,兆字节中没有1024字节,兆字节中没有1024字节,兆字节中没有1024 K ...

That said, 1024 is a safe "magic" number that will never change in any system you can expect your program to work in. 也就是说,1024是一个安全的“魔术”数字,在可以预期您的程序可以运行的任何系统中,它都不会改变。

I would read this into a variable rather than use a magic number. 我会将其读入变量而不是使用幻数。 Even if magic numbers are not going to change, like the number of bytes in a megabyte, using a well named constant is a good practice because it makes your code more readable. 即使幻数不会改变(例如兆字节中的字节数),使用良好命名的常量也是一个好习惯,因为它会使您的代码更具可读性。 It makes it immediately apparent to everybody else what your intention is. 它使其他人立即知道您的意图是什么。

1) You don't want 1024. That gives you kilobytes. 1)您不需要1024。这给了您千字节。 You want 1024*1024, or 1048576. 您需要1024 * 1024或1048576。

2) Why would dividing by a magic number be a bad idea? 2)为什么除以魔术数字会是个坏主意? It's not like the number of bytes in a megabyte will ever change. 并不是说兆字节中的字节数会改变。 Don't overthink things too much. 不要想得太多。

Don't get me wrong, but: I think that declaring 1024 as a Magic Variable goes a bit too far, that's a bit like "$ONE = 1; $TWO = 2;" 不要误会我的意思,但是:我认为将1024声明为魔术变量有点太过分了,有点像“ $ ONE = 1; $ TWO = 2;”。 etc. 等等

A Kilobyte has been falsely declared as 1024 Bytes since more than 20 years, and I seriously doubt that the operating system manufacturers will ever correct that bug and change it to 1000. 自20多年以来,千字节已被错误地声明为1024字节,我严重怀疑操作系统制造商是否会更正该错误并将其更改为1000。

What could make sense though is to declare non-obvious stuff, like "$megabyte = 1024 * 1024" since that is more readable than 1048576. 不过可能有意义的是声明非显而易见的内容,例如“ $ megabyte = 1024 * 1024”,因为它比1048576更具可读性。

Since the -s operator returns the file size in bytes you should probably be doing something like 由于-s运算符返回文件大小(以字节为单位),因此您可能应该这样做

my $size_in_mb = (-s $fh) / (1024 * 1024);

and use int() if you need a round figure. 如果需要圆形图形,请使用int()。 It's not like the dimensions of KB or MB is going to change anytime in the near future :) 这并不意味着KB或MB的大小在不久的将来会随时发生变化:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM