简体   繁体   English

我应该使用哪个工具来查找Perl中的内存分配?

[英]Which tool should I use for finding out my memory allocation in Perl?

I've slurped in a big file using File::Slurp but given the size of the file I can see that I must have it in memory twice or perhaps it's getting inflated by being turned into 16 bit unicode. 我使用File :: Slurp在一个大文件中啜饮但是考虑到文件的大小,我可以看到我必须在内存中使用它两次,或者可能因为变成16位unicode而膨胀。 How can I best diagnose that sort of a problem in Perl? 我怎样才能最好地诊断Perl中的那种问题?

The file I pulled in is 800mb in size and my perl process that's analysing that data has roughly 1.6gb allocated at runtime. 我输入的文件大小为800mb,我的perl进程正在分析该数据在运行时分配的大约1.6gb。

I realise that I may be wrong about my reason for the problem but I'm not sure the most efficient way to prove/disprove my theory. 我意识到我对这个问题的理由可能是错的,但我不确定证明/反驳我的理论的最有效方法。

Update: 更新:

I have elminated dodgy character encoding from the list of suspects. 我从嫌疑人名单中删除了狡猾的字符编码。 It looks like I'm copying the variable at some point, I just can't figure out where. 看起来我在某个时候复制变量,我只是无法弄清楚在哪里。

Update 2: 更新2:

I have now done some more investigation and discovered that it's actually just getting the data from File::Slurp that's causing the problem. 我现在已经做了一些调查,发现它实际上只是从File :: Slurp获取导致问题的数据。 I had a look through the documentation and discovered that I can get it to return a scalar_ref, ie 我查看了文档,发现我可以让它返回一个scalar_ref,即

my $data = read_file($file, binmode => ':raw', scalar_ref => 1);

Then I don't get the inflation of my memory. 然后我没有得到记忆的膨胀。 Which makes some sense and is the most logical thing to do when getting the data in my situation. 在我的情况下获取数据时,这是有道理的并且是最合乎逻辑的事情。

The information about looking at what variables exist etc. has generally helpful though thanks. 关于查看存在哪些变量等的信息通常有用,但谢谢。

Maybe Devel::DumpSizes and/or Devel::Size can help out? 也许Devel::DumpSizes和/或Devel::Size可以帮忙吗? I think the former would be more useful in your case. 我认为前者在你的情况下会更有用。

Devel::DumpSizes - Dump the name and size in bytes (in increasing order) of variables that are available at a give point in a script. Devel :: DumpSizes - 转储脚本中给定点可用变量的名称和大小(以递增顺序)。

Devel::Size - Perl extension for finding the memory usage of Perl variables Devel :: Size - 用于查找Perl变量的内存使用情况的Perl扩展

Here are some generic resources on memory issues in Perl: 以下是Perl中有关内存问题的一些常规资源:

As far as your own suggestion, the simplest way to disprove would be to write a simple Perl program that: 至于你自己的建议,最简单的反驳方法是编写一个简单的Perl程序:

  1. Creates a big (100M) file of plain text, probably by just outputting the same string in a loop into a file, or for binary files running dd command via system() call 创建一个大的(100M)纯文本文件,可能只需将循环中的相同字符串输出到文件中,或者通过system()调用运行dd命令的二进制文件

  2. Read the file in using standard Perl open()/@a=<>; 使用标准Perl open()/@a=<>;读取文件open()/@a=<>;

  3. Measure memory consumption. 测量内存消耗。

Then repeat #2-#3 for your 800M file. 然后对您的800M文件重复#2-#3。

That will tell you if the issue is File::Slurp, some weird logic in your program, or some specific content in the file (eg non-ascii, although I'd be surprized if that ends up to be the reason) 这将告诉你问题是File :: Slurp,你的程序中的一些奇怪的逻辑,或文件中的一些特定内容(例如非ascii,虽然如果最终成为原因我会感到惊讶)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM