Perl DBI替代LongReadLen

Question

I'd like to know the most memory-efficient way to pull arbitrarily large data fields from an Oracle db with Perl DBI. 我想知道使用Perl DBI从Oracle数据库中提取任意大数据字段的最节省内存的方法。 The method I know to use is to set the 'LongReadLen' attribute on the database handle to something sufficiently large. 我知道使用的方法是将数据库句柄上的“LongReadLen”属性设置为足够大的值。 However, my application needs to pull several thousand records, so doing this arbitarily is extremely memory inefficient. 但是，我的应用程序需要提取数千条记录，因此这样做是非常低效的内存效率。

The doc suggests doing a query upfront to find the largest potential value, and setting that. 该文档建议事先进行查询以找到最大的潜在价值，然后进行设置。

$dbh->{LongReadLen} = $dbh->selectrow_array(qq{
    SELECT MAX(OCTET_LENGTH(long_column_name))
    FROM table WHERE ...
});
$sth = $dbh->prepare(qq{
    SELECT long_column_name, ... FROM table WHERE ...
});

However, this is still inefficient, since the outlying data is not representative of every record. 然而，这仍然是低效的，因为外围数据不代表每个记录。 The largest values are in excess of a MB, but the average record is less than a KB. 最大值超过MB，但平均记录小于KB。 I want to be able to pull all of the informatoin (ie, no truncation) while wasting as little memory on unused buffers as possible. 我希望能够在尽可能少浪费未使用的缓冲区的同时提取所有信息（即，不截断）。

A method I've considered is to pull the data in chunks, say 50 records a time, and set LongReadLen against the max length of records of that chunk. 我考虑过的一种方法是以块的形式提取数据，一次说50条记录，并将LongReadLen设置为该块的最大记录长度。 Another work around, which could, but doesn't have to, build on the chunk idea, would be to fork a child process, retrieve the data, and then kill the child (taking the wasted memory with it). 另一个可以但不必依赖于块构思的工作是分叉子进程，检索数据，然后杀死子进程（利用它浪费内存）。 The most wonderful thing would be the ability to force-free the DBI buffers, but I don't think that's possible. 最棒的是强制释放DBI缓冲区的能力，但我认为这不可行。

Has anyone addressed a similar problem with any success? 有没有人解决类似问题取得任何成功？ Thanks for the help! 谢谢您的帮助！

EDIT 编辑

Perl v5.8.8, DBI v1.52 Perl v5.8.8，DBI v1.52

To clarify: the memory inefficiency is coming from using 'LongReadLen' together with {ora_pers_lob => 1} in the prepare. 澄清一下：内存效率低下来自于在准备中使用'LongReadLen'和{ora_pers_lob => 1}。 Using this code: 使用此代码：

my $sql = "select myclob from my table where id = 68683";
my $dbh = DBI->connect( "dbi:Oracle:$db", $user, $pass ) or croak $DBI::errstr;

print "before";
readline( *STDIN );

$dbh->{'LongReadLen'} = 2 * 1024 * 1024;
my $sth = $dbh->prepare( $sql, {'ora_pers_lob' => 1} ) or croak $dbh->errstr;
$sth->execute() or croak( 'Cant execute_query '. $dbh->errstr . ' sql: ' . $sql );
my $row = $sth->fetchrow_hashref;

print "after";
readline( *STDIN );

Resident memory usage "before" is at 18MB and usage "after" is at 30MB. “之前”的驻留内存使用量为18MB，“之后”的使用量为30MB。 This is unacceptable over a large number of queries. 这在大量查询中是不可接受的。

Answer 1

Are your columns with large data LOBs (CLOBs or BLOBs)? 您的列是否包含大数据LOB（CLOB或BLOB）？ If so, you don't need to use LongReadLen at all; 如果是这样，您根本不需要使用LongReadLen; DBD::Oracle provides a LOB streaming interface. DBD :: Oracle提供了LOB流接口。

What you want to do is to bind the param as type ORA_CLOB or ORA_BLOB , which will get you a "LOB locator" returned from the query, instead of tex. 你想要做的是将param绑定为类型ORA_CLOB或ORA_BLOB ，这将获得从查询返回的“LOB定位器”，而不是tex。 Then you use ora_lob_read together with the LOB locator to get data. 然后使用ora_lob_read和LOB定位器来获取数据。 Here's an example of code that's worked for me: 这是一个对我有用的代码示例：

sub read_lob {
  my ( $dbh, $clob ) = @_;

  my $BLOCK_SIZE = 16384;

  my $out;
  my $offset = 1;

  while ( my $data = $dbh->ora_lob_read( $clob, $offset, $BLOCK_SIZE ) ) {
    $out .= $data;
    $offset += $BLOCK_SIZE;
  }
  return $out;
}

Answer 2

I think of it in this way : 我这样想：

use Parallel::ForkManager
use strict;

# Max 50 processes for parallel data retrieving
my $pm = new Parallel::ForkManager(50);

# while loop goes here
while (my @row = $sth->fetchrow_array) {

# do the fork
$pm->start and next;

#
# Data retreiving goes here
#

# do the exit in the child process
$pm->finish;
}
$pm->wait_all_children;

check Parallel::ForkManager in CPAN to know more. 检查CPAN中的 Parallel :: ForkManager以了解更多信息。

Perl DBI替代LongReadLen

问题描述

2 个解决方案

解决方案1
5 已采纳 2011-12-08 04:49:19

解决方案2
0 2011-12-08 03:20:19

Perl DBI替代LongReadLen

问题描述

2 个解决方案

解决方案1 5 已采纳 2011-12-08 04:49:19

解决方案2 0 2011-12-08 03:20:19

解决方案1
5 已采纳 2011-12-08 04:49:19

解决方案2
0 2011-12-08 03:20:19