简体   繁体   中英

OledbDataReader using up all the RAM (in powershell)

From all my reading, the oledb datareader does not store records in memory, but this code is maxing out the RAM. Its meant to pull data from an Oracle db (about 10M records) and write them to a GZIP file. I have tried everything (including commenting out the Gzip write) and it still ramps up the RAM until it falls over. Is there are way to just execute the reader without it staying in memory? What am I doing wrong?

$tableName='ACCOUNTS'
$fileNum=1
$gzFilename="c:\temp\gzip\$tableName.$fileNum.txt.gz"
$con=Open-Con ORA -tns $tns -userName $userName -fetchSize $fetchSize
$cmd = New-Object system.Data.OleDb.OleDbCommand($sql,$con);           
$cmd.CommandTimeout = '0';        
$output = New-Object System.IO.FileStream $gzFilename, ([IO.FileMode]::Create), ([IO.FileAccess]::Write), ([IO.FileShare]::None)
[System.IO.Compression.GzipStream]$gzipStream = New-Object System.IO.Compression.GzipStream $output, ([IO.Compression.CompressionMode]::Compress)
$encoding = [System.Text.Encoding]::UTF8
$reader=$cmd.ExecuteReader()
[int]$j=0
While ($reader.Read())            
{            
        $j++
        $str=$reader[0..$($reader.Fieldcount-1)] -join '|'                        
        $out=$encoding.GetBytes($("$str`n").ToString() )
        $gzipStream.Write($out,0, $out.length)
        if($j % 10000 -eq 0){write-host $j}
        if($j % 1000000 -eq 0){
            write-host 'creating new gz file'
            $gzipStream.Close();
            $gzipStream.Dispose()
            $fileNum+=1
            $gzFilename="c:\temp\gzip\$tableName.$fileNum.txt.gz"
            $output = New-Object System.IO.FileStream $gzFilename, ([IO.FileMode]::Create), ([IO.FileAccess]::Write), ([IO.FileShare]::None)
            [System.IO.Compression.GzipStream]$gzipStream = New-Object System.IO.Compression.GzipStream $output, ([IO.Compression.CompressionMode]::Compress)
            }
}

Edit: from the comments, [system.gc]::Collect() had no effect. Also, stripping it down to the simplest form and only reading a single field also had no effect. This code ramps up to 16GB memory (viewed in task manager) and then quits with OOM

$con=Open-Con ORA -tns $tns -userName $userName -fetchSize $fetchSize
$cmd = New-Object system.Data.OleDb.OleDbCommand($sql,$con);           
$cmd.CommandTimeout = '0';        
            
$reader=$cmd.ExecuteReader()
[int]$j=0
While ($reader.Read())            
{            
    $str=$reader[0]                       
}    

Possibly it's using up virtual address space rather than actual RAM. That's a common problem with the underlying.Net garbage collector used with (at least) the ADO.Net and string objects created here, especially if any of the records have fields with lots of text.

Building on that, it looks like you're doing most of the correct things to avoid this issue (using DataReader, writing directly to a stream, etc). What you could do to improve this is writing to the stream one field at a time, rather than using -join to push all the fields into the same string and then writing, and making sure we re-use the same $out array buffer (though I'm not sure exactly what this last looks like in PowerShell or with Encoding.GetBytes() .

This may help, but it still can create issues with how it concatenates the fieldDelimiter and line terminator. If you find this runs for longer, but still eventually produces an error, you probably need to do the tedious work to have separate write operations to the gzip stream for each of those values.

$tableName='ACCOUNTS'
$fileNum=1
$gzFilename="c:\temp\gzip\$tableName.$fileNum.txt.gz"
$con=Open-Con ORA -tns $tns -userName $userName -fetchSize $fetchSize
$cmd = New-Object system.Data.OleDb.OleDbCommand($sql,$con);           
$cmd.CommandTimeout = '0';        
$output = New-Object System.IO.FileStream $gzFilename, ([IO.FileMode]::Create), ([IO.FileAccess]::Write), ([IO.FileShare]::None)
[System.IO.Compression.GzipStream]$gzipStream = New-Object System.IO.Compression.GzipStream $output, ([IO.Compression.CompressionMode]::Compress)
$encoding = [System.Text.Encoding]::UTF8
$reader=$cmd.ExecuteReader()
[int]$j=0
While ($reader.Read())            
{            
        $j++
        $fieldDelimiter= ""
        $terminator = ""
        for ($k=0;$k -lt $reader.Fieldcount;$k++) {
            if ($k -eq $reader.Fieldcount - 1) { $terminator = "`n"}

            $out = $encoding.GetBytes("$fieldDelimiter$($reader[$k])$terminator")
            $gzipStream.Write($out,0,$out.length)

            $fieldDelimiter= "|"            
        }      

        if($j % 10000 -eq 0){write-host $j}
        if($j % 1000000 -eq 0){
            write-host 'creating new gz file'
            $gzipStream.Close();
            $gzipStream.Dispose()
            $fileNum+=1
            $gzFilename="c:\temp\gzip\$tableName.$fileNum.txt.gz"
            $output = New-Object System.IO.FileStream $gzFilename, ([IO.FileMode]::Create), ([IO.FileAccess]::Write), ([IO.FileShare]::None)
            [System.IO.Compression.GzipStream]$gzipStream = New-Object System.IO.Compression.GzipStream $output, ([IO.Compression.CompressionMode]::Compress)
            }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM