简体   繁体   English

使用Powershell v4插入MySQL时发生内存泄漏

[英]Memory leak when inserting into MySQL with Powershell v4

I'm using powershell v4 on W2K12 R2 (fully patched) to insert a large number(100+ million) of records into a MySQL database. 我正在W2K12 R2上使用Powershell v4(已完全修补)将大量(100+百万)条记录插入MySQL数据库。 I've run into a bit of a problem where memory usage continues growing and growing despite aggressively removing variables and garbage collecting. 我遇到了一个问题,尽管积极地删除了变量并进行了垃圾回收,但内存使用量却在不断增长。 Note that the memory usage is growing on the box that I'm running the script on -not the DB server. 请注意,我在-而非DB服务器上运行脚本的盒子上的内存使用量正在增加。

The insertion speed is good and the job runs fine. 插入速度快,工作运行正常。 However, I have a memory leak and have been beating my head against a wall for a week trying to figure out why. 但是,我有一个内存泄漏,并且一直在脑子里碰壁一个星期,试图找出原因。 I know from testing that the memory accumulates when calling the MySQL portion of the script and not anywhere else. 从测试中我知道,在调用脚本的MySQL部分而不是其他任何地方时,内存会累积。

I've noticed that after every insertion that the memory grows from anywhere between 1MB and 15MB. 我注意到,每次插入后,内存会从1MB到15MB之间增长。

Here is the basic flow of the process (code at the bottom). 这是该过程的基本流程(底部的代码)。 -records are being added to an array until there are 1,000 records in the array -once there are a thousand records, they are inserted, as a batch, into the DB -the array is then emptied using the .clear() method (I've verified that 0 records remain in array). -将记录添加到阵列中,直到阵列中有1000条记录-一次有1000条记录,将它们作为一批批量插入DB中-然后使用.clear()方法清空阵列(I已验证0个记录保留在数组中)。 -I've tried aggressively garbage collecting after every insert (no luck there). -我已经尝试过在每次插入后进行垃圾收集(那里没有运气)。 -also tried removing variables and then garbage collecting. -还尝试了删除变量,然后进行垃圾收集。 Still no luck. 仍然没有运气。

The code below is simplified for the sake of brevity. 为了简洁起见,下面的代码已简化。 But, it shows how I'm iterating over the records and doing the insert: 但是,它显示了我如何遍历记录并进行插入:

$reader = [IO.File]::OpenText($filetoread)
$lineCount = 1
   while ($reader.Peek() -ge 0) {
      if($lineCount -ge 1000-or $reader.Peek() -lt 0) {

          insert_into_db

          $lineCount = 0
      }
   $lineCount++
   }
$reader.Close()
$reader.Dispose()

One call to establish the connection: 一个电话建立连接:

[void][system.reflection.Assembly]::LoadFrom("C:\Program Files (x86)\MySQL\MySQL Connector Net 6.8.3\Assemblies\v4.5\MySql.Data.dll")
$connection = New-Object MySql.Data.MySqlClient.MySqlConnection($connectionString)

And here is the call to MySQL to do the actual inserts for each 1,000 records: 这是对MySQL的调用,它对每1,000条记录进行实际插入:

function insert_into_db {
    $command = $connection.CreateCommand()                  # Create command object
    $command.CommandText = $query                           # Load query into object
    $script:RowsInserted = $command.ExecuteNonQuery()       # Execute command
    $command.Dispose()                                      # Dispose of command object
    $command = $null
    $query = $null
}

If anyone has any ideas or suggestions I'm all ears! 如果有人有任何想法或建议,我将为您服务!

Thanks, Jeremy 谢谢,杰里米

My initial conclusion about the problem being related to the Powershell -join operator appear to be wrong. 我对与Powershell -join运算符有关的问题的初步结论似乎是错误的。

Here is what I was doing. 这是我在做什么。 Note that I'm adding each line to an array, which I will un-roll later when I form my SQL. 请注意,我将每行添加到一个数组中,稍后在形成SQL时将其展开。 (On a side note, adding items to an array tends to more performant than concatenating strings) (顺便说一句,将项目添加到数组中比连接字符串更容易执行)

$dataForInsertion =  = New-Object System.Collections.Generic.List[String]
$reader = [IO.File]::OpenText($filetoread)
$lineCount = 1
   while ($reader.Peek() -ge 0) {
      $line = $reader.Readline()
      $dataForInsertion.add($line)
      if($lineCount -ge 1000-or $reader.Peek() -lt 0) {

          insert_into_db -insertthis $dataForInsertion

          $lineCount = 0
      }
   $lineCount++
   }
$reader.Close()
$reader.Dispose()

Calling the insert function: 调用插入函数:

   sql_query -query "SET autocommit=0;INSERT INTO ``$table`` ($columns) VALUES $($dataForInsertion -join ',');COMMIT;"

The improved insert function now looks like this: 改进的插入功能现在如下所示:

function insert_into_db {
    $command.CommandText = $query                           # Load query into object
    $script:RowsInserted = $command.ExecuteNonQuery()       # Execute command
    $command.Dispose()                                      # Dispose of command object
    $query = $null
}

So, it turns out my initial conclusion about the source of the problem was wrong. 因此,事实证明我对问题根源的初步结论是错误的。 the Powershell -join operator had nothing to do with the issue. Powershell -join运算符与该问题无关。

In my SQL insert function I was repeatedly calling $connection.CreateCommand() on every insert. 在我的SQL插入函数中,我在每次插入时都反复调用$ connection.CreateCommand()。 Once I moved that into the function that handles setting up the connection (which is only called once -or when needed) the memory leak disappeared. 一旦将其移到用于处理建立连接的函数中(仅调用一次-或在需要时调用),内存泄漏就消失了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM