简体   繁体   English

使用 LOAD DATA LOCAL INFILE 将 CSV 数据存储到 MySQL 的问题

[英]Troubles using LOAD DATA LOCAL INFILE to store CSV data into MySQL

UPDATE:更新:

As per the answer suggesting the use of Table::insert , I updated the query.根据建议使用Table::insert的答案,我更新了查询。

// At this point the data is stored in an array
// I turn it into a collection to make use of the chunk function
$records = collect($records); 
// I split the array into chunks
// Inserting everything at once gave memory issues
$chunks = $records->chunk(500); 
// Then I iterate the chunks and insert them into the database
foreach ($chunks as $chunk) {
    \DB::table('table')->insert($chunk->toArray()); 
}

The issue with this is that while being faster than using the previous method, it still takes several minutes to insert everything into my database, so it's not an actual solution.这样做的问题是,虽然比使用以前的方法更快,但将所有内容插入我的数据库仍然需要几分钟,所以这不是一个实际的解决方案。

I'm not sure if this is the wrong approach, if I'm missing something/doing something wrong here.我不确定这是否是错误的方法,如果我在这里遗漏了什么/做错了什么。


Description:描述:

I have an Angular form where the user can upload a.txt file containing CSV data.我有一个 Angular 表单,用户可以在其中上传包含 CSV 数据的 .txt 文件。 This is how it works:这是它的工作原理:

HTML: HTML:

  <input type="file" class="form-control w-auto" (change)="getFile($event)" accept=".txt" />

Typescript: Typescript:

  getFile(e) {
    this.file = e.target.files[0];
    this.file_description = e.target.files[0].name;
  }

  submit() {
    let reader = new FileReader();

    reader.onload = () => {
      let result = reader.result;
      this.ImportService.import(result).subscribe(
        (response) => {
          // Do success stuff
        },
        (_error) => {
          // Do error stuff
        }
      );
    };

    if (this.file) {
      reader.readAsText(this.file, "UTF-16LE");
    }
  }

This sends the data to my Laravel instance, where I want to store it into MySQL.这会将数据发送到我的 Laravel 实例,我想将其存储到 MySQL 中。 I've been able to do this by processing the received data into an array and inserting each row one by one, like this:我已经能够通过将接收到的数据处理成一个数组并一一插入每一行来做到这一点,如下所示:

PHP: PHP:

$array = array();
$csv = str_getcsv($request->file, "\n");
foreach ($csv as &$row) {
    $row = str_getcsv($row, ";");
    $array[] = $row;
}
array_splice($array, 0, 1);

foreach ($array as &$row) {
    $query = Table::firstOrNew(['col2' => $row[1], 'col3' => $row[2]]);
    $query->col1 = $row[0];
    $query->col2 = $row[1];
    $query->col3 = $row[2];
    $query->col4 = $row[3];
    $query->col5 = $row[4];
    // [...]
    $query->col72 = $row[71];
    $query->col73 = $row[72];
    $query->save();
}

The Problem:问题:

The data being sent here contains around 100.000 records.此处发送的数据包含大约 100.000 条记录。 This method is way too slow, often leading to timeouts/5-10+ minutes of waiting.这种方法太慢了,经常导致超时/5-10+分钟的等待。


Attempted Solution:尝试的解决方案:

I've been trying to use LOAD DATA LOCAL INFILE , but I can't get it to work.我一直在尝试使用LOAD DATA LOCAL INFILE ,但我无法让它工作。

Here's the code:这是代码:

$results = DB::connection()->getpdo()->exec(
    "LOAD DATA LOCAL INFILE '" . $request->file . "' IGNORE INTO TABLE `table`
    FIELDS TERMINATED BY ';'
    LINES TERMINATED BY '\n'
    IGNORE 1 LINES (
        `col1`,
        `col2`,
        `col3`,
        // etc
    )"
);

In short, there are three issues with this method:简而言之,这种方法存在三个问题:

  • If the CSV data contains any Apostrophes I get a syntax error.如果 CSV 数据包含任何撇号,我会收到语法错误。
  • The local_infile property on my MySQL instance is disabled by default, and reverts to disable on restart.默认情况下,我的 MySQL 实例上的local_infile属性处于禁用状态,并在重新启动时恢复为禁用状态。
  • With the above fixed, I get the following error: General error: 7890 Can't find file 'col1;col2;col3;etc' .修复上述问题后,我收到以下错误: General error: 7890 Can't find file 'col1;col2;col3;etc'

I've tried using LOAD DATA INFILE but I'm getting several 'access denied' errors when the query tries to fetch the file contents.我尝试过使用LOAD DATA INFILE ,但是当查询尝试获取文件内容时,我收到了几个“拒绝访问”错误。


If you need any further information pleases let me know.如果您需要任何进一步的信息,请告诉我。

Tasks like this are probably better suited to be handled in a Queue/Job so that the user isn't waiting for this to complete in real time.像这样的任务可能更适合在队列/作业中处理,这样用户就不会等待实时完成。

Either way, I would recommend refactoring the code though.无论哪种方式,我都建议重构代码。 firstOrNew() will run queries for every single record you process, so it will be a major strain on the database. firstOrNew() 将为您处理的每条记录运行查询,因此这将是数据库的主要压力。

The methods insert() and update() support processing many records at a time, so use PHP to preprocess the records so you can utilize those two methods. insert() 和 update() 方法支持一次处理多条记录,因此请使用 PHP 对记录进行预处理,以便您可以利用这两种方法。

Use one database query to get all the existing records and then build two arrays to call insert() and update().使用一个数据库查询获取所有现有记录,然后构建两个 arrays 来调用 insert() 和 update()。

I think you can create a where query like this if you create pairs of values that you want to search on:如果您创建要搜索的值对,我认为您可以创建这样的 where 查询:

Table::where(function($query) use ($pairs) {
    foreach ($pairs as $pair) {
        $query->where('col2', $pair['col2'])->where('col3', $pair['col3']);
    }
});

That will give you the existing records, so then you can loop through all records in PHP to build your two arrays and then run the insert and update queries only one time:这将为您提供现有记录,因此您可以遍历 PHP 中的所有记录以构建您的两个 arrays 然后只运行一次插入和更新查询:

Table::insert($new_records);
Table::update($existing_records);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM