简体   繁体   English

使用CakePHP以块的形式导入CSV文件

[英]Import CSV file in chunks with CakePHP

I'm trying to import a large CSV file, with about 23.000 rows, into my MySQL database. 我正在尝试将大约23.000行的大型CSV文件导入MySQL数据库。 I can't import all rules at once, that just doesn't work. 我不能一次导入所有规则,这是行不通的。 So I was wondering or it's possible to read the file in chunks, even while I'm using cakePHP transactions. 所以我想知道或者有可能以块的形式读取文件,即使在我使用cakePHP事务时也是如此。 This is my code so far: 到目前为止这是我的代码:

// Get data source for transactions
$dataSource = $this->FeedImport->Product->getDataSource();

try{
    //Start Transactions
    $dataSource->begin();

    // Create an empty array for the CSV data
    $data = array();
    $i = 0;

    // read each data row in the file
    while (($row = fgetcsv($handle)) !== false) {
        // for each header field
        foreach ($header as $k=>$head) {
            // Remove any special characters from $head
            $head = preg_replace('/[^A-Za-z0-9\-]/', '', $head);
            if(array_key_exists($head, $this->fields)){
                //Check the row contains an image, if so, download
                if(preg_match('/\.(?:jpe?g|png|gif)$/i', $row[$k])){
                    foreach($this->fields[$head] as $table => $field){
                        $imageFileName = uniqid($supplier.'_');
                        $data[$i][$table][][$field] = $imageFileName.'.'.end(explode('.', $row[$k]));
                        $this->__importImg($row[$k]);
                    }
                }else{
                    foreach($this->fields[$head] as $table => $field){
                        if($table == 'Term'){
                            if(isset($row[$k]) && !$this->FeedImport->Product->Term->find('first', array('conditions' => array('Term.name' => $row[$k])))){
                                if(!$this->FeedImport->Product->Term->save(
                                    array(
                                        'name' => $row[$k]
                                    )
                                ));
                            }
                            if(isset($row[$k])) $term = $this->FeedImport->Product->Term->find('first', array('conditions' => array('Term.name' => $row[$k])));
                            $data[$i][$table][$table][$field] = (isset($term['Term']['term_id'])) ? $term['Term']['term_id'] : '';
                        }else{
                            $data[$i][$table][$field] = (isset($row[$k])) ? $row[$k] : '';
                        }
                    }
                }
            }
        }

        $data[$i]['Product']['product_id_supplier'] = $data[$i]['Product']['slug'];
        $data[$i]['Product']['supplier_id'] = $supplier;
        $data[$i]['Product']['feedimport_id'] = 1;

        $i++;
    }

    // save the row
    if (!$this->FeedImport->Product->saveAll($data)) {
        throw new Exception();
    }

} catch(Exception $e) {
    $dataSource->rollback($e);
}
$dataSource->commit();

I've putted the code above in a seperate function, so I can give a startline and endline for the while loop. 我把上面的代码放在一个单独的函数中,所以我可以为while循环提供一个起始行和结束行。 But there's where I got stuck, I don't know how to set a start and end rule using fgetcsv. 但是我遇到了困难,我不知道如何使用fgetcsv设置开始和结束规则。 Can someone help me out here? 有人可以帮帮我吗?

I've tried using fseek and such, but I just can't get it done... Can someone help me out here? 我尝试过使用fseek等等,但我无法完成它...有人可以帮助我吗?

I considered using LOAD DATA INFILE to import those big productfeeds, but I don't think that's going to work nicely, cause I'm using multiple joining tables and some exceptions for importing data into several tables.. so that's too bad. 我考虑使用LOAD DATA INFILE导入那些大型产品,但我认为这不会很好用,因为我使用多个连接表和一些例外将数据导入到几个表中......所以这太糟糕了。

Using PHP5.5 you can leverage the new Generator feature. 使用PHP5.5,您可以利用新的Generator功能。 See http://mark-story.com/posts/view/php-generators-a-useful-example 请参阅http://mark-story.com/posts/view/php-generators-a-useful-example

A possible workaround would be the following 可能的解决方法如下

while (($row = fgetcsv($handle)) !== false) {
    if ($i === 5000){
        try {
            if (!$this->FeedImport->Product->saveAll($data))
                throw new Exception();
        } catch (Exception $e){
            $dataSource->rollback($e);
        }

        $i = 0;
        $data = [];
    }

    // Code
}

For each 5000 records it will commit the data, reset the counter and data array and continue. 对于每5000条记录,它将提交数据,重置计数器和数据数组并继续。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM