简体   繁体   English

MYSQL查询执行速度非常慢

[英]MYSQL query performs very slow

I have developed a user bulk upload module. 我已经开发了一个用户批量上传模块。 There are 2 situations, when I do a bulk upload of 20 000 records when database has zero records. 有两种情况,当数据库具有零记录时,我批量上传了20000条记录。 Its taking about 5 hours. 大约需要5个小时。 But when the database already has about 30 000 records the upload is very very slow. 但是,当数据库中已经有大约30 000条记录时,上载非常慢。 It takes about 11 hours to upload 20 000 records. 上载2万条记录大约需要11个小时。 I am just reading a CSV file via fgetcsv method. 我只是通过fgetcsv方法读取CSV文件。

if (($handle = fopen($filePath, "r")) !== FALSE) {
            while (($peopleData = fgetcsv($handle, 10240, ",")) !== FALSE) {
                if (count($peopleData) == $fieldsCount) {

//inside i check if user already exist (firstName & lastName & DOB)
//if not, i check if email exist. if exist, update the records.
//other wise insert a new record.
}}}

Below are the queries that run. 下面是运行的查询。 (I am using Yii framework) (我正在使用Yii框架)

SELECT * 
FROM `AdvanceBulkInsert` `t` 
WHERE renameSource='24851_bulk_people_2016-02-25_LE CARVALHO 1.zip.csv' 
LIMIT 1

SELECT cf.*, ctyp.typeName, cfv.id as customId, cfv.customFieldId, 
       cfv.relatedId, cfv.fieldValue, cfv.createdAt 
FROM `CustomField` `cf` 
    INNER JOIN CustomType ctyp on ctyp.id = cf.customTypeId 
    LEFT OUTER JOIN CustomValue cfv on cf.id = cfv.customFieldId 
                and relatedId = 0 
    LEFT JOIN CustomFieldSubArea cfsa on cfsa.customFieldId = cf.id 
WHERE ((relatedTable = 'people' and enabled = '1') 
  AND (onCreate = '1')) 
  AND (cfsa.subarea='peoplebulkinsert') 
ORDER BY cf.sortOrder, cf.label

SELECT * 
FROM `User` `t` 
WHERE `t`.`firstName`='Franck' 
  AND `t`.`lastName`='ALLEGAERT ' 
  AND `t`.`dateOfBirth`='1971-07-29' 
  AND (userType NOT IN ("1")) 
LIMIT 1

If exist update the user: 如果存在,请更新用户:

UPDATE `User` SET `id`='51394', `address1`='49 GRANDE RUE', 
                  `mobile`='', `name`=NULL, `firstName`='Franck', 
                  `lastName`='ALLEGAERT ', `username`=NULL, 
                  `password`=NULL, `email`=NULL, `gender`=0, 
                  `zip`='60310', `countryCode`='DZ', 
                  `joinedDate`='2016-02-23 10:44:18', 
                  `signUpDate`='0000-00-00 00:00:00', 
                  `supporterDate`='2016-02-25 13:26:37', `userType`=3, 
                  `signup`=0, `isSysUser`=0, `dateOfBirth`='1971-07-29', 
                  `reqruiteCount`=0, `keywords`='70,71,72,73,74,75', 
                  `delStatus`=0, `city`='AMY', `isUnsubEmail`=0, 
                  `isManual`=1, `isSignupConfirmed`=0, `profImage`=NULL, 
                  `totalDonations`=NULL, `isMcContact`=NULL, 
                  `emailStatus`=NULL, `notes`=NULL, 
                  `addressInvalidatedAt`=NULL, 
                  `createdAt`='2016-02-23 10:44:18', 
                  `updatedAt`='2016-02-25 13:26:37', `longLat`=NULL 
WHERE `User`.`id`='51394'

If user don't exist, insert new record. 如果用户不存在,请插入新记录。

Table engine type is MYISAM. 表引擎类型为MYISAM。 Only the email column has a index. 仅电子邮件列具有索引。

How can I optimize this to reduce the processing time? 我该如何优化以减少处理时间?

Query 2, took 0.4701 seconds which means for 30 000 records it will take 14103 sec, which is about 235 minutes. 查询2花费了0.4701秒,这意味着30 000条记录将花费14103秒,大约235分钟。 approx 6 hours. 大约6个小时。

Update 更新资料

CREATE TABLE IF NOT EXISTS `User` (
  `id` bigint(20) NOT NULL,
  `address1` text COLLATE utf8_unicode_ci,
  `mobile` varchar(15) COLLATE utf8_unicode_ci DEFAULT NULL,
  `name` varchar(45) COLLATE utf8_unicode_ci DEFAULT NULL,
  `firstName` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
  `lastName` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
  `username` varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
  `password` varchar(45) COLLATE utf8_unicode_ci DEFAULT NULL,
  `email` varchar(45) COLLATE utf8_unicode_ci DEFAULT NULL,
  `gender` tinyint(2) NOT NULL DEFAULT '0' COMMENT '1 - female, 2-male, 0 - unknown',
  `zip` varchar(15) COLLATE utf8_unicode_ci DEFAULT NULL,
  `countryCode` varchar(3) COLLATE utf8_unicode_ci DEFAULT NULL,
  `joinedDate` datetime DEFAULT NULL,
  `signUpDate` datetime NOT NULL COMMENT 'User signed up date',
  `supporterDate` datetime NOT NULL COMMENT 'Date which user get supporter',
  `userType` tinyint(2) NOT NULL,
  `signup` tinyint(2) NOT NULL DEFAULT '0' COMMENT 'whether user followed signup process 1 - signup, 0 - not signup',
  `isSysUser` tinyint(1) NOT NULL DEFAULT '0' COMMENT '1 - system user, 0 - not a system user',
  `dateOfBirth` date DEFAULT NULL COMMENT 'User date of birth',
  `reqruiteCount` int(11) DEFAULT '0' COMMENT 'User count that he has reqruited',
  `keywords` text COLLATE utf8_unicode_ci COMMENT 'Kewords',
  `delStatus` tinyint(2) NOT NULL DEFAULT '0' COMMENT '0 - active, 1 - deleted',
  `city` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
  `isUnsubEmail` tinyint(1) NOT NULL DEFAULT '0' COMMENT '0 - ok, 1 - Unsubscribed form email',
  `isManual` tinyint(1) NOT NULL DEFAULT '0' COMMENT '0 - ok, 1 - Manualy add',
  `longLat` varchar(45) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Longitude and Latitude',
  `isSignupConfirmed` tinyint(4) NOT NULL DEFAULT '0' COMMENT 'Whether user has confirmed signup ',
  `profImage` tinytext COLLATE utf8_unicode_ci COMMENT 'Profile image name or URL',
  `totalDonations` float DEFAULT NULL COMMENT 'Total donations made by the user',
  `isMcContact` tinyint(1) DEFAULT NULL COMMENT '1 - Mailchimp contact',
  `emailStatus` tinyint(2) DEFAULT NULL COMMENT '1-bounced, 2-blocked',
  `notes` text COLLATE utf8_unicode_ci,
  `addressInvalidatedAt` datetime DEFAULT NULL,
  `createdAt` datetime NOT NULL,
  `updatedAt` datetime DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE IF NOT EXISTS `AdvanceBulkInsert` (
  `id` int(11) NOT NULL,
  `source` varchar(256) NOT NULL,
  `renameSource` varchar(256) DEFAULT NULL,
  `countryCode` varchar(3) NOT NULL,
  `userType` tinyint(2) NOT NULL,
  `size` varchar(128) NOT NULL,
  `errors` varchar(512) NOT NULL,
  `status` char(1) NOT NULL COMMENT '1:Queued, 2:In Progress, 3:Error, 4:Finished, 5:Cancel',
  `createdAt` datetime NOT NULL,
  `createdBy` int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE IF NOT EXISTS `CustomField` (
  `id` int(11) NOT NULL,
  `customTypeId` int(11) NOT NULL,
  `fieldName` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
  `relatedTable` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
  `defaultValue` text COLLATE utf8_unicode_ci,
  `sortOrder` int(11) NOT NULL DEFAULT '0',
  `enabled` char(1) COLLATE utf8_unicode_ci DEFAULT '1',
  `listItemTag` char(1) COLLATE utf8_unicode_ci DEFAULT NULL,
  `required` char(1) COLLATE utf8_unicode_ci DEFAULT '0',
  `onCreate` char(1) COLLATE utf8_unicode_ci DEFAULT '1',
  `onEdit` char(1) COLLATE utf8_unicode_ci DEFAULT '1',
  `onView` char(1) COLLATE utf8_unicode_ci DEFAULT '1',
  `listValues` text COLLATE utf8_unicode_ci,
  `label` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
  `htmlOptions` text COLLATE utf8_unicode_ci
) ENGINE=MyISAM AUTO_INCREMENT=12 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE IF NOT EXISTS `CustomFieldSubArea` (
  `id` int(11) NOT NULL,
  `customFieldId` int(11) NOT NULL,
  `subarea` varchar(256) COLLATE utf8_unicode_ci NOT NULL
) ENGINE=MyISAM AUTO_INCREMENT=43 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE IF NOT EXISTS `CustomValue` (
  `id` int(11) NOT NULL,
  `customFieldId` int(11) NOT NULL,
  `relatedId` int(11) NOT NULL,
  `fieldValue` text COLLATE utf8_unicode_ci,
  `createdAt` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=MyISAM AUTO_INCREMENT=86866 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

Entire PHP Code is here http://pastie.org/10737962 整个PHP代码在这里http://pastie.org/10737962

Update 2 更新2

Explain output of the Query 解释查询的输出

在此处输入图片说明

Indexes are your friend. 索引是您的朋友。

UPDATE User ... WHERE id = ... -- Desperately needs an index on ID, probably PRIMARY KEY . UPDATE User ... WHERE id = ...需要ID的索引,可能是PRIMARY KEY

Similarly for renameSource . 对于renameSource同样renameSource

SELECT * 
FROM `User` `t` 
WHERE `t`.`firstName`='Franck' 
  AND `t`.`lastName`='ALLEGAERT ' 
  AND `t`.`dateOfBirth`='1971-07-29' 
  AND (userType NOT IN ("1")) 
LIMIT 1;

Needs INDEX(firstName, lastName, dateOfBirth) ; 需要INDEX(firstName, lastName, dateOfBirth) ; the fields can be in any order (in this case). 字段可以是任何顺序(在这种情况下)。

Look at each query to see what it needs, then add that INDEX to the table. 查看每个查询以查看其需求,然后将该INDEX添加到表中。 Read my Cookbook on building indexes . 阅读我的关于建立索引的食谱

Try these things to increase your query performance: 尝试使用以下方法来提高查询性能:

  • define indexing in your database structure, and get only columns that you want. 在数据库结构中定义索引,并仅获取所需的列。
  • Do not use * in select query. 不要在选择查询中使用*。
  • And do not put ids in quotes like User.id='51394' , instead do User.id= 51394 . 并且不要将id放在诸如User.id='51394'引号中,而是将User.id= 51394代替。
  • If you are giving ids in quotes then your indexing will not work. 如果您在引号中提供ID,则无法建立索引。 That approach improve your query performance by 20% faster. 该方法将查询性能提高了20%。
  • If you are using ENGINE=MyISAM then you not able to define indexing in between your database table, change database engine to ENGINE=InnoDB . 如果使用的是ENGINE=MyISAM则无法在数据库表之间定义索引,请将数据库引擎更改为ENGINE=InnoDB And create some indexing like foreign keys, full text indexing. 并创建一些索引,例如外键,全文本索引。

If I understand, for all the result of SELECT * FROM AdvanceBulkInsert ... you run a request SELECT cf.* , and for all the SELECT cf.* , you run the SELECT * FROM User 据我了解,对于SELECT * FROM AdvanceBulkInsert ...的所有结果,您都运行一个SELECT cf.*请求,对于所有SELECT cf.* ,您都运行SELECT * FROM User

I think the issue is that you send way too much requests to the base. 我认为问题在于您向基地发送了太多请求。

I think you should merge all your select request in only one big request. 我认为您应该将所有选择请求合并为一个大请求。

For that: 为了那个原因:

Then you call the update on all the result of the merged select. 然后,对合并选择的所有结果调用更新。

You should too time one by one your request to find which of this requests take the most time, and you should too use ANALYSE to find what part of the request take time. 您也应该一遍一遍地处理您的请求,以找出哪个请求花费的时间最多,并且您也应该使用ANALYZE来查找请求中哪一部分花费时间。

Edit: 编辑:

Now I have see your code : 现在,我看到了您的代码:

Some lead: 一些线索:

  • have you index for cf.customTypeId , cfv.customFieldId , cfsa.customFieldId, user. 您是否为cf.customTypeId,cfv.customFieldId,cfsa.customFieldId用户索引。 dateOfBirth ,user. dateOfBirth,用户。 firstName,user.lastName ? firstName,user.lastName?

  • you don't need to do a LEFT JOIN CustomFieldSubArea if you have a WHERE who use CustomFieldSubArea, a simple JOIN CustomFieldSubArea is enougth. 如果您有使用CustomFieldSubArea的WHERE,则不需要做LEFT JOIN CustomFieldSubArea,一个简单的JOIN CustomFieldSubArea就足够了。

  • You will launch the query 2 a lot of time with relatedId = 0 , maybe you can save the result in a var? 您将使用relatedId = 0大量启动查询2,也许您可​​以将结果保存在var中?

  • if you don't need sorted data, remove the "ORDER BY cf.sortOrder, cf.label" . 如果不需要排序的数据,请删除“ ORDER BY cf.sortOrder,cf.label”。 Else, add index on cf.sortOrder, cf.label 否则,在cf.sortOrder,cf.label上添加索引

When you need to find out why a query takes long, you need to inspect individual parts. 当您需要找出查询耗时的原因时,您需要检查各个部分。 As you shown in the question Explain statement can help you very much. 正如您在问题中所显示的, 解释语句可以为您提供很大帮助。 Usually the most important columns are: 通常,最重要的列是:

  • select_type - this should always be simple query/subquery. select_type-这应该始终是简单的查询/子查询。 Related subqueries give a lot of troubles. 相关子查询会带来很多麻烦。 Luckily you don't use any 幸运的是您没有使用任何
  • possible keys - What keys is this select going to search by 可能的键-此选择将搜索什么键
  • rows - how many candidate rows are determined by the keys/cache and other techniques. 行-由键/缓存和其他技术确定多少候选行。 Smaller number is better 数字越小越好
  • Extra - "using" tells you how exactly are the rows found, this is the most useful information 额外-“使用”可告诉您如何精确找到行,这是最有用的信息

Query analysis 查询分析

I would have posted analytics for the 1st and 3rd query but they are both quite simple queries. 我会为第一和第三查询发布分析数据,但是它们都是非常简单的查询。 Here is the breakdown for the query that gives you troubles: 这是给您带来麻烦的查询的细分:

EXPLAIN SELECT cf.*, ctyp.typeName, cfv.id as customId, cfv.customFieldId, 
   cfv.relatedId, cfv.fieldValue, cfv.createdAt 
FROM `CustomField` `cf` 
    INNER JOIN CustomType ctyp on ctyp.id = cf.customTypeId 
    LEFT OUTER JOIN CustomValue cfv on cf.id = cfv.customFieldId 
                and relatedId = 0 
    LEFT JOIN CustomFieldSubArea cfsa on cfsa.customFieldId = cf.id 
WHERE ((relatedTable = 'people' and enabled = '1') 
  AND (onCreate = '1')) 
  AND (cfsa.subarea='peoplebulkinsert') 
ORDER BY cf.sortOrder, cf.label
  • INNER JOIN CustomType ctyp on ctyp.id = cf.customTypeId ctyp.id = cf.customTypeId上的INNER JOIN CustomType ctyp
  • LEFT OUTER JOIN CustomValue cfv on cf.id = cfv.customFieldId and relatedId = 0 左外联接cf.id = cfv.customFieldIdrelatedId = 0上的CustomValue cfv
  • LEFT JOIN CustomFieldSubArea cfsa on cfsa.customFieldId = cf.id cfsa.customFieldId = cf.id上向左联接CustomFieldSubArea cfsa
  • WHERE (( relatedTable = 'people' and enabled = '1') AND ( onCreate = '1')) AND ( cfsa.subarea ='peoplebulkinsert') WHERE((relatedTable = '人'和启用 = '1')和( 的onCreate = '1'))AND(cfsa.subarea = 'peoplebulkinsert')
  • ORDER BY cf.sortOrder , cf.label ORDER BY cf.sortOrdercf.label

Solution

Let me explain above list. 让我解释一下上面的清单。 Bold columns totally must have an index. 粗体列必须完全具有索引。 Joining tables is expensive operation that otherwise needs to go through all rows of both tables. 连接表是昂贵的操作,否则需要遍历两个表的所有行。 If you make index on the joinable columns the DB engine will find much faster and better way to do it. 如果在可连接列上建立索引,则数据库引擎将找到更快,更好的方法。 This should be common practice for any database 对于任何数据库,这应该是惯例

The italic columns are not mandatory to have index, but if you have large amount of rows (20 000 is large amount) you should also have index on the columns that you use for searching, it might not have such impact on the processing speed but is worth the extra bit of time. 斜体列不是必须具有索引的,但是如果您有大量的行(20 000是大量的),则还应该在用于搜索的列上具有索引,这可能不会对处理速度产生影响,但是值得额外的时间。

So you need to add indicies to theese columns 因此,您需要在theese列中添加索引

  • CustomType - id CustomType-ID
  • CustomField - customTypeId, id, relatedTable, enabled, onCreate, sortOrder, label CustomField-customTypeId,id,relatedTable,enabled,onCreate,sortOrder,label
  • CustomValue - customFieldId CustomValue-customFieldId
  • CustomFieldSubArea - customFieldId, subarea CustomFieldSubArea-customFieldId,子区域

To verify the results try running explain statement again after adding indicies (and possibly few other select/insert/update queries). 要验证结果,请尝试在添加索引(可能还有其他一些选择/插入/更新查询)之后再次运行explain语句。 The extra column should say something like "Using Index" and possible_keys column should list used keys (even two or more per join query). 额外的列应显示诸如“使用索引”之类的内容,而possible_keys列应列出已使用的键(每个联接查询甚至两个或更多)。

Side note: You have some typos in your code, you should fix them in case someone else needs to work on your code too: "reqruiteCount" as table column and "fileUplaod" as array index in your refered code. 旁注:您的代码中有一些错别字,如果其他人也需要对您的代码进行处理,则应予以纠正:“ reqruiteCount”作为表列,“ fileUplaod”作为引用的代码中的数组索引。

For my work, I have to add daily one CSV with 524 Columns and 10k records. 对于我的工作,我必须每天添加一个524列和10k记录的CSV。 When I have try to parse it and add the record with php, it was horrible. 当我尝试解析它并用php添加记录时,那太可怕了。

So, I propose to you to see the documentation about LOAD DATA LOCAL INFILE 因此,我建议您查看有关LOAD DATA LOCAL INFILE的文档

I copy/past my own code for example, but adapt him to your needs 例如,我复制/粘贴了自己的代码,但根据您的需求使他适应

$dataload = 'LOAD DATA LOCAL INFILE "'.$filename.'"
                REPLACE
                INTO TABLE '.$this->csvTable.' CHARACTER SET "utf8"
                FIELDS TERMINATED BY "\t"
                IGNORE 1 LINES
            ';

$result = (bool)$this->db->query($dataload);

Where $filename is a local path of your CSV (you can use dirname(__FILE__) for get it ) 其中$ filename是CSV的本地路径(您可以使用dirname(__FILE__)来获取它)

This SQL command is very quick (just 1 or 2 second for add/update all the CSV) 这个SQL命令非常快(添加或更新所有CSV只需1或2秒)

EDIT : read the doc, but of course you need to have an uniq index on your user table for "replace" works. 编辑:阅读文档,但是您当然需要在用户表上具有uniq索引才能进行“替换”工作。 So, you don't need to check if the user exist or not. 因此,您无需检查用户是否存在。 And you don't need to parse the CSV file with php. 而且您不需要使用php解析CSV文件。

You appear to have the possibility (probability?) of 3 queries for every single record. 您似乎对每条记录都有3个查询(概率?)。 Those 3 queries are going to require 3 trips to the database (and if you are using yii storing the records in yii objects then that might slow things down even more). 这3个查询将需要3次访问数据库的时间(如果您使用yii将记录存储在yii对象中,则可能会进一步降低速度)。

Can you add a unique key on first name / last name / DOB and one on email address? 您可以在名字/姓氏/ DOB上添加唯一键,在电子邮件地址上添加唯一键吗?

If so the you can just do INSERT....ON DUPLICATE KEY UPDATE. 如果是这样,您只需执行INSERT .... ON DUPLICATE KEY UPDATE。 This would reduce it to a single query for each record, greatly speeding things up. 这样会将其简化为每个记录的单个查询,从而大大加快了工作速度。

But the big advantage of this syntax is that you can insert / update many records at once (I normally stick to about 250), so even less trips to the database. 但是这种语法的最大优点是您可以一次插入/更新许多记录(我通常坚持约250条记录),因此访问数据库的次数更少。

You can knock up a class that you just pass records to and which does the insert when the number of records hits your choice. 您可以敲击一个类,该类将记录仅传递给该类,并且当记录数达到您的选择时将插入该类。 Also add in a call to insert the records in the destructor to insert any final records. 还添加一个调用以将记录插入析构函数中以插入所有最终记录。

Another option is to read everything in to a temp table and then use that as a source to join to your user table to do the updates / insert to. 另一个选择是将所有内容读取到临时表中,然后将其用作连接到用户表以进行更新/插入的源。 This would require a bit of effort with the indexes, but a bulk load to a temp table is quick, and a updates from that with useful indexes would be fast. 这将需要对索引进行一些工作,但是临时表的批量加载很快,并且使用有用的索引对其进行更新将很快。 Using it as a source for the inserts should also be fast (if you exclude the records already updated). 使用它作为插入源也应该很快(如果排除已更新的记录)。

The other issue appears to be your following query, but not sure where you execute this. 另一个问题似乎是您的以下查询,但不确定在哪里执行此查询。 It appears to only need to be executed once, in which case it might not matter too much. 它似乎只需要执行一次,在这种情况下可能并不太重要。 You haven't given the structure of the CustomType table, but it is joined to Customfield and the field customTypeId has no index. 您尚未提供CustomType表的结构,但它已连接到Customfield,并且字段customTypeId没有索引。 Hence that join will be slow. 因此,该连接将很慢。 Similarly on the CustomValue and CustomFieldSubArea joins which join based on customFieldId, and neither have an index on this field (hopefully a unique index, as if those fields are not unique you will get a LOT of records returned - 1 row for every possibly combination) 同样,在CustomValue和CustomFieldSubArea联接上,它们基于customFieldId联接,并且在该字段上都没有索引(希望是唯一索引,因为这些字段不是唯一的,您将获得很多记录返回-每种可能的组合为1行)

SELECT cf.*, ctyp.typeName, cfv.id as customId, cfv.customFieldId, 
       cfv.relatedId, cfv.fieldValue, cfv.createdAt 
FROM `CustomField` `cf` 
    INNER JOIN CustomType ctyp on ctyp.id = cf.customTypeId 
    LEFT OUTER JOIN CustomValue cfv on cf.id = cfv.customFieldId 
                and relatedId = 0 
    LEFT JOIN CustomFieldSubArea cfsa on cfsa.customFieldId = cf.id 
WHERE ((relatedTable = 'people' and enabled = '1') 
  AND (onCreate = '1')) 
  AND (cfsa.subarea='peoplebulkinsert') 
ORDER BY cf.sortOrder, cf.label

看到它,您可以尝试减少查询,并使用sql在线编译器检查时间段,然后将其包含在项目下。

Always do bulk importing within a transation 始终在转换中批量导入

        $transaction = Yii::app()->db->beginTransaction();
        $curRow = 0;
        try
        {
            while (($peopleData = fgetcsv($handle, 10240, ",")) !== FALSE) {
            $curRow++;
            //process $peopleData
            //insert row
            //best to use INSERT ... ON DUPLICATE  KEY UPDATE
            // a = 1
            // b = 2;
            if ($curRow % 5000 == 0) {
               $transaction->commit();
               $transaction->beginTransaction();
            }
        }
        catch (Exception $ex)
        {
            $transaction->rollBack();
            $result = $e->getMessage();                    
        }
        //don't forget the remainder.
        $transaction->commit();

I have seen import routines sped up 500% by simply using this technique. 我已经看到通过简单地使用此技术,导入例程将加速500%。 I have also seen an import process that did 600 queries (mixture of select, insert, update and show table structure) for each row. 我还看到了一个导入过程,该过程对每一行进行了600个查询(选择,插入,更新和显示表结构的混合)。 This technique sped up the process 30%. 这项技术使该过程加快了30%。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM