简体   繁体   English

如何在PHP中进行长时间的批处理?

[英]How to do long time batch processes in PHP?

I have batch process when i need to update my db table, around 100000-500000 rows, from uploaded CVS file. 当我需要从上载的CVS文件更新大约100000-500000行的数据库表时,我具有批处理过程。 Normally it takes 20-30 minutes, sometimes longer. 通常需要20到30分钟,有时会更长。

What is the best way to do ? 最好的方法是什么? any good practice on that ? 有什么好的做法吗? Any suggest would be appreciated 任何建议将不胜感激

Thanks. 谢谢。

It takes 30 minutes to import 500.000 rows from a CSV? 从CSV导入500.000行需要30分钟?

Have you considered letting MySQL do the hard work? 您是否考虑过让MySQL做艰苦的工作? There is LOAD DATA INFILE , which supports dealing with CSV files: LOAD DATA INFILE ,它支持处理CSV文件:

LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
  FIELDS TERMINATED BY ',' ENCLOSED BY '"'
  LINES TERMINATED BY '\n';

If the file is not quite in the right shape to be imported right to the target table, you can either use PHP to transform it beforehand, or LOAD it into a "staging" table and let MySQL handle the necessary transformation — whichever is faster and more convenient. 如果文件的形状不正确,无法直接导入目标表,则可以使用PHP预先对其进行转换,也可以将其加载到“暂存”表中,然后让MySQL处理必要的转换-以更快和更快速的方式进行。更方便。

As an additional option, there seems to be a possibility to run MySQL queries asynchronously through the MySQL Native Driver for PHP (MYSQLND) . 作为附加选项,似乎有可能通过用于PHPMySQL本机驱动程序(MYSQLND)异步运行MySQL查询。 Maybe you can explore that option as well. 也许您也可以探索该选项。 It would enable you to retain snappy UI performance. 这将使您能够保留快速的UI性能。

If you're doing a lot of inserts, are you doing bulk inserts? 如果您要进行大量插入,是否要进行批量插入? ie like this: 即像这样:

INSERT INTO table (col1 col2) VALUES (val1a, val2a), (val1b, val2b), (....

That will dramatically speed up inserts. 这将大大加快插入速度。

Another thing you can do is disable indexing while you make the changes, then let it rebuild the indexes in one go when you're finished. 您可以做的另一件事是在进行更改时禁用索引编制,然后在完成后立即一次性重建索引。

A bit more detail about what you're doing and you might get more ideas 有关您正在做的事情的更多细节,您可能会得到更多的想法

PEAR有一个名为Benchmark的程序包,它具有Benchmark_Profiler类,可以帮助您找到代码中最慢的部分,以便进行优化。

We had a feature like that in a big application. 在大型应用程序中,我们具有类似的功能。 We had the issue of inserting millions of rows from a csv into a table with 9 indexes. 我们遇到了将csv中的数百万行插入具有9个索引的表中的问题。 After lots of refactoring we found the ideal way to insert the data was to load it into a [temporary] table with the mysql LOAD DATA INFILE command, do the transformations there and copy the result with multiple insert queries into the actual table ( INSERT INTO ... SELECT FROM ) processing only 50k lines or so with each query (which performed better than issuing a single insert but YMMV). 经过大量重构后,我们发现插入数据的理想方法是使用mysql LOAD DATA INFILE命令将其加载到[临时]表中,在其中进行转换,然后将具有多个插入查询的结果复制到实际表中( INSERT INTO ... SELECT FROM )每个查询仅处理50k行左右(比发出单个插入但YMMV要好)。

I cant do it with cron, coz this is under user control, A user click process buttons and later on can check logs to see process status 我无法使用cron做到这一点,因为这是在用户控制之下的,用户单击进程按钮,以后可以检查日志以查看进程状态

When the user presses said button, set a flag in a table in the database. 当用户按下所述按钮时,在数据库的表中设置一个标志。 Then have your cron job check for this flag. 然后让您的Cron作业检查此标志。 If it's there, start processing, otherwise don't. 如果存在,请开始处理,否则不进行。 I applicable, you could use the same table to post some kind of status update (eg. xx% done), so the user has some feedback about the progress. 如果适用,您可以使用同一张表发布某种状态更新(例如xx%完成),以便用户对进度有一些反馈。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM