简体   繁体   English

使用PHP在MySQL中加载没有文件的数据infile

[英]load data infile without file in mysql with php

I receive files in a streamed manner once every 30 seconds. 我每30秒以流媒体方式接收一次文件。 The files may have up to 40 columns and 50,000 rows. 文件最多可以包含40列和50,000行。 The files are txt files and tab seperated. 这些文件是txt文件,并且制表符分开。 Right now, I'm saving the file temporally, save the contents with load data infile to a temporary table in the database and delete the file afterwards. 现在,我正在临时保存文件,将包含装入数据文件的内容保存到数据库中的临时表中,然后再删除文件。

I would like to avoid the save and delete process and instead save the data directly to the database. 我想避免保存和删除过程,而是直接将数据保存到数据库中。 The stream is the $output here: 流是这里的$output

protected function run(OutputInterface $output)
{
    $this->readInventoryReport($this->interaction($output));
}

I've been googling around all the time trying to find a "performance is a big issue" - proof answer to this, but I can't find a good way of doing this without saving the data to a file and using load data infile. 我一直在搜寻所有“性能是一个大问题”,这是证明性的答案,但是我一直没有找到一种很好的方法来完成此任务,除非将数据保存到文件中并在文件中使用加载数据。 I need to have the contents available quickly and work with thoses after they are saved to a temporary table. 我需要快速提供内容,并将其保存到临时表后再使用这些内容。 (Update other tables with the contents...) (使用内容更新其他表...)

Is there a good way of handling this, or will the file save and delete method together with load data infile be better than other solutions? 是否有解决此问题的好方法,还是将文件保存和删除方法以及文件中的加载数据比其他解决方案更好?

The server I'm running this on has SSDs and 32GB of RAM. 我正在其上运行的服务器具有SSD和32GB的RAM。

LOAD DATA INFILE is your fastest way to do low-latency ingestion of tonnage of data into MySQL. LOAD DATA INFILE是将低吨位数据吞入MySQL的最快方法。

You can write yourself a php program that will, using prepared statements and the like, do a pretty good job of inserting rows into your database. 您可以编写一个自己的php程序,该程序将使用准备好的语句等在将行插入数据库中方面做得很好。 If you arrange to do a COMMIT every couple of hundred rows, and use prepared statements, and write your code carefully, it will be fairly fast, but not as fast as LOAD DATA INFILE . 如果您安排每两百行执行一次COMMIT ,并使用准备好的语句,并仔细地编写代码,这将相当快,但不如LOAD DATA INFILE快。 Why? 为什么? individual row operations have to be serialized onto the network wire, then deserialized, and processed one (or two or ten) at a time. 单个行操作必须序列化到网络线路上,然后反序列化,并一次处理一次(或两次或十次)。 LOAD DATA just slurps up your data locally. LOAD DATA只是在本地收集数据。

It sounds like you have a nice MySQL server machine. 听起来您有一台不错的MySQL服务器计算机。 But the serialization is still a bottleneck. 但是序列化仍然是瓶颈。

50K records every 30 seconds, eh? 每30秒记录50K条记录,是吗? That's a lot! 好多啊! Is any of that data redundant? 这些数据是否多余? That is, do any of the rows in a later batch of data overwrite rows in an earlier batch? 也就是说,后面一批数据中的任何行是否会覆盖前面一批中的行? If so, you might be able to write a program that would skip rows that have become obsolete. 如果是这样,您也许可以编写一个程序来跳过已经过时的行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM