简体   繁体   English

Python:读/写大量小文件

[英]Python: Read/write large number of small files

I have a large number (about 60K) of small files (<10kb) that I need to process, create a processed version for each file and after that upload the processed files to a ftp server, in less than 1 minute.我需要处理大量(大约 60K)小文件(<10kb),为每个文件创建一个处理后的版本,然后在不到 1 分钟的时间内将处理后的文件上传到 ftp 服务器。
I'm using Python 3 with standard read/write function and the performance is really bad.我正在使用 Python 3 和标准读/写 function ,性能真的很差。
Do you know if Python have any library support this case?您知道 Python 是否有任何库支持这种情况? Or what can I do to get better performance?或者我能做些什么来获得更好的性能?
Thank you so much!太感谢了!

remember that I/O operations are heavy , and writing or reading files have overhead.请记住,I/O 操作很繁重,写入或读取文件有开销。

Writing 1024 new files each one is 1kb would take much longer than writing a single new file of 1mb.写 1024 个新文件,每个 1kb 比写一个 1mb 的新文件要长得多。 Every file you open the system need's to find a place for it on your file system and write it's metadata.您打开系统的每个文件都需要在文件系统上为其找到一个位置并写入它的元数据。

So no matter what you are planning, your performance probably won't be great , but I do suggest you to make sure you are doing the best to have the minimum of operations (ie upload all the files to the FTP server, instead of open the connection for each file and upload it by itself).所以无论你打算做什么,你的性能可能都不会很好,但我建议你确保你尽最大努力做到最少的操作(即将所有文件上传到 FTP 服务器,而不是打开每个文件的连接并自行上传)。
I suggest you to post a snippet from your code so people could help you make sure the code is the most minimal it can be我建议您从您的代码中发布一个片段,以便人们可以帮助您确保代码是最小的

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM