简体   繁体   English

复制许多小文件(不通过网络)时.net File.Copy非常慢

[英].net File.Copy very slow when copying many small files (not over network)

I'm making a simple folder sync backup tool for myself and ran into quite a roadblock using File.Copy. 我正在为自己制作一个简单的文件夹同步备份工具,并使用File.Copy遇到了相当大的障碍。 Doing tests copying a folder of ~44,000 small files (Windows mail folders) to another drive in my system, I found that using File.Copy was over 3x slower than using a command line and running xcopy to copy the same files/folders. 进行测试将大约44,000个小文件(Windows邮件文件夹)的文件夹复制到我系统中的另一个驱动器时,我发现使用File.Copy比使用命令行并运行xcopy复制相同的文件/文件夹慢3倍。 My C# version takes over 16+ minutes to copy the files, whereas xcopy takes only 5 minutes. 我的C#版本需要超过16分钟来复制文件,而xcopy只需要5分钟。 I've tried searching for help on this topic, but all I find is people complaining about slow file copying of large files over a network. 我试图在这个主题上寻求帮助,但我发现所有人都在抱怨通过网络缓慢复制大文件的文件。 This is neither a large file problem nor a network copying problem. 这既不是大文件问题也不是网络复制问题。

I found an interesting article about a better File.Copy replacement , but the code as posted has some errors which causes problems with the stack and I am nowhere near knowledgeable enough to fix the problems in his code. 我发现了一篇关于更好的File.Copy替换有趣文章 ,但是发布的代码有一些错误会导致堆栈出现问题,而且我无法提供足够的知识来解决代码中的问题。

Are there any common or easy ways to replace File.Copy with something more speedy? 是否有任何常见或简单的方法可以更快速地替换File.Copy?

One thing to consider is whether your copy has a user interface that updates during the copy. 需要考虑的一件事是您的副本是否具有在复制期间更新的用户界面。 If so, make sure your copy is running on a separate thread, or both your UI will freeze up during the copy, and the copy will be slowed down by making blocking calls to update the UI. 如果是这样,请确保您的副本在单独的线程上运行,或者在复制过程中您的UI都将冻结,并且通过阻止调用来更新UI将减慢副本速度。

I have written a similar program and in my experience, my code ran faster than a windows explorer copy (not sure about xcopy from the command prompt). 我编写了一个类似的程序,根据我的经验,我的代码比Windows资源管理器副本运行得更快(不确定命令提示符下的xcopy )。

Also if you have a UI, don't update on every file; 此外,如果您有UI,请不要更新每个文件; instead update every X megabytes or every Y files (whichever comes first), this keeps down the amount of updating to something the UI can actually handle. 而是更新每个X兆字节或每个Y文件(以先到者为准),这样可以将更新量保持在UI实际可以处理的范围内。 I used every .5MB or 10 files; 我使用了每个.5MB或10个文件; those may not be optimal but it noticeably increased my copy speed and UI responsiveness. 那些可能不是最佳的,但它明显增加了我的复制速度和UI响应。

Another way to speed things up is to use the Enumerate functions instead of Get functions (eg EnumerateFiles instead of GetFiles ). 另一种加快速度的方法是使用Enumerate函数而不是Get函数(例如EnumerateFiles而不是GetFiles )。 These functions start returning results as soon as possible instead of waiting to return everything when the list is finished being built. 这些函数尽快开始返回结果,而不是在列表构建完成后等待返回所有内容。 They return an Enumerable, so you can just call foreach on the result: foreach(string file in System.IO.Directory.EnumerateDirectories(path)) . 它们返回一个Enumerable,因此您可以在结果上调用foreach:foreach( System.IO.Directory.EnumerateDirectories(path))字符串文件System.IO.Directory.EnumerateDirectories(path)) For my program this also made a noticeable difference in speed, and would be even more helpful in cases like yours where you are dealing with directories containing many files. 对于我的程序,这在速度方面也有明显的不同,并且在你处理包含许多文件的目录的情况下会更有帮助。

One of the things that slows down IO operations the most on rotational disks is moving the disk head. 在旋转磁盘上减慢IO操作的一个原因是移动磁头。

It's reasonable to assume and probably quite accurate that your many small files (that all are related to each other) are closer together on the disk than they are close to the destination of the copy (assuming you're copying from one part of a disk to another part of the same disk). 可以合理地假设并且可能非常准确的是,您的许多小文件(彼此都相关)在磁盘上比靠近复制目标的位置更靠近(假设您从磁盘的一部分进行复制)到同一磁盘的另一部分)。 If you copy for a bit then write for a bit, you open a window of opportunity for other processes to move the disk head on the source or target disk. 如果你复制一点然后写一点,你打开一个机会窗口,让其他进程移动源磁盘或目标磁盘上的磁盘头。

One thing that XCopy does much better than Copy (meaning in both cases the commands) is that XCopy reads in a bunch of files before starting to write out those files to the destination. XCopy比Copy(在两种情况下都是命令)做得更好的一件事是XCopy在开始将这些文件写到目的地之前读入一堆文件。

If you are copying files on the same disk, try allocating a large buffer to read in many files at once, then write out those files once the buffer is full). 如果要在同一磁盘上复制文件,请尝试分配一个大缓冲区以一次读入多个文件,然后在缓冲区已满时写出这些文件。

If you are reading from one disk and writing to another disk, try starting up one thread to read from the source disk and a separate thread to write to the other disk. 如果您正在从一个磁盘读取并写入另一个磁盘,请尝试启动一个线程以从源磁盘读取,并启动一个单独的线程以写入另一个磁盘。

There are two algorithms for faster file copy: 有两种算法可以更快地复制文件:

If source and destination are different disks Then: 如果源和目标是不同的磁盘则:

  • One thread reading files continuously and storing in a buffer. 一个线程连续读取文件并存储在缓冲区中。
  • Another thread writing files continuously from that buffer. 另一个线程从该缓冲区连续写入文件。

If source and destination is same disk then: 如果源和目标是相同的磁盘,则:

  • Read a fixed chunk of bytes, say 8K at a time, no matter how many files that is. 读取一个固定的字节块,一次说8K,无论有多少文件。
  • Write that fixed chunk to destination, either in one file or in multiple files. 将该固定块写入目标,可以是一个文件,也可以是多个文件。

This way you will get significant performance. 这样您将获得显着的性能。

Alternative is you just invoke xcopy from your .net code. 另一种方法是从.net代码调用xcopy。 Why bother doing it using File.Copy. 为什么要使用File.Copy来做这件事。 You can capture xcopy output using Process.StandardOutput and show on the screen in order to show user what's going on. 您可以使用Process.StandardOutput捕获xcopy输出并在屏幕上显示,以向用户显示正在进行的操作。

I think you could at least parallize it so that you do two files at the same time. 我认为你至少可以平行化它,这样你就可以同时做两个文件。 While one thread is writing another can already be reading the next file. 当一个线程正在写另一个线程时,已经可以读取下一个文件。 If you have a list of the files you can do that like this. 如果你有一个文件列表,你可以这样做。 Using many threads will not help because this will make the drive move around a lot more instead of being able to write sequentially.. 使用多个线程无济于事,因为这会使驱动器移动更多而不是能够顺序写入。

 var files = new List<string>();
 // todo: fill the files list using directoryenumeration or so...
 var po = new ParallelOptions() {MaxDegreeOfParallelism = 2};
 Parallel.ForEach(files, po, CopyAFile);

 // Routine to copy a single file
 private void CopyAFile(string file) { }

I have no good experience at this level. 我在这个级别没有很好的经验。 Why don't you try to run a batch file containing your xcopy comand? 为什么不尝试运行包含xcopy命令的批处理文件? Check this post: Executing Batch File in C# 查看这篇文章: 在C#中执行批处理文件

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM