简体   繁体   English

文件访问和 Parallel.For

[英]File Access and Parallel.For

I have a set of nested methods that effectively copy a file for backup and then make changes to that file using the "RecordConsolidator" class.我有一组嵌套的方法,可以有效地复制文件进行备份,然后使用“RecordConsolidator”类对该文件进行更改。 In the chain of events I get random exceptions that the file is in use which doesn't make sense to me unless every single line of code in a Parallel.For call is executed asynchronously.在事件链中,我收到文件正在使用的随机异常,这对我来说没有意义,除非 Parallel.For 调用中的每一行代码都是异步执行的。 Here is an example of my current issues (in commented lines):这是我当前问题的示例(在注释行中):

// this is from a method called CompactFormIDs
try
{
    // will this ever be executed twice on the same object?  Will it be 
    // released before the next line of code after the catch statement?
    File.Copy(dest, source); 
}
catch (Exception e)
{
    weirdExceptions.Add(source); // I keep getting a message that the file already exists
    // even though there is only one copy statement which copies the file above.
}

// creates an undo step in a batch file
Globals.AddCommit(CommitType.RestoreBackup | CommitType.UndoDeleteBackup | CommitType.CommitDeleteBackup, source, dest); 

HashSet<FormID> npcList = new HashSet<FormID>();
uint mask = (uint)masters.Count << 24;
report.BeginAppendProcess();

// the method below also causes an exception.
// It acts as though the file copied is still in use.  No other process accesses the file other than the 
// copy process before this statement.  When not doing Parallel.For this works just fine.
using (RecordConsolidator consolidator = new RecordConsolidator(source, dest, mask, npcList))
{...
}

The end goal is to:最终目标是:

  1. Make a copy of the file that will be modified so that it can be restored if things don't work properly with the new version of the file.制作将要修改的文件的副本,以便在新版本的文件无法正常工作时将其恢复。

  2. Add the restoration of the original file to a batch script将恢复原文件添加到批处理脚本中

  3. Make the changes to the file.对文件进行更改。

How can I do this in a parallel process using a Parallel.For method without encountering all these "file in use/exists" issues.如何在使用 Parallel.For 方法的并行进程中执行此操作而不会遇到所有这些“正在使用/存在的文件”问题。 The fact that there is even an issue here makes no sense because the single Copy statement is causing multiple issues that shouldn't happen unless the copy isn't complete before the rest of the code is executed or somehow Parallel.For is being executed twice for each item.此处甚至存在问题这一事实毫无意义,因为单个 Copy 语句会导致多个问题,除非在执行其余代码之前复制未完成或以某种方式 Parallel.For 执行两次,否则不应发生这些问题对于每个项目。

UPDATE 1: This is the method that contains the Parallel.For loop:更新 1:这是包含 Parallel.For 循环的方法:

private void OnFormShown(object sender, EventArgs e)
{
    Mod mod;
    RichTextboxBuilder builder;
    List<Mod> batch = task as List<Mod>;
    Refresh();
    if (batch != null)
    {
        RichTextboxBuilder.BeginConcurrentAppendProcess(this, batch.Count);
        ReportCaption = "Conversion Progress";
        progressBar.Visible = true;
        progressBar.Maximum = batch.Count;
        Parallel.For(0, batch.Count, i =>
        {
            mod = batch[i];
            builder = RichTextboxBuilder.BeginConcurrentAppend(i);
            //builder.TextUpdated += Builder_TextUpdated;
            taskTarget.ConvertToESL(mod, builder, false);
            RichTextboxBuilder.EndConcurrentAppend(i);
        });

        Finalize(false);
    }
    else
    {
        mod = task as Mod;
        ReportTextBuilder = new RichTextboxBuilder(this);
        Finalize(taskTarget.ConvertToESL(mod, ReportTextBuilder));
    }
}

As mentioned in the comment, I use a thread-safe HashSet正如评论中提到的,我使用了线程安全的 HashSet

var fileQueue = new HashSet<string>(StringComparer.Ordinal);

You can use Lock() or manage it through a ReaderWriterLockSlim to make it threadsafe.您可以使用 Lock() 或通过 ReaderWriterLockSlim 管理它以使其成为线程安全的。

Another issue I faced is the fact that "I'm not alone on a server", meaning other processes do other stuff, shocking I know but sometimes I do not get the whole server to myself ;-)我面临的另一个问题是“我在服务器上并不孤单”,这意味着其他进程会做其他事情,我知道这很令人震惊,但有时我并没有把整个服务器都给自己 ;-)

Have a look at Nuget Package Walter看看 Nuget Package Walter

it has an extension method它有一个扩展方法

TryDiscoverWhoisBlocking(this FileInfo file, out IReadOnlyList<Process> processes)

I could paste the code in here but there are quite a few native methods and the post would be too long.我可以把代码贴在这里,但是有很多本地方法,帖子太长了。

Use the method to see who is blocking your access to the file, it might be a virus scanner, if so and you get the error then loop for a while allowing the virus scanner to do its thing till all handles are of the file and you can continue.使用该方法查看谁阻止您访问该文件,它可能是病毒扫描程序,如果是这样并且您收到错误,然后循环一段时间允许病毒扫描程序执行其操作,直到所有句柄都属于该文件并且您可以继续。

I found out that most of the time "I'm the problem" and "I'm the one blocking access" so I look at my code and figure, why use these one-liners when there actually is a way to do things the way I intended it to be done.我发现大多数时候“我是问题所在”和“我是阻止访问的人”,所以我查看了我的代码和图,为什么在实际上有办法做事情的时候使用这些单行代码我打算这样做的方式。

Bellow, I tell the type of blocking I think are needed and only take the blocking level I need.波纹管,我告诉我认为需要的阻塞类型,并且只采用我需要的阻塞级别。 In your case, you could open shared-read and write without blocking or with... anyway the choice is yourse.在您的情况下,您可以在不阻塞的情况下打开共享读取和写入,或者...无论如何选择是您的。

using (var fs = new FileStream(path: file.FullName, access: FileAccess.Write, mode: FileMode.Append, share: FileShare.ReadWrite))
using (var sw = new StreamWriter(fs, encoding: UTF8Encoding.UTF8))
{
    sw.Write(text);
    sw.Flush();
}

I'm writing this as I guess your code works and the locking is caused by another thread or another process, above code will definitively tell you what's going on.我写这个是因为我猜你的代码可以工作并且锁定是由另一个线程或另一个进程引起的,上面的代码会明确地告诉你发生了什么。

Walter's code is definitely something that may help with other people with this problem. Walter 的代码绝对可以帮助解决这个问题的其他人。 However my problem (and the inconsistency which I described accompanying its occurrence) were the result of Windows Messages duplicates being sent to the Message Loop.然而,我的问题(以及我描述的伴随其发生的不一致)是 Windows 消息重复发送到消息循环的结果。 If the OnFormShown message occurred twice, it would call the message handler twice.如果 OnFormShown 消息出现两次,它将调用消息处理程序两次。 This modification solved this particular issue:此修改解决了此特定问题:

I added a "processed" class variable and changed the handler added to my original post to the following.我添加了一个“已处理”类变量,并将添加到我的原始帖子中的处理程序更改为以下内容。 (The adding Task.Run was to solve a different problem). (添加 Task.Run 是为了解决一个不同的问题)。

private void OnFormShown(object sender, EventArgs e)
{
    if (!processed)
    {
        processed = true;
        Mod mod;
        RichTextboxBuilder builder;
        List<Mod> batch = task as List<Mod>;
        Refresh();
        if (batch != null)
        {
            RichTextboxBuilder.BeginConcurrentAppendProcess(this, batch.Count);
            ReportCaption = "Conversion Progress";
            progressBar.Visible = true;
            progressBar.Maximum = batch.Count;

            Task.Run(() =>
            {
                Parallel.For(0, batch.Count, i =>
                {
                    mod = batch[i];
                    builder = RichTextboxBuilder.BeginConcurrentAppend(i);
                    //builder.TextUpdated += Builder_TextUpdated;
                    taskTarget.ConvertToESL(mod, builder, false);
                    RichTextboxBuilder.EndConcurrentAppend(i);
                }); Finalize(false);
            });
        }
        else
        {
            mod = task as Mod;
            ReportTextBuilder = new RichTextboxBuilder(this);
            Finalize(taskTarget.ConvertToESL(mod, ReportTextBuilder));
        }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM