简体   繁体   English

无法强制删除目录

[英]Unable to force the removal of a directory

I'm using the Info-ZIP utilities in a Ruby script on Windows 10 to unzip an archive, edit the contents, and rezip it.我在 Windows 10 上的 Ruby 脚本中使用 Info-ZIP 实用程序来解压缩存档、编辑内容并重新压缩它。 The script is meant to iterate over a batch of archives, and delete the temporary folder that is created when extracting the contents.该脚本旨在遍历一批档案,并删除提取内容时创建的临时文件夹。 The folder is not being deleted, though.不过,该文件夹并未被删除。 For example:例如:

archives.each { |archive|
    system("unzip.exe -o archive -d temp")
    [...]
    system("zip.exe -X0q archive .")
    FileUtils.rm_rf "temp"
}

This has always worked on a Mac just fine (using the same script, in conjunction with the zip/unzip commands), however, in Windows, I cannot get the temporary folder to be deleted.这在 Mac 上一直工作得很好(使用相同的脚本,结合 zip/unzip 命令),但是,在 Windows 中,我无法删除临时文件夹。 The unzipping and zipping process works fine, but the "temp" folder will not be deleted.解压缩和压缩过程正常,但不会删除“temp”文件夹。 This results in the unzipping utility throwing the same error: error: cannot delete old temp/[file] for every file that exists in the folder.这会导致解压缩实用程序抛出相同的错误: error: cannot delete old temp/[file]

I've tried using system("del /Q temp") , which throws a Could Not Find: C:\[...]\temp error, even though the directory does exist.我试过使用system("del /Q temp") ,它会抛出Could Not Find: C:\[...]\temp错误,即使该目录确实存在。 I tried system("rmdir /s /q temp") , which throws another error: The process cannot access the file because it is being used by another process.我尝试了system("rmdir /s /q temp") ,它引发了另一个错误: The process cannot access the file because it is being used by another process. The only "process" using this file is the script itself, though.不过,使用此文件的唯一“进程”是脚本本身。

Once the script is done running, if I run FileUtils.rm_rf "temp" afterwards, it then works, and successfully deletes the directory.一旦脚本运行完毕,如果我之后运行FileUtils.rm_rf "temp" ,它就会工作,并成功删除目录。 However, I need this to be done after each iteration and within the same original script, so that the directory is correctly overwritten and deleted at the end of the execution, without any error or warning in Command Prompt.但是,我需要在每次迭代后并在同一个原始脚本中完成此操作,以便在执行结束时正确覆盖和删除目录,而不会在命令提示符中出现任何错误或警告。

Is there any other way to forcibly delete this folder?有没有其他方法可以强行删除这个文件夹?

Update: After doing a lot more testing of different parts of the script, I was able to locate the exact source of the problem.更新:在对脚本的不同部分进行了大量测试后,我能够找到问题的确切根源。 So all of the archives contain XHTML files.所以所有的档案都包含 XHTML 文件。 The script requires in some cases that an archive be duplicated, and the duplicated archive has its contents modified.该脚本在某些情况下需要复制存档,并且复制的存档的内容已修改。 Whether or not a duplicate needs to be made depends on the existence of certain markup within an XHTML file.是否需要复制取决于 XHTML 文件中是否存在某些标记。 The script uses Nokogiri to parse the content.该脚本使用 Nokogiri 来解析内容。 It seems that the method of parsing through Nokogiri is what is triggering the issue.似乎是通过 Nokogiri 进行解析的方法触发了这个问题。 To simplify the code:为了简化代码:

FileUtils.cp(original_archive,new_archive)
unzip_archive(new_archive) # a function to contain the unzipping steps
Dir.glob("temp/**/*.{html,xhtml}").each { |page|
        contents = Nokogiri::XML(open(page))
    }
zip_archive(new_archive)

In this example, nothing is actually happening, but just the presence of Nokogiri::XML(open(page)) is enough to trigger the errors.在此示例中,实际上没有发生任何事情,但仅存在Nokogiri::XML(open(page))就足以触发错误。 This happens for every page that is opened through Nokogiri.通过 Nokogiri 打开的每个页面都会发生这种情况。 So if I change it to only one page:所以如果我把它改成只有一页:

contents = Nokogiri::XML(open(Dir.glob("temp/**/one_page.xhtml")))

then the FileUtils.rm_rf 'temp' successfully deletes the files in the temp folder except for one_page.xhtml , which throws the "cannot delete" error.然后FileUtils.rm_rf 'temp'成功删除了 temp 文件夹中的文件,但one_page.xhtml除外,这会引发“无法删除”错误。

Is there a way to bypass this issue, such that I can still use Nokogiri in my Ruby script, but not have the script think the Nokogiri "process" is still running?有没有办法绕过这个问题,这样我仍然可以在我的 Ruby 脚本中使用 Nokogiri,但脚本不会认为 Nokogiri“进程”仍在运行? This isspecific to Windows, since no such problems were encountered on Macs.这是 Windows 特有的,因为在 Mac 上没有遇到此类问题。

Looking at the code:查看代码:

Dir.glob("temp/**/*.{html,xhtml}").each { |page|
        contents = Nokogiri::XML(open(page))
    }

the problem really looks like you're consuming all the available file handles.这个问题真的看起来像你正在消耗所有可用的文件句柄。 This isn't a Nokogiri problem at all, it just happened to be in town when the problem occurred.这根本不是 Nokogiri 的问题,问题发生时它恰好在城里。

OSes have a pool of file handles available;操作系统有一个可用的文件句柄池; They're not an infinite resource.它们不是取之不尽的资源。 If you have a huge number of files that are being found, iterating over them and leaving them open, then you're consuming them all, which is poor programming.如果你有大量的文件正在被发现,遍历它们并让它们保持打开状态,那么你就在消耗它们,这是糟糕的编程。

Using the block form for File.open will work around the problem, but File.read without the block is cleaner, shorter and, in my opinion, a much better way to go.使用File.read File.open清晰、更短,而且在我看来,这是一种比 go 更好的方法。

Dir.glob("temp/**/*.{html,xhtml}").each { |page|
  contents = Nokogiri::XML(File.read(page))
  # do something with contents
}

But, using Dir.glob is also contributing to this, and another, problem.但是,使用Dir.glob也会导致这个问题和另一个问题。 You're asking the system to search the disk to find all matching files, then return them as an array in memory, which are then iterated over.您要求系统搜索磁盘以找到所有匹配的文件,然后将它们作为 memory 中的数组返回,然后对其进行迭代。 Instead, I highly recommend using Find , which is in Ruby's Standard Library.相反,我强烈建议使用 Ruby 标准库中的Find It behaves much better in that sort of situation.它在那种情况下表现得更好。

The Find module supports the top-down traversal of a set of file paths. Find模块支持自顶向下遍历一组文件路径。

For example, to total the size of all files under your home directory, ignoring anything in a “dot” directory (eg $HOME/.ssh):例如,要计算主目录下所有文件的总大小,忽略“点”目录中的任何内容(例如 $HOME/.ssh):

require 'find'

total_size = 0

Find.find(ENV["HOME"]) do |path|
  if FileTest.directory?(path)
    if File.basename(path).start_with?('.')
      Find.prune       # Don't look any further into this directory.
    else
      next
    end
  else
    total_size += FileTest.size(path)
  end
end

Using Find you can run the code against a huge drive containing millions of matches and it'll perform better than Dir.glob .使用Find您可以针对包含数百万个匹配项的巨大驱动器运行代码,它的性能将优于Dir.glob

Tweaking their example, this untested code should get you started:调整他们的例子,这个未经测试的代码应该让你开始:

require 'find'
require 'nokogiri'

Find.find('temp') do |path|
  if FileTest.file?(path) && path[/\.x?html$/i]
    contents = Nokogiri::XML(File.read(page))
    # do something with contents
  end
end

A second problem you'll often see using Dir.glob to do a top-down search ( ** ) is it'll immediately ask the OS to find all the matching files, then wait for the OS to gather them.您经常会看到使用Dir.glob进行自上而下搜索 ( ** ) 的第二个问题是它会立即要求操作系统找到所有匹配的文件,然后等待操作系统收集它们。 If, instead, you use Find your code will pause for each search for the next match in the hierarchy, but it'll be a much shorter pause resulting in a more responsive application that doesn't eat as much memory or beat the disk gathering files.相反,如果您使用Find ,您的代码将在每次搜索层次结构中的下一个匹配项时暂停,但暂停时间会短得多,从而导致响应速度更快的应用程序不会吃掉那么多 memory 或击败磁盘收集文件。 On a remotely mounted drive or a file server you could end up irritating your sysadmin when they notice huge.network and disk IO spikes instead of a minor increase in activity.在远程安装的驱动器或文件服务器上,当系统管理员注意到 huge.network 和磁盘 IO 出现峰值而不是活动略有增加时,您可能最终会激怒他们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM