简体   繁体   English

将文件夹和内容与 PowerShell 进行比较

[英]Comparing folders and content with PowerShell

I have two different folders with xml files.我有两个不同的文件夹,其中包含 xml 文件。 One folder (folder2) contains updated and new xml files compared to the other (folder1).与另一个(文件夹 1)相比,一个文件夹(文件夹 2)包含更新的和新的 xml 文件。 I need to know which files in folder2 are new/updated compared to folder1 and copy them to a third folder (folder3).我需要知道文件夹 2 中的哪些文件与文件夹 1 相比是新的/更新的,并将它们复制到第三个文件夹(文件夹 3)。 What's the best way to accomplish this in PowerShell?在 PowerShell 中完成此任务的最佳方法是什么?

OK, I'm not going to code the whole thing for you (what's the fun in that?) but I'll get you started.好的,我不会为你编写整个代码(这有什么乐趣?),但我会让你开始。

First, there are two ways to do the content comparison.首先,有两种方法可以进行内容比较。 The lazy/mostly right way, which is comparing the length of the files;懒惰/大部分正确的方法,即比较文件的长度; and the accurate but more involved way, which is comparing a hash of the contents of each file.以及更准确但更复杂的方法,即比较每个文件内容的 hash。

For simplicity sake, let's do the easy way and compare file size.为简单起见,让我们用简单的方法比较文件大小。

Basically, you want two objects that represent the source and target folders:基本上,您需要两个代表源文件夹和目标文件夹的对象:

$Folder1 = Get-childitem "C:\Folder1"
$Folder2 = Get-childitem  "C:\Folder2"

Then you can use Compare-Object to see which items are different...然后您可以使用Compare-Object来查看哪些项目不同......

Compare-Object $Folder1 $Folder2 -Property Name, Length

which will list for you everything that is different by comparing only name and length of the file objects in each collection.通过仅比较每个集合中文件对象的名称和长度,它将为您列出所有不同的内容。

You can pipe that to a Where-Object filter to pick stuff that is different on the left side...您可以 pipe 到Where-Object过滤器来选择左侧不同的东西......

Compare-Object $Folder1 $Folder2 -Property Name, Length | Where-Object {$_.SideIndicator -eq "<="}

And then pipe that to a ForEach-Object to copy where you want:然后 pipe 将其复制到ForEach-Object以复制到您想要的位置:

Compare-Object $Folder1 $Folder2 -Property Name, Length  | Where-Object {$_.SideIndicator -eq "<="} | ForEach-Object {
        Copy-Item "C:\Folder1\$($_.name)" -Destination "C:\Folder3" -Force
        }

Recursive Directory Diff Using MD5 Hashing (Compares Content)使用 MD5 散列的递归目录差异(比较内容)

Here is a pure PowerShell v3+ recursive file diff (no dependencies) that calculates MD5 hash for each directories file contents (left/right).这是一个纯 PowerShell v3+ 递归文件差异(无依赖关系),它为每个目录文件内容(左/右)计算 MD5 hash。 Can optionally export CSV's along with a summary text file.可以选择导出 CSV 以及摘要文本文件。 Default outputs results to stdout.默认输出结果到标准输出。 Can either drop the rdiff.ps1 file into your path or copy the contents into your script.可以将 rdiff.ps1 文件放入您的路径或将内容复制到您的脚本中。

USAGE: rdiff path/to/left,path/to/right [-s path/to/summary/dir]

Here is the gist .这是要点 Recommended to use version from gist as it may have additional features over time.建议使用 gist 中的版本,因为它可能会随着时间的推移而具有其他功能。 Feel free to send pull requests.随时发送拉取请求。

#########################################################################
### USAGE: rdiff path/to/left,path/to/right [-s path/to/summary/dir]  ###
### ADD LOCATION OF THIS SCRIPT TO PATH                               ###
#########################################################################
[CmdletBinding()]
param (
  [parameter(HelpMessage="Stores the execution working directory.")]
  [string]$ExecutionDirectory=$PWD,

  [parameter(Position=0,HelpMessage="Compare two directories recursively for differences.")]
  [alias("c")]
  [string[]]$Compare,

  [parameter(HelpMessage="Export a summary to path.")]
  [alias("s")]
  [string]$ExportSummary
)

### FUNCTION DEFINITIONS ###

# SETS WORKING DIRECTORY FOR .NET #
function SetWorkDir($PathName, $TestPath) {
  $AbsPath = NormalizePath $PathName $TestPath
  Set-Location $AbsPath
  [System.IO.Directory]::SetCurrentDirectory($AbsPath)
}

# RESTORES THE EXECUTION WORKING DIRECTORY AND EXITS #
function SafeExit() {
  SetWorkDir /path/to/execution/directory $ExecutionDirectory
  Exit
}

function Print {
  [CmdletBinding()]
  param (
    [parameter(Mandatory=$TRUE,Position=0,HelpMessage="Message to print.")]
    [string]$Message,

    [parameter(HelpMessage="Specifies a success.")]
    [alias("s")]
    [switch]$SuccessFlag,

    [parameter(HelpMessage="Specifies a warning.")]
    [alias("w")]
    [switch]$WarningFlag,

    [parameter(HelpMessage="Specifies an error.")]
    [alias("e")]
    [switch]$ErrorFlag,

    [parameter(HelpMessage="Specifies a fatal error.")]
    [alias("f")]
    [switch]$FatalFlag,

    [parameter(HelpMessage="Specifies a info message.")]
    [alias("i")]
    [switch]$InfoFlag = !$SuccessFlag -and !$WarningFlag -and !$ErrorFlag -and !$FatalFlag,

    [parameter(HelpMessage="Specifies blank lines to print before.")]
    [alias("b")]
    [int]$LinesBefore=0,

    [parameter(HelpMessage="Specifies blank lines to print after.")]
    [alias("a")]
    [int]$LinesAfter=0,

    [parameter(HelpMessage="Specifies if program should exit.")]
    [alias("x")]
    [switch]$ExitAfter
  )
  PROCESS {
    if($LinesBefore -ne 0) {
      foreach($i in 0..$LinesBefore) { Write-Host "" }
    }
    if($InfoFlag) { Write-Host "$Message" }
    if($SuccessFlag) { Write-Host "$Message" -ForegroundColor "Green" }
    if($WarningFlag) { Write-Host "$Message" -ForegroundColor "Orange" }
    if($ErrorFlag) { Write-Host "$Message" -ForegroundColor "Red" }
    if($FatalFlag) { Write-Host "$Message" -ForegroundColor "Red" -BackgroundColor "Black" }
    if($LinesAfter -ne 0) {
      foreach($i in 0..$LinesAfter) { Write-Host "" }
    }
    if($ExitAfter) { SafeExit }
  }
}

# VALIDATES STRING MIGHT BE A PATH #
function ValidatePath($PathName, $TestPath) {
  If([string]::IsNullOrWhiteSpace($TestPath)) {
    Print -x -f "$PathName is not a path"
  }
}

# NORMALIZES RELATIVE OR ABSOLUTE PATH TO ABSOLUTE PATH #
function NormalizePath($PathName, $TestPath) {
  ValidatePath "$PathName" "$TestPath"
  $TestPath = [System.IO.Path]::Combine((pwd).Path, $TestPath)
  $NormalizedPath = [System.IO.Path]::GetFullPath($TestPath)
  return $NormalizedPath
}


# VALIDATES STRING MIGHT BE A PATH AND RETURNS ABSOLUTE PATH #
function ResolvePath($PathName, $TestPath) {
  ValidatePath "$PathName" "$TestPath"
  $ResolvedPath = NormalizePath $PathName $TestPath
  return $ResolvedPath
}

# VALIDATES STRING RESOLVES TO A PATH AND RETURNS ABSOLUTE PATH #
function RequirePath($PathName, $TestPath, $PathType) {
  ValidatePath $PathName $TestPath
  If(!(Test-Path $TestPath -PathType $PathType)) {
    Print -x -f "$PathName ($TestPath) does not exist as a $PathType"
  }
  $ResolvedPath = Resolve-Path $TestPath
  return $ResolvedPath
}

# Like mkdir -p -> creates a directory recursively if it doesn't exist #
function MakeDirP {
  [CmdletBinding()]
  param (
    [parameter(Mandatory=$TRUE,Position=0,HelpMessage="Path create.")]
    [string]$Path
  )
  PROCESS {
    New-Item -path $Path -itemtype Directory -force | Out-Null
  }
}

# GETS ALL FILES IN A PATH RECURSIVELY #
function GetFiles {
  [CmdletBinding()]
  param (
    [parameter(Mandatory=$TRUE,Position=0,HelpMessage="Path to get files for.")]
    [string]$Path
  )
  PROCESS {
    ls $Path -r | where { !$_.PSIsContainer }
  }
}

# GETS ALL FILES WITH CALCULATED HASH PROPERTY RELATIVE TO A ROOT DIRECTORY RECURSIVELY #
# RETURNS LIST OF @{RelativePath, Hash, FullName}
function GetFilesWithHash {
  [CmdletBinding()]
  param (
    [parameter(Mandatory=$TRUE,Position=0,HelpMessage="Path to get directories for.")]
    [string]$Path,

    [parameter(HelpMessage="The hash algorithm to use.")]
    [string]$Algorithm="MD5"
  )
  PROCESS {
    $OriginalPath = $PWD
    SetWorkDir path/to/diff $Path
    GetFiles $Path | select @{N="RelativePath";E={$_.FullName | Resolve-Path -Relative}},
                            @{N="Hash";E={(Get-FileHash $_.FullName -Algorithm $Algorithm | select Hash).Hash}},
                            FullName
    SetWorkDir path/to/original $OriginalPath
  }
}

# COMPARE TWO DIRECTORIES RECURSIVELY #
# RETURNS LIST OF @{RelativePath, Hash, FullName}
function DiffDirectories {
  [CmdletBinding()]
  param (
    [parameter(Mandatory=$TRUE,Position=0,HelpMessage="Directory to compare left.")]
    [alias("l")]
    [string]$LeftPath,

    [parameter(Mandatory=$TRUE,Position=1,HelpMessage="Directory to compare right.")]
    [alias("r")]
    [string]$RightPath
  )
  PROCESS {
    $LeftHash = GetFilesWithHash $LeftPath
    $RightHash = GetFilesWithHash $RightPath
    diff -ReferenceObject $LeftHash -DifferenceObject $RightHash -Property RelativePath,Hash
  }
}

### END FUNCTION DEFINITIONS ###

### PROGRAM LOGIC ###

if($Compare.length -ne 2) {
  Print -x "Compare requires passing exactly 2 path parameters separated by comma, you passed $($Compare.length)." -f
}
Print "Comparing $($Compare[0]) to $($Compare[1])..." -a 1
$LeftPath   = RequirePath path/to/left $Compare[0] container
$RightPath  = RequirePath path/to/right $Compare[1] container
$Diff       = DiffDirectories $LeftPath $RightPath
$LeftDiff   = $Diff | where {$_.SideIndicator -eq "<="} | select RelativePath,Hash
$RightDiff   = $Diff | where {$_.SideIndicator -eq "=>"} | select RelativePath,Hash
if($ExportSummary) {
  $ExportSummary = ResolvePath path/to/summary/dir $ExportSummary
  MakeDirP $ExportSummary
  $SummaryPath = Join-Path $ExportSummary summary.txt
  $LeftCsvPath = Join-Path $ExportSummary left.csv
  $RightCsvPath = Join-Path $ExportSummary right.csv

  $LeftMeasure = $LeftDiff | measure
  $RightMeasure = $RightDiff | measure

  "== DIFF SUMMARY ==" > $SummaryPath
  "" >> $SummaryPath
  "-- DIRECTORIES --" >> $SummaryPath
  "`tLEFT -> $LeftPath" >> $SummaryPath
  "`tRIGHT -> $RightPath" >> $SummaryPath
  "" >> $SummaryPath
  "-- DIFF COUNT --" >> $SummaryPath
  "`tLEFT -> $($LeftMeasure.Count)" >> $SummaryPath
  "`tRIGHT -> $($RightMeasure.Count)" >> $SummaryPath
  "" >> $SummaryPath
  $Diff | Format-Table >> $SummaryPath

  $LeftDiff | Export-Csv $LeftCsvPath -f
  $RightDiff | Export-Csv $RightCsvPath -f
}
$Diff
SafeExit

Further to @JNK's answer, you might want to ensure that you are always working with files rather than the less-intuitive output from Compare-Object .除了@JNK 的回答,您可能希望确保您始终使用文件,而不是Compare-Object直观的 output 。 You just need to use the -PassThru switch...您只需要使用-PassThru开关...

$Folder1 = Get-ChildItem "C:\Folder1"
$Folder2 = Get-ChildItem "C:\Folder2"
$Folder2 = "C:\Folder3\"

# Get all differences, i.e. from both "sides"
$AllDiffs = Compare-Object $Folder1 $Folder2 -Property Name,Length -PassThru

# Filter for new/updated files from $Folder2
$Changes = $AllDiffs | Where-Object {$_.Directory.Fullname -eq $Folder2}

# Copy to $Folder3
$Changes | Copy-Item -Destination $Folder3

This at least means you don't have to worry about which way the SideIndicator arrow points!这至少意味着您不必担心 SideIndicator 箭头指向的方向!

Also, bear in mind that you might want to compare on LastWriteTime as well.另外,请记住,您可能还想在LastWriteTime上进行比较。

Sub-folders子文件夹

Looping through the sub-folders recursively is a little more complicated as you probably will need to strip off the respective root folder paths from the FullName field before comparing lists.递归遍历子文件夹稍微复杂一些,因为您可能需要在比较列表之前从 FullName 字段中删除相应的根文件夹路径。

You could do this by adding a new ScriptProperty to your Folder1 and Folder2 lists:您可以通过在 Folder1 和 Folder2 列表中添加新的 ScriptProperty 来做到这一点:

$Folder1 | Add-Member -MemberType ScriptProperty -Name "RelativePath" `
  -Value {$this.FullName -replace [Regex]::Escape("C:\Folder1"),""}

$Folder2 | Add-Member -MemberType ScriptProperty -Name "RelativePath" `
  -Value {$this.FullName -replace [Regex]::Escape("C:\Folder2"),""}

You should then be able to use RelativePath as a property when comparing the two objects and also use that to join on to "C:\Folder3" when copying to keep the folder structure in place.然后,您应该能够在比较两个对象时使用RelativePath作为属性,并在复制时使用它来连接到“C:\Folder3”以保持文件夹结构到位。

Here's an approach which will find files which are missing or differ in content.这是一种查找丢失或内容不同的文件的方法。

First, a quick-and-dirty one-liner (see caveat below).首先,一个快速而肮脏的单线(见下面的警告)。

dir -r | rvpa -Relative |%{ if (Test-Path $right\$_) { if (Test-Path -Type Leaf $_) { if ( diff (cat $_) (cat $right\$_ ) ) { $_ } } } else { $_ } }

Run the above in one of the directories, with $right set to (or replaced with) the path to the other directory.在其中一个目录中运行上述代码,将$right设置为(或替换为)另一个目录的路径。 Things missing from $right , or which differ in content, will be reported.将报告$right中缺少的内容或内容不同的内容。 No output means no differences found. No output 表示未发现差异。 CAVEAT: Things existing in $right but missing from the left will not be found/reported.警告: $right中存在但左侧缺失的东西将不会被发现/报告。

This doesn't bother calculating hashes;这不会打扰计算哈希; it just compares the file contents directly.它只是直接比较文件内容。 Hashing makes sense when you want to reference something in another context (later date, on another machine, etc.), but when we're comparing things directly, it adds nothing but overhead.当您想在另一个上下文中引用某些东西时(以后的日期,在另一台机器上等),散列是有意义的,但是当我们直接比较事物时,它只会增加开销。 (It's also theoretically possible for two files to have the same hash, although that's basically impossible to happen by accident. Deliberate attack, on the other hand...) (理论上,两个文件也有可能具有相同的 hash,尽管这基本上不可能偶然发生。另一方面,蓄意攻击......)

Here's a more proper script, which handles more corner cases and errors.这是一个更合适的脚本,它可以处理更多的极端情况和错误。

[CmdletBinding()]
Param(
    [Parameter(Mandatory=$true,Position=0)][string]$Left,
    [Parameter(Mandatory=$True,Position=1)][string]$Right
    )

# throw errors on undefined variables
Set-StrictMode -Version 1

# stop immediately on error
$ErrorActionPreference = [System.Management.Automation.ActionPreference]::Stop

# init counters
$Items = $MissingRight = $MissingLeft = $Contentdiff = 0

# make sure the given parameters are valid paths
$left  = Resolve-Path $left
$right = Resolve-Path $right

# make sure the given parameters are directories
if (-Not (Test-Path -Type Container $left))  { throw "not a container: $left"  }
if (-Not (Test-Path -Type Container $right)) { throw "not a container: $right" }

# Starting from $left as relative root, walk the tree and compare to $right.
Push-Location $left

try {
    Get-ChildItem -Recurse | Resolve-Path -Relative | ForEach-Object {
        $rel = $_
        
        $Items++
        
        # make sure counterpart exists on the other side
        if (-not (Test-Path $right\$rel)) {
            Write-Output "missing from right: $rel"
            $MissingRight++
            return
            }
    
        # compare contents for files (directories just have to exist)
        if (Test-Path -Type Leaf $rel) {
            if ( Compare-Object (Get-Content $left\$rel) (Get-Content $right\$rel) ) {
                Write-Output "content differs   : $rel"
                $ContentDiff++
                }
            }
        }
    }
finally {
    Pop-Location
    }

# Check items in $right for counterparts in $left.
# Something missing from $left of course won't be found when walking $left.
# Don't need to check content again here.

Push-Location $right

try {
    Get-ChildItem -Recurse | Resolve-Path -Relative | ForEach-Object {
        $rel = $_
        
        if (-not (Test-Path $left\$rel)) {
            Write-Output "missing from left : $rel"
            $MissingLeft++
            return
            }
        }
    }
finally {
    Pop-Location
    }

Write-Verbose "$Items items, $ContentDiff differed, $MissingLeft missing from left, $MissingRight from right"

Handy version using script parameter使用脚本参数的方便版本

Simple file-level comparasion简单的文件级比较

Call it like PS >.\DirDiff.ps1 -a.\Old\ -b.\New\将其PS >.\DirDiff.ps1 -a.\Old\ -b.\New\

Param(
  [string]$a,
  [string]$b
)

$fsa = Get-ChildItem -Recurse -path $a
$fsb = Get-ChildItem -Recurse -path $b
Compare-Object -Referenceobject $fsa -DifferenceObject $fsb

Possible output:可能的 output:

InputObject                  SideIndicator
-----------                  -------------
appsettings.Development.json <=
appsettings.Testing.json     <=
Server.pdb                   =>
ServerClientLibrary.pdb      =>

Do this:做这个:

compare (Get-ChildItem D:\MyFolder\NewFolder) (Get-ChildItem \\RemoteServer\MyFolder\NewFolder)

And even recursively:甚至递归:

compare (Get-ChildItem -r D:\MyFolder\NewFolder) (Get-ChildItem -r \\RemoteServer\MyFolder\NewFolder)

and is even hard to forget:)甚至很难忘记:)

gci -path 'C:\Folder' -recurse |where{$_.PSIsContainer} gci -path 'C:\Folder' -recurse |where{$_.PSIsContainer}

-recurse will explore all subtrees below the root path given and the.PSIsContainer property is the one you want to test for to grab all folders only. -recurse 将探索给定根路径下的所有子树,并且 .PSIsContainer 属性是您要测试以仅获取所有文件夹的属性。 You can use where{.$_.PSIsContainer} for just files.您可以将 where{.$_.PSIsContainer} 用于文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM