简体   繁体   English

PowerShell从10多个子文件夹复制并重命名多个.csv文件

[英]PowerShell copy and rename multiple .csv files from 10+ subfolders

I'm searching for a way to copy multiple .csv files all named exactly the same, located in different folders (all of them are in the same dierctory) and merge them into 1 .csv file (I would like to skip copying the first line which is head, except from the first file and there is no rule how many lines are written in each .csv file, so the script should recognize written lines to know how many and which one to merge /to avoid blank lines). 我正在寻找一种方法来复制多个名称完全相同的.csv文件,它们位于不同的文件夹中(它们都在同一目录中)并将它们合并为1个.csv文件(我想跳过复制第一个开头的行,除了第一个文件之外,并且没有规则,每个.csv文件中写入多少行,因此脚本应识别已写入的行,以了解要合并的行数和合并的行(以避免出现空行)。

This is what I tried so far: 这是我到目前为止尝试过的:

$src = "C:\Users\E\Desktop\Merge\Input\Files*.csv"
$dst = "C:\Users\E\Desktop\Merge\Output"

Get-ChildItem -Path $src -Recurse -File | Copy-Item -Destination $dst

and this one: 还有这个:

Get-ChildItem -Path $src -Recurse -File | Copy-Item -Destination $dst | 
ForEach-Object {
$NewName = $_.Name
$Destination = Join-Path -Path $_.Directory.FullName -ChildPath $NewName
Move-Item -Path $_.FullName -Destination $Destination -Force
}

any help please? 有什么帮助吗? :) :)

Since you are looking to merge the files you may as well read them all into PowerShell, and then output the whole thing at once. 由于您希望合并文件,因此您最好将它们全部阅读到PowerShell中,然后立即输出整个文件。 You could do something like: 您可以执行以下操作:

$Data = Get-ChildItem -Path $src -Recurse -File | Import-Csv
$Data | Export-Csv $dst\Output.csv -NoTypeInformation

That may not be feasible if your CSV files are extremely large, but it is a simple way to merge CSV files if the header row is the same in all files. 如果您的CSV文件很大,这可能不可行,但是如果所有文件的标题行都相同,这是合并CSV文件的简单方法。

Another method would be to just treat it as text, which is much less memory intensive. 另一种方法是将其视为文本,这样就减少了内存消耗。 For that you would want to get a list of files, copy the first one intact, and then copy the rest of them skipping the header row. 为此,您需要获取文件列表,完整复制第一个文件,然后跳过标题行复制其余文件。

$Files = Get-ChildItem $src -Recurse
$TargetFile = Join-Path $dst $Files[0].Name
$Files[0] | Copy-Item -Dest $TargetFile
#Skip the first file, and loop through the rest
$Files | Select -Skip 1 | ForEach-Object{
    #Get the contents of the file, and skip the header row, then append the rest to the target
    Get-Content $_ | Select -Skip 1 | Add-Content $TargetFile
}

Edit: Ok, I wanted to replicate the process so that I could figure out what was giving you errors. 编辑:好的,我想复制该过程,以便找出导致错误的原因。 To do that I created 3 folders, and copied a .csv file with 4 entries into each folder, with all of the files named 'Files 06202018.csv'. 为此,我创建了3个文件夹,并将一个包含4个条目的.csv文件复制到每个文件夹中,所有文件均名为“文件06202018.csv”。 I ran my code above, and it did what it should, but there was some file corruption where the second file would be appended directly to the end of the first file without a new line being created for it, so I changed things from just copying the first file, to reading it and creating a new file in the destination. 我在上面运行了我的代码,它做了应有的工作,但是有一些文件损坏,其中第二个文件将直接附加到第一个文件的末尾,而没有为其创建新行,因此我从复制开始更改了一些内容第一个文件,以读取它并在目标位置创建一个新文件。 The below code worked flawlessly for me: 下面的代码为我完美地工作:

$src = "C:\Temp\Test\Files*.csv" 
$dst = "C:\Temp\Test\Output"
$Files = Get-ChildItem $src -Recurse 
$TargetFile = Join-Path $dst $Files[0].Name
GC $Files[0] | Set-Content $TargetFile 
#Skip the first file, and loop through the rest 
$Files | Select -Skip 1 | ForEach-Object{ 
    #Get the contents of the file, and skip the header row, then append the rest to the target 
    Get-Content $_ | Select -Skip 1 | Add-Content $TargetFile 
}

That took the files: 那拿了文件:

C:\Temp\Test\Lapis\Files 06202018.csv
C:\Temp\Test\Malachite\Files 06202018.csv
C:\Temp\Test\Opal\Files 06202018.csv

And it combined those three files into a correctly merged file at: 并将这三个文件合并到一个正确合并的文件中:

C:\Temp\Test\Output\Files 06202018.csv

The only time that I had any issues is if I forgot to delete the target file before running this. 我唯一遇到的问题是是否在运行此操作之前忘记删除目标文件。 Depending on how large these files are, and how much memory you have available, you could probably speed this up by changing the last two lines to this: 根据这些文件的大小以及可用的内存量,可以通过将最后两行更改为以下内容来加快速度:

    Get-Content $_ | Select -Skip 1
} | Add-Content $TargetFile

That would read all of the files in (other than the first one) and only write to the destination once, instead of having to get file lock, open the file for writing, write, and close the destination for each file. 这将读取其中的所有文件(第一个文件除外),并且只写入一次目标,而不必获取文件锁定,打开文件进行写入,写入和关闭每个文件的目标。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM