简体   繁体   中英

PowerShell copy and rename multiple .csv files from 10+ subfolders

I'm searching for a way to copy multiple .csv files all named exactly the same, located in different folders (all of them are in the same dierctory) and merge them into 1 .csv file (I would like to skip copying the first line which is head, except from the first file and there is no rule how many lines are written in each .csv file, so the script should recognize written lines to know how many and which one to merge /to avoid blank lines).

This is what I tried so far:

$src = "C:\Users\E\Desktop\Merge\Input\Files*.csv"
$dst = "C:\Users\E\Desktop\Merge\Output"

Get-ChildItem -Path $src -Recurse -File | Copy-Item -Destination $dst

and this one:

Get-ChildItem -Path $src -Recurse -File | Copy-Item -Destination $dst | 
ForEach-Object {
$NewName = $_.Name
$Destination = Join-Path -Path $_.Directory.FullName -ChildPath $NewName
Move-Item -Path $_.FullName -Destination $Destination -Force
}

any help please? :)

Since you are looking to merge the files you may as well read them all into PowerShell, and then output the whole thing at once. You could do something like:

$Data = Get-ChildItem -Path $src -Recurse -File | Import-Csv
$Data | Export-Csv $dst\Output.csv -NoTypeInformation

That may not be feasible if your CSV files are extremely large, but it is a simple way to merge CSV files if the header row is the same in all files.

Another method would be to just treat it as text, which is much less memory intensive. For that you would want to get a list of files, copy the first one intact, and then copy the rest of them skipping the header row.

$Files = Get-ChildItem $src -Recurse
$TargetFile = Join-Path $dst $Files[0].Name
$Files[0] | Copy-Item -Dest $TargetFile
#Skip the first file, and loop through the rest
$Files | Select -Skip 1 | ForEach-Object{
    #Get the contents of the file, and skip the header row, then append the rest to the target
    Get-Content $_ | Select -Skip 1 | Add-Content $TargetFile
}

Edit: Ok, I wanted to replicate the process so that I could figure out what was giving you errors. To do that I created 3 folders, and copied a .csv file with 4 entries into each folder, with all of the files named 'Files 06202018.csv'. I ran my code above, and it did what it should, but there was some file corruption where the second file would be appended directly to the end of the first file without a new line being created for it, so I changed things from just copying the first file, to reading it and creating a new file in the destination. The below code worked flawlessly for me:

$src = "C:\Temp\Test\Files*.csv" 
$dst = "C:\Temp\Test\Output"
$Files = Get-ChildItem $src -Recurse 
$TargetFile = Join-Path $dst $Files[0].Name
GC $Files[0] | Set-Content $TargetFile 
#Skip the first file, and loop through the rest 
$Files | Select -Skip 1 | ForEach-Object{ 
    #Get the contents of the file, and skip the header row, then append the rest to the target 
    Get-Content $_ | Select -Skip 1 | Add-Content $TargetFile 
}

That took the files:

C:\Temp\Test\Lapis\Files 06202018.csv
C:\Temp\Test\Malachite\Files 06202018.csv
C:\Temp\Test\Opal\Files 06202018.csv

And it combined those three files into a correctly merged file at:

C:\Temp\Test\Output\Files 06202018.csv

The only time that I had any issues is if I forgot to delete the target file before running this. Depending on how large these files are, and how much memory you have available, you could probably speed this up by changing the last two lines to this:

    Get-Content $_ | Select -Skip 1
} | Add-Content $TargetFile

That would read all of the files in (other than the first one) and only write to the destination once, instead of having to get file lock, open the file for writing, write, and close the destination for each file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM