简体   繁体   English

在powershell中连接csv文件,没有第一行(第一个文件除外)

[英]Concatenate csv files in powershell, without the first line (except for the first file)

I have multiple *.csv files.我有多个 *.csv 文件。 I want to concatenate them into a single CSV file in a powershell script.我想在 powershell 脚本中将它们连接成一个 CSV 文件。 All csv files have the same header (the first line), so when I concatenate them I want to keep the first line only from the first file.所有 csv 文件都有相同的标题(第一行),所以当我连接它们时,我只想保留第一个文件的第一行。

How can I do that?我怎样才能做到这一点?

Note: The solution in this answer intentionally uses plain-text processing to process the files, for two reasons:注意:此答案中的解决方案有意使用纯文本处理来处理文件,原因有两个:

  • Use of Import-Csv and Export-Csv incurs significant processing overhead (though that may not matter in a given situation);使用Import-CsvExport-Csv会产生大量的处理开销(尽管在特定情况下这可能无关紧要); plain-text processing is significantly faster .纯文本处理速度明显更快

  • In Windows PowerShell and PowerShell [Core] 6.x, the output will invariably have double-quoted column values , even if they weren't initially (though that should normally not matter).在 Windows PowerShell 和 PowerShell [Core] 6.x 中,输出将始终具有双引号列值,即使它们最初不是(尽管这通常无关紧要)。

    • In PowerShell [Core] 7.0+ Export-Csv and ConvertTo-Csv now have a -UseQuotes parameter that allows you to control quoting in the output.在 PowerShell [Core] 7.0+ Export-CsvConvertTo-Csv现在有一个-UseQuotes参数,允许您控制输出中的引用。

That said, Import-Csv and Export-Csv are certainly the better choice whenever you need to read and interpret the data (as opposed to just copying it elsewhere) - see Sid's helpful answer .也就是说,当您需要读取和解释数据(而不是仅将其复制到其他地方)Import-CsvExport-Csv无疑是更好的选择- 请参阅Sid 的有用答案


# The single output file.
# Note: Best to save this in a different folder than the input
#       folder, in case you need to run multiple times.
$outFile = 'outdir/out.csv'

# Get all input CSV files as an array of file-info objects,
# from the current dir. in this example
$inFiles = @(Get-ChildItem -Filter *.csv)

# Extract the header line (column names) from the first input file
# and write it to the output file.
Get-Content $inFiles[0] -First 1 | Set-Content -Encoding Utf8 $outFile

# Process all input files and append their *data* rows to the
# output file (that is, skip the header row).
# NOTE: If you only wanted to extract a given count $count of data rows
#       from each file, add -First ($count+1) to the Get-Content call.
foreach ($file in $inFiles) {
  Get-Content $_.FullName | Select-Object -Skip 1 | 
    Set-Content -Append -Encoding Utf8 $outFile 
}

Note the use of -Encoding Utf8 as an example;注意以-Encoding Utf8为例; adjust as needed;根据需要调整; by default, Set-Content will use "ANSI" encoding in Windows PowerShell, and BOM-less UTF-8 in PowerShell Core .默认情况下, Set-Content将在 Windows PowerShell 中使用“ANSI”编码,在 PowerShell Core 中使用BOM-less UTF-8。

Caveat : By doing line-by-line plain-text processing, you're relying on each text line representing a single CSV data row ;警告:通过逐行纯文本处理,您依赖于代表单个 CSV数据行的每个文本 this is typically true, but doesn't have to be.通常是正确的,但并非必须如此。

Conversely, if performance is paramount, the plain-text approach above could be made significantly faster with direct use of .NET methods such as [IO.File]::ReadLines() or, if the files are small enough, even [IO.File]::ReadAllLines() .相反,如果性能是最重要的,直接使用 .NET 方法(如[IO.File]::ReadLines()或者,如果文件足够小,甚至[IO.File]::ReadAllLines()

You could have done like this:你可以这样做:

(Get-ChildItem -Path $path -Filter *.csv).FullName | Import-Csv | Export-Csv $path\concatenated.csv -NoTypeInformation

Where $path is the folder where the csv files exist.其中$path是 csv 文件所在的文件夹。 The final csv file will be in the same folder.最终的 csv 文件将位于同一文件夹中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM