[英]PowerShell script to replace commas within double quotes with nothing
I have a comma separated CSV file, where I intend to replace the commas in double quotes to nothing and also replace double quotes with nothing:我有一个逗号分隔的 CSV 文件,我打算在其中将双引号中的逗号替换为空,并将双引号替换为空:
Editor's note : The original form of this question asked to "change [the] delimiter to pipe" ( |
), which is no longer a requirement;编者按:本题原形式要求“将[the]分隔符改为竖线”(
|
),不再要求; gms0ulman's answer was written when it still was. gms0ulman 的答案是在它仍然存在的时候写的。
$inform = Get-Content C:\test.csv
$inform | % {
$info = $_.ToString().Replace(",","")
$var = $info
$var | Out-file C:\test1.csv -Append
}
Any help would be much appreciated.任何帮助将不胜感激。
In:在:
1,2,"Test,ABC"
Out:出:
1,2,TestABC
Import the CSV.导入 CSV。 Convert it to a CSV with a different delimiter.
将其转换为具有不同分隔符的 CSV。 Replace the commas.
替换逗号。 Convert the delimiter back.
将分隔符转换回。 Replace the double quotes.
替换双引号。 Write out the resulting file.
写出结果文件。
Import-Csv -Path C:\MyFile.csv |
ConvertTo-Csv -Delimiter '|' |
ForEach-Object { $_ -replace ',',[String]::Empty } |
ConvertFrom-Csv -Delimiter '|' |
ConvertTo-Csv |
ForEach-Object { $_ -replace -replace '"',[String]::Empty } |
Set-Content -Path C:\MyFile_fixed.csv
I would break this down into two steps.我将把它分为两个步骤。 Another StackOverflow user may be able to give you a one-liner.
另一个 StackOverflow 用户可能会给你一个单行。
Import-Csv C:\test.csv | Export-Csv tempfile.csv -Delimiter "|"
(Get-Content tempfile.csv).Replace(",","").Replace('"',"") | Out-File test1.csv
The following should do what you want (tested in PSv5.1):以下应该做你想做的(在 PSv5.1 中测试):
Import-Csv C:\test.csv | ForEach-Object -Begin { $writeHeader = $True } {
if ($writeHeader) { $writeHeader = $False; $_.psobject.properties.Name -join ',' }
$_.psobject.properties.Value -replace ',', '' -join ','
} | Set-Content -Encoding UTF8 test1.csv
Import-Csv
reads your CSV file into custom objects ( [pscustomobject]
instances) whose properties contain the column values with double quotes removed. Import-Csv
将您的 CSV 文件读入自定义对象( [pscustomobject]
实例),其属性包含删除双引号的列值。
,
instances can therefore blindly replaced without worrying about column- separating ,
instances.,
实例可以因此盲目地更换,而无需担心列-分离,
实例。 The problem is that you can not use Export-Csv
after modifying the objects, because it invariably adds double quotes (back) around all output values.问题是您不能在修改对象后使用
Export-Csv
,因为它总是在所有输出值周围添加双引号(后退)。
Therefore, a custom mini-script must be executed for each custom object, using ForEach-Object
:因此,必须使用
ForEach-Object
为每个自定义对象执行自定义迷你脚本:
-Begin { $writeHeader = $True }
is executed once at the beginning to signal the need to output a header row before the first data row. -Begin { $writeHeader = $True }
在开始时执行一次,以表示需要在第一个数据行之前输出标题行。
$_.psobject.properties
is the collection of all properties defined on the input object, named for the header columns, and containing a given data row's values. $_.psobject.properties
是输入对象上定义的所有属性的集合,以标题列命名,并包含给定数据行的值。
$_.psobject.properties.Name -join ','
outputs the header row, simply by joining the property names - which are the column headers - with ,
to yield a single output string. $_.psobject.properties.Name -join ','
输出标题行,只需将属性名称 - 即列标题 - 与,
以产生单个输出字符串。
$_.psobject.properties.Value -replace ',', ''
removes any value-internal ,
instances (replaces them with the empty string), and -join ','
again joins the resulting values as-is with ,
, outputting a data row. $_.psobject.properties.Value -replace ',', ''
删除任何内部值,
实例(用空字符串替换它们),并且-join ','
再次将结果值按原样与,
,输出一个数据行。
Set-Content
- which is preferable to Out-File
here, because the output objects are already strings - is used to write to the output file. Set-Content
- 这里比Out-File
更可取,因为输出对象已经是字符串- 用于写入输出文件。
Note the -Encoding
parameter to control the output character encoding -adjust as needed.请注意
-Encoding
参数以根据需要控制输出字符编码 -adjust。
In Windows PowerShell (versions up to v5.1), not using -Encoding
would default to your system's "ANSI" code page (even though the help topic claims ASCII), whereas Out-File
would default to UTF-16LE ("Unicode").在Windows PowerShell (版本高达 v5.1)中,不使用
-Encoding
将默认为您系统的“ANSI”代码页(即使帮助主题声称使用 ASCII),而Out-File
将默认为 UTF-16LE(“Unicode” )。
Does your csv have headers?你的csv有标题吗? Are the values to be changed in the same column?
要更改的值是否在同一列中?
If it looks something like this :如果它看起来像这样:
h1,h2,h3
1,2,"Test,ABC"
3,4,"Test,DEF"
This should work:这应该有效:
$Csv = Import-Csv -path C:\MyFile.csv
$Csv.H3 | foreach {$_.Replace('"',"").Replace(",","")}
Edit: Made it work.编辑:让它工作。 But basically the same as mklement0's solution
但是和mklement0的解决方案基本一样
$Csv = Import-Csv -path C:\MyFile.csv
$Csv | Foreach {$_.H3 = $_.H3.Replace(",","")}
$CsvObject = $Csv | Convertto-Csv -NoTypeInformation
$CsvObject.replace('"','') |
Set-Content C:\OutFile.Csv
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.