简体   繁体   English

如何使用PowerShell在模式之前将每个逗号替换为文本文件中的空格

[英]How can I replace every comma with a space in a text file before a pattern using PowerShell

I have a text file with lines in this format: 我有一个文本文件,其中的行格式如下:

FirstName,LastName,SSN,$x.xx,$x.xx,$x.xx
FirstName,MiddleInitial,LastName,SSN,$x.xx,$x.xx,$x.xx

The lines could be in either format. 这些行可以采用任何一种格式。 For example: 例如:

Joe,Smith,123-45-6789,$150.00,$150.00,$0.00
Jane,F,Doe,987-65-4321,$250.00,$500.00,$0.00

I want to basically turn everything before the SSN into a single field for the name thus: 我基本上想将SSN之前的所有内容都变成一个单独的名称字段,因此:

Joe Smith,123-45-6789,$150.00,$150.00,$0.00
Jane F Doe,987-65-4321,$250.00,$500.00,$0.00

How can I do this using PowerShell? 如何使用PowerShell执行此操作? I think I need to use ForEach-Object and at some point replace "," with " ", but I don't know how to specify the pattern. 我想我需要使用ForEach-Object并在某些时候将“,”替换为“”,但是我不知道如何指定模式。 I also don't know how to use a ForEach-Object with a $_.Where so that I can specify the "SkipUntil" mode. 我也不知道如何将ForEach-Object与$ _。Where一起使用,以便指定“ SkipUntil”模式。

Thanks very much! 非常感谢!

Mathias is correct; Mathias是正确的; you want to use the -replace operator, which uses regular expressions. 您要使用-replace运算符,该运算符使用正则表达式。 I think this will do what you want: 我认为这将满足您的要求:

$string -replace ',(?=.*,\d{3}-\d{2}-\d{4})',' '

The regular expression uses a lookahead (?=) to look for any commas that are followed by any number of any character (. is any character, * is any number of them including 0) that are then followed by a comma immediately followed by a SSN (\\d{3}-\\d{2}-\\d{4}). 正则表达式使用先行(?=)查找任何逗号,后跟任意数量的任何字符(。是任意字符,*是其中包括0的任意数量),然后用逗号紧跟逗号SSN(\\ d {3}-\\ d {2}-\\ d {4})。 The concept of "zero-width assertions", such as this lookahead, simply means that it is used to determine the match, but it not actually returned as part of the match. “零宽度断言”的概念(例如这种前瞻)仅表示它用于确定匹配项,但实际上并没有作为匹配项的一部分返回。

That's how we're able to match only the commas in the names themselves, and then replace them with a space. 这样我们就可以只匹配名称本身中的逗号,然后用空格替换它们。

I know it's answered, and neatly so, but I tried to come up with an alternative to using a regex - count the number of commas in a line, then replace either the first one, or the first two, commas in the line. 我知道它的答案很简洁,但是我设法提出了一种使用正则表达式的替代方法-计算一行中的逗号数量,然后替换该行中的第一个或前两个逗号。

But strings can't count how many times a character appears in them without using the regex engine(*), and replacements can't be done a specific number of times without using the regex engine(**), so it's not very neat: 但是,如果不使用正则表达式引擎(*),字符串就无法计数字符出现在字符中的次数,并且如果不使用正则表达式引擎(**),字符串也无法完成特定次数的替换,因此它不是很整齐:

$comma = [regex]","
Get-Content data.csv | ForEach { 

    $numOfCommasToReplace = $comma.Matches($_).Count - 4
    $comma.Replace($_, ' ', $numOfCommasToReplace)

} | Out-File data2.csv

Avoiding the regex engine entirely, just for fun, gets me things like this: 仅仅为了好玩而完全避免使用正则表达式引擎,我会得到如下信息:

Get-Content .\data.csv | ForEach { 

    $1,$2,$3,$4,$5,$6,$7 = $_ -split ','
    if ($7) {"$1 $2 $3,$4,$5,$6,$7"} else {"$1 $2,$3,$4,$5,$6"}

} | Out-File data2.csv

(*) ($line -as [char[]] -eq ',').Count (*) ($line -as [char[]] -eq ',').Count

(**) while ( #counting ) { # split/mangle/join } (**) while ( #counting ) { # split/mangle/join }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM