[英]Powershell - How to search (using wildcard) and replace values in a CSV file?
I have a CSV file (one column/field only) with thousands of records in it.我有一个 CSV 文件(仅一列/字段),其中包含数千条记录。
I need a way in Powershell to search for a value using a few characters followed by a wildcard and, where found, then replace that value with a ".我需要 Powershell 中的一种方法来使用几个字符后跟通配符来搜索一个值,然后在找到时用“。
I have searched around on how to do this but everyting I have found so far either doesn't cover CSV files or doesn't explain how I might be able to do the search using a wildcard.我已经搜索了如何执行此操作,但到目前为止我发现的所有内容要么不涵盖 CSV 文件,要么没有解释我如何能够使用通配符进行搜索。
Example of values in CSV file: CSV 文件中的值示例:
<#
RanDom.Texto 1.yellow [ Table - wood ] "gibberishcode1.moreRandomText11.xyz123+456"
R@ndomEq.Textolo 2.blue [Chair - steel ] "gibberishcode2.moreRandomText222.xyz19283+4567+89
randomi.Textpel 3.green [ counter - granite] "gibberishcode3.moreRandomText3333.xyz17243+3210+987+654"
#>
You will note above that the only values in common across the records are the .xyz in each record.您将在上面注意到,记录中唯一的共同值是每条记录中的.xyz 。
I want to replace the .xyz (and everything that follows) with a " value.我想用"值替换.xyz (以及随后的所有内容)。
Eg Desired result as follows:例如,期望的结果如下:
<#
RanDom.Texto 1.yellow [ Table - wood ] "gibberishcode1.moreRandomText11"
R@ndomEq.Textolo 2.blue [Chair - steel ] "gibberishcode2.moreRandomText222"
Randomi.Textpel 3.green [ counter - granite] "gibberishcode3.moreRandomText3333"
#>
Here is some code I tried but it doesn't work in that it didn't replace the values (but it does successfuly export to a new csv file).这是我尝试过的一些代码,但它不起作用,因为它没有替换值(但它确实成功导出到新的 csv 文件)。
# Create function that gets the current file path (of where this script is located)
function Get-ScriptDirectory {Split-Path -parent $PSCommandPath}
# Create function that gets the current date and time in format of 1990-07-01_19h15m59
function Get-TimeStamp {return "{0:yyyy-MM-dd}_{0:HH}h{0:mm}m{0:ss}" -f (Get-Date)}
# Set current file path. Also used in both FOR loops below as primary source directory.
${sourceDirPath} = Get-ScriptDirectory
# Import CSV look-up file
${csvFile} = (Import-Csv -Path ${sourceDirPath}\SourceCSVFile.csv)
# for each row, replace the values of .xyz and all that follows with "
foreach(${row} in ${csvFile})
{
${row} = ${row} -replace '.xyz*','"'
}
# Set modified CSV's name and path
${newCSVFile} = ${sourceDirPath} + '\' + $(Get-TimeStamp) + '_SourceCSVFile_Modified.csv'
# export the modified CSV
${csvFile} | Export-Csv ${newCSVFile} -NoTypeInformation
I also tried this as an alternative but no luck either (i think this code below may only work for.txt files??)...我也尝试过这个作为替代方案,但也没有运气(我认为下面的这段代码可能只适用于 .txt 文件??)......
((Get-Content -path C:\TEMP\TEST\SourceCSVFile.csv -Raw) -replace '.xyz'*,'"') | Export-Csv -Path C:\TEMP\TEST\ReplacementFile.csv
I'm new to Powershell and don't have a proper understanding of regex yet so please be gentle.我是 Powershell 的新手,对正则表达式还没有正确的理解,所以请保持温和。
UPDATE and SOLUTION:更新和解决方案:
For those that are interested in my final solution... I used the code provided by Thomas (Thank you!!) however my .csv
file was left with some records that had a triple quote """
value at the end of the string.对于那些对我的最终解决方案感兴趣的人......我使用了 Thomas 提供的代码(谢谢!!)但是我的.csv
文件留下了一些在字符串末尾具有三引号"""
值的记录.
As such I modified the code to use variables and execute a second pass of cleaning by replacing all triple quotation (eg """) values with a single quote value (eg ") and then piping the result to file.因此,我修改了代码以使用变量并通过将所有三引号(例如“”)值替换为单引号值(例如“)来执行第二次清理,然后将结果通过管道传输到文件。
# Create function that gets the current file path (of where this script is located and running from)
function Get-ScriptDirectory {Split-Path -parent $PSCommandPath}
# Set current file path
${sourceDirPath} = Get-ScriptDirectory
# Assign source .csv file name to variable
$origNameSource = 'AllNames.csv'
# Assign desired .csv file name post cleaning
$origNameCLEAN = 'AllNames_CLEAN.csv'
# First pass clean to replace .xyz* with " and assign result to tempCsvText variable
${tempCsvText} = ((Get-Content -Path ${sourceDirPath}\$origNameSource) | % {$_ -replace '\.xyz.*$', '"'})
# Second pass clean to replace """ with " and write result to a new .csv file
${tempCsvText} -replace '"""', '"' | Set-Content -Path ${sourceDirPath}\$origNameCLEAN
# Import records from new .csv file and remove duplicates by using Sort-Object * -Unique
${csvFile} = (Import-Csv -Path ${sourceDirPath}\$origNameCLEAN) | Sort-Object * -Unique
First, a .csv
file is nothing else than a regular text file, just following some rules on how content is embedded (one line for each row, columns delimited by a defined ASCII character, optional header).首先, .csv
文件只不过是一个普通的文本文件,只是遵循一些关于如何嵌入内容的规则(每行一行,由定义的 ASCII 字符分隔的列,可选标题)。 Your last line is close.你的最后一行很接近。 You have to use a regular expression, that reaches until the end of a line.您必须使用正则表达式,直到行尾。 This will do it:这将做到:
Get-Content -Path C:\TEMP\TEST\SourceCSVFile.csv | % {$_ -replace '\.xyz.*$', '"'} | Set-Content -Path C:\TEMP\TEST\ReplacementFile.csv
Differences:差异:
-Raw
parameter to get each line as one string.我删除了-Raw
参数以将每一行作为一个字符串。.xyz
until the end of each line我调整了你的正则表达式以匹配从.xyz
直到每一行的结尾Set-Content
as I only did text replacement and did not read any objects that would then have to be retranslated back to csv text by Export-Csv
我将结果通过管道传输到Set-Content
,因为我只进行了文本替换并且没有读取任何必须通过Export-Csv
重新翻译回 csv 文本的对象
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.