How to remove double quote that is between a set of double quotes?
"Test T"est"
should get output as "Test Test"
"Test T"est", "Test1 "Test1"
should get output as "Test Test", "Test1 Test1"
You can try with awk
:
$ awk -F", *" '{ # Set the field separator
for(i=1;i<=NF;i++){ # Loop through all fields
$i="\""gensub("\"", "", "g", $i)"\"" # Rebuild the field with only surrounding quotes
}
}1' OFS="," file # Print the line
"Test Test","Test1 Test1"
If this is a corrupted CSV and you can say there are no commas inside the fields, then PowerShell's CSV handling will read them and leave the trailing quote. Remove that, then re-export to a new CSV to get values with double quotes around them.
import-csv .\test.csv -Header 'column1', 'column2' |
ForEach-Object {
foreach ($column in $_.psobject.properties.Name)
{
$_.$column = $_.$column.Replace('"', '')
}
$_
} | Export-Csv .\test2.csv -NoTypeInformation
If the file has headers in it, remove the -header 'column1', 'column2'
part.
So if this is for a corrupted CSV you could state the problem as remove any double quotes that don't appear at the start or end of a line and that are not near a comma (with optional white space). So this can easily be done with a Powershell regex like so:
$t = '"Test T"est", "Test1 "Test1"'
$t -replace '(?<!^|\s*,\s*)"(?!\s*,\s*|$)', ''
An alternative with sed:
sed 's/\("[^"]\+\)"\([^"]\+"\)/\1\2/g' inputFile
input:
"Test T"est"
"Test T"est", "Test1 "Test1"
output:
"Test Test"
"Test Test", "Test1 Test1"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.