简体   繁体   中英

How to remove double quote between double quotes

How to remove double quote that is between a set of double quotes?

"Test T"est" should get output as "Test Test"

"Test T"est", "Test1 "Test1" should get output as "Test Test", "Test1 Test1"

You can try with awk :

$ awk -F", *" '{                           # Set the field separator
  for(i=1;i<=NF;i++){                      # Loop through all fields
     $i="\""gensub("\"", "", "g", $i)"\""  # Rebuild the field with only surrounding quotes
  }
}1' OFS="," file                           # Print the line 
"Test Test","Test1 Test1"

If this is a corrupted CSV and you can say there are no commas inside the fields, then PowerShell's CSV handling will read them and leave the trailing quote. Remove that, then re-export to a new CSV to get values with double quotes around them.

import-csv .\test.csv -Header 'column1', 'column2' | 
    ForEach-Object {

        foreach ($column in $_.psobject.properties.Name)
        {
          $_.$column = $_.$column.Replace('"', '') 
        }

        $_ 

     } | Export-Csv .\test2.csv -NoTypeInformation

If the file has headers in it, remove the -header 'column1', 'column2' part.

So if this is for a corrupted CSV you could state the problem as remove any double quotes that don't appear at the start or end of a line and that are not near a comma (with optional white space). So this can easily be done with a Powershell regex like so:

$t = '"Test T"est", "Test1 "Test1"'
$t -replace '(?<!^|\s*,\s*)"(?!\s*,\s*|$)', ''

An alternative with sed:

sed 's/\("[^"]\+\)"\([^"]\+"\)/\1\2/g' inputFile

input:

"Test T"est"
"Test T"est", "Test1 "Test1"

output:

"Test Test"
"Test Test", "Test1 Test1"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM