简体   繁体   中英

Replace commas that are not within double inverted commas(“”) from csv files using powershell?

I have a huge csv file(around 100GB). My problem is that I need to replace commas(,) in the file with semi-colon(;) except for the ones within double-inverted commas("").

I tried several methods but none seem to be working. Also this modification needs to be on Windows, hence sed and awk are out of option.

Example:
Input : "A,B,C",D,E,"FG","H,J",K
Output : "A,B,C";D;R;"FG";H,J;K

Once this is done, I need to remove the ".

I am able to remove the " from the file, but semi-colon replacement is failing everytime.

Please let me know if this is achievable through Powershell.

This should take care of both the delimiter replacement and removing the double quotes:

 Get-Content ./File.csv -ReadCount 1000 |
 foreach { $_ -replace ',(?=(?:[^"]|"[^"]*")*$)',';' -replace '"' } |
 Add-Content ./NewFile.csv 

and handle a large file efficiently without needing third party utilities.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM