简体   繁体   English

使用Powershell替换csv文件中双反逗号(“”)之外的逗号?

[英]Replace commas that are not within double inverted commas(“”) from csv files using powershell?

I have a huge csv file(around 100GB). 我有一个巨大的csv文件(大约100GB)。 My problem is that I need to replace commas(,) in the file with semi-colon(;) except for the ones within double-inverted commas(""). 我的问题是我需要用分号(;)替换文件中的逗号(,),而不是两次反转的逗号(“”)中的逗号。

I tried several methods but none seem to be working. 我尝试了几种方法,但似乎都没有用。 Also this modification needs to be on Windows, hence sed and awk are out of option. 同样,此修改也需要在Windows上进行,因此sed和awk都不可行。

Example: 例:
Input : "A,B,C",D,E,"FG","H,J",K 输入:“ A,B,C”,D,E,“ FG”,“ H,J”,K
Output : "A,B,C";D;R;"FG";H,J;K 输出:“ A,B,C”; D; R;“ FG”; H,J; K

Once this is done, I need to remove the ". 完成此操作后,我需要删除“。

I am able to remove the " from the file, but semi-colon replacement is failing everytime. 我可以从文件中删除“,但分号替换每次都会失败。

Please let me know if this is achievable through Powershell. 请让我知道这是否可以通过Powershell实现。

This should take care of both the delimiter replacement and removing the double quotes: 这应同时注意分隔符的替换和双引号的删除:

 Get-Content ./File.csv -ReadCount 1000 |
 foreach { $_ -replace ',(?=(?:[^"]|"[^"]*")*$)',';' -replace '"' } |
 Add-Content ./NewFile.csv 

and handle a large file efficiently without needing third party utilities. 并且无需第三方实用程序即可有效处理大型文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM