简体   繁体   English

在Java中使用RegEx解析CSV-单元格内的转义双引号

[英]Parsing CSV with a RegEx in java - escape double quote within cell

I am looking for a java regex which will escape the doublequote within an excel cell. 我正在寻找一个Java regex,它将在Excel单元格中转义双引号。

I have followed this example but need another change in the regular expression to make it work for escaping doublequote within one of the cells. 我已经遵循了这个示例,但是需要对正则表达式进行另一处更改,以使其能够在一个单元格中转义双引号。

Parsing CSV input with a RegEx in java 在Java中使用RegEx解析CSV输入

private final Pattern pattern = Pattern.compile("\"([^\"]*)\"|(?<=,|^)([^,]*)(?=,|$)");

Example Data: 示例数据:

"A,B" , "2" size" , "text1,text2, text3" “ A,B” “ 2”大小“ ” text1,text2,text3“

The regex from above fails at 2" . 上面的正则表达式在2"处失败。

I want the output to be as below .Doesn't matter if the outer double quotes are there or not. 我希望输出如下。无论外部双引号是否存在,都没关系。

"A,B" “ A,B”
"2" size" “ 2”尺寸”
"text1,text2, text3" “ text1,text2,text3”

while I agree, that using regex for parsing a CVS is not really the best way, a slightly better pattern is: 我同意,使用正则表达式解析CVS并不是真正的最佳方法,但更好的模式是:

Pattern pattern = Pattern.compile("^\"([^\"]*)\",|,\"([^\"]*)\",|,\"([^\"]*)\"$|(?<=,|^)([^,]*)(?=,|$)");

This will terminate a cell value only after quote and comma, or start it after a command and a quote. 这将仅在引号和逗号之后终止单元格值,或在命令和引号之后启动它。

well as FJ commented, the input data is ambiguous. 就像FJ评论的那样,输入数据不明确。 But for your example input, you could try 但是对于示例输入,您可以尝试

  • string.split("\\",\\"") method to get a String[] . string.split("\\",\\"")方法获取String[] after this, you got an array with 3 elements: 之后,您将获得一个包含3个元素的数组:
 [ "A,B, 2" size, text1,text2, text3" ] 
  • remove the first character (which is double quote) of the first element of the array 删除数组第一个元素的第一个字符(双引号)
  • remove the last character (which is double quote) of the last element of the array 删除数组最后一个元素的最后一个字符(双引号)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM