[英]Parsing a string that contains double quotes
i have a relatively simple java question.我有一个相对简单的 java 问题。 I have a string that looks like this:
我有一个看起来像这样的字符串:
"Anderson,T",CWS,SS
I need to parse it in a way that I have我需要以我拥有的方式解析它
Anderson,T
CWS
SS
all as separate strings.全部作为单独的字符串。
Thanks!谢谢!
Here's a solution that will capture quoted strings, remove spaces, and match empty items:这是一个解决方案,它将捕获带引号的字符串、删除空格并匹配空项:
public static void main(String[] args) {
String quoted = "\"(.*?(?<!\\\\)(?:\\\\\\\\)*)\"";
Pattern regex = Pattern.compile(
"(?:^|(?<=,))\\s*(" + quoted + "|[^,]*?)\\s*(?:$|,)");
String line = "\"Anderson,T\",CWS,\"single quote\\\"\", SS ,,hello,,";
Matcher m = regex.matcher(line);
int count = 0;
while (m.find()) {
String s = m.group(2) == null ? m.group(1) : m.group(2);
System.out.println(s);
count++;
}
System.out.printf("(%d matches found)%n", count);
}
I split out the quoted part of the pattern to make it a bit easier to follow.我拆分了模式的引用部分,以便更容易理解。 Capturing group 1 is the quoted string, 2 is every other match.
捕获组 1 是引用的字符串,2 是每隔一个匹配项。
To break down the overall pattern:要分解整体模式:
(?:^|(?<=,))
(don't capture)(?:^|(?<=,))
(不捕获)\\s*
\\s*
(" + quoted + "|[^,]*?)
(The non-comma match is non-greedy so it doesn't grab any following spaces)(" + quoted + "|[^,]*?)
(非逗号匹配是非贪婪的,因此它不会抓取任何后续空格)\\s*
\\s*
(?:$|,)
(don't capture)(?:$|,)
(不要捕获) To break down the quote pattern:要分解报价模式:
\"
\"
(
(
.*?
.*?
(?<?\\\\)(::\\\\\\\\)*
(to avoid matching escaped quotes with or without preceding escaped backslashes)(?<?\\\\)(::\\\\\\\\)*
(以避免匹配带有或不带有前面转义反斜杠的转义引号))
)
\"
\"
Assuming your string looks like this假设你的字符串看起来像这样
String input = "\"Anderson,T\",CWS,SS";
You can use this solution found for a similar scenario.您可以将这个解决方案用于类似的场景。
String input = "\"Anderson,T\",CWS,SS";
List<String> result = new ArrayList<String>();
int start = 0; //start index. Used to determine where the word starts
boolean inQuotes = false;
for (int current = 0; current < input.length(); current++) { //iterate through characters
if (input.charAt(current) == '\"') //if found a quote
inQuotes = !inQuotes; // toggle state
if(current == (input.length() - 1))//if it is the last character
result.add(input.substring(start)); //add last word
else if (input.charAt(current) == ',' && !inQuotes) { //if found a comma not inside quotes
result.add(input.substring(start, current)); //add everything between the start index and the current character. (add a word)
start = current + 1; //update start index
}
}
System.out.println(result);
I have modified it a bit to improve readability.我对其进行了一些修改以提高可读性。 This code stores the strings you want in the list
result
.此代码将您想要的字符串存储在列表
result
中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.