I know how to tokenize the String, but the Problem is I want to tokenize the as shown below.
String st = "'test1, test2','test3, test4'";
What I've tried is as below:
st.split(",");
This is giving me output as:
'test1
test2'
'test3
test4'
But I want output as:
'test1, test2'
'test3, test4'
How do i do this?
Since single quotes are not mandatory , split
will not work, because Java's regex engine does not allow variable-length lookbehind expressions. Here is a simple solution that uses regex to match the content, not the delimiters:
String st = "'test1, test2','test3, test4',test5,'test6, test7',test8";
Pattern p = Pattern.compile("('[^']*'|[^,]*)(?:,?)");
Matcher m = p.matcher(st);
while (m.find()) {
System.out.println(m.group(1));
}
You can add syntax for escaping single quotes by altering the "content" portion of the quoted substring (currently, it's [^']*
, meaning "anything except a single quote repeated zero or more times).
The easiest and reliable solution would be to use a CSV parser. Maybe Commons CSV would help.
It will scape the strings based on CSV rules. So even ''
could be used within the value without breaking it.
A sample code would be like: ByteArrayInputStream baos = new ByteArrayInputStream("'test1, test2','test3, test4'".getBytes());
CSVReader reader = new CSVReader(new InputStreamReader(baos), ',', '\'');
String[] read = reader.readNext();
System.out.println("0: " + read[0]);
System.out.println("1: " + read[1]);
reader.close();
This would print:
0: test1, test2
1: test3, test4
If you use maven you can just import the dependency:
<dependency>
<groupId>net.sf.opencsv</groupId>
<artifactId>opencsv</artifactId>
<version>2.0</version>
</dependency>
And start using it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.