简体   繁体   English

如何在不丢失单词的情况下分割字符串?

[英]How to split a string without losing any word?

I am using Eclipse for Java and I want to split an input line without losing any characters. 我正在使用Eclipse for Java,并且希望在不丢失任何字符的情况下拆分输入行。

For example, the input line is: 例如,输入行是:

IPOD6 1 USD6IPHONE6 16G,64G,128G USD9,USD99,USD999MACAIR 2013-2014 USD123MACPRO 2013-2014,2014-2015 USD899,USD999

and the desired output is: 所需的输出是:

IPOD6 1 USD6
IPHONE6 16G,64G,128G USD9,USD99,USD999
MACAIR 2013-2014 USD123
MACPRO 2013-2014,2014-2015 USD899,USD999

I was using split("(?<=\\\\bUSD\\\\d{1,99}+)") but it doesn't work. 我正在使用split("(?<=\\\\bUSD\\\\d{1,99}+)")但是它不起作用。

without making it too complicated, use this pattern 不要太复杂,使用这种模式

(?=IPOD|IPHONE|MAC)

and replace with new line 并换成新行
now it is easy to capture or split into an array 现在很容易捕获或拆分为数组
Demo 演示


or maybe this pattern 或者这种模式

((USD\d+,?)+)

and replace w/ $1\\n 并替换为w / $1\\n
Demo 演示

You just need to add a non-word boundary \\B inside the positive look-behind. 您只需要在正向后方添加一个非单词边界\\B \\B matches between two non-word characters or between two word characters. \\B在两个非单词字符之间或两个单词字符之间匹配。 It won't split on the boundary which exists between USD9 and comma in this USD9, substring because there is a word boundary exits between USD9 and comma since 9 is a word character and , is a non-word character. 它不会在此USD9,子字符串中的USD9和逗号之间的边界USD9,因为USD9和逗号之间存在单词边界出口,因为9是单词字符,而是非单词字符。 It splits on the boundary which exists between USD6 and IPHONE6 because there is a non-word boundary \\B exists between those substrings since 6 is a word character and I is also a word character. 它分裂这之间存在的边界上USD6IPHONE6因为有一个非单词边界\\B那些子之间存在因为6是一个字字符和I也是一个字字符。

String s = "IPOD6 1 USD6IPHONE6 16G,64G,128G USD9,USD99,USD999MACAIR 2013-2014 USD123MACPRO 2013-2014,2014-2015 USD899,USD999";
String[] parts = s.split("(?<=\\bUSD\\d{1,99}+\\B)");
for(String i: parts)
{
    System.out.println(i);
}

Output: 输出:

IPOD6 1 USD6
IPHONE6 16G,64G,128G USD9,USD99,USD999
MACAIR 2013-2014 USD123
MACPRO 2013-2014,2014-2015 USD899,USD999

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM