简体   繁体   中英

How to find the delimiter encountered in a string in JAVA

I have written simple program in Java which does manipulation of a given string.

The input string has some delimiters which are non-alphabets. I have used String Tokenizer to read and manipulate the individual words in a string.

Now I need to reconstruct this manipulated string with the same set of delimiters. Appreciate if any one can suggest me how to identify the delimiter.

In other words, this is what input is:

Text1 Delimiter1 Text2 Delimiter2 Text3 Delimiter3 Text4 Delimiter4

This is what my code does:

NewText1 NewText2 NewText3 NewText4

I made use of string tokenizer to identify the next token in this manner:

StringTokenizer st = new StringTokenizer(str, ", 0123456789(*&^%$#@!-_)");

But now I would like to identify the delimiter that was encountered so that I can build my new string.

This is what I actually want:

NewText1 Delimiter1 NewText2 Delimiter2 NewText3 Delimiter3 NewText4 Delmiter4

You can proceed according to this:

String dels = "-, 0123456789(*&^%$#@!_)";
String specs = "[" + dels + "]+";
String letts = "[^" + dels + "]+";
String text = "one, two - three! four";
String[] words = text.split( specs );
String[] delim = text.split( letts );

Note that in dels the hyphen must be up front. If you ever add [ or ] or ^ more care must be taken - check the javadoc in java.util.regex.Pattern.

There is no particular problem with composing the original string.

The disadvantage with StringTokenizer with a third argument is that it returns each delimiter as a separate token of length 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM