Regex pattern matching doesn't work specifc string in java

Question

I was using a REGEX pattern in java (given below): 工作模式

for the string: 工作字符串 It works fine. But when I tried using the below pattern: 非工作模式

for the string:str =

非工作字符串 Sorry about the image upload. Looks like the character '[]' in a00[] is encoded differently on the browser. Any ways to read that character in a different manner? The same character has a different representation in notepad++. I'm using RXTX and inputStream.read(readBuffer) to read the data. Is there any way I can update my encoding methods in java to overcome this? http://i.imgur.com/sdUjS.jpg i.imgur.com

PS: Sorry about the image description - if it type it out i cant represent that character. when i copy paste that character, it becomes an empty space.

Answer 1

The strange symbol (└) looks like how ASCII 3 is represented in some fonts.

In Regex, \\b matches a word boundary. That is, between an alphanumeric and non-alphanumeric character. It works in the first case because there is a digit ("9") before the matched substring, and an exclamation mark ("!") right after it (which is a non-alphanumeric character).

In the second case you changed the exclamation mark to a letter, so there is no longer a transition from alphanumeric to non-alphanumeric.

The solution is to extend the Regex so it also matches the symbol and digit:

Pattern.compile("(\\x03\\d)(a)\\w*(?=\\x03\\d)");

I used \\\\x03\\\\d to match the codes. The last part (?= ) is a look-ahead. It checks if it matches, but does not consume it. This is so, so you do multiple matches in a row.

A simpler alternative, would be to just split the string on "└", and examine the pieces.

s.split("\u0003")

Regex pattern matching doesn't work specifc string in java

Question

1 answers

solution1
2 ACCPTED 2011-12-28 12:14:08

Regex pattern matching doesn't work specifc string in java

Question

1 answers

solution1 2 ACCPTED 2011-12-28 12:14:08

solution1
2 ACCPTED 2011-12-28 12:14:08