简体   繁体   中英

How to remove comma after a word pattern in java

Please help me out to get the specific regex to remove comma after a word pattern in java. Assume, I would like to delete comma after each pattern where the pattern is <Word$TAG>, <Word$TAG>, <Word$TAG>, <Word$TAG>, <Word$TAG> now I want my output to be <Word$TAG> <Word$TAG> <Word$TAG> <Word$TAG> . if I used .replaceAll() , it will replace all commas, but in my <Word$TAG> Word may have a comma(,).

For example, Input.txt is as follows

mms§NNP_ACRON, site§N_NN, pe§PSP, ,,,,,§RD_PUNC, link§N_NN, ....§RD_PUNC, CID§NNP_ACRON, team§N_NN, :)§E

and Output.txt

mms§NNP_ACRON site§N_NN pe§PSP ,,,,,§RD_PUNC link§N_NN ....§RD_PUNC CID§NNP_ACRON team§N_NN :)§E

You could use ", " as search and replace it with " " (space) as below:

one.replace(", ", " ");

If you think, you have "myString, ,,," or multiple spaces in between, then you could use replace all with regex like

one.replaceAll(",\\s+", " ");
(?<=[^,\s]),

Try this.Replace by empty string .See demo.

http://regex101.com/r/lZ5mN8/5

Match the data you want , not the one you don't want.

You probably want ([^ ]+), and keep the bracketed data, separated by whitespace.

You might even want to narrow it down to ([^ ]+§[^ ]+), . Usually, stricter is better.

You could use a positive lookahead assertion to match all the commas which are followed by a space or end of the line anchor.

String s = "mms§NNP_ACRON, site§N_NN, pe§PSP, ,,,,,§RD_PUNC, link§N_NN, ....§RD_PUNC, CID§NNP_ACRON, team§N_NN, :)§E";
System.out.println(s.replaceAll(",(?=\\s|$)",""));

Output:

mms§NNP_ACRON site§N_NN pe§PSP ,,,,,§RD_PUNC link§N_NN ....§RD_PUNC CID§NNP_ACRON team§N_NN :)§E

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM