简体   繁体   中英

JS & Regex: how to replace punctuation pattern properly?

Given an input text such where all spaces are replaced by n _ :

Hello_world_?. Hello_other_sentenc3___. World___________.

I want to keep the _ between words, but I want to stick each punctuation back to the last word of a sentence without any space between last word and punctuation. I want to use the the punctuation as pivot of my regex.

I wrote the following JS-Regex:

str = str.replace(/(_| )*([:punct:])*( |_)/g, "$2$3"); 

This fails, since it returns :

Hello_world_?. Hello_other_sentenc3_. World_._ 

Why it doesn't works ? How to delete all "_" between the last word and the punctuation ? http://jsfiddle.net/9c4z5/

Try the following regex , which makes use of a positive lookahead :

str = str.replace(/_+(?=\.)/g, "");

It replaces all underscores which are immediately followed by a punctuation character with the empty string, thus removing them.

If you want to match other punctuation characters than just the period, replace the \\. part with an appropriate character class.

JavaScript doesn't have :punct: in its regex implementation. I believe you'd have to list out the punctuation characters you care about, perhaps something like this:

str = str.replace(/(_| )+([.,?])/g, "$2");

That is, replace any group of _ or space that is immediately followed by punctation with just the punctuation.

Demo: http://jsfiddle.net/9c4z5/2/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM