简体   繁体   中英

Match string unless starts and ends with character

I'm trying to use a regex in JavaScript to decide if a message gets deleted. I want to delete the message if it contains "string" anywhere, unless it's surrounded by colons.

  • string - gets deleted
  • blah string - gets deleted
  • :string blah - gets deleted
  • :string: string - gets deleted
  • thing :string: - doesn't get deleted

I'm using JavaScript, and so far I'm using message.match(/string/i) to see if the message gets deleted. I've tried a negative lookahead, but I probably used it wrong.

EDIT: Sorry for not including this earlier, but :blahstring: and :stringblah: and :blahstringblah: should not be deleted as well.

If lookbehind is supported you may use

/(?<!:(?=string:))string/i

See the regex demo

Details

  • (?<!:(?=string:)) - a negative lookbehind that fails the match if, immediately to the left of the current location, there is : that is not immediately followed with string:
  • string - a string

 var strs = ['string - gets deleted','blah string - gets deleted',':string blah - gets deleted',':string: string - gets deleted','thing :string: - doesnt get deleted']; var rx = /(?<!:(?=string:))string/i; for (var s of strs) { console.log(s, "=>", rx.test(s)); } 

Output:

string - gets deleted => true
blah string - gets deleted => true
:string blah - gets deleted => true
:string: string - gets deleted => true
thing :string: - doesnt get deleted => false

A version without lookbehind

It is based on a regex that matches string either without colons or with colons on both sides. If the matches contain at least one match with no colon at the start, the entry must be deleted.

 var strs = ['string - gets deleted','blah string - gets deleted',':string blah - gets deleted',':string: string - gets deleted','thing :string: - doesnt get deleted']; var rx = /(?::(?=string:))?string/gi; for (var s of strs) { var matches = s.match(rx); console.log(s, "=>", (matches.some(function (x) { return !/^:/.test(x); }) )); } 

There are some boundary cases where the colon appears only at one side of "string". Therefore I believe it is easier to remove all occurrences of ":string:" and only then look for a match of "string":

 function deleteIt(msg) { return /string/i.test(msg.replace(/:\\w*string\\w*(?=:)/ig, ":")); } console.log(deleteIt("this is :string ")); // true console.log(deleteIt("this is string: ")); // true console.log(deleteIt("string:string: ")); // true console.log(deleteIt("this is :string: ")); // false console.log(deleteIt("this is :blastring:stringbla:string: ")); // false 

The last test in the above snippet is a special case. The colon is "shared" by a preceding and following "string". Depending on whether you want such "string" occurrences to be ignored or not, you may need to replace the look-ahead with a normal capture of the second colon.

Addendum

In your edit to the question, you say that ":blastring:" or ":stringbla:" should also not trigger a deletion.

So I added \\w* twice in the regex above to align with that extra requirement.

If also punctuation or other non-alphabetical characters could be allowed between the colon and "string", like ":,-°string^0&:", just not white-space, then use \\S* instead of \\w* .

You can use a combination of positive lookbehind and negative lookahead:

(?<=^|[^:]|(:))string(?!\1)

Demo: https://regex101.com/r/Ca1TTW/1

This is what worked for me in my tests: ^.*(?<!\\:)string(?!\\:).*$

  • ^ Match the start of the string
  • .* Match any character any number of times
  • (?<!\\:) Match if the : suffix is missing
  • string Match the word string
  • (?!\\:) Match if the suffix is missing
  • .* Match any character any number of times
  • $ Match the end of the line

Try

 let s=[ "string", "blah string", ":string blah", ":string: string", "thing :string:", ":blahstring:", ":stringblah:", ":blahstringblah:", ]; let d=s.filter(x=> !x.match(/:.*string.*:/i) || x.match(/:.*string.*:.*string.*/i) || x.match(/.*string.*:.*string.*:/i)); console.log('Delete :', d); console.log('Save :', s.filter(x=>!d.includes(x)) ); 

We put to "delete list" d elementh which

  • !x.match(/:string:/i) - not contains :string: or
  • x.match(/:.*string.*:.*string.*/i) contains :string: and then string non surrounded by :
  • x.match(/.*string.*:.*string.*:/i) same as above but vice-versa

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM