简体   繁体   中英

Ruby regex keep the first instance of pattern and substitute/remove the rest

I have a string that contains a certain word pattern repeatedly. How can I keep the first occurrence, remove if it is a certain case, and substitute the rest of the pattern if it is a certain case?

rocket = "Meowth, that's right!!! Prepare for trouble meowth, and make it double. MEOWTH ftw!!!"

I want to keep the first instance of "meowth" , case insensitive. The rest of "meowth" instance: if it is spelled all caps, it will be replaced with string "team rocket" . Other than that, it will be removed.

rocket.gsub(/meowth/i, 'team rocket') 

The code above replaces all "meowth" string instances (case insensitive). How can I keep the first instance and substitute/remove the rest of the instance?

Desired output:

rocket = "Meowth, that's right!!! Prepare for trouble, and make it double. team rocket ftw!!!"

If the first occurrence is at the start of the string, you may use a negative lookahead (?!\\A) or lookbehind (?<!\\A) at the start of the pattern to exclude the matches at the start of the string:

rocket = "Meowth, that's right!!! Prepare for trouble meowth, and make it double. MEOWTH ftw!!!"
rocket.gsub(/(?!\A)\s*(meowth)/i) { $1.upcase == $1 ? ' team rocket' : '' }
# => Meowth, that's right!!! Prepare for trouble, and make it double. team rocket ftw!!!

See the Ruby demo

If the first instance of the word can be anywhere inside the string, not just at the start of the string, use

rocket.gsub(/(?:\G(?!\A)|\A.*?meowth).*?\K\s*(meowth)/mi) { 
          $1.upcase == $1 ? ' team rocket' : '' 
}

See another Ruby demo .

NOTE : to match meowth as a whole word, enclose it with word boundaries: /(?!\\A)\\s*\\b(meowth)\\b/ .

Details :

  • (?!\\A) - at the current position, there should be no start of string
  • \\s* - 0+ whitespaces
  • (meowth) - Group 1 capturing meowth (case-insensitively, due to /i modifier)

Or,

  • (?:\\G(?!\\A)|\\A.*?meowth) - matches a location after a successful match ( \\G(?!\\A) ) or a substring from the string start ( \\A ) till the first occurrence of meowth (as .*? match any 0+ chars as few times as possible )
  • .*? - any 0+ chars as few as possible up to the first
  • \\K - omit the matched text
  • \\s* - 0+ whitespaces
  • (meowth) - meowth (group 1).

Inside the block, the captured value is checked for being ALLCAPS with .upcase , and if it is, the value is replaced with team rocket , esle, removed.

You don't really need a complex regex for this, just pass a block:

str.gsub(/meowth/i).with_index do |w, i|
  if i != 0 && w == 'MEOWTH'
    w = 'team rocket'
  elsif i != 0
    w = ''
  end

  w
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM