如何在 Ruby 字符串中替換正則表達式匹配之外的內容？

Question

給定如下示例輸入：

s = "an example with 'one' word and 'two and three' words inside quotes"

我正在嘗試迭代引號之外的部分以進行一些替換。 例如將and轉換為&但只能在引號之外獲取：

an example with 'one' word & 'two and three' words inside quotes

如果我要更改引號內，我可以簡單地執行以下操作：

s.gsub(/'.*?'/){ |q| q.gsub(/and/, '&') }

要得到：

an example with 'one' word and 'two & three' words inside quotes

我主要嘗試了兩件事來使這種策略適應報價之外的情況。

首先，我試圖否定第一個gsub中的正則表達式（即/'.*?'/ ）。 我想如果有像/v這樣的后綴修飾符，我可以簡單地做s.gsub(/'.*?'/v){... } ，不幸的是我找不到這樣的東西。 有一個負面的前瞻（即(?!pat) ），但我認為這不是我需要的。

其次，我嘗試將split與gsub! 像這樣：

puts s.split(/'.*?'/){ |r| r.gsub!(/and/, '&') }

使用split我可以遍歷引號之外的部分：

s.split(/'.*?'/){ |r| puts r }

要得到：

an example with 
 word and 
 words inside quotes

但是，我不能用gsub或gsub! . 我想我需要一個變異版本的split ，類似於gsub的變異版本scan ，但似乎沒有這樣的東西。

有沒有一種簡單的方法可以使這些方法中的任何一種都起作用？

Answer 1

您可以匹配並捕獲您需要保留的內容，並且只匹配您需要替換的內容。

利用

s.gsub(/('[^']*')|and/) { $1 || '&' }
s.gsub(/('[^']*')|and/) { |m| m == $~[1] ? $~[1] : '&' }

如果您需要將and作為一個完整的單詞進行匹配，請在模式中使用\band\b而不是and 。

這種方法非常方便，因為您可以添加想要跳過的任意數量的特定模式。 例如，您還想避免在雙引號and匹配整個單詞：

s.gsub(/('[^']*'|"[^"]*")|\band\b/) { $1 || '&' }

或者，您想確保它也在使用轉義引號的引號之間跳過字符串：

s.gsub(/('[^'\\]*(?:\\.[^'\\]*)*'|"[^"\\]*(?:\\.[^"\\]*)*")|\band\b/m) { $1 || '&' }

或者，如果它出現在圓形、方形、尖括號和大括號之外：

s.gsub(/(<[^<>]*>|\{[^{}]*\}|\([^()]*\)|\[[^\]\[]*\])|\band\b/m) { $1 || '&' }

匹配和捕獲單引號之間的子字符串，只匹配您需要更改的內容。 如果第 1 組匹配，則將其放回$1 ，否則，替換為& 。 第二行中的替換塊只是檢查最后一個匹配的 Group 1 值是否與當前匹配的值相同，如果是，則將其放回原處，否則，替換為& 。

請參閱Ruby 演示。

正則表達式詳細信息

('[^']*') - 捕獲組 #1: ' ，除'之外的零個或多個字符，然后是一個'字符
| - 或者
and - and substring。

Answer 2

您可以使用以下正則表達式執行所需的替換。

r = /\G[^'\n]*?(?:'[^'\n]*'[^'\n]*?)*?\K\band\b/

啟動你的引擎！

所需的 Ruby 代碼如下。

str = "an and with 'one' word and 'two and three' words and end"

str.gsub(r, '&')
  #=> "an & with 'one' word & 'two and three' words & end"

Ruby碼測試儀

Ruby 的正則表達式引擎執行以下操作。 本質上，正則表達式斷言"and"自上次匹配以來跟隨偶數個單引號，或者如果它是第一個匹配，則從字符串開頭跟隨偶數個單引號。

\G          : asserts position at the end of the previous match
              or the start of the string for the first match
[^'\n]*?    : match 0+ chars other than ' and \n, lazily
(?:         : begin capture group
  '[^'\n]*' : match ' then 0+ chars other than ' and \n then '
  [^'\n]*?  : match 0+ chars other than ' and \n, lazily
)           : end non-capture group
*?          : execute non-capture group 0+ times, lazily 
\K          : forget everything matched so far and reset start of match
\band\b/    : match 'and'

如何在 Ruby 字符串中替換正則表達式匹配之外的內容？

問題描述

2 個解決方案

解決方案1
1 2020-06-24 19:40:24

解決方案2
1 2020-06-24 22:48:28

如何在 Ruby 字符串中替換正則表達式匹配之外的內容？

問題描述

2 個解決方案

解決方案1 1 2020-06-24 19:40:24

解決方案2 1 2020-06-24 22:48:28

解決方案1
1 2020-06-24 19:40:24

解決方案2
1 2020-06-24 22:48:28