简体   繁体   English

为什么Ruby gsub不能代替第二次出现这种模式?

[英]Why does Ruby gsub not replace a second occurrence of this pattern?

I have a bit of code for escaping double-quotes from a string which may include pre-escaped quotes; 我有一些代码可以将可能包含预转义引号的字符串中的双引号转义; eg: 例如:

This is a \"string"

Using the following code with Ruby 1.8.7p374: 在Ruby 1.8.7p374中使用以下代码:

string.gsub!(/([^\\])"/, '\1\"')

However, I get some funny edge-case when trying it on the following string: ab""c => ab\\""c . 但是,在以下字符串上尝试时会遇到一些有趣的边缘情况: ab""c => ab\\""c I would expect it to have escaped both quotes. 我希望它能同时引用两个引号。

It's definitely not a big issue, but it got me curious. 绝对不是什么大问题,但这让我感到好奇。
Is this a mistake with my expression? 我的表达方式有误吗? A gsub bug/feature? gsub错误/功能?

(In newer Ruby versions, this could probably be solved easily by using negative lookbacks, but they seem to be not supported in this version). (在较新的Ruby版本中,可以使用否定的回溯轻松解决此问题,但此版本似乎不支持它们)。

Requiring a match to a non- \\ character means the regex needs to consume that character as well as the quote. 要求匹配非\\字符意味着正则表达式需要使用该字符以及引号。 The gsub matches also cannot overlap. gsub匹配也不能重叠。

You are right that a look-behind assertion would fix this. 您认为后置断言可以解决此问题是正确的。 But without that available, you have a couple of choices in Ruby 1.8.7. 但是,如果没有可用的功能,Ruby 1.8.7中有两个选择。

  1. Repeat until there are no substitutions made ( gsub! returns nil if there were no matches): 重复直到没有替换为止(如果没有匹配项, gsub!返回nil ):

    loop { break unless string.gsub!(/([^\\\\])"/, '\\1\\"') }

  2. For 1.8.7, you don't have look-behind assertions. 对于1.8.7,您没有后置断言。 But you can reverse the string, use look-ahead assertions to make your changes, then reverse it back: 但是您可以反转字符串,使用先行断言进行更改,然后将其反转:

    string = string.reverse.gsub(/"(?!\\\\)/, '"\\\\').reverse

Your regex also won't work if there is a quote at the start of a string, eg "ab""c will transform to "ab\\""c . 如果字符串开头有引号,则您的正则表达式也将不起作用,例如"ab""c将转换为"ab\\""c The reason for this is similar to your case with double quotes. 原因与您使用双引号的情况类似。

After gsub has matched b" and replaced it, it continues from the last match, looking at the next " , but doesn't look at the previously consumed characters. gsub匹配了b"并将其替换后,它将从最后一个匹配继续,查看下一个" ,但不查看先前使用的字符。

You might be able to fix your issue with a lookbehind in newer Ruby versions, but that won't fix the beginning of string problem. 在较新的Ruby版本中,您可以通过回溯来解决问题,但这不能解决字符串问题的开头。 The way to fix that is to use the \\G anchor (which is available in Ruby 1.8.7), which matches where the previous match ended or at the start of the string. 解决该问题的方法是使用\\G (在Ruby 1.8.7中可用),该与上一个匹配项的结束位置或字符串的开头相匹配。 So you are looking for a " that is either immediately after an non slash or is at the start of the current match (meaning a " has just been matched or this is the start of the string). 所以,你正在寻找一个" 要么是后一个非斜线或者是在当前比赛开始(意思是"刚刚被匹配或这是字符串的开始)。 Something like this: 像这样:

string.gsub!(/([^\\]|\G)"/, '\1\"')

This will convert the string "ab""c to \\"ab\\"\\"c . 这会将字符串"ab""c转换为\\"ab\\"\\"c

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM