[英]Replace with multiple patterns mutually exclusively
I have the following text: 我有以下文字:
a phrase whith length one, which is "uno"
Using the following dictionary, 使用以下字典,
1) phrase --- frase
2) a phrase --- una frase
3) one --- uno
4) uno --- one
I'm trying to replace the occurrences of the dictionary items in the text. 我正在尝试替换文本中词典项的出现。 The desired output is: 所需的输出是:
[a phrase|una frase] whith length [one|uno], which is "[uno|one]"
I've done this: 我已经做到了:
text = %(a phrase whith length one, which is "uno")
dictionary.each do |original, translation|
text.gsub! original, "[#{original}|#{translation}]"
end
This snippet outputs the following for each dictionary word: 此代码段为每个词典单词输出以下内容:
1) a [phrase|frase] whith length one, which is "uno"
2) a [phrase|frase] whith length one, which is "uno"
3) a [phrase|frase] whith length [one|uno], which is "uno"
3) a [phrase|frase] whith length [one|[uno|one]], which is "[uno|one]"
I see two problems here: 我在这里看到两个问题:
phrase
is being replaced instead of a phrase
. phrase
代替a phrase
。 I think that this can be fixed by sorting the dictionary by length, giving priority to longer terms. 我认为可以通过按长度对字典进行排序来解决此问题,优先考虑较长的术语。 uno
in [one|uno]
. 已经替换字被重新替换,如uno
在[one|uno]
。 I thought of using some sort of regular expression list (with Regex::union
), but I don't know how efficient and clean it'll be. 我曾想过使用某种正则表达式列表(使用Regex::union
),但是我不知道它会多么高效和干净。 Any ideas? 有任何想法吗?
To solve your second problem, you have to replace in a single pass. 要解决第二个问题,您必须一次性更换。
Convert the dictionary into a hash with the key-value pairs in the order you mention (sorted by length, perhaps). 按照您提到的顺序(可能是按长度排序),使用键值对将字典转换为哈希。
dictionary = {
"a phrase" => "[a phrase|una frase]",
"phrase" => "[phrase|frase]",
"one" => "[one|uno]",
"uno" => "[uno|one]",
}
Then replace all in a single pass. 然后一次更换所有零件。
text.gsub(Regexp.union(*dictionary.keys.map{|w| "\b#{w}\b"}), dictionary)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.