简体   繁体   English

互斥地替换为多个模式

[英]Replace with multiple patterns mutually exclusively

I have the following text: 我有以下文字:

a phrase whith length one, which is "uno"

Using the following dictionary, 使用以下字典,

1) phrase --- frase
2) a phrase --- una frase
3) one --- uno
4) uno --- one

I'm trying to replace the occurrences of the dictionary items in the text. 我正在尝试替换文本中词典项的出现。 The desired output is: 所需的输出是:

[a phrase|una frase] whith length [one|uno], which is "[uno|one]"

I've done this: 我已经做到了:

text = %(a phrase whith length one, which is "uno")
dictionary.each do |original, translation|
  text.gsub! original, "[#{original}|#{translation}]"
end

This snippet outputs the following for each dictionary word: 此代码段为每个词典单词输出以下内容:

1) a [phrase|frase] whith length one, which is "uno"
2) a [phrase|frase] whith length one, which is "uno"
3) a [phrase|frase] whith length [one|uno], which is "uno"
3) a [phrase|frase] whith length [one|[uno|one]], which is "[uno|one]"

I see two problems here: 我在这里看到两个问题:

  • The word phrase is being replaced instead of a phrase . phrase代替a phrase I think that this can be fixed by sorting the dictionary by length, giving priority to longer terms. 我认为可以通过按长度对字典进行排序来解决此问题,优先考虑较长的术语。
  • The already replaced words are being re-replaced, like uno in [one|uno] . 已经替换字被重新替换,如uno[one|uno] I thought of using some sort of regular expression list (with Regex::union ), but I don't know how efficient and clean it'll be. 我曾想过使用某种正则表达式列表(使用Regex::union ),但是我不知道它会多么高效和干净。

Any ideas? 有任何想法吗?

To solve your second problem, you have to replace in a single pass. 要解决第二个问题,您必须一次性更换。

Convert the dictionary into a hash with the key-value pairs in the order you mention (sorted by length, perhaps). 按照您提到的顺序(可能是按长度排序),使用键值对将字典转换为哈希。

dictionary = {
  "a phrase" => "[a phrase|una frase]",
  "phrase" => "[phrase|frase]",
  "one" => "[one|uno]",
  "uno" => "[uno|one]",
}

Then replace all in a single pass. 然后一次更换所有零件。

text.gsub(Regexp.union(*dictionary.keys.map{|w| "\b#{w}\b"}), dictionary)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM