简体   繁体   English

如何让gsub处理多种模式和替换

[英]How to have gsub handle multiple patterns and replacements

A while ago I created a function in PHP to "twitterize" the text of tweets pulled via Twitter's API. 不久前,我在PHP中创建了一个函数,以“微化”通过Twitter API提取的推文文本。

Here's what it looked like: 看起来是这样的:

function twitterize($tweet){
$patterns = array ( "/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)/", 
                    "/(?<=^|(?<=[^a-zA-Z0-9-\.]))@([A-Za-z_]+[A-Za-z0-9_]+)/",
                    "/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/");
$replacements = array ("<a href='\\0' target='_blank'>\\0</a>", "<a href='http://twitter.com/\\1' target='_blank'>\\0</a>", "<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>");

return preg_replace($patterns, $replacements, $tweet);

}

Now I'm a little stuck with Ruby's gsub , I tried: 现在,我对Ruby的gsub有点gsub ,我尝试了:

def twitterize(text)
patterns = ["/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)/", "/(?<=^|(?<=[^a-zA-Z0-9-\.]))@([A-Za-z_]+[A-Za-z0-9_]+)/", "/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/"]
replacements =  ["<a href='\\0' target='_blank'>\\0</a>",
                "<a href='http://twitter.com/\\1' target='_blank'>\\0</a>",
                "<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>"]

return text.gsub(patterns, replacements)
end

Which obviously didn't work and returned an error: 显然不起作用并返回错误:

No implicit conversion of Array into String

And after looking at the Ruby documentation for gsub and exploring a few of the examples they were providing, I still couldn't find a solution to my problem: How can I have gsub handle multiple patterns and multiple replacements at once? 在查看gsubRuby文档并浏览了他们提供的一些示例之后,我仍然找不到解决我的问题的解决方案:我如何让gsub处理多个模式和多个替换?

Well, as you can read from the docs, gsub does not handle multiple patterns and replacements at once. 嗯,正如您从文档中可以看到的那样, gsub 不能一次处理多个模式和替换。 That's what causing your error, quite explicit otherwise (you can read that as "give me a String, not an Array!!1"). 这就是导致您的错误的原因,否则非常明显(您可以读为“给我一个字符串,而不是数组!1”)。

You can write that like this: 您可以这样写:

def twitterize(text)
  patterns = [/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)/, /(?<=^|(?<=[^a-zA-Z0-9-\.]))@([A-Za-z_]+[A-Za-z0-9_]+)/, /(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/]
  replacements =  ["<a href='\\0' target='_blank'>\\0</a>",
            "<a href='http://twitter.com/\\1' target='_blank'>\\0</a>",
            "<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>"]

  patterns.each_with_index do |pattern, i|
    text.gsub!(pattern, replacements[i])
  end

  text
end

This can be refactored into more elegant rubyish code, but I think it'll do the job. 可以将其重构为更优雅的红宝石代码,但我认为它可以完成工作。

The error was because you tried to use an array of replacements in the place of a string in the gsub function. 该错误是因为您试图在gsub函数中使用替换数组代替字符串。 Its syntax is: 其语法为:

text.gsub(matching_pattern,replacement_text)

You need to do something like this: 您需要执行以下操作:

replaced_text = text.gsub(pattern1, replacement1)
replaced_text = replaced_text.gsub(pattern2, replacement2)

and so on, where the pattern 1 is one of your matching patterns and replacement is the replacement text you would like. 依此类推,其中模式1是您匹配的模式之一,而替换是您想要的替换文本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM