简体   繁体   English

从Rails的字符串中删除某些正则表达式

[英]Remove certain regex from a string in Rails

I am building a tweet-like system that includes @mentions and #hashtags. 我正在构建一个类似于@tweet的系统,其中包括@mentions和#hashtags。 Right now, I need to take a tweet that will come to the server like this: 现在,我需要像这样在服务器上发布一条推文:

hi [@Bob D](member:Bob D) whats the deal with [#red](tag:red)

and save it in the database as: 并将其保存为:

hi @Bob P whats the deal with #red

I have the flow of what the code looks like in my mind but can't get it to work. 我脑海中浮现出代码的样子,但无法正常工作。 Basically, I need to do the following: 基本上,我需要执行以下操作:

  1. Scan the string for any [@...] (an array like structure that begins with an @ ) 扫描字符串以查找任何[@...] (以@开头的类似结构的数组)
  2. Delete the paranthesis after the array like structure(so for [@Bob D](member:Bob D) , remove everything in paranthesis) 在类似数组的数组后删除括号(对于[@Bob D](member:Bob D) ,删除括号中的所有内容)
  3. Remove the brackets surrounding a substring that begins with @ (meaning, delete the [] from [@...] ) 删除以@开头的子字符串周围的括号(表示从[@...]删除[] [@...]

I will also need to do the same for # . 我还需要对#做同样的事情。 I'm almost certain this can be done by using regular expressions the slice! 我几乎可以肯定,这可以通过使用正则表达式slice!来完成slice! method, but i'm really having trouble coming up with the regular expressions needed and the control flow. 方法,但我真的很难提出所需的正则表达式和控制流程。 I think it would be something like this: 我认为应该是这样的:

a = "hi [@Bob D](member:Bob D) whats the deal with [#red](tag:red)"
substring = a.scan <regular expression here>
substring.each do |matching_substring|  #the loop should get rid of the paranthesis but not the brackets
    a.slice! matching_substring
end
#Something here should get rid of brackets

The problem with the code above is that I can't figure out the regex and it doesn't get rid of the brackets. 上面的代码的问题是我无法弄清楚正则表达式,并且它也没有摆脱括号。

This regex should work for this /(\\[(@.*?)\\]\\((.*?)\\))/ 此正则表达式适用于/(\\[(@.*?)\\]\\((.*?)\\))/

you can use this rubular to test it 你可以用这个石头来测试

the ? after the * makes it non-greedy so it should capture each match *表示非贪婪之后,因此应该捕获每个匹配项

the code would look something like 代码看起来像

a = "hi [@Bob D](member:Bob D) whats the deal with [#red](tag:red)"
substring = a.scan (\[(@.*?)\]\((.*?)\))
substring.each do |matching_substring|
  a.gsub(matching_substring[0], matching_substring[1]) # replaces [@Bob D](member:Bob D) with @Bob D
  matching_substring[1] #the part in the brackets sans brackets
  matching_substring[2] #the part in the parentheses sans parentheses
end

Consider this: 考虑一下:

str = "hi [@Bob D](member:Bob D) whats the deal with [#red](tag:red)"

BRACKET_RE_STR = '\[
              (
                [@#]
                [^\]]+
              )
              \]'
PARAGRAPH_RE_STR = '\(
              [^)]+
              \)'


BRACKET_RE = /#{BRACKET_RE_STR}/x
PARAGRAPH_RE = /#{PARAGRAPH_RE_STR}/x
BRACKET_AND_PARAGRAPH_RE = /#{BRACKET_RE_STR}#{PARAGRAPH_RE_STR}/x

str.gsub(BRACKET_AND_PARAGRAPH_RE) { |s| s.sub(PARAGRAPH_RE, '').sub(BRACKET_RE, '\1') }
# => "hi @Bob D whats the deal with #red"

The longer, or more complex the pattern, the harder it is to maintain or update, so keep them as small as possible. 模式越长或越复杂,维护或更新就越困难,因此请使其尽可能小。 Build complex patterns from simple ones so it's easier to debug and extend. 从简单的模式构建复杂的模式,以便于调试和扩展。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM