简体   繁体   中英

Remove certain regex from a string in Rails

I am building a tweet-like system that includes @mentions and #hashtags. Right now, I need to take a tweet that will come to the server like this:

hi [@Bob D](member:Bob D) whats the deal with [#red](tag:red)

and save it in the database as:

hi @Bob P whats the deal with #red

I have the flow of what the code looks like in my mind but can't get it to work. Basically, I need to do the following:

  1. Scan the string for any [@...] (an array like structure that begins with an @ )
  2. Delete the paranthesis after the array like structure(so for [@Bob D](member:Bob D) , remove everything in paranthesis)
  3. Remove the brackets surrounding a substring that begins with @ (meaning, delete the [] from [@...] )

I will also need to do the same for # . I'm almost certain this can be done by using regular expressions the slice! method, but i'm really having trouble coming up with the regular expressions needed and the control flow. I think it would be something like this:

a = "hi [@Bob D](member:Bob D) whats the deal with [#red](tag:red)"
substring = a.scan <regular expression here>
substring.each do |matching_substring|  #the loop should get rid of the paranthesis but not the brackets
    a.slice! matching_substring
end
#Something here should get rid of brackets

The problem with the code above is that I can't figure out the regex and it doesn't get rid of the brackets.

This regex should work for this /(\\[(@.*?)\\]\\((.*?)\\))/

you can use this rubular to test it

the ? after the * makes it non-greedy so it should capture each match

the code would look something like

a = "hi [@Bob D](member:Bob D) whats the deal with [#red](tag:red)"
substring = a.scan (\[(@.*?)\]\((.*?)\))
substring.each do |matching_substring|
  a.gsub(matching_substring[0], matching_substring[1]) # replaces [@Bob D](member:Bob D) with @Bob D
  matching_substring[1] #the part in the brackets sans brackets
  matching_substring[2] #the part in the parentheses sans parentheses
end

Consider this:

str = "hi [@Bob D](member:Bob D) whats the deal with [#red](tag:red)"

BRACKET_RE_STR = '\[
              (
                [@#]
                [^\]]+
              )
              \]'
PARAGRAPH_RE_STR = '\(
              [^)]+
              \)'


BRACKET_RE = /#{BRACKET_RE_STR}/x
PARAGRAPH_RE = /#{PARAGRAPH_RE_STR}/x
BRACKET_AND_PARAGRAPH_RE = /#{BRACKET_RE_STR}#{PARAGRAPH_RE_STR}/x

str.gsub(BRACKET_AND_PARAGRAPH_RE) { |s| s.sub(PARAGRAPH_RE, '').sub(BRACKET_RE, '\1') }
# => "hi @Bob D whats the deal with #red"

The longer, or more complex the pattern, the harder it is to maintain or update, so keep them as small as possible. Build complex patterns from simple ones so it's easier to debug and extend.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM