[英]Regex - Split message into groups
I want to split this message into groups:我想将此消息分成几组:
[Rule] 'Server - update repository' [Source] 10.10.10.10 [User] _Server [Content] HTTP GET http://example.com
Expected result:预期结果:
Group1: [Rule] 'Server - update repository'
Group2: [Source] 10.10.10.10
Group3: [User] _Server
Group4: [Content] HTTP GET http://example.com
It does not have to be 4 groups, sometimes it can be less / more.不一定是4组,有时可以少/多。 Pattern I tried to built:
我尝试构建的模式:
(\(^\[\w+\].*\)){0,}
I would do it like this:我会这样做:
string = "[Rule] 'Server - update repository' [Source] 10.10.10.10 [User] _Server [Content] HTTP GET http://example.com"
regexp = /\[?[^\[]+/
string.scan(regexp)
#=> ["[Rule] 'Server - update repository' ", "[Source] 10.10.10.10 ", "[User] _Server ", "[Content] HTTP GET http://example.com"]
Or when you prefer a hash to be returned:或者当您希望返回 hash 时:
regexp = /\[(\w+)\]\s+([^\[]+)/
string.scan(regexp).to_h
#=> { "Rule" => "'Server - update repository' ", "Source" => "10.10.10.10 ", "User" => "_Server ", "Content" => "HTTP GET http://example.com" }
If there will be no [ in the group text this might work.如果组文本中没有[这可能有效。
str = "[Rule] 'Server - update repository' [Source] 10.10.10.10 [User] _Server [Content] HTTP GET http://example.com"
str.split("[").each_with_index {|c, i| puts "Group #{i}: [#{c}" if i > 0}
Group 1: [Rule] 'Server - update repository'
Group 2: [Source] 10.10.10.10
Group 3: [User] _Server
Group 4: [Content] HTTP GET http://example.com
You can also use String#split .您也可以使用String#split 。
str = "[Rule] 'Server - update repository' [Source] 10.10.10.10 [User] _Server [Content] HTTP GET http://example.com"
str.split(/ +(?=\[)/)
#=> ["[Rule] 'Server - update repository'",
# "[Source] 10.10.10.10",
# "[User] _Server",
# "[Content] HTTP GET http://example.com"]
The string is split on one or more spaces followed by a left bracket.字符串在一个或多个空格后跟一个左方括号分开。
(?=\[)
is a positive lookahead . (?=\[)
是一个积极的前瞻。
If you wish to create a hash with keys :Group1
, :Group2
, and so on, you could write如果你想创建一个 hash 键
:Group1
, :Group2
,等等,你可以写
arr = str.split(/ +(?=\[)/)
arr.each_index.with_object({}) do |i,h|
h.update("Group#{i+1}".to_sym => arr[i])
end
#=> {:Group1=>"[Rule] 'Server - update repository'",
# :Group2=>"[Source] 10.10.10.10",
# :Group3=>"[User] _Server",
# :Group4=>"[Content] HTTP GET http://example.com"}
Depending on requirements here is another option.根据这里的要求是另一种选择。
RGX = /\[([A-Z][a-z]+)\] +([^\[\]]+[^ \[\]])/
str.gsub(RGX).with_object({}) { |_,h| h[$1] = $2 }
#=> {"Rule"=>"'Server - update repository'",
# "Source"=>"10.10.10.10",
# "User"=>"_Server",
# "Content"=>"HTTP GET http://example.com"}
This uses the form of String#gsub that takes a single argument and has no block, returning an enumerator.这使用String#gsub的形式,它接受一个参数并且没有块,返回一个枚举器。 This form is useful but odd, as it has nothing to do with string replacement.
这种形式很有用但很奇怪,因为它与字符串替换无关。
We can write the regular expression in free spacing mode to make it self-documenting.我们可以以自由间距模式编写正则表达式,使其自文档化。
/
\[ # match '['
( # begin capture group 1
[A-Z] # match an uppercase letter
[a-z]+ # match one or more lowercase letters
) # end capture group 1
\]\ + # match ']' followed by one or more spaces
( # begin capture group 2
[^\[\]]+ # match one or more chars other than '[' and ']'
[^ \[\]] # match one char other than ' ', '[' and ']'
) # end capture group 2
/x # invoke free-spacing regex definition mode
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.