简体   繁体   English

在Ruby中使用命名的Regex组

[英]Working with Named Regex Groups in Ruby

I'm trying to match regex groups over a series of lines and getting stumped. 我试图通过一系列的行来匹配正则表达式组,并感到困惑。 Data file has lines that look like this: 数据文件中的行如下所示:

2014-03-01 08:19,47.799107662994,-75.876391553881,some comment,James,#tag

Here is my Ruby code: 这是我的Ruby代码:

regex = /(?<day>.*)\s(?<hour>\d*:\d*),(?<lat>.*),(?<long>.*),(?<entry>.*),(?<people>.*),#(?<tag>.*)/

f = File.open("/Users/USERNAME/path/to/file.txt", encoding: 'UTF-8')
lines = f.read
f.close
lines.each_line do |line|
  if line =~ /&/
    line.gsub!(/[&]/, 'and')
  end

  if regex =~ line
    puts line
  end
end

That works, but if I change that third to last to line to, for example puts day , then I get an error saying that is an undefined local variable. 那行得通, 但是如果我将倒数第三行更改为行,例如puts day ,那么我会收到一条错误消息,说那是未定义的局部变量。 My understanding was that =~ automatically defined those variables. 我的理解是=~自动定义了这些变量。

Any idea what I'm doing wrong? 知道我在做什么错吗?

You can only access value of the named regex through a matchdata object 您只能通过matchdata对象访问已named regex

regex = /(?<day>.*)\s(?<hour>\d*:\d*),(?<lat>.*),(?<long>.*),(?<entry>.*),(?<people>.*),#(?<tag>.*)/
line = "2014-03-01 08:19,47.799107662994,-75.876391553881,some comment,James,#tag"

matchdata = regex.match(line)

matchdata["day"] # => "2014-03-01"

so I would do as below instead:

if (matchdata = regex.match(line))
  puts matchdata["day"]
end

From the Ruby Rexexp docs : Ruby Rexexp文档

When named capture groups are used with a literal regexp on the left-hand side of an expression and the =~ operator, the captured text is also assigned to local variables with corresponding names. 当命名的捕获组与表达式左侧的文字正则表达式和=〜运算符一起使用时,捕获的文本也将分配给具有相应名称的局部变量。

So it needs to be a literal regex that is used in order to create the local variables. 因此,它必须是用于创建局部变量的文字正则表达式。

In your case you are using a variable to reference the regex, not a literal. 在您的情况下,您正在使用变量来引用正则表达式,而不是文字。

For example: 例如:

regex = /(?<day>.*)/
regex =~ 'whatever'
puts day

produces NameError: undefined local variable or method `day' for main:Object , but this NameError: undefined local variable or method `day' for main:Object产生NameError: undefined local variable or method `day' for main:Object ,但这

/(?<day>.*)/ =~ 'whatever'
puts day

prints whatever . 打印whatever

Try: 尝试:

puts $~['day'] if regex =~ line

The (somewhat cryptic) $~ global variable is a MatchData instance storing the results of the last regex match, and you can access your named captures in there. (有点神秘) $~全局变量是一个MatchData实例,用于存储最后一个正则表达式匹配的结果,您可以在其中访问命名的捕获。

But the answer by @bjhaid is a better choice, saving the MatchData explicitly. 但是@bjhaid的答案是一个更好的选择,显式保存MatchData。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM