[英]Working with Named Regex Groups in Ruby
I'm trying to match regex groups over a series of lines and getting stumped. 我试图通过一系列的行来匹配正则表达式组,并感到困惑。 Data file has lines that look like this:
数据文件中的行如下所示:
2014-03-01 08:19,47.799107662994,-75.876391553881,some comment,James,#tag
Here is my Ruby code: 这是我的Ruby代码:
regex = /(?<day>.*)\s(?<hour>\d*:\d*),(?<lat>.*),(?<long>.*),(?<entry>.*),(?<people>.*),#(?<tag>.*)/
f = File.open("/Users/USERNAME/path/to/file.txt", encoding: 'UTF-8')
lines = f.read
f.close
lines.each_line do |line|
if line =~ /&/
line.gsub!(/[&]/, 'and')
end
if regex =~ line
puts line
end
end
That works, but if I change that third to last to line to, for example puts day
, then I get an error saying that is an undefined local variable. 那行得通, 但是如果我将倒数第三行更改为行,例如
puts day
,那么我会收到一条错误消息,说那是未定义的局部变量。 My understanding was that =~
automatically defined those variables. 我的理解是
=~
自动定义了这些变量。
Any idea what I'm doing wrong? 知道我在做什么错吗?
You can only access value of the named regex
through a matchdata
object 您只能通过
matchdata
对象访问已named regex
值
regex = /(?<day>.*)\s(?<hour>\d*:\d*),(?<lat>.*),(?<long>.*),(?<entry>.*),(?<people>.*),#(?<tag>.*)/
line = "2014-03-01 08:19,47.799107662994,-75.876391553881,some comment,James,#tag"
matchdata = regex.match(line)
matchdata["day"] # => "2014-03-01"
so I would do as below instead:
if (matchdata = regex.match(line))
puts matchdata["day"]
end
From the Ruby Rexexp docs : 从Ruby Rexexp文档 :
When named capture groups are used with a literal regexp on the left-hand side of an expression and the =~ operator, the captured text is also assigned to local variables with corresponding names.
当命名的捕获组与表达式左侧的文字正则表达式和=〜运算符一起使用时,捕获的文本也将分配给具有相应名称的局部变量。
So it needs to be a literal regex that is used in order to create the local variables. 因此,它必须是用于创建局部变量的文字正则表达式。
In your case you are using a variable to reference the regex, not a literal. 在您的情况下,您正在使用变量来引用正则表达式,而不是文字。
For example: 例如:
regex = /(?<day>.*)/
regex =~ 'whatever'
puts day
produces NameError: undefined local variable or method `day' for main:Object
, but this NameError: undefined local variable or method `day' for main:Object
产生NameError: undefined local variable or method `day' for main:Object
,但这
/(?<day>.*)/ =~ 'whatever'
puts day
prints whatever
. 打印
whatever
。
Try: 尝试:
puts $~['day'] if regex =~ line
The (somewhat cryptic) $~
global variable is a MatchData instance storing the results of the last regex match, and you can access your named captures in there. (有点神秘)
$~
全局变量是一个MatchData实例,用于存储最后一个正则表达式匹配的结果,您可以在其中访问命名的捕获。
But the answer by @bjhaid is a better choice, saving the MatchData explicitly. 但是@bjhaid的答案是一个更好的选择,显式保存MatchData。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.