In my ruby on rails app, I am trying to build a parser to extract some metadata out of a string.
Let's say the sample string is:
The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20).
I want to extract the substring out of the last occurence of the ( ).
So, I want to get "ralph, 20" no matter how many ( ) are in the string.
Is there a best way to create this ruby string extraction ... regexp?
Thanks,
John
It looks like you want a sexeger . They work by reversing the string, running a reversed regex against the string, and then reversing the results. Here is an example (pardon the code, I don't really know Ruby):
#!/usr/bin/ruby
s = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20).";
reversed_s = s.reverse;
reversed_s =~ /^.*?\)(.*?)\(/;
result = $1.reverse;
puts result;
The fact that this is getting no up votes tells me nobody clicked through to read why you want to use a sexeger, so here is are the results of a benchmark:
do they all return the same thing?
ralph, 20
ralph, 20
ralph, 20
ralph, 20
user system total real
scan greedy 0.760000 0.000000 0.760000 ( 0.772793)
scan non greedy 0.750000 0.010000 0.760000 ( 0.760855)
right index 0.760000 0.000000 0.760000 ( 0.770573)
sexeger non greedy 0.400000 0.000000 0.400000 ( 0.408110)
And here is the benchmark:
#!/usr/bin/ruby
require 'benchmark'
def scan_greedy(s)
result = s.scan(/\([^)]*\)/x)[-1]
result[1 .. result.length - 2]
end
def scan_non_greedy(s)
result = s.scan(/\(.*?\)/)[-1]
result[1 .. result.length - 2]
end
def right_index(s)
s[s.rindex('(') + 1 .. s.rindex(')') -1]
end
def sexeger_non_greedy(s)
s.reverse =~ /^.*?\)(.*?)\(/
$1.reverse
end
s = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20).";
puts "do they all return the same thing?",
scan_greedy(s), scan_non_greedy(s), right_index(s), sexeger_non_greedy(s)
n = 100_000
Benchmark.bm(18) do |x|
x.report("scan greedy") { n.times do; scan_greedy(s); end }
x.report("scan non greedy") { n.times do; scan_non_greedy(s); end }
x.report("right index") { n.times do; scan_greedy(s); end }
x.report("sexeger non greedy") { n.times do; sexeger_non_greedy(s); end }
end
I would try this (here my regex assumes the first value is alphanumeric and the second value is a digit, adjust accordingly). Here the scan gets all occurrences as an array and the -1 tells us to grab just the last one, which seems to be just what you're asking for:
>> foo = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
=> "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
>> foo.scan(/\(\w+, ?\d+\)/)[-1]
=> "(ralph, 20)"
A simple non regular expression solution:
string = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
string[string.rindex('(')..string.rindex(')')]
Example:
irb(main):001:0> string = "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
=> "The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20)."
irb(main):002:0> string[string.rindex('(')..string.rindex(')')]
=> "(ralph, 20)"
And without the parentheses:
irb(main):007:0> string[string.rindex('(')+1..string.rindex(')')-1]
=> "ralph, 20"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.