I am trying to extract measurements from file names, and they are very inconsistent; for example:
I have to be able to match all numbers (with decimals and with without leading zeros). I think I have that working with this:
/\d*\.?\d+/i
However, I also want to be able to exclude numbers preceded by SS or GR. Something like this seems to partial work:
/(?<!GR|SS)\d*\.?\d+/i
That will exclude the 5 from FSTCAR.5_13UNC_1.00GR5P above but anything more than a single digit is not excluded so 16 from the 316 would be a match. I am doing this in ruby.
To fix the SS and GR exclusion, try this:
/(?<!GR|SS)[\d\.]+/i
I'm not sure exactly what your layout is, but using this would be faster for your negative look behind:
(?<![GRS]{2})
Edit: the +
still isn't greedy enough.
You might need to use two regex. One to remove the GR/SS numbers, and one to match (note: I'm not very familiar with Ruby):
val.gsub('/[GRS]{2}[\d\.]+/', '')
val =~ /[\d\.]+/
Anytime you have to dither floating number strings its not a trivial feat.
This just takes your last regex and adds some extra stuff to the lookbehind.
This secures that the engine won't bypass a number just to match the regex.
# (?<!GR)(?<!SS)(?<![.\d])\d*\.?\d+
# (?<! GR | SS | [.\d] )
(?<! GR )
(?<! SS )
(?<! [.\d] )
\d* \.? \d+
Perl test case
@ary = (
'FSTCAR.5_13UNC_1.00 ',
'FSTCAR.5_13UNC_1.00GR5P',
'FSTCAR.5_13UNC_1.00SS316'
);
foreach $fname (@ary)
{
print "filename: $fname\n";
while ( $fname =~ /(?<!GR)(?<!SS)(?<![.\d])\d*\.?\d+/ig ) {
print " found $&\n";
}
}
Output >>
filename: FSTCAR.5_13UNC_1.00
found .5
found 13
found 1.00
filename: FSTCAR.5_13UNC_1.00GR5P
found .5
found 13
found 1.00
filename: FSTCAR.5_13UNC_1.00SS316
found .5
found 13
found 1.00
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.