简体   繁体   中英

Matching string between two markers that are filepaths and contain special characters

I'm trying to write a ruby script that will return text between two other strings. The issues is that the two matching string contain special characters. Escaping the special characters is not solving the problem.

I've tried escaping special characters, different matching patterns, and providing variables with the matching strings without much luck.

I've also tested a simplified match by using only ODS and NAME as delimiters. That seemed to work.

####Example contents of logfile 
#### 'aaaaaaaaa ODS | Filename = /tmp/bbbbbb | NAME = ccccc'

log_to_scan = 'logfile'
marker1 = 'ODS | FILENAME = /tmp/'
marker2 = ' | NAME'

contents = File.read(log_to_scan)

print contents.match(/ODS \| FILENAME = \/tmp\/(.*) \| NAME/m[1].strip

print contents.match(/marker1(.*)marker2/m)[1].strip

Given the sample contents above, I am expecting the output to be bbbbbb. However, I am getting either nothing or a NoMethod error. Not sure what else to true or what I'm mistake I'm making.

str = 'aaaaaaaaa ODS | Filename = /tmp/bbbbbb | NAME = ccccc'
marker1 = 'ODS | FILENAME = /tmp/'
marker2 = ' | NAME'

r = /(?<=#{Regexp.escape(marker1)}).*(?=#{Regexp.escape(marker2)})/i
  #=> /(?<=ODS\ \|\ FILENAME\ =\ \/tmp\/).*(?=\ \|\ NAME)/i 
str[r]
  #=> "bbbbbb" 

or

r = /#{Regexp.escape(marker1)}(.*)#{Regexp.escape(marker2)}/i
str[r,1]
  #=> "bbbbbb" 

or, if the string to be matched is known to be lower-case, or it is permissible to return that string downcased:

s = str.downcase
  #=> "aaaaaaaaa ods | filename = /tmp/bbbbbb | name = ccccc" 
m1 = marker1.downcase
  #=> "ods | filename = /tmp/" 
m2 = marker2.downcase
  #=> " | name" 
id1 = s.index(m1) + m1.size
  #=> 32
id2 = s.index(m2, id1+1) - 1
  #=> 37
str[id1..id2]
  #=> "bbbbbb"

See Regexp::escape . In #1,

(?<=#{Regexp.escape(marker1)})

is a positive lookbehind , requiring marker1 to appear immediately before the match.

(?=#{Regexp.escape(marker2)})

is a positive lookahead , requiring marker2 to immediately follow the match.

In #3, I used the form of String#index that takes a second argument ("offset").

Your original expression is just fine, we would be slightly modifying it here, if there might be other additional spaces in your string input and it might work:

^.+?ODS(\s+)?\|(\s+)?FILENAME(\s+)?=(\s+)?\/tmp\/(.+?)(\s+)?\|(\s+)?NAME(\s+)?=(\s+)?(.+?)$

and our desired outputs are in these two capturing groups:

(.+?)

Test

re = /^.+?ODS(\s+)?\|(\s+)?FILENAME(\s+)?=(\s+)?\/tmp\/(.+?)(\s+)?\|(\s+)?NAME(\s+)?=(\s+)?(.+?)$/mi
str = 'aaaaaaaaa ODS | Filename = /tmp/bbbbbb | NAME = ccccc'

# Print the match result
str.scan(re) do |match|
    puts match.to_s
end

Demo

How about String#scanf ?

> require 'scanf'
> str = 'ODS | FILENAME = /tmp/ | NAME'
> str.scanf('ODS | FILENAME = %s | NAME')
=> ["/tmp/"]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM