Multiline file grep

Question

I have a file that has sections like this,

flags...id, description, used, color
AB, "Abandoned", 0, 13168840
DM, "Demolished", 0, 15780518
OP, "Operational", 0, 15780518...

where ... represents a series of control characters eg ETX and STX. I am trying to grab multiple lines from the file.

I am using the following code:

f = File.open(somePath)
r = f.grep(/flags.+id, description, used, color(?<data>(?:.|\s)*?)[\x00-\x08]/)

This code does not work. I do not understand why. The documentation of grep appears to insinuate that the file is parsed line by line. I have a feeling that this may be the reason why the regular expression isn't returning any results.

Am I correct that grep uses line-by-line parsing? Is this why my regex isn't working as intended?
Would it be better to use file.each_line to capture the data?
Are there better/cleaner alternatives to all of the above?

Answer 1

String#scan comes to the rescue:

File.read('/path/to/file').scan(
  /flags.+id, description, used, color(?<data>(?:.|\s)*?)[\x00-\x08]/m
)

Answer 2

You need to enable multiline mode. . doesn't match newlines by default.

From the documentation https://ruby-doc.org/core-2.1.1/Regexp.html

/./ - Any character except a newline.
/./m - Any character (the m modifier enables multiline mode)

Answer 3

Am I correct that grep uses line-by-line parsing?

Yes. Try on your file:

r = File.open(somePath) do |f|
  f.grep(/[A-Z]{2},/)
end

puts r
# => AB, "Abandoned", 0, 13168840
#    DM, "Demolished", 0, 15780518
#    OP, "Operational", 0, 15780518

puts r.inspect
# => ["AB, \"Abandoned\", 0, 13168840\n", "DM, \"Demolished\", 0, 15780518\n", "OP, \"Operational\", 0, 15780518\n"]

Is this why my regex isn't working as intended?

Not only. What are you searching for, with [\\x00-\\x08]? An ascii or an hexadecimal character?

Would it be better to use file.each_line to capture the data?

File#grep sounds good.

Multiline file grep

Question

3 answers

solution1
1 ACCPTED 2017-10-25 06:04:55

solution2
0 2017-10-25 00:06:26

solution3
0

Multiline file grep

Question

3 answers

solution1 1 ACCPTED 2017-10-25 06:04:55

solution2 0 2017-10-25 00:06:26

solution3 0

solution1
1 ACCPTED 2017-10-25 06:04:55

solution2
0 2017-10-25 00:06:26

solution3
0