Trouble creating multiline regex

Question

I am doing some data cleaning job where i have to convert pdf file into text using (iText). I need to extract some Data tables out from the parsed. (Tables can appear in any order so could not parse it line by line). Anyways i started looking into regex solution for the same which i thought would be easier but apparently not for me.

The data looks like this

Dummy Value Data
VAL1 VAL2 Mean Calc  Calc2
(mf) (m) (rad) (rad) (rad/100m)
0.0 0.0 0.0 0.0 0.000
9224.0 9224.0 0.0 0.0 0.000
9928.0 9925.9 2.3 322.5 0.490
9885.0 9889.8 0.9 285.9 -0.953
5432.0 5432.5 3.3 95.4 -0.509
<newline>
<newline>

This is exactly the same patter i want to capture. The last 2 new lines mark the end of the pattern. I did tried a few things but nothing worked. I can share my regex too but they dont work.

Answer 1

You can use find method

Your regex would be

(?<VAL1>[-+]?\d+([.]\d+)?)\s+(?<VAL2>[-+]?\d+([.]\d+)?)\s+(?<Mean>[-+]?\d+([.]\d+)?)\s+(?<Calc>[-+]?\d+([.]\d+)?)\s+(?<Calc2>[-+]?\d+([.]\d+)?)

Your code

Matcher m=Pattern.compile(aboveRegex).matcher();
while(m.find())
{
    m.group("VAL1");
    m.group("VAL2");
    m.group("Mean");
    m.group("Calc");
}

EDIT

To match multiple such tables

([+-]?\d+([.]\d+)?( [+-]?\d+([.]\d+)?){4}(\r?\n))+(?=(\r?\n))

Answer 2

Try the next regex:

(\w+( +\w+)*)\r?\n(\w+( +\w+)*)\r?\n(\([\w/]+\)( \([\w/]+\))*)\r?\n((-?\d+\.\d+( -?\d+\.\d+)* *)\r?\n)*(?=(\r?\n){2})

The <newline> is \\r?\\n in the regex.

Trouble creating multiline regex

Question

2 answers

solution1
0 2013-07-03 20:01:10

solution2
0 2013-07-03 20:13:42

Trouble creating multiline regex

Question

2 answers

solution1 0 2013-07-03 20:01:10

solution2 0 2013-07-03 20:13:42

solution1
0 2013-07-03 20:01:10

solution2
0 2013-07-03 20:13:42