matching and splitting

Question

I have a file1 with contents:

abc_1 (qst_0) bndk
cgn32 (mn_r_1) mncp
 dmj_2 (yst) pr1f

I want to match and split the file line by line. for which I use the following code:

path = sys.argv[1]
 with open(path) as f:
  data = f.read()
 unit = re.split(r"(.+\(.*\).+)", data)
 print(*unit)

It is able to split the first 2 lines, but in the 3rd line it gives an error saying IndentationError: Unexpected Indent at line 3 of file1. Could you someone help me out?

Answer 1

You can try this:

with open(path) as f:
    data = f.read()
unit = [line.strip().split(" ") for line in data.split("\n")]
print(unit)

output:

[['abc_1', '(qst_0)', 'bndk'],
 ['cgn32', '(mn_r_1)', 'mncp'],
 ['dmj_2', '(yst)', 'pr1f']]

Answer 2

What is an indentation error?

Indentation error in python refers to the wrong syntax with respect to the spaces.

Here in line 3 data=f.read() you haven't followed the syntax properly. So in this case your code hasn't even executed a single line from your input file.

Make sure that you've 4 spaces inside a block, while using python. The following should work.

import re
import sys

path = sys.argv[1]

with open(path) as fp:
    for line in fp:
        print (re.split(r"(.+\(.*\).+)", line))

(or)

import re
import sys

path = sys.argv[1]

with open(path) as fp:
    split_lines = [re.split(r"(.+\(.*\).+)", line) for line in fp]

print(split_lines)

NOTE:

You haven't mentioned on what basis you want to split the lines. Is it by spaces, "_" or ")"?
Your current regex doesn't do it.

matching and splitting

Question

2 answers

solution1
0 ACCPTED 2020-06-25 05:59:48

solution2
0 2020-06-25 06:05:34

matching and splitting

Question

2 answers

solution1 0 ACCPTED 2020-06-25 05:59:48

solution2 0 2020-06-25 06:05:34

solution1
0 ACCPTED 2020-06-25 05:59:48

solution2
0 2020-06-25 06:05:34