I have a text file and the content is,
Submitted By,Assigned,Closed
Name1,10,5
Name2,20,10
Name3,30,15
I have written a Regex Pattern, to extract the value between first ,
and second ,
^\w+,(\w+),.*$
My Python code is
import re
f=r'sample.txt'
rePat = re.compile('^\w+,(\w+),.*$', re.MULTILINE)
text = open(f, 'r').read()
output = re.findall(rePat, text)
print (f)
print (output)
Expected Output:
Assigned
10
20
30
But I am getting
10
20
30
Why it is missing the first line?
The problem is due to the fact that \\w+
matches one or more word chars (basically, letters, digits, underscores and also some diacritics). You have a space in between the second and third commas, so I suggest matching any chars between commas with [^,\\n]+
(the \\n
here is to make sure we stay within the same line).
You can use
rePat = re.compile(r'^[^,\n]+,([^,\n]+),.*$', re.MULTILINE)
Or, a bit simplified if you do not need to extract anything else:
rePat = re.compile(r'^[^,\n]+,([^,\n]+)', re.MULTILINE)
See this regex demo . Details :
^
- start of a line [^,\\n]+
- one or more chars other than ,
and LF ,
- a comma ([^,\\n]+)
- Group 1: one or more chars other than ,
and LF. See a Python demo :
import re
text = r"""Submitted By,Assigned,Closed
Name1,10,5
Name2,20,10
Name3,30,15"""
rePat = re.compile('^[^,\n]+,([^,\n]+),.*$', re.MULTILINE)
output = re.findall(rePat, text)
print (output)
# => ['Assigned', '10', '20', '30']
You could add matching optional spaces and word characters after the first \\w+
to match till the first comma.
^\w+(?: \w+)*,(\w+),.*$
^
Start of string \\w+
Match 1+ word chars (?: \\w+)*
Optionally repeat matching a space and 1+ word chars ,(\\w+),
Match a comma and capture 1+ word chars in group 1 .*$
( You could omit this part) import re
f = r'sample.txt'
rePat = re.compile('^\w+(?: \w+)*,(\w+),.*$', re.MULTILINE)
text = open(f, 'r').read()
output = re.findall(rePat, text)
print(output)
Output
['Assigned', '10', '20', '30']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.