I'm trying to match strings in the lines of a file and write the matches minus the first one and the last one
import os, re
infile=open("~/infile", "r")
out=open("~/out", "w")
pattern=re.compile("=[A-Z0-9]*>")
for line in infile:
out.write( pattern.search(line)[1:-1] + '\n' )
Problem is that it says that Match
is not subscriptable, when I try to add .group()
it says that Nonegroup has no attritube group
, groups()
returns that .write
needs a tuple etc
Any idea how to get .search
to return a string ?
The re.search
function returns a Match
object.
If the match fails, the re.search
function will return None. To extract the matching text, use the Match.group
method.
>>> match = re.search("a.", "abc")
>>> if match is not None:
... print(match.group(0))
'ab'
>>> print(re.search("a.", "a"))
None
That said, it's probably a better idea to use groups to find the required section of the match:
>>> match = re.search("=([A-Z0-9]*)>", "=abc>") # Notice brackets
>>> match.group(0)
'=abc>'
>>> match.group(1)
'abc'
This regex can then be used with findall as @WiktorStribiżew suggests.
You seem to need only the part of strings between =
and >
. In this case, it is much easier to use a capturing group around the alphanumeric pattern and use it with re.findall
that will never return None
, but just an empty list upon no match, or a list of captured texts if found. Also, I doubt you need empty matches, so use +
instead of *
:
pattern=re.compile(r"=([A-Z0-9]+)>")
^ ^
and then
"\n".join(pattern.findall(line))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.