I am trying to get strings from a list of files with ls command. I have this two cases:
"filename"
"link File" -> "filename"
In python, I did this code:
print(re.findall( r'"(.*?)"', linha))
The RE i did:
"(.*?)" -: match ['filename'] CORRECT
['link File" -> "filename'] WRONG
"(.*?)" -> "(.*?)" -: match [''] WRONG
['link File', 'filename'] CORRECT
What is the RE to get this result in the same RE:
-: match ['filename', ''] CORRECT
['link File', 'filename'] CORRECT
You have an optional section, so use a ?
to match it if it is there. Next, you want to exclude "
from your matches, since your targets are surrounded by quotes. This makes it easier for the regex engine to match your string:
"([^"]*)"(?: -> "([^"]*)")?
The (?:...)
grouping is non-capturing, the ?
after it makes it optional.
When you use this with re.findall()
, you'll always get tuples with two groups, the second one being empty for those inputs where -> "..."
is missing:
>>> import re
>>> re.findall(r'"([^"]*)"(?: -> "([^"]*)")?', '"filename"')
[('filename', '')]
>>> re.findall(r'"([^"]*)"(?: -> "([^"]*)")?', '"link File" -> "filename"')
[('link File', 'filename')]
I've created an online demonstration with Regex101 (which, for some reason, requires us to explicitly escape double quotes, not something that Python actually would require). It contains a breakdown of the pattern on the right-hand side under the 'Explanation' banner.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.