I have a list of file name strings similar to this (but very long):
list = ['AB8372943.txt', 'test.pdf', '123485940.docx', 'CW2839502.txt', 'AB1234567.txt', '283AB.txt']
I am looking to make another list out of this one by taking only the strings that match 4 conditions:
Therefore in this case the desired result would be this list:
list2 = ['AB8372943.txt', 'AB1234567.txt']
So far I know that to check for a 7 digit number I can use:
list2 = [i for i in list if re.findall(r"\d{7}", i)]
And how to look for substrings within the strings... But it isn't enough for the strings to just contain the substrings, they need to start and end with a specific one and have a 7 digit number in the middle and that's it? Is there a way to do this???
Thank you so much in advance!
To also make sure it starts with AB
and ends with .txt
:
my_list = ['AB8372943.txt', 'test.pdf', '123485940.docx', 'CW2839502.txt', 'AB1234567.txt', '283AB.txt']
my_list2 = [i for i in my_list if re.findall(r"^AB\d{7}.txt$", i)]
You should avoid using a built in name like list
. Also, if the string does not contain sub strings, you can use re.match
which will start the match from the start of the string.
AB\d{7}\.txt\Z
The pattern matches:
AB\d{7}
Match AB and 7 digits \.txt
Match .txt
and note to escape the dot \Z
End of string For example
import re
lst = ['AB8372943.txt', 'test.pdf', '123485940.docx', 'CW2839502.txt', 'AB1234567.txt', '283AB.txt']
lst2 = [s for s in lst if re.match(r"AB\d{7}\.txt\Z", s)]
print(lst2)
Output
['AB8372943.txt', 'AB1234567.txt']
See a Python demo
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.