I have an image dataset that describes different activities appearing in the particular images. Each image in the dataset is named as <activity>_<num>
. For example, educating_13.jpg
, practicing_147.jpg
, etc.
Now I want to select images with same activity, say "cooking", and I decided to do this using re
module in Python. The script I wrote is like
pattern = "^(\w+)_(\d+)$"
for filename in os.listdir("."):
root, _ = os.path.splitext(filename)
activity = re.match(pattern, root).group(1)
if activity == "cooking":
# do something
However, even though many images are successfully processed. It finally aborted with AttributeError
. It seems that some of the images could not be matched with the specified pattern.
So do I make some mistake? Any input is appreciated.
EDIT:
By using exception mechanism in Python, it turns out that of almost 150 thousand images, there is a text file called temp.txt
and this is the one that violates the pattern.
Without using regex. Using str.split
Ex:
for filename in os.listdir("."):
root, _ = os.path.splitext(filename)
if "_" in root:
activity, num = root.split("_")
if activity == "cooking":
# do something
re.match(pattern, root)
can return None if not matching
re.match(pattern, root) == None
and find the image https://regex101.com/
to check your regexp with name of images If re.match(pattern, root)
is None then calling .group(1)
will give you the attribute error. So in certain cases you don't seem to match all entries in your directory.
It's hard to know which ones are giving you problems, but by default \\w
matches only [a-zA-Z0-9_]
, so:
You could post the directory listing, then maybe we can spot the file.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.