I have a dictionary (table) defined like this:
table = {{"id": [1, 2, 3]}, {"file": ['good1.txt', 'bad2.txt', 'good3.txt']}}
and I have a list of bad candidates that should be removed:
to_exclude = ['bad0.txt', 'bad1.txt', 'bad2.txt']
I hope to filter the table based on if the file in a row of my table can be found inside to_exclude.
filtered = {{"id": [1, 2]}, {"file": ['good1.txt', 'good3.txt']}}
I guess I could use a for loop to check the entries one by one, but I was wondering what's the most python-efficient manner to solve this problem.
Could someone provide some guidance on this? Thanks.
The most efficient thing to do is to convert to_exclude
into a set. And then do the straightforward search
# just so things are efficient
to_exclude_set = set(to_exclude)
table = {key: [value for value in values if value not in to_exclude_set]
for key, values in table.items()
}
I'm assuming you miswrote your data structure. You have a set of two dictionaries, which is impossible. (Dictionaries are not hashable). I'm hoping your actual data is:
data = {"id": [1, 2, 3], "file": [.......]}
a dictionary with two keys.
So for me, the simplest would be:
# Create a set for faster testing
to_exclude_set = set(to_exclude)
# Create (id, file) pairs for the pairs we want to keep
pairs = [(id, file) for id, file in zip(data["id"], data["file"])
if file not in to_exclude_set]
# Recreate the data structure
result = { 'id': [_ for id, _ in pairs],
'file': [_ for _, file in pairs] }
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.