[英]Filter list of dictionaries based on the value of a key
if two values are identical in a list of dictionaries, I would like the list filtered with only one of the dictionaries.如果字典列表中的两个值相同,我希望仅使用其中一个字典过滤列表。 I do not care about the second (or third dictionary that matches).
我不关心第二个(或第三个匹配的字典)。
crcs = [
{'compress_name': 'file1.bin', 'crc': '55A0669C', 'name': 'R:\\filepath\\system\\compress1.zip'},
{'compress_name': 'file3.bin', 'crc': '55A0669C', 'name': 'R:\\filepath\\system\\compress2.zip'},
{'compress_name': 'file2.bin', 'crc': '66B07710', 'name': 'R:\\filepath\\system\\compress2.zip'},
{'compress_name': 'file5.bin', 'crc': '66B07710', 'name': 'R:\\filepath\\system\\compress3.zip'}
]
expected results is a list of two dictionaries with differing "crc" values.预期结果是具有不同“crc”值的两个字典的列表。
[
{'compress_name': 'file1.bin', 'crc': '55A0669C', 'name': 'R:\\filepath\\system\\compress1.zip'},
{'compress_name': 'file2.bin', 'crc': '66B07710', 'name': 'R:\\filepath\\system\\compress2.zip'},
]
or any other combination of the CRC values matching 55A0669C and 66B07710.或匹配 55A0669C 和 66B07710 的 CRC 值的任何其他组合。 The list of dictionaries could be 400 or more items long.
字典列表可能有 400 个或更多项。
I'm using python 3.7我正在使用 python 3.7
if it's only crc what need to be unique, then you can use如果只有 crc 需要唯一,那么你可以使用
crcs = [ {'compress_name': 'file1.bin', 'crc': '55A0669C', 'name': 'R:\filepath\system\compress1.zip'}, {'compress_name': 'file3.bin', 'crc': '55A0669C', 'name': 'R:\filepath\system\compress2.zip'}, {'compress_name': 'file2.bin', 'crc': '66B07710', 'name': 'R:\filepath\system\compress2.zip'}, {'compress_name': 'file5.bin', 'crc': '66B07710', 'name': 'R:\filepath\system\compress3.zip'} ]
crcs_all = []
crcs_uniq = []
for i in range(len(crcs)):
crc = crcs[i]['crc']
if crc not in crcs_all:
crcs_all.append(crc)
crcs_uniq.append(crcs[i])
print(crcs_uniq)
That will give you那会给你
[ {'compress_name': 'file1.bin', 'crc': '55A0669C', 'name': 'R:\x0cilepath\\system\\compress1.zip'},
{'compress_name': 'file2.bin', 'crc': '66B07710', 'name': 'R:\x0cilepath\\system\\compress2.zip'}]
You could use caste the list of dictionaries into a dataframe and then select the unique crc
values.您可以使用种姓将字典列表转换为 dataframe 然后 select 唯一的
crc
值。 Finally, you could get the first occurences of the duplicate crc
values by using list.index(crc)
and store than in a list unique_idx
.最后,您可以使用
list.index(crc)
获取重复的crc
值的第一次出现,并将其存储在 list unique_idx
中。 We use this unique_idx
to filter out the relevant rows from the dataframe df
and then extract that data as a dict
.我们使用这个
unique_idx
从 dataframe df
中过滤掉相关行,然后将该数据提取为dict
。
import pandas as pd
df = pd.DataFrame(crcs)
unique_crcs = df.crc.unique().tolist()
unique_idx = []
for crc in unique_crcs:
unique_idx.append(all_crcs.index(crc))
dfu = df.iloc[unique_idx]
dfu.T.to_dict()
Output : Output :
{0: {'compress_name': 'file1.bin',
'crc': '55A0669C',
'name': 'R:\\filepath\\system\\compress1.zip'},
2: {'compress_name': 'file2.bin',
'crc': '66B07710',
'name': 'R:\\filepath\\system\\compress2.zip'}}
import pandas as pd
crcs = [{'compress_name': 'file1.bin', 'crc': '55A0669C', 'name': r'R:\filepath\system\compress1.zip'},
{'compress_name': 'file3.bin', 'crc': '55A0669C', 'name': r'R:\filepath\system\compress2.zip'},
{'compress_name': 'file2.bin', 'crc': '66B07710', 'name': r'R:\filepath\system\compress2.zip'},
{'compress_name': 'file5.bin', 'crc': '66B07710', 'name': r'R:\filepath\system\compress3.zip'} ]
df = pd.DataFrame(crcs)
print(df)
Output : Output :
compress_name crc name
0 file1.bin 55A0669C R:\filepath\system\compress1.zip
1 file3.bin 55A0669C R:\filepath\system\compress2.zip
2 file2.bin 66B07710 R:\filepath\system\compress2.zip
3 file5.bin 66B07710 R:\filepath\system\compress3.zip
unique_crcs = df.crc.unique().tolist()
all_crcs = df.crc.to_list()
unique_idx = []
uniques = dict()
for crc in unique_crcs:
idx = all_crcs.index(crc)
uniques.update({crc: idx})
unique_idx.append(idx)
print(uniques)
print(all_crcs)
Output : Output :
{'55A0669C': 0, '66B07710': 2}
['55A0669C', '55A0669C', '66B07710', '66B07710']
dfu = df.iloc[unique_idx]
dfu.T.to_dict()
Output : Output :
{0: {'compress_name': 'file1.bin',
'crc': '55A0669C',
'name': 'R:\\filepath\\system\\compress1.zip'},
2: {'compress_name': 'file2.bin',
'crc': '66B07710',
'name': 'R:\\filepath\\system\\compress2.zip'}}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.