简体   繁体   English

如何仅从字典/json 中提取特定字段?

[英]How to extract only specific fields from a dictionary/json?

I am trying to create a new dictionary to extract only specific fields, I only want the "process_hash", "process_name", "process_effective_reputation" fields.我正在尝试创建一个新字典来仅提取特定字段,我只想要“process_hash”、“process_name”、“process_effective_reputation”字段。

The code below sort of works but it only extracts the first item, I would like to extract all the items but only for "process_hash", "process_name", "process_effective_reputation" fields下面的代码可以工作,但它只提取第一个项目,我想提取所有项目,但仅适用于“process_hash”、“process_name”、“process_effective_reputation”字段

JSON: JSON:

{'results': [{'device_name': 'faaadc2',
          'device_timestamp': '2020-10-27T00:50:46.176Z',
          'event_id': '9b1bvfaa11eb81b',
          'process_effective_reputation': 'LIST5',
          'process_hash': ['bfc7dcf5935f3avda9df8e9b6425c37a',
                           'ca9f3a2450asd518fc939a33c100b2d557f96e040f712f6dd4641ad1734e2f19'],
          'process_name': 'c:\\program files '
                          '(x86)\\to122soft\\thcaadf3\\tohossce.exe',
          'process_username': ['JOHN\\user1']},
         {'device_name': 'fk6saadc2',
          'device_timestamp': '2020-10-27T00:50:46.176Z',
          'event_id': '9b151f6e17ee11eb81b',
          'process_effective_reputation': 'LIST1',
          'process_hash': ['bfc7dcf5935f3a9df8e9baaa425c37a',
                           'ca9f3aaa506cc518fc939a33c100b2d557f96e040f712f6dd4641ad1734e2f19'],
          'process_name': 'c:\\program files '
                          '(x86)\\oaaft\\tf3\\toaaotsice.exe',
          'process_username': ['JOHN\\user2']},
         {'device_name': 'sdddsdc2',
          'device_timestamp': '2020-10-27T00:50:46.176Z',
          'event_id': '9b151f698e11eb81b',
          'process_effective_reputation': 'LIST',
          'process_hash': ['9df8ebfc7dcf5935830f3a9b6asdcd7a',
                           'ca9f3a24506cc518fdfrcv39a33c100b2d557f96e040f7124641ad1734e2f19'],
          'process_name': 'c:\\program files '
                          '(x86)\\toht\\thaa3\\toasce.exe',
          'process_username': ['JOHN\\user3']}]}

Code:代码:

response = json.loads(r.text)
r = response['results']

selected_fields = []
for d in r:
    selected_fields.append({k: d[k] for k in ("process_hash", "process_name", "process_effective_reputation")})

new_data = []
for data in selected_fields:
    fieldnames = 'md5 sha256 process_name process_effective_reputation'.split()
    row = {'md5': data['process_hash'][0], 'sha256': data['process_hash'][1]}
    # Copy process_name and process_effective_reputation fields.
    row.update({fieldname: data[fieldname] for fieldname in fieldnames[-2:]})
    new_data.append(row)
return new_data

UPDATE:更新:

thank you Lauren Boland for the code, this worked and Nattelar for the explanation.感谢 Lauren Boland 提供的代码,这很有效,Nattelar 提供了解释。

i have attached the new code, i am trying to split the process hash fields into two fields, so that it's "md5" "sha256" "process_name" "process_effective_reputation", i've tried the code above but i get row = {'md5': data['process_hash'][0], 'sha256': data['process_hash'][1]} IndexError: list index out of range我附上了新代码,我试图将进程哈希字段拆分为两个字段,以便它是“md5”“sha256”“process_name”“process_effective_reputation”,我已经尝试了上面的代码,但我得到了 row = {' md5': data['process_hash'][0], 'sha256': data['process_hash'][1]} IndexError: list index out of range

Thank you谢谢

You were overwriting the selected_fields dictionary in every iteration of your for loop.您在 for 循环的每次迭代中都覆盖了selected_fields字典。

Try making it a list instead.尝试将其设为列表。 It will return a list of dictionaries.它将返回一个字典列表。

selected_fields = []
for d in r:
    selected_fields.append({k: d[k] for k in ("process_hash", "process_name", "process_effective_reputation")})
return selected_fields

Even though there's already an answer I want to point out what's happening here.即使已经有了答案,我还是想指出这里发生了什么。

When you try to make a variable equals to something you will end up overwriting the value that was there before, which is why your code is not working.当您尝试使变量等于某个值时,您最终会覆盖之前存在的值,这就是您的代码无法正常工作的原因。 But even if you tried to use use selected_fields.update() it wouldn't work cause the keys names are the same, and now the values of the keys would be overwrite.但即使您尝试使用 selected_fields.update() 它也不会起作用,因为键名称相同,现在键的值将被覆盖。

When doing this kind of thing you generally have to maintain the original type of the structure, that in the case of selected_fields['Results'] is a list在做这种事情时,你通常必须保持结构的原始类型,在 selected_fields['Results'] 的情况下是一个列表

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM