簡體   English   中英

快速映射/修改大量Python dicts中的值?

[英]Quickly mapping/modifying values in a large list of Python dicts?

我有一些代碼,我正在努力加快。 也許我得到的是正確的,但每當我問StackOverflow時,有人通常會知道一個聰明的小技巧“使用地圖!”,“嘗試這個lambda”,或“導入迭代工具”,我希望有人可以在這里提供幫助。 這是我關注的代碼部分:

#slowest part from here....
for row_dict in json_data:
    row_dict_clean = {}
    for key, value in row_dict.items():
        value_clean = get_cleantext(value)
        row_dict_clean[key] = value_clean
    json_data_clean.append(row_dict_clean)
    total += 1
#to here...

這個概念非常簡單。 我有一個包含字典的數百萬長list ,我需要通過一個更清潔的方式運行每個value 然后我最終得到了一個清晰的詞典清單。 任何我不知道應該使用的聰明的iterate工具? 這是一個更完整的MVE來幫助它:

def get_json_data_clean(json_data):
    json_data_clean = []
    total = 0
    #slowest part from here....
    for row_dict in json_data:
        row_dict_clean = {}
        for key, value in row_dict.items():
            value_clean = get_cleantext(value)
            row_dict_clean[key] = value_clean
        json_data_clean.append(row_dict_clean)
        total += 1
    #to here...
    return json_data_clean

def get_cleantext(value):
    #do complex cleaning stuffs on the string, I can't change what this does
    value = value.replace("bad", "good")
    return value

json_data = [
    {"key1":"some bad",
     "key2":"bad things",
     "key3":"extra bad"},
    {"key1":"more bad stuff",
     "key2":"wow, so much bad",
     "key3":"who dis?"},
    # a few million more dictionaries
    {"key1":"so much bad stuff",
     "key2":"the bad",
     "key3":"the more bad"},
]

json_data_clean = get_json_data_clean(json_data)
print(json_data_clean)

每當我嵌套for循環時,我的腦袋里都會響起一個小鈴聲,可能有更好的方法。 任何幫助表示贊賞!

一定要在https://codereview.stackexchange.com/上問聰明的家伙,但作為一個快速解決方法,你可以map()你的轉換功能map()到一個字典列表,如下所示:

def clean_text(value: str)-> str:
    # ...
    return value.replace("bad", "good")

def clean_dict(d: dict):
    return {k:clean_text(v) for k,v in d.items()}


json_data = [
    {"key1":"some bad",
     "key2":"bad things",
     "key3":"extra bad"},
    {"key1":"more bad stuff",
     "key2":"wow, so much bad",
     "key3":"who dis?"},
    # a few million more dictionaries
    {"key1":"so much bad stuff",
     "key2":"the bad",
     "key3":"the more bad"},
]

x = list(map(clean_dict, json_data))

一個被遺漏的東西是你的total計數器,但它似乎永遠不會離開get_json_data_clean()

不知道為什么@Daniel Gale提出了filter()因為你沒有消除任何值,只是轉換它們。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM