简体   繁体   English

Pandas-根据开关用数据框填充字典

[英]Pandas- Fill a dictionary with dataframes depending on a switch

Background : I have a few dataframes that may be turned on or off with switches.背景:我有一些数据帧可以通过开关打开或关闭。 I want to fill a dictionary with each of the turned-on dataframes.我想用每个打开的数据框填充字典。 Then I want to be able to loop over the dataframe.然后我希望能够遍历 dataframe。

Issue : I don't know how to dynamically build my dictionary to only include the dataframes when their switches are turned on.问题:我不知道如何动态构建我的字典以仅在打开开关时包含数据帧。

What I've Tried:我试过的:

import pandas as pd

sw_a = True
sw_b = False
sw_c = True

a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                   'Cost':[1.1,1.2,1.3,1.4,1.5],
                    'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']}) if sw_a == True else []
b = pd.DataFrame({'IDs':[1,2],
                   'Cost':[1.1,1.2],
                    'Names':['APPLE1','Blue1']}) if sw_b == True else []
c = pd.DataFrame({'IDs':[12],
                  'Cost':[1.5],
                    'Names':['APPLE2']}) if sw_c == True else []
total = {"first":a,"second":b,"third":c}

for df in total:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')

The above does not work because it always includes the dataframes, if the switch is off it's a string instead of totally excluded.以上不起作用,因为它始终包含数据帧,如果开关关闭,它是一个字符串而不是完全排除。

Consider something like this.考虑这样的事情。

sw_a = True
sw_b = False
sw_c = True

a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                   'Cost':[1.1,1.2,1.3,1.4,1.5],
                    'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']})
b = pd.DataFrame({'IDs':[1,2],
                   'Cost':[1.1,1.2],
                    'Names':['APPLE1','Blue1']})
c = pd.DataFrame({'IDs':[12],
                  'Cost':[1.5],
                    'Names':['APPLE2']})

total = {}
if sw_a == True:
    total['sw_a'] = a
if sw_b == True:
    total['sw_b'] = b
if sw_c == True:
    total['sw_c'] = c
print(total)

for df in total:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')

The number of fruits for sw_a is 5 and the cost is 6.5
The number of fruits for sw_c is 1 and the cost is 1.5

My set-up is similar to yours, but I don't bother with the switches on each dataframe assignment:我的设置与您的类似,但我不关心每个 dataframe 分配上的开关:

import pandas as pd

sw_a = True

sw_b = False
sw_c = True

a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                   'Cost':[1.1,1.2,1.3,1.4,1.5],
                    'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']})
b = pd.DataFrame({'IDs':[1,2],
                   'Cost':[1.1,1.2],
                    'Names':['APPLE1','Blue1']})
c = pd.DataFrame({'IDs':[12],
                  'Cost':[1.5],
                    'Names':['APPLE2']})

total = {"first":a,"second":b,"third":c} # don't worry about the switches yet.

Only now do we filter:我们现在才过滤:

list_switches = [sw_a, sw_b, sw_c] # the switches! finally!
total_filtered = {tup[1]:total[tup[1]] for tup in zip(list_switches, total) if tup[0]}

And carry on as you have done.并像你所做的那样继续。

for df in total_filtered:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')

Output: Output:

在此处输入图像描述

Edit You can be slightly fancier with the zip functionlity, for example if you're constructing the lists of dataframes, dataframe names, and switches dynamically and can be sure that they will always be the same length, you could do something like:编辑您可以稍微zip功能,例如,如果您正在构建数据帧列表、dataframe 名称和动态切换,并且可以确保它们始终具有相同的长度,您可以执行以下操作:

# pretend these three lists are coming from somewhere else and can have variable length, rather than being hard-coded.
list_dfs = [a,b,c]
list_switches = [sw_a, sw_b, sw_c]
list_names = ["first", "second", "third"]

# use a zip object over the three lists.
zipped = zip(list_dfs, list_switches, list_names)
total = {tup[2] : tup[0] for tup in zipped if tup[1]}

for df in total:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM