如何在python中使用for循环创建多个数据帧

Question

I'm trying to make multiple dataframes that are subsets of existing dataframes. 我正在尝试制作多个数据框，这些数据框是现有数据框的子集。

I have df_list which is actually a list of datasets: 我有df_list实际上是数据集列表：

df_list = [df1B, df2B, df3B, df4B, df5B, df6B, df7B, df8B, df9B, df10B, df11B, df12B, df13B, df14B, df15B, df16B, df17B, df18B, df19B, df20B, df21B, df22B, df23B, df24B, df25B, df26B, df27B, df28B, df30B, df31B, df32B, df33B, df34B, df35B]

If I want to make a subset of a single data set I do this and it works: 如果要创建单个数据集的子集，请执行此操作，并且可以：

df2B = df2B.groupby(['Location']).get_group(36)

It takes all locations with number 36, but when I try to do it for all the data sets in a for loop it doesn't work 它使用编号为36的所有位置，但是当我尝试对for循环中的所有数据集执行操作时，它不起作用

for df in df_list:
    df = df.groupby(['Location']).get_group(36)

But this is not making it for each dataset. 但这并不是每个数据集都能做到的。 It doesn't show any error message but it doesn't do anything else either :( 它没有显示任何错误消息，但也没有执行其他任何操作:(

Should I just write the same line 35 times ??? 我应该只在同一行上写35次吗？ I hope I have a better option. 我希望我有一个更好的选择。

Answer 1

If I understand correctly, you can use a list comprehension for this: 如果我理解正确，您可以为此使用list理解：

subset_df_list = [df.groupby('Location').get_group(36) for df in df_list]

As an aside, your for loop doesn't work because you just keep assigning back to df . 顺便说一句，您的for循环不起作用，因为您只是继续分配回df 。 You probably want this, which is also the equivalent of the above comprehension: 您可能需要这样做，这也等同于上述理解：

subset_df_list = []

for df in df_list:
    subset_df = df.groupby('Location').get_group(36)
    subset_df_list.append(subset_df)

Answer 2

df = [pd.DataFrame({'Location': np.random.randint(0,5,size=(100))}) for i in range(10)]
df = list(map(lambda x: x.groupby('Location').get_group(1), df))

Answer 3

You're assigning to your loop variable, which is then thrown away on the next go around. 您将分配给循环变量，然后将其丢弃。 DataFrame.append isn't inplace , and doesn't have an inplace parameter. DataFrame.append不是inplace ，并且不具有inplace参数。 Instead: 代替：

df1 = pd.DataFrame({'gr': [1,1,2,2], 'v': [1,2,3,2]})
df2 = pd.DataFrame({'gr': [1,1,2,2], 'v': [6,5,4,3]})
df_combined = pd.DataFrame({'gr': [], 'v':[]})
df_combined
Empty DataFrame
Columns: [gr, v]
Index: []
for df in [df1, df2]:
    df_combined = df_combined.append(df.groupby('gr').get_group(1))
df_combined
#     gr    v
# 0  1.0  1.0
# 1  1.0  2.0
# 0  1.0  6.0
# 1  1.0  5.0

Unless you want a list of DataFrames, which it suddenly seems like you do. 除非您想要一个DataFrames列表，否则突然看起来就像您这样做。 (I was thrown by df.append() . For a list , append adds to the end in place. For a DataFrame, it does not. In the list case, you want: （我被df.append()抛出。对于list ， append添加到末尾。对于DataFrame，则不添加。在列表的情况下，您需要：

# setup as before
combined_dfs = []
for df in [df1, df2]:
    combined_dfs = df_combined.append(df.groupby('gr').get_group(1))

It's a funny way to use DataFrames, but there ya go! 这是使用DataFrames的一种有趣的方式，但是可以！ :D ：D

如何在python中使用for循环创建多个数据帧

问题描述

3 个解决方案

解决方案1
1 已采纳 2019-05-29 02:47:28

解决方案2
0 2019-05-29 02:48:18

解决方案3
0 2019-05-29 03:03:43

如何在python中使用for循环创建多个数据帧

问题描述

3 个解决方案

解决方案1 1 已采纳 2019-05-29 02:47:28

解决方案2 0 2019-05-29 02:48:18

解决方案3 0 2019-05-29 03:03:43

解决方案1
1 已采纳 2019-05-29 02:47:28

解决方案2
0 2019-05-29 02:48:18

解决方案3
0 2019-05-29 03:03:43