[英]How do I create multiple data frames using a for loop in python
I'm trying to make multiple dataframes that are subsets of existing dataframes. 我正在尝试制作多个数据框,这些数据框是现有数据框的子集。
I have df_list
which is actually a list of datasets: 我有df_list
实际上是数据集列表:
df_list = [df1B, df2B, df3B, df4B, df5B, df6B, df7B, df8B, df9B, df10B, df11B, df12B, df13B, df14B, df15B, df16B, df17B, df18B, df19B, df20B, df21B, df22B, df23B, df24B, df25B, df26B, df27B, df28B, df30B, df31B, df32B, df33B, df34B, df35B]
If I want to make a subset of a single data set I do this and it works: 如果要创建单个数据集的子集,请执行此操作,并且可以:
df2B = df2B.groupby(['Location']).get_group(36)
It takes all locations with number 36, but when I try to do it for all the data sets in a for loop it doesn't work 它使用编号为36的所有位置,但是当我尝试对for循环中的所有数据集执行操作时,它不起作用
for df in df_list:
df = df.groupby(['Location']).get_group(36)
But this is not making it for each dataset. 但这并不是每个数据集都能做到的。 It doesn't show any error message but it doesn't do anything else either :( 它没有显示任何错误消息,但也没有执行其他任何操作:(
Should I just write the same line 35 times ??? 我应该只在同一行上写35次吗? I hope I have a better option. 我希望我有一个更好的选择。
If I understand correctly, you can use a list
comprehension for this: 如果我理解正确,您可以为此使用list
理解:
subset_df_list = [df.groupby('Location').get_group(36) for df in df_list]
As an aside, your for
loop doesn't work because you just keep assigning back to df
. 顺便说一句,您的for
循环不起作用,因为您只是继续分配回df
。 You probably want this, which is also the equivalent of the above comprehension: 您可能需要这样做,这也等同于上述理解:
subset_df_list = []
for df in df_list:
subset_df = df.groupby('Location').get_group(36)
subset_df_list.append(subset_df)
df = [pd.DataFrame({'Location': np.random.randint(0,5,size=(100))}) for i in range(10)]
df = list(map(lambda x: x.groupby('Location').get_group(1), df))
You're assigning to your loop variable, which is then thrown away on the next go around. 您将分配给循环变量,然后将其丢弃。 DataFrame.append
isn't inplace
, and doesn't have an inplace
parameter. DataFrame.append
不是inplace
,并且不具有inplace
参数。 Instead: 代替:
df1 = pd.DataFrame({'gr': [1,1,2,2], 'v': [1,2,3,2]})
df2 = pd.DataFrame({'gr': [1,1,2,2], 'v': [6,5,4,3]})
df_combined = pd.DataFrame({'gr': [], 'v':[]})
df_combined
Empty DataFrame
Columns: [gr, v]
Index: []
for df in [df1, df2]:
df_combined = df_combined.append(df.groupby('gr').get_group(1))
df_combined
# gr v
# 0 1.0 1.0
# 1 1.0 2.0
# 0 1.0 6.0
# 1 1.0 5.0
Unless you want a list of DataFrames, which it suddenly seems like you do. 除非您想要一个DataFrames列表,否则突然看起来就像您这样做。 (I was thrown by df.append()
. For a list
, append
adds to the end in place. For a DataFrame, it does not. In the list case, you want: (我被df.append()
抛出。对于list
, append
添加到末尾。对于DataFrame,则不添加。在列表的情况下,您需要:
# setup as before
combined_dfs = []
for df in [df1, df2]:
combined_dfs = df_combined.append(df.groupby('gr').get_group(1))
It's a funny way to use DataFrames, but there ya go! 这是使用DataFrames的一种有趣的方式,但是可以! :D :D
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.