简体   繁体   English

连接多个 Pandas DataFrames

[英]Concatenating multiple pandas DataFrames

I have a large number of DataFrames with similar prefix df_ , that look like:我有大量具有类似前缀df_ ,它们看起来像:

df_1
df_x
df_ab
.
.
.
df_1a
df_2b

Of course I can do final_df = pd.concat([df_1, df_x, df_ab, ... df_1a, df_2b], axis = 1)当然我可以做final_df = pd.concat([df_1, df_x, df_ab, ... df_1a, df_2b], axis = 1)

The issue is that although the prefix df_ will always be there, the rest of the dataframes' names keep changing and do not have any pattern.问题是,尽管前缀df_将始终存在,但其余数据df_的名称不断变化并且没有任何模式。 So, I have to constantly update the list of dataframes in pd.concat to create the 'final_df`, which is cumbersome.因此,我必须不断更新pd.concat中的数据帧列表以创建“final_df”,这很麻烦。

Question : is there anyway to tell python to concatenate all defined dataframes in the namespace (only) starting with df_ and create the final_df or at least return a list of all such dataframes that I can then manually feed into pd.concat ?问题:无论如何要告诉 python 连接命名空间中所有已定义的数据帧(仅)以df_并创建final_df或至少返回所有此类数据帧的列表,然后我可以手动将其输入pd.concat

You could do something like this, using the built-in function globals() :您可以使用内置函数globals()

def concat_all(prefix='df_'):
    dfs = [df for name, df in globals().items() if name.startswith(prefix)
           and isinstance(df, pd.DataFrame)]
    return pd.concat(dfs, axis=1)

Logic:逻辑:

  1. Filter down your global namespace to DataFrames that start with prefix将全局命名空间过滤为以prefix开头的 DataFrame
  2. Put these in a list (concat doesn't take a generator)把这些放在一个列表中(concat 不带生成器)
  3. Call concat() on the first axis.在第一个轴上调用concat()

Example:例子:

import pandas as pd

df_1 = pd.DataFrame([[0, 1], [2, 3]])
df_2 = pd.DataFrame([[4, 5], [6, 7]])
other_df = df_1.copy() * 2  # ignore this
s_1 = pd.Series([1, 2, 3, 4])  # and this

final_df = concat_all()
final_df

   0  1  0  1
0  0  1  4  5
1  2  3  6  7

Always use globals() with caution.始终谨慎使用globals() It gets you a dictionary of the entire module namespace.它为您提供整个模块命名空间的字典。

You need globals() rather than locals() because the dictionary is being used inside a function.您需要globals()而不是locals()因为字典是在函数内部使用的。 locals() would be null here at time of use.在使用时, locals()在这里将为空。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM