简体   繁体   English

多个pandas数据帧的交集

[英]Intersection of multiple pandas dataframes

I have a number of dataframes (100) in a list as: 我在列表中有许多数据帧(100):

frameList = [df1,df2,..,df100]

Each dataframe has the two columns DateTime , Temperature . 每个数据帧都有两列DateTimeTemperature

I want to intersect all the dataframes on the common DateTime column and get all their Temperature columns combined/merged into one big dataframe: Temperature from df1, Temperature from df2, Temperature from df3, .., Temperature from df100. 我想在公共DateTime列上交叉所有数据帧,并将所有Temperature列组合/合并到一个大数据帧中:温度来自df1,温度来自df2,温度来自df3,..,温度来自df100。

(pandas merge doesn't work as I'd have to compute multiple (99) pairwise intersections). (pandas merge不起作用,因为我必须计算多个(99)成对交叉点)。

Use pd.concat , which works on a list of DataFrames or Series. 使用pd.concat ,它适用于DataFrames或Series列表。

pd.concat(frameList, axis=1, join='inner')

This is better than using pd.merge , as pd.merge will copy the data pairwise every time it is executed. 这比使用pd.merge更好,因为pd.merge将在每次执行时成对复制数据。 pd.concat copies only once. pd.concat只复制一次。 However, pd.concat only merges based on an axes, whereas pd.merge can also merge on (multiple) columns. 但是, pd.concat仅基于轴进行合并,而pd.merge也可以合并(多个)列。

you can try using reduce functionality in python..something like this 你可以尝试在python中使用reduce功能。这样的东西

dfs = [df0, df1, df2, dfN]
df_final = reduce(lambda left,right: pd.merge(left,right,on='DateTime'), dfs)

You could iterate over your list like this: 您可以像这样遍历列表:

df_merge = frameList[0]
for df in frameList[1:]:       
    df_merge = pd.merge(df_merge, df, on='DateTime', how='inner')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM