[英]pandas.concat of multiple data frames using only common columns
I have multiple pandas
data frame objects cost1, cost2, cost3 ....我有多个pandas
数据帧对象COST1,cost2,COST3 ....
How can I append rows from all of these data frames into one single data frame while retaining elements from only the common column names?如何将所有这些数据框中的行附加到一个数据框中,同时仅保留公共列名称中的元素?
As of now I have到目前为止,我有
frames=[cost1,cost2,cost3]
new_combined = pd.concat(frames, ignore_index=True)
This obviously contains columns which are not common across all data frames.这显然包含在所有数据框中不常见的列。
For future readers, Above functionality can be implemented by pandas itself.对于未来的读者,上述功能可以由 Pandas 自己实现。 Pandas can concat dataframe while keeping common columns only, if you provide join='inner' argument in pd.concat.如果您在 pd.concat 中提供 join='inner' 参数,Pandas 可以在仅保留公共列的同时连接数据帧。 eg例如
pd.concat(frames,join='inner', ignore_index=True)
You can find the common columns with Python's set.intersection
:您可以使用 Python 的set.intersection
找到公共列:
common_cols = list(set.intersection(*(set(df.columns) for df in frames)))
To concatenate using only the common columns, you can use要仅使用公共列连接,您可以使用
pd.concat([df[common_cols] for df in frames], ignore_index=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.