简体   繁体   English

如何合并所有具有相同索引和相同列名的数据框列表?

[英]How to merge list of dataframes all with same index and same column names?

I have a list of dataframe like this of 90 heat devices我有一个这样的数据框列表,其中包含 90 个加热设备

data_list = [df0,df2, ... ,df89]

All of these dataframes in data_list have the same features (= column names): data_list中的所有这些数据框都具有相同的功能(= 列名):

("timestamp", "outside_temperature", "heating_generation", "power_consumption", "hot_water_storage", etc..)

All dataframes have the index : timestamp for the same period of time所有数据帧都有索引:同一时间段的timestamp

I want to create now new dataframes which are having one feature but for all 90 heat devices我现在想创建新的数据框,这些数据框具有一个功能,但适用于所有 90 个加热设备

eg for outside_temperature:例如对于外部温度:

  timestamp        device_0, device_2, device_3 ,..., device_89
01.05.2022 00:10      15.03    14.39     15.69   ...   15.30
01.05.2022 00:15      14.94    14.20     15.30   ...   15.29
01.05.2022 00:20      14.94    14.05     15.29   ...   15.20
.
.
.
etc.

and that for all my features.这适用于我的所有功能。

Any idea whats the best way to do so ?知道最好的方法是什么吗? I was thinking about merging but couldn't find good advice, or do it by a for loops.我正在考虑合并,但找不到好的建议,或者通过 for 循环来完成。

If I followed your question correctly you could concat the selected columns by first using a comprehension then setting the columns names in two steps.如果我正确地遵循了您的问题,您可以通过首先使用理解连接选定的列,然后分两步设置列名称。 This does assume that the timestamps are the same in all data frames stored in data_list.这确实假设存储在 data_list 中的所有数据帧中的时间戳都是相同的。

For example the following would concatenate all outside_temperature columns, leaving the columns name the same:例如,以下将连接所有 outside_temperature 列,使列名相同:

feature_df = pd.concat([x['outside_temperture'] for x in data_list], axis=1)

and then you could rename the columns with something like the following:然后您可以使用以下内容重命名列:

feature_df.columns = [f'device_{i}' for i in range(len(data_list))]

You can concat your dataframes and then transpose them using pd.pivot function like in this small example您可以连接您的数据帧,然后使用pd.pivot函数转置它们,就像在这个小例子中一样

import pandas as pd
df1 = pd.DataFrame({
    'timestamp': [pd.Timestamp(2022, 1,1), pd.Timestamp(2022, 1,2)],
    'value':[1,2],

})
df2 = pd.DataFrame({
    'timestamp': [pd.Timestamp(2022, 1,1), pd.Timestamp(2022, 1,4)],
    'value':[3,4]
})

dfs = [df1, df2]
# add df tag
for (idx, df) in enumerate(dfs):
    df['device'] = f'device_{idx}'

final_df = pd.concat([df1, df2])
pd.pivot(index = 'timestamp' , columns = 'device', values= 'value', data= final_df )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM