[英]How to merge list of dataframes all with same index and same column names?
I have a list of dataframe like this of 90 heat devices我有一个这样的数据框列表,其中包含 90 个加热设备
data_list = [df0,df2, ... ,df89]
All of these dataframes in data_list
have the same features (= column names): data_list
中的所有这些数据框都具有相同的功能(= 列名):
("timestamp", "outside_temperature", "heating_generation", "power_consumption", "hot_water_storage", etc..)
All dataframes have the index : timestamp
for the same period of time所有数据帧都有索引:同一时间段的
timestamp
I want to create now new dataframes which are having one feature but for all 90 heat devices我现在想创建新的数据框,这些数据框具有一个功能,但适用于所有 90 个加热设备
eg for outside_temperature:例如对于外部温度:
timestamp device_0, device_2, device_3 ,..., device_89
01.05.2022 00:10 15.03 14.39 15.69 ... 15.30
01.05.2022 00:15 14.94 14.20 15.30 ... 15.29
01.05.2022 00:20 14.94 14.05 15.29 ... 15.20
.
.
.
etc.
and that for all my features.这适用于我的所有功能。
Any idea whats the best way to do so ?知道最好的方法是什么吗? I was thinking about merging but couldn't find good advice, or do it by a for loops.
我正在考虑合并,但找不到好的建议,或者通过 for 循环来完成。
If I followed your question correctly you could concat the selected columns by first using a comprehension then setting the columns names in two steps.如果我正确地遵循了您的问题,您可以通过首先使用理解连接选定的列,然后分两步设置列名称。 This does assume that the timestamps are the same in all data frames stored in data_list.
这确实假设存储在 data_list 中的所有数据帧中的时间戳都是相同的。
For example the following would concatenate all outside_temperature columns, leaving the columns name the same:例如,以下将连接所有 outside_temperature 列,使列名相同:
feature_df = pd.concat([x['outside_temperture'] for x in data_list], axis=1)
and then you could rename the columns with something like the following:然后您可以使用以下内容重命名列:
feature_df.columns = [f'device_{i}' for i in range(len(data_list))]
You can concat your dataframes and then transpose them using pd.pivot
function like in this small example您可以连接您的数据帧,然后使用
pd.pivot
函数转置它们,就像在这个小例子中一样
import pandas as pd
df1 = pd.DataFrame({
'timestamp': [pd.Timestamp(2022, 1,1), pd.Timestamp(2022, 1,2)],
'value':[1,2],
})
df2 = pd.DataFrame({
'timestamp': [pd.Timestamp(2022, 1,1), pd.Timestamp(2022, 1,4)],
'value':[3,4]
})
dfs = [df1, df2]
# add df tag
for (idx, df) in enumerate(dfs):
df['device'] = f'device_{idx}'
final_df = pd.concat([df1, df2])
pd.pivot(index = 'timestamp' , columns = 'device', values= 'value', data= final_df )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.