熊猫将数据框的列表连接到一个数据框

Question

Trying to combine a list of dataframes to one dataframe. 尝试将数据帧列表组合到一个数据帧。 Data looks like: 数据如下：

    Date        station_id  Hour    Temp
0   2004-01-01  1           1       46.0
1   2004-01-01  1           2       46.0
2   2004-01-01  1           3       45.0
3   2004-01-01  1           4       41.0
...
433730  2008-06-30  11      3       64.0
433731  2008-06-30  11      4       64.0
433732  2008-06-30  11      5       64.0
433733  2008-06-30  11      6       64.0

This gives me a list of dataframes: 这给了我一个数据帧列表：

stations = [x for _,x in df.groupby('station_id')]

When I reset the indices for "stations", and concat, I can get a dataframe, but it doesn't look like I'd like: 当我重置“ stations”和concat的索引时，我可以得到一个数据框，但它看起来不像我想要的：

for i in range(0,11):
     stations[i].reset_index(drop=True,inplace=True)    

pd.concat(stations,axis=1)

    Date        station_id  Hour    Temp    Date        station_id  Hour    Temp
0   2004-01-01  1           1       46.0    2004-01-01  2           1       38.0
1   2004-01-01  1           2       46.0    2004-01-01  2           2       36.0
2   2004-01-01  1           3       45.0    2004-01-01  2           3       35.0
3   2004-01-01  1           4       41.0    2004-01-01  2           4       30.0

I'm much rather get towards a df like this: 我更喜欢这样的df：

    Date        Hour    Stn1    Stn2
0   2004-01-01  1       46.0    38.0
1   2004-01-01  2       46.0    6.0
2   2004-01-01  3       45.0    35.0
3   2004-01-01  4       41.0    30.0

How do I do this? 我该怎么做呢？

Answer 1

Based on your expected output, you are looking for a pivot table with index=['Date', 'Hour'], columns='station_id', values=Temp . 根据您的预期输出，您正在寻找index=['Date', 'Hour'], columns='station_id', values=Temp 透视表。 Demo: 演示：

# A bunch of example data
df
    Date        station_id  Hour    Temp
0   2004-01-01  1           1       10.0
1   2004-01-01  1           2       20.0
2   2004-01-01  1           3       30.0
3   2004-01-01  1           4       40.0
4   2004-01-01  2           1       50.0
5   2004-01-01  2           2       60.0
6   2004-01-01  2           3       70.0
7   2004-01-01  2           4       80.0
8   2004-01-01  3           1       90.0
9   2004-01-01  3           2       100.0
10   2004-01-02  3          1       110.0
11   2004-01-02  3          2       120.0
12   2004-01-01  4          4       130.0
13   2004-01-02  4          5       140.0

# Create pivot table, with ['Date', 'Hour'] in a MultiIndex
res = df.pivot_table(columns='station_id', index=['Date', 'Hour'], values='Temp')

# Add 'Stn' prefix to each column name
res = res.add_prefix('Stn')

# Delete the name of the columns' index, which is 'station_id'
del res.columns.name

# Reset MultiIndex into columns
res.reset_index(inplace=True)

res
        Date  Hour  Stn1  Stn2   Stn3   Stn4
0 2004-01-01     1  10.0  50.0   90.0    NaN
1 2004-01-01     2  20.0  60.0  100.0    NaN
2 2004-01-01     3  30.0  70.0    NaN    NaN
3 2004-01-01     4  40.0  80.0    NaN  130.0
4 2004-01-02     1   NaN   NaN  110.0    NaN
5 2004-01-02     2   NaN   NaN  120.0    NaN
6 2004-01-02     5   NaN   NaN    NaN  140.0

Answer 2

For what it's worth, this gets where I want to go. 对于它的价值，这就是我想去的地方。

stations = [x for _,x in df.groupby('station_id')] #,as_index=True)]
for i in range(0,11):
stations[i].reset_index(drop=True,inplace=True)
stations[i].rename(columns={'Temp':'Stn'+str(i+1)},inplace=True) 
stations[i].drop(columns='station_id',inplace=True)
if i>0:
    stations[i].drop(columns=['Date','Hour'],inplace=True)
stations = pd.concat(stations,axis=1)

Feels a bit brute force to me, though. 不过，对我来说有点蛮力。 Additional pythonic suggestions welcome. 欢迎其他pythonic建议。

熊猫将数据框的列表连接到一个数据框

问题描述

2 个解决方案

解决方案1
0 2018-09-23 19:02:26

解决方案2
0 2018-09-23 19:11:59

熊猫将数据框的列表连接到一个数据框

问题描述

2 个解决方案

解决方案1 0 2018-09-23 19:02:26

解决方案2 0 2018-09-23 19:11:59

解决方案1
0 2018-09-23 19:02:26

解决方案2
0 2018-09-23 19:11:59