![](/img/trans.png)
[英]Combining data from different rows based on the cell content and creating new columns based on the cell values with pandas and python
[英]Creating new rows and new columns based on row data in pandas
我有一個數據框,其中每個行都有行,每個用戶加入我的網站並進行購買。
+---+-----+--------------------+---------+--------+-----+
| | uid | msg | _time | gender | age |
+---+-----+--------------------+---------+--------+-----+
| 0 | 1 | confirmed_settings | 1/29/15 | M | 37 |
| 1 | 1 | sale | 4/13/15 | M | 37 |
| 2 | 3 | confirmed_settings | 4/19/15 | M | 35 |
| 3 | 4 | confirmed_settings | 2/21/15 | M | 21 |
| 4 | 5 | confirmed_settings | 3/28/15 | M | 18 |
| 5 | 4 | sale | 3/15/15 | M | 21 |
+---+-----+--------------------+---------+--------+-----+
我想改變數據框使每一行是一個UID獨特,有一個叫列sale
和confirmed_settings
具有動作的時間戳。 請注意,並非每個用戶都有sale
,但是每個用戶都有confirmed_settings
。 如下所示:
+---+-----+--------------------+---------+---------+--------+-----+
| | uid | confirmed_settings | sale | _time | gender | age |
+---+-----+--------------------+---------+---------+--------+-----+
| 0 | 1 | 1/29/15 | 4/13/15 | 1/29/15 | M | 37 |
| 1 | 3 | 4/19/15 | null | 4/19/15 | M | 35 |
| 2 | 4 | 2/21/15 | 3/15/15 | 2/21/15 | M | 21 |
| 3 | 5 | 3/28/15 | null | 3/28/15 | M | 18 |
+---+-----+--------------------+---------+---------+--------+-----+
完成此操作的最佳熊貓習慣用語/功能是什么?
不知道這是否是最佳解決方案,但應該可以:
In [1]: df
Out[1]:
uid msg _time gender age
0 1 confirmed_settings 1/29/15 M 37
1 1 sale 4/13/15 M 37
2 3 confirmed_settings 4/19/15 M 35
3 4 confirmed_settings 2/21/15 M 21
4 5 confirmed_settings 3/28/15 M 18
5 4 sale 3/15/15 M 21
In [2]: df1 = df.pivot(index='uid', columns='msg', values='_time').reset_index()
In [3]: df1 = df1.merge(df[['uid', 'gender', 'age']].drop_duplicates(), on='uid')
In [4]: df1
Out[4]:
uid confirmed_settings sale gender age
0 1 1/29/15 4/13/15 M 37
2 3 4/19/15 NaN M 35
3 4 2/21/15 3/15/15 M 21
5 5 3/28/15 NaN M 18
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.