[英]Add row in data frame automatically
Could someone tell me how to Add rows in this dataframe automatically? 有人可以告诉我如何自动在此数据框中添加行吗? I have a data frame df : 我有一个数据框df:
frequency
enrollment_id event days
1 access 2 3
7 8
9 4
10 3
12 2
15 21
18 4
19 8
20 20
22 16
23 2
28 2
29 14
navigate 2 1
7 4
9 1
10 3
11 1
12 1
15 5
18 1
19 1
22 3
23 1
28 1
29 2
page_close 2 1
7 6
9 2
10 3
... ...
200881 navigate 28 1
200882 discussion 28 4
navigate 28 4
200883 access 28 2
navigate 28 2
page_close 28 1
200885 navigate 21 1
200887 access 21 3
navigate 21 2
page_close 21 1
video 21 1
200888 access 21 2
discussion 21 1
navigate 21 5
page_close 21 1
video 21 1
wiki 21 1
200889 navigate 21 1
200893 navigate 21 2
200895 navigate 21 1
200896 navigate 21 1
200897 navigate 21 1
200898 navigate 21 1
200900 navigate 21 1
200901 access 21 3
navigate 21 2
page_close 21 2
video 21 1
200904 navigate 21 1
200905 navigate 21 1
This df has 3 index: 1. event 2. days 3. enrollment_id and only one column frequency
该df具有3个索引:1.事件2.天3. enrollment_id,并且只有一列frequency
event has 7 different value
like : access
, remove
etc. 事件具有7个不同的value
例如: access
, remove
等。
days has 30 different vaule
0 - 29
(not every event has 0 - 29
. some event just has for example 0
, 1
, 4
.) 天有30个不同的vaule
0 - 29
(不是每一个事件具有0 - 29
。一些事件只是具有例如0
, 1
, 4
)。
enrollment_id
has a lot of different value
(maybe 100000
). enrollment_id
具有很多不同的value
(也许是100000
)。 Same, not each days has all enrollment_id
. 一样,并非每天都有全部enrollment_id
。 My question is : How can I add all lost rows? 我的问题是:如何添加所有丢失的行? For example : If I have this 例如:如果我有这个
frequency
enrollment_id event days
1 access 2 3
7 8
I need to add rows for 我需要为添加行
frequency
enrollment_id event days
1 access 0 0
1 0
3 0
4 0
5 0
6 0
... ...
29 0
and I need to add rows for 0
with all other enrollment_id
and frequency 0
and and all rows for access
with 0days - 29days
and enrollment_id
from 1 - max
我需要添加行的0
与所有其他enrollment_id
和frequency 0
和所有行的access
与0days - 29days
和enrollment_id
从1 - max
I really want to get this answer. 我真的很想得到这个答案。 I really appreciate your help!! 非常感谢您的帮助!
EDIT: 编辑:
If need add mising days only to last level days
use reindex
with unstack
+ stack
: 如果需要添加mising天只有最后一级days
使用reindex
与unstack
+ stack
:
df = df['frequency'].unstack()
.reindex(columns=list(range(30)), fill_value=0)
.stack()
.to_frame('frequency')
If need add all combination of all levels: 如果需要添加所有级别的所有组合:
Use by new MultiIndex
created by from_product
: 通过使用新的MultiIndex
通过创建from_product
:
#get all unique values of all levels
a = df.index.get_level_values('enrollment_id').unique()
b = df.index.get_level_values('event').unique()
c = df.index.get_level_values('days').unique()
Or you can use your values in lists like: 或者,您可以在列表中使用您的值,例如:
a = ['access', 'remove']
b = range(1, df.index.get_level_values('event').max() + 1)
c = range(30)
mux = pd.MultiIndex.from_product([a,b,c], names=df.index.names)
#for missing values add 0
df = df.reindex(mux, fill_value=0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.