简体   繁体   English

自动在数据框中添加行

[英]Add row in data frame automatically

Could someone tell me how to Add rows in this dataframe automatically? 有人可以告诉我如何自动在此数据框中添加行吗? I have a data frame df : 我有一个数据框df:

                                   frequency
enrollment_id event      days           
1             access     2             3
                         7             8
                         9             4
                         10            3
                         12            2
                         15           21
                         18            4
                         19            8
                         20           20
                         22           16
                         23            2
                         28            2
                         29           14
              navigate   2             1
                         7             4
                         9             1
                         10            3
                         11            1
                         12            1
                         15            5
                         18            1
                         19            1
                         22            3
                         23            1
                         28            1
                         29            2
              page_close 2             1
                         7             6
                         9             2
                         10            3
...                                  ...
200881        navigate   28            1
200882        discussion 28            4
              navigate   28            4
200883        access     28            2
              navigate   28            2
              page_close 28            1
200885        navigate   21            1
200887        access     21            3
              navigate   21            2
              page_close 21            1
              video      21            1
200888        access     21            2
              discussion 21            1
              navigate   21            5
              page_close 21            1
              video      21            1
              wiki       21            1
200889        navigate   21            1
200893        navigate   21            2
200895        navigate   21            1
200896        navigate   21            1
200897        navigate   21            1
200898        navigate   21            1
200900        navigate   21            1
200901        access     21            3
              navigate   21            2
              page_close 21            2
              video      21            1
200904        navigate   21            1
200905        navigate   21            1

This df has 3 index: 1. event 2. days 3. enrollment_id and only one column frequency 该df具有3个索引:1.事件2.天3. enrollment_id,并且只有一列frequency

  1. event has 7 different value like : access , remove etc. 事件具有7个不同的value例如: accessremove等。

  2. days has 30 different vaule 0 - 29 (not every event has 0 - 29 . some event just has for example 0 , 1 , 4 .) 天有30个不同的vaule 0 - 29 (不是每一个事件具有0 - 29 。一些事件只是具有例如014 )。

  3. enrollment_id has a lot of different value (maybe 100000 ). enrollment_id具有很多不同的value (也许是100000 )。 Same, not each days has all enrollment_id . 一样,并非每天都有全部enrollment_id My question is : How can I add all lost rows? 我的问题是:如何添加所有丢失的行?

For example : If I have this 例如:如果我有这个

                                     frequency
enrollment_id event      days           
1             access     2             3
                         7             8

I need to add rows for 我需要为添加行

                               frequency
enrollment_id event      days           
1             access     0             0
                         1             0
                         3             0
                         4             0
                         5             0
                         6             0
                         ...           ...
                         29            0

and I need to add rows for 0 with all other enrollment_id and frequency 0 and and all rows for access with 0days - 29days and enrollment_id from 1 - max 我需要添加行的0与所有其他enrollment_idfrequency 0和所有行的access0days - 29daysenrollment_id1 - max

I really want to get this answer. 我真的很想得到这个答案。 I really appreciate your help!! 非常感谢您的帮助!

EDIT: 编辑:

If need add mising days only to last level days use reindex with unstack + stack : 如果需要添加mising天只有最后一级days使用reindexunstack + stack

df = df['frequency'].unstack()
                    .reindex(columns=list(range(30)), fill_value=0)
                    .stack()
                    .to_frame('frequency')

If need add all combination of all levels: 如果需要添加所有级别的所有组合:

Use by new MultiIndex created by from_product : 通过使用新的MultiIndex通过创建from_product

#get all unique values of all levels
a = df.index.get_level_values('enrollment_id').unique()
b = df.index.get_level_values('event').unique()
c = df.index.get_level_values('days').unique()

Or you can use your values in lists like: 或者,您可以在列表中使用您的值,例如:

a = ['access', 'remove']
b = range(1, df.index.get_level_values('event').max() + 1)
c = range(30)

mux = pd.MultiIndex.from_product([a,b,c], names=df.index.names)

#for missing values add 0
df = df.reindex(mux, fill_value=0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM