简体   繁体   English

如何使用python生成ID号?

[英]How to generate id number using python?

I have a dataframe. 我有一个数据框。 I want to create a unique ID number for each person and create a column based the person and date(weekly). 我想为每个人创建一个唯一的ID号,并根据人和日期(每周)创建一列。

import pandas as pd
df = pd.DataFrame({ 'name':['one','one','two','two','two','three','four'],
                     'date':['2019-05-01','2019-05-08','2019-05-01','2019-05-08','2019-05-15','2019-05-01','2019-05-15'],
                    "a":range(7)})
df['date'] = pd.to_datetime(df['date'],yearfirst=True)
df = df.sort_values(['name','date'])
print(df)

This is the data: 这是数据:

    name       date  a
6   four 2019-05-15  6
0    one 2019-05-01  0
1    one 2019-05-08  1
5  three 2019-05-01  5
2    two 2019-05-01  2
3    two 2019-05-08  3
4    two 2019-05-15  4

The expected result is 预期的结果是

    name       date  a    id    week
6   four 2019-05-15  6     1    3
0    one 2019-05-01  0     2    1
1    one 2019-05-08  1     2    2
5  three 2019-05-01  5     3    1 
2    two 2019-05-01  2     4    1
3    two 2019-05-08  3     4    2
4    two 2019-05-15  4     4    3

How can I get the "id" and "week"? 如何获得“ id”和“ week”? Thank you! 谢谢!

Like @cs95 commented use GroupBy.ngroup with division days by 7 with numpy.ceil : 如@ cs95评论使用GroupBy.ngroup与分工天7numpy.ceil

df["Id"] = df.groupby("name").ngroup() + 1
df['week'] = np.ceil(df.date.dt.day / 7).astype(int)
print (df)

    name       date  a  Id  week
6   four 2019-05-15  6   1     3
0    one 2019-05-01  0   2     1
1    one 2019-05-08  1   2     2
5  three 2019-05-01  5   3     1
2    two 2019-05-01  2   4     1
3    two 2019-05-08  3   4     2
4    two 2019-05-15  4   4     3

Or: 要么:

df["Id"] = df.groupby("name").ngroup() + 1
df['week'] =  df.groupby("date").ngroup() + 1
print (df)

    name       date  a  Id  week
6   four 2019-05-15  6   1     3
0    one 2019-05-01  0   2     1
1    one 2019-05-08  1   2     2
5  three 2019-05-01  5   3     1
2    two 2019-05-01  2   4     1
3    two 2019-05-08  3   4     2
4    two 2019-05-15  4   4     3

I use cumsum to get df['id'] and groupby on df.date to get df['week'] : 我用cumsum来获取df['id']并在df.date上使用groupby来获取df['week']

df['id'] = df.name.ne(df.name.shift()).cumsum()
df['week'] = df.date.groupby(df.date).ngroup() + 1


Out[408]:
    name       date  a  id  week
6   four 2019-05-15  6   1     3
0    one 2019-05-01  0   2     1
1    one 2019-05-08  1   2     2
5  three 2019-05-01  5   3     1
2    two 2019-05-01  2   4     1
3    two 2019-05-08  3   4     2
4    two 2019-05-15  4   4     3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM