[英]create unique identifier in dataframe based on combination of columns
I have the following dataframe:我有以下 dataframe:
id Lat Lon Year Area State
50319 -36.0629 -62.3423 2019 90 Iowa
18873 -36.0629 -62.3423 2017 90 Iowa
18876 -36.0754 -62.327 2017 124 Illinois
18878 -36.0688 -62.3353 2017 138 Kansas
I want to create a new column which assigns a unique identifier based on whether the columns Lat
, Lon
and Area
have the same values.我想创建一个新列,它根据
Lat
、 Lon
和Area
列是否具有相同的值来分配唯一标识符。 Eg in this case rows 1 and 2 have the same values in those columns and will be given the same unique identifier 0_Iowa
where Iowa
comes from the State
column.例如,在这种情况下,第 1 行和第 2 行在这些列中具有相同的值,并将被赋予相同的唯一标识符
0_Iowa
,其中Iowa
来自State
列。 I tried using a for loop but is there a more pythonic way to do it?我尝试使用 for 循环,但有没有更 Pythonic 的方式来做到这一点?
id Lat Lon Year Area State unique_id
50319 -36.0629 -62.3423 2019 90 Iowa 0_Iowa
18873 -36.0629 -62.3423 2017 90 Iowa 0_Iowa
18876 -36.0754 -62.327 2017 124 Illinois 1_Illinois
18878 -36.0688 -62.3353 2017 138 Kansas 2_Kansas
I'd go with groupby.ngroup
setting sort=False
for the grouping and str.cat
to concatenate with State
setting a separator:我将 go 与
groupby.ngroup
设置sort=False
用于分组和str.cat
与State
连接设置分隔符:
df['Sate'] = (df.groupby(['Lat','Lon','Area'], sort=False)
.ngroup()
.astype(str)
.str.cat(df.State, sep='_'))
print(df)
id Lat Lon Year Area State Sate
0 50319 -36.0629 -62.3423 2019 90 Iowa 0_Iowa
1 18873 -36.0629 -62.3423 2017 90 Iowa 0_Iowa
2 18876 -36.0754 -62.3270 2017 124 Illinois 1_Illinois
3 18878 -36.0688 -62.3353 2017 138 Kansas 2_Kansas
1
you can do groupby.ngroup and add the column State:您可以执行groupby.ngroup并添加列 State:
df['unique_id'] = (df.groupby(['Lat', 'Lon','Area'], sort=False).ngroup().astype(str)
+ '_' + df['State'])
print (df)
id Lat Lon Year Area State unique_id
0 50319 -36.0629 -62.3423 2019 90 Iowa 0_Iowa
1 18873 -36.0629 -62.3423 2017 90 Iowa 0_Iowa
2 18876 -36.0754 -62.3270 2017 124 Illinois 1_Illinois
3 18878 -36.0688 -62.3353 2017 138 Kansas 2_Kansas
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.