[英]How to generate unique ids when mapping pairs of values from two sets in Python
I have 2 columns, 1st: Cluster, 2nd: Vehicle我有 2 列,第 1 列:集群,第 2 列:车辆
Cluster VehicleID
435 1 2
264 1 1
444 1 1
302 1 1
191 1 1
383 1 1
81 1 1
142 2 1
6 2 1
420 2 1
153 2 1
42 2 2
168 2 1
292 2 2
138 2 2
65 2 2
316 2 1
219 2 1
329 2 1
371 3 1
Basically, this tells cluster 1 has two vehicles:[1,2] and cluster has 1 vehicle.基本上,这告诉集群 1 有两辆车:[1,2] 并且集群有 1 辆车。 The above table is a small sample.
上表是一个小样本。 So, I have cluster 1:[1,2], cluster 2:[1] What I want is the cluster 1's vehicle 1 needs to be mapped as 1, 2 as 2. But, cluster 2's vehicle 1 should be mapped as 3.
所以,我有集群 1:[1,2],集群 2:[1] 我想要的是集群 1 的车辆 1 需要映射为 1,2 为 2。但是,集群 2 的车辆 1 应该映射为 3 .
In short, they should be sequential and irrespective of the "Cluster" column.简而言之,它们应该是连续的,并且与“集群”列无关。
I am not able to know where am I going wrong.我不知道我哪里出错了。 Kindly help.
请帮忙。
You may begin with sorting such that you can take advantage of diff
ing to find when IDs change, and then use cumsum
to retrieve your cumulative IDs.您可以从排序开始,这样您就可以利用
diff
来查找 ID 何时更改,然后使用cumsum
检索您的累积 ID。
initial = df.index
df = df.sort_values(['Cluster', 'VehicleID'])
df['new-ID'] = (df.VehicleID.diff().ne(0) | df.Cluster.diff().eq(1)).cumsum()
df.loc[initial] # back to initial ordering
Cluster VehicleID new-ID
435 1 2 2
264 1 1 1
444 1 1 1
302 1 1 1
191 1 1 1
383 1 1 1
81 1 1 1
142 2 1 3
6 2 1 3
420 2 1 3
153 2 1 3
42 2 2 4
168 2 1 3
292 2 2 4
138 2 2 4
65 2 2 4
316 2 1 3
219 2 1 3
329 2 1 3
371 3 1 5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.