[英]grouping data using unique combinations
在我下面的数据集中,我需要找到唯一的序列并为它们分配一个序列号..
数据集:
user age maritalstatus product
A Young married 111
B young married 222
C young Single 111
D old single 222
E old married 111
F teen married 222
G teen married 555
H adult single 444
I adult single 333
唯一序列:
young married 0
young single 1
old single 2
old married 3
teen married 4
adult single 5
找到如上所示的唯一值后,如果我传递如下所示的数据框,则 newdataframe
user age maritalstatus
A Young married
X young Single
D old single
Z old married
它应该将产品作为列表返回给我。
A: [222] - as user A has already purchased 111, the matching sequence contains 222, so returns 222.
X: [111, 222]
D: [] - returns nothing, as there is only one sequence like this, and D has already purchased the product 222, so returns empty.
Z: [111] matches with sequence E, so returned 111
如果没有序列,如下所示
user age maritalstatus
Y adult married
它应该给我一个空列表
Y : []
您可以使用集合 - 模块提供用于构造和操作唯一元素的无序集合的类
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.