簡體   English   中英

使用唯一組合對數據進行分組

[英]grouping data using unique combinations

在我下面的數據集中,我需要找到唯一的序列並為它們分配一個序列號..

數據集:

user    age maritalstatus   product
A   Young   married 111
B   young   married 222
C   young   Single  111
D   old single  222
E   old married 111
F   teen    married 222
G   teen    married 555
H   adult   single  444
I   adult   single  333

唯一序列:

young   married     0
young   single      1
old     single      2
old     married     3
teen    married     4
adult   single      5

找到如上所示的唯一值后,如果我傳遞如下所示的數據框,則 newdataframe

user    age maritalstatus  
A      Young   married 
X      young   Single  
D      old     single  
Z      old     married

它應該將產品作為列表返回給我。

A: [222] - as user A has already purchased 111, the matching sequence contains 222, so returns 222.
X: [111, 222]
D: [] - returns nothing, as there is only one sequence like this, and D has already purchased the product 222, so returns empty.
Z: [111] matches with sequence E, so returned 111

如果沒有序列,如下所示

user     age     maritalstatus  
    Y     adult  married

它應該給我一個空列表

 Y : []

您可以使用集合 - 模塊提供用於構造和操作唯一元素的無序集合的類

看看: https : //docs.python.org/2/library/sets.html

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM