簡體   English   中英

從兩列分配唯一的ID,其中值可以在不同的行上按相反的順序

[英]assign unique ID from two columns where values can be in inverse order on different rows

我有一個df,我想為兩列但在不同行中並以相反順序顯示的唯一ID的組合創建唯一ID。 一個人怎么能做到這一點? 感謝您的幫助。

DF:

>     CALL_NO    DUP_OF_CALL_NO
>     171573729  171573731
>     171573729  171573731
>     171630085  171630084
>     171630085  171630084
>     171573731  171573729
>     171573731  171573729
>     171630084  171630085  
>     171630084  171630085  
>     171573731  171573729
>     171573731  171573729

所需的輸出:

>     ID  CALL_NO    DUP_OF_CALL_NO
>     1   171573729  171573731
>     1   171573729  171573731
>     2   171630085  171630084
>     2   171630085  171630084
>     1   171573731  171573729
>     1   171573731  171573729
>     2   171630084  171630085  
>     2   171630084  171630085  
>     1   171573731  171573729
>     1   171573731  171573729
df$ID=as.numeric(factor(apply(df,1,function(x)toString(sort(x)))))
> df
     CALL_NO DUP_OF_CALL_NO ID
1  171573729      171573731  1
2  171573729      171573731  1
3  171630085      171630084  2
4  171630085      171630084  2
5  171573731      171573729  1
6  171573731      171573729  1
7  171630084      171630085  2
8  171630084      171630085  2
9  171573731      171573729  1
10 171573731      171573729  1

或者您也可以使用do.call

as.numeric(factor(do.call(function(x,y)paste(pmin(x,y),pmax(x,y)),unname(df))))
[1] 1 1 2 2 1 1 2 2 1 1

與data.table:

創建示例數據集

set.seed(1234)
df <- data.frame(
    call1 = sample(1:10, 100, T),
    call2 = sample(1:10, 100, T),
    stringsAsFactors = F
)

# load data.table package and convert df to a data.table
library(data.table)
setDT(df)

添加群組指標:

# order df so group numbers are increasing by (call1, call2)
df <- df[order(call1, call2)]

# add group id
df[, id := .GRP, by=.(call1, call2)]
#first standardize order of CALL_NO and DUP_OF_CALL_NO to create groupings
df = df %>% 
            mutate( call1 = pmin(CALL_NO, DUP_OF_CALL_NO),
                     call2 = pmax(CALL_NO, DUP_OF_CALL_NO)) 


###Create unique ID across related CALL_NO and DUP_OF_CALL_NO
#convert to data.table    
df<- data.table(df)

#apply ID to associate group of calls 
df[, ID := .GRP, by=.(call1, call2)]

#convert back to data.frame
class(as.data.frame(df))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM