簡體   English   中英

基於行值的新列-更好的方法?

[英]New column based on row values - A better way?

當然,有一種更好的方法來創建與“目標”列匹配的列?

我在Stack上搜索了一個答案,但似乎沒有人需要知道如何做。 也許是從完全愚蠢的角度來看(我的頭可能一直處於Stata模式,因為那是我老板的想法,他要求我創建這個新的“輸出”變量)。

A       <-c("bears",  "bears",     "na",   "pandas",     "pandas",    "bears",   "pandas")
B       <-c("bears",  "pandas",     "na",   "bears",     "na",          "bears",   "pandas")
target  <-c("bears", "the_zoo",   "na",   "the_zoo",  "pandas",   "bears",   "pandas")
df_test <-data.frame(A,B,target,  stringsAsFactors =FALSE)

class(df_test$B)
for(i in 1:nrow(df_test)){
                          # Case: 1: Both are equal
    df_test$output[i] <- ifelse(df_test$A[i] == df_test$B[i],
                               yes = as.character(df_test$A[i]), 
                               # Case 2: A contains NA
                                no = ifelse(df_test$A[i] == "na",
                                            yes = as.character(df_test$B[i]),
                                            # Case 2.2: B contains NA
                                            no = ifelse(df_test$B[i] =="na",
                                                        yes = as.character(df_test$A[i]),
                                                        # Case 3: All other possibilities are "the_zoo"
                                                        no = "the_zoo"
                                                        )))
                                                    }
df_test



> df_test
       A      B  target  output
1  bears  bears   bears   bears
2  bears pandas the_zoo the_zoo
3     na     na      na      na
4 pandas  bears the_zoo the_zoo
5 pandas     na  pandas  pandas
6  bears  bears   bears   bears
7 pandas pandas  pandas  pandas

出什么問題了

A       <-c("bears",  "bears",     "na",   "pandas",     "pandas",    "bears",   "pandas")
B       <-c("bears",  "pandas",     "na",   "bears",     "na",          "bears",   "pandas")
target  <-c("bears", "the_zoo",   "na",   "the_zoo",  "pandas",   "bears",   "pandas")
df_test <-data.frame(A,B,target,  stringsAsFactors =FALSE)

df_test$test <- with(df_test, ifelse(A == B, A, 
                       ifelse(A == "na",B, 
                              ifelse(B == "na", A, "the_zoo"))))


print(df_test)

產生:

       A      B  target    test
1  bears  bears   bears   bears
2  bears pandas the_zoo the_zoo
3     na     na      na      na
4 pandas  bears the_zoo the_zoo
5 pandas     na  pandas  pandas
6  bears  bears   bears   bears
7 pandas pandas  pandas  pandas

您不需要for循環,因為ifelse已經被矢量化了。

清理代碼的一種方法是在dplyr軟件包中使用case_when

library(dplyr)

df_test$output <-
case_when(
    df_test$A == df_test$B ~ as.character(df_test$A),
    df_test$A == "na" ~ as.character(df_test$B),
    df_test$B =="na" ~ as.character(df_test$A),
    TRUE ~ "the_zoo"
)

請注意,如果AB列已經是字符類型,就好像您的代碼已假定的那樣,那么您可以刪除上面對as.character的不必要的調用。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM