簡體   English   中英

如何基於另一列創建因子列?

[英]How to create a factor column based on another column?

我想基於“社區區域”的值創建一個名為“區域”的列,例如,社區區域1 =北,社區區域2 =南。 我希望它是這樣的:

Community area   Region
25               West
67               Southwest
39               South
40               South
25               West

我嘗試了以下代碼,但沒有幫助:

region<-function(x){if(x==c(8,32,33)){crime$Region<-"Central"} 
else if(x==c(5,6,7,21,22)){crime$Region<-"North"}
else if(x==c(1:4,9:14,76,77)){crime$Region<-"Far North Side"}
else if(x==c(15:20)){crime$Region<-"Northwest Side"}
else if(x==c(23:31)){crime$Region<-"West"}
else if(x==c(34:43,60,69)){crime$Region<-"South"}
else if(x==c(56:59,61:68)){crime$Region<-"Southwest Side"}
else if(x==c(44:55)){crime$Region<-"Far Southeast Side"}
else if(x==c(70:75)){crime$Region<-"Far Southwest Side"}
else {crime$Region<-"Other"}
}
region(crime$Community.Area)

對於涉及ifelse if長表達式,請嘗試使用case_when軟件包中的dplyr

> set.seed(1234)
> 
> df <- data.frame(x1 = round(runif(n = 20, min = 1, max = 4), 0), stringsAsFactors = F)
> 
> df
   x1
1   1
2   3
3   3
4   3
5   4
6   3
7   1
8   2
...
20  2
> 
> df$Region <- dplyr::case_when(df$x1 == 1 ~ "North", 
+                  df$x1 == 2 ~ "South", 
+                  df$x1 == 3 ~ "East",
+                  TRUE ~ "West")
> df
   x1 Region
1   1  North
2   3   East
3   3   East
4   3   East
5   4   West
6   3   East
7   1  North
...
20  2  South

通過修改region功能,可以在OP理念中實現一種解決方案。

  # Take one value at a time and return Region
  region<-function(x){if(x %in% c(8,32,33)){"Central"} 
    else if(x %in% c(5,6,7,21,22)){"North"}
    else if(x %in% c(1:4,9:14,76,77)){"Far North Side"}
    else if(x %in% c(15:20)){"Northwest Side"}
    else if(x %in% c(23:31)){"West"}
    else if(x %in% c(34:43,60,69)){"South"}
    else if(x %in% c(56:59,61:68)){"Southwest Side"}
    else if(x %in% c(44:55)){"Far Southeast Side"}
    else if(x %in% c(70:75)){"Far Southwest Side"}
    else {"Other"}
  }

# Use mapply to pass each value of `Community_area` to find region as
df$Region <- mapply(region, df$Community_area)

df
#  Community_Area         Region
#1             25           West
#2             67 Southwest Side
#3             39          South
#4             40          South
#5             25           West

數據

df <- data.frame(Community_Area = c(25, 67, 39, 40, 25))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM