简体   繁体   中英

Assign value to a column from another column based on condition

Say that I have a list like this:

> desired <- c("10001", "10004")

And a sample data frame like this:

> desired_sample_df <- data.frame(geo = rep("other", 30), zip = c(rep(10001:10010, 2), 10011:10020), cbsa = c(rep("NY", 20), rep("CA", 10)))
> desired_sample_df
     geo   zip cbsa
1  other 10001   NY
2  other 10002   NY
3  other 10003   NY
4  other 10004   NY
5  other 10005   NY
6  other 10006   NY
7  other 10007   NY
8  other 10008   NY
9  other 10009   NY
10 other 10010   NY
11 other 10001   NY
12 other 10002   NY
13 other 10003   NY
14 other 10004   NY
15 other 10005   NY
16 other 10006   NY
17 other 10007   NY
18 other 10008   NY
19 other 10009   NY
20 other 10010   NY
21 other 10011   CA
22 other 10012   CA
23 other 10013   CA
24 other 10014   CA
25 other 10015   CA
26 other 10016   CA
27 other 10017   CA
28 other 10018   CA
29 other 10019   CA
30 other 10020   CA

I would like to overwrite the geo column with a value from zip only if the value of zip is in the desired list saved at the start.


Here is what I've tried:

> desired_sample_df$geo[desired_sample_df$zip %in% desired] <- desired_sample_df$zip[which(desired_sample_df$zip %in% desired)]
Warning message:
In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA,  :
  invalid factor level, NA generated


> desired_sample_df$geo[desired_sample_df$zip %in% desired] <- desired_sample_df$zip
Warning messages:
1: In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA,  :
  invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA,  :
  number of items to replace is not a multiple of replacement length

One of the problems is that strings in dataframes automatically become factors. Try this:

desired <- c("10001", "10004")
df <- data.frame(geo = rep("other", 30), zip = c(rep(10001:10010, 2), 10011:10020), cbsa = c(rep("NY", 20), rep("CA", 10)), stringsAsFactors=FALSE)

idx <- df$zip %in% desired

Now you can alter the elements you want by

df[idx, ]$geo <- df[idx, ]$zip

Like this?

df$geo <- ifelse(df$zip %in% desired,df$zip,df$geo)

where I'm calling your desired_sample_df , just df .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM