I would appreciate your help with uniting two columns into a single column, while keeping the new values unique. I tried to find a solution to this issue but since I'm terrible at doing loops in R, maybe it's better if some shows the right way to dothis.
Let's say I have a dataset like this:
place year
A 2018
A 2018
B 2018
C 2018
C 2018
C 2019
C 2019
I would like to create a new column (variable) that combines both columns (place and year) but adds a numeric suffix in in the case of repetitions. For example, C has two cases of 2018 and 2019. I would like the new value to of the new variable to be "C_2018.1" and "C_2018.2" if that makes sense. I know how to combine variables into strings, but adding the number of non-unique values is what I'm not sure about. Maybe I need loops?
data$new_v <- paste(data$place, data$year, sep = "_")
I hope this makes sufficient sense and it should be quite easy I guess.
Loops might be easier but...
data$ctr = unlist(sapply(table(data$new_v), function(n)1:n))
And then you could do
data$new_v <- paste(data$new_v, data$ctr, sep = ".")
This would leave you with the singletons (like B) still having a.1
You can solve this with dplyr:
data %>%
group_by(place, year) %>%
mutate(new_v = paste0(place, "_", year, ".", row_number()))
The group_by
clause causes row_number()
to count within the groups, starting from 1.
df <- data.frame(place=c("A","A","B","C","C","C","C"),year=c(2018,2018,2018,2018,2018,2019,2019))
df <- data.table(df)
df[,counter:=seq(.N),by=c("place","year")]
df[,new_var:=paste(place,year,counter,sep="_")]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.