[R] I am trying to modify the format of my data frame (df) so that the column name is appended to each observation within that column within R. For example:
Soccer_Brand | Basketball_Brand |
---|---|
Adidas | Nike |
Nike | Under Armour |
And want to get it to look like
Soccer_Brand | Basketball_Brand |
---|---|
Adidas_Soccer_Brand | Nike_Basketball_Brand |
Nike_Soccer_Brand | Under_Armour_Basketball_Brand |
Im attempting a market basket analysis and need to remove column names eventually. However I will lose the information on what sport the brand belongs to without appending the column names to the observations themselves. Essentially I wont be able to tell whether a 'nike' entry belongs to soccer or basketball.
I've used Excel formulas to hack a solution thus far but want my R script to be self contained. I haven't found any solutions out there for this in R.
You can paste
a column's name onto its contents. Just iterate through all the columns. Doing so with lapply
allows the one-liner:
df[] <- lapply(seq_along(df),\(i) paste(df[[i]], names(df)[i], sep = "_"))
resulting in
df
#> Soccer_Brand Basketball_Brand
#> 1 Adidas_Soccer_Brand Nike_Basketball_Brand
#> 2 Nike_Soccer_Brand Under Armour_Basketball_Brand
Data from question in reproducible format
df <- data.frame(Soccer_Brand = c("Adidas", "Nike"),
Basketball_Brand = c("Nike", "Under Armour"))
Or using an option in tidyverse
library(dplyr)
library(stringr)
df <- df %>%
mutate(across(everything(), ~ str_c(.x, cur_column(), sep = "_")))
-output
df
Soccer_Brand Basketball_Brand
1 Adidas_Soccer_Brand Nike_Basketball_Brand
2 Nike_Soccer_Brand Under Armour_Basketball_Brand
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.