简体   繁体   中英

How to merge duplicated rows

I have a data frame that looks like

Nicknames Names
Fonse, Fons Alfons
Fonse, Fonsi Alfons
Gustel, Gustl, Guste, August
Baldi Balthasar
Hausl, Baldi Balthasar
Flore, Flori Florian

I would like to merge the duplicated rows to be:

Nicknames Names
Fonse, Fons,Fonse, Fonsi Alfons
Gustel, Gustl, Guste, August
Baldi, Hausl, Baldi Balthasar
Flore, Flori Florian

I was able to creat a subset of the duplicate but I don't know how to combine them

nick2 <- subset(nick, any(duplicated(nick$Names)))

Here is the data as a csv file https://github.com/Garybertrand/nick

This should solve your problem

library(data.table)
library(dplyr)

setDT(df)[, list(Nicknames = paste(Nicknames, collapse = ', ')), 
          by = c('Names')] %>%
  select(Nicknames,Names)

You can also use base R.

aggregate(Nicknames ~ Names, unique(df), paste, collapse = ", ")

The short tidyverse solution would be like this:

library(tidyverse)

df %>% 
  group_by(Names) %>% 
  summarize(Nicknames = paste0(Nicknames, collapse = ", "))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM