简体   繁体   English

如何重命名 R 中不同数据框中不同列中的观察值?

[英]How to rename observations in different columns in different dataframes in R?

I want to add "Name" to the start and "plc" to the end of the names of the observations in different columns in different dataframes, except where the names already have "Name" or "plc" in the right place.我想将“名称”添加到开头,并将“plc”添加到不同数据框中不同列中的观察名称的末尾,除非名称已经在正确的位置具有“名称”或“plc”。 The following is a simple reprex.以下是一个简单的表示。

Original dataframes原始数据帧

names1a <- c("Name Alperton plc", "Bury", "Central", "Durham")
names1b <- c("Egham plc", "Fulton", "Great", "Heywood plc")
year1 <- c(1999, 2000, 2001, 2001)
df1 <- data.frame(names1a, names1b, year1)

names2 <- c("Charleton plc", "Birmingham", "Name Tees", "Salford")
year2 <- c(2000, 1955, 2001, 2001)
df2 <- data.frame(names2, year2)

Desired result:期望的结果:

df1 df1


            names1a          names1b year1
1 Name Alperton plc   Name Egham plc  1999
2     Name Bury plc  Name Fulton plc  2000
3  Name Central plc   Name Great plc  2001
4   Name Durham plc Name Heywood plc  2001

df2 df2

               names2 year2
1  Name Charleton plc  2000
2 Name Birmingham plc  1955
3       Name Tees plc  2001
4    Name Salford plc  2001

My approach: I get the result I want, but I have a large dataset with many columns, so my approach is just too repetitive.我的方法:我得到了我想要的结果,但是我有一个包含很多列的大型数据集,所以我的方法太重复了。 I struggle with making functions and I think one would be useful here:我努力制作功能,我认为在这里会有用:

df1$names1a <- sub("$", " plc", df1$names1a)
df1$names1b <- sub("$", " plc", df1$names1b)
df2$names2 <- sub("$", " plc", df2$names2)
df1$names1a <- sub("plc plc", "plc", df1$names1a)
df1$names1b <- sub("plc plc", "plc", df1$names1b)
df2$names2 <- sub("plc plc", "plc", df2$names2)

df1$names1a <- sub("^", "Name ", df1$names1a)
df1$names1b <- sub("^", "Name ", df1$names1b)
df2$names2 <- sub("^", "Name ", df2$names2)
df1$names1a <- sub("Name Name", "Name", df1$names1a)
df1$names1b <- sub("Name Name", "Name", df1$names1b)
df2$names2 <- sub("Name Name", "Name", df2$names2)

The easiest would probably be to delete "Name" and "plc" and add it to everything afterwards, like so:最简单的可能是删除“名称”和“plc”,然后将其添加到所有内容中,如下所示:

f <- function(x) paste("Name", trimws(gsub("^Name|plc$", "", x)), "plc")

cols <- c("names1a", "names1b")
df1[cols] <- lapply(df1[cols], f)
df1
#          names1a         names1b year1
# 1 Name Arton plc    Name Egh plc  1999
# 2  Name Bury plc  Name Futon plc  2000
# 3  Name Cntr plc    Name Grt plc  2001
# 4  Name Durh plc Name Hywood plc  2001

df2$names2 <- f(df2$names2)
df2
#             names2 year2
# 1  Name Chrton plc  2000
# 2 Name Biringh plc  1955
# 3      Name Ts plc  2001
# 4   Name Sford plc  2001

You could try this function which allows editing both data frames separately and can cope with any number of columns starting with "names...".您可以试试这个 function,它允许分别编辑两个数据框,并且可以处理以“名称...”开头的任意数量的列。 Based on tidyverse packages:基于tidyverse包:


library(dplyr)
library(stringr)

  fun <- function(df){
    df <-
      df %>% 
      mutate(across(starts_with("names"), ~ str_replace_all(., "^Name | plc$", "")),
             across(starts_with("names"), ~ paste("Name", .,  "plc")))
    return(df)
  }
  
fun(df = df1)
#>             names1a          names1b year1
#> 1 Name Alperton plc   Name Egham plc  1999
#> 2     Name Bury plc  Name Fulton plc  2000
#> 3  Name Central plc   Name Great plc  2001
#> 4   Name Durham plc Name Heywood plc  2001
  
fun(df2)
#>                names2 year2
#> 1  Name Charleton plc  2000
#> 2 Name Birmingham plc  1955
#> 3       Name Tees plc  2001
#> 4    Name Salford plc  2001

You could combine the two mutate functions into one:您可以将两个 mutate 函数合二为一:

mutate(across(starts_with("names"), ~ paste("Name", str_replace_all(., "^Name | plc$", ""), "plc")))

Created on 2021-01-17 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2021 年 1 月 17 日创建

You can just remove the confounding strings from the names first and then append 'name' and 'plc' however you like:您可以先从名称中删除混淆字符串,然后再删除 append 'name' 和 'plc' 但是你喜欢:

library(tidyverse)

names1a <- c("Name Alperton plc", "Bury", "Central", "Durham")

names1a <- str_replace_all(names1a, "Name ", "")
names1a <- str_replace_all(names1a, " plc", "")
names1a <- paste0("Name ", names1a, " plc")

names1a
#> [1] "Name Alperton plc" "Name Bury plc"     "Name Central plc" 
#> [4] "Name Durham plc"

Created on 2021-01-17 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2021 年 1 月 17 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM