简体   繁体   English

当列名以特定数字结尾时,重命名数据框列

[英]Rename data frame column when column name ends with a specific number

Found several questions regarding changing column names in a data frame, but didn't answer the issue I have. 发现了有关更改数据框中列名的几个问题,但没有回答我遇到的问题。

I am reading in a number of excel files as data frames in a loop, doing some analysis with the data for each data frame and moving the next data frame. 我正在循环读取多个Excel文件作为数据帧,并对每个数据帧的数据进行一些分析,然后移动下一个数据帧。 In each data frame where I need to rename a column whose name ends with, say for example, 00010. The position of the columns change in each file/data frame. 在每个数据帧中,我需要重命名其名称以例如00010结尾的列。在每个文件/数据帧中,列的位置都会发生变化。

eg Imported column names are: agency_id , site_no , datetime , tz_cd , 11_00010 , 11_00010_cd , 12_00030 , 12_00030_md , ... 例如,进口列名: agency_idsite_nodatetimetz_cd11_0001011_00010_cd12_0003012_00030_md ,...

For my analysis, I need the following columns: site_no , datetime , 11_00010 , 12_00030 . 对于我的分析,我需要以下栏目: site_nodatetime11_0001012_00030 I need to rename the column, 11_00010 , to temperature and column, 12_00030 to salinity. 我需要重命名列, 11_00010 ,对温度和列, 12_00030盐度。 If they were in the same order I can easily rename the columns using rename in plyr or colnames or names . 如果它们的顺序相同,我可以使用plyrcolnamesnames rename轻松地重命名列。 However, the sequence or order of the columns in each data frame is not same and the columns containing 00010 and 00030 may begin with different numbers, so the position of 00010 and 00030 in the column names is not always fixed. 但是,每个数据帧中列的顺序或顺序不相同,并且包含00010和00030的列可能以不同的数字开头,因此在列名中00010和00030的位置并不总是固定的。 If it were renaming would be easier. 如果重命名会更容易。 Also, I do not need the columns whose name contain 00010 or 00030 but end with cd or md etc. 另外,我不需要名称包含00010或00030但以cd或md等结尾的列。

Would really appreciate any way around. 真的很感激。

Why not just use gsub ? 为什么不只使用gsub呢?

Assuming "x" to be your column names (normally accessed via names(your-data-frame) : 假设“ x”为您的列名(通常通过names(your-data-frame)访问names(your-data-frame)

x <- c("agency_id", "site_no", "datetime", "tz_cd", "11_00010", 
       "11_00010_cd", "12_00030", "12_00030_md")

x <- gsub("11_00010", "temperature", x)
x <- gsub("12_00030", "salinity", x)
x
# [1] "agency_id"      "site_no"        "datetime"       "tz_cd"         
# [5] "temperature"    "temperature_cd" "salinity"       "salinity_md"

Judging by your comments, perhaps you're looking for something more like: 从您的评论来看,也许您正在寻找更多类似的东西:

x <- c("agency_id", "site_no", "datetime", "tz_cd", "11_00010", 
       "11_00010_cd", "12_00030", "12_00030_md")

x[grep("_00010$", x)] <- "temperature"
x[grep("_00030$", x)] <- "salinity"
x
# [1] "agency_id"   "site_no"     "datetime"    "tz_cd"      
# [5] "temperature" "11_00010_cd" "salinity"    "12_00030_md"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM