[英]Removing Special Characters and Numbers for a column in a data frame
I have a data frame: 我有一个数据框:
dput(Data1)
structure(list(Emp.ID = c(182038L, 191854L), Project.Acquired.Skill = structure(c(2L,
1L), .Label = c("Architecting (10),Cognos TM1 (4),Support Function (3)",
"SAS (76),SAS Analytics (76),SAS BI (76),SAS data modeling tool (63),ClearCase (18),SQL (18),SQL Server (18),SQL SERVER 2000 (18),SQL SERVER 2005 (18),Excel (16),Oracle (16),AS400 (10)"
), class = "factor")), .Names = c("Emp.ID", "Project.Acquired.Skill"
), class = "data.frame", row.names = c(NA, -2L))
str(Data1)
'data.frame': 2 obs. of 2 variables:
$ Emp.ID : int 182038 191854
$ Project.Acquired.Skill: Factor w/ 2 levels "Architecting (10),Cognos TM1 (4),Support Function (3)",..: 2 1
I have a column which is a factor with value like this Architecting (10),Cognos TM1 (4),Support Function (3)
and i need to strip the numbers(0-9),WhiteSpace and the brackets() to get Architecting,Cognos TM1,Support Function
. 我有一列是具有这样的值的因子,例如
Architecting (10),Cognos TM1 (4),Support Function (3)
并且我需要剥离数字(0-9),WhiteSpace和括号()以获得Architecting,Cognos TM1,Support Function
。 I am facing issues as this was coded as factor. 我正面临问题,因为这被编码为因素。
My output should look like this 我的输出应如下所示
Emp ID Project Acquired Skill
182038 SAS,SAS Analytics,SAS BI,SAS data modeling tool,ClearCase,SQL,SQL Server,SQL SERVER 2000,SQL SERVER 2005,Excel,Oracle,AS400
191854 Architecting,Cognos TM1,Support Function
Use a character class regexp in gsub
: 在
gsub
使用字符类regexp:
transform(Data1, Project.Acquired.Skill=gsub("\\s[0-9()]+","",Project.Acquired.Skill))
Emp.ID
1 182038
2 191854
Project.Acquired.Skill
1 SAS,SAS Analytics,SAS BI,SAS data modeling tool,ClearCase,SQL,SQL Server,SQL SERVER,SQL SERVER,Excel,Oracle,AS400
2 Architecting,Cognos TM1,Support Function
(data1[,2] <- gsub("\\s\\(\\d+\\)", "", data1[,2]))
# [1] "SAS,SAS Analytics,SAS BI,SAS data modeling tool,ClearCase,SQL,SQL Server,SQL SERVER 2000,SQL SERVER 2005,Excel,Oracle,AS400"
# [2] "Architecting,Cognos TM1,Support Function"
library(qdap)
gsub(" ,", ",", strip(Data1[, 2], char.keep=",", lower=F))
## [1] "SAS,SAS Analytics,SAS BI,SAS data modeling tool,ClearCase,SQL,SQL Server,SQL SERVER ,SQL SERVER ,Excel,Oracle,AS"
## [2] "Architecting,Cognos TM,Support Function"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.