简体   繁体   English

R 更改 Dataframe 中的元素

[英]R Changing Elements in a Dataframe

I'm trying to, as the title says, change elements from my dataframe from one character to another.正如标题所说,我正在尝试将 dataframe 中的元素从一个字符更改为另一个字符。 The dataframe is as follows: dataframe如下:

g1=c("CC","DD","GG")
g2=c("AA","BB","EE")
g3=c("HH","II","JJ")

df=data.frame(g1,g2,g3)

I wish to convert the elements from letterletter format to letter/letter format (eg CC to C/C or AA to A/A)我希望将元素从字母格式转换为字母/字母格式(例如 CC 到 C/C 或 AA 到 A/A)

I know using "strsplit" would work on a list.我知道使用“strsplit”可以在列表中使用。 I also know that I would need to somehow incorporate: collapse="/"我也知道我需要以某种方式合并:collapse="/"

How would I be able to apply the strsplit function to the entire dataframe?如何将 strsplit function 应用于整个 dataframe?

I was thinking something along the lines of:我在想一些事情:

split=function(x)
{
  unlist(paste(strsplit(x,""),collapse="/"))
}

j=as.data.frame(apply(df,1,split))

but it doesn't give the desired results.但它没有给出预期的结果。

Update---------------- Apparently, the following script works:更新----------------显然,以下脚本有效:

split=function(x)
{
  paste(unlist(strsplit(x,"")),collapse="/")
}

p=apply(df,c(1,2),split)

If there is a more efficient or convenient way, please feel free to share.如果有更高效或方便的方法,欢迎分享。

I can think of two ways to approach this.我可以想到两种方法来解决这个问题。 One is using strsplit like you did.一种是像你一样使用strsplit You were only missing the portion where you loop over each element in the list returned from strsplit :您只错过了遍历从strsplit返回的列表中的每个元素的部分:

Split <- function(x) {
  #unlist(lapply(strsplit(x, ""), paste, collapse="/"))
  sapply(strsplit(x, ""), paste, collapse="/")
}
as.data.frame(lapply(df, Split))

Another approach would be to use gsub and the \\B symbol, which matches the empty string that isn't at the beginning or end of a "word".另一种方法是使用gsub\\B符号,它匹配不在“单词”开头或结尾的空字符串。

as.data.frame(lapply(df, gsub, pattern="\\B", replacement="/"))

What constitutes a "word" depends on locale and implementation, so here's another solution using gsub and back-references.什么构成“单词”取决于语言环境和实现,所以这里有另一个使用gsub和反向引用的解决方案。

as.data.frame(lapply(df, gsub, pattern="(.)(.)", replacement="\\1/\\2"))

Start with a function definition like this从这样的 function 定义开始

insertslash <- function(x) sapply(strsplit(x, ""), function(x) paste(x, collapse="/"))

Convince yourself that it does what it should by insertslash(g1) .通过insertslash(g1)说服自己它应该做的事情。

To apply it to all columns of the dataframe, do this:要将其应用于 dataframe 的所有列,请执行以下操作:

as.data.frame(apply(df, 2, insertslash))

Obviously, you can roll this into one nasty one-liner:显然,你可以把它变成一个讨厌的单行:

as.data.frame(apply(df, 2, function(x) sapply(strsplit(x, ""), function(x) paste(x, collapse="/"))))

Here's a bit of a hack using gsub .这是使用gsub的一些技巧。 Someone who knows more about regex ought to be able to improve on this:对正则表达式了解更多的人应该能够对此进行改进:

mySplit <- function(x)
{
  substr(gsub("","/",x),2,4)
}

as.data.frame(apply(df,2,mySplit))

The reason your original solution wasn't working was because you were unlist ing at the wrong spot.您最初的解决方案不起作用的原因是您在错误的位置unlist So if you unlist later and use lapply things work as you might expect:因此,如果您稍后lapply unlist事情会按您的预期工作:

mySplit1 <- function(x)
{
  unlist(lapply(strsplit(x,""),paste,collapse="/"))
}

as.data.frame(apply(df,2,mySplit1))

Another hack using paste(), definitely not as elegant but it gets the job done.另一个使用 paste() 的 hack,绝对没有那么优雅,但它完成了工作。

for (col in 1:ncol(df)){
  df[,col] = paste(substr(df[,col],1,1),"/",substr(df[,col],1,1), sep="")
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM