[英]R Changing Elements in a Dataframe
I'm trying to, as the title says, change elements from my dataframe from one character to another.正如标题所说,我正在尝试将 dataframe 中的元素从一个字符更改为另一个字符。 The dataframe is as follows:
dataframe如下:
g1=c("CC","DD","GG")
g2=c("AA","BB","EE")
g3=c("HH","II","JJ")
df=data.frame(g1,g2,g3)
I wish to convert the elements from letterletter format to letter/letter format (eg CC to C/C or AA to A/A)我希望将元素从字母格式转换为字母/字母格式(例如 CC 到 C/C 或 AA 到 A/A)
I know using "strsplit" would work on a list.我知道使用“strsplit”可以在列表中使用。 I also know that I would need to somehow incorporate: collapse="/"
我也知道我需要以某种方式合并:collapse="/"
How would I be able to apply the strsplit function to the entire dataframe?如何将 strsplit function 应用于整个 dataframe?
I was thinking something along the lines of:我在想一些事情:
split=function(x)
{
unlist(paste(strsplit(x,""),collapse="/"))
}
j=as.data.frame(apply(df,1,split))
but it doesn't give the desired results.但它没有给出预期的结果。
Update---------------- Apparently, the following script works:更新----------------显然,以下脚本有效:
split=function(x)
{
paste(unlist(strsplit(x,"")),collapse="/")
}
p=apply(df,c(1,2),split)
If there is a more efficient or convenient way, please feel free to share.如果有更高效或方便的方法,欢迎分享。
I can think of two ways to approach this.我可以想到两种方法来解决这个问题。 One is using
strsplit
like you did.一种是像你一样使用
strsplit
。 You were only missing the portion where you loop over each element in the list returned from strsplit
:您只错过了遍历从
strsplit
返回的列表中的每个元素的部分:
Split <- function(x) {
#unlist(lapply(strsplit(x, ""), paste, collapse="/"))
sapply(strsplit(x, ""), paste, collapse="/")
}
as.data.frame(lapply(df, Split))
Another approach would be to use gsub
and the \\B
symbol, which matches the empty string that isn't at the beginning or end of a "word".另一种方法是使用
gsub
和\\B
符号,它匹配不在“单词”开头或结尾的空字符串。
as.data.frame(lapply(df, gsub, pattern="\\B", replacement="/"))
What constitutes a "word" depends on locale and implementation, so here's another solution using gsub
and back-references.什么构成“单词”取决于语言环境和实现,所以这里有另一个使用
gsub
和反向引用的解决方案。
as.data.frame(lapply(df, gsub, pattern="(.)(.)", replacement="\\1/\\2"))
Start with a function definition like this从这样的 function 定义开始
insertslash <- function(x) sapply(strsplit(x, ""), function(x) paste(x, collapse="/"))
Convince yourself that it does what it should by insertslash(g1)
.通过
insertslash(g1)
说服自己它应该做的事情。
To apply it to all columns of the dataframe, do this:要将其应用于 dataframe 的所有列,请执行以下操作:
as.data.frame(apply(df, 2, insertslash))
Obviously, you can roll this into one nasty one-liner:显然,你可以把它变成一个讨厌的单行:
as.data.frame(apply(df, 2, function(x) sapply(strsplit(x, ""), function(x) paste(x, collapse="/"))))
Here's a bit of a hack using gsub
.这是使用
gsub
的一些技巧。 Someone who knows more about regex ought to be able to improve on this:对正则表达式了解更多的人应该能够对此进行改进:
mySplit <- function(x)
{
substr(gsub("","/",x),2,4)
}
as.data.frame(apply(df,2,mySplit))
The reason your original solution wasn't working was because you were unlist
ing at the wrong spot.您最初的解决方案不起作用的原因是您在错误的位置
unlist
。 So if you unlist
later and use lapply
things work as you might expect:因此,如果您稍后
lapply
unlist
事情会按您的预期工作:
mySplit1 <- function(x)
{
unlist(lapply(strsplit(x,""),paste,collapse="/"))
}
as.data.frame(apply(df,2,mySplit1))
Another hack using paste(), definitely not as elegant but it gets the job done.另一个使用 paste() 的 hack,绝对没有那么优雅,但它完成了工作。
for (col in 1:ncol(df)){
df[,col] = paste(substr(df[,col],1,1),"/",substr(df[,col],1,1), sep="")
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.