简体   繁体   English

R sapply循环替换for循环

[英]R sapply loop to replace for loop

I have successfully switched for loops to sapply loops before, and I know for a fact (system.time()) that they are faster. 我之前已经成功地将for循环切换 sapply循环 ,而且我知道(system.time())它们更快。

BUT my mind still works in a for loop way... 但是我的思想仍然以for循环的方式工作...

Please help me to convert this for loop case: 请帮助我将其转换为for循环大小写:

names.list <- c("Anna", "Ana", "Albert", "Albort", "Rob", "Robb", "Tommy", "Tommie")
misspell.list <- c("Anna", "Albort", "Robb", "Tommie")
fix.list <- c("Ana", "Albert", "Rob", "Tommy")

for(i in 1:length(fix.list)) {
        names.list[which(names.list == misspell.list[i])] <- fix.list[i]

}

names.list

To a sapply() 到sapply()

So far, I got: 到目前为止,我得到了:

sapply(seq_along(fix.list), function(x)
        names.list[which(names.list == misspell.list[x])]  <- fix.list[x]
)

But it only returns me the original vector. 但这只会返回原始向量。

Thanks! 谢谢!

EDIT 1: 编辑1:

the misspell.list and fix.list were created automatically by adist() bellow and the original names.list has 665 elements. misspell.list和fix.list是由adist()波纹管自动创建的,原始的names.list具有665个元素。 My for() solution returns length(unique(names.list)) = 653 elements 我的for()解决方案返回length(unique(names.list)) = 653个元素

# will do another sapply() substitution here soon
for(i in 1:(length(names.list)-1)) {
        distancias[i] <- adist(names.list[i], names.list[i+1])
}

# fix list
misspell.list <- names.list[which(distancias < 2)]
fix.list <- names.list[which(distancias < 2) +1]

EDIT 2: thanks to you, now I'm a sapply overlord and I'm here just to show my other for-sapply substitution used with adist() 编辑2:谢谢你,现在我是一个apply的霸主 ,我在这里只是为了展示与adist()一起使用的其他for-apply替换

nomes <- sort(unique(names.list))
distancias <- rep(10, length(nomes))

#adist() for finding misspelling
sapply(seq_along(nomes), 
       function(x) {
                if(x<length(nomes)) {
                        distancias[x] <<- adist(nomes[x], nomes[x+1])
                        }
        }
       )
# fix list
misspell.list <- names.list[which(distancias < 2)]
fix.list <- names.list[which(distancias < 2) +1]

The other part you already know, thanks again! 您已经知道的另一部分,再次感谢!

If there is one-to-one correspondence between misspell.list and fix.list you can do away with loops by using match function 如果misspell.listfix.list之间存在一对一的对应关系, fix.list可以使用match函数消除循环

names.list[match(misspell.list,names.list)] <- fix.list

names.list
#[1] "Ana"    "Ana"    "Albert" "Albert" "Rob"    "Rob"    "Tommy"  "Tommy"

The solution using match is much better, but in terms of what you were trying to do, this will work. 使用match的解决方案要好得多,但是就您要尝试执行的操作而言,这将起作用。 Firstly, you don't need the which . 首先,您不需要which You also need to use the <<- operator to tell the internal function defined within the loop to use the global environment rather than its own local one - otherwise it does not change names.list , only its copy. 您还需要使用<<-运算符告诉循环中定义的内部函数使用全局环境,而不是其自己的局部环境-否则,它不会更改names.list ,而只会更改其副本。

sapply(seq_along(fix.list), function(x)
  names.list[names.list == misspell.list[x]]  <<- fix.list[x]
)

names.list
[1] "Ana"    "Ana"    "Albert" "Albert" "Rob"    "Rob"    "Tommy"  "Tommy" 

I would propose a small change to your whole setup. 我会建议对您的整个设置进行一个小的更改。 When using indexes like you do, you need to be sure that the order is always the same. 当像您一样使用索引时,需要确保顺序始终相同。 If you add or remove a name, the whole thing falls apart. 如果添加或删除名称,则整个过程将分崩离析。

Using a named list and lapply or sapply , your code stays dynamic and you can potentially match multiple misspellings to one name. 使用命名列表和lapplysapply ,您的代码将保持动态,并且可以将多个拼写错误与一个名称匹配。

misspell.list  <-  list(
  'Anna' = 'Ana',
  'Albort' = 'Albert',
  'Robb' = 'Rob',
  'Tommie' = 'Tommy'
)

names.list <- c("Anna", "Ana", "Albert", "Albort", "Rob", "Robb", "Tommy", "Tommie")


> sapply(names.list,function(x) ifelse(x %in% names(misspell.list),misspell.list[[x]],x))
    Anna      Ana   Albert   Albort      Rob     Robb    Tommy   Tommie 
   "Ana"    "Ana" "Albert" "Albert"    "Rob"    "Rob"  "Tommy"  "Tommy" 

To illustrate what I mean, I'm using sample to shuffle up your names.list vector and extend it to 20 names. 为了说明我的意思,我正在使用sample将您的names.list向量names.list并将其扩展为20个名称。 This shows that order and length have no influence. 这表明顺序和长度没有影响。

sapply(names.list[sample(1:length(names.list),20,replace = T)],function(x) ifelse(x %in% names(misspell.list),misspell.list[[x]],x))
  Albert   Tommie      Rob   Tommie      Rob    Tommy      Ana     Robb   Tommie      Ana   Tommie   Albort      Ana   Albert   Albert   Albort 
"Albert"  "Tommy"    "Rob"  "Tommy"    "Rob"  "Tommy"    "Ana"    "Rob"  "Tommy"    "Ana"  "Tommy" "Albert"    "Ana" "Albert" "Albert" "Albert" 
   Tommy    Tommy    Tommy      Ana 
 "Tommy"  "Tommy"  "Tommy"    "Ana" 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM