简体   繁体   English

在R中递归地应用lapply

[英]Recursively apply lapply in R

Out of curiosity, I was testing whether recursive lapply gives me the same result as applying the function manually. 出于好奇,我正在测试递归lapply是否给我与手动应用函数相同的结果。 I found out that lapply behaves erratically. 我发现lapply行为不规律。 So, here's what I did: 所以,这就是我做的:

Example 1: 例1:

m<-c(2,3,4)
n<-c(5,6,3)
o<-c(1,1,1.5)
dc<-data.frame(m,n,o)

Now, let's analyze the fun part: 现在,让我们分析一下有趣的部分:

lapply(dc,mode)

gives: 得到:

lapply(dc,mode)
$m
[1] "numeric"

$n
[1] "numeric"

$o
[1] "numeric"

Let's compare above result running mode individually on say "m". 让我们单独比较上面的结果运行模式,说“m”。

  mode(dc$m)

I got: 我有:

"numeric"

Ditto for others. 别人也一样。 This is all good because we have got atomic vectors. 这一切都很好,因为我们有原子矢量。

Now, let's analyze another example: 现在,让我们分析另一个例子:

Example 2: 例2:

a<-c(2,3,4,5,5,3)
b<-c(0,1,1,0,1,0)
b<-factor(b,levels = c(0,1),labels = c("F","M"))
c<-c("Hello","Hi")
datacheck<-data.frame(a,b,c)

Now, I would apply "str" function to a, b and c individually. 现在,我将“str”函数分别应用于a,b和c。

str(datacheck$b)
 Factor w/ 2 levels "F","M": 1 2 2 1 2 1
str(datacheck$c)
 Factor w/ 2 levels "Hello","Hi": 1 2 1 2 1 2
str(datacheck$a)
 num [1:6] 2 3 4 5 5 3

This is all good and expected because b and c are factors. 这是好的和预期的,因为b和c是因素。 "a" is just an array of numbers. “a”只是一组数字。

Now, when I run lapply, I get: 现在,当我运行lapply时,我得到:

 lapply(datacheck,str)
 num [1:6] 2 3 4 5 5 3
 Factor w/ 2 levels "F","M": 1 2 2 1 2 1
 Factor w/ 2 levels "Hello","Hi": 1 2 1 2 1 2
$a
NULL

$b
NULL

$c
NULL

My question is: why are $a, $b and $c NULL and not numeric, what we found when we ran str() command independently? 我的问题是:为什么$ a,$ b和$ c为NULL而不是数字,我们在独立运行str()命令时发现了什么? I looked around on SO and also read ?lapply, but I couldn't find an answer. 我环顾四周,也阅读了拉普利,但我找不到答案。

I'd appreciate your thoughts. 我很感激你的想法。

We need to use class 我们需要使用class

lapply(datacheck, class)

This returns a list , but if we need a vector 这会返回一个list ,但是如果我们需要一个vector

sapply(datacheck, class)
#       a         b         c 
#"numeric"  "factor"  "factor" 

If we need to get the str as a character output, we can do with capture.output as str just prints the output. 如果我们需要将str作为字符输出,我们可以使用capture.output因为str只打印输出。

lapply(datacheck, function(x) trimws(capture.output(str(x))))
#$a
#[1] "num [1:6] 2 3 4 5 5 3"

#$b
#[1] "Factor w/ 2 levels \"F\",\"M\": 1 2 2 1 2 1"

#$c
#[1] "Factor w/ 2 levels \"Hello\",\"Hi\": 1 2 1 2 1 2"

By checking the 通过检查

class(str(datacheck$a))
num [1:6] 2 3 4 5 5 3
#[1] "NULL"

we get a NULL as output which is why the lapply shows NULL 我们得到一个NULL作为输出,这就是lapply显示NULL的原因

lapply(datacheck, str)

By checking the source code of str 通过检查str的源代码

 methods(str)
 #[1] str.data.frame* str.Date*       str.default*    str.dendrogram* str.logLik*     str.POSIXt*    

getAnywhere(str.default)
...
...

 cat(ss, sep = "\n") #just prints the output
 return(invisible())
 ...
 ...

The reason why lapply(datacheck,str) returns a list of NULL is explained in help(str) : lapply(datacheck,str)返回NULL列表的原因在help(str)有解释:

Value

str does not return anything, for efficiency reasons. 出于效率原因, str不会返回任何内容。 The obvious side effect is output to the terminal. 明显的副作用是输出到终端。

So, the difference is what you see printed in the console window and what the function actually returns. 因此,区别在于您在控制台窗口中看到的内容以及函数实际返回的内容。 Using lapply does make this visible. 使用lapply确实可以看到它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM