简体   繁体   English

ffbase :: as.character中的“ by”参数有什么作用?

[英]What does the “by” argument in ffbase::as.character do?

In the post below, 在下面的帖子中,

aggregation using ffdfdply function in R 在R中使用ffdfdply函数进行聚合

There is a line like this. 有这样一条线。

splitby <- as.character(data$Date, by = 250000)

Just out of curiosity, I wonder what by argument means. 只是出于好奇,我想知道by参数是指。 It seems to be related to ff dataframe but I'm not sure. 它似乎与ff数据帧有关,但我不确定。 Google search and R documentation of as.character and as.vector provided no useful information. Google搜索和as.characteras.vector R文档没有提供有用的信息。

I tried some examples but the codes below give the same results. 我尝试了一些示例,但是下面的代码给出了相同的结果。

d <- seq.Date(Sys.Date(), Sys.Date()+10000, by = "day")
as.character(d, by=1)
as.character(d, by=10)
as.character(d, by=100)

If anybody could tell me what it is, I'd appreciate it. 如果有人能告诉我这是什么,我将不胜感激。 Thank you in advance. 先感谢您。

Since as.character.ff works using the default as.character internally, and in view of the fact that df vectors can be larger than RAM, the data needs to be processed in chunks. 由于as.character.ff在内部使用默认的as.character ,并且鉴于df向量可能大于RAM,因此需要对数据进行分块处理。 The partition into chunks is facilitated by the chunk function. chunk功能有助于将chunk划分为多个块。 In this case, the relevant method is chunk.ff_vector . 在这种情况下,相关方法是chunk.ff_vector By default, this will calculate the chunk size by dividing getOption("ffbatchbytes") by the record size. 默认情况下,这将通过将getOption("ffbatchbytes")除以记录大小来计算块大小。 However, this behaviour can be overridden by supplying the chunk size using by . 但是,可以通过使用by提供块大小来覆盖此行为。

In the example you give, the ff vector will be converted to character 250000 members at a time. 在您给出的示例中,ff向量将一次转换为250000个character成员。

The end result will be the same for any by or without by at all. 最终的结果将是任何一样by或不by的。 Larger values will lead to greater temporary use of RAM but potentially quicker operation. 较大的值将导致更多地临时使用RAM,但可能会更快地进行操作。

First, that function is ffbase::as.character , not plain old base::as.character 首先,该函数是ffbase::as.character ,而不是普通的旧base::as.character

See http://www.inside-r.org/packages/cran/ffbase/docs/as.character.ff which says 请参阅http://www.inside-r.org/packages/cran/ffbase/docs/as.character.ff ,其中显示

as.character((x, ...))

Arguments:
x: a ff vector
...: other parameters passed on to chunk

So the by argument is being passed through to some chunk function. 因此, by参数将传递给某些chunk函数。 Then you need to figure out which package's chunk function is being used. 然后,您需要确定正在使用哪个程序包的chunk功能。 Type ?chunk , tell us which one, then go read its doc to see what its by argument does. 输入?chunk ,告诉我们哪个,然后阅读其文档以查看其by参数的作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM