[英]What does the “by” argument in ffbase::as.character do?
In the post below, 在下面的帖子中,
aggregation using ffdfdply function in R 在R中使用ffdfdply函数进行聚合
There is a line like this. 有这样一条线。
splitby <- as.character(data$Date, by = 250000)
Just out of curiosity, I wonder what by
argument means. 只是出于好奇,我想知道by
参数是指。 It seems to be related to ff
dataframe but I'm not sure. 它似乎与ff
数据帧有关,但我不确定。 Google search and R documentation of as.character
and as.vector
provided no useful information. Google搜索和as.character
和as.vector
R文档没有提供有用的信息。
I tried some examples but the codes below give the same results. 我尝试了一些示例,但是下面的代码给出了相同的结果。
d <- seq.Date(Sys.Date(), Sys.Date()+10000, by = "day")
as.character(d, by=1)
as.character(d, by=10)
as.character(d, by=100)
If anybody could tell me what it is, I'd appreciate it. 如果有人能告诉我这是什么,我将不胜感激。 Thank you in advance. 先感谢您。
Since as.character.ff
works using the default as.character
internally, and in view of the fact that df vectors can be larger than RAM, the data needs to be processed in chunks. 由于as.character.ff
在内部使用默认的as.character
,并且鉴于df向量可能大于RAM,因此需要对数据进行分块处理。 The partition into chunks is facilitated by the chunk
function. chunk
功能有助于将chunk
划分为多个块。 In this case, the relevant method is chunk.ff_vector
. 在这种情况下,相关方法是chunk.ff_vector
。 By default, this will calculate the chunk size by dividing getOption("ffbatchbytes")
by the record size. 默认情况下,这将通过将getOption("ffbatchbytes")
除以记录大小来计算块大小。 However, this behaviour can be overridden by supplying the chunk size using by
. 但是,可以通过使用by
提供块大小来覆盖此行为。
In the example you give, the ff vector will be converted to character
250000 members at a time. 在您给出的示例中,ff向量将一次转换为250000个character
成员。
The end result will be the same for any by
or without by
at all. 最终的结果将是任何一样by
或不by
的。 Larger values will lead to greater temporary use of RAM but potentially quicker operation. 较大的值将导致更多地临时使用RAM,但可能会更快地进行操作。
First, that function is ffbase::as.character
, not plain old base::as.character
首先,该函数是ffbase::as.character
,而不是普通的旧base::as.character
See http://www.inside-r.org/packages/cran/ffbase/docs/as.character.ff which says 请参阅http://www.inside-r.org/packages/cran/ffbase/docs/as.character.ff ,其中显示
as.character((x, ...))
Arguments:
x: a ff vector
...: other parameters passed on to chunk
So the by
argument is being passed through to some chunk
function. 因此, by
参数将传递给某些chunk
函数。 Then you need to figure out which package's chunk
function is being used. 然后,您需要确定正在使用哪个程序包的chunk
功能。 Type ?chunk
, tell us which one, then go read its doc to see what its by
argument does. 输入?chunk
,告诉我们哪个,然后阅读其文档以查看其by
参数的作用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.