[英]Understanding how .Internal C functions are handled in R
I wonder if anyone can illustrate to me how R executes a C
call from an R command typed at the console prompt. 我想知道是否有人可以向我说明R如何从控制台提示符下键入的R命令执行
C
调用。 I am particularly confused by R
's treatment of a) function arguments and b) the function call itself. 我对
R
对a)函数参数和b)函数调用本身的处理特别感到困惑。
Let's take an example, in this case set.seed()
. 我们举一个例子,在这种情况下是
set.seed()
。 Wondering how it works I type the name in at the prompt, get the source ( look here for more on that ), see there is eventually a .Internal(set.seed(seed, i.knd, normal.kind)
, so dutifully look up the relevant function name in the .Internals
section of /src/names.c
, find it is called do_setseed
and is in RNG.c
which leads me to... 想知道它是如何工作的我在提示符下输入名称,获取源代码( 在这里查看更多内容 ),看看最终有一个
.Internal(set.seed(seed, i.knd, normal.kind)
,所以尽职尽责在/src/names.c
的.Internals
部分查找相关的函数名,找到它叫做do_setseed
并在RNG.c
,这导致我...
SEXP attribute_hidden do_setseed (SEXP call, SEXP op, SEXP args, SEXP env)
{
SEXP skind, nkind;
int seed;
checkArity(op, args);
if(!isNull(CAR(args))) {
seed = asInteger(CAR(args));
if (seed == NA_INTEGER)
error(_("supplied seed is not a valid integer"));
} else seed = TimeToSeed();
skind = CADR(args);
nkind = CADDR(args);
//...
//DO RNG here
//...
return R_NilValue;
}
CAR
, CADR
, CADDR
? CAR
, CADR
, CADDR
? My research leads me to believe they are a Lisp
influenced construct concerning lists but beyond that I do not understand what these functions do or why they are needed . Lisp
影响的有关列表的构造,但除此之外,我不明白这些函数的作用或为什么需要它们 。 checkArity()
do? checkArity()
什么作用? SEXP args
seems self explanatory, but is this a list of the arguments that is passed in the function call? SEXP args
似乎是自解释的,但这是函数调用中传递的参数列表吗? SEXP op
represent? SEXP op
代表什么? I take this to mean operator (like in binary functions such as +
), but then what is the SEXP call
for? +
),但那么SEXP call
是什么? Is anyone able to flow through what happens when I type 是否有人能够流经我打字时发生的事情
set.seed(1)
at the R console prompt, up to the point at which skind
and nkind
are defined? 在R控制台提示符下,直到定义
skind
和nkind
的点? I find I am not able to well understand the source code at this level and path from interpreter to C function. 我发现我无法很好地理解这个级别的源代码以及从解释器到C函数的路径。
CAR
and CDR
are how you access pairlist objects, as explained in section 2.1.11 of R Language Definition . CAR
和CDR
是您访问pairlist对象的方式,如R语言定义的 2.1.11节所述 。 CAR
contains the first element, and CDR
contains the remaining elements. CAR
包含第一个元素, CDR
包含其余元素。 An example is given in section 5.10.2 of Writing R Extensions : 编写R扩展的第5.10.2节给出了一个例子:
#include <R.h>
#include <Rinternals.h>
SEXP convolveE(SEXP args)
{
int i, j, na, nb, nab;
double *xa, *xb, *xab;
SEXP a, b, ab;
a = PROTECT(coerceVector(CADR(args), REALSXP));
b = PROTECT(coerceVector(CADDR(args), REALSXP));
...
}
/* The macros: */
first = CADR(args);
second = CADDR(args);
third = CADDDR(args);
fourth = CAD4R(args);
/* provide convenient ways to access the first four arguments.
* More generally we can use the CDR and CAR macros as in: */
args = CDR(args); a = CAR(args);
args = CDR(args); b = CAR(args);
There's also a TAG
macro to access the names given to the actual arguments. 还有一个
TAG
宏来访问给实际参数的名称。
checkArity
ensures that the number of arguments passed to the function is correct. checkArity
确保传递给函数的参数数量是正确的。 args
are the actual arguments passed to the function. args
是传递给函数的实际参数。 op
is offset pointer "used for C functions that deal with more than one R function" (quoted from src/main/names.c
, which also contains the table showing the offset and arity for each function). op
是偏移指针“用于处理多个R函数的C函数”(引自src/main/names.c
,其中还包含显示每个函数的偏移和arity的表)。
For example, do_colsum
handles col/rowSums
and col/rowMeans
. 例如,
do_colsum
处理col/rowSums
和col/rowMeans
。
/* Table of .Internal(.) and .Primitive(.) R functions
* ===== ========= ==========
* Each entry is a line with
*
* printname c-entry offset eval arity pp-kind precedence rightassoc
* --------- ------- ------ ---- ----- ------- ---------- ----------
{"colSums", do_colsum, 0, 11, 4, {PP_FUNCALL, PREC_FN, 0}},
{"colMeans", do_colsum, 1, 11, 4, {PP_FUNCALL, PREC_FN, 0}},
{"rowSums", do_colsum, 2, 11, 4, {PP_FUNCALL, PREC_FN, 0}},
{"rowMeans", do_colsum, 3, 11, 4, {PP_FUNCALL, PREC_FN, 0}},
Note that arity
in the above table is 4 because (even though rowSums
et al only have 3 arguments) do_colsum
has 4, which you can see from the .Internal
call in rowSums
: 请注意,上表中的
arity
为4,因为(即使rowSums
等只有3个参数) do_colsum
有4个,您可以从rowSums
的.Internal
调用中rowSums
:
> rowSums
function (x, na.rm = FALSE, dims = 1L)
{
if (is.data.frame(x))
x <- as.matrix(x)
if (!is.array(x) || length(dn <- dim(x)) < 2L)
stop("'x' must be an array of at least two dimensions")
if (dims < 1L || dims > length(dn) - 1L)
stop("invalid 'dims'")
p <- prod(dn[-(1L:dims)])
dn <- dn[1L:dims]
z <- if (is.complex(x))
.Internal(rowSums(Re(x), prod(dn), p, na.rm)) + (0+1i) *
.Internal(rowSums(Im(x), prod(dn), p, na.rm))
else .Internal(rowSums(x, prod(dn), p, na.rm))
if (length(dn) > 1L) {
dim(z) <- dn
dimnames(z) <- dimnames(x)[1L:dims]
}
else names(z) <- dimnames(x)[[1L]]
z
}
The basic C-level pairlist extraction functions are CAR
and CDR
. 基本的C级pairlist提取功能是
CAR
和CDR
。 (Pairlists are very similar to lists but are implemented as a linked-list and are used internally for argument lists). (Pairlists与列表非常相似,但是作为链表实现,并在内部用于参数列表)。 They have simple R equivalents:
x[[1]]
and x[-1]
. 它们具有简单的R等价物:
x[[1]]
和x[-1]
。 R also provides lots of combinations of the two: R还提供了两者的许多组合:
CAAR(x) = CAR(CAR(x))
which is equivalent to x[[1]][[1]]
CAAR(x) = CAR(CAR(x))
等于x[[1]][[1]]
CADR(x) = CAR(CDR(x))
which is equivalent to x[-1][[1]]
, ie x[[2]]
CADR(x) = CAR(CDR(x))
,相当于x[-1][[1]]
,即x[[2]]
CADDR(x) = CAR(CDR(CDR(x))
is equivalent to x[-1][-1][[1]]
, ie x[[3]]
CADDR(x) = CAR(CDR(CDR(x))
相当于x[-1][-1][[1]]
,即x[[3]]
Accessing the nth element of a pairlist is an O(n)
operation, unlike accessing the nth element of a list which is O(1)
. 与访问列表的第n个元素
O(1)
不同,访问pairlist的第n个元素是O(n)
操作。 This is why there aren't nicer functions for accessing the nth element of a pairlist. 这就是为什么没有更好的功能来访问pairlist的第n个元素。
Internal/primitive functions don't do matching by name, they only use positional matching, which is why they can use this simple system for extracting the arguments. 内部/原始函数不按名称进行匹配,它们仅使用位置匹配,这就是为什么它们可以使用这个简单的系统来提取参数。
Next you need to understand what the arguments to the C function are. 接下来,您需要了解C函数的参数是什么。 I'm not sure where these are documented, so I might not be completely right about the structure, but I should be the general pieces:
我不确定这些文件在哪里记录,所以我可能不完全正确的结构,但我应该是一般的部分:
call
: the complete call, as might be captured by match.call()
call
:完成调用,可能由match.call()
捕获
op
: the index of the .Internal function called from R. This is needed because there is a many-to-1 mapping from .Internal functions to C functions. op
:从R调用的.Internal函数的索引。这是必需的,因为从.Internal函数到C函数有多对一的映射。 (eg do_summary
implements sum, mean, min, max and prod). (例如,
do_summary
实现sum,mean,min,max和prod)。 The number is the third entry in names.c
- it's always 0 for do_setseed
and hence never used 该数字是
names.c
的第三个条目 - 对于do_setseed
,它始终为0,因此从未使用过
args
: a pair list of the arguments supplied to the function. args
:提供给函数的参数的一对列表。
env
: the environment from which the function was called. env
:调用函数的环境。
checkArity
is a macro which calls Rf_checkArityCall
, which basically looks up the number of arguments (the fifth column in names.c
is arity) and make sure the supplied number matches. checkArity
是一个调用Rf_checkArityCall
的宏,它基本上查找参数的数量( names.c
的第五列是arity)并确保提供的数字匹配。 You have to follow through quite a few macros and functions in C to see what's going on - it's very helpful to have a local copy of R-source that you can grep through. 你必须在C中完成相当多的宏和函数才能看到正在发生的事情 - 拥有一个你可以通过的R源的本地副本是非常有帮助的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.