[英]What happens during evaluation in R?
I'm interested in how the most basic thing, evaluation, works in R. 我对R中最基本的东西,评价是如何工作感兴趣
I came to R as a biologist, and yet interested in everything related to code, it's still a bit mysterious. 我作为一名生物学家来到R,但对与代码相关的一切感兴趣,它仍然有点神秘。
I think I understand properly: 我想我理解得很好:
But technically, what happens behind the curtain when we evaluate something in R, when we press enter after a (or more) line(s) of code? 但从技术上讲,当我们在一个(或更多)代码行之后按Enter键时,当我们评估R中的某些东西时幕后会发生什么?
I have found this, in the R language definition by the core team: 我在核心团队的R语言定义中找到了这个:
When a user types a command at the prompt (or when an expression is read from a file) the first thing that happens to it is that the command is transformed by the parser into an internal representation. 当用户在提示符下键入命令时(或者从文件中读取表达式时),发生的第一件事就是命令被解析器转换为内部表示。 The evaluator executes parsed R expressions and returns the value of the expression. 求值程序执行已解析的R表达式并返回表达式的值。 All expressions have a value. 所有表达式都有一个值。 This is the core of the language. 这是该语言的核心。
But it is abstruse to me (particularly the boldtype part) and the subsection do not help me to disentangle this. 但这对我来说是深奥的(特别是粗体部分)而且这一小节并没有帮助我解开这个问题。
Do I have to open a fundamental book on informatics to understand this, or is there another way to understand, technically, what I'm doing 8 hours a day? 我是否必须打开一本关于信息学的基础书来理解这一点,还是有另一种方法可以从技术上理解我每天工作8小时的事情?
This is going to be an incomplete answer, but it seems your question is about the nature of the "internal representation." 这将是一个不完整的答案,但似乎你的问题是关于“内部代表”的性质。 In essence, R's parser takes arbitrary R code, removes irrelevant stuff (like superfluous whitespace) and creates a nested set of expressions to evaluate. 本质上,R的解析器采用任意R代码,删除不相关的东西(如多余的空格)并创建一组嵌套的表达式来进行评估。 We can use pryr::call_tree()
to see what is going on. 我们可以使用pryr::call_tree()
来查看发生了什么。
Take a simple expression that only uses mathematical operators: 采用仅使用数学运算符的简单表达式:
> 1 + 2 - 3 * 4 / 5
[1] 0.6
In that series of operations, an output occurs that respects R's precedence rules. 在该系列操作中,出现了一个尊重R优先级规则的输出。 But what is actually happening? 但究竟发生了什么? First, the parser converts whatever is typed into an "expression": 首先,解析器将输入的内容转换为“表达式”:
> parse(text = "1 + 2 - 3 * 4 / 5")
expression(1 + 2 - 3 * 4 / 5)
This expression masks a deeper complexity: 此表达式掩盖了更深层的复杂性
> library("pryr")
> call_tree(parse(text = "1 + 2 - 3 * 4 / 5"))
\- ()
\- `-
\- ()
\- `+
\- 1
\- 2
\- ()
\- `/
\- ()
\- `*
\- 3
\- 4
\- 5
This expression is the sequential evaluation of four functions, first "*"()
, then "/"()
, then "+"()
, then "-"()
. 这个表达式是对四个函数的顺序评估,首先是"*"()
,然后是"/"()
,然后是"+"()
,然后是"-"()
。 Thus, this can actually be rewritten as a deeply nested expression: 因此,实际上可以将其重写为深层嵌套的表达式:
> "-"("+"(1,2), "/"("*"(3,4), 5))
[1] 0.6
> call_tree(parse(text = '"-"("+"(1,2), "/"("*"(3,4), 5))'))
\- ()
\- `-
\- ()
\- `+
\- 1
\- 2
\- ()
\- `/
\- ()
\- `*
\- 3
\- 4
\- 5
Multi-line expressions are also parsed into individual expressions: 多行表达式也被解析为单个表达式:
> parse(text = "1; 2; 3")
expression(1, 2, 3)
> parse(text = "1\n2\n3")
expression(1, 2, 3)
> call_tree(parse(text = "1; 2; 3"))
\- 1
\- 2
\- 3
These call trees are then evaluated. 然后评估这些调用树。
Thus when R's read-eval-print loop executes, it parses the code typed in the interpreter or sourced from a file into this call tree structure, then sequentially evaluates each function call, and then prints the result unless an error occurs). 因此,当R的read-eval-print循环执行时,它将解释器中键入的代码或源自文件的代码解析为此调用树结构,然后依次计算每个函数调用,然后打印结果,除非发生错误)。 Errors occur when a parsable line of code cannot be fully evaluated: 无法完全评估可解析的代码行时发生错误:
> call_tree(parse(text = "2 + 'A'"))
\- ()
\- `+
\- 2
\- "A"
And a parsing failure occurs when a typable line of code cannot be parsed into a call tree: 当一行典型代码无法解析为调用树时,会发生解析失败:
> parse(text = "2 + +")
Error in parse(text = "2 + +") : <text>:2:0: unexpected end of input
1: 2 + +
^
That's not a complete story, but perhaps it gets you part way to understanding. 这不是一个完整的故事,但也许它可以帮助你理解。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.