[英]How can I push the current execution state into a stack so that I can continue from it later?
Imagine a simple grammar: 想象一个简单的语法:
(a|ab)c
Which reads (a or ab) followed by c. 其中读取(a或ab)后跟c。 The parse tree would look like this:
解析树看起来像这样:
and
/ \
or c
/ \
a ab
Now given it this input: 现在给它这个输入:
abc
We would traverse first down the left side of the tree, and match "a", then go back up a level. 我们首先遍历树的左侧,然后匹配“a”,然后返回一个级别。 Since "a" matched, the "or" is also true, so move on to the "c".
由于“a”匹配,“或”也是如此,因此转到“c”。 "c" does not match, and we hit the end of the road.
“c”不匹配,我们走到了尽头。
But there was an alternate path it could have taken; 但它可以采取另一种方式; had we traversed down to "ab", we would have found a match.
如果我们走到“ab”,我们就会找到一场比赛。
So what I want to do for "or" nodes is essentially this: 所以我想要为“或”节点做的基本上是这样的:
Then whenever the parser hits a dead end, I want to pop an item off the stack and continue from there again. 然后每当解析器遇到死胡同时,我想从堆栈中弹出一个项目并再次从那里继续。
That's the part I can't figure out...how do I essentially save the current call stack? 这是我无法弄清楚的部分......我如何实质上保存当前的调用堆栈? I can save the "ab" node in a stack so that I know I have to execute that one next, but then it still needs to know it needs to fall back up to the "or" afterwards.
我可以将“ab”节点保存在堆栈中,这样我就知道我必须在下一步执行该节点,但是它仍然需要知道它需要后退到“或”。
I think Chris was on to something. 我认为克里斯有所作为。 We have to find a way to translate the tree such that it isn't necessary to jump across branches like that.
我们必须找到一种翻译树的方法,这样就不必像这样跳过树枝。 For example, this equivalent parse tree doesn't have that problem:
例如,这个等效的解析树没有这个问题:
or
/ \
and and
/ \ / \
a c ab c
This time we parse down the left, hit "a", it passes, so we try the "c" node beside it, that fails, "and" fails, "or" has to try the right branch, ... "ab" passes, the other "c" passes, and then the whole expression passes. 这次我们解析左边,点击“a”,它通过,所以我们尝试旁边的“c”节点,失败,“和”失败,“或”必须尝试正确的分支,......“ab “通过,另一个”c“通过,然后整个表达通过。
You have the answer to your question in the way you posed it. 你可以用你提出的方式回答你的问题。
You need to save the state . 你需要保存状态 。 The tricky part is identifying the state.
棘手的部分是确定国家。 Saving it is easy.
保存很容易。
Your problem is that the parser "has a state" when it starts parsing some grammar rule. 您的问题是解析器在开始解析某些语法规则时“有状态”。 (This gets messier if you use an LALR parser, which merges the parsing of many rules into a single state).
(如果使用LALR解析器,将许多规则的解析合并为单个状态,则会变得更加混乱)。 That state consists of:
该州包括:
When you are parsing and face a choice alternative as you have described, you need to "save the state", run a trial parse on the first term. 当您解析并面对您所描述的选择替代时,您需要“保存状态”,在第一个术语上运行试验解析。 If successful, you can throw away the saved state and continue.
如果成功,您可以丢弃已保存的状态并继续。 If failure, restore the state, and try the 2nd (and nth alternatives).
如果失败,恢复状态,并尝试第二(和第n个替代)。 (Its easier to be brainless and just the save state regardless of whether you face an alternative, but that's up to you).
(无论你是否面对替代方案,它都更容易变得无脑,只是保存状态,但这取决于你)。
How can you save the state? 你怎么能拯救国家? Push it into a stack.
将它推入堆栈。 (You typically have a parse stack, that's a pretty convenient place! If you don't like that, add another stack but you'll discover it and the parse stack in general move synchronously; I just make the parse stack contain a record with all the stuff I need, including space for the input. And you'll find the "call stack" convenient for parts of the state; see below).
(你通常有一个解析堆栈,这是一个非常方便的地方!如果你不喜欢它,添加另一个堆栈但你会发现它并且解析堆栈通常同步移动;我只是让解析堆栈包含一个记录我需要的所有东西,包括输入空间。你会发现“调用堆栈”方便部分状态;见下文)。
The first thing is to save the input location; 首先是保存输入位置; that is likely a file source position and for optimizing reasons likely a buffer index.
这可能是文件源位置,并且出于优化原因可能是缓冲区索引。 That's just a scalar so it is pretty easy to save.
这只是一个标量,因此很容易保存。 Restoring the input stream may be harder;
恢复输入流可能更难; there's no gaurantee that the parser input scanner is anywhere near where it was.
没有任何关于解析器输入扫描器在任何地方附近的保证。 So you need to reposition the file, re-read any buffer, and reposition any input buffer pointer.
因此,您需要重新定位文件,重新读取任何缓冲区,并重新定位任何输入缓冲区指针。 Some simple checks can make this statistically cheap: store the file position of the first character of any buffer;
一些简单的检查可以使这在统计上更便宜:存储任何缓冲区的第一个字符的文件位置; then deteimining if you have to re-read the buffer is a matter of comparing the saved file position with the buffer start file position.
然后判断是否必须重新读取缓冲区是将保存的文件位置与缓冲区起始文件位置进行比较的问题。 The rest should be obvious.
其余应该是显而易见的。
You'll backtrack through fewer buffers (eg, your parser runs faster) if you rearrange your grammar to minimize that. 如果重新排列语法以最小化,那么您将通过更少的缓冲区(例如,您的解析器运行速度更快)回溯。 In your specific grammar, you have "(a | ab ) c", which could be re-written by hand to "ab? c".
在你的特定语法中,你有“(a | ab)c”,可以手工重写为“ab?c”。 The latter will at least not backtrack across whatever a represents.
后者至少不会回溯任何代表。
The odd part is saving the parse stack. 奇怪的部分是保存解析堆栈。 Well, you don't have to, because your trial parse is only going to extend the parse stack you have, and restore it to the parse state you have whether your subparse succeeds or fails.
好吧,您不必这样做,因为您的试用解析只会扩展您拥有的解析堆栈,并将其恢复到您的子解析成功或失败的解析状态。
"where the parser goes on fail" and "where it goes on success" are just two more scalars. “解析器继续失败的地方”和“成功的地方”只是另外两个标量。 You can represent them as indexes of your parsing code blocks, and program counters (eg, continuations) or as a return address on your call stack (see? another parallel stack!) followed by a conditional test to success/failure.
您可以将它们表示为解析代码块的索引,程序计数器(例如,continuation)或作为调用堆栈上的返回地址(参见?另一个并行堆栈!),然后是成功/失败的条件测试。
If you want some details on the latter, check out my SO answer on hand-coded recursive descent parsers. 如果你想了解后者的一些细节,请查看我手动编码的递归下降解析器的答案。
If you start building trees, or doing something else as a side effect of the parse, you'll have to figure how to capture/save the state of the side-effected entity, and restore it. 如果你开始构建树,或者做一些其他事情作为解析的副作用,你将不得不想象如何捕获/保存副作用实体的状态,并恢复它。 But whatever it is, you'll end up pushing it on a stack.
但不管它是什么,你最终会把它推到堆栈上。
What you should do is is call a method for each possibility. 你应该做的是为每种可能性调用一种方法。 If you hit a dead end, you can return, and you will be right back where you were, and ready to try the next option.
如果你走到了死胡同,你可以回来,你会回到原来的位置,并准备尝试下一个选项。
You can indicate whether you successfully parsed the branch by returning a value from the parsing method. 您可以通过从解析方法返回值来指示是否成功解析了分支。 For example, you could return true for success and false for failure.
例如,您可以返回true表示成功,返回false表示失败。 In this case, if the parse returns false, you try the next option.
在这种情况下,如果解析返回false,则尝试下一个选项。
Just return your state in addition to your result. 除了结果之外,只需返回您的州。 Lets take a simple example where you can have an index for each element:
让我们举一个简单的例子,你可以为每个元素都有一个索引:
Grammer: (a|ab)c
Translated: AND(OR(a,ab),c)
Input: abc
Call AND
Call OR
a matches, return true,1
c does not match, start over
Call OR giving 1
ab matches, return true,2
c matches, return true
You will need a more complex structure to handle harder cases (whether it be a queue or stack is up to how you build and destruct in recreating your state) 您将需要一个更复杂的结构来处理更难的情况(无论是队列还是堆栈都取决于您在重建状态时如何构建和销毁)
you need to use recursion. 你需要使用递归。
Something like: 就像是:
in or statement for each statement bool ret = eval(statement) if (ret) bool recVal = call the recursion if (recVal) than you find a path stop the recursion. in或语句中的每个语句bool ret = eval(statement)if(ret)bool recVal =调用递归if(recVal)比找到一个路径停止递归。 else we continue in another loop and try next statement.
否则我们继续在另一个循环中尝试下一个语句。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.