简体   繁体   English

Prolog - 普通规则的性能是否优于列表?

[英]Prolog - Do plain rules have better performance than lists?

I have a set of DCG rules (in this case german personal pronouns): 我有一套DCG规则(在这种情况下是德语人称代词):

% personal pronoun (person, case, number, genus)
ppers(1,0,sg,_) --> [ich].
ppers(1,1,sg,_) --> [meiner].
ppers(1,2,sg,_) --> [mir].
ppers(1,3,sg,_) --> [mich].
ppers(2,0,sg,_) --> [du].
ppers(2,1,sg,_) --> [deiner].
ppers(2,2,sg,_) --> [dir].
ppers(2,3,sg,_) --> [dich].
...

Because they are semantically connected, it would make sense to me to keep this information by moving them into a list (grouped by person for example) instead of unrelated rules. 因为它们是语义连接的,所以通过将这些信息移动到列表(例如按人员分组)而不是不相关的规则来保留这些信息是有意义的。 This also makes things a bit neater: 这也使事情有点整洁:

ppers(1,sg,_,[ich, meiner, mir, mich]).
ppers(2,sg,_,[du,deiner,dir,dich]).
...

I would then select the item I want with nth0() where the case I need is the index within the list. 然后我会用nth0()选择我想要的项目,其中我需要的是列表中的索引。

However, I noticed when tracing through the program, that when checking a german sentence for correct grammar and trying to find if a part is a personal pronoun, Prolog will not step through every instanc when I use the upper version (plain rules), but will crawl through every list when I use the list version below. 然而,在通过程序进行追踪时,我注意到,当检查德语句子的正确语法并试图找出某个部分是否是人称代词时,Prolog将不会在我使用较高版本(普通规则)时逐步完成每个实例,但是当我使用下面的列表版本时,将遍历每个列表。

Does this mean that performance will be worse if I use lists and nth0 versus plain rules? 如果我使用list和nth0与普通规则相比,这是否意味着性能会更差? Or does the Prolog tracer just not show the crawling for plain rules as it does for lists? 或者Prolog跟踪器是不是像列表那样显示对普通规则的爬行?

(I hope I could make my question obvious enough, if not I will expand.) (我希望我可以让我的问题显而易见,如果不是我会扩展。)

Most probably the speed and tracing difference is not caused by indexing (*), but by the speed and tracing difference between clause head unification and body call nth. 很可能速度和跟踪差异不是由索引(*)引起的,而是由子句统一和主体调用nth之间的速度和跟踪差异引起的。 If you really want to take advantage of indexing and want to be portable (**) across most Prolog systems, you would need to reformulate your problem for first argument indexing. 如果您真的想利用索引并希望在大多数Prolog系统中都可以移植(**),那么您需要重新设计问题以进行第一个参数索引。

One way to do this, is via an additional predicate. 一种方法是通过一个额外的谓词。 Supposed you have originally these DCG rules: 假设您最初有这些DCG规则:

cat(Attr1, .., Attrn) --> [Terminal1, .., Terminaln].
..

Transform this into: 将其转换为:

cat(X1, .., Xn) --> [Y], cat2(Y, X1, .., Xn).

cat2(Terminal1, Attr1, .., Attrn) --> [Terminal2, .., Terminaln].
..

When we apply this to your example we would get: 当我们将此应用于您的示例时,我们会得到:

% personal pronoun (person, case, number, genus)
ppers(X1,X2,X3,X4) --> [Y], ppers2(Y,X1,X2,X3,X4).

% personal pronoun 2 (first word, person, case, number, genus)
ppers2(ich,1,0,sg,_) --> [].
ppers2(meiner,1,1,sg,_) --> [].
ppers2(mir,1,2,sg,_) --> [].
ppers2(mich,1,3,sg,_) --> [].
ppers2(du,2,0,sg,_) --> [].
ppers2(deiner,2,1,sg,_) --> [].
ppers2(dir,2,2,sg,_) --> [].
ppers2(dich,2,3,sg,_) --> [].

You can do this for each category you have in your code and that is kind of a lexicon table. 您可以对代码中的每个类别执行此操作,这类似于词典表。 The above works independent on how DCGs are translated and if first argument indexing is present, it will be lightning fast. 以上工作独立于DCG的翻译方式,如果存在第一个参数索引,则会很快。

Bye 再见

(*) Even if your Prolog system can do multi argument indexing, it might still not do complex term indexing. (*)即使你的Prolog系统可以进行多参数索引,它仍然可能不会进行复杂的术语索引。 To index a [ich|X] the Prolog system would need to decend into the list, but most probably it does not decend and does only index (.)/2, so that all clauses look the same and indexing has no positive effect. 要索引一个[ich | X],Prolog系统需要下降到列表中,但很可能它不会下降并且仅索引(。)/ 2,因此所有子句看起来都相同而索引没有正面效果。

(**) I guess the only common denominator among Prolog systems what concerns indexing is first argument indexing. (**)我认为Prolog系统中唯一关注索引的共同点是第一个参数索引。 Besides that not all Prolog systems may put a terminal into the head. 除此之外,并非所有Prolog系统都可能将终端设置在头部。 Some might use =/2 in the body and some might use 'C'/3 in the body. 有些人可能在身体中使用= / 2而有些可能在身体中使用'C'/ 3。 DCGs are currently not standardized what concerns the modelling of terminals. DCG目前尚未标准化与终端建模有关的内容。

In general the tracer will show you what actually happens, so yes, if it iterates where the alternative formulation directly accesses the target term via matching, then that will also happen when you're not looking. 一般来说,跟踪器会告诉你实际发生了什么,所以是的,如果它迭代了替代公式通过匹配直接访问目标术语的位置,那么当你不看时也会发生这种情况。 But to find out whether that actually means worse performance, you have to measure and compare both alternatives in a real scenario. 但要了解这实际上是否意味着性能更差,您必须在实际场景中测量和比较两种备选方案。 The unification might be slow even though it's not shown as a separate step by the tracer, or your run-time system might make optimizations or even compile stuff that doesn't happen under trace . 统一可能会很慢,即使它没有被跟踪器显示为单独的步骤,或者您的运行时系统可能会进行优化甚至编译在trace下不会发生的事情。 Or it might be slower but not enough to worry about. 或者它可能更慢但不足以担心。 Here, as always, the golden rule is: measure, then optimize. 在这里,一如既往,黄金法则是:衡量, 然后优化。

Why are you using nth0? 你为什么用nth0? Maybe could be the performance killer culprit, use memberchk instead. 也许可能是性能杀手的罪魁祸首,请使用memberchk。

Apart this I think your intuition about performances has a well founded background in 'argument indexing'. 除此之外,我认为你对表演的直觉在“论证索引”中有着良好的背景。 DCG are usually translated in Prolog (I'm using SWI-Prolog here): DCG通常在Prolog中翻译(我在这里使用SWI-Prolog):

ppers(1,0,sg,_) --> [ich].

becomes

ppers(1, 0, sg, _, [ich|A], A).

A recent optimization on SWI-Prolog virtual engine, inspired (I think) from YAP, automatically builds all the indexes for predicates having sufficiently bound arguments. 最近对SWI-Prolog虚拟引擎进行了优化,灵感(我认为)来自YAP,它自动为具有足够绑定参数的谓词构建所有索引。

Thus you can expect that parsing (using SWI-Prolog) with your first scheme will be more efficient. 因此,您可以期望使用第一个方案进行解析(使用SWI-Prolog)将更有效。

Previously, just 'first argument indexing' was implemented, in that case (or if you are using a Prolog without indexing capabilities) you should find very similar timings between these schemes. 以前,只有'第一个参数索引'被实现,在这种情况下(或者如果你使用Prolog而没有索引功能),你应该在这些方案之间找到非常相似的时间。

HTH HTH

Grammar rules are compiled into predicate clauses, which are usually indexed. 语法规则被编译成谓词子句,通常是索引子句。 Most, if not all, Prolog compilers use first-argument indexing (by default) to avoid trying clauses that will never be part of the proof tree when proving a goal. 大多数(如果不是全部)Prolog编译器使用第一个参数索引(默认情况下)来避免尝试在证明目标时永远不会成为证明树一部分的子句。 Thus, depending on your call patterns, and as you observed using your Prolog compiler tracing support, will not step into every predicate clause. 因此,根据您的调用模式,以及您使用Prolog编译器跟踪支持观察到的情况,不会进入每个谓词子句。 Moreover, calling the nth0/3 predicate with an instantiated index still requires a linear traversal of the list until the specified index is reached. 此外,使用实例化索引调用nth0 / 3谓词仍然需要对列表进行线性遍历,直到达到指定的索引。 Same if, as others suggested, if you used the memberchk/2 predicate. 如果您使用memberchk / 2谓词,则与其他人建议的相同。 A list is list. 列表是列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM