简体   繁体   English

如何比较Ruta规则中两个不同注释的特征?

[英]How to compare features of two different annotations within a Ruta rule?

I am processing a text with UIMA Ruta and want to remove duplicated annotations. 我正在处理UIMA Ruta的文本,并希望删除重复的注释。 I consider an annotation to be duplicated if certain features, for instance a name, have the same value. 如果某些功能(例如名称)具有相同的值,我认为注释是重复的。 I have unsuccessfully tried different approaches, but I hope the following examples will give an idea of what I am trying to do: 我尝试了不同的方法但未成功,但我希望以下示例能够说明我想要做的事情:

STRING nameVal;
Person {-> GETFEATURE("name", nameVal)}  
ANY+? 
Person.name == nameVal {-> UNMARK(Person)};

I have also tried this variation: 我也尝试过这种变化:

STRING nameVal;
Person {-> GETFEATURE("name", nameVal)}  
ANY+? 
Person {-> UNMARK(Person)} <- { Person.name == nameVal; };

If I replace the variable nameVal with a literal (see next example), the rules work well and seem to be close to what I want, but not quite. 如果我用一个文字替换变量nameVal(参见下一个例子),规则运作良好,似乎接近我想要的,但不完全。

Person
ANY+? 
Person.name == "Mustermann" {-> UNMARK(Person)};

I believe, the problem is that, when the comparison is evaluated, the global variable has not yet been initialized. 我相信,问题是,当评估比较时,全局变量尚未初始化。 Is there a way in Ruta to compare a feature of the first matched annotation with a feature of the last matched annotation inside the same rule? 在Ruta中有没有办法将第一个匹配的注释的特征与同一规则中最后匹配的注释的特征进行比较?

Yes, the problem is that the actions are executed when the complete rule has matched after all conditions are evaluated. 是的,问题是在评估完所有条件后完整规则匹配时执行操作。 You need an action to assign the feature value to a variable, but you need a condition for comparing the variable to another feature. 您需要一个操作来将特征值分配给变量,但是您需要一个条件来将变量与另一个特征进行比较。

However, there are many ways to solve this in Ruta nevertheless, eg, with more rules, BLOCK or action inlined rules. 然而,有许多方法可以在Ruta中解决这个问题,例如,使用更多规则,BLOCK或动作内联规则。 The best way are label expression. 最好的方法是标签表达。 UIMA Ruta 2.5.0 makes our life much easier here. UIMA Ruta 2.5.0让我们的生活更轻松。 You can write something like this: 你可以写这样的东西:

p1:Person # p2:Person{p1.name == p2.name -> UNMARK(Person)};

or 要么

p1:Person # Person.name==p1.name{ -> UNMARK(Person)};

You can probably write a faster rule if you use a STRINGLIST: If the value is contained in the list, then unmark the annotation, if not, then add the value to the list. 如果使用STRINGLIST,则可以编写更快的规则:如果值包含在列表中,则取消标记注释,如果不是,则将值添加到列表中。

DISCLAIMER: I am a developer of UIMA Ruta 免责声明:我是UIMA Ruta的开发人员

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM