[英]UIMA RUTA annotation at the beginning of sequence
I have sequence of annotations that are instances of the same type (eg sequence of CW annotations). 我具有相同类型的实例的注释序列(例如CW注释序列)。 I need to remove the first of them (more formally: remove annotation that has no annotations of the same type before in document).
我需要删除其中的第一个(更正式地说:删除文档中之前没有相同类型注释的注释)。 Less formally: to remove an annotation at the beginning of document.
不那么正式:删除文档开头的注释。 Example document: "Software StageTools" So, I tried many variants:
示例文档:“ Software StageTools”因此,我尝试了许多变体:
"Software"{-AFTER(CW) -> UNMARK(CW)} CW+; //does not work
"Software"{BEFORE(CW) -> UNMARK(CW)} CW+; //does not work
"Software"{-STARTSWITH(Document) -> UNMARK(CW)} CW+; //does not work
CW{0, 0} "Software"{-> UNMARK(CW)} CW+; //getting parsing error
...and some other ones. ...和其他一些 Obviously, no one works (may be, I can refer to begin feature of annotation, but this will not solve formal issue).
显然,没有人工作(也许可以,我可以参考注释的开始功能,但这不能解决形式问题)。
At last, the question is - how can I say RUTA to remove annotation that has no annotations of the same type before in document? 最后,问题是-如何说出RUTA删除文档中之前没有相同类型注释的注释?
There are many ways to do this. 有很多方法可以做到这一点。 Here are two examples:
这是两个示例:
# cw:CW.ct=="Software"{-> UNMARK(cw)} CW;
Remove the first CW "Software" in the document if there is another CW following. 如果后面还有另一个CW,请删除文档中的第一个CW“软件”。
ANY{-PARTOF(CW)} cw:@CW.ct=="Software"{-> UNMARK(cw)} CW;
Remove any CW "Software" if there is a CW following and there is no CW preceding. 如果后面有CW而前面没有CW,则删除任何CW“软件”。 If the document can start with the pattern, you need a second rule.
如果文档可以以模式开头,则需要第二条规则。
Your second rule actually works for me. 您的第二条规则实际上对我有用。 The last rule has no valid syntax.
最后一条规则没有有效的语法。 The min/max quantifier requires different brackets like
[0,0]
. 最小/最大量词需要使用不同的括号,例如
[0,0]
。 However, this would not have the effect you want. 但是,这不会达到您想要的效果。
DISCLAIMER: I am a developer of UIMA Ruta 免责声明:我是UIMA Ruta的开发人员
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.