[英]Search for an item in a text file using UIMA Ruta
I have been trying to search for an item which is there in a text file. 我一直在尝试搜索文本文件中存在的项目。
The text file is like Eg: ` 文本文件就像例如:
So I did a dictionary search for XYZ initially and found the positions, but I want only the 1st XYZ and not the rest. 因此,我最初在字典中搜索XYZ并找到了位置,但是我只想要第一个XYZ ,而不想要其余的。 There is a property of XYZ that , it will always be between the 5 digit code and the text MethondName . XYZ有一个属性,该属性将始终在5位代码和文本MethondName之间 。
I am unable to do that. 我做不到。
WORDLIST ZipList = 'Zipcode.txt';
DECLARE Zip;
Document
Document{-> MARKFAST(Zip, ZipList)};
DECLARE Method;
"MethodName" -> Method;
WORDLIST typelist = 'typelist.txt';
DECLARE type;
Document{-> MARKFAST(type, typelist)};
Also how do we use REGEX in UIMA RUTA? 另外,我们如何在UIMA RUTA中使用REGEX?
There are many ways to specify this. 有很多指定方法。 Here are some examples (not tested): 以下是一些示例(未经测试):
// just remove the other annotations (assuming type is the one you want)
type{-> UNMARK(type)} ANY{-STARTSWITH(Method)};
// only keep the first one: remove any annotation if there is one somewhere in front of it
// you can also specify this with POSISTION or CURRENTCOUNT, but both are slow
type # @type{-> UNMARK(type)}
// just create a new annotation in between
NUM{REGEXP(".....")} #{-> type} @Method;
There are two options to use regex in UIMA Ruta: 在UIMA Ruta中有两种使用正则表达式的选项:
"[A-Za-z]+" -> Type;
(找到)简单的正则表达式规则,例如"[A-Za-z]+" -> Type;
ANY{REGEXP("[A-Za-z]+")-> Type};
Let me know if something is not clear. 让我知道是否不清楚。 I will extend the description then. 然后,我将扩展描述。
DISCLAIMER: I am a developer of UIMA Ruta 免责声明:我是UIMA Ruta的开发人员
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.