简体   繁体   English

如何使用Regex模式从树结构中提取单词

[英]How to extract the words from the tree structure using Regex pattern

I need to extract the noun phrases from the tree structure, but i am unable to extract the nouns from the tree structure using regex pattern. 我需要从树结构中提取名词短语,但是我无法使用正则表达式模式从树结构中提取名词。

Here is the tree structure 这是树状结构

(TOP (ADJP (JJ welcome) (PP (TO to) (NP (NNP Regular) (NNP Expression) (NNS learnings))))) (TOP(ADJP(欢迎JJ)(PP(TO至)(NP(NNP常规)(NNP表达式)(NNS学习))))))

I need to extract all the words which are pos tags like NP,NNP,NNS etc.ie; 我需要提取所有pos标签的单词,例如NP,NNP,NNS等。 i need to get the words like Regular,Expression,learnings using regex pattern. 我需要使用正则表达式模式来获取诸如Regular,Expression,learnings之类的单词。

Can some one please help me how to get this. 有人可以帮我如何得到这个。

No sure if this is what you've wanted but this will extract those words for you: 不确定这是否是您想要的,但这会为您提取这些单词:

Pattern regexpPattern = Pattern.compile("([A-Z]?[a-z]+)\\)");
Matcher m = regexpPattern.matcher("your string");
while (m.find()) {
    System.out.println(m.group(1));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM