I have created the following code to recognize a grammar consisting of a verb folowed by one or more determiners and then one or more nouns. The grammar will not recognize a second noun as being in the grammar (example phrase: "monitoring a parking space"):
Testing sentence in grammar: monitoring a parking space
Grammar Chunk:
(S (MT monitoring/VBG a/DT parking/NN) (MT space/NN))
False
Here is the code used in Python 3.5.6:
import nltk
def extractMT(sent):
grammar = r"""
MT:
{<VBG|VBZ|VB>?<DT>?<NN|NNS>}
"""
chunker = nltk.RegexpParser(grammar)
ne = set()
chunk = chunker.parse(nltk.pos_tag(nltk.word_tokenize(sent)))
print("Grammar Chunk: ")
print(chunk)
for tree in chunk.subtrees(filter=lambda t: t.label() == 'MT'):
returnList = []
for child in tree.leaves():
returnList.append(child[0])
ne.add(' '.join(returnList))
return ne
testSentence1 = "monitoring a parking space"
print ("Testing sentence in grammar: " + testSentence1)
print ("Is sentence in grammar?: " + testSentence1 in extractMT(testSentence1))
Like in standard regex
to get many elements you need +
(which means one or more
) or *
(which means zero or more
)
{<VBG|VBZ|VB>?<DT>?<NN|NNS>+}
You can also use {,2}
to get 0
, 1
or 2
elements, or {1,2}
get 1
or 2
elements, or {2}
to get exactly 2
elements
{<VBG|VBZ|VB>?<DT>?<NN|NNS>{,2}}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.