I would like to extract "all" the noun phrases from a sentence. I'm wondering how I can do it. I have the following code:
doc2 = nlp("what is the capital of Bangladesh?")
for chunk in doc2.noun_chunks:
print(chunk)
1. what
2. the capital
3. bangladesh
the capital of Bangladesh
I have tried answers from spacy doc and StackOverflow. Nothing worked. It seems only cTakes
and Stanford core NLP
can give such complex NP.
Any help is appreciated.
Spacy clearly defines a noun chunk as:
A base noun phrase, or "NP chunk", is a noun phrase that does not permit other NPs to be nested within it – so no NP-level coordination, no prepositional phrases, and no relative clauses." ( https://spacy.io/api/doc#noun_chunks )
If you process the dependency parse differently, allowing prepositional modifiers and nested phrases/chunks, then you can end up with what you're looking for.
I bet you could modify the existing spacy code fairly easily to do what you want:
For those who are still looking for this answer
noun_pharses=set()
for nc in doc.noun_chunks:
for np in [nc, doc[nc.root.left_edge.i:nc.root.right_edge.i+1]]:
noun_pharses.add(np)
This is how I get all the complex noun phrase
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.