如何從python中的短語中提取頭部名詞？

Question

我正在執行關鍵短語分類任務，為此，我正在從python中的關鍵短語中提取頭部名詞。 互聯網上提供的幫助很少，沒有很好的用處。 我為此而苦苦掙扎。

Answer 1

該任務稱為詞性標記，屬於自然語言處理（NLP）領域。 為了從文本中提取名詞，您可以使用nltk

import nltk

text= 'Your text goes here'

# Check if noun (=NN)
isNoun = lambda pos: pos[:2] == 'NN'

# tokenise text and keep only nouns
tokenized = nltk.word_tokenize(lines)
nouns = [word for (word, pos) in nltk.pos_tag(tokenized) if isNoun (pos)] 
print(nouns)

或TextBlow

from textblob import TextBlob
text= 'Your text goes here'
blob = TextBlob(text)
print(blob.noun_phrases)

如果您想了解有關PoS標記的更多信息，您可能會發現從官方的nltk頁面獲得的nltk非常有用。

Answer 2

您可以通過NLTK工具包使用詞性標注來句子，並提取與“名詞”，“動詞”相關的標記

text = '''I am doing a keyphrase classification task and for this i am working with the head noun extraction from keyphrases in python. The little help available on internet is not of good use. i am struggling with this.'''
pos_tagged_sent = nltk.pos_tag(nltk.tokenize.word_tokenize(text))

nouns = [tag[0] for tag in pos_tagged_sent if tag[1]=='NN']

出：

[('I', 'PRP'),
 ('am', 'VBP'),
 ('doing', 'VBG'),
 ('a', 'DT'),
 ('keyphrase', 'NN'),
 ('classification', 'NN'),

Answer 3

您可以在NLTK中使用Stanford Parser包並獲取依賴關系。 然后為您使用關系工作，例如nn或復合（名詞復合修飾符）。 您可以在此處查看De Marneffe的類型化依賴項手冊。

在手冊中，“石油價格期貨”的名詞短語包含具有兩個修飾符和一個頭部的化合物。

您可以在此處的 Stanford Parser演示界面中檢查任何句子的語法分析樹和相關性。

希望這可以幫助，

干杯

如何從python中的短語中提取頭部名詞？

問題描述

3 個解決方案

解決方案1
0 2018-09-20 11:40:06

解決方案2
0 2018-09-20 11:41:02

解決方案3
-1 2018-09-20 11:48:30

如何從python中的短語中提取頭部名詞？

問題描述

3 個解決方案

解決方案1 0 2018-09-20 11:40:06

解決方案2 0 2018-09-20 11:41:02

解決方案3 -1 2018-09-20 11:48:30

解決方案1
0 2018-09-20 11:40:06

解決方案2
0 2018-09-20 11:41:02

解決方案3
-1 2018-09-20 11:48:30