[英]TypeError: sequence item 352: expected str instance, NoneType found
我試圖在我的語料庫中執行句子分塊。首先我加載了我的標簽數據,然后我試圖在那個標有標簽的語料庫中執行分塊。這是我的代碼。
def load_corpus():
corpus_root = os.path.abspath('../nlp1/dumpfiles')
mycorpus = nltk.corpus.reader.TaggedCorpusReader(corpus_root,'.*')
return mycorpus.tagged_sents()
def sents_chunks(tagg_sents, pos_tag_pattern):
chunk_freq_dict = defaultdict(int)
chunker = nltk.RegexpParser(pos_tag_pattern)
for sent in tagg_sents:
if not all(sent):
print("NoneType object in \"{}\": {}".format(sent.label(),sent))
sent = cast_to_tree_function(filter(bool, sent))
for chk in chunker.parse(sent).subtrees():
if str(chk).startswith('(NP'):
phrase = chk.__unicode__()[4:-1]
#print(phrase)
if '\n' in phrase:
phrase = ' '.join(phrase.split())
#print(phrase)
chunk_freq_dict[phrase] += 1
#print(chunk_freq_dict)
return chunk_freq_dict
我在我的語料庫中某個地方出現錯誤,為什么不知道為什么?為什么有人知道這是什么問題,我該如何解決? 這是錯誤:
Traceback (most recent call last):
File "multiwords1.py", line 184, in <module>
candidates = main(domain_corpus, PATTERN,MIN_FREQ,MIN_CVAL)
File "multiwords1.py", line 156, in main
chunks_freqs = sents_chunks(domain_sents, pos_tag_pattern)
File "multiwords1.py", line 23, in sents_chunks
for chk in chunker.parse(sent).subtrees():
File "/usr/local/lib/python3.5/dist-packages/nltk/chunk/regexp.py", line 1208, in parse
chunk_struct = parser.parse(chunk_struct, trace=trace)
File "/usr/local/lib/python3.5/dist-packages/nltk/chunk/regexp.py", line 1023, in parse
chunkstr = ChunkString(chunk_struct)
File "/usr/local/lib/python3.5/dist-packages/nltk/chunk/regexp.py", line 98, in __init__
self._str = '<' + '><'.join(tags) + '>'
TypeError: sequence item 352: expected str instance, NoneType found
您有一個TypeError執行。 消息項352從標簽的類型不為(NoneType)引發,這意味着在sent
(ntlk.tree.Tree類)中依次包含一個NoneType對象。
這行是發生異常的原因 ,因為str.join只能采用str 。 您需要檢查sent
迭代器中str類型的從屬關系中的每個項目。
您可以為此使用內置過濾器功能,但結果應強制轉換為Tree type 。
filter(bool, sent) # return a iterator with valid items
要檢查哪個可迭代對象具有NoneType項目,可以執行以下操作:
if not all(sent):
print("NoneType object in \"{}\": {}".format(sent.label(), sent))
sent = cast_to_tree_function(filter(bool, sent)) # update set object to valid items
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.