簡體   English   中英

TypeError:序列項352:預期的str實例,找不到NoneType

[英]TypeError: sequence item 352: expected str instance, NoneType found

我試圖在我的語料庫中執行句子分塊。首先我加載了我的標簽數據,然后我試圖在那個標有標簽的語料庫中執行分塊。這是我的代碼。

def load_corpus():
    corpus_root = os.path.abspath('../nlp1/dumpfiles')
    mycorpus = nltk.corpus.reader.TaggedCorpusReader(corpus_root,'.*')
    return mycorpus.tagged_sents()

def sents_chunks(tagg_sents, pos_tag_pattern):
    chunk_freq_dict = defaultdict(int)
    chunker = nltk.RegexpParser(pos_tag_pattern)
    for sent in tagg_sents:
        if not all(sent):
          print("NoneType object in \"{}\": {}".format(sent.label(),sent))
          sent = cast_to_tree_function(filter(bool, sent)) 
        for chk in chunker.parse(sent).subtrees():
            if str(chk).startswith('(NP'):
                phrase = chk.__unicode__()[4:-1]
                #print(phrase)
                if '\n' in phrase:
                    phrase = ' '.join(phrase.split())
                    #print(phrase)
                chunk_freq_dict[phrase] += 1
    #print(chunk_freq_dict)
    return chunk_freq_dict 

我在我的語料庫中某個地方出現錯誤,為什么不知道為什么?為什么有人知道這是什么問題,我該如何解決? 這是錯誤:

Traceback (most recent call last):
  File "multiwords1.py", line 184, in <module>
    candidates = main(domain_corpus, PATTERN,MIN_FREQ,MIN_CVAL)
  File "multiwords1.py", line 156, in main
    chunks_freqs = sents_chunks(domain_sents, pos_tag_pattern)
  File "multiwords1.py", line 23, in sents_chunks
    for chk in chunker.parse(sent).subtrees():
  File "/usr/local/lib/python3.5/dist-packages/nltk/chunk/regexp.py", line 1208, in parse
    chunk_struct = parser.parse(chunk_struct, trace=trace)
  File "/usr/local/lib/python3.5/dist-packages/nltk/chunk/regexp.py", line 1023, in parse
    chunkstr = ChunkString(chunk_struct)
  File "/usr/local/lib/python3.5/dist-packages/nltk/chunk/regexp.py", line 98, in __init__
    self._str = '<' + '><'.join(tags) + '>'
TypeError: sequence item 352: expected str instance, NoneType found

您有一個TypeError執行。 消息項352從標簽的類型不為(NoneType)引發,這意味着在sent (ntlk.tree.Tree類)中依次包含一個NoneType對象。

這行是發生異常的原因 ,因為str.join只能采用str 您需要檢查sent 迭代器str類型的從屬關系中的每個項目。

您可以為此使用內置過濾器功能,但結果應強制轉換為Tree type

filter(bool, sent) # return a iterator with valid items

要檢查哪個可迭代對象具有NoneType項目,可以執行以下操作:

if not all(sent):
    print("NoneType object in \"{}\": {}".format(sent.label(), sent))
    sent = cast_to_tree_function(filter(bool, sent))  # update set object to valid items

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM