nltk的協調如何工作？

Question

這是我發現的，而在工作這個問題是concordance不喜歡在開始顯示上下文Text ：

>>> from nltk.book import *
>>> text3.concordance("beginning",lines=1)
Displaying 1 of 5 matches:
                                   beginning God created the heaven and the ear

請注意，上面的輸出中沒有“ In the”。 但是， Text末尾的concordance沒有問題。

>>> text3.concordance("coffin",lines=1)
Displaying 1 of 1 matches:
 embalmed him , and he was put in a coffin in Egypt .

有趣的是，如果指定width ，效果會更好（我相信默認width=79 ）。

>>> text3.concordance("beginning",width=11, lines=1)
Displaying 1 of 5 matches:
In the beginning

有人對此有解釋嗎？ nltk.org上的文檔說：

在指定的上下文窗口中打印單詞的一致性。 字匹配不區分大小寫。

Answer 1

考慮一下我在源代碼HERE中的class ConcordanceIndex()從原始源代碼修改的函數concordance 。

def print_concordance(self, word, width=35, lines=25):
    """
    Print a concordance for ``word`` with the specified context window.

    :param word: The target word
    :type word: str
    :param width: The width of each line, in characters (default=80)
    :type width: int
    :param lines: The number of lines to display (default=25)
    :type lines: int
    """
    #print ("inside:")
    #print (width)
    half_width = (width - len(word) - 2) // 2
    #print (half_width)
    context = width // 4 # approx number of words of context
    #print ("Context:"+str(context))
    offsets = self.offsets(word)
    if offsets:
        lines = min(lines, len(offsets))
        print("Displaying %s of %s matches:" % (lines, len(offsets)))
        for i in offsets:
            #print(i)
            if lines <= 0:
                break
            left = (' ' * half_width +
                    ' '.join(self._tokens[i-context:i])) #This is were you have to concentrate 
            #print(i-context)
            #print(self._tokens[i-context:i])
            right = ' '.join(self._tokens[i+1:i+context])
            left = left[-half_width:]
            right = right[:half_width]
            print(left, self._tokens[i], right)
            lines -= 1
    else:
        print("No matches")

從注釋區域，您可以觀察到只要值變為“ -ve”，控制台上就不會輸出任何內容。

您可以使用['+ ve'：'-ve']，但不能使用['-ve'：'+ ve']。 因此，不會打印任何內容，或者以其他方式打印空字符串。

當self._tokens[i-context:i]初始值隨着寬度的增加而為正時，趨於為負，因此沒有輸出。

nltk的協調如何工作？

問題描述

1 個解決方案

解決方案1
0 2016-02-19 15:21:08

nltk的協調如何工作？

問題描述

1 個解決方案

解決方案1 0 2016-02-19 15:21:08

解決方案1
0 2016-02-19 15:21:08