在python中的列表中計數值

Question

我是python的新手，很難弄清楚我的代碼是什么問題。

我在這里要做的是將文本轉換為列表中的元組，然后計算列表中DT的數量。

假設txt文件的前三行如下所示：

The/DT Fulton/NNP County/NNP Grand/NNP Jury/NNP said/VBD Friday/NNP an/DT investigation/NN of/IN Atlanta/NNP 's/POS recent/JJ primary/JJ election/NN produced/VBD ``/`` no/DT evidence/NN ''/'' that/IN any/DT irregularities/NNS took/VBD place/NN ./. 
The/DT jury/NN further/RB said/VBD in/IN term-end/JJ presentments/NNS that/IN the/DT City/NNP Executive/NNP Committee/NNP ,/, which/WDT had/VBD over-all/JJ charge/NN of/IN the/DT election/NN ,/, ``/`` deserves/VBZ the/DT praise/NN and/CC thanks/NNS of/IN the/DT City/NNP of/IN Atlanta/NNP ''/'' for/IN the/DT manner/NN in/IN which/WDT the/DT election/NN was/VBD conducted/VBN ./.
The/DT September-October/NNP term/NN jury/NN had/VBD been/VBN charged/VBN by/IN Fulton/NNP Superior/NNP Court/NNP Judge/NNP Durwood/NNP Pye/NNP to/TO investigate/VB reports/NNS of/IN possible/JJ ``/`` irregularities/NNS ''/'' in/IN the/DT hard-fought/JJ primary/NN which/WDT was/VBD won/VBN by/IN Mayor-nominate/NNP Ivan/NNP Allen/NNP Jr./NNP ./.

在工作區中將其另存為“ practice.txt”。

所以我的代碼如下所示：

with open("practice.txt") as myfile:
    for line in myfile:
        cnt += 1
        word = line.split()
        total_word_per_line += len(word)
        total_type_of_words += len(set(word))
        a = [tuple(i.split('/')) for i in word]

    for x in a:
        DT_sum = 0
        if x[1] == 'DT':
            DT_sum += 1

        total_DT_sum += DT_sum

    print total_DT_sum

但是輸出顯示total_DT_sum為2，這意味着它僅在第三列表中計算了DT。 有什么建議計算所有DT嗎？

期望的輸出是5（上面三個句子中DT的總數）

提前致謝！

Answer 1

您的錯誤：

for x in a:
    DT_sum = 0

DT_sum重置為0 ...

萬一您想從頭開始，最簡單的方法是每行的count sum ：

with open("practice.txt") as myfile:
    nb_dt = sum(line.count("/DT") for line in my_file)

結果是13，而不是您所說的5（可以手動驗證）

該解決方案未考慮分詞。 這意味着如果有的話，它也會找到/DTXXX 。

因此，需要執行一些更復雜的代碼：

with open("practice.txt") as myfile:
    nb_dt = sum(1 if word.partition("/")[2]=="DT" else 0 for line in my_file for word in line.split())

根據/在每行的每個單詞的右邊都有DT ，每個時間分割計數為1。

Answer 2

如果需要先將數據存儲在元組列表中，然后再計算'DT'的數量，則可以使用filter()如下所示：

my_list = []

with open('practice.txt', 'r') as f:
    for line in f:
        my_list.extend([tuple(i.split('/')) for i in line.split()])

res = filter(lambda i: i[1] == 'DT', my_list)
print(len(res))  # Output: 13

extend()用於將每行的構造元組添加到my_list

filter()將僅返回第二個位置帶有'DT'項目。

輸出：

>>> res = filter(lambda i: i[1] == 'DT', my_list)
>>> res
[('The', 'DT'), ('an', 'DT'), ('no', 'DT'), ('any', 'DT'), ('The', 'DT'), ('the', 'DT'), ('the', 'DT'), ('the', 'DT'), ('the', 'DT'), ('the', 'DT'), ('the', 'DT'), ('The', 'DT'), ('the', 'DT')]
>>>
>>> len(res)
13

在python中的列表中計數值

問題描述

2 個解決方案

解決方案1
0 2016-12-21 14:13:34

解決方案2
0 已采納 2016-12-21 14:25:21

在python中的列表中計數值

問題描述

2 個解決方案

解決方案1 0 2016-12-21 14:13:34

解決方案2 0 已采納 2016-12-21 14:25:21

解決方案1
0 2016-12-21 14:13:34

解決方案2
0 已采納 2016-12-21 14:25:21