将文本挖掘中的多个值附加到python中的单个列表中

Question

I have a csv file that I am importing data from. 我有一个要从中导入数据的csv文件。 I am trying to create a list of words used in two essays and how many times they are used. 我正在尝试创建两篇文章中使用的单词及其使用次数的列表。 I am running a loop to get each line of the csv file which has two essays, and the output posts the combined word count between the two essays. 我正在运行一个循环，以获取具有两篇论文的csv文件的每一行，并且输出在这两篇论文之间发布了合并的单词数。 However, I hundreds of lines that have two essays each. 但是，我有数百行，每行有两篇文章。 I would like for there to be one list with all words and word counts from all the essays. 我希望有一份清单，列出所有论文的所有单词和字数。

import textmining

import csv

with open('2011ShortAnswers.csv', 'rb') as csvfile:
    data = csv.reader(csvfile, delimiter=",")

    for row in data:
        doc1 = row[3]
        doc2 = row[4]

        tdm = textmining.TermDocumentMatrix()

        tdm.add_doc(doc1)
        tdm.add_doc(doc2)

        for row in tdm.rows(cutoff=1):
            print row

Answer 1

Try using a dictionary where you increment each word as you go: 尝试使用字典，在此字典中，您可以随便增加每个单词：

word_count_dictionary = {}
for word in row:
    if word not in word_count_dictionary.keys():
        word_count_dictionary[word] = 1
    else:
        word_count_dictionary[word] += 1

You can then iterate over the keys to form the list you need: 然后，您可以遍历键以形成所需的列表：

word_count_list = [(word,word_count_dictionary[word]) for word in word_count_dictionary.keys()]

将文本挖掘中的多个值附加到python中的单个列表中

问题描述

1 个解决方案

解决方案1
0 2013-10-30 13:23:34

将文本挖掘中的多个值附加到python中的单个列表中

问题描述

1 个解决方案

解决方案1 0 2013-10-30 13:23:34

解决方案1
0 2013-10-30 13:23:34