简体   繁体   English

元组是计数器中的字典键 - 我如何使它成为一个字符串?

[英]Tuple is a dictionary key in counter - how do I make it a string?

I am new to Python.我是 Python 的新手。 I used collections.Counter to count the most frequent bigrams in a text:我使用 collections.Counter 来计算文本中最常见的二元组:

import sys, codecs
import nltk, collections
from nltk.util import ngrams

InputFile = codecs.open("testin.txt", 'r', 'utf-8')
text=InputFile.read().lower()
tokens = text.split() 
bi_tokens = ngrams(tokens, 2)
bi_freq = collections.Counter(bi_tokens)

If I use:如果我使用:

for row in bi_freq.most_common(100):
    print (row)

The result appears as:结果显示为:

(('star', 'wars'), 29)
(('blu', 'ray'), 21)

If I use:如果我使用:

for row in bi_freq.most_common(1000):
    print (row[0], "\t", row[1])

The result appears a bit cleaner as:结果看起来更干净一些:

('star', 'wars')     29
('blu', 'ray')   21

I would like to get to:我想去:

star wars    29
blu ray      21

which I would import into a spreadsheet in two columns with tab as a separator.我会用制表符作为分隔符将其导入到两列中的电子表格中。

So my question is: how do I access each tuple value, when the tuple is a key in a dictionary, so that I can concatenate them into a string?所以我的问题是:当元组是字典中的键时,我如何访问每个元组值,以便我可以将它们连接成一个字符串? Thanks in advance.提前致谢。

Edit: I did this:编辑:我这样做了:

for row in bi_freq.most_common(100):
    wordlist_in_bigram = row[0]
    print (wordlist_in_bigram[0],wordlist_in_bigram[1],"\t", row[1])

And the result seems to be what I wanted:结果似乎是我想要的:

star wars    29
blu ray      21

Is this a good solution?这是一个好的解决方案吗? Thanks谢谢

Use join() to create a delimited string from a sequence.使用join()从序列中创建分隔字符串。

for bigram, c in b_freq.most_common(1000):
    print(" ".join(bigram), c)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM