[英]Adding a string to all keys in dictionary (Python)
I'm new to Python and Pyspark and I'm practicing TF-IDF.我是 Python 和 Pyspark 的新手,我正在练习 TF-IDF。 I split all words from sentences in the txt file, removed punctuations, removed the words that are in the stop-words list, and saved them as a dictionary with the codes below.我从 txt 文件中的句子中拆分所有单词,删除标点符号,删除停用词列表中的单词,并将它们保存为字典,代码如下。
x = text_file.flatmap(lambda line: str_clean(line).split()
x = x.filter(lambda word: word not in stopwords
x = x.reduceByKey(lambda a,b: a+b)
x = x.collectAsMap()
I have 10 different txt files for this same process.对于同一过程,我有 10 个不同的 txt 文件。 And I'd like to add a string like "@d1"
to keys in dictionary so that I can indicate that the key is from document 1.我想在字典中的键中添加一个像"@d1"
这样的字符串,这样我就可以表明该键来自文档 1。
How can I add "@1"
to all keys in the dictionary?如何将"@1"
添加到字典中的所有键?
Essentially my dictionary is in the form:基本上我的字典是这样的:
{'word1': 1, 'word2': 1, 'word3': 2, ....}
And I would like it to be:我希望它是:
{'word1@d1': 1, 'word2@d1': 1, 'word3@d1': 2, ...}
Try a dictionary comprehension :尝试字典理解:
{k+'@d1': v for k, v in d.items()}
In Python 3.6+, you can use f-strings:在 Python 3.6+ 中,您可以使用 f 字符串:
{f'{k}@d1': v for k, v in d.items()}
You can use dict
constructor to rebuild the dict, appending file number to the end of each key:您可以使用dict
构造函数来重建 dict,将文件号附加到每个键的末尾:
>>> d = {'a': 1, 'b': 2}
>>> file_number = 1
>>> dict(("{}@{}".format(k,file_number),v) for k,v in d.items())
>>> {'a@1': 1, 'b@1': 2}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.