从文件中读取单词，维护顺序

Question

here is an array of Unicode words used in the python script. 这是python脚本中使用的Unicode字数组。

texts =[u"abc", u"pqr", u"mnp"]

The script is working as expected with the above 3 words example. 该脚本正如预期的那样使用上述3个单词示例。 The issue is that there are thousands of words in a text file. 问题是文本文件中有数千个单词。 How do I read from the text file? 我如何从文本文件中读取？

Update: I have 2 issues. 更新：我有2个问题。 The sequence of words from the text file is not maintained in the output. 文本文件中的单词序列不会保留在输出中。 The text file has unicode characters and hence the "u" in my original example. 文本文件具有unicode字符，因此在我的原始示例中为“u”。

# cat testfile.txt
Testing this file with Python

# cat test.py
#!/usr/bin/python
# -*- coding: utf-8 -*-

f     = open('testfile.txt', 'r')
texts  = set(f.read().split())
print (texts)

# python test.py
set(['this', 'Python', 'Testing', 'with', 'file'])

Answer 1

This is because how sets work. 这是因为如何设置工作。 They don't maintain the order of the items stored in the set. 它们不维护存储在集合中的项目的顺序。

From the documentation : 从文档：

A set object is an unordered collection of distinct hashable objects set对象是不同的可哈希对象的无序集合

Answer 2

I see no problem with your file reading code. 我认为您的文件读取代码没有问题。 Given that the words appear in the file separated by whitespace, and the file is not too big to be gulped with a single read , it should work just fine. 鉴于单词出现在由空格分隔的文件中，并且文件不是太大而无法通过单个read吞咽，它应该可以正常工作。 The real problem is the order of the words if you shove them into a set . 真正的问题是如果你把它们推到一个set中的话的顺序。

If you need the words in the same order as they appear in the file, why are you using a set ? 如果您需要与文件中显示的顺序相同的单词，为什么使用set ？ Just keep them in a list. 只需将它们保存在列表中即可。

If you need a set to remove duplicates and/or other purposes, then you have the following options: 如果您需要一个set来删除重复项和/或其他目的，那么您有以下选项：

Use the OrderedDict class - standard in Python since 2.7, and recipes exist online for earlier versions. 使用OrderedDict类 - 自2.7以来在Python中的标准，并且早期版本的在线存在配方。
Create an ordered set - here's a SO question with a good discussion of this 创建一个有序集 - 这是一个SO问题，对此进行了很好的讨论

从文件中读取单词，维护顺序

问题描述

2 个解决方案

解决方案1
2 2011-05-07 08:10:13

解决方案2
2 已采纳 2011-05-07 08:15:27

从文件中读取单词，维护顺序

问题描述

2 个解决方案

解决方案1 2 2011-05-07 08:10:13

解决方案2 2 已采纳 2011-05-07 08:15:27

解决方案1
2 2011-05-07 08:10:13

解决方案2
2 已采纳 2011-05-07 08:15:27