如何从列表中创建单词对列表

Question

我在文件“ temp”中有一个单词列表：

 1. the 
 2. of
 3. to
 4. and
 5. bank

等等

我如何提高其可读性？

import itertools
f = open("temp.txt","r")
lines = f.readlines()
pairs = list(itertools.permutations(lines, 2))
print(pairs)

我迷路了，请帮忙。

Answer 1

import itertools

with open("temp.txt", "r") as f:
    words = [item.split(' ')[-1].strip() for item in f]

pairs = list(itertools.permutations(words, 2))
print(pairs)

打印（使用pprint的可读性）：

[('the', 'of'),
 ('the', 'to'),
 ('the', 'and'),
 ('the', 'bank'),
 ('of', 'the'),
 ('of', 'to'),
 ('of', 'and'),
 ('of', 'bank'),
 ('to', 'the'),
 ('to', 'of'),
 ('to', 'and'),
 ('to', 'bank'),
 ('and', 'the'),
 ('and', 'of'),
 ('and', 'to'),
 ('and', 'bank'),
 ('bank', 'the'),
 ('bank', 'of'),
 ('bank', 'to'),
 ('bank', 'and')]

Answer 2

我假设您的问题是创建temp文件中定义的所有可能的单词对。 这称为置换，您已经在使用itertools.permutations函数

如果需要将输出实际写入文件，则代码应为以下内容：

编码：

import itertools
f = open("temp","r")
lines = [line.split(' ')[-1].strip() for line in f] #1
pairs = list(itertools.permutations(lines, 2)) #2
r = open('result', 'w') #3
r.write("\n".join([" ".join(p) for p in pairs])) #4
r.close() #5

[line.split(' ')[-1].strip() for line in f]的[line.split(' ')[-1].strip() for line in f]将读取整个文件，并且对于读取的每一行，它将在空格字符周围分割它，选择该行的最后一项（负索引（如-1 ）在列表中向后移动），删除所有尾随空格（如\\n ），并将所有行放在一个列表中
对已像您已经生成的那样生成，但是现在它们没有尾随的\\n
打开result文件进行写入
将两对以空格（ " " ）分隔的行对，将每个结果（一行）与\\n ，然后写入文件
关闭文件（因此刷新它）

Answer 3

一些改进的解释

import itertools

with open('temp.txt', 'r') as fobj_in, open('out.txt', 'w') as fobj_out:
    words = (item.split()[-1] for item in fobj_in if item.strip())
    for pair in itertools.permutations(words, 2):
        fobj_out.write('{} {}\n'.format(*pair))

说明

with open('temp.txt', 'r') as fobj_in, open('out.txt', 'w') as fobj_out:

我们打开这两个文件，一个用于读取，的帮助下写的一个with 。 这样可以保证，即使在该块中某处有异常，只要我们离开with块的缩进，两个文件都将被关闭。

我们使用列表理解来获取所有单词：

words = [item.split()[-1] for item in fobj_in if item.strip()]

item.split()[-1]在任何空格处剥离，并为我们提供该行的最后一个条目。 请注意，它还在每行末尾取\\n 。 这里不需要.strip() 。 item.split()通常比item.split(' ')更好，因为它也可以用于多个空间和制表符。 我们仍然需要使用if item.strip()确保该行不为空。 如果删除所有空格后什么都没留下，那么我们就没有字了， item.split()[-1]将给出索引错误。 只需转到下一行并丢弃该行即可。

现在，我们可以遍历所有对，并将它们写入输出文件：

for pair in itertools.permutations(words, 2):
    fobj_out.write('{} {}\n'.format(*pair))

我们要求迭代器一次给我们下一个单词对一对，然后将此对写入输出文件。 无需将其转换为列表。 .format(*pair)的两个元素，并与我们具有两个元素的pair等效于.format(pair[0], pair[1]) 。

业绩说明

第一种直觉可能是也使用生成器表达式从文件中读取单词：

words = (item.split()[-1] for item in fobj_in if item.strip())

但是时间测量表明，列表理解比生成器表达式要快。 这是由于itertools.permutations(words)始终消耗迭代器words 。 首先创建列表可以避免再次遍历所有元素的工作。

如何从列表中创建单词对列表

问题描述

3 个解决方案

解决方案1
4 2013-06-08 21:05:31

解决方案2
3 已采纳 2013-06-08 21:18:20

解决方案3
2 2013-06-08 22:24:32

一些改进的解释

说明

业绩说明

如何从列表中创建单词对列表

问题描述

3 个解决方案

解决方案1 4 2013-06-08 21:05:31

解决方案2 3 已采纳 2013-06-08 21:18:20

解决方案3 2 2013-06-08 22:24:32

一些改进的解释

说明

业绩说明

解决方案1
4 2013-06-08 21:05:31

解决方案2
3 已采纳 2013-06-08 21:18:20

解决方案3
2 2013-06-08 22:24:32