![](/img/trans.png)
[英]In Python, searching a text file for multiple words and printing the corresponding lines
[英]Inputting text file and list of words and print corresponding words with lines numbers?
这是我的代码。 但是,这仅适用于一个单词,并且将同一单词打印两次。 如何使它遍历单词列表和文本文件,并用数字编号打印单词。 例如:
index (‘raven.txt’, [‘raven’, ‘mortal’, ‘dying’, ‘ghost’, ghastly’, ‘evil’, ‘demon’])
ghost 9
dying 9
demon 122
evil 99, 106
ghastly 82
mortal 30
我的代码:
filename = input("type a filename:")
file = open(filename)
counter = 0
lst = []
while True:
x = input("type word:")
for line in file.readlines():
counter += 1
if line.find(x) >= 0:
print(x, counter)
通过使用集,您可以一次或多或少地检查所有关键字而无需循环。
def index(filepath, keywords):
# Convert list to a set
keys = set(keywords)
data = {}
with open(filepath, "r") as fd:
for i, line in enumerate(fd.readlines()):
for key in set.intersection(keys, set(line.split())):
data.setdefault(key, []).append(i)
return data
filepath = raw_input("Enter file: ")
keywords = raw_input("Enter keywords: ").split()
data = index(filepath, keywords)
for key in sorted(data.keys()):
print "%s :: %s"%(key, ", ".join([str(i) for i in sorted(data[key])]))
在下面的测试文件中,输出将是:
>python kaka.py
Enter file: test.txt
Enter keywords: help some test
help :: 3, 7
test :: 0, 3, 7
ssdfsdf test sdfsdf
sdf
sdfsdfs
sdf help test
sdfsdf
sdfsdf help test
file.readlines()
返回一个生成器。 生成器中的项目只能访问一次。 为了再次访问生成器中的迭代,您将不得不重新初始化它。
相反,您可以在列表中添加文件的所有行,然后在每次将单词输入程序时都在该列表中进行搜索。
filename = input("type a filename:")
file = open(filename)
files_lines = [line for line in file]
counter = 0
while True:
x = input("type word:")
for line in files_lines:
counter += 1
if line.find(x) >= 0:
print(x, counter)
# Reset the counter so that for the next search
# word the line number begins from line number = 0
counter = 0
您可以通过使用枚举来进一步改进代码
filename = input("type a filename:")
file = open(filename)
files_lines = [line for line in file]
while True:
x = input("type word:")
line_nos = []
for line_no, line in enumerate(files_lines, start=1):
if line.find(x) >= 0:
line_nos.append(line_no)
if line_nos:
print(x, line_nos)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.