简体   繁体   English

读取文本文件并解释数据

[英]read a text file and interpret the data

I have to write a program that reads in a text file called "terms.txt" and then sorts the file and prints it with the reference first followed by the page numbers. 我必须编写一个程序,该程序读取一个名为“ terms.txt”的文本文件,然后对该文件进行排序,并先打印参考,然后再打印页码。 this is what the file looks like given: 这是给定的文件外观:

3:degree
54:connected
93:adjacent
54:vertex
19:edge
64:neighbor
72:path
55:shortest path
127:tree
99:spanning tree
19:path
28:connected
3:degree
55:graph
64:adjacent
44:breadth first search
77:neighbor
55:degree
55:depth first search
19:degree
27:neighbor
16:Spanning Tree

and this is what it should like like after being run through the program: 这是通过程序运行后的样子:

adjacent, 64, 93
breadth first search, 44
connected, 28, 54
degree, 3, 19, 55
depth first search, 55
edge, 19
graph, 55
neighbor, 27, 64, 77
path, 19, 72
shortest path, 55
spanning tree, 16, 99
tree, 127
vertex, 54

Right now, this is what i have and it's just printing a list of the page numbers and a list of the references... I'm not sure where to go from here. 现在,这就是我所拥有的,它只是在打印页码列表和参考列表...我不确定从这里开始。 Anything will help! 一切都会有帮助的!

def bookIndex2():
    indexList = []
    pageNum = []
    file = open('terms.txt', 'r')
    for line in file:
        pageNumber, reference = line.split(':')  
        pageNum.append(pageNumber)
        indexList.append(reference)
    indexList.sort()
    print(pageNum)
    print(indexList)

You basically need to use a dictionary that maps the references to a list of page numbers. 基本上,您需要使用将引用映射到页码列表的字典。 As you iterate over the file, add the page number to the list for that reference. 遍历文件时,将页码添加到列表中以供参考。 Once you have processed the whole file, sort and print the dictionary's items. 处理完整个文件后,排序并打印字典中的项目。 I recommend using collections.defaultdict instead of the standard dictionary as this is quite good a building up a dictionary of lists. 我建议使用collections.defaultdict而不是标准字典,因为这是建立列表字典的很好。

from collections import defaultdict

index = defaultdict(list)
with open('terms.txt') as f:
    for line in f:
        page, reference = line.strip().lower().split(':')
        index[reference].append(int(page))

for reference, pages in sorted(index.items()):
    print "{}, {}".format(reference, ', '.join(str(i) for i in sorted(pages)))

Your result looks like the contents of a dictionary, where the keys are terms and the values are lists of page numbers. 您的结果看起来像字典的内容,其中的键是术语,而值是页码列表。 The idea would be: for each line of input, add that page number to the list for that term (creating the entry/list if needed). 想法是:对于每一行输入,将该页码添加到该术语的列表中(如果需要,创建条目/列表)。 One the dictionary is filled up, just go through the keys in order to produce the desired output. 一本字典已满,只需按一下键即可产生所需的输出。

This might help. 这可能会有所帮助。

from collections import defaultdict

def bookIndex2():
    file = open('terms.txt', 'r')
    occurance_dict = defaultdict(list)
    for line in file:
        pageNumber, reference = line.split(':')  
        occurance_dict[reference].append(pageNumber)
    for term, occurances in sorted(occurance_dict.items()):
        print [term]+[occurances]

It's very difficult homework... 这是非常困难的作业...

from collections import defaultdict
def bookIndex2():
    file = open('terms.txt', 'r')
    d = defaultdict(set)
    for line in file:
        num, name = line.strip().split(":")
        d[name.lower()].add(num)
    print "\n".join(map(", ".join, [[name] + sorted(num, key=int) for name, num  in  sorted(d.items())]))

output: 输出:

adjacent, 64, 93
breadth first search, 44
connected, 28, 54
degree, 3, 19, 55
depth first search, 55
edge, 19
graph, 55
neighbor, 27, 64, 77
path, 19, 72
shortest path, 55
spanning tree, 16, 99
tree, 127
vertex, 54

If you're looking for something a bit more simple/easier to understand, this might help :) 如果您正在寻找更简单/更容易理解的内容,这可能会有所帮助:)

def bookIndex2():
    appendix = {}

    file = open('terms.txt', 'r')
    for line in file:
        pageNumber, reference = line.split(':')
        reference = reference.rstrip()       "removes \n characters"

        if reference in appendix:
            appendix[reference]=appendix[reference]+', '+pageNumber
        else:
            appendix.update({reference : pageNumber})

    print appendix

The above code stores references as keys in the dictionary. 上面的代码将引用存储为字典中的键。 If a reference already exists, then the page number is just appended to the existing reference with a comma. 如果引用已经存在,则页面号仅以逗号附加到现有引用之后。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM