简体   繁体   中英

reading a file and parse them into section

okay so I have a file that contains ID number follows by name just like this:

10 alex de souza

11 robin van persie

9 serhat akin

I need to read this file and break each record up into 2 fields the id, and the name. I need to store the entries in a dictionary where ID is the key and the name is the satellite data. Then I need to output, in 2 columns, one entry per line, all the entries in the dictionary, sorted (numerically) by ID. dict.keys and list.sort might be helpful (I guess). Finally the input filename needs to be the first command-line argument.

Thanks for your help!

I have this so far however can't go any further.

fin = open("ids","r")    #Read the file

for line in fin:           #Split lines 

string = str.split()

if len(string) > 1:           #Seperate names and grades

id = map(int, string[0]

name = string[1:]

print(id, name) #Print results

We need sys.argv to get the command line argument (careful, the name of the script is always the 0th element of the returned list).

Now we open the file (no error handling, you should add that) and read in the lines individually. Now we have 'number firstname secondname'-strings for each line in the list "lines".

Then open an empty dictionary out and loop over the individual strings in lines, splitting them every space and storing them in the temporary variable tmp (which is now a list of strings: ('number', 'firstname','secondname')). Following that we just fill the dictionary, using the number as key and the space-joined rest of the names as value.

To print the dictionary sorted just loop over the list of numbers returned by sorted(out), using the key=int option for numerical sorting. Then print the id (the number) and then the corresponding value by calling the dictionary with a string representation of the id.

import sys

try:
    infile = sys.argv[1]
except IndexError:
    infile = input('Enter file name: ')

with open(infile, 'r') as file:
    lines = file.readlines()

out = {}  
for fullstr in lines:
    tmp = fullstr.split()
    out[tmp[0]] = ' '.join(tmp[1:])

for id in sorted(out, key=int):
    print id, out[str(id)]

This works for python 2.7 with ASCII-strings. I'm pretty sure that it should be able to handle other encodings as well (German Umlaute work at least), but I can't test that any further. You may also want to add a lot of error handling in case the input file is somehow formatted differently.

Just a suggestion, this code is probably simpler than the other code posted:

import sys
with open(sys.argv[1], "r") as handle:
    lines = handle.readlines()
data = dict([i.strip().split(' ', 1) for i in lines])

for idx in sorted(data, key=int):
    print idx, data[idx]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM