Python, reading lines from a file one by one

Question

im making a program wich sorts out valid and invalid social security numbers.

The program is supposed to be able to sort numbers from a textfile on my computer. However im only able to input all the numbers at once (i think). i wont the program to check the numbers one by one.

this is how it looks right now

def fileinput():
    try:
        textfile = open("numberlist.txt","r")
        socialsecuritynumber = textfile.read()
        numberprogram(socialsecuritynumber)
    except IOError:
        print("there's no such file!\n")

anyone know how im supposed to do this? the textfile just contains numbers

1993-06-11 5570
930611-5570
930611 5570
93 05115570
1993 05 11 55 70
1993 05 11 5570

this is numbers from my textfile

Answer 1

Always read files with with statement. So, if there is a problem during read, or there is an exception in the code block, file will be closed automatically.

Then use a for loop to read line by line like this

 with open("numberlist.txt","r") as textfile: for line in textfile: print line

Answer 2

Use with as thefourtheye suggested. You can use the readLines() method and iterate over the lines one by one using a for-in loop and check for the validity. This will ensure that even for large files, your code doesn't break.

Answer 3

with open("numberlist.txt") as f: # this auto closes the file after reading. It's good practice
    numbers = f.readlines() # numbers is a list of all the numbers(a list of lines in the file)

if there are unwanted spaces in the lines(or just in case there are):

numbers = [n.strip() for n in numbers] # takes out unwanted spaces on the ends

and if you find there are commas or something after the numbers, you can do this:

numbers = [n[:-1] for n in numbers] # slices off the last character of each line/list item

for number in numbers:
    #do whatever you want here

EDIT:

Alternatively you could use a regular expression, and commas and spaces won't matter:

import re

n = ['1993-06-11 5570',
     '930611-5570',
     '930611 5570',
     '93 05115570',
     '1993 05 11 55 70',
     '1993 05 11 5570']

regex = '([0-9]+(?:[- ]?[0-9]+)*)'
match_nums = [re.search(regex, num) for num in n]
results = [i.groups() for i in match_nums]
for i in results:
    print i

('1993-06-11 5570',)
('930611-5570',)
('930611 5570',)
('93 05115570',)
('1993 05 11 55 70',)
('1993 05 11 5570',)

for info on regular expressions, see here

Answer 4

Using with for file operations is suggested. If it is Python 2.4 or something, you have to import with statement. Simplest solution for your problem with numbers that I could think of is:

from __future__ import with_statement
file_with_ssn = "/absolute/path/to/the/file"

try:
    with open(file_with_ssn) as ssn_file:
        for ssn in ssn_file:
            ssn = filter(str.isdigit, ssn)
            # This removes anything other than a number including -.
            # Since SSN cannot be in negative, isdigit is fine here
            numberprogram(ssn)
except EnvironmentError:
    print "IOError or OSError or WindowsError happened"

Python, reading lines from a file one by one

Question

4 answers

solution1
4 2013-12-10 12:41:22

solution2
0 2013-12-10 12:44:56

solution3
0 ACCPTED 2013-12-10 12:45:01

solution4
0 2013-12-17 10:54:18

Python, reading lines from a file one by one

Question

4 answers

solution1 4 2013-12-10 12:41:22

solution2 0 2013-12-10 12:44:56

solution3 0 ACCPTED 2013-12-10 12:45:01

solution4 0 2013-12-17 10:54:18

solution1
4 2013-12-10 12:41:22

solution2
0 2013-12-10 12:44:56

solution3
0 ACCPTED 2013-12-10 12:45:01

solution4
0 2013-12-17 10:54:18