简体   繁体   English

Python,一一读取文件中的行

[英]Python, reading lines from a file one by one

im making a program wich sorts out valid and invalid social security numbers. 即时通讯程序会找出有效和无效的社会安全号码。

The program is supposed to be able to sort numbers from a textfile on my computer. 该程序应该能够对我计算机上文本文件中的数字进行排序。 However im only able to input all the numbers at once (i think). 但是即时通讯只能一次输入所有数字(我认为)。 i wont the program to check the numbers one by one. 我不会程序一一检查数字。

this is how it looks right now 这就是现在的样子

def fileinput():
    try:
        textfile = open("numberlist.txt","r")
        socialsecuritynumber = textfile.read()
        numberprogram(socialsecuritynumber)
    except IOError:
        print("there's no such file!\n")

anyone know how im supposed to do this? 有人知道我应该怎么做吗? the textfile just contains numbers 文本文件只包含数字

  • 1993-06-11 5570 1993-06-11 5570
  • 930611-5570 930611-5570
  • 930611 5570 930611 5570
  • 93 05115570 93 05115570
  • 1993 05 11 55 70 1993 05 11 55 70
  • 1993 05 11 5570 1993 05 11 5570

this is numbers from my textfile 这是我文本文件中的数字

  1. Always read files with with statement. 始终使用with语句读取文件。 So, if there is a problem during read, or there is an exception in the code block, file will be closed automatically. 因此,如果在读取过程中出现问题,或者代码块中存在异常,则文件将自动关闭。
  2. Then use a for loop to read line by line like this 然后使用for循环像这样逐行读取

     with open("numberlist.txt","r") as textfile: for line in textfile: print line 

Use with as thefourtheye suggested. 如您所建议的那样使用。 You can use the readLines() method and iterate over the lines one by one using a for-in loop and check for the validity. 您可以使用readLines()方法,并使用for-in循环逐行遍历行,并检查其有效性。 This will ensure that even for large files, your code doesn't break. 这将确保即使对于大文件,您的代码也不会中断。

with open("numberlist.txt") as f: # this auto closes the file after reading. It's good practice
    numbers = f.readlines() # numbers is a list of all the numbers(a list of lines in the file)

if there are unwanted spaces in the lines(or just in case there are): 如果行中有多余的空格(或者以防万一):

numbers = [n.strip() for n in numbers] # takes out unwanted spaces on the ends

and if you find there are commas or something after the numbers, you can do this: 如果您发现数字后面有逗号或其他内容,则可以执行以下操作:

numbers = [n[:-1] for n in numbers] # slices off the last character of each line/list item

for number in numbers:
    #do whatever you want here

EDIT: 编辑:

Alternatively you could use a regular expression, and commas and spaces won't matter: 另外,您可以使用正则表达式,而逗号和空格无关紧要:

import re

n = ['1993-06-11 5570',
     '930611-5570',
     '930611 5570',
     '93 05115570',
     '1993 05 11 55 70',
     '1993 05 11 5570']

regex = '([0-9]+(?:[- ]?[0-9]+)*)'
match_nums = [re.search(regex, num) for num in n]
results = [i.groups() for i in match_nums]
for i in results:
    print i

('1993-06-11 5570',)
('930611-5570',)
('930611 5570',)
('93 05115570',)
('1993 05 11 55 70',)
('1993 05 11 5570',)

for info on regular expressions, see here 有关正则表达式的信息,请参见此处

Using with for file operations is suggested. 建议使用with进行文件操作。 If it is Python 2.4 or something, you have to import with statement. 如果是Python 2.4之类的东西,则必须使用with语句导入。 Simplest solution for your problem with numbers that I could think of is: 我想到的最简单的数字问题解决方案是:

from __future__ import with_statement
file_with_ssn = "/absolute/path/to/the/file"

try:
    with open(file_with_ssn) as ssn_file:
        for ssn in ssn_file:
            ssn = filter(str.isdigit, ssn)
            # This removes anything other than a number including -.
            # Since SSN cannot be in negative, isdigit is fine here
            numberprogram(ssn)
except EnvironmentError:
    print "IOError or OSError or WindowsError happened"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM