简体   繁体   English

使用python一次读取一个整数

[英]Reading one integer at a time using python

How can I read int from a file? 如何从文件读取int? I have a large(512MB) txt file, which contains integer data as: 我有一个很大的(512MB)txt文件,其中包含以下整数数据:

0 0 0 10 5 0 0 140
0 20 6 0 9 5 0 0

Now if I use c = file.read(1) , I get only one character at a time, but I need one integer at a time. 现在,如果我使用c = file.read(1) ,一次只能得到一个字符,但是一次只需要一个整数。 Like: 喜欢:

c = 0
c = 10
c = 5
c = 140 and so on...

Any great heart please help. 任何有心人请帮助。 Thanks in advance. 提前致谢。

Here's one way: 这是一种方法:

with open('in.txt', 'r') as f:
  for line in f:
    for s in line.split(' '):
      num = int(s)
      print num

By doing for line in f you are reading bit by bit (using neither read() all nor readlines ). 通过执行for line in f您正在一点一点地read() all既不使用read() all也不使用readlines )。 Important because your file is large. 重要,因为您的文件很大。

Then you split each line on spaces, and read each number as you go. 然后,您将每行分隔成空格,并随即读取每个数字。

You can do more error checking than that simple example, which will barf if the file contains corrupted data. 您可以执行比该简单示例更多的错误检查,如果文件包含损坏的数据,该示例将禁止操作。

As the comments say, this should be enough for you - otherwise if it is possible your file can have extremely long lines you can do something trickier like reading blocks at a time. 就像评论所说,这对您来说应该足够了-否则,如果文件可能有很长的行,您可以做一些棘手的事情,例如一次读取块。

512 MB is really not that large. 512 MB确实不是那么大。 If you're going to create a list of the data anyway, I don't see a problem with doing the reading step in one go: 无论如何,如果您要创建数据列表,那么一次性执行读取步骤就不会出现问题:

my_int_list = [int(v) for v in open('myfile.txt').read().split()]

if you can structure your code so you don't need the entire list in memory, it would be better to use a generator: 如果可以对代码进行结构化,以便不需要内存中的整个列表,那么最好使用生成器:

def my_ints(fname):
    for line in open(fname):
        for val in line.split():
            yield int(val)

and then use it: 然后使用它:

for c in my_ints('myfile.txt'):
    # do something with c (which is the next int)

I would do it this way: 我会这样:

  • buffer = file.read(8192) 缓冲区= file.read(8192)
  • contents += buffer 内容+ =缓冲区
  • split the output string by space 按空格分割输出字符串
  • remove last element from the array (might not be full number) 从数组中删除最后一个元素(可能不是整数)
  • replace contents with last element string 用最后一个元素字符串替换内容
  • repeat until buffer is None` 重复执行直到缓冲区为None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM