简体   繁体   English

从具有 readline 偏移量的文件中读取行

[英]Reading lines from a file with offset for readline

I want to read a file line by line, but I want to move the line pointer on every two reads.我想逐行读取文件,但我想在每两次读取时移动行指针。 The file looks like该文件看起来像

100
200
300
400

So, if I write所以,如果我写

line_1 = f.readline()  # 100
line_2 = f.readline()  # 200

Then upon the third readline, I will get 300. I want to get 100 with a readline and the I want to get 200 with an incremental statement.然后在第三个 readline 上,我将得到 300。我想用 readline 得到 100,我想用增量语句得到 200。 Then I will put those into a loop and finally I want to get the lines in this manner:然后我会将它们放入一个循环中,最后我想以这种方式获取这些行:

iteration #1: 100 and 200
iteration #2: 200 and 300
iteration #3: 300 and 400

How can I do that?我怎样才能做到这一点?

You can create a generator (it removes the EOL character too, you can get rid of rstrip if you want something different):你可以创建一个生成器(它也删除了 EOL 字符,如果你想要不同的东西,你可以去掉rstrip ):

def readpairsoflines(f):
    l1 = f.readline().rstrip('\n')
    for l2 in f:
        l2 = l2.rstrip('\n')
        yield l1, l2
        l1 = l2

And use it like this:并像这样使用它:

with open(filename) as f:
    for l1, l2 in readpairsoflines(f):
        # Do something with your pair of lines, for example print them
        print(f'{l1} and {l2}')

Result:结果:

100 and 200
200 and 300
300 and 400

With this approach only two lines are read and kept in memory.使用这种方法只读取两行并保存在内存中。 Therefore, it works also with large files where memory is a possible concern.因此,它也适用于可能需要考虑内存的大文件。

I'm always a fan of simple and readable solutions (though sometimes less "pythonic" ).我总是喜欢简单易读的解决方案(尽管有时不那么“pythonic” )。

with open("example.txt") as f:
    old = f.readline().rstrip()
    
    for line in f:
        line = line.rstrip()
        print("{} and  {}".format(old, line))
        old = line
  • A first read is performed before looping through the remaining lines在循环遍历其余行之前执行第一次读取
  • Then, the desired output is printed, and the old string is updated然后,打印所需的输出,并更新old字符串
  • The rstrip() call is required ion order to remove the undesired trailing '\\n' rstrip()调用需要离子顺序来删除不需要的尾随'\\n'
  • I assumed that nothing had to be printed in case of files with less than two lines;我认为在文件少于两行的情况下不必打印任何内容; the code can be easily modified to manage any need in that special case可以轻松修改代码以管理特殊情况下的任何需求

The output:输出:

100 and  200
200 and  300
300 and  400

now I would suggest splitting the document in newlines like this现在我建议像这样在换行符中拆分文档

with open('params.txt') as file:
    data = file.read()
data = data.split('\n')
for index, item in enumerate(data):
    try:
        print(str(item) + ' ' + str(data[index + 1]))
    except IndexError:
        print(str(item))

and using some list logic print what you need so what this code does is creates a list of needed values(not efficient for verrry large files) and gets their index so when it prints the item it also prints the next item in the list and it does it for every item index error is because last item wont have next item but you can also work around it by using if else statements并使用一些列表逻辑打印您需要的内容,因此此代码的作用是创建所需值的列表(对于verrry 大文件效率不高)并获取它们的索引,因此当它打印该项目时,它还会打印列表中的下一个项目,它对每个项目索引错误都这样做是因为最后一个项目不会有下一个项目,但您也可以通过使用 if else 语句来解决它

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM