简体   繁体   English

如何在python中读取文件时检查下一行的下一行,并在其末尾删除换行符?

[英]How do I check for next to next line while reading a file in python and strip the newline character at its end?

I have a very huge javascript file which I was trying to analyze. 我有一个非常庞大的javascript文件,我正在尝试进行分析。 The file had a lot of code with newlines removed and it was becoming hard to analyze the file so I used the replace function to find all the instances of ; 该文件有很多代码,其中删除了换行符,并且分析该文件变得越来越困难,因此我使用了replace函数来查找的所有实例; and replaced it with ;\ (\ is the unicode for newline). 并将其替换为;\ (\\ u000A是换行符的unicode)。 This solved my problem and the program become more readable. 这解决了我的问题,程序变得更具可读性。 However I had another problem now: Every for loop got changed. 但是我现在有另一个问题: 每个for循环都被更改了。

For instance: 例如:

for(i=0; i<someValue; i++)

got changed to 变成了

for(i=0;
i<someValue;
i++)

I want to write a program in Python to format this mistake. 我想用Python编写程序来格式化此错误。 My thinking was along the lines: 我的想法是:

for line in open('index.html', 'r+'):
    if  line.startswith('for(') and line.endswith(';'):
        line.strip('\n')

However, I don't know what code do I use to strip the next lines newline character as the for loop would only read one line at a time. 但是,我不知道我用什么代码来剥离下一行换行符,因为for循环一次只能读取一行。 Could anyone please suggest what would I be requiring to do? 有人可以建议我要做什么吗?

A Python file object is an iterable, you can ask it for the next line while looping: Python文件对象是可迭代的,您可以在循环时要求其下一行:

with open(inputfilename) as ifh:
    for line in ifh:
        if line.startswith('for(') and line.endswith(';\n'):
            line = line.rstrip('\n') + next(ifh).rstrip('\n') + next(ifh)

This uses the next() function to retrieve the next two items from the ifh file object and add them to the current line. 这使用next()函数ifh文件对象中检索接下来的两项并将其添加到当前行。 The outer loop will continue with the line after that. 外循环将在此之后继续。

To illustrate, look at the output of this iterator loop: 为了说明,请查看此迭代器循环的输出:

>>> lst = [1, 2, 3, 4]
>>> lst_iter = iter(lst)
>>> for i in lst_iter:
...     print i
...     if i == 2:
...         print 'skipping ahead to', next(lst_iter)
...
1
2
skipping ahead to 3
4

Here next() advanced the lst_iter iterable to the next item, and the outer for loop then continued with the next value after that. 在这里, next()lst_iter迭代到下一个项目,然后外部的for循环继续执行此后的下一个值。

Your next problem is rewriting the file in-place; 您的下一个问题是就地重写文件。 you cannot read and write to the same file at the same time, and hope to replace just the right parts. 您无法同时读取和写入同一文件,并希望只替换正确的部分。 Buffering and different line lengths get in the way. 缓冲和不同的行长会影响您的工作。

Use the fileinput module to handle replacing the contents of a file: 使用fileinput模块来处理文件内容的替换:

import sys
import fileinput

for line in fileinput.input(inputfilename):
    if line.startswith('for(') and line.endswith(';'):
        line = line.rstrip('\n') + next(ifh).rstrip('\n') + next(ifh)
    sys.stdout.write(line)

or use my in-place file rewriting context manager . 或使用我的就地文件重写上下文管理器

from inplace import inplace

with inplace(inputfilename) as (ifh, ofh):
    for line in ifh:
        if line.startswith('for(') and line.endswith(';'):
            line = line.rstrip('\n') + next(ifh).rstrip('\n') + next(ifh)
        ofh.write(line)

You can use a counter, like this: 您可以使用一个计数器,如下所示:

cnt = 2
for line in open('index.html'):
    if(line.startswith('for(') and line.endswith(';\n')):
        cnt = 0
    if cnt < 2:
        line = line.strip('\n')
        cnt += 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM