简体   繁体   English

将当前行追加到上一行

[英]Append current line to previous line

I'm trying to parse .ldif file but failed to get desired output.我正在尝试解析.ldif文件,但未能获得所需的输出。 Any help is much appreciated.任何帮助深表感谢。

Here's is what I'm doing using python:这是我正在使用 python 做的事情:

lines = open("use.ldif", "r").read().split("\n")
for i, line in enumerate(lines):
   if not line.find(":"):
      lines[i-1] = lines[-1].strip() + line
      lines.pop(i)

open("user_modified.ldif", "w").write("\n".join(lines)+"\n")

use.ldif (input file)使用.ldif(输入文件)

dn: cnh
changetype: add
objectclass: inetOrgPerson
objectclass: cdsUser
objectclass: organizationalPerson
objectclass: Person
objectclass: n
objectclass: Top
objectclass: cd
objectclass: D
objectclass: nshd shdghsf shgdhfjh jhghhghhgh
 hjgfhgfghfhg
street: shgdhgf

dn: cnh
changetype: add
objectclass: inetOrgPerson
objectclass: hjgfhgfghfhg
street: shgdhgf kjsgdhgsjhg shdghsgjfhsfsf
 jgsdhsh
company: xyz

user_modified.ldif (Output from my code) user_modified.ldif(从我的代码输出)

I am getting the same output, nothing is modified.我得到相同的输出,没有任何修改。 I feel it's because I'm doing split("\\n") but I'm not getting an idea of what else can be done.我觉得这是因为我在做split("\\n")但我不知道还能做什么。

desired output期望的输出

dn: cnh
changetype: add
objectclass: inetOrgPerson
objectclass: cdsUser
objectclass: organizationalPerson
objectclass: Person
objectclass: n
objectclass: Top
objectclass: cd
objectclass: D
objectclass: nshd shdghsf shgdhfjh jhghhghhghhjgfhgfghfhg
street: shgdhgf

dn: cnh
changetype: add
objectclass: inetOrgPerson
objectclass: hjgfhgfghfhg
street: shgdhgf kjsgdhgsjhg shdghsgjfhsfsfjgsdhsh
company: xyz

As you can see in my output file user_modified.ldif the object class in first entry and street in second entry gets to the next line.正如您在我的输出文件user_modified.ldif看到的,第一个条目中的对象类和第二个条目中的 street 到达下一行。 How can I have them in same line, like in the desired output.我怎样才能将它们放在同一行中,就像在所需的输出中一样。

Thanks in advance提前致谢

lines = open("use.ldif", "r").read().split("\n")
for i, line in enumerate(lines):
   if len(line) > 0 and not (":" in line):
       lines[i-1] = lines[i-1].strip() + line
       lines.pop(i)

open("user_modified.ldif", "w").write("\n".join(lines)+"\n")

Okey here my approach:好吧,我的方法是:

import re

pattern = re.compile(r"(\w+):(.*)")

with open("use.ldif", "r") as f:
    new_lines = []

    for line in f:
        if line.endswith('\n'):
            line = line[:-1]

        if line == "":
            new_lines.append(line)
            continue

        l = pattern.search(line)
        if l:
            new_lines.append(line)
        else:
            new_lines[-1] += line

with open("user_modified.ldif", "wt") as f:
    f.write("\n".join(new_lines))

Looking a bit your code I suggest you to get documented a bit about iterating over files.看看你的代码,我建议你记录一些关于迭代文件的信息。 Maybe you are still beginner with Python, but in your code shows you are processing whole file 3 times, at read() , at split('\\n') and finally at the for statement.也许您仍然是 Python 的初学者,但在您的代码中显示您正在处理整个文件 3 次,分别是read()split('\\n') ,最后是for语句。 When you open a file, what you get is called descriptor, and as you can see in my code you can use it to iterate over the file getting a line on each step.当你打开一个文件时,你得到的称为描述符,正如你在我的代码中看到的,你可以使用它来迭代文件,在每一步得到一行。 For larger files this will become a important performance trick.对于较大的文件,这将成为一个重要的性能技巧。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM