使用python解析文件并提取块之间的行

Question

I have a file that I need to parse and extract some specific lines from. 我有一个文件需要解析并从中提取一些特定的行。 This is an example of the file data: 这是文件数据的示例：

dn: uid=portaladmin,ou=people,ou=myrealm,dc=portalDomain
objectclass: wlsUser objectclass: top 
objectclass: person 
objectclass: organizationalPerson 
objectclass: inetOrgPerson 
cn: portaladmin 
sn: portaladmin 
description: Admin for portal domain 
uid: portaladmin userpassword:: e3NzaGF9L3JYUldtVERBUklCdWM3NGtBSlJQVFVjQ04yRmNkU3o= 
wlsMemberOf: cn=PortalSystemAdministrators,ou=groups,ou=myrealm,dc=portalDom  ain

dn: uid=weblogic,ou=people,ou=myrealm,dc=portalDomain 
objectclass: wlsUser 
objectclass: top 
objectclass: person 
objectclass: organizationalPerson 
objectclass: inetOrgPerson 
cn: weblogic 
sn: weblogic 
description: This user is the default administrator. 
uid: weblogic 
userpassword:: e3NzaGF9VHhObDZhTlBpZTFSa2VVeTRTak1vWm0yTFJmdlN4RE8= 
wlsMemberOf: cn=Administrators,ou=groups,ou=myrealm,dc=portalDomain 
wlsMemberOf: cn=PortalSystemAdministrators,ou=groups,ou=myrealm,dc=portalDomain

As you can see the information is in blocks and I need to extract lines with ( cn: , sn: , description: , uid: and userpassword: ) values, also need to tell the script to search for specifics uid or cn from a list. 如您所见，信息以块为单位，我需要提取具有（ cn: ， sn: ， description: ， uid:和userpassword:值的行，还需要告诉脚本从列表中搜索特定的uid或cn 。

I'm not a experienced programmer and that's why I came here to ask the gurus on this. 我不是一个经验丰富的程序员，这就是为什么我来这问问大师。 Please help, thanks in advance. 请帮助，在此先感谢。

Answer 1

Just find the lines using str.startswith,passing a tuple of the substrings: 只需使用str.startswith找到这些行，并传递一个子字符串元组：

with open("in.txt") as f:
    for line in f:
        if line.startswith(("cn:","sn:", "description:", "uid:","userpassword:")):
            print(line.rstrip())

Output: 输出：

cn: portaladmin
sn: portaladmin
description: Admin for portal domain
uid: portaladmin userpassword:: e3NzaGF9L3JYUldtVERBUklCdWM3NGtBSlJQVFVjQ04yRmNkU3o=
cn: weblogic
sn: weblogic
description: This user is the default administrator.
uid: weblogic
userpassword:: e3NzaGF9VHhObDZhTlBpZTFSa2VVeTRTak1vWm0yTFJmdlN4RE8=

Based on your comment if you want to search for substrings you can use any : 根据您的评论，如果您想搜索子字符串，可以使用以下any ：

  if any(sub in line for sub in ("cn: somestring", "sn: somestring", "description: somestring", "uid: somestring", "userpassword: somestring")):

If the pattern is more complicated then you will probably need a regex but without knowing exactly what you want to extract then it is not possible to suggest a viable regex 如果模式更复杂，那么您可能需要一个正则表达式，但是不完全知道要提取的内容，那么就不可能建议可行的正则表达式

Answer 2

extractedLines = []
with open("file.txt", "r") as f:
    for line in f:
        for item in ["cn:", "sn:", "description:", "uid:", "userpassword:"]:
            if item in line:
                extractedLines.append(line)

使用python解析文件并提取块之间的行

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-07-28 18:17:13

解决方案2
-1 2015-07-28 18:06:43

使用python解析文件并提取块之间的行

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-07-28 18:17:13

解决方案2 -1 2015-07-28 18:06:43

解决方案1
1 已采纳 2015-07-28 18:17:13

解决方案2
-1 2015-07-28 18:06:43