简体   繁体   中英

Searching for a particular string in python

I have a certain file which contains data similar to the given format

Name  :  Neha xxxxx
Title  :  ENGINEER.xxxxx xxxxxx
Employee number  :  27xxx
Status : Active
User ID :  nehxxx
Manager ID  :  xxxx
Manager : Krisxxxxxxxx

This data is to be sequentially inserted into a database.For that purpose , i am first building lists by the following code

filename = "LDAPFile.txt"
lines = open(filename).read().splitlines()

#print lines
for item in lines:
    if('Name') in item:
        Name = item.split(':')[1]
        #print Name[1]
    if('Title') in item:
        Title = item.split(":")[1]
        #print Title[1]
    if('Employee number') in item:
        ENO = item.split(":")[1]
        #print ENO
    if('Status') in item:
        Status = item.split(":")[1]
        #print Status
    if('User ID') in item:
        UID = item.split(":")[1]
        #print UID
    if('Manager ID') in item:
        MID = item.split(":")[1]
        #print MID
        #print len(MID)
    if('Manager') in item:
        MANAGER = item.split(":")
        print MANAGER
        #print len(MANAGER)

However , if('Manager') in item: results in both manager ID and Manager. How can I specifically search for Manager ?

The minimal change you can make is this:

if item.startswith("Manager :"):

This will be efficient, as you don't have to search the whole string, and will avoid finding the same string elsewhere.

However, you can improve the whole code as follows:

data = {}
for item in lines:
    try:
        key, value = item.split(":")
    except ValueError:
        pass # ignore line - not in expected format
    else:
        data[key.strip()] = value.strip()

You can now access the fields within the data dictionary

data["Manager"] ...

Use regex from re of Python to achieve this. In below example it checks for the Manager that is not followed by " ID"

if re.match("Manager(?!\s+ID)", item):

Remember, this example is only effective for your scenario.

Why not split the lines first:

for item in lines:
    parts = item.split(':')
    if parts[0].strip() == "Manager":
        # process the item

I think that it would be easier to use regular expressions. So what I would do is the following:

import re

# create a list to save the whole file in it

inf = open(filename, "r")
read = inf.readlines()
inf.close

for l in read:
    mat1 = re.search(r'Manager ID',l,re.M)
    mat1 = re.search(r'Manager ID',l,re.M)
    if mat1:
     MID = l.split(":")[1]
    elif mat2:
     Manager = l.split(":")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM