简体   繁体   中英

How to replace elements in one list with items from another list?

I have some company legal forms that I need to translate:

ABC GMBH CO & KG
DEF LIMITED LIABILITY CO
XYZ AD
UVW LTEE

The idea is GMBH CO & KG = GMBH; LLC = AD = LTEE = LIMITED LIABILITY CO GMBH CO & KG = GMBH; LLC = AD = LTEE = LIMITED LIABILITY CO

I wrote the following code, but it doesn't appear to work. Any ideas why?

file = open("fake.txt","r").read()
col = file.split("\n")

abbr = ['LLC', 'GMBH']
full = [
('LIMITED LIABILITY COMPANY', 'LIMITED LIABILITY CO', 'LTEE', 'LIMITEE','AD', 'AKTZIONERNO DRUZHESTVO'), 
('GMBH CO & KG', 'MBH', 'GESELLSCHAFT MIT BESCHRANKTER HAFTUNG')
]

def trans(col):
    i=0
    while i<len(abbr):
        c=0
        while c<len(full[i]):
            for x in full[i][c]:
                if x in col:
                    col = col.replace(x,abbr[i])
            c+=1    
        i+=1
    return col

print trans(col)

You could create a dictionary with all the strings that lead to the same abbreviation as keys with that abbreviation as the value. Then you would need to iterate over your input lines to look for the strings.

This is what I mean:

>>> lines = ["ABC GMBH CO & KG",
... "DEF LIMITED LIABILITY CO",
... "XYZ AD",
... "UVW LTEE"]

>>> abbr_dict = {}
>>> abbr_dict['GMBH CO & KG'] = 'GMBH'
>>> abbr_dict['MBH'] = 'GMBH'
>>> abbr_dict['GESELLSCHAFT MIT BESCHRANKTER HAFTUNG'] = 'GMBH'
>>> abbr_dict['LIMITED LIABILITY COMPANY'] = 'LLC'
>>> abbr_dict['LIMITED LIABILITY CO'] = 'LLC'
>>> abbr_dict['LTEE'] = 'LLC'
>>> abbr_dict['LIMITEE'] = 'LLC'
>>> abbr_dict['AD'] = 'LLC'
>>> abbr_dict['AKTZIONERNO DRUZHESTVO'] = 'LLC'

>>> for line in lines:
...     for key in abbr_dict:
...         if key in line:
...             line = line.replace(key, abbr_dict[key])
...             print(line)
...             break # This is to prevent multiple replacements on the same line

This prints:

ABC GMBH
DEF LLC
XYZ LLC
UVW LLC

Note that, this might not be an optimal solution if the input line has a string like ABC GMBH AD & KG . In that case, it would replace the MBH with GMBH giving ABC GMBH LLC & KG which might not be what you need.

You have two problems in your code:

for x in full[i][c]:

this for will look in each character of each full[i][c] not each element of full[i] .

if x in col:

Once fixed the first problem this will try to match exactly with the content of a line, and not a substring.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM