[英]How to replace elements in one list with items from another list?
I have some company legal forms that I need to translate: 我有一些公司法律表格需要翻译:
ABC GMBH CO & KG
DEF LIMITED LIABILITY CO
XYZ AD
UVW LTEE
The idea is GMBH CO & KG = GMBH; LLC = AD = LTEE = LIMITED LIABILITY CO
这个想法是
GMBH CO & KG = GMBH; LLC = AD = LTEE = LIMITED LIABILITY CO
GMBH CO & KG = GMBH; LLC = AD = LTEE = LIMITED LIABILITY CO
I wrote the following code, but it doesn't appear to work. 我编写了以下代码,但似乎没有用。 Any ideas why?
有什么想法吗?
file = open("fake.txt","r").read()
col = file.split("\n")
abbr = ['LLC', 'GMBH']
full = [
('LIMITED LIABILITY COMPANY', 'LIMITED LIABILITY CO', 'LTEE', 'LIMITEE','AD', 'AKTZIONERNO DRUZHESTVO'),
('GMBH CO & KG', 'MBH', 'GESELLSCHAFT MIT BESCHRANKTER HAFTUNG')
]
def trans(col):
i=0
while i<len(abbr):
c=0
while c<len(full[i]):
for x in full[i][c]:
if x in col:
col = col.replace(x,abbr[i])
c+=1
i+=1
return col
print trans(col)
You could create a dictionary with all the strings that lead to the same abbreviation as keys with that abbreviation as the value. 您可以创建一个字典,其中所有导致与键相同的缩写的字符串都以该缩写为值。 Then you would need to iterate over your input lines to look for the strings.
然后,您将需要遍历输入行以查找字符串。
This is what I mean: 这就是我的意思:
>>> lines = ["ABC GMBH CO & KG",
... "DEF LIMITED LIABILITY CO",
... "XYZ AD",
... "UVW LTEE"]
>>> abbr_dict = {}
>>> abbr_dict['GMBH CO & KG'] = 'GMBH'
>>> abbr_dict['MBH'] = 'GMBH'
>>> abbr_dict['GESELLSCHAFT MIT BESCHRANKTER HAFTUNG'] = 'GMBH'
>>> abbr_dict['LIMITED LIABILITY COMPANY'] = 'LLC'
>>> abbr_dict['LIMITED LIABILITY CO'] = 'LLC'
>>> abbr_dict['LTEE'] = 'LLC'
>>> abbr_dict['LIMITEE'] = 'LLC'
>>> abbr_dict['AD'] = 'LLC'
>>> abbr_dict['AKTZIONERNO DRUZHESTVO'] = 'LLC'
>>> for line in lines:
... for key in abbr_dict:
... if key in line:
... line = line.replace(key, abbr_dict[key])
... print(line)
... break # This is to prevent multiple replacements on the same line
This prints: 打印:
ABC GMBH
DEF LLC
XYZ LLC
UVW LLC
Note that, this might not be an optimal solution if the input line has a string like ABC GMBH AD & KG
. 请注意,如果输入线具有
ABC GMBH AD & KG
类的字符串,这可能不是最佳解决方案。 In that case, it would replace the MBH
with GMBH
giving ABC GMBH LLC & KG
which might not be what you need. 在这种情况下,它将用
GMBH
替换MBH
,从而得到ABC GMBH LLC & KG
,这可能不是您所需要的。
You have two problems in your code: 您的代码中有两个问题:
for x in full[i][c]:
this for will look in each character of each full[i][c]
not each element of full[i]
. 这个for将查找每个
full[i][c]
每个字符,而不是full[i]
每个元素。
if x in col:
Once fixed the first problem this will try to match exactly with the content of a line, and not a substring. 一旦解决了第一个问题,它将尝试与行而不是子字符串的内容完全匹配。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.