简体   繁体   English

如何在Python3中替换完整字符串而不是仅替换子字符串

[英]How to replace full string instead of just sub string in Python3

I have bunch of input files and need to replace few strings in them. 我有一堆输入文件,需要替换其中的几个字符串。 First I created a dictionary using of key value pairs using regex. 首先,我使用正则表达式使用键值对创建了一个字典。 Dictionary contains key(string to be replaced) and value(replacement). 字典包含键(要替换的字符串)和值(替换)。

Example line in input file: Details of first student are FullName ="ABC XYZ KLM" FirstName ="ABC" ID = "123" 输入文件中的示例行:第一个学生的详细信息为FullName =“ ABC XYZ KLM” FirstName =“ ABC” ID =“ 123”

My dictionary would be -> 我的字典是->

student = {
    'ABC':'Student Firstname',
    'ABC XYZ KLM':'Student Fullname',
    '123':'Student ID'
    }

I am using string replace() to do the replacement like this: 我正在使用字符串replace()进行如下替换:

for line in inputfile1:
    for src, dst in student.items():
          line = line.replace(src,dst)

My output is coming as: Details of first student are FullName =" Student Firstname XYZ KLM " FirstName ="Student Firstname" ID = "Student ID" 我的输出如下:第一位学生的详细信息为FullName =“ Student Firstname XYZ KLM ” FirstName =“ Student Firstname” ID =“ Student ID”

What I am looking for is: Details of first student are FullName =" Student Fullname " FirstName ="Student Firstname" ID = "Student ID" 我要寻找的是:第一位学生的详细信息是FullName =“ Student Fullname ” FirstName =“ Student Firstname” ID =“ Student ID”

Can you please help me with figuring this out? 您能帮我解决这个问题吗?

This is happening because the str.replace(..) start by replacing the ABC string first. 发生这种情况是因为str.replace(..)首先是替换ABC字符串。 You need to make sure that the longest pattern is replaced first. 您需要确保首先替换最长的模式。 To do that, you can follow one of these options: 为此,您可以遵循以下选项之一:

option 1: 选项1:

Use an OrderedDict dictionary instead and put the longest strings to be replace before the shortest: 请改用OrderedDict字典,将要替换的最长字符串放在最短字符串之前:

In [3]: from collections import OrderedDict

In [6]: student = OrderedDict([('ABC XYZ KLM', 'Student Fullname'),  ('ABC', 'Student Firstname'),('123', 'Student ID')])

In [7]: student.items()
Out[7]: 
[('ABC XYZ KLM', 'Student Fullname'),
 ('ABC', 'Student Firstname'),
 ('123', 'Student ID')]

In [8]: line = 'FullName ="ABC XYZ KLM" FirstName ="ABC" ID = "123"' 

In [9]: for src, dst in student.items():
   ...:        line = line.replace(src, dst)
In [10]: line 
Out[10]: 'FullName ="Student Fullname" FirstName ="Student Firstname" ID = "Student ID"'

The overall code looks like this: 整体代码如下所示:

from collections import OrderedDict
student = OrderedDict([('ABC XYZ KLM', 'Student Fullname'),
                       ('ABC', 'Student Firstname'),
                    ('123', 'Student ID')])
line = 'FullName ="ABC XYZ KLM" FirstName ="ABC" ID = "123"' 
for src, dst in student.items():
    line = line.replace(src, dst)

option 2: 选项2:

Also as suggested by @AlexHal in the comments below, you can simply use a list of tuples and sort it based on the longest pattern before replacement, the code will look like this: 同样,正如@AlexHal在下面的注释中所建议的那样,您可以简单地使用元组列表,并根据替换之前最长的模式对它进行排序,代码如下所示:

In [2]: student = [('ABC', 'Student Firstname'),('123', 'Student ID'), ('ABC XYZ KLM', 'Student Fullname')]

In [3]: sorted(student, key=lambda x: len(x[0]), reverse=True)
Out[3]: 
[('ABC XYZ KLM', 'Student Fullname'),
 ('ABC', 'Student Firstname'),
 ('123', 'Student ID')]

In [4]: sorted(student, key=lambda x: len(x[0]), reverse=True)
Out[4]: 
[('ABC XYZ KLM', 'Student Fullname'),
 ('ABC', 'Student Firstname'),
 ('123', 'Student ID')]

In [9]: line = ' "Details of first student are FirstName ="ABC" FullName ="ABC XYZ KLM" ID = "123"'

In [10]: for src, dst in sorted(student, key=lambda x: len(x[0]), reverse=True):
    ...:     line = line.replace(src, dst)
    ...:     

In [11]: line
Out[11]: ' "Details of first student are FirstName ="Student Firstname" FullName ="Student Fullname" ID = "Student ID"'

In [12]: 

Overall code: 总体代码:

student = [('ABC', 'Student Firstname'),
           ('123', 'Student ID'), 
           ('ABC XYZ KLM', 'Student Fullname')]

line = ' "Details of first student are FirstName ="ABC" FullName ="ABC XYZ KLM" ID = "123"'    
for src, dst in sorted(student, key=lambda x: len(x[0]), reverse=True):
    line = line.replace(src, dst)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM