简体   繁体   English

替换字符串的一部分(如果存在于列表中)(Python)

[英]Replace part of string if present in list ( Python )

I want to replace part of the string to blank if present in a list. 如果列表中存在字符串,我想将字符串的一部分替换为空白。

For example : 例如 :

List 名单

foo = ['.com', '.net', '.co', '.in']

Convert these strings to 将这些字符串转换为

google.com   
google.co.in 
google.net   
google.com/gmail/   

These strings 这些弦

google  
google  
google  
google/gmail/

So far i have found this solution. 到目前为止,我已经找到了解决方案。 Is there any other optimized way to do it? 还有其他优化方法吗?
replace item in a string if it matches an item in the list 如果匹配列表中的项目,则替换字符串中的项目

Similar to George Shulkin's answer. 类似于乔治·舒尔金的答案。

import re
suffixes = ['.com', '.co', '.in', '.net']
patterns = [re.compile(suffix) for suffix in suffixes]

def remove_suffixes(s: str) -> str:
    for pattern in patterns:
        s = pattern.sub("", s)
    return s

# urls = ["google.com", ...
clean_urls = map(remove_suffixes, urls)
# or clean_urls = [remove_suffixes(url) for url in urls]

You might want to use the list comprehension, because it can be faster than map in many cases. 您可能要使用列表推导,因为在许多情况下,它可能比map更快。

This has the advantage of also compiling the regexes, which can be better for performance when used in a loop. 这具有还编译正则表达式的优势,当在循环中使用时,正则表达式可能会更好。

Or if you decided to use functools.reduce , 或者,如果您决定使用functools.reduce

from functools import reduce

def remove_suffixes(s: str) -> str:
    return reduce(lambda s, pattern: pattern.sub("", s), patterns, s) 

You need to split this task in two: 您需要将此任务分为两部分:

  1. Write a code to replace string with a new string if matched. 编写代码,以匹配新字符串替换字符串。
  2. Apply this function to the list. 将此功能应用于列表。

First can be done with regexp (see below). 首先可以使用regexp完成(请参见下文)。 Second can be done by using map function. 第二可以通过使用map功能来完成。

Example of the code to replace substring: 替换子字符串的代码示例:

>>> import re
>>> re.sub(".com", "",  "google.com/gmail/")
'google/gmail/'

Example for use of the map function: 使用map函数的示例:

>>> map(lambda x: len(x), ["one", "two", "three"])
[3, 3, 5]

(it replaces elements of array with length of those elements). (它将数组元素替换为这些元素的长度)。

You can combine those two to get what you want. 您可以将两者结合起来以获得所需的内容。

Using the suggestion of George Shuklin this is the simplest code i could come up with. 使用George Shuklin的建议,这是我能想到的最简单的代码。


import re

domains = ['.com', '.net', '.co', '.in']

urls = ["google.com","google.co.in","google.net","google.com/gmail/"]

for i in range(len(urls)):
    for domain in domains:
        urls[i] = re.sub(domain,"",urls[i])

print(urls)

This outputs: 输出:

['google', 'google', 'google', 'google/gmail/']

You can use re.sub and str.join : 您可以使用re.substr.join

import re
foo = ['.com', '.net', '.co', '.in']
urls = ["google.com","google.co.in","google.net","google.com/gmail/"]
final_result = [re.sub('|'.join(foo), '', i) for i in urls]

Output: 输出:

['google', 'google', 'google', 'google/gmail/']

Another alternative is to use str.replace() and str.find() . 另一种替代方法是使用str.replace()str.find()

foo = ['.com', '.net', '.co', '.in']
domains = ["google.com", "google.co.in", "google.net", "google.com/gmail/"]

def remove_extensions(domain, extensions):
    for ext in extensions:
        if domain.find(ext) != -1:
            domain = domain.replace(ext, "")
    return domain

list(map(lambda x: remove_extensions(x, foo), domains))

This code snippet outputs the result as expected: 此代码段按预期输出结果:

['google', 'google', 'google', 'google/gmail/']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM