简体   繁体   English

同时使用值列表拆分字符串

[英]Split a string using a list of value at the same time

I have a string and a list: 我有一个字符串和一个列表:

src = 'ways to learn are read and execute.'
temp = ['ways to','are','and']

What I wanted is to split the string using the list temp 's values and produce: 我想要的是使用list temp的值拆分字符串并生成:

['learn','read','execute']

at the same time. 同时。

I had tried for loop: 我曾试图for循环:

for x in temp:
    src.split(x)

This is what it produced: 这就是它产生的:

['','to learn are read and execute.']
['ways to learn','read and execute.']
['ways to learn are read','execute.']

What I wanted is to output all the values in list first, then use it split the string. 我想要的是首先输出列表中的所有值,然后使用它拆分字符串。

Did anyone has solutions? 有人有解决方案吗?

re.split is the conventional solution for splitting on multiple separators: re.split是拆分多个分隔符的传统解决方案:

import re

src = 'ways to learn are read and execute.'
temp = ['ways to','are','and']

pattern = "|".join(re.escape(item) for item in temp)
result = re.split(pattern, src)
print(result)

Result: 结果:

['', ' learn ', ' read ', ' execute.']

You can also filter out blank items and strip the spaces+punctuation with a simple list comprehension: 您还可以过滤掉空白项目并使用简单的列表理解去除空格+标点符号:

result = [item.strip(" .") for item in result if item]
print(result)

Result: 结果:

['learn', 'read', 'execute']

This is a method which is purely pythonic and does not rely on regular expressions. 这是一种纯粹的pythonic方法,不依赖于正则表达式。 It's more verbose and more complex: 它更冗长,更复杂:

result = []
current = 0
for part in temp:
    too_long_result = src.split(part)[1]
    if current + 1 < len(temp): result.append(too_long_result.split(temp[current+1])[0].lstrip().rstrip())
    else: result.append(too_long_result.lstrip().rstrip())
    current += 1
print(result)

You cann remove the .lstrip().rstrip() commands if you don't want to remove the trailing and leading whitespaces in the list entries. 如果您不想删除列表条目中的尾随空格和前导空格,则可以删除.lstrip().rstrip()命令。

Loop solution. 循环解决方案 You can add conditions such as strip if you need them. 如果需要,可以添加条带等条件。

src = 'ways to learn are read and execute.'
temp = ['ways to','are','and']
copy_src = src
result = []
for x in temp:
    left, right = copy_src.split(x)
    if left:
        result.append(left) #or left.strip()
    copy_src = right
result.append(copy_src) #or copy_src.strip()

just keep it simple 保持简单

src = 'ways to learn are read and execute.'
temp = ['ways','to','are','and']
res=''
for w1 in src.split():
  if w1 not in temp:
    if w1 not in res.split():
      res=res+w1+" "
 print(res)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM