简体   繁体   English

替换位于其间的字符串

[英]Replace a string located between

Here is my problem: in a variable that is text and contains commas, I try to delete only the commas located between two strings (in fact [ and ] ). 这是我的问题:在一个文本变量中包含逗号,我尝试只删除位于两个字符串之间的逗号(事实上[] )。 For example using the following string: 例如,使用以下字符串:

input =  "The sun shines, that's fine [not, for, everyone] and if it rains, it Will Be better."
output = "The sun shines, that's fine [not for everyone] and if it rains, it Will Be better."

I know how to use .replace for the whole variable, but I can not do it for a part of it. 我知道如何使用.replace作为整个变量,但我不能为它的一部分做。 There are some topics approaching on this site, but I did not manage to exploit them for my own question, eg: 这个网站上有一些主题正在接近,但我没有设法利用它们来解决我自己的问题,例如:

import re
Variable = "The sun shines, that's fine [not, for, everyone] and if it rains, it Will Be better."
Variable1 = re.sub("\[[^]]*\]", lambda x:x.group(0).replace(',',''), Variable)

First you need to find the parts of the string that need to be rewritten (you do this with re.sub ). 首先,您需要找到需要重写的字符串部分(使用re.sub执行此操作)。 Then you rewrite that parts. 然后你重写那些部分。

The function var1 = re.sub("re", fun, var) means: find all substrings in te variable var that conform to "re" ; 函数var1 = re.sub("re", fun, var)表示:查找te变量var中符合"re"所有子串; process them with the function fun ; 用功能fun处理它们; return the result; 返回结果; the result will be saved to the var1 variable. 结果将保存到var1变量中。

The regular expression "[[^]]*]" means: find substrings that start with [ ( \\[ in re), contain everything except ] ( [^]]* in re) and end with ] ( \\] in re). 正则表达式“[[^]] *]”是指:发现与启动子串[\\[在重),包含除一切][^]]*在重新)和结束]\\]在再) 。

For every found occurrence run a function that convert this occurrence to something new. 对于每个找到的事件,运行一个将此事件转换为新事件的函数。 The function is: 功能是:

lambda x: group(0).replace(',', '')

That means: take the string that found ( group(0) ), replace ',' with '' (remove , in other words) and return the result. 这意味着:获取找到的字符串( group(0) ),将','替换为'' (删除,换句话说)并返回结果。

You can use an expression like this to match them (if the brackets are balanced): 您可以使用这样的表达式来匹配它们(如果括号是平衡的):

,(?=[^][]*\])

Used something like: 使用类似的东西:

re.sub(r",(?=[^][]*\])", "", str)

Here is a non-regex method. 这是一个非正则表达式方法。 You can replace your [] delimiters with say [/ and /] , and then split on the / delimiter. 您可以用[//]替换[]分隔符,然后在/分隔符上split Then every odd string in the split list needs to be processed for comma removal, which can be done while rebuilding the string in a list comprehension: 然后需要处理拆分列表中的每个odd字符串以删除comma ,这可以在列表解析中重建字符串时完成:

>>> Variable = "The sun shines, that's fine [not, for, everyone] and if it rains,
                it Will Be better."
>>> chunks = Variable.replace('[','[/').replace(']','/]').split('/')
>>> ''.join(sen.replace(',','') if i%2 else sen for i, sen in enumerate(chunks))
"The sun shines, that's fine [not for everyone] and if it rains, it Will Be 
 better."

If you don't fancy learning regular expressions (see other responses on this page), you can use the partition command. 如果您不想学习正则表达式(请参阅此页面上的其他响应),则可以使用partition命令。

sentence = "the quick, brown [fox, jumped , over] the lazy dog"
left, bracket, rest = sentence.partition("[")
block, bracket, right = rest.partition("]")

"block" is now the part of the string in between the brackets, "left" is what was to the left of the opening bracket and "right" is what was to the right of the opening bracket. “block”现在是括号之间的字符串的一部分,“left”是左括号的开头,“right”是开括号右边的。

You can then recover the full sentence with: 然后你可以恢复完整的句子:

new_sentence = left + "[" + block.replace(",","") + "]" + right
print new_sentence # the quick, brown [fox jumped over] the lazy dog

If you have more than one block, you can put this all in a for loop, applying the partition command to "right" at every step. 如果你有多个块,你可以把它们全部放在for循环中,在每一步都应用partition命令“right”。

Or you could learn regular expressions! 或者你可以学习正则表达式! It will be worth it in the long run. 从长远来看,这是值得的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM