简体   繁体   English

用冒号替换每个奇怪的逗号 - 正则表达式

[英]Replace every odd comma with colon - Regex

I need to write a regex that converts every odd comma to a colon in python.我需要编写一个正则表达式,将 python 中的每个奇数逗号转换为冒号。

For example例如

"[2, 0.2520474110789976, 8, 0.25215388264234934, 3, 0.3560689678084889, 1, 0.3573715347893714, 4, 0.5626369616327825, 5, 0.793617535995843]"

gets converted to转换为

"[2: 0.2520474110789976, 8: 0.25215388264234934, 3: 0.3560689678084889, 1: 0.3573715347893714, 4: 0.5626369616327825, 5: 0.793617535995843]"

I did go through other questions on StackOverflow and found the below question.我通过 StackOverflow 上的其他问题做了 go 并找到了以下问题。 However, the JS version doesn't seem to work in Python.但是,JS 版本似乎在 Python 中不起作用。

Regex - replace all odd numbered occurrences of a comma 正则表达式 - 替换所有奇数出现的逗号

I did the following based on the link above我根据上面的链接做了以下

pattern = "(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)(,)(.*?,|)(?=.*?(?:,|$))"
stringa = re.sub(pattern,": ",flat_list_string)

and got an output并得到了 output

"2:  8:  3:  1:  4:  5:  0.793617535995843"

instead of the one mentioned earlier.而不是前面提到的那个。

I'm pretty new to Regex, so haven't tried much myself.我对正则表达式很陌生,所以我自己没有尝试太多。 Would appreciate any help.将不胜感激任何帮助。 Thanks.谢谢。

Update1: Pasted my incorrect output Update1:粘贴了我不正确的 output

You seem to just be using the regex incorrectly.您似乎只是错误地使用了正则表达式。 First, you should use a raw string literal r"..." so that you don't have to escape the backslashes:首先,您应该使用原始字符串文字r"..."这样您就不必转义反斜杠:

pattern = r"(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)(,)(.*?,|)(?=.*?(?:,|$))"

Next, you should change the replacement string to r":\2" , which means : followed by group 2. The regex matches every odd comma, and also all the characters after it until the next even comma.接下来,您应该将替换字符串更改为r":\2" ,这意味着:后跟第 2 组。正则表达式匹配每个奇数逗号,以及它后面的所有字符,直到下一个偶数逗号。 It puts all this into group 2. Replacing with just : will replace all those matched characters too.它将所有这些放入第 2 组。用:替换也将替换所有匹配的字符。

stringa = re.sub(pattern, r":\2",flat_list_string)

The JS regex also handles commas in quotes that the OP of the other post doesn't want to consider, such as: JS 正则表达式还处理其他帖子的 OP 不想考虑的引号中的逗号,例如:

"hello, world", 1, "bye, world", 2
      ^.               ^
these should not be counted as commas

If you do want to count these commas, then you can use this regex:如果您确实想计算这些逗号,那么您可以使用这个正则表达式:

,([^,]+(?:,|$))

And replace with :\1 .并替换为:\1

You can achieve the same result with this simple regex您可以使用这个简单的正则表达式获得相同的结果

re.sub(r'(\b\d{1,2}),',r'\g<1>:',search_string)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM