简体   繁体   English

Python 正则表达式 - 替换直到某个字符

[英]Python regex - substitute until certain character

I am looking to replace spaces with commas, but up to first / and tried the following:我正在寻找用逗号替换空格,但直到第一个/并尝试了以下操作:

import re

txt = "usera   28935 28876  0 Apr25 ?        00:07:20 /xxx/yyyy/foo/bar/zzzzz/Java/jdk-1.8.0_101/xxx/xxx -cp /xxx/yyyy/foo/bar/zzzzz"

rem = (re.sub(' +', ' ', txt))  # convert multiple spaces into single

print(re.sub(' ', ',', rem.lstrip()))

But the output is - inserts comma after every space!但是 output 是 - 在每个空格后插入逗号!

usera,28935,28876,0,Apr25,?,00:07:20,/xxx/yyyy/foo/bar/zzzzz/Java/jdk-1.8.0_101/xxx/xxx,-cp,/xxx/yyyy/foo/bar/zzzzz

Expected Output:预期 Output:

usera,28935,28876,0,Apr25,?,00:07:20,/xxx/yyyy/foo/bar/zzzzz/Java/jdk-1.8.0_101/xxx/xxx -cp /xxx/yyyy/foo/bar/zzzzz

ie comma should be applied until the first /即逗号应该应用到第一个/

I have tried lookahead, lookbehind but unable to work this out.我已经尝试过前瞻,后瞻,但无法解决这个问题。 Could someone advise me on how to achieve this please?有人可以告诉我如何实现这一目标吗?

Whenever you have a problem like this, consider splitting before using a regex每当您遇到此类问题时,请考虑在使用正则表达式之前进行拆分

# split the text once at the first /
a, b = txt.split("/", 1)

# do the replacement in the first half
a = re.sub(" +", ",", a)

# join 'em back up
result = "{}/{}".format(a,b)

You can use lookbehind, but it needs to be variable length.您可以使用lookbehind,但它必须是可变长度。 So, you'll need third-party regex module:因此,您需要第三regex模块:

>>> import regex
>>> txt = "usera   28935 28876  0 Apr25 ?        00:07:20 /xxx/yyyy/foo/bar/zzzzz/Java/jdk-1.8.0_101/xxx/xxx -cp /xxx/yyyy/foo/bar/zzzzz"
>>> regex.sub(r'(?<!/.*) +', ',', txt)
'usera,28935,28876,0,Apr25,?,00:07:20,/xxx/yyyy/foo/bar/zzzzz/Java/jdk-1.8.0_101/xxx/xxx -cp /xxx/yyyy/foo/bar/zzzzz'

# or you can use \G
>>> regex.sub(r'\G([^/ ]*+) +', r'\1,', txt)
'usera,28935,28876,0,Apr25,?,00:07:20,/xxx/yyyy/foo/bar/zzzzz/Java/jdk-1.8.0_101/xxx/xxx -cp /xxx/yyyy/foo/bar/zzzzz'

The first one replaces spaces only if / character is not present earlier in the string.仅当字符串中较早出现/字符时,第一个替换空格。

The second one defines a sequence of other than space or / characters followed by spaces to be matched as many times as possible from the start of the string.第二个定义了一个非空格或/字符的序列,后跟空格,从字符串的开头尽可能多地匹配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM