简体   繁体   English

Python正则表达式-替代

[英]Python Regular expression - Substitution

I have written a python code : 我写了一个python代码:

import re

url = "www.google.com";
line = "../../asyouwish.html"

num = re.sub(r'(\.\.\/)*', url, line)
print ("Final : ", num)

My intention is to replace ../ (any number of times) with the url value provided. 我的意图是用提供的url值替换../(任意次数)。 However I am not getting correct output. 但是我没有得到正确的输出。 My desired output is "www.google.com/asyouwish.html". 我想要的输出是“ www.google.com/asyouwish.html”。

What I get is : 我得到的是:

Final :  www.google.comawww.google.comswww.google.comywww.google.comowww.google.
comuwww.google.comwwww.google.comiwww.google.comswww.google.comhwww.google.com.w
ww.google.comhwww.google.comtwww.google.commwww.google.comlwww.google.com

Can anyone help me as where I went wrong !!! 谁能帮我解决我的问题!!! Thanks. 谢谢。

* means 0-or-more occurrences. *表示0个或多个事件。 + means 1-or-more. +表示1或更大。 You want a match to have at least 1 occurrence of ../ . 您希望匹配项至少出现一次../ So change the * to + : 因此将*更改为+

import re

url = "www.google.com/"
line = "../../asyouwish.html"

num = re.sub(r'([.]{2}/)+', url, line)
print ("Final : ", num)

yields 产量

('Final : ', 'www.google.com/asyouwish.html')

Since the re.sub will remove 1-or-more '../' , you'll need to add a forward-slash after url . 由于re.sub将删除1个或多个'../' ,因此您需要在url之后添加一个正斜杠。 Above, I've added the forward-slash to url itself. 上方,我在url本身中添加了正斜杠。 If url comes without the forward-slash, you can (as an alternative) add it with 如果url不带正斜杠,则可以(用另一种方法)添加

num = re.sub(r'([.]{2}/)+', url+'/', line)

When you match on 0-or-more occurrences, r'([.]{2}/)*' , each and every location between the characters in line matches the pattern, so you get a substitution at each interstice. 当您匹配0个或多个出现的字符r'([.]{2}/)*'line中字符之间的每个位置都与该模式匹配,因此在每个空隙处都会得到一个替换。

In [9]: x = 'www.google.comawww.google.comswww.google.comywww.google.comowww.google.comuwww.google.comwwww.google.comiwww.google.comswww.google.comhwww.google.com.www.google.comhwww.google.comtwww.google.commwww.google.comlwww.google.com'

In [13]: x.split('www.google.com')
Out[13]: ['', 'a', 's', 'y', 'o', 'u', 'w', 'i', 's', 'h', '.', 'h', 't', 'm', 'l', '']

use something like 使用类似

url = "www.google.com";
line = "../../asyouwish.html"
link_part = line.split("/")

final_url = url + "/" + link_part[-1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM