简体   繁体   English

正则表达式在方括号中保留顺序

[英]regex preserve order in square brackets

The context of the question is as follows. 问题的内容如下。 I wish to grab the file that wget is attempting to download, but need to ignore flags which may or may not appear. 我希望获取wget尝试下载的文件,但需要忽略可能会或可能不会出现的标志。 eg. 例如。 wget -qO http://google.com/myfile.sh . wget -qO http://google.com/myfile.sh The expected output would be: http://google.com/myfile.sh . 预期的输出将是: http://google.com/myfile.sh : http://google.com/myfile.sh For this example the regex: 对于此示例,正则表达式:

r'wget\s-\w+\s([^\s]*)'

seems to do the trick. 似乎可以解决问题。 However, will not work when there is no flag. 但是,没有标志时将不起作用。

In order for the (possibly absent) flag to work I attempted: r'wget\\s[-\\w+\\s]?([^\\s]*)' which I was hoping would say that "you can expect 0 or 1 instance of a dash followed by some characters", however it seems to think that the order of the -\\w+\\s is optional, atleast that is my explanation of the following results: 为了使(可能不存在)标志起作用,我尝试: r'wget\\s[-\\w+\\s]?([^\\s]*)'我希望说“您可以期望0或1破折号的实例,后面跟一些字符”,但是似乎认为-\\w+\\s的顺序是可选的,至少是我对以下结果的解释:

import re
re.search(r'wget\s-\w+\s([^\s]*)','wget -qO http://google.com/myfile.sh').group(1)
>>> 'http://google.com/myfile.sh'
re.search(r'wget\s[-\w+\s]?([^\s]*)','wget -qO http://google.com/myfile.sh').group(1)
>>> 'q0'
re.search(r'wget\s[-\w+\s]*([^\s]*)','wget -qO http://google.com/myfile.sh').group(1)
>>> '://google.com/myfile.sh'

Can someone explain the last two results, and show how to make sure that it matches 0 or more flags? 有人可以解释最后两个结果,并显示如何确保它与0个或多个标志匹配吗?

Try the following: 请尝试以下操作:

 wget\s*(?:-\w+)?\s*(.*)

https://regex101.com/r/aDWM3X/1 for reference https://regex101.com/r/aDWM3X/1供参考

The reason why your example was not working is because you are using brackets which means "any of the following characters or range" while also using + after \\w (which does not mean 1 or more of \\w, it means look for any \\w, any + ....if you use a group then you can make the group optional with ? (zero or 1) or * if it can be zero or unlimited 您的示例无法正常运行的原因是,您使用方括号表示“以下任何字符或范围中的任何一个”,同时在\\ w之后使用+(不表示1个或多个\\ w,表示查找任何\\ w,任何+ ....如果使用组,则可以使用?(零或1)或*(如果它可以为零或无限制)使该组为可选

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM