正则表达式仅查找有效的url +文本python

Question

I have a regex 我有一个正则表达式

((http\://|https\://|ftp\://)|(www.)|([a-zA-Z0-9\.-]))+(([a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4}))(/[a-zA-Z0-9%:/-_\?\.'~#-]*)?

which is picking valid url perfectly. 完美地选择了有效的网址。

I have a scenario where there can be 我有一种可能的情况

VALID URL + TEXT or (www.abc.com testing the regex) 有效的URL + TEXT或（www.abc.com测试正则表达式）
Text + VALID URL (Testing the regex www.abc.com) 文字+有效网址（测试正则表达式www.abc.com）

REQ: 要求：

What i want is first the regex check the valid url then if url is valid it ignores the Valid url and search TEXT only outside the Valid URL. 我想要的是首先正则表达式检查有效的网址，然后如果该网址有效，它将忽略有效的网址，仅在有效网址之外搜索TEXT。

Issues: 问题：

I have tried many regex but it is picking the valid url also which i don't want i only want if url is valid search for the text outside the url. 我已经尝试了很多正则表达式，但是它也选择了有效的URL，但我不希望我仅在URL有效的情况下才搜索该URL之外的文本。

NO function Please . 没有功能请。 I am trying to fix this using Regex. 我正在尝试使用Regex修复此问题。

Answer 1

Perhaps you want this: 也许您想要这样：

(.*?)((?:(?:http\:\/\/|https\:\/\/|ftp\:\/\/)|(?:www.)|(?:[a-zA-Z0-9\.-]))+(?:(?:[a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4}))(?:\/[a-zA-Z0-9%:\/-_\?\.'~#-]*)?)(.*)

See demo here: 在此处查看演示：

You will get three groups, you can used named groups to catpure beforeUrl text, Url and afterUrl text, which will be something like this: 您将获得三个组，可以使用命名组来区分beforeUrl文本， Url和afterUrl文本，如下所示：

(?<beforeUrl>.*?)(?<Url>(?:(?:http\:\/\/|https\:\/\/|ftp\:\/\/)|(?:www.)|(?:[a-zA-Z0-9\.-]))+(?:(?:[a-zA-Z0-9\.-]+\.[a-zA-Z]{2,4}))(?:\/[a-zA-Z0-9%:\/-_\?\.'~#-]*)?)(?<afterUrl>.*)

See demo here. 在此处查看演示。

正则表达式仅查找有效的url +文本python

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-09-08 06:27:39

正则表达式仅查找有效的url +文本python

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-09-08 06:27:39

解决方案1
0 已采纳 2015-09-08 06:27:39