正则表达式选择长字符串的url部分

Question

I have a very long string, somewhere in this string, there is an url. 我有一个很长的字符串，在该字符串的某个地方，有一个URL。 In this example this url is at the beginning. 在此示例中，该URL在开头。

"http://localhost:1234/api/$metadata#this_entry_is_variable_and_can_exist_of_numbers_and_characters/$entity","Version":"AAAEEEIIU=""

I'm trying to write a RegEx in C# for this particular string, to extract the url after the following rules: 我正在尝试使用C＃为此特定字符串编写一个RegEx，以遵循以下规则提取URL：

The url always starts with http:// or https:// 网址始终以http：//或https：//开头
After the url, the port is sometimes specified, not always 在url之后，有时会指定端口，但并非总是
After the port, there is a path, in this example /api , but it can be any characters 在端口之后，有一个路径，在此示例中为/api ，但可以是任何字符
After the path, in this example /api , it is always /$metadata 路径之后，在此示例中/api ，始终为/$metadata
After the /$metadata there is a hashtag # followed by a string of any characters 在/$metadata之后，有一个#后跟任何字符的字符串
The last part of the url always ends with /$entity 网址的最后部分始终以/$entity结尾

This is the RegEx I have come up with so far: 到目前为止，这是我想出的RegEx：

(^http://\w+(\.\w+)*(:[0-9]+)?\/?(\/[.\^$metadata$(\#(\[a-zA-Z0-9)(\$(\entity$))]*).*?)

When testing this in LinqPad, the following issues occur: 在LinqPad中进行测试时，会发生以下问题：

If the string contains more than the url, there is no match 如果该字符串包含的网址超出限制，则没有匹配项
It does not strictly validate on /$metadata, it accepts /$metadata1111 它不严格在/ $ metadata上验证，它接受/ $ metadata1111
It does not strictly validate on /$entity, it accepts /$entity111 它不会严格验证/ $ entity，而是接受/ $ entity111
Obviously it does not accept https:// yet. 显然，它还不接受https：//。

Can anyone give me a hint on were to continue, as I'm stuck.. 任何人都可以给我一个提示，因为我被困住了。

Answer 1

Your regex doesn't follow a Regular Expression constructing rules hence no expected match. 您的正则表达式不遵循正则表达式构造规则，因此没有预期的匹配。 This is what you are expressing: 这是您要表达的内容：

https?://[^/]+/[^/]+/\$metadata#[^/]+/\$entity

Live demo 现场演示

Answer 2

Try this regex: 试试这个正则表达式：

https?://[\w-]+(?:\.[\w-]+)*(?::\d+)?/.*?\$metadata#.*?\$entity\b

Demo 演示版

To you questions: 给您的问题：

You matched only one regex because of the ^ . 由于^您仅匹配了一个正则表达式。 It matches only the start of input string if RegexOptions.Multiline is not set, and also start of every new line (after newline chars) if RegexOptions.Multiline is set. 如果未设置RegexOptions.Multiline则仅与输入字符串的开头匹配；如果设置了RegexOptions.Multiline则仅与每个新行的开头（在换行符之后） RegexOptions.Multiline 。
The regex gets mixed up in the part where $metadata...entity$ is surrounded by [] 正则表达式在 $metadata...entity$ 被[]包围的部分中混杂在一起
See 2. 见2。
Simply make the s optional with ? 只需将s可选?

正则表达式选择长字符串的url部分

问题描述

2 个解决方案

解决方案1
3 已采纳 2017-04-21 06:07:34

解决方案2
2 2017-04-21 06:07:44

正则表达式选择长字符串的url部分

问题描述

2 个解决方案

解决方案1 3 已采纳 2017-04-21 06:07:34

解决方案2 2 2017-04-21 06:07:44

解决方案1
3 已采纳 2017-04-21 06:07:34

解决方案2
2 2017-04-21 06:07:44