简体   繁体   中英

Regex Matching First URL with word

What I need:

I have a string like this:

Bike's: http://website.net/bikeurl Toys: http://website.net/rc-cars
Calendar: http://website.net/schedule

I want to match the word I specify and the first URL after it. So if i specify the word as "Bike" i should get:

Bike's: http://website.net/bikeurl

Or if possible only the url of the Bike word:

http://website.net/bikeurl

Or if I specify Toys I should get:

Toys: http://website.net/rc-cars

or if possible

http://website.net/rc-cars

What I am using:

I am using this regex:

(Bike)(.*)((https?|ftp):/?/?)(?:(.*?)(?::(.*?)|)@)?([^:/\s]+)(:([^/]*))?(((?:/\w+)*)/)([-\w.]+[^#?\s]*)?(\?([^#]*))?(#(.*))?

Result:

It is matching:

Bike's: http://website.net/bikeurl Toys: http://website.net/rc-cars

I only want:

Bike's: http://website.net/bikeurl

I am not a regex expert, I tried using {n} {n,} but it either didn't match anything or matches the same

I am using .NET C# so I am testing here http://regexhero.net/tester/

Here is another approach:

Bike(.*?):\\s\\S*

and here is an example how to get the corresponding URL-candidate only:

var inputString = "Bike’s: http://website.net/bikeurl Toys: http://website.net/rc-cars Calendar: http://website.net/schedule";
var word = "Bike";
var url = new Regex( word + @"(.*?):\s(?<URL>\S*)" )
    .Match( inputString )
    .Result( "${URL}" );

If I understood your problem correctly. You need a generic regex that will select a url based on a word. Here is one that would select the url with bike in it:

(.(?<!\s))*\/\/((?!\s).)*bike((?!\s).)*

If you replace bike with any other word. It would select the respective URL's.

EDIT 1:
Based on your edit, here is one that would select based on the word preceding the URL:

(TOKEN((?!\s).)*\s+)((?!\s).)*

It would select the word + the URL eg.
(Bike((?!\\s).)*\\s+)((?!\\s).)* would select Bike's: http://website.net/bikeurl (Toy((?!\\s).)*\\s+)((?!\\s).)* would select Toys: http://website.net/rc-cars (Calendar((?!\\s).)*\\s+)((?!\\s).)* would select Calendar: http://website.net/schedule
If you want to make sure the string contains a URL, you can use this instead:

(TOKEN((?!\s).)*\s+)((?!\s).)*\/\/((?!\s).)*

It will make sure that the 2nd part of the string ie. the one that is supposed to contain a URL has a // in between.

If you really need to make sure it's an url look at this:

Validate urls with Regex

Regex to check a valid Url

Here's another solution. I would separate the Bike's, Toys and Calendar in a dictionary and put the url as a value then when needed call it.

        Dictionary<string, string> myDic = new Dictionary<string, string>() 
        { 
            { "Bike’s:",   "http://website.net/bikeurl" },

            { "Toys:",     "http://website.net/rc-cars" },

            { "Calendar:", "http://website.net/schedule"  }
        };

        foreach (KeyValuePair<string, string> item in myDic)
        {
            if (item.Key.Equals("Bike's"))
            {
                 //do something
            }
        }

Hope one of my ideas helps you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM