简体   繁体   中英

How to get the main domain string using regular expression?

I have just started using regular expression and i landed up in a problem. So it would be really nice if someone can help me out with it.

The problem is, in case I have a url as given below;

$url = http://www.blog.domain.com/page/category=?

and want only the domain , how can i get it using regular expression in javascript.

thank you

This should work too, but most restrictive and shorter:

var url = "http://www.blog.domain.com/page/category"
var result = url.replace(/^(https?:\/\/)?(.+\.)*(([a-z0-9-]*)\.[a-z]{2,6})(\/.+)$/i,"$4")

If you want "domain.com" and not only "domain", use $3 instead of $4 .

Explaination step by step:

  • A correct domain syntax: letters,numbers and "-" /([a-z0-9-]*)/i
  • Domain extension (2-6 chars): /(([a-z0-9-]*)\\.[az]{2,6})/i
  • Subdomains: /(.+\\.)*(([a-z0-9-]*)\\.[az]{2,6})/i
  • An url start with http and maybe http s : /^https?:\\/\\/(.+\\.)*(([a-z0-9-]*)\\.[az]{2,6})/i
  • You can put or not http when you type an url: /^(https?:\\/\\/)?(.+\\.)*(([a-z0-9-]*)\\.[az]{2,6})/i
  • Then what is after /: /^(https?:\\/\\/)?(.+\\.)*(([a-z0-9-]*)\\.[az]{2,6})(\\/.+)$/i

Try below code

 var url = "http://www.blog.domain.com/page/category=?";
 var match = url .match(/(?:http?:\/\/)?(?:www\.)?(.*?)\//);
 console.log(match[match.length-1]);

Do not use regex for this:

use hostname :

The URLUtils.hostname property is a DOMString containing the domain of the URL.

var x = new URL("http://www.blog.domain.com/page/category=?").hostname;
console.log(x);

as pointed by vishwanath, URL faces compatibilty issues with IE<10 so for those cases, regex will be needed.

use this :

var str = "http://www.blog.domain.com/page/category=?";
var res = str.match(/[^.]*.(com|net|org|info|coop|int|co\.uk|org\.uk|ac\.uk|uk)/g);
console.log(res);

=> domain.com

the list in the regex can be expanded further depending upon your need. a list of TLDs can be found here

You can get it using the following RegEx: /.*\\.(.+)\\.[com|org|gov]/

You can add all of the supported domain extensions in this regex.

RegEx101 Explanation

Working Code Snippet:

 var url = "http://www.blog.domain.gov/page/category=?"; var regEx = /.*\\.(.+)\\.[com|org|gov]/; alert(url.match(regEx)[1]); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM