简体   繁体   English

使用JavaScript从URL提取域名

[英]Extracting the domain name from a URL using javascript

I need a general script/pattern to extract the main domain name from URLs. 我需要一个通用的脚本/模式来从URL中提取主要域名。 I have the following attempt that failed. 我有以下尝试失败。

Let use say I have this link1 and need to extract the main domain name (google.co.uk) without the sub-domain (mail). 假设我有此link1,并且需要提取主域名(google.co.uk),而没有子域(邮件)。 I made this script which worked fine with .co.uk but will fail with websites that has one top-level domain name like: .com and .com . 我制作了此脚本,该脚本在.co.uk可以正常使用,但在具有一个顶级域名(如.com.com网站上将失败。

Is there a better way to extract main domain name from ANY URL? 有没有更好的方法可以从ANY URL中提取主域名? The URL is constructed as follows: URL的构造如下:

https://(optional sub-domain)*(domain name with two or three top-level domain name)(optional forward slash followed by text)*

The * refer to zero or more times. *表示零次或多次。

var link1="https://mail.google.co.uk/link/link/link";
var url = new URL(link1);
var domain = url.hostname.split('.').slice(-3).join('.');
console.log("The domain name is: "+ domain);

In the above code, I expect: google.co.uk 在上面的代码中,我希望:google.co.uk

It could work because the link has two parts in the top-level domain name ( .co.uk ) so -3 works. 之所以可行,是因为该链接在顶级域名( .co.uk )中包含两个部分,因此-3起作用。 But I need the code to work with this link as well: 但是我也需要代码来使用此链接:

var link1="https://mail.google.com/link/link/link";

And I need the output to be: google.com 我需要的输出是: google.com

But the problem is that the code produces: 但是问题是代码产生了:

mail.google.com

And I only want the main domain name: google.com 而且我只想要主要域名: google.com

EDIT: Some of the expected output examples are here: 编辑:一些预期的输出示例在这里:

1) In mail.google.co.uk it should be: google.co.uk 1)在mail.google.co.uk ,应为: google.co.uk

2) In mail.google.com it should be: google.com 2)在mail.google.com它应该是: google.com

3) In link.mail.google.com/link/link it should be: google.com 3)在link.mail.google.com/link/link它应该是: google.com

4) In link.link2.mail.google.com it should be: google.com 4)在link.link2.mail.google.com它应该是: google.com

ie just the main domain name without sub-domains or links after the domain name. 即只是主域名,没有子域名或域名后的链接。 The top-level domain name can be in the fom of (.com, .net, .org, etc.) or in the form of (.co.uk, .co.us, etc). 顶级域名可以是(.com,.net,.org等)的形式,也可以是(.co.uk,.co.us等)的形式。 The top-level domain name should be captured either if it is one part or two parts (my code capture only two parts). 如果顶级域名是一个部分或两个部分(我的代码仅捕获两个部分),则应将其捕获。

Sure if you wanted 当然可以

"mail.google.co.uk"

you can just use 你可以用

url.host

or if you wanted it with headers, use 或者如果您想要带有标题,请使用

url.origin

cheers! 干杯!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM