从URL捕获顶级域时出现问题

Question

I want a way to capture the Top-Level Domain from a URL, but am not able to get any success. 我想要一种从URL捕获顶级域的方法，但无法成功。 The problem in my case is that the URL can be different. 我的问题是URL可以不同。 Sometimes a user can enter www.google.com or m.google.com or m.google.uk or google.uk or www.m.google.com 有时用户可以输入www.google.com或m.google.com或m.google.uk或google.uk或www.m.google.com

I tried using slice but it didn't work as I can have 2 or 3 characters in my URL. 我尝试使用slice但无法正常工作，因为我的URL中可以包含2个或3个字符。 I can't split based on ".", I might get 2 or 3 or 4 results. 我无法根据“。”进行拆分，我可能会得到2或3或4个结果。 Is there a single-line JavaScript function I can use? 我可以使用单行JavaScript函数吗？ Is there any easy custom function available? 是否有任何简单的自定义功能？

All posts are pointing to get the host name but in my case I want to extract just last 3 or 2 characters of the URL (com, uk, cn, etc.). 所有帖子都指向要获取主机名，但就我而言，我只想提取URL的最后3个或2个字符（com，uk，cn等）。 I can apply multiple if-else loops too but I want to avoid that, and want to check if there is a simple solution for this. 我也可以应用多个if-else循环，但是我想避免这种情况，并想检查是否有一个简单的解决方案。

I am looking for output as 'com' or 'uk' or 'cn' depending on top level domain of my URL. 根据我的URL的顶级域名，我正在寻找“ com”，“ uk”或“ cn”的输出。 URL is entered by user which is why it difficult to predict if user will enter m.google.com or www.m.google.com or www.google.com or simply google.com 用户输入的网址，这就是为什么很难预测用户将输入m.google.com还是www.m.google.com或www.google.com或仅仅是google.com的原因

Answer 1

One possible approach: 一种可能的方法：

 var parser = document.createElement('a'); parser.href = "http://www.google.com/path/"; console.log(parser.hostname); // "www.google.com" parser.href = "http://m.google.com/path/"; console.log(parser.hostname); // "m.google.com" parser.href = "http://www.m.google.com/path/"; console.log(parser.hostname); // "www.m.google.com"

Answer 2

Below code works for me. 下面的代码对我有用。 Thanks @StephenP for your help. 感谢@StephenP的帮助。 Thanks @Timo as well but it seems Document is not identified in protractor library. 还要感谢@Timo，但似乎在量角器库中未标识文档。

var parser = TextBox.siteName;//get input of site from user in parser variable.
 var hostParts = parser.split('.');
    var URLdomain = hostParts[hostParts.length - 1];

Answer 3

If you can isolate the domain, the last period ( . ) should signify the TLD. 如果您可以隔离域，则最后一个句点（ . ）应表示TLD。

Test it out here: https://jsfiddle.net/ubb61wam/2/ 在这里测试： https : //jsfiddle.net/ubb61wam/2/

var addresses = [
  'google.com',             // should return 'com'
  'https://google.com.uk',  // should return 'uk'
  'yahoo.cn/foo/bar.foo',   // should return 'cn'
  'file:///usr/local'       // should fail
];

for (var index in addresses) {
    console.log(tld(addresses[index]));
}

function tld(address) {
    // handle edge-cases
    if (typeof address == 'undefined' || address.indexOf('file:///') != -1)
        return undefined;

    var part = address;

    //remove http://
    if (part.indexOf('//') != -1)
        part = part.split('//')[1];

    //isolate domain
    if (part.indexOf('/') != -1)
        part = part.split('/')[0];  

    //get tld
    if (part.indexOf('.') != -1) {
        var all = part.split('.');
        part = all[all.length - 1]; 
    }
    return part;
}

从URL捕获顶级域时出现问题

问题描述

3 个解决方案

解决方案1
2 2016-11-04 17:37:01

解决方案2
1 已采纳 2016-11-04 18:46:04

解决方案3
0 2016-11-04 19:27:18

从URL捕获顶级域时出现问题

问题描述

3 个解决方案

解决方案1 2 2016-11-04 17:37:01

解决方案2 1 已采纳 2016-11-04 18:46:04

解决方案3 0 2016-11-04 19:27:18

解决方案1
2 2016-11-04 17:37:01

解决方案2
1 已采纳 2016-11-04 18:46:04

解决方案3
0 2016-11-04 19:27:18