繁体   English   中英

用Java脚本解析用户代理

[英]User Agent parsing in Javascript

我需要从用户代理字符串中提取操作系统的名称和浏览器的名称。

用户代理样本:

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.9) Gecko/20100825 Ubuntu/9.10 (karmic) Firefox/3.6.9

如何仅获得操作系统(例如"Linux i686""Firefox 3.6.9" )?

这是我在小提琴链接中的代码,如下所示:

 function getBrowserAndOS(userAgent, elements) { var browserList = { 'Chrome': [/Chrome\\/(\\S+)/], 'Firefox': [/Firefox\\/(\\S+)/], 'MSIE': [/MSIE (\\S+);/], 'Opera': [ /Opera\\/.*?Version\\/(\\S+)/, /Opera\\/(\\S+)/ ], 'Safari': [/Version\\/(\\S+).*?Safari\\//] }, re, m, browser, version; var osList = { 'Windows': [/Windows\\/(\\S+)/], 'Linux': [/Linux\\/(\\S+)/] }, re2, m2, os; if (userAgent === undefined) userAgent = navigator.userAgent; if (elements === undefined) elements = 2; else if (elements === 0) elements = 1337; for (browser in browserList) { while (re = browserList[browser].shift()) { if (m = userAgent.match(re)) { version = (m[1].match(new RegExp('[^.]+(?:\\.[^.]+){0,' + --elements + '}')))[0]; //version = (m[1].match(new RegExp('[^.]+(?:\\.[^.]+){0,}')))[0]; //return browser + ' ' + version; console.log(browser + ' ' + version); } } } for (os in osList) { while (re2 = osList[os].shift()) { if (m2 = userAgent.match(re2)) { //version = (m[1].match(new RegExp('[^.]+(?:\\.[^.]+){0,' + --elements + '}')))[0]; //version = (m[1].match(new RegExp('[^.]+(?:\\.[^.]+){0,}')))[0]; //return browser + ' ' + version; console.log(os); } } } return null; } console.log(getBrowserAndOS(navigator.userAgent, 2)); 

我只需要提取操作系统名称和浏览器名称及其各自的版本。 我如何解析它以获得那些字符串?

我不建议自己这样做。 我将使用像Platform.js这样的解析器,其工作原理如下:

<script src="platform.js"></script>
<script>
var os = platform.os;
var browser = platform.name + ' ' + platform.version;
</script>

您是否打算基于从用户代理(UA)字符串“嗅探”的浏览器来控制网站的行为?

请不要 请改用特征检测。

实践证明,执行不当(非未来性)的User-Agent嗅探是每次发布新版本的Internet Explorer时遇到的最重要的兼容性问题。 结果,多年来,围绕用户代理字符串的逻辑变得越来越复杂。 兼容性模式的引入意味着浏览器现在具有多个UA字符串,并且在滥用多年后,不再支持该字符串的旧可扩展性。

默认情况下,Windows 8.1上的Internet Explorer 11发送以下User-Agent字符串:

Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko

该字符串经过精心设计,可导致大多数UA字符串嗅探逻辑将其解释为Gecko或WebKit。 这种设计选择是一种谨慎的选择-IE团队测试了许多UA字符串变体,以找出会导致大多数站点“正常工作”以供IE11用户使用的情况。

这是两个实际上对您有帮助的链接 您可能还想查看我大部分评论的原始来源

这是用于识别操作系统的本机JavaScript解决方案,但是,每当引入新的操作系统时,都需要手动对其进行更新:

function getOs (userAgent) {

     //Converts the user-agent to a lower case string
     var userAgent = userAgent.toLowerCase();

     //Fallback in case the operating system can't be identified
     var os = "Unknown OS Platform";

     //Corresponding arrays of user-agent strings and operating systems
     match = ["windows nt 10","windows nt 6.3","windows nt 6.2","windows nt 6.1","windows nt 6.0","windows nt 5.2","windows nt 5.1","windows xp","windows nt 5.0","windows me","win98","win95","win16","macintosh","mac os x","mac_powerpc","android","linux","ubuntu","iphone","ipod","ipad","blackberry","webos"];
     result = ["Windows 10","Windows 8.1","Windows 8","Windows 7","Windows Vista","Windows Server 2003/XP x64","Windows XP","Windows XP","Windows 2000","Windows ME","Windows 98","Windows 95","Windows 3.11","Mac OS X","Mac OS X","Mac OS 9","Android","Linux","Ubuntu","iPhone","iPod","iPad","BlackBerry","Mobile"];

     //For each item in match array
     for (var i = 0; i < match.length; i++) {

              //If the string is contained within the user-agent then set the os 
              if (userAgent.indexOf(match[i]) !== -1) {
                   os = result[i];
                   break;
              }

     }

     //Return the determined os
     return os;
}

用户代理不是一组用于询问诸如“你是什么?”之类的定性问题的元数据,它们实际上仅对诸如“你是Linux?”或“你是什么版本的Firefox? ”。

让我说明一下,这是一个将用户代理转换为可爱的json可序列化对象的脚本:

parseUA = (() => {
    //useragent strings are just a set of phrases each optionally followed by a set of properties encapsulated in paretheses
    const part = /\s*([^\s/]+)(\/(\S+)|)(\s+\(([^)]+)\)|)/g;
    //these properties are delimited by semicolons
    const delim = /;\s*/;
    //the properties may be simple key-value pairs if;
    const single = [
        //it is a single comma separation,
        /^([^,]+),\s*([^,]+)$/,
        //it is a single space separation,
        /^(\S+)\s+(\S+)$/,
        //it is a single colon separation,
        /^([^:]+):([^:]+)$/,
        //it is a single slash separation
        /^([^/]+)\/([^/]+)$/,
        //or is a special string
        /^(.NET CLR|Windows)\s+(.+)$/
    ];
    //otherwise it is unparsable because everyone does it differently, looking at you iPhone
    const many = / +/;
    //oh yeah, bots like to use links
    const link = /^\+(.+)$/;

    const inner = (properties, property) => {
        let tmp;

        if (tmp = property.match(link)) {
            properties.link = tmp[1];
        }
        else if (tmp = single.reduce((match, regex) => (match || property.match(regex)), null)) {
            properties[tmp[1]] = tmp[2];
        }
        else if (many.test(property)) {
            if (!properties.properties)
                properties.properties = [];
            properties.properties.push(property);
        }
        else {
            properties[property] = true;
        }

        return properties;
    };

    return (input) => {
        const output = {};
        for (let match; match = part.exec(input); '') {
            output[match[1]] = {
                ...(match[5] && match[5].split(delim).reduce(inner, {})),
                ...(match[3] && {version:match[3]})
            };
        }
        return output;
    };
})();
//parseUA('user agent string here');

使用此,我们可以转换出以下用户代理:

 `Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; WOW64; Trident/4.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)` { "Mozilla": { "compatible": true, "MSIE": "7.0", "Windows": "NT 6.0", "WOW64": true, "Trident": "4.0", "SLCC1": true, ".NET CLR": "3.0.30729", ".NET4.0C": true, ".NET4.0E": true, "version": "4.0" } } `Mozilla/5.0 (SAMSUNG; SAMSUNG-GT-S8500-BOUYGUES/S8500AGJF1; U; Bada/1.0; fr-fr) AppleWebKit/533.1 (KHTML, like Gecko) Dolfin/2.0 Mobile WVGA SMM-MMS/1.2.0 NexPlayer/3.0 profile/MIDP-2.1 configuration/CLDC-1.1 OPN-B` { "Mozilla": { "SAMSUNG": true, "SAMSUNG-GT-S8500-BOUYGUES": "S8500AGJF1", "U": true, "Bada": "1.0", "fr-fr": true, "version": "5.0" }, "AppleWebKit": { "KHTML": "like Gecko", "version": "533.1" }, "Dolfin": { "version": "2.0" }, "Mobile": {}, "WVGA": {}, "SMM-MMS": { "version": "1.2.0" }, "NexPlayer": { "version": "3.0" }, "profile": { "version": "MIDP-2.1" }, "configuration": { "version": "CLDC-1.1" }, "OPN-B": {} } `Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Comodo_Dragon/4.1.1.11 Chrome/4.1.249.1042 Safari/532.5` { "Mozilla": { "Windows": "NT 5.1", "U": true, "en-US": true, "version": "5.0" }, "AppleWebKit": { "KHTML": "like Gecko", "version": "532.5" }, "Comodo_Dragon": { "version": "4.1.1.11" }, "Chrome": { "version": "4.1.249.1042" }, "Safari": { "version": "532.5" } } `Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36` { "Mozilla": { "X11": true, "Fedora": true, "Linux": "x86_64", "version": "5.0" }, "AppleWebKit": { "KHTML": "like Gecko", "version": "537.36" }, "Chrome": { "version": "73.0.3683.86" }, "Safari": { "version": "537.36" } } `Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0` { "Mozilla": { "X11": true, "Fedora": true, "Linux": "x86_64", "rv": "66.0", "version": "5.0" }, "Gecko": { "version": "20100101" }, "Firefox": { "version": "66.0" } } `Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36` { "Mozilla": { "X11": true, "Linux": "x86_64", "version": "5.0" }, "AppleWebKit": { "KHTML": "like Gecko", "version": "537.36" }, "Chrome": { "version": "73.0.3683.103" }, "Safari": { "version": "537.36" } } `Mozilla/5.0 (Linux; Android 6.0.1; SM-G920V Build/MMB29K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36` { "Mozilla": { "Linux": true, "Android": "6.0.1", "SM-G920V": "Build/MMB29K", "version": "5.0" }, "AppleWebKit": { "KHTML": "like Gecko", "version": "537.36" }, "Chrome": { "version": "52.0.2743.98" }, "Mobile": {}, "Safari": { "version": "537.36" } } `Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)` { "Mozilla": { "iPhone": true, "properties": [ "CPU iPhone OS 9_1 like Mac OS X" ], "version": "5.0" }, "AppleWebKit": { "KHTML": "like Gecko", "version": "601.1.46" }, "Version": { "version": "9.0" }, "Mobile": { "version": "13B143" }, "Safari": { "compatible": true, "AdsBot-Google-Mobile": true, "link": "http://www.google.com/mobile/adsbot.html", "version": "601.1" } } 

如果您扩展后会看到,作为一个人,您可以轻松地读取操作系统版本: Mozilla.Windows = NT 6.0Mozilla.Bada = 1.0Mozilla.Fedora && Mozilla.Linux = x86_64
但是您看到问题了吗? 他们都没有说OS = "Windows"OS = "Samsung Bada"等。

要问您要问的问题,您需要对所有可能的值有所了解,就像@Peter Wetherall上面尝试过的那样,或者说“我只关心这些少数的浏览器/ OS”,例如您所拥有的。题。

如果可以,并且您没有使用信息来更改代码的工作方式(不应按照@Sophit进行操作),只想显示有关浏览器的内容,则可以在上面使用parseUA()与手动检查Mozilla.Windows || Mozilla.Linux || //et cetera结合使用 Mozilla.Windows || Mozilla.Linux || //et cetera Mozilla.Windows || Mozilla.Linux || //et cetera ,这比尝试通过正则表达式在原始useragent字符串上获得幸运要容易得多(这将导致误报:请参见浏览器Comodo_Dragon在其中显示“ Chrome”),这样的错误发生率将比尝试使用regex幸运。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM