如何从 JavaScript 中的 url 列表计算单词出现次数？

Question

I have list of urls in a JSON object in WordPress.我在 WordPress 的 JSON 对象中有 url 列表。 I want to count the occurence of the second part of the url.我想计算 url 的第二部分的出现。

The code below currently gets the rest of the url after the prefix https://www.example.co .下面的代码当前获取前缀https://www.example.co之后的其余 url。 What I want to do next is the count the occurence of the second part of the url which is cat1, cat3, cat2, xmlrpc.php我接下来要做的是计算 url 的第二部分出现的次数，即cat1, cat3, cat2, xmlrpc.php

var urlList = [
  {
    "URL": "https://www.example.co/cat1/aa/bb/cc",
    "Last crawled": "Jun 23, 2019"
  },
  {
    "URL": "https://www.example.co/cat2/aa",
    "Last crawled": "Jun 23, 2019"
  },
  {
    "URL": "https://www.example.co/cat1/aa/bb/cc/dd/ee",
    "Last crawled": "Jun 23, 2019"
  },
  {
    "URL": "https://www.example.co/cat3/aa/bb/cc/",
    "Last crawled": "Jun 23, 2019"
  },
  {
    "URL": "https://www.example.co/cat2/aa/bb",
    "Last crawled": "Jun 23, 2019"
  },
  {
    "URL": "https://www.example.co/cat1/aa/bb",
    "Last crawled": "Jun 23, 2019"
  },
  {
    "URL": "https://www.example.co/xmlrpc.php",
    "Last crawled": "Jun 19, 2019"
  }
]

const paths = urlList.map(value => value.URL.replace('https://www.example.co', ''));

//console.log(paths);

paths.forEach(function(item) {
    var urlSecondPart = item.split("/")[1];
    console.log(urlSecondPart);
});

Do you know how can I achieve that with my current forEach loop?您知道如何使用当前的forEach循环实现这一目标吗？

Any help is greatly appreciated.任何帮助是极大的赞赏。 Thanks谢谢

Answer 1

Use a regular expression to match non- / s that come after the .co/ :使用正则表达式匹配.co/之后的非/ s ：

 var urlList = [ { "URL": "https://www.example.co/cat1/aa/bb/cc", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat2/aa", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat1/aa/bb/cc/dd/ee", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat3/aa/bb/cc/", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat2/aa/bb", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat1/aa/bb", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/xmlrpc.php", "Last crawled": "Jun 19, 2019" } ] const paths = urlList.map( ({ URL }) => URL.match(/\\.co\\/([^\\/]+)/)[1] ); console.log(paths); const counts = paths.reduce((a, str) => { a[str] = (a[str] || 0) + 1; return a; }, {}); console.log(counts);

On newer engines, you can use lookbehind instead of extracting the capture group:在较新的引擎上，您可以使用后视而不是提取捕获组：

const paths = urlList.map(
  ({ URL }) => URL.match(/(?<=\.co\/)[^\/]+/)[0]
);

If you want to keep track of all full URLs used, reduce not only into a count, but also into an array of those full URLs:如果要跟踪使用的所有完整 URL，不仅要减少计数，还要减少这些完整 URL 的数组：

 var urlList = [ { "URL": "https://www.example.co/cat1/aa/bb/cc", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat2/aa", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat1/aa/bb/cc/dd/ee", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat3/aa/bb/cc/", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat2/aa/bb", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/cat1/aa/bb", "Last crawled": "Jun 23, 2019" }, { "URL": "https://www.example.co/xmlrpc.php", "Last crawled": "Jun 19, 2019" } ] const getSecond = url => url.match(/\\.co\\/([^\\/]+)/)[1]; const counts = urlList.reduce((a, { URL }) => { const second = getSecond(URL); if (!a[second]) { a[second] = { count: 0, fullUrls: [] }; } a[second].count++; a[second].fullUrls.push(URL); return a; }, {}); console.log(counts);

如何从 JavaScript 中的 url 列表计算单词出现次数？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-07-02 03:00:52

如何从 JavaScript 中的 url 列表计算单词出现次数？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-07-02 03:00:52

解决方案1
1 已采纳 2019-07-02 03:00:52