简体   繁体   English

在JS中的另一个多维数组(20k元素)中计算数组的单词(12k元素)出现次数

[英]Count array's words (12k elements) occurrences in another multidimensional array (20k elements) in JS

I've been trying to get this one to work, but the way I found it to work, using regex makes it run out of memory and/or it gives me the error: Uncaught SyntaxError: Invalid regular expression: \b + +\b: Nothing to repeat.我一直试图让这个工作,但我发现它工作的方式,使用regex使它用完 memory 和/或它给我错误:未捕获的Uncaught SyntaxError: Invalid regular expression: \b + +\b: Nothing to repeat.

This is the function:这是 function:

function countSearchTerms() {
  const filteredTerms = nGramsSht.getRange(6, 2, nGramsSht.getLastRow() - 5, 1).getValues().filter(e => e != '');
  const searchTermData = nGramFinalDataSht.getRange(1, 1, nGramFinalDataSht.getLastRow(), 1).getValues().filter(e => e != '');

  let occurrences = [];
  for (let r = 0; r < filteredTerms.length; r++) {
    let count = 0;
    for (let a = 0; a < searchTermData.length; a++) {
      if ((new RegExp("\\b" + filteredTerms[r].toString() + "\\b").test(searchTermData[a]))) {
        count++;
      }
    }
    occurrences.push([count])
  }

  if (occurrences.length > 0) {
    nGramsSht.getRange(6, 3, nGramsSht.getLastRow() - 5, 1).clearContent();
    nGramsSht.getRange(6, 3, occurrences.length, 1).setValues(occurrences);
  }
}

I'd use this answer's approach, but how to count the words in a occurring in b ?我会使用这个答案的方法,但是如何计算a中出现的单词b

 function wordcount() { const ss = SpreadsheetApp.getActive(); const sh = ss.getSheetByName("Sheet0"); sh.clearContents(); const a = [["A"], ["earth"], ["20"], ["tunnel"], ["house"], ["earth A"], ["$100"], ["house $100"]]; const b = [["A"], ["A Plane is expensive"], ["peaceful earth"], ["20 years"], ["tunnel"], ["tiny house"], ["earth B612"], ["$100"], ["house $100"]] sh.getRange(1, 1, a.length, a[0].length).setValues(a); let o = [... new Set(a.slice().flat().join(' ').split(' '))].map(w => [w, sh.createTextFinder(w).matchCase(true).findAll().length]); o.unshift(["Words","Count"]); sh.getRange(sh.getLastRow() + 2,1,o.length,o[0].length).setValues(o); }

Thanks a lot!非常感谢!

Try it this way:试试这种方式:

function wordcount() {
  const ss = SpreadsheetApp.getActive();
  const sh = ss.getSheetByName("Sheet0");
  sh.clearContents();
  const a = [["A"], ["earth"], ["20"], ["tunnel"], ["house"], ["earth A"], ["$100"], ["house $100"]];
  const b = [["A"], ["A Plane is expensive"], ["peaceful earth"], ["20 years"], ["tunnel"], ["tiny house"], ["earth B612"], ["$100"], ["house $100"]]
  sh.getRange(1, 1, b.length, b[0].length).setValues(b);
  let o = [... new Set(a.slice().flat().join(' ').split(' '))].map(w => [w, sh.createTextFinder(w).matchCase(true).findAll().length]);
  o.unshift(["Words", "Count"]);
  sh.getRange(sh.getLastRow() + 2, 1, o.length, o[0].length).setValues(o);
}
Search搜索
A一个
A Plane is expensive飞机很贵
peaceful earth和平的地球
20 years 20年
tunnel隧道
tiny house小房子
earth B612大地 B612
$100 100 美元
house $100房子 100 美元
Words Count数数
A一个 2 2
earth地球 2 2
20 20 1 1
tunnel隧道 1 1
house 2 2
$100 100 美元 2 2

Maybe your data makes the regexp fail.也许您的数据使正则表达式失败。 You should escape it for regexp and check if not empty.您应该将其转义为正则表达式并检查是否为空。 I hope it helps我希望它有帮助

function escapeRegExp(string) {
  return string.replace(/([.*+?^=!:${}()|\[\]\/\\])/g, "\\$1");
}

var term = escapeRegExp(filteredTerms[r].toString().trim())
if (term && (new RegExp("\\b" + term + "\\b").test(searchTermData[a]))) {
  count++;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM