简体   繁体   English

有没有更好的方法来根据条件(Google Apps Script)合并行数据?

[英]Is there a better way to merge row data based on criteria (Google Apps Script)?

Alright...Let me start by saying I am self-taught with Google Apps Script...enough said, right!?好的...首先让我说我是自学的 Google Apps 脚本...说的够多了,对吧!? The script below is functioning but I want to optimize it or come up with another way to achieve the same result.下面的脚本正在运行,但我想优化它或想出另一种方法来实现相同的结果。 The script takes 18000 rows and 86 columns of data and combines them into a single rows based on an id list.该脚本采用 18000 行和 86 列数据,并根据 id 列表将它们组合成单行。 The id list is about 13000 rows long. id 列表大约有 13000 行。 The short version is this...it filters the data by the id and then checks each column for the last row with submitted data and returns that cell.简短的版本是这样的……它按 id 过滤数据,然后检查每一列的最后一行是否有提交的数据并返回该单元格。 For example:例如:

//sample data
[[311112, 1, 2, 4, 5,"","","","","","", 2, 3],
[323223,"","","","","", 2, 4, 4,"","","",""],
[321321, 1, 2, 4, 5,"","","","","","", 2, 3],
[311112, 4, 1, 6, 7,"", 3,"", 3,"","", 5, 3],
[321233,"","","","","","", 4, 3, 1, 5,"",""],
[321321,"","","","","","","","", 1 ,4,"",""],
[323223,"","","","","", 2, 3,"","","","",""],
[323153,"", 2, 3, 6,"","","","","","","",""],
[321321,"","","","","", 2, 3,"","","","",""],
[321321,"", 5, 3,"", 1,"","","","","","",""]]

//filtered Data by id 321321
[[321321, 1, 2, 4, 5,"","","","","","", 2, 3],
[321321,"","","","","","","","", 1, 4,"",""],
[321321,"","","","","", 2, 3,"","","","",""],
[321321,"", 5, 3,"", 1,"","","","","","",""]]

// returned row is getting the last nonempty value for each column from the filtered data.

[[321321, 1, 5, 3, 5, 1, 2, 3,"", 1, 4, 2, 3]]

It takes about 16-18 minutes for the script to complete.脚本完成大约需要16-18 分钟 Is there a better way to accomplish this or any optimizations suggestions?有没有更好的方法来完成这个或任何优化建议?

function combineR(startRow, startRange) {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheets = ss.getSheets();
  var testSheet = ss.getSheetByName('Raw Scores');
  var cSheet = ss.getSheetByName('Combined Scores');
  var gradingResults = testSheet.getRange(1, 1, testSheet.getLastRow(), testSheet.getLastColumn()).getValues();

  if (startRow > cSheet.getLastRow()) {
    return;
  }

  if (startRow + startRange > cSheet.getLastRow()) {
    startRange = cSheet.getLastRow() - startRow;
  }

  var sID = cSheet.getRange(startRow, 2, startRange).getValues();
  var maxScores = [];
  for (var x = 0; x < sID.length; x++) {
    var filtered = gradingResults.filter(function (dataRow) {
      return dataRow[0] === sID[x][0];
    });

    if (isFinite(filtered)) {
      maxScores.push(['', '', '', '', '', '', '', '', '', '',
        '', '', '', '', '', '', '', '', '', '',
        '', '', '', '', '', '', '', '', '', '',
        '', '', '', '', '', '', '', '', '', '',
        '', '', '', '', '']);
      continue;
    } else {
      maxScores.push(['', getMaxLetter(filtered, 3), lastGraded(filtered, 4), lastGraded(filtered, 5), lastGraded(filtered, 6), lastGraded(filtered, 7), lastGraded(filtered, 8), lastGraded(filtered, 9), lastGraded(filtered, 10), lastGraded(filtered, 11),
        lastGraded(filtered, 12), lastGraded(filtered, 13), lastGraded(filtered, 14), lastGraded(filtered, 15), lastGraded(filtered, 16), lastGraded(filtered, 17), lastGraded(filtered, 18), lastGraded(filtered, 19), lastGraded(filtered, 20), lastGraded(filtered, 21),
        lastGraded(filtered, 22), lastGraded(filtered, 23), lastGraded(filtered, 24), lastGraded(filtered, 25), lastGraded(filtered, 26), lastGraded(filtered, 27), lastGraded(filtered, 28), lastGraded(filtered, 29), lastGraded(filtered, 30), lastGraded(filtered, 31),
        lastGraded(filtered, 32), lastGraded(filtered, 33), lastGraded(filtered, 34), lastGraded(filtered, 35), lastGraded(filtered, 36), lastGraded(filtered, 37), lastGraded(filtered, 38), lastGraded(filtered, 39), lastGraded(filtered, 40), lastGraded(filtered, 41),
        lastGraded(filtered, 42), lastGraded(filtered, 43), lastGraded(filtered, 44), lastGraded(filtered, 45), lastGraded(filtered, 46)]);
    }
  }
  cSheet.getRange(startRow, 11, maxScores.length, maxScores[0].length).setValues(maxScores)
}

function getMaxLetter(arr, idx) {
  var letter = arr.map(function (e) { return e[idx] }).sort().pop();
  return letter;
}

function lastGraded(arr, idx) {
  var newArray = arr.map(function (e) { return e[idx] });
  newArray.reverse();
  for (var x = 0; x < newArray.length; x++) {
    if (typeof newArray[x] == 'number') {
      return newArray[x];
    }
  }
  return '';
}

Column A has duplicate Ids that need to be merged A列有重复的ID需要合并原始数据

Column B has the unique values that are the final merged product B 列具有作为最终合并产品的唯一值组合数据

Issues:问题:

The script seems to have various issues, but the main issue seems to be calling lastGraded function many times with various indexes.该脚本似乎有各种问题,但主要问题似乎是使用各种索引多次调用lastGraded函数。 This does map , reverse and everything else for each index and costs time.这会为每个索引mapreverse和其他所有内容,并且会花费时间。

Solution:解决方案:

Given your sample data, I propose the following approach:鉴于您的样本数据,我建议采用以下方法:

  • Get all the input data in 1 2D array获取 1 个二维数组中的所有输入数据

  • Reduce the input data to a Map . 输入数据减少Map The map will have each id as key and all the rows that match that key as 2D array for each key .地图将每个id作为key并将与该key匹配的所有行作为每个key二维array This will greatly increase performance/speed at the cost of memory.这将以内存为代价大大提高性能/速度。 This is better than filtering the array by each id, because,这比按每个 id 过滤数组要好,因为,

    • You loop the input array only once您只循环输入数组一次
    • whereas arr.filter will have to loop the array for each idarr.filter必须为每个 id 循环数组
  • Once reduced to a map , Loop through each array in the map in the reverse for each element in the last row to find the non empty element.一旦缩减为一个map ,对最后一行中的每个元素反向循环遍历map中的每个array以找到非空元素。

Sample snippet:示例片段:

 const arrMain = //sample data [ [311112, 1, 2, 4, 5, '', '', '', '', '', '', 2, 3], [323223, '', '', '', '', '', 2, 4, 4, '', '', '', ''], [321321, 1, 2, 4, 5, '', '', '', '', '', '', 2, 3], [311112, 4, 1, 6, 7, '', 3, '', 3, '', '', 5, 3], [321233, '', '', '', '', '', '', 4, 3, 1, 5, '', ''], [321321, '', '', '', '', '', '', '', '', 1, 4, '', ''], [323223, '', '', '', '', '', 2, 3, '', '', '', '', ''], [323153, '', 2, 3, 6, '', '', '', '', '', '', '', ''], [321321, '', '', '', '', '', 2, 3, '', '', '', '', ''], [321321, '', 5, 3, '', 1, '', '', '', '', '', '', ''], ]; //reduce input array to a map of id=>rows const map = arrMain.reduce((map, row) => { if (!map.has(row[0])) map.set(row[0], [row]); else map.get(row[0]).push(row); return map; }, new Map()); const out = []; map.forEach(arr2d => { const l = arr2d.length - 1, lastRow = arr2d[l].slice(0); //iterate lastrow of this id's column elements for (let j = 0; j < lastRow.length; ++j) { if (lastRow[j] === '') { //iterate each row of this id for (let i = l; i >= 0; --i) { if (arr2d[i][j] !== '') { lastRow[j] = arr2d[i][j]; break; } } } } out.push(lastRow); }); console.log(out);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM