简体   繁体   English

如何对 fp-ts 中的对象数组执行分组聚合?

[英]How to perform grouped-by aggregations on array of objects in fp-ts?

I'm trying to analyze data given as an array of nested objects.我正在尝试分析作为嵌套对象数组给出的数据。 I want to use fp-ts ecosystem, and I'm trying to figure out how I could combine a grouped-by calculation with any pre-defined function (such as, calculating average, median, mode, sum, standard deviation, etc.).我想使用fp-ts生态系统,我想弄清楚如何将分组计算与任何预定义的 function(例如,计算平均值、中位数、模式、总和、标准差等)结合起来。 ).

Example例子

I have an array of objects, where each object holds data about a different student.我有一个对象数组,其中每个 object 都包含有关不同学生的数据。 Here we have 3 students.我们这里有 3 个学生。

const studentsGrades = [
  {
    name: 'john',
    age: 21,
    classes: {
      history: {
        grade: 89,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 95,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 81,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 77,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },

  {
    name: 'amanda',
    age: 20,
    classes: {
      history: {
        grade: 95,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 99,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 89,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 65,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },

  {
    name: 'rachel',
    age: 19,
    classes: {
      history: {
        grade: 80,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 90,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 100,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 88,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },
];

I want to perform different calculations.我想执行不同的计算。 For example, what is the average grade for physics?例如,物理的平均成绩是多少? What is the median grade for literature?文学的平均成绩是多少? What is the standard deviation in grades of humanities classes?人文学科成绩的标准差是多少?


One way for me to reason about it, is to separately define independent functions that do those calculation on arrays. For example:我对此进行推理的一种方法是单独定义在 arrays 上进行这些计算的独立函数。例如:
average平均的

const calcMean = (arr: number[]): number => {
    return arr.reduce((acc, v, i, a) => acc + v / a.length, 0); // https://stackoverflow.com/a/62372003/6105259
};

median中位数

const calcMedian = (arr: number[]): number => {
  if (!arr.length) return undefined;
  const s = [...arr].sort((a, b) => a - b);
  const mid = Math.floor(s.length / 2);
  return s.length % 2 === 0 ? ((s[mid - 1] + s[mid]) / 2) : s[mid];
}; // https://stackoverflow.com/a/70806192/6105259

standard deviation标准偏差

const calcStandardDeviation = (arr: number[]): number => {
  const mean = calcMean(arr);
  const variance = arr.reduce((s, n) => s + (n - mean) ** 2, 0) / (arr.length - 1);
  return Math.sqrt(variance);
}; // https://stackoverflow.com/a/68258974/6105259

Alright but now what?好吧,但现在呢? How can I apply any function of interest (ie, either calcMean() , calcMedian() , or calcStandardDeviation() ) on studentsGrades to answer my analysis questions by grouping by the relevant key?我如何在studentsGrades上应用任何感兴趣的 function(即calcMean()calcMedian()calcStandardDeviation() )以通过按相关键分组来回答我的分析问题?

If you're using fp-ts, you should use Option instead of returning undefined for calcMedian .如果您使用的是 fp-ts,您应该使用Option而不是为calcMedian返回undefined It's also good to type parameters as taking readonly arrays when they don't modify the array:当参数不修改数组时,将参数键入为只读 arrays 也很好:

import * as O from 'fp-ts/Option';

const calcMean = (arr: readonly number[]): number => {
  return arr.reduce((acc, v) => acc + v, 0) / arr.length;
};

const calcMedian = (arr: readonly number[]): O.Option<number> => {
  if (!arr.length) return O.none;
  const sorted = [...arr].sort((a, b) => a - b);
  const mid = Math.trunc(sorted.length / 2);
  return O.some(
    sorted.length % 2 === 0
      ? (sorted[mid - 1]! + sorted[mid]!) / 2
      : sorted[mid]!
  );
};

const calcStandardDeviation = (arr: readonly number[]): number => {
  const mean = calcMean(arr);
  const variance = arr.reduce((s, n) => s + (n - mean) ** 2, 0) / (arr.length - 1);
  return Math.sqrt(variance);
};

For getting the subjects data:获取主题数据:

import * as RA from 'fp-ts/ReadonlyArray';
import * as O from 'fp-ts/Option';
import {pipe} from 'function';

const gradesByClass = (className: string): readonly number[] =>
  pipe(
    studentsGrades,
    RA.filterMap(({classes}) => O.fromNullable(classes[className]?.grade))
  );

const gradesByCategory = (categoryName: string): readonly number[] =>
  pipe(
    studentsGrades,
    RA.chain(({classes}) => Object.values(classes)),
    RA.filterMap(({category, grade}) => category === categoryName ? O.some(grade) : O.none)
  );

Then you can use these functions like this:然后你可以像这样使用这些函数:

calcMean(gradesByClass('physics'))
calcMedian(gradesByClass('literature'))
calcStandardDeviation(gradesByCategory('humanities'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM