简体   繁体   English

用 hist function 计算直方图的 MATLAB 算法是什么?

[英]What is MATLAB's algorithm to calculate histogram with hist function?

I'm working on translation of some old MATLAB code to C++. I have noticed, that my custom function to calculate histogram that supposed to be equivalent to MATLAB [counts,centers]= hist(___) gives different results.我正在将一些旧的 MATLAB 代码翻译成 C++。我注意到,我的自定义 function 计算直方图应该等同于 MATLAB [counts,centers]= hist(___)给出不同的结果。 I could not find a bug in my implementation, so I used MATLAB Coder to generate C++ function from MATLAB code and compare it to my C++ code.我在我的实现中找不到错误,所以我使用 MATLAB Coder 从 MATLAB 代码生成 C++ function 并将其与我的 C++ 代码进行比较。 Here is a simple MATLAB function I used to generate C++ code:这是一个简单的 MATLAB function 我用来生成 C++ 的代码:

function [counts, centers] = my_hist(values, bins)
    [counts, centers] = hist(values, bins);
    disp(centers);
    disp(counts);
end

And a script to call it, so MATLAB can define inputs:以及调用它的脚本,因此 MATLAB 可以定义输入:

values = rand(1,1000);
bins = linspace(0.05, 0.95, 10);

[counts, centers] = my_hist(values, bins);

Based on the above, the Coder generates the function:基于以上,Coder生成function:

//
// File: my_hist.cpp
//
// MATLAB Coder version            : 5.3
// C/C++ source code generated on  : 17-Nov-2022 15:46:17
//

// Include Files
#include "my_hist.h"
#include "rt_nonfinite.h"
#include <algorithm>
#include <cmath>
#include <cstring>
#include <math.h>

// Function Definitions
//
// MY_HIST Summary of this function goes here
//    Detailed explanation goes here
//
// Arguments    : const double values[1000]
//                const double bins[10]
//                double counts[10]
//                double centers[10]
// Return Type  : void
//
void my_hist(const double values[1000], const double bins[10],
             double counts[10], double centers[10])
{
  double edges[11];
  double nn[11];
  double absx;
  int k;
  int low_i;
  std::copy(&bins[0], &bins[10], &centers[0]);
  for (k = 0; k < 9; k++) {
    absx = bins[k];
    edges[k + 1] = absx + (bins[k + 1] - absx) / 2.0;
  }
  edges[0] = rtMinusInf;
  edges[10] = rtInf;
  for (k = 0; k < 9; k++) {
    double absx_tmp;
    absx_tmp = edges[k + 1];
    absx = std::abs(absx_tmp);
    if ((!std::isinf(absx)) && (!std::isnan(absx))) {
      if (absx <= 2.2250738585072014E-308) {
        absx = 4.94065645841247E-324;
      } else {
        frexp(absx, &low_i);
        absx = std::ldexp(1.0, low_i - 53);
      }
    } else {
      absx = rtNaN;
    }
    edges[k + 1] = absx_tmp + absx;
  }
  std::memset(&nn[0], 0, 11U * sizeof(double));
  low_i = 1;
  int exitg1;
  do {
    exitg1 = 0;
    if (low_i + 1 < 12) {
      if (!(edges[low_i] >= edges[low_i - 1])) {
        for (low_i = 0; low_i < 11; low_i++) {
          nn[low_i] = rtNaN;
        }
        exitg1 = 1;
      } else {
        low_i++;
      }
    } else {
      for (k = 0; k < 1000; k++) {
        low_i = 0;
        absx = values[k];
        if (!std::isnan(absx)) {
          if ((absx >= edges[0]) && (absx < edges[10])) {
            int high_i;
            int low_ip1;
            low_i = 1;
            low_ip1 = 2;
            high_i = 11;
            while (high_i > low_ip1) {
              int mid_i;
              mid_i = (low_i + high_i) >> 1;
              if (values[k] >= edges[mid_i - 1]) {
                low_i = mid_i;
                low_ip1 = mid_i + 1;
              } else {
                high_i = mid_i;
              }
            }
          }
          if (values[k] == edges[10]) {
            low_i = 11;
          }
        }
        if (low_i > 0) {
          nn[low_i - 1]++;
        }
      }
      exitg1 = 1;
    }
  } while (exitg1 == 0);
  std::copy(&nn[0], &nn[10], &counts[0]);
  counts[9] += nn[10];
}

//
// File trailer for my_hist.cpp
//
// [EOF]
//

I don't understande what happens in this chunk of code and why it is done:我不明白这段代码中发生了什么以及为什么这样做:

  for (k = 0; k < 9; k++) {
    double absx_tmp;
    absx_tmp = edges[k + 1];
    absx = std::abs(absx_tmp);
    if ((!std::isinf(absx)) && (!std::isnan(absx))) {
      if (absx <= 2.2250738585072014E-308) {
        absx = 4.94065645841247E-324;
      } else {
        frexp(absx, &low_i);
        absx = std::ldexp(1.0, low_i - 53);
      }
    } else {
      absx = rtNaN;
    }
    edges[k + 1] = absx_tmp + absx;
  }

The function shift the edges of bins, but how and why? function 移动垃圾箱的边缘,但是如何以及为什么? I will be grateful for help and explanation!我将不胜感激帮助和解释!

That bit of code adds eps to each bin edge except the first and last.那段代码将eps添加到除第一个和最后一个之外的每个 bin 边缘。

It is hard to know why hist does this, they must be working around some edge case they discovered (presumably related to floating-point rounding errors), and figured this was the best or the easiest solution.很难知道hist为什么这样做,他们一定是在解决他们发现的一些边缘情况(可能与浮点舍入错误有关),并认为这是最好或最简单的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM