简体   繁体   English

在C ++中,如何使用向量视图和gsl_stats_mean计算整数向量的均值?

[英]In C++, how to compute the mean of a vector of integers using a vector view and gsl_stats_mean?

my program manipulates STL vectors of integers but, from time to time, I need to calculate a few statistics on them. 我的程序操纵整数的STL向量,但有时,我需要计算一些统计数据。 Therefore I use the GSL functions . 因此我使用GSL功能 To avoid copying the STL vector into a GSL vector, I create a GSL vector view , and give it to the GSL functions, as in this piece of code: 为了避免将STL向量复制到GSL向量中,我创建了一个GSL向量视图 ,并将其提供给GSL函数,如下面这段代码:

#include <iostream>
#include <vector>
#include <gsl/gsl_vector.h>
#include <gsl/gsl_statistics.h>
using namespace std;

int main( int argc, char* argv[] )
{
  vector<int> stl_v;
  for( int i=0; i<5; ++i )
    stl_v.push_back( i );

  gsl_vector_int_const_view gsl_v = gsl_vector_int_const_view_array( &stl_v[0], stl_v.size() );

  for( int i=0; i<stl_v.size(); ++i )
    cout << "gsl_v_" << i << "=" << gsl_vector_int_get( &gsl_v.vector, i ) << endl;

  cout << "mean=" << gsl_stats_mean( (double*) gsl_v.vector.data, 1, stl_v.size() ) << endl;
}

Once compiled (gcc -lstdc++ -lgsl -lgslcblas test.cpp), this code outputs this: 编译完成后(gcc -lstdc ++ -lgsl -lgslcblas test.cpp),此代码输出:

gsl_v_0=0
gsl_v_1=1
gsl_v_2=2
gsl_v_3=3
gsl_v_4=4
mean=5.73266e-310

The vector view is properly created but I don't understand why the mean is wrong (it should be equal to 10/5=2). 矢量视图已正确创建,但我不明白为什么均值是错误的(它应该等于10/5 = 2)。 Any idea? 任何的想法? Thanks in advance. 提前致谢。

The cast to double* is very suspicious. 演员double*是非常可疑的。

Any time you are tempted to use a cast, think again. 每当你想要使用演员时,请再想一想。 Then look for a way to do it without a cast (maybe by introducing a temporary variable if the conversion is implicit). 然后在没有强制转换的情况下寻找一种方法(如果转换是隐式的,可以通过引入临时变量)。 Then think a third time before you cast. 然后在你演员之前第三次思考。

Since the memory region does not actually contain double values, the code is simply interpreting the bit patterns there as if they represented doubles, with predictably undesired effects. 由于存储区实际上不包含double值,因此代码只是将那里的位模式解释为它们表示双精度,具有可预测的不希望的效果。 Casting an int* to double* is VERY different from casting each element of the array. int*double*与转换数组的每个元素非常不同。

Use the integer statistics functions: 使用整数统计函数:

cout << "mean=" << gsl_stats_int_mean( gsl_v.vector.data, 1, stl_v.size() ) << endl;

Note the gsl_stats_int_mean instead of gsl_stats_mean . 请注意gsl_stats_int_mean而不是gsl_stats_mean

Unless you're doing a lot of statistics considerably more complex than the mean, I'd ignore gsl and just use standard algorithms: 除非你做了很多比平均值复杂得多的统计数据,否则我会忽略gsl而只使用标准算法:

double mean = std::accumulate(stl_v.begin(), stl_v.end(), 0.0) / stl_v.size();

When/if using a statistical library is justified, your first choice should probably be to look for something else that's better designed (eg, Boost Accumulators). 当/如果使用统计库是合理的,你的第一选择应该是寻找更好设计的其他东西(例如,Boost Accumulators)。

If you decide, for whatever reason, that you really need to use gsl, it looks like you'll have to copy your array of int s to an array of double s first, then use gsl on the result. 如果由于某种原因决定你真的需要使用gsl,看起来你必须首先将你的int数组复制到double的数组中,然后在结果上使用gsl。 This is obvious quite inefficient, especially if you're dealing with a lot of data -- thus the previous advice to use something else instead. 这显然是非常低效的,特别是如果你正在处理大量数据 - 因此之前建议使用其他东西。

Although I'm not familiar with GSL, the expression (double*) gsl_v.vector.data looks extremely suspicious. 虽然我不熟悉GSL,但表达式(double*) gsl_v.vector.data看起来非常可疑。 Are you sure it's correct to reinterpret_cast that pointer to get double data? 你确定reinterpret_cast指向获取double数据的指针是否正确?

Casting to double* is messing up your data. 施放到double*会弄乱您的数据。 It is not converting data into double , but just using int binary data as double 它不是将数据转换为double ,而只是将int二进制数据转换为double

According to http://www.gnu.org/software/gsl/manual/html_node/Mean-and-standard-deviation-and-variance.html the gsl_stats_mean function takes an array of double . 根据http://www.gnu.org/software/gsl/manual/html_node/Mean-and-standard-deviation-and-variance.htmlgsl_stats_mean函数采用的阵列double You're taking a vector of int and telling it to use the raw bytes as double which isn't going to work right. 你正在使用int的vector并告诉它使用原始字节作为double ,这是不能正常工作的。

You'll need to set up a temporary vector of double to pass in: 你需要建立一个临时的vector的双通的:

// Assumes that there's at least one item in stl_v.
std::vector<double> tempForStats(stl_v.begin(), stl_v.end());
gsl_stats_mean(&tempForStats[0], 1, tempForStats.size());

EDIT: You could also use standard library algorithms to do the int mean yourself: 编辑:您也可以使用标准库算法自己做int意思:

// Assumes that there's at least one item in stl_v.
double total = std::accumulate(stl_v.begin(), stl_v.end(), 0);
double mean = total / stl_v.size();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM