[英]c++ Variance and Standard Deviation
我創建了一個提示用戶輸入數據集的程序。 該程序存儲數據並對其進行排序,然后計算數組的方差和標准差。 但是,我沒有得到正確的方差和標准差計算(答案略有偏差)。 任何人都知道問題似乎是什么?
#include <iostream>
#include <iomanip>
#include <array>
using namespace std;
//function declarations
void GetData(double vals[], int& valCount);
void Sort(double vals[], int& valCount);
void printSort(double vals[], int& valCount);
double Variance(double vals[], int valCount);
double StandardDev(double vals[], int valCount);
double SqRoot(double value); //use for StandardDev function
//function definitions
int main ()
{
double vals = 0;
int valCount = 0; //number of values to be processed
//ask user how many values
cout << "Enter the number of values (0 - 100) to be processed: ";
cin >> valCount;
//process and store input values
GetData(&vals, valCount);
//sort values
Sort(&vals, valCount);
//print sort
cout << "\nValues in Sorted Order: " << endl;
printSort(&vals, valCount);
//print variance
cout << "\nThe variance for the input value list is: " << Variance(&vals, valCount);
//print standard deviation
cout << "\nThe standard deviation for the input list is: " <<StandardDev(&vals, valCount)<< endl;
return 0;
}
//prompt user to get data
void GetData(double vals[], int& valCount)
{
for(int i = 0; i < valCount; i++)
{
cout << "Enter a value: ";
cin >> vals[i];
}
}
//bubble sort values
void Sort(double vals[], int& valCount)
{
for (int i=(valCount-1); i>0; i--)
for (int j=0; j<i; j++)
if (vals[j] > vals[j+1])
swap (vals[j], vals[j+1]);
}
//print sorted values
void printSort(double vals[], int& valCount)
{
for (int i=0; i < valCount; i++)
cout << vals[i] << "\n";
}
//compute variance
double Variance(double vals[], int valCount)
{
//mean
int sum = 0;
double mean = 0;
for (int i = 0; i < valCount; i++)
sum += vals[i];
mean = sum / valCount;
//variance
double squaredDifference = 0;
for (int i = 0; i < valCount; i++)
squaredDifference += (vals[i] - mean) * (vals[i] - mean);
return squaredDifference / valCount;
}
//compute standard deviation
double StandardDev(double vals[], int valCount)
{
double stDev;
stDev = SqRoot(Variance(vals, valCount));
return stDev;
}
//compute square root
double SqRoot(double value)
{
double n = 0.00001;
double s = value;
while ((s - value / s) > n)
{
s = (s + value / s) / 2;
}
return s;
}
導致您出錯的代碼有很多錯誤。 類型不匹配,但更重要的是,您從未創建數組來存儲值。 您將普通雙精度數視為數組,幸運的是您的程序從未在您身上崩潰。
下面是你的代碼的一個工作版本,用一個組成的數據集和 Excel 驗證。我盡可能多地留下你的代碼,只是在適當的時候注釋掉。 如果我把它注釋掉了,我並沒有對它做任何修改,所以可能還是有錯誤。
在這種情況下,數組上的向量。 你不知道前面的大小(在編譯時),向量比動態 arrays 更容易。你也從來沒有數組。 矢量也知道它們有多大,所以你不需要傳遞大小。
類型不匹配。 你的函數一直期待一個雙精度數組,但你的總和是一個整數,還有許多其他不匹配。 您還傳遞了一個普通的雙精度數,就像它是一個數組一樣,寫在 memory 中,這不是您可以像那樣更改的。
立即開始的最佳實踐。 停止using namespace std;
. 只需在需要時限定您的名字,或者using std::cout;
等行更具體在 function 的頂部。你的命名到處都是。 選擇一個命名方案並堅持下去。 以大寫字母開頭的名稱通常是為類或類型保留的。
#include <iomanip>
#include <iostream>
// #include <array> // You never actually declared a std::array
#include <vector> // You don't know the size ahead of time, vectors are the
// right tool for that job.
// Use what's available
#include <algorithm> // std::sort()
#include <cmath> // std::sqrt()
#include <numeric> // std::accumulate()
// function declarations
// Commented out redundant functions, and changed arguments to match
void get_data(std::vector<double>& vals);
// void Sort(double vals[], int& valCount);
void print(const std::vector<double>& vals);
double variance(const std::vector<double>& vals);
double standard_dev(const std::vector<double>& vals);
// double SqRoot(double value); //use for StandardDev function
// function definitions
int main() {
int valCount = 0; // number of values to be processed
// ask user how many values
std::cout << "Enter the number of values (0 - 100) to be processed: ";
std::cin >> valCount;
std::vector<double> vals(valCount, 0);
// Was just a double, but you pass it around like it's an array. That's
// really bad. Either allocate the array on the heap, or use a vector.
// Moved to after getting the count so I could declare the vector with
// that size up front instead of reserving later; personal preference.
// process and store input values
get_data(vals);
// sort values
// Sort(&vals, valCount);
std::sort(vals.begin(), vals.end(), std::less<double>());
// The third argument can be omitted as it's the default behavior, but
// I prefer being explicit. If compiling with C++17, the <double> can
// also be omitted due to a feature called CTAD
// print sort
std::cout << "\nValues in Sorted Order: " << '\n';
print(vals);
// print variance
std::cout << "\nThe variance for the input value list is: " << variance(vals);
// print standard deviation
std::cout << "\nThe standard deviation for the input list is: "
<< standard_dev(vals) << '\n';
return 0;
}
// prompt user to get data
void get_data(std::vector<double>& vals) {
for (unsigned int i = 0; i < vals.size(); i++) {
std::cout << "Enter a value: ";
std::cin >> vals[i];
}
}
// //bubble sort values
// void Sort(double vals[], int& valCount)
// {
// for (int i=(valCount-1); i>0; i--)
// for (int j=0; j<i; j++)
// if (vals[j] > vals[j+1])
// swap (vals[j], vals[j+1]);
// }
// print sorted values
void print(const std::vector<double>& vals) {
for (auto i : vals) {
std::cout << i << ' ';
}
std::cout << '\n';
}
// compute variance
double variance(const std::vector<double>& vals) {
// was int, but your now vector is of type double
double sum = std::accumulate(vals.begin(), vals.end(), 0);
double mean = sum / static_cast<double>(vals.size());
// variance
double squaredDifference = 0;
for (unsigned int i = 0; i < vals.size(); i++)
squaredDifference += std::pow(vals[i] - mean, 2);
// Might be possible to get this with std::accumulate, but my first go didn't
// work.
return squaredDifference / static_cast<double>(vals.size());
}
// compute standard deviation
double standard_dev(const std::vector<double>& vals) {
return std::sqrt(variance(vals));
}
// //compute square root
// double SqRoot(double value)
// {
// double n = 0.00001;
// double s = value;
// while ((s - value / s) > n)
// {
// s = (s + value / s) / 2;
// }
// return s;
// }
編輯:我確實計算出了累加器的方差。 它確實需要了解 lambdas(匿名函數、仿函數)。 我編譯為 C++14 標准,這一段時間以來一直是主要編譯器的默認值。
double variance(const std::vector<double>& vals) {
auto meanOp = [valSize = vals.size()](double accumulator, double val) {
return accumulator += (val / static_cast<double>(valSize));
};
double mean = std::accumulate(vals.begin(), vals.end(), 0.0, meanOp);
auto varianceOp = [mean, valSize = vals.size()](double accumulator,
double val) {
return accumulator +=
(std::pow(val - mean, 2) / static_cast<double>(valSize));
};
return std::accumulate(vals.begin(), vals.end(), 0.0, varianceOp);
}
mean = sum / valCount;
in Variance
將使用 integer 數學計算,然后轉換為 double。 您需要先轉換為 double:
mean = double(sum) / valCount;
您的SqRoot
function 計算出一個近似值。 您應該改用std::sqrt
,這樣會更快更准確。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.