简体   繁体   中英

Convert from std::vector to std::array

I am trying to use std::arrays instead of std::vectors in my code. But failed, This code that use vector works correctly

void split_dataset(int fold, vector<vector<int>>& vec_X_dataset, vector<int>& vec_Y_dataset,
vector<vector<int>>& vec_X_train, vector<int>& vec_Y_train,
vector<vector<int>>& vec_X_test, vector<int>& vec_Y_test) {
size_t len = vec_X_dataset.size(); //5430
size_t division = static_cast<size_t>(len / 5); //1086

switch (fold) {
case 1:
    for (size_t i = 0; i < len; ++i) {
        if (i < division) {
            vec_X_test.push_back(vec_X_dataset[i]);
            vec_Y_test.push_back(vec_Y_dataset[i]);
        }
        else {
            vec_X_train.push_back(vec_X_dataset[i]);
            vec_Y_train.push_back(vec_Y_dataset[i]);
        }
    }
  
    break;

But when trying to use std::arrays instead of vectors it gives me wrong results... This is the code that I have problem with:

void split_dataset(int fold, array<array<int, 20>, 5430>& array_X_dataset, array<int, 5430>& array_Y_dataset,
        array<array<int, 20>, 5430>& array_X_train, array<int, 5430>& array_Y_train,
        array<array<int, 20>, 5430>& array_X_test, array<int, 5430>& array_Y_test) {
        size_t len = array_X_dataset.size(); //5430
        size_t division = static_cast<size_t>(len / 5); //1086
      //  cout << "len = " << len << " divisoin = " << division;
        switch (fold) {
        case 1:
            for (size_t i = 0; i < len; ++i) {
                if (i < division) {
                    // vec_X_test.push_back(vec_X_dataset[i]);
                     //vec_Y_test.push_back(vec_Y_dataset[i]);
                    array_X_test[i] = array_X_dataset[i];
                    array_Y_test[i] = array_Y_dataset[i];
                }
                else {
                    //vec_X_train.push_back(vec_X_dataset[i]);
                    //vec_Y_train.push_back(vec_Y_dataset[i]);
                    array_X_train[i] = array_X_dataset[i];
                    array_Y_train[i] = array_Y_dataset[i];
                }
            }
    
            break;

this is the main functoin:

int main()
    {
     static array<array<int, 20>, 5430> array_X_train {};
            static array<int, 5430> array_Y_train {};
            static array<array<int, 20>, 5430> array_X_test  {};
            static array<int, 5430> array_Y_test  {};
             int fold;
            cout << "plz cin fold" << endl;
            cin >> fold;
            split_dataset(fold, array_X_dataset, array_Y_dataset, array_X_train, array_Y_train, array_X_test, array_Y_test);
            cout << endl << endl << "size of array_X_dataset=" << array_X_dataset.size() << endl;
            cout << "size of array_Y_dataset=" << array_Y_dataset.size() << endl;
            cout << "size of train_x_dataset in fold " << fold << " =" << array_X_train.size() << endl;
            cout << "size of test_x_dataset in fold " << fold << " =" << array_X_test.size() << endl << endl << endl;
          
     

With arrays, you specify the size when you create it. It does not grow as you add to it. You don't actually add to it; you just set the value of specific elements.

When you write: array_X_test[i] = array_X_dataset[i]; or array_X_train[i] = array_X_dataset[i] , you are setting the element at location i . So, for the 'test' arrays, you are putting the values where you expect them (as the i you are using here starts from zero), but for the 'train' array, you are not. You are doing this starting from division , so the first element you put in the 'train' array is doing this: array_X_train[division] = array_X_dataset[division]

You probably want to keep a separate variable for the number of elements (which is also the next index you would use) separately for each array. Something like:

void split_dataset(int fold, array<array<int, 20>, 5430>& array_X_dataset, array<int, 5430>& array_Y_dataset,
        array<array<int, 20>, 5430>& array_X_train, array<int, 5430>& array_Y_train,
        array<array<int, 20>, 5430>& array_X_test, array<int, 5430>& array_Y_test) {
        size_t len = array_X_dataset.size(); //5430
        size_t division = static_cast<size_t>(len / 5); //1086
      //  cout << "len = " << len << " divisoin = " << division;
        int num_elements_in_test = 0;
        int num_elements_in_train = 0;

        switch (fold) {
        case 1:
            for (size_t i = 0; i < len; ++i) {
                if (i < division) {
                    array_X_test[num_elements_in_test ] = array_X_dataset[i];
                    array_Y_test[num_elements_in_test ] = array_Y_dataset[i];
                    num_elements_in_test++;
                }
                else {
                    array_X_train[num_elements_in_train ] = array_X_dataset[i];
                    array_Y_train[num_elements_in_train ] = array_Y_dataset[i];
                    num_elements_in_train++;
                }
            }
    
            break;

Note that the actual size of the arrays will remain as you specified when you created them, but you will have an additional variable with the information you need -- the number of elements actually used. Also, you will always use the first part of the array, leaving the unused entries at the end, which is easier to work with.

array<T, 5430> always has 5430 T elements. You are assigning to the first 1086 elements of array_X_test and array_Y_test and to the last 4344 elements of array_X_train and array_Y_train .

It sounds like you want smaller test and train arrays.

Rather than have a switch over the bounds to use, you can calculate the iterators to the relevant ranges

As an aside, the standard library has algorithms for copying

using X = std::array<int, 20>;
using Y = int;

constexpr size_t Dataset = 5430;
constexpr size_t Test = Dataset / 5;
constexpr size_t Train = Dataset - Test;

void split_dataset(int fold, const std::array<X, Dataset>& array_X_dataset, const std::array<Y, Dataset>& array_Y_dataset,
        std::array<X, Train>& array_X_train, std::array<Y, Train>& array_Y_train,
        std::array<X, Test>& array_X_test, std::array<Y, Test>& array_Y_test) {

    auto X_test_first = array_X_dataset.begin() + (Test * (fold - 1));
    auto X_test_last = array_X_dataset.begin() + (Test * fold);
    auto Y_test_first = array_Y_dataset.begin() + (Test * (fold - 1));
    auto Y_test_last = array_Y_dataset.begin() + (Test * fold);

    // for fold 1 copies nothing and returns train.begin
    auto X_train_mid = std::copy(array_X_dataset.begin(), X_test_first, array_X_train.begin());
    auto Y_train_mid = std::copy(array_Y_dataset.begin(), Y_test_first, array_Y_train.begin());

    std::copy(X_test_first, X_test_last, array_X_test.begin());
    std::copy(Y_test_first, Y_test_last, array_Y_test.begin());

    // for fold 5 copies nothing
    std::copy(X_test_last, array_X_dataset.end(), array_X_train.begin());
    std::copy(Y_test_first, array_Y_dataset.end(), array_Y_train.begin());
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM