简体   繁体   中英

split 2-d array based columns to two 2-d array in python by using numpy

我有一个 19 行 1280 列的二维数组。我想把它分成 2 个 19 行和 70% 的列用于训练和 30% 的列用于测试。这列随机选择。我的代码在 python 中.请帮助我。谢谢

Edited to include randomised shuffle

You can useslicing to slice arrays into your desired shape and numpy.random.shuffle() to obtain randomiced array indices.

import numpy as np
from copy import deepcopy

# create example data
num_cols, num_rows = 10, 3
arr = np.array([[f'{row}_{col}' for col in range(num_cols)] for row in range(num_rows)])

# create a list of random indices
random_cols = list(range(arr.shape[1]))
np.random.shuffle(random_cols)

# calculate truncation index as 70% of total number of columns
truncation_index = int(arr.shape[1] * 0.7)

# use arrray slicing to extract two sub_arrays
train_array = arr[:, random_cols[:truncation_index]]
test_array = arr[:, random_cols[truncation_index:]]

print(f'arr: \n{arr} \n')
print(f'train array: \n{train_array} \n')
print(f'test array: \n{test_array} \n')

With output

arr: 
[['0_0' '0_1' '0_2' '0_3' '0_4' '0_5' '0_6' '0_7' '0_8' '0_9']
 ['1_0' '1_1' '1_2' '1_3' '1_4' '1_5' '1_6' '1_7' '1_8' '1_9']
 ['2_0' '2_1' '2_2' '2_3' '2_4' '2_5' '2_6' '2_7' '2_8' '2_9']] 

train array: 
[['0_5' '0_8' '0_0' '0_7' '0_6' '0_1' '0_4']
 ['1_5' '1_8' '1_0' '1_7' '1_6' '1_1' '1_4']
 ['2_5' '2_8' '2_0' '2_7' '2_6' '2_1' '2_4']] 

test array: 
[['0_3' '0_9' '0_2']
 ['1_3' '1_9' '1_2']
 ['2_3' '2_9' '2_2']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM