numpy：如何将矩阵随机拆分/选择为n个不同的矩阵

Question

I have a numpy matrix with shape of (4601, 58). 我有一个形状为（4601，58）的numpy矩阵。
I want to split the matrix randomly as per 60%, 20%, 20% split based on number of rows 我想根据行数按60％，20％，20％的比例随机分割矩阵
This is for Machine Learning task I need 这是我需要的机器学习任务
Is there a numpy function that randomly selects rows? 是否有一个numpy函数可以随机选择行？

Answer 1

you can use numpy.random.shuffle 您可以使用numpy.random.shuffle

import numpy as np

N = 4601
data = np.arange(N*58).reshape(-1, 58)
np.random.shuffle(data)

a = data[:int(N*0.6)]
b = data[int(N*0.6):int(N*0.8)]
c = data[int(N*0.8):]

Answer 2

A complement to HYRY's answer if you want to shuffle consistently several arrays x, y, z with same first dimension: x.shape[0] == y.shape[0] == z.shape[0] == n_samples . 如果您要一致地随机播放具有相同第一维的多个数组x，y，z，则可以作为HYRY答案的补充： x.shape[0] == y.shape[0] == z.shape[0] == n_samples 。

You can do: 你可以做：

rng = np.random.RandomState(42)  # reproducible results with a fixed seed
indices = np.arange(n_samples)
rng.shuffle(indices)
x_shuffled = x[indices]
y_shuffled = y[indices]
z_shuffled = z[indices]

And then proceed with the split of each shuffled array as in HYRY's answer. 然后按照HYRY的答案进行每个随机排列的数组的拆分。

Answer 3

If you want to randomly select rows, you could just use random.sample from the standard Python library: 如果要随机选择行，则可以使用标准Python库中的random.sample ：

import random

population = range(4601) # Your number of rows
choice = random.sample(population, k) # k being the number of samples you require

random.sample samples without replacement, so you don't need to worry about repeated rows ending up in choice . random.sample样本无需替换，因此您不必担心重复的行最终会出现在choice 。 Given a numpy array called matrix , you can select the rows by slicing, like this: matrix[choice] . 给定一个名为matrix的numpy数组，您可以通过切片来选择行，如下所示： matrix[choice] 。

Of, course, k can be equal to the number of total elements in the population, and then choice would contain a random ordering of the indices for your rows. 当然， k可以等于总体中总元素的数量，然后choice将包含行索引的随机排序。 Then you can partition choice as you please, if that's all you need. 然后，您可以根据需要对choice进行分区。

Answer 4

Since you need it for machine learning, here is a method I wrote: 由于您需要它进行机器学习，因此我写了一种方法：

import numpy as np

def split_random(matrix, percent_train=70, percent_test=15):
    """
    Splits matrix data into randomly ordered sets 
    grouped by provided percentages.

    Usage:
    rows = 100
    columns = 2
    matrix = np.random.rand(rows, columns)
    training, testing, validation = \
    split_random(matrix, percent_train=80, percent_test=10)

    percent_validation 10
    training (80, 2)
    testing (10, 2)
    validation (10, 2)

    Returns:
    - training_data: percentage_train e.g. 70%
    - testing_data: percent_test e.g. 15%
    - validation_data: reminder from 100% e.g. 15%
    Created by Uki D. Lucas on Feb. 4, 2017
    """

    percent_validation = 100 - percent_train - percent_test

    if percent_validation < 0:
        print("Make sure that the provided sum of " + \
        "training and testing percentages is equal, " + \
        "or less than 100%.")
        percent_validation = 0
    else:
        print("percent_validation", percent_validation)

    #print(matrix)  
    rows = matrix.shape[0]
    np.random.shuffle(matrix)

    end_training = int(rows*percent_train/100)    
    end_testing = end_training + int((rows * percent_test/100))

    training = matrix[:end_training]
    testing = matrix[end_training:end_testing]
    validation = matrix[end_testing:]
    return training, testing, validation

# TEST:
rows = 100
columns = 2
matrix = np.random.rand(rows, columns)
training, testing, validation = split_random(matrix, percent_train=80, percent_test=10) 

print("training",training.shape)
print("testing",testing.shape)
print("validation",validation.shape)

print(split_random.__doc__)

training (80, 2) 训练（80，2）
testing (10, 2) 测试（10，2）
validation (10, 2) 验证（10，2）

numpy：如何将矩阵随机拆分/选择为n个不同的矩阵

问题描述

4 个解决方案

解决方案1
18 已采纳 2012-02-01 02:21:43

解决方案2
7 2012-02-01 08:18:21

解决方案3
3 2012-02-01 00:49:00

解决方案4
2 2017-02-04 19:57:48

numpy：如何将矩阵随机拆分/选择为n个不同的矩阵

问题描述

4 个解决方案

解决方案1 18 已采纳 2012-02-01 02:21:43

解决方案2 7 2012-02-01 08:18:21

解决方案3 3 2012-02-01 00:49:00

解决方案4 2 2017-02-04 19:57:48

解决方案1
18 已采纳 2012-02-01 02:21:43

解决方案2
7 2012-02-01 08:18:21

解决方案3
3 2012-02-01 00:49:00

解决方案4
2 2017-02-04 19:57:48