Numpy：行列をn個の異なる行列にランダムに分割/選択する方法

私は形状が（4601,58）の数値行列を持っています。
は、これは私がランダムに行を選択numpyの機能があります
必要機械学習のタスクのためにあるのですか？あなたが終わる繰り返し行を心配する必要はありませんので、交換せずに
Numpy：行列をn個の異なる行列にランダムに分割/選択する方法
```
import random 

population = range(4601) # Your number of rows 
choice = random.sample(population, k) # k being the number of samples you require 
```
random.sampleサンプル：あなたがランダムに行を選択したい場合は、あなただけの標準Pythonライブラリからrandom.sampleを使用することができ

2012-02-01 daydreamer

あなたは同じ最初の次元で一貫して、いくつかの配列x、y、zのをシャッフルしたい場合は、HYRYの答えにnumpy.random.shuffle

import numpy as np 

N = 4601 
data = np.arange(N*58).reshape(-1, 58) 
np.random.shuffle(data) 

a = data[:int(N*0.6)] 
b = data[int(N*0.6):int(N*0.8)] 
c = data[int(N*0.8):]

出典

2012-02-01 02:21:43 HYRY

choiceにあります。 matrixと呼ばれる配列があると、スライシングで行を選択することができます（matrix[choice]など）。

もちろん、kは、母集団内の全要素数に等しい場合があり、choiceには、行のインデックスのランダムな順序が含まれます。その後、必要ならばchoiceを分割することができます。

出典

2012-02-01 00:49:00

補数を使用することができます。x.shape[0] == y.shape[0] == z.shape[0] == n_samples。

rng = np.random.RandomState(42) # reproducible results with a fixed seed 
indices = np.arange(n_samples) 
rng.shuffle(indices) 
x_shuffled = x[indices] 
y_shuffled = y[indices] 
z_shuffled = z[indices]

をそしてHYRYの答えのように各シャッフル配列の分割を続行：

あなたは行うことができます。

出典

2012-02-01 08:18:21 ogrisel

あなたは機械学習のためにそれを必要とするので、ここで私が書いた方法である：

import numpy as np 

def split_random(matrix, percent_train=70, percent_test=15): 
    """ 
    Splits matrix data into randomly ordered sets 
    grouped by provided percentages. 

    Usage: 
    rows = 100 
    columns = 2 
    matrix = np.random.rand(rows, columns) 
    training, testing, validation = \ 
    split_random(matrix, percent_train=80, percent_test=10) 

    percent_validation 10 
    training (80, 2) 
    testing (10, 2) 
    validation (10, 2) 

    Returns: 
    - training_data: percentage_train e.g. 70% 
    - testing_data: percent_test e.g. 15% 
    - validation_data: reminder from 100% e.g. 15% 
    Created by Uki D. Lucas on Feb. 4, 2017 
    """ 

    percent_validation = 100 - percent_train - percent_test 

    if percent_validation < 0: 
     print("Make sure that the provided sum of " + \ 
     "training and testing percentages is equal, " + \ 
     "or less than 100%.") 
     percent_validation = 0 
    else: 
     print("percent_validation", percent_validation) 

    #print(matrix) 
    rows = matrix.shape[0] 
    np.random.shuffle(matrix) 

    end_training = int(rows*percent_train/100)  
    end_testing = end_training + int((rows * percent_test/100)) 

    training = matrix[:end_training] 
    testing = matrix[end_training:end_testing] 
    validation = matrix[end_testing:] 
    return training, testing, validation 

# TEST: 
rows = 100 
columns = 2 
matrix = np.random.rand(rows, columns) 
training, testing, validation = split_random(matrix, percent_train=80, percent_test=10) 

print("training",training.shape) 
print("testing",testing.shape) 
print("validation",validation.shape) 

print(split_random.__doc__)

トレーニング（80、2）
検証（10、
テスト（2 10） 2）

出典

2017-02-04 19:57:48

Numpy：行列をn個の異なる行列にランダムに分割/選択する方法

答えて

関連する問題