私はcudaMemcpyを間違って使用していますか？

私はプロパティにRowsとColumnsを実装すると、配列の使用を簡素化するために、このCuArrayを実装している：私はcudaMemcpyを間違って使用していますか？

#include <cuda_runtime_api.h> 
#include <cuda.h> 
template<class TType> 
class CuArray 
{ 
public: 

    int Rows; 
    int Columns; 
    int Elements; 
    TType *ArrayPointer; 

    CuArray<TType>(int rows, int columns = 1) 
    { 
     this->Rows = rows; 
     this->Columns = columns; 
     Elements = this->Rows * this->Columns; 

     cudaMalloc(&this->ArrayPointer, sizeof(TType)*this->Elements); 
    } 

    static CuArray<TType>* GpuCreate(int rows, int columns = 1) 
    { 
     CuArray<TType>* cuArray = new CuArray<TType>(rows, columns); 
     CuArray<TType>* gpuCuArray; 
     size_t size = sizeof(CuArray<TType>); 
     cudaMalloc(&gpuCuArray, size); 
     cudaMemcpy(gpuCuArray, cuArray, size, cudaMemcpyHostToDevice); 
     return gpuCuArray; 
    } 
};

cudaMemcpyが期待されると私は私が間違っているのかわからないように動作していないようですが。

これは、呼び出しのための変数の値（およびポインタの位置）です。 CuArray<int*>::GpuCreate(11);：

NsightのEclipse 7.5、64
cuArray = Ubuntuの14.04 {0xb6e8b0、行= 11、列= 1、要素が= 11}
サイズでデバッグ= 32
gpuCuArray = {0x7053e3600、行= 0、列= 0、要素= 0}

ポインタ値newとcudaMalloc私のために正常に見えるが、cudaMemcpy後に動作するようには思えません。

どうしたのですか？

出典

2016-07-21 Jens

通常、次のコードは、GPUに格納された2次元配列を表現するのに十分なはずです。 Rows,Columnsなどをデバイスメモリに保存する必要はありません。これらの情報は通常、ホスト側からのみ必要です。しかし、それがあなたの場合ではない場合は、あなたのデザインの考慮事項についてもっと詳しく説明したいかもしれません。コードは、CuArrayオブジェクトをどのように使用するかを示します。

#include <cuda_runtime_api.h> 
#include <cuda.h> 
template<class TType> 
class CuArray 
{ 
public: 

    int Rows; 
    int Columns; 
    int Elements; 
    TType *ArrayPointer; 

    CuArray<TType>(int rows, int columns = 1) 
    { 
     this->Rows = rows; 
     this->Columns = columns; 
     Elements = this->Rows * this->Columns; 

     cudaMalloc(&this->ArrayPointer, sizeof(TType)*this->Elements); 
    } 

    static CuArray<TType>* GpuCreate(int rows, int columns = 1) 
    { 
     CuArray<TType>* cuArray = new CuArray<TType>(rows, columns); 
     return cuArray; 
    } 
};

出典

2016-07-21 11:40:09 kangshiyin

私はカーネルを呼び出す場合、私はホストからそれを呼び出しますし、私は、パラメータとして行と列を渡すことができますので、私はCUDAでのコーディングでは初心者ですが、あなたの議論は良いようですカーネル。しかし、それはより多くのパラメータを必要とします。 – Jens

しかし、あなたのソリューションはcudaMemcpyの問題を解決しません。なぜ自分のコードで動作していないのかを理解することは素晴らしいことです。 – Jens

@JensHorstmann元のコードでは、あなたの期待は何ですか？何が動作していないようですか？あなたは、あなたが取り組んでいることについてより多くの詳細を提供したいかもしれません。 – kangshiyin

私はcudaMemcpyを間違って使用していますか？

答えて

関連する問題