Caffeの畳み込みはどのように実際に機能しますか？

私は、基本的な畳み込みレイヤの一部として実装されたpycaffeの畳み込み関数を使って遊んでいました。Caffeの畳み込みはどのように実際に機能しますか？

name: "convolution" 
input: "data" 
input_dim: 1 
input_dim: 1 
input_dim: 227 
input_dim: 227 

layer { 
    name: "conv" 
    type: "Convolution" 
    bottom: "data" 
    top: "conv" 
    convolution_param { 
    num_output: 96 
    kernel_size: 11 
    stride: 1 
    } 
}

これらのパラメータは、（実際は4であるストライドを除く）AlexNet初CONV層のものと同じである：ここに私のconvolution.prototxtファイルです。

私は、NVIDIA GeForce GT 650M 1024 MB GPUを搭載したMacBook Proを持っています。私はそれがあまり意味があるかどうかはわかりませんが、私のラップトップにはIntel HD 4000が組み込まれています。

私は自分のラップトップでいくつかのテストを行いましたが、最初はGPUモードとCPUでストライドハイパーパラメータを変更しました。

1）変化するストライドcaffe.set_device(0); caffe.set_mode_gpu()を呼び出した後：

Stride 1: 27.26 ms 
Stride 2: 14.27 ms 
Stride 3: 10.57 ms 
Stride 4: 7.45 ms

2）caffe.set_mode_cpu()を呼び出した後に前進を変える：

Stride 1: 49.77 ms # expected 
Stride 2: 9.92 ms # this and the results after this don't make sense 
Stride 3: 4.50 ms 
Stride 4: 1.96 ms

（3の平均）

私はちょうどよCaffeのコンボリューションがこれらのテストに基づいてどのように動作するかを理解しようとしています。誰も私にこれを見せてくれるの？ CPUモードがGPUモードより高速に実行されるのはなぜですか？

あなた自身のため見ることに興味があるなら、私は使用

テストコード：だから

import numpy as np 
import caffe 
import time 

caffe.set_device(0) 
caffe.set_mode_gpu() # caffe.set_mode_cpu() 

net = caffe.Net('convolution.prototxt', caffe.TEST) 
total = 0.0 
for _ in range(3): 
    net.blobs['data'].data[...] = np.random.randn(1, 1, 227, 227) # there really is an ellipsis there 
    net.params['conv'][0].data[...] = np.random.randn(96, 1, 11, 11) 
    s = time.time() 
    r = net.forward() 
    e = time.time() 
    total += (e - s) 

print total/3 * 1000

出典

2016-07-08 cᴏʟᴅsᴘᴇᴇᴅ