: http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration
それを決定する公式のツールです。
CentOS 5.3を実行している2つの物理Intel X5560(6core + 6HT)を搭載したマシンからの実行例です(古いかもしれません)。
Package 0 Cache and Thread details
Box Description:
Cache is cache level designator
Size is cache size
OScpu# is cpu # as seen by OS
Core is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache
CmbMsk will differ from AffMsk if > 1 hw_thread/cache
Extended Hex replaces trailing zeroes with 'z#'
where # is number of zeroes (so '8z5' is '0x800000')
L1D is Level 1 Data cache, size(KBytes)= 32, Cores/cache= 2, Caches/package= 4
L1I is Level 1 Instruction cache, size(KBytes)= 32, Cores/cache= 2, Caches/package= 4
L2 is Level 2 Unified cache, size(KBytes)= 256, Cores/cache= 2, Caches/package= 4
L3 is Level 3 Unified cache, size(KBytes)= 8192, Cores/cache= 8, Caches/package= 1
+-----------+-----------+-----------+-----------+
Cache | L1D | L1D | L1D | L1D |
Size | 32K | 32K | 32K | 32K |
OScpu#| 0 8| 1 9| 2 10| 3 11|
Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|
AffMsk| 1 100| 2 200| 4 400| 8 800|
CmbMsk| 101 | 202 | 404 | 808 |
+-----------+-----------+-----------+-----------+
Cache | L1I | L1I | L1I | L1I |
Size | 32K | 32K | 32K | 32K |
+-----------+-----------+-----------+-----------+
Cache | L2 | L2 | L2 | L2 |
Size | 256K | 256K | 256K | 256K |
+-----------+-----------+-----------+-----------+
Cache | L3 |
Size | 8M |
CmbMsk| f0f |
+-----------------------------------------------+
Combined socket AffinityMask= 0xf0f
Package 1 Cache and Thread details
Box Description:
Cache is cache level designator
Size is cache size
OScpu# is cpu # as seen by OS
Core is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache
CmbMsk will differ from AffMsk if > 1 hw_thread/cache
Extended Hex replaces trailing zeroes with 'z#'
where # is number of zeroes (so '8z5' is '0x800000')
+-----------+-----------+-----------+-----------+
Cache | L1D | L1D | L1D | L1D |
Size | 32K | 32K | 32K | 32K |
OScpu#| 4 12| 5 13| 6 14| 7 15|
Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|
AffMsk| 10 1z3| 20 2z3| 40 4z3| 80 8z3|
CmbMsk| 1010 | 2020 | 4040 | 8080 |
+-----------+-----------+-----------+-----------+
Cache | L1I | L1I | L1I | L1I |
Size | 32K | 32K | 32K | 32K |
+-----------+-----------+-----------+-----------+
Cache | L2 | L2 | L2 | L2 |
Size | 256K | 256K | 256K | 256K |
+-----------+-----------+-----------+-----------+
Cache | L3 |
Size | 8M |
CmbMsk| f0f0 |
+-----------------------------------------------+
申し訳ありませんが、私は答えは分かりません。しかし、あなたは私の好奇心を刺激しました:今、私はあなたがこの情報を必要としていることを知りたいです。 – nfechner
@nfechner:コアがどのように順序付けられているか分かっていれば、それに従ってスレッドを並べ替えることができます。現在、私は12コアすべてを利用することはできません。 8スレッドのパフォーマンスは12スレッドのパフォーマンスよりもはるかに優れています。 – veda
情報ありがとうございます。 – nfechner