私はPBSクラスタ上でGNU Parallelで多くの小さなシリアルジョブを実行しようとしていますが、複数の計算ノードを使用するため、各計算ノードには16コアがあるため、-S $ SERVERNAMEオプションをGNUParallelに渡しました。ジョブの数が-S $SERVERNAME
を使用して、ノード上で開始点である以下、私は以上の9つのジョブを起動することを意図したときに、私が指定したジョブの数と同じではないが、私の観察されない:"実行する最大ジョブ数"が、リモートサーバー上でGNU Parallelを使用する場合に指定されたジョブ数と等しくありませんか?
[[email protected] ~]$ parallel --version
GNU parallel 20160922
Copyright (C) 2007,2008,2009,2010,2011,2012,2013,2014,2015,2016
Ole Tange and Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
GNU parallel comes with no warranty.
Web site: http://www.gnu.org/software/parallel
When using programs that use GNU Parallel to process data for publication
please cite as described in 'parallel --citation'.
[[email protected] ~]$ hostname # this shows my hostname
shelob001
ローカルホストとしてGNUParallelを使用する場合-S $ SERVERNAMEがなければ問題はなく、10個のジョブを生成し、GNUParallelは10個のジョブを開始しました:
[[email protected] ~]$ parallel --progress echo ::: `seq 1 10`
Computers/CPU cores/Max jobs to run
1:local/16/10 # 10 jobs spawned, no problem
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
local:10/0/100%/0.0s 1
local:9/1/100%/0.0s 2
local:8/2/100%/0.0s 3
local:7/3/100%/0.0s 4
local:6/4/100%/0.0s 5
local:5/5/100%/0.0s 6
local:4/6/100%/0.0s 7
local:3/7/100%/0.0s 8
local:2/8/100%/0.0s 9
local:1/9/100%/0.0s 10
local:0/10/100%/0.0s
GNUParallelを使用して-S $SERVERNAME
を使用して10個未満のジョブを生成する場合でも、問題はありません。ここで
[[email protected] ~]$ parallel -S shelob001 --progress echo ::: `seq 1 1`
Computers/CPU cores/Max jobs to run
1:shelob001/16/1 # When the number of jobs is less than 10, no problem
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
shelob001:1/0/100%/0.0s 1
shelob001:0/1/100%/1.0s
[[email protected] ~]$ parallel -S shelob001 --progress echo ::: `seq 1 8`
Computers/CPU cores/Max jobs to run
1:shelob001/16/8 # When the number of jobs is less than 10, no problem
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
shelob001:8/0/100%/0.0s 1
shelob001:7/1/100%/1.0s 7
shelob001:6/2/100%/0.5s 3
shelob001:5/3/100%/0.3s 8
shelob001:4/4/100%/0.2s 5
shelob001:3/5/100%/0.2s 2
shelob001:2/6/100%/0.2s 6
shelob001:1/7/100%/0.1s 4
shelob001:0/8/100%/0.1s
[[email protected] ~]$ parallel -S shelob001 --progress echo ::: `seq 1 9`
Computers/CPU cores/Max jobs to run
1:shelob001/16/9 # When the number of jobs is less than 10, no problem
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
shelob001:9/0/100%/0.0s 1
shelob001:8/1/100%/1.0s 5
shelob001:7/2/100%/0.5s 8
shelob001:6/3/100%/0.3s 2
shelob001:5/4/100%/0.2s 6
shelob001:4/5/100%/0.2s 9
shelob001:3/6/100%/0.2s 3
shelob001:2/7/100%/0.1s 4
shelob001:1/8/100%/0.1s 7
shelob001:0/9/100%/0.1s
私はジョブ番号> = 10を使用しようとすると、私を混乱させる何である、ジョブの数が、私は10を起動したいここでは、望んでいたよりも常に1以下生み出した、わずか9つのジョブを開始しました:
[[email protected] ~]$ parallel -S shelob001 --progress echo ::: `seq 1 10` # I want to start 10 jobs
Computers/CPU cores/Max jobs to run
1:shelob001/16/9 #why here "Max jobs to run" is 9?
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
shelob001:9/0/100%/0.0s 2
shelob001:9/1/100%/3.0s 1
shelob001:8/2/100%/1.5s 7
shelob001:7/3/100%/1.0s 4
shelob001:6/4/100%/0.8s 9
shelob001:5/5/100%/0.6s 8
shelob001:4/6/100%/0.5s 3
shelob001:3/7/100%/0.4s 5
shelob001:2/8/100%/0.4s 6
shelob001:1/9/100%/0.4s 10
shelob001:0/10/100%/0.4s
[[email protected] ~]$ parallel -S shelob001 --progress echo ::: `seq 1 11`
Computers/CPU cores/Max jobs to run
1:shelob001/16/10 # it seems the jobs started is one less than I specified
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
shelob001:10/0/100%/0.0s 1
shelob001:10/1/100%/3.0s 2
shelob001:9/2/100%/1.5s 8
shelob001:8/3/100%/1.0s 3
shelob001:7/4/100%/0.8s 4
shelob001:6/5/100%/0.6s 5
shelob001:5/6/100%/0.5s 7
shelob001:4/7/100%/0.4s 10
shelob001:3/8/100%/0.4s 9
shelob001:2/9/100%/0.3s 6
shelob001:1/10/100%/0.4s 11
shelob001:0/11/100%/0.4s
[[email protected] ~]$
"top"を使用して計算ノードのステータスを確認したところ、seq 1 10
を使用すると9個のCpusしか使用されません。うまくいけば、私は自分の問題を明確にしました。誰もこの問題の原因を指摘できますか?どんな提案も大歓迎です。
ありがとうございました!
ありがとうございました。しかし、-j + 1は役に立ちません。 –
[fchen14 @ shelob001〜] $ parallel -j + 1 -S 16/shelob001 --progress echo ::: 'seq shelob001/16/9 コンピュータ:9/0/100%: shelob001を完了するために開始されたジョブ/平均秒/%を完了した/ジョブを実行するジョブ 1を実行するための1台の10 ' コンピュータ/ CPUコア/最大ジョブ/0.0s 1 shelob001:9/1/100%/ 2.0s 2 shelob001:8/2/100%/ 1.0s 9 shelob001:7/3/100%/ 0.7s 7 shelob001:6/4/100%/ 0.5s 3 shelob001:5/5/100%/ 0.4s 8 shelob001:4/6/100%/ 0.3s 5Shelob001:3/7/100%/ 0.3s 4 shelob001:2/8/100%/ 0.2s 6 shelob001:1/9/100%/ 0.3s 10 shelob001:0/10/100%/ 0.3 s –