snakefileのperlスクリプトにワイルドカード引数を渡す方法は？

カスタムPerlスクリプトを実行できるSnakefileにルールを書き込もうとしています。 2つの入力ファイルと1つの出力ファイルがあります。私は様々なファイルのスクリプトを実行したいので、入力ファイルと出力ファイルにはワイルドカードが含まれています。しかし、私が異なる入出力ファイルを生成するために展開すると、perlスクリプトは可能なすべての入力ファイルを入力として受け取り、1つずつ進めたいと考えています。私は、perlに入力ファイルを1つずつ「食べさせる」ために何をすべきですか？これは私のコードです：あなたのルールはすべてのファイルを実行したい理由snakefileのperlスクリプトにワイルドカード引数を渡す方法は？

DOMAINS= ["Metallophos", "PP2C", "Y_phosphatase"] 
SUPERGROUPS=["2supergroups","5supergroups"] 

rule add_supergroups: 
    input: 
     newick=expand("data/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip",domain=DOMAINS, supergroup=SUPERGROUPS), 
     sup="data/species.v3.1.1.supergroups.txt" 
    output: 
     expand("results/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip.supergroups", domain=DOMAINS, supergroup=SUPERGROUPS) 
    shell: 
     "perl scripts/change_newick.pl {input.sup} {input.newick} {output}"

出典

2017-03-16 lvw

理由は簡単です：機能は（拡大）。

あなたが知っているように、expandはPython文字列のリストをSnakemakeのファイル管理に非常に便利にします。

しかし、あなたの例では、ルールは、ファイルのリストあなたの出力とファイルのリストを生成する{input.sup}で{input.newick}で 1つのファイルでperlスクリプトを実行したいと考えています。

expand function on the input and outputを使用しないで、問題を簡単に解決できます。

しかし、Snakemakeはすべてのファイルを作成する必要があると認識していますか？ rule add_supergroupsの入力を受け入れるrule add_supergroupsの前にの前にルールターゲットを作成します。

はのは、いくつかのコードを実行してみましょう：

DOMAINS= ["Metallophos", "PP2C", "Y_phosphatase"] SUPERGROUPS=["2supergroups","5supergroups"] rule target : input : expand("results/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip.supergroups", domain=DOMAINS, supergroup=SUPERGROUPS) rule add_supergroups: input: newick="data/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip", sup="data/species.v3.1.1.supergroups.txt" output: "results/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip.supergroups" shell: "perl scripts/change_newick.pl {input.sup} {input.newick} {output}"

今では動作するはずです。 SnakeMakeはtarget ruleのファイルのリストが必要です。彼はすべてのルールを検索して、これらのファイルを作成できるかどうかを調べます。

この場合、彼はpattern filenameをoutput add_supergroupsと認識しています。だから彼はwilcardsをDOMAINSとSUPERGROUPSで自動的に完成させます。ルールadd_supergroupsは1つのファイルで実行されます。

出典

2017-03-16 16:13:27

ヘヘ、あなたは明らかに;-) – rioualen

私よりも速かった私は、2人の新しいユーザーによる2つの良い答えがあることを好みます。がんばり続ける。 :-) – simbabque

私たちはフランス語の "Snakemakeコミュニティ"の両方に参加しています:) – rioualen

expand（）関数を削除し、ルール "all"を使用してターゲットを定義できます。ルールadd_supergroupsのワイルドカードの値は、このターゲットファイルから自動的に推測されます。

Snakemakeがパターンを認識して一致させるために、ルール "add_supergroups"でワイルドカードに異なる名前を使用することさえできます。

DOMAINS= ["Metallophos", "PP2C", "Y_phosphatase"] 
SUPERGROUPS=["2supergroups","5supergroups"] 

rule all: 
    input: expand("results/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip.supergroups" 

rule add_supergroups: 
    input: 
     newick="data/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip", 
     sup="data/species.v3.1.1.supergroups.txt" 
    output: 
     "results/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip.supergroups" 
    shell: 
     "perl scripts/change_newick.pl {input.sup} {input.newick} {output}"

理論的には、それもこのように動作するはずです：

DOMAINS= ["Metallophos", "PP2C", "Y_phosphatase"] 
SUPERGROUPS=["2supergroups","5supergroups"] 

rule all: 
    input: expand("results/{domain}/{supergroup}/RAxML_bipartitionsBranchLabels.bbhlist.txt.{domain}.fa.aligned.rp.me-25.id.phylip.supergroups" 

rule add_supergroups: 
    input: 
     newick="data/{foo}", 
     sup="data/species.v3.1.1.supergroups.txt" 
    output: 
     "results/{foo}.supergroups" 
    shell: 
     "perl scripts/change_newick.pl {input.sup} {input.newick} {output}"

出典

2017-03-16 17:37:29 rioualen

snakefileのperlスクリプトにワイルドカード引数を渡す方法は？

答えて

関連する問題