まず、質問タイトルは質問をよく説明していないと思います。タイトルを変更したり、より良いものをお勧めします。パンダのデータフレームを行の名前で変更する
"sample","module","status","tot.seq","seq.length","pct.gc","pct.dup"
"ERR435952_cleaned_1","Basic Statistics","PASS","15529112","62",47,41.66
"ERR435952_cleaned_1","Per base sequence quality","FAIL","15529112","62",47,41.66
"ERR435952_cleaned_1","Per tile sequence quality","FAIL","15529112","62",47,41.66
"ERR435952_cleaned_1","Per sequence quality scores","PASS","15529112","62",47,41.66
"ERR435952_cleaned_1","Per base sequence content","PASS","15529112","62",47,41.66
"ERR435952_cleaned_1","Per sequence GC content","PASS","15529112","62",47,41.66
"ERR435952_cleaned_1","Per base N content","PASS","15529112","62",47,41.66
"ERR435952_cleaned_1","Sequence Length Distribution","PASS","15529112","62",47,41.66
"ERR435952_cleaned_1","Sequence Duplication Levels","WARN","15529112","62",47,41.66
"ERR435952_cleaned_1","Overrepresented sequences","WARN","15529112","62",47,41.66
"ERR435952_cleaned_1","Adapter Content","PASS","15529112","62",47,41.66
"ERR435952_cleaned_1","Kmer Content","FAIL","15529112","62",47,41.66
"ERR435952_cleaned_2","Basic Statistics","PASS","15529112","62",48,42.44
"ERR435952_cleaned_2","Per base sequence quality","PASS","15529112","62",48,42.44
"ERR435952_cleaned_2","Per tile sequence quality","WARN","15529112","62",48,42.44
"ERR435952_cleaned_2","Per sequence quality scores","PASS","15529112","62",48,42.44
"ERR435952_cleaned_2","Per base sequence content","PASS","15529112","62",48,42.44
"ERR435952_cleaned_2","Per sequence GC content","WARN","15529112","62",48,42.44
"ERR435952_cleaned_2","Per base N content","PASS","15529112","62",48,42.44
"ERR435952_cleaned_2","Sequence Length Distribution","PASS","15529112","62",48,42.44
"ERR435952_cleaned_2","Sequence Duplication Levels","WARN","15529112","62",48,42.44
"ERR435952_cleaned_2","Overrepresented sequences","WARN","15529112","62",48,42.44
"ERR435952_cleaned_2","Adapter Content","PASS","15529112","62",48,42.44
"ERR435952_cleaned_2","Kmer Content","FAIL","15529112","62",48,42.44
そして、私はこのような何かに変換したいので、私は(PASS/FAIL /値をWARNに基づいて、単純なヒートマップを作成することができます総読み取り数を含む:tot.seq):
私は行の数をカウントすることでこれを行うことができます(各モジュール/機能値の間隔には相関があります)が、これはまったくきれいではありません大規模なデータセットでも効率的かどうかはわかりません。
ありがとう!最初の質問では、各サンプル(モジュールごとに繰り返す)の繰り返し値であるため、読み取り回数(tot.seq)を追加するのを忘れました。どのように1回だけ追加できますか? – Siddharth