次の表は、task-spoolerの出力です。pandas( 'task-spooler')を使って複雑なテーブルを読む
人間が解析するのは簡単ですが、私はそれをパンダのDFに読み込むことができません。
ID State Output E-Level Times(r/u/s) Command [run=1/2]
6 running /tmp/ts-out.FzVneG [l1]python infloop.py
0 finished /tmp/ts-out.ixWHm2 0 0.00/0.00/0.00 bash -c echo 1
1 finished /tmp/ts-out.ZzwS11 0 0.00/0.00/0.00 bash -c echo 1
2 finished /tmp/ts-out.GJlyge 2 0.00/0.00/0.00 bash -c
4 finished /tmp/ts-out.lIVMYH 2 0.00/0.00/0.00 bash -c -h
5 finished /tmp/ts-out.8EKHy1 -1 141.23/0.00/0.00 python infloop.py
3 finished /tmp/ts-out.lBr4Wy -1 2545.36/0.00/0.02 bash -c python infloop.py
7 finished /tmp/ts-out.kxCczi 2 0.01/0.00/0.00 bash -c
8 finished /tmp/ts-out.3VkfNh 0 0.00/0.00/0.00 echo
9 finished /tmp/ts-out.8ewxzl 0 0.01/0.00/0.00 echo
10 finished /tmp/ts-out.ahSLaY 0 0.00/0.00/0.00 bash -c echo $GPUID
11 finished /a/home/cc/cs/yuvval/tmp/ts-out.3dpaBO 0 0.00/0.00/0.00 bash -c ls
12 finished /tmp/ts-out.ADWkve 0 0.00/0.00/0.00 bash -c ls
13 finished /a/home/cc/cs/yuvval/tmp/ts-out.xm0jtn -1 130.67/0.00/0.02 bash -c python infloop.py
14 finished /tmp/ts-out.HxBqkm 0 0.00/0.00/0.00 bash -c echo 11
15 finished /tmp/ts-out.ERNuaE 0 0.00/0.00/0.00 bash -c echo
16 finished /tmp/ts-out.9j6hkS 0 0.00/0.00/0.00 bash -c echo $GPUID
17 finished /tmp/ts-out.Y5QDNa 0 0.00/0.00/0.00 bash -c echo $GPUID
18 finished /tmp/ts-out.EIHhoX -1 0.00/0.00/0.00 %s
19 finished /tmp/ts-out.LLw2Wl -1 0.00/0.00/0.00
20 finished /tmp/ts-out.deWAJR -1 0.01/0.00/0.00 echo $GPUID
21 finished /tmp/ts-out.AdZFIf -1 0.00/0.00/0.00 echo 12
22 finished /tmp/ts-out.NBOCVv 0 0.00/0.00/0.00 echo 12
23 finished /tmp/ts-out.5WpfPu 0 0.00/0.00/0.00 echo
24 finished /tmp/ts-out.1lw4bS -1 0.00/0.00/0.00 echo
25 finished /tmp/ts-out.7MNGLQ 0 0.00/0.00/0.00 bash -c echo $GPUID
26 finished /tmp/ts-out.8FZ3on 0 0.00/0.00/0.00 bash -c echo $GPUID
私の最高の試みでした:
from StringIO import StringIO as sIO
std = ... # the table text
pd.read_table(sIO(std), sep='\s+', engine='python')
エラー:
ValueError: Expected 7 fields in line 2, saw 9
EDIT:表が利用可能である生成 ソースコード。各行を生成するコマンドは次のとおりです。これは、テーブルをデータフレームに読み込む際に役立ちますか?
if (p->label)
snprintf(line, maxlen, "%-4i %-10s %-20s %-8i %0.2f/%0.2f/%0.2f %s[%s]"
"%s\n",
p->jobid,
jobstate,
output_filename,
p->result.errorlevel,
p->result.real_ms,
p->result.user_ms,
p->result.system_ms,
dependstr,
p->label,
p->command);
else
snprintf(line, maxlen, "%-4i %-10s %-20s %-8i %0.2f/%0.2f/%0.2f %s%s\n",
p->jobid,
jobstate,
output_filename,
p->result.errorlevel,
p->result.real_ms,
p->result.user_ms,
p->result.system_ms,
dependstr,
p->command);
はタブで区切られていますか? 'sep = '\ t''を試してください。 – EdChum
@EdChum、no。 '\ t'を使うと全ての列を一つの列に入れます – yuval
' df = pd.read_csv( 'file'、sep = r '\ s {2、}'、engine = 'python') 'はどうでしょうか? - 区切りは正規表現です - '2つ以上の空白' – jezrael