2017-05-11 12 views
2

[OK]をので、私はpandas pivot table documentationを見て、以下同様の質問している:私の問題は、出力timedeltasたときに集約されるしかしフォーマット集計はtimedelta

彼らは1 day 12:30:00のフォーマットを表示して、それを36:30:00と表示したいとします。これを行う方法はありますか?

CODE

pd.pivot_table(
    df, 
    index=["date","task_type"], 
    columns=["calendar"], 
    values="duration", 
    aggfunc=np.sum, 
    fill_value=timedelta(hours=0), 
    margins=True, 
    margins_name="TOTAL" 
) 

OUTPUT

calendar     Proj1 Proj2   Proj3   TOTAL 
date  task_type 
2017-05-04    01:00:00 00:00:00 0 days 00:00:00 0 days 01:00:00 
      development 00:00:00 00:00:00 0 days 02:00:00 0 days 02:00:00 
      training 00:00:00 00:00:00 0 days 04:00:00 0 days 04:00:00 
2017-05-05 admin  00:00:00 02:00:00 0 days 00:00:00 0 days 02:00:00 
      development 00:30:00 00:00:00 0 days 03:30:00 0 days 04:00:00 
      meeting  00:00:00 00:00:00 0 days 01:00:00 0 days 01:00:00 
2017-05-08    01:30:00 00:00:00 0 days 00:00:00 0 days 01:30:00 
      admin  00:00:00 02:00:00 0 days 00:00:00 0 days 02:00:00 
      development 00:00:00 00:00:00 0 days 05:00:00 0 days 05:00:00 
2017-05-09 admin  02:00:00 01:00:00 0 days 00:00:00 0 days 03:00:00 
      development 00:00:00 00:00:00 0 days 01:00:00 0 days 01:00:00 
      research 01:00:00 00:00:00 0 days 00:00:00 0 days 01:00:00 
      training 00:00:00 00:00:00 0 days 03:30:00 0 days 03:30:00 
2017-05-10 admin  00:00:00 01:30:00 0 days 00:00:00 0 days 01:30:00 
      development 00:00:00 00:00:00 0 days 02:30:00 0 days 02:30:00 
      meeting  02:00:00 00:00:00 0 days 00:00:00 0 days 02:00:00 
      training 00:00:00 00:00:00 0 days 02:00:00 0 days 02:00:00 
2017-05-11 admin  00:00:00 01:00:00 0 days 00:00:00 0 days 01:00:00 
      development 00:00:00 02:30:00 0 days 00:00:00 0 days 02:30:00 
TOTAL     08:00:00 10:00:00 1 days 00:30:00 1 days 18:30:00 

そう簡単ではない

calendar     Proj1 Proj2   Proj3   TOTAL 
date  task_type 
2017-05-04    01:00:00 00:00:00  00:00:00  01:00:00 
      development 00:00:00 00:00:00  02:00:00  02:00:00 
      training 00:00:00 00:00:00  04:00:00  04:00:00 
2017-05-05 admin  00:00:00 02:00:00  00:00:00  02:00:00 
      development 00:30:00 00:00:00  03:30:00  04:00:00 
      meeting  00:00:00 00:00:00  01:00:00  01:00:00 
2017-05-08    01:30:00 00:00:00  00:00:00  01:30:00 
      admin  00:00:00 02:00:00  00:00:00  02:00:00 
      development 00:00:00 00:00:00  05:00:00  05:00:00 
2017-05-09 admin  02:00:00 01:00:00  00:00:00  03:00:00 
      development 00:00:00 00:00:00  01:00:00  01:00:00 
      research 01:00:00 00:00:00  00:00:00  01:00:00 
      training 00:00:00 00:00:00  03:30:00  03:30:00 
2017-05-10 admin  00:00:00 01:30:00  00:00:00  01:30:00 
      development 00:00:00 00:00:00  02:30:00  02:30:00 
      meeting  02:00:00 00:00:00  00:00:00  02:00:00 
      training 00:00:00 00:00:00  02:00:00  02:00:00 
2017-05-11 admin  00:00:00 01:00:00  00:00:00  01:00:00 
      development 00:00:00 02:30:00  00:00:00  02:30:00 
TOTAL     08:00:00 10:00:00  24:30:00  42:30:00 

答えて

1

を期待したが、私は0を実装してみてくださいapplyとソリューション:

def f(x): 
    ts = x.dt.total_seconds() 
    hours, remainder = divmod(ts, 3600) 
    minutes, seconds = divmod(remainder, 60) 
    return hours.abs().astype(int).astype(str).str.zfill(2) + ':' + \ 
      minutes.abs().astype(int).astype(str).str.zfill(2) 


df1 = df.apply(f) 
print (df1) 
calendar      Proj1 Proj2 Proj3 TOTAL 
date    task_type        
2017-04-03 15:30:00 c   00:00 00:00 00:00 00:00 
        e   10:30 10:30 00:00 21:00 
2017-04-03 19:00:00 d   24:30 00:00 10:30 35:00 
2017-04-04 05:30:00 a   00:00 03:30 00:00 03:30 
2017-04-04 19:30:00 a   00:00 21:00 00:00 21:00 
2017-04-04 23:00:00 a   00:00 00:00 03:30 03:30 
        c   00:00 00:00 17:30 17:30 
        d   00:00 17:30 00:00 17:30 
TOTAL       35:00 52:30 31:30 119:00 

...とapplymapを持つ:

def f(x): 
    ts = x.total_seconds() 
    hours, remainder = divmod(ts, 3600) 
    minutes, seconds = divmod(remainder, 60) 
    return ('{:02d}:{:02d}').format(int(hours), int(minutes)) 

df2 = df.applymap(f) 
print (df2) 
calendar      Proj1 Proj2 Proj3 TOTAL 
date    task_type        
2017-04-03 15:30:00 c   00:00 00:00 00:00 00:00 
        e   10:30 10:30 00:00 21:00 
2017-04-03 19:00:00 d   24:30 00:00 10:30 35:00 
2017-04-04 05:30:00 a   00:00 03:30 00:00 03:30 
2017-04-04 19:30:00 a   00:00 21:00 00:00 21:00 
2017-04-04 23:00:00 a   00:00 00:00 03:30 03:30 
        c   00:00 00:00 17:30 17:30 
        d   00:00 17:30 00:00 17:30 
TOTAL       35:00 52:30 31:30 119:00 

サンプルのセットアップ:

N = 10 
np.random.seed(10) 
rng = pd.date_range('2017-04-03 15:30:00', periods=N, freq='3.5H') 
df = pd.DataFrame({'date': np.random.choice(rng, size=N), 
        'task_type': np.random.choice(list('abcde'), size=N), 
        'calendar': np.random.choice(['Proj1','Proj2','Proj3'], size=N), 
        'duration': np.abs(np.random.choice(rng, size=N) - 
             np.random.choice(rng, size=N)) }) 
#print (df) 



df = pd.pivot_table(
    df, 
    index=["date","task_type"], 
    columns=["calendar"], 
    values="duration", 
    aggfunc=np.sum, 
    fill_value=pd.Timedelta(hours=0), 
    margins=True, 
    margins_name="TOTAL" 
) 

print (df) 
calendar        Proj1   Proj2   Proj3 \ 
date    task_type             
2017-04-03 15:30:00 c   0 days 00:00:00 0 days 00:00:00 0 days 00:00:00 
        e   0 days 10:30:00 0 days 10:30:00 0 days 00:00:00 
2017-04-03 19:00:00 d   1 days 00:30:00 0 days 00:00:00 0 days 10:30:00 
2017-04-04 05:30:00 a   0 days 00:00:00 0 days 03:30:00 0 days 00:00:00 
2017-04-04 19:30:00 a   0 days 00:00:00 0 days 21:00:00 0 days 00:00:00 
2017-04-04 23:00:00 a   0 days 00:00:00 0 days 00:00:00 0 days 03:30:00 
        c   0 days 00:00:00 0 days 00:00:00 0 days 17:30:00 
        d   0 days 00:00:00 0 days 17:30:00 0 days 00:00:00 
TOTAL       1 days 11:00:00 2 days 04:30:00 1 days 07:30:00 

calendar        TOTAL 
date    task_type     
2017-04-03 15:30:00 c   0 days 00:00:00 
        e   0 days 21:00:00 
2017-04-03 19:00:00 d   1 days 11:00:00 
2017-04-04 05:30:00 a   0 days 03:30:00 
2017-04-04 19:30:00 a   0 days 21:00:00 
2017-04-04 23:00:00 a   0 days 03:30:00 
        c   0 days 17:30:00 
        d   0 days 17:30:00 
TOTAL       4 days 23:00:00 
+0

私は 'applymap'オプションと一緒に行きました。 –