Pythonの既存の値に基づいて新しい列を作成する

日付範囲に基づいて新しい列を作成して、毎月のEMIの消費量を確認しようとしています。 Pythonでは、これはPythonの既存の値に基づいて新しい列を作成する

Start Date End Date EMI 
01/12/16 01/12/17 4800 
09/01/16 09/01/17 3000 
01/07/15 01/05/16 2300

入力ファイルを行うことができ、私は私にこの使用してのpythonの実装上のご提案を教えてください出力ファイルは、この

Start Date End Date  EMI 06/16 07/16 08/16 09/16 10/16 11/16 12/16 01/17 02/17 
01/12/16 01/12/17 4800 4800 4800 4800 4800 4800 4800 4800 4800 0 
09/01/16 09/01/17 3000 0  0  0  3000 3000 3000 3000 3000 3000 
01/07/15 01/05/16 2300 0  0  0  0  0  0  0  0  0

見えるようにしたいどのようにアドバイスしてください。あなたが必要とする

出典

2016-11-09 yasin mohammed

私は完全に混乱しています！出力の欄にはどうやって来ましたか？値を決めるのは何ですか？ – piRSquared

EMIがデータ範囲内に収まっていれば、その月のEMI値で列をポールする必要がある場合、サンプルファイルを編集しました –

IIUC：

#reshape datetime columns to one, create datetimeindex 
df1 = pd.melt(df.reset_index(), id_vars=['EMI', 'index'], value_name='date') 
     .set_index('date') 
#convert index to periodindex by month 
df1.index = pd.to_datetime(df1.index, format='%d/%m/%y', errors='coerce') 
       .to_period('M') 
#groupby by column index nad resample by month 
df1 = df1.groupby('index') 
     .resample('M') 
     .ffill() 
     .drop(['variable', 'index'], axis=1) 
     .reset_index() 
#pivoting, fill NaN with 0, cast floats to int 
df1 = df1.pivot(index='index', columns='date', values='EMI') 
     .fillna(0) 
     .astype(int) 
#change format of columns 
df1.columns = df1.columns.strftime('%m/%y') 
#concat original dataframe 
df = pd.concat([df,df1], axis=1) 

print (df) 
    Start Date End Date EMI 07/15 08/15 09/15 10/15 11/15 12/15 01/16 \ 
0 01/12/16 01/12/17 4800  0  0  0  0  0  0  0 
1 09/01/16 09/01/17 3000  0  0  0  0  0  0 3000 
2 01/07/15 01/05/16 2300 2300 2300 2300 2300 2300 2300 2300 

    03/17 04/17 05/17 06/17 07/17 08/17 09/17 10/17 11/17 12/17 
0 ...  4800 4800 4800 4800 4800 4800 4800 4800 4800 4800 
1 ...  0  0  0  0  0  0  0  0  0  0 
2 ...  0  0  0  0  0  0  0  0  0  0 

[3 rows x 33 columns]

出典

2016-11-09 18:20:10 jezrael

私のソリューションをチェックできますか？ '01/12/16'は' DDMMYY'または 'MMDDYY'ですか？ – jezrael

日付の書式はMMDDYYです。この変更は、 'df1.index = pd.to_datetime（df1.index、format = '％d /％m /％y'、errors = 'coerce'）の文に変更されました。 –

また、私このコードを3時間実行しています私の合計ファイルサイズはわずか180 MBですdf1 = df1.groupby（ 'index'） .resample（ 'M'） .ffill（） .drop（[ variable '、' index ']、axis = 1） .reset_index（） ' –

Pythonの既存の値に基づいて新しい列を作成する

答えて

関連する問題