2016-11-02 5 views
0
def copy_blanks(df, column): 

like this, Please suggest me. 

Input: 

e-mail,number 
[email protected],0 
[email protected],1 
[email protected],0 
[email protected],0 
[email protected],1 
[email protected],0 
,0 

ただし、ここではdefault_valueオプションを使用しています。その点で我々はあらゆる価値を使用することができる。このオプションを使用したとき。その値は下に追加されますcopy_blanks(df、column)は、元の列がブランクのすべての値の元の列の値を最後の列にコピーする必要があります。

e-mail,number 
[email protected],0 
[email protected],1 
[email protected],0 
[email protected],0 
[email protected],1 
[email protected],0 
NA,0 

しかし、私の出力にはデフォルト値とskip_blankオプションがあります。 trueのようにskip_blankを使用するときは、デフォルト値を使用しないでください。skip_blankをfalseにしておくと、デフォルト値が使用されます。

私の出力:私が正しくあなたを理解していれば

e-mail,number,e-mail_clean 
[email protected],0,[email protected] 
[email protected],1,[email protected] 
[email protected],0,[email protected] 
[email protected],0,[email protected] 
[email protected],1,[email protected] 
[email protected],0,[email protected] 
,0, 

答えて

0

はあなたのサンプルdf

df = pd.DataFrame([ 
    ['[email protected]', 0], 
    ['[email protected]', 1], 
    ['[email protected]', 0], 
    ['[email protected]', 0], 
    ['[email protected]', 1], 
    ['[email protected]', 0], 
    ['', 0] 
], columns=['e-mail','number']) 

print(df) 

     e-mail number 
0 [email protected]  0 
1 [email protected]  1 
2 [email protected]  0 
3 [email protected]  0 
4 [email protected]  1 
5 [email protected]  0 
6     0 

を考慮してください。

def copy_blanks(df, column, skip_blanks=False, default_value='NA'): 
    df = df.copy() 
    s = df[column] 
    if not skip_blanks: 
     s = s.replace('', default_value) 
    df['{}_clean'.format(column)] = s 
    return df 

copy_blanks(df, 'e-mail', skip_blanks=False) 

     e-mail number e-mail_clean 
0 [email protected]  0 [email protected] 
1 [email protected]  1 [email protected] 
2 [email protected]  0 [email protected] 
3 [email protected]  0 [email protected] 
4 [email protected]  1 [email protected] 
5 [email protected]  0 [email protected] 
6     0   NA 

copy_blanks(df, 'e-mail', skip_blanks=True) 

     e-mail number e-mail_clean 
0 [email protected]  0 [email protected] 
1 [email protected]  1 [email protected] 
2 [email protected]  0 [email protected] 
3 [email protected]  0 [email protected] 
4 [email protected]  1 [email protected] 
5 [email protected]  0 [email protected] 
6     0    

copy_blanks(df, 'e-mail', skip_blanks=False, default_value='[email protected]') 

     e-mail number e-mail_clean 
0 [email protected]  0 [email protected] 
1 [email protected]  1 [email protected] 
2 [email protected]  0 [email protected] 
3 [email protected]  0 [email protected] 
4 [email protected]  1 [email protected] 
5 [email protected]  0 [email protected] 
6     0 [email protected] 
関連する問題