2016-08-26 8 views



h1 h2 h3 h4 h5 h6 
A1 A2 A3 A4 A5 A6 
B1 B2 B3 B4 B5 B6 


h1 h2 h3 h4 h5 h6 
A1 A2 
A1 A2 A3 A4 
A1 A2 A3 A4 A5 A6 
B1 B2 
B1 B2 B3 B4 
B1 B2 B3 B4 B5 B6 


importFile = pd.read_csv('file.csv') 
df = df_importFile = pd.DataFrame(importFile) 

index_cols = ['h1'] 
cols = [c for c in df if c not in index_cols] 

df2 = df.set_index(index_cols).stack().reset_index(level=1, drop=True).to_frame('Value') 

df2 = pd.concat([pd.Series([v if i % len(cols) == n else '' 
         for i, v in enumerate(df2.Value)], name=col) 
      for n, col in enumerate(cols)], axis=1).set_index(df2.index) 


h1 h2 h3 h4 h5 h6 
A1 A2 
A1  A3 
A1   A4 
A1    A5 
A1     A6 


# take number of columns and divide by 2 
# this is the number of pairs 
pairs = df.shape[1] // 2 

# np.repeat takes the number of rows and returns an object to slice 
# the dataframe array df.values... then slice... result should be 
# of length pairs * len(df) 
a = df.values[np.repeat(np.arange(df.shape[0]), pairs)] 

# row values to condition with as column vector 
dim0 = (np.arange(a.shape[0]) % (pairs))[:, None ] 

# column values to condition with as row vector 
dim1 = np.repeat(np.arange(pairs), 2) 

# boolean mask to use in np.where generated 
# via the magic of numpy broadcasting 
mask = dim0 >= dim1 

# QED 
pd.DataFrame(np.where(mask, a, ''), columns=df.columns) 

enter image description here


。入力ファイルのすべての行をロールバックする必要があります。 – mrh5028


@ mrh5028 unftated – piRSquared


完璧に作業しました!今、何が起こっているのかを正確に理解するために作業します。 – mrh5028



dfNew = pd.DataFrame() 
ct = 1 
while ct <= df.shape[1]/2 : 
    dfNew = dfNew.append(df[df.columns[:2*ct]]) 
    ct +=1 

dfNew.sort_values(['h1'], ascending=[True]).reset_index(drop=True).fillna("") 
print df 

    h1 h2 h3 h4 h5 h6 
0 A1 A2     
1 A1 A2 A3 A4   
2 A1 A2 A3 A4 A5 A6 
3 B1 B2     
4 B1 B2 B3 B4   
5 B1 B2 B3 B4 B5 B6 