2017-09-19 17 views
3

が、これは私の入力データであると想像:パンダのデータフレーム - 行と列のマルチインデックス?

data = [("France", "Paris",  "Male", "1"), 
      ("France", "Paris",  "Female", "6"), 
      ("France", "Nice",  "Male", "2"), 
      ("France", "Nice",  "Female", "7"), 
      ("Germany", "Berlin",  "Male", "3"), 
      ("Germany", "Berlin",  "Female", "8"), 
      ("Germany", "Munchen", "Male", "4"), 
      ("Germany", "Munchen", "Female", "9"), 
      ("Germany", "Koln",  "Male", "5"), 
      ("Germany", "Koln",  "Female", "10")] 

私はこのようなデータフレームに入れたいのですが:

Country City  Sex 
        Male  Female 
France Paris  1   6 
     Nice  2   7 
Germany Berlin  3   8 
     Munchen  4   9 
     Koln  5   10 

最初の部分は簡単です:

df = pd.DataFrame(data, columns=["country", "city", "sex", "count"]) 
df = df.set_index(["country", "city"]) 

を与えます私の出力:

    sex count 
country city     
France Paris  Male  1 
     Paris Female  6 
     Nice  Male  2 
     Nice  Female  7 
Germany Berlin  Male  3 
     Berlin Female  8 
     Munchen Male  4 
     Munchen Female  9 
     Koln  Male  5 
     Koln  Female 10 

行は大丈夫ですが、今は 'sex'列の値を列multiindexに入れたいと思います。もしそうなら、それは可能ですか?

答えて

2

set_indexlistに列Sexを追加し、呼び出しunstack

df = df.set_index(["country", "city",'sex']).unstack() 
#data cleaning - remove columns name sex and rename column count 
df = df.rename_axis((None, None),axis=1).rename(columns={'count':'Sex'}) 
print (df) 
        Sex  
       Female Male 
country city    
France Nice   7 2 
     Paris  6 1 
Germany Berlin  8 3 
     Koln  10 5 
     Munchen  9 4 
0

スタック解除のピボットインプレースを使用して、別の方法すなわち(両方ともほぼ同じ意味)

df.set_index(['country','city']).pivot(columns='sex') 
    
        count  
sex    Female Male 
country city    
France Nice   7 2 
     Paris  6 1 
Germany Berlin  8 3 
     Koln  10 5 
     Munchen  9 4 
関連する問題