パンダの2つのテーブルの平均を見つけるには？

I持ってこのようになります行の1000年代と一つのテーブル： FILE1：パンダの2つのテーブルの平均を見つけるには？

apples1 + hate 0 0 0 2 4 6 0 1 
apples2 + hate 0 2 0 4 4 6 0 2 
apples4 + hate 0 2 0 4 4 6 0 2

とfile2の同じヘッダを持つ別のファイル - NBいくつかのヘッダがFILE1に欠けている：

apples1 + hate 0 0 0 1 4 6 0 2 
apples2 + hate 0 1 0 6 4 6 0 2 
apples3 + hate 0 2 0 4 4 6 0 2 
apples4 + hate 0 1 0 3 4 3 0 1

Iパンダの2つのファイルと一般的な列の平均を比較したい私は1つのファイルだけにある列を印刷したくありません。結果ファイルは次のようになります。

apples1 + hate 0 0 0 1.5 4 6 0 1.5 
apples2 + hate 0 1.5 0 5 4 6 0 2 
apples4 + hate 0 2 0 3.5 4 6 0 2

出典

2017-11-28 Alex Trevylan

ちょうどそのようにGROUPBYと – Wen

インナーは、最初の列の2つのファイルの結合はNaNを含む列を削除する意味します@Wenが提案した。 – user32185

この解決策には2つのステップがあります。

はpandas.concat(...)を使用し、それだけですべてのデータフレームの列を維持するために、「インナー」の参加を指定する（軸= 0、デフォルト）縦にそれらを積み重ねることによって、すべてのデータフレームを連結します。
結果のデータフレームでmean(...)関数を呼び出します。

例：

In [1]: df1 = pd.DataFrame([[1,2,3], [4,5,6]], columns=['a','b','c']) 
In [2]: df2 = pd.DataFrame([[1,2],[3,4]], columns=['a','c']) 
In [3]: df1 
Out[3]: 
    a b c 
0 1 2 3 
1 4 5 6 

In [4]: df2 
Out[4]: 
    a c 
0 1 2 
1 3 4 

In [5]: df3 = pd.concat([df1, df2], join='inner') 
In [6]: df3 
Out[6]: 
    a c 
0 1 3 
1 4 6 
0 1 2 
1 3 4 

In [7]: df3.mean() 
Out[7]: 
a 2.25 
c 3.75 
dtype: float64

出典

2017-11-28 17:30:30 SciGuyMcQ

のは、この試してみましょう：つまり...

df1 = df1.set_index([0,1,2]) 
df2 = df2.set_index([0,1,2])

ましょう使用 "をapple1は+嫌い" 最初の3つの列に

df1 = pd.read_csv('file1', header=None) 
df2 = pd.read_csv('file2', header=None)

セットのインデックスをmerge索引の内部結合データファイルにmeanと同じ名前と集計とGROUPBY列：

df1.merge(df2, right_index=True, left_index=True)\ 
    .pipe(lambda x: x.groupby(x.columns.str.extract('(\w+)\_[xy]', expand=False), 
          axis=1, sort=False).mean()).reset_index()

は出力：

  0 1  2 3 4 5 6 7 8 9 10 
0 apples1 + hate 0.0 0.0 0.0 1.5 4.0 6.0 0.0 1.5 
1 apples2 + hate 0.0 1.5 0.0 5.0 4.0 6.0 0.0 2.0 
2 apples4 + hate 0.0 1.5 0.0 3.5 4.0 4.5 0.0 1.5

出典

2017-11-28 20:09:16

パンダの2つのテーブルの平均を見つけるには？

答えて

関連する問題