私はブーリアン列に持つパンダ

のクラスIDに2つのブール列を変換します私はブーリアン列に持つパンダ

df = pd.DataFrame([[True, True], 
        [True, False], 
        [False, True], 
        [True, True], 
        [False, False]], 
       columns=['col1', 'col2'])

私は彼らが属するユニークな組み合わせを識別する新しい列を生成する必要があります。

result = pd.Series([0, 1, 2, 0, 3])

はのように思えますこれを行うには非常に簡単な方法があるはずですが、それは私を逃れています。多分sklearn.preprocessingを使用して何か？簡単なPandasまたはNumpy溶液も同様に好ましい。

EDIT：ソリューションは、2つの以上の列

出典

2017-03-04 Chris

print (pd.Series(pd.factorize(df.apply(tuple, axis=1))[0])) 
0 0 
1 1 
2 2 
3 0 
4 3 
dtype: int64

stringとsumへのキャストで別の解決策：

print (pd.Series(pd.factorize(df.astype(str).sum(axis=1))[0])) 
0 0 
1 1 
2 2 
3 0 
4 3 
dtype: int64

出典

2017-03-04 19:51:18 jezrael

私が探していたものである最も単純にはfactorizeでtuplesを作成することです。私はそこに1つのライナーがあることを知っていた。ありがとう！ – Chris

ありがとうございます。喜んであなたを助けることができます！がんばろう！ – jezrael

に拡張することができれば本当にいいだろう、私は前にパンダを使用していないが、ここで私はないだろうと確信している無地のpythonとソリューションで決してましたパンダに適応するのは難しい：

a = [[True, True], 
     [True, False], 
     [False, True], 
     [True, True], 
     [False, False]] 

ids, result = [], [] # ids, keeps a list of previously seen items. result, keeps the result 

for x in a: 
    if x in ids: # x has been seen before 
     id = ids.index(x) # find old id 
     result.append(id) 
    else: # x hasn't been seen before 
     id = len(ids) # create new id 
     result.append(id) 
     ids.append(x) 

print(result) # [0, 1, 2, 0, 3]

これは単に使用シリーズに結果を得るために、任意の数の列で動作します：

result = pd.Series(result)

出典

2017-03-04 19:48:58

私はブーリアン列に持つパンダ

答えて

関連する問題