大半とタプル

のリスト内での平均は3種類の方法から感情推定値を表すタプルの次のリストを仮定します大半とタプル

[('pos', 0.2), ('neu', 0.1), ('pos', 0.4)]

私は大多数の感情を見つけるための最も効率的な方法は何か不思議に思っており、このためでした

result=('pos', 0.3)

おかげ

出典

2017-07-14 Nicholas

NumPyまたはPandasを使用できますか？ –

どのように効率的にしたいですか？ CPU時間、メモリ、または開発時間を効率的に使用できますか？ – skyking

CPU時間。センチメントは、毎秒何千ものAPIコールから取得されます。ありがとう – Nicholas

import collections 

reports = [('pos', 0.2), ('neu', 0.1), ('pos', 0.4)] 

oracle = collections.defaultdict(list) 
for mood, score in reports: 
    oracle[mood].append(score) 

counts = {mood: len(scores) for mood, scores in oracle.items()} 

mood = max(counts) # gives `'pos'` 

sum(oracle[mood])/len(oracle[mood]) # gives 0.3

出典

2017-07-14 11:26:44

'' neu''を '' zeu''に変更してみてください。それはそれを壊す。 –

ありがとう - それは非常に包括的です – Nicholas

import itertools 

l = [('pos', 0.2), ('neu', 0.1), ('pos', 0.4)]

：その平均、すなわちを計算しますあなたはその後、最後に次に

majority = max(sentiments, key=len) 
# majority = [('pos', 0.2), ('pos', 0.4)]

（別名最長のグループを持っている）、最も一般的である感情感情によって最初のグループ（彼らが最初にソートする必要が注意してください）

sentiments = [list(j[1]) for j in itertools.groupby(sorted(l), lambda i: i[0])] 
# sentiments = [[('neu', 0.1)], [('pos', 0.2), ('pos', 0.4)]]

把握することができます

平均を計算する

values = [i[1] for i in majority] 
average = (majority[0][0], sum(values)/len(values)) 
# average = ('pos', 0.30000000000000004)

出典

2017-07-14 11:29:41 CoryKramer

変数名として 'l'を使用することが癌の主要な原因です。 –

ありがとうございます - 私は[この回答]（https://stackoverflow.com/questions/31212260/group-and-compute-the-average-in-list-of-tuples ）は、この場合の過剰なものになりますが、明らかにそうではありません。 – Nicholas

それは辞書を使うのが良い。「キー」はセンチメント名であり、valueはセンチメント値（値）のリストである「数字」（キー）とセンチメントの発生数に対する「カウント」（キー）を含む辞書です。値）。例：

sentiment['pos']['numbers'] = [0.2,0.4] 
sentiment['pos']['count'] = 2 
sentiment={'pos':{'numbers':[0.2,0.4],'count':2},'neu':{'numbers':`[0.1],'count:1'}}`

出典

2017-07-14 11:30:29 pooya

がcollectionsを使用してstatisticsモジュールは、あなたがこれを行うことができます：あなたは私がCoryKramer's answerを好む効率を探している与えられたものの

from collections import Counter 
from statistics import mean 

lst = [('pos', 0.2), ('neu', 0.1), ('pos', 0.4)] 
count = Counter(item[0] for item in lst) # Counter({'pos': 2, 'neu': 1}) 
maj = count.most_common(1)[0][0]   # pos 
mn = mean(item[1] for item in lst if item[0] == maj) 
result = (maj, mn) 

print(result) # ('pos', 0.30000000000000004)

。

出典

2017-07-14 11:31:01

あなたの答えとポインタありがとう – Nicholas

sorted_tuples = sorted(my_tuple_list, key = lambda x : x[-1] , reverse = True) 

majority_sentiment= sorted_tuples[0][0] 
majority_sentiment_score = 0 
num_items = 0 

for sentiment_tup in sorted_tuples: 
    if sentiment_tup[0] == majority_sentiment: 
     majority_sentiment_score+= sentiment_tup[1] 
     num_items +=1 

avg_sentiment_score = majority_sentiment_score/num_items 

result= (majority_sentiment,avg_sentiment_score)

そうすべきです。

出典

2017-07-14 11:52:13

これは大多数のアイテムを1つしか見つけず、平均を計算しません。また、 'sorted（my_tuple_list、key = lambda x：x [-1]、reverse = True）[1]'は他の 'pos'要素を返します – Nicholas

ああ、私は質問を誤読しました。編集します。 –

答えて

関連する問題