整列した単語のリストが重複して印刷されるのはなぜですか？

私はNLTK wordnet synsetsを使ってsynsetsを見つけるのにスルタンモノリンガルアライナを実装しようとしています。整列した単語のリストが重複して印刷されるのはなぜですか？

そして、私は二つのリストがあります：word1のword1[i]ののsynsetがword2のword2[j]、その後、word1[i]とword2[j]ののsynsetと類似している場合

word1 = ['move', 'buy','learn'] 
word2 = ['study', 'purchase']

アライナーのルールに基づいては、整列されます。

from nltk.corpus import wordnet as wn 

def getSynonyms(word): 
    synonymList1 = [] 
    wordnetSynset1 = wn.synsets(word) 
    tempList1=[] 
    for synset1 in wordnetSynset1: 
     synLemmas = synset1.lemma_names() 
     for i in xrange(len(synLemmas)): 
      word = synLemmas[i].replace('_',' ') 
      if word not in tempList1: 
       tempList1.append(word) 
    synonymList1.append(tempList1) 
    return synonymList1 

def cekSynonyms(word1, word2): 
    newlist = [] 
    for i in xrange(len(word1)): 
     for j in xrange(len(word2)): 
      getsyn1 = getSynonyms(word1[i]) 
      getsyn2 = getSynonyms(word2[j]) 
      ds1 = [x for y in getsyn1 for x in y] 
      ds2 = [x for y in getsyn2 for x in y] 
      print ds1,"---align to--->",ds2,"\n" 
      for k in xrange(len(ds1)): 
       for l in xrange(len(ds2)): 
        if ds1[k] == ds2[l]: 
         #newsim = [ds1[k], ds2[l]] 
         newsim = [word1[i], word2[j]] 
         newlist.append(newsim) 
    return newlist 

word1 = ['move', 'buy','learn'] 
word2 = ['study', 'purchase'] 
print cekSynonyms(word1, word2)

そして、はい、私はそれぞれの単語ののsynsetを見つけることができる：ここ

とは、私のコードです。ここでの出力は次のとおり

[u'move', u'relocation', u'motion', u'movement', u'motility', u'travel', u'go', u'locomote', u'displace', u'proceed', u'be active', u'act', u'affect', u'impress', u'strike', u'motivate', u'actuate', u'propel', u'prompt', u'incite', u'run', u'make a motion'] ---align to---> [u'survey', u'study', u'work', u'report', u'written report', u'discipline', u'subject', u'subject area', u'subject field', u'field', u'field of study', u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine', u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the books', u'meditate', u'contemplate'] 

[u'move', u'relocation', u'motion', u'movement', u'motility', u'travel', u'go', u'locomote', u'displace', u'proceed', u'be active', u'act', u'affect', u'impress', u'strike', u'motivate', u'actuate', u'propel', u'prompt', u'incite', u'run', u'make a motion'] ---align to---> [u'purchase', u'leverage', u'buy'] 

[u'bargain', u'buy', u'steal', u'purchase', u'bribe', u'corrupt', u"grease one's palms"] ---align to---> [u'survey', u'study', u'work', u'report', u'written report', u'discipline', u'subject', u'subject area', u'subject field', u'field', u'field of study', u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine', u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the books', u'meditate', u'contemplate'] 

[u'bargain', u'buy', u'steal', u'purchase', u'bribe', u'corrupt', u"grease one's palms"] ---align to---> [u'purchase', u'leverage', u'buy'] 

[u'learn', u'larn', u'acquire', u'hear', u'get word', u'get wind', u'pick up', u'find out', u'get a line', u'discover', u'see', u'memorize', u'memorise', u'con', u'study', u'read', u'take', u'teach', u'instruct', u'determine', u'check', u'ascertain', u'watch'] ---align to---> [u'survey', u'study', u'work', u'report', u'written report', u'discipline', u'subject', u'subject area', u'subject field', u'field', u'field of study', u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine', u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the books', u'meditate', u'contemplate'] 

[u'learn', u'larn', u'acquire', u'hear', u'get word', u'get wind', u'pick up', u'find out', u'get a line', u'discover', u'see', u'memorize', u'memorise', u'con', u'study', u'read', u'take', u'teach', u'instruct', u'determine', u'check', u'ascertain', u'watch'] ---align to---> [u'purchase', u'leverage', u'buy'] 

[['buy', 'purchase'], ['buy', 'purchase'], ['learn', 'study'], ['learn', 'study'], ['learn', 'study'], ['learn', 'study']]

6行は、上記のそれらのsynsetによって比較さword1とword2の両方の内部の各ワードです。

下の行は、整列した単語です。

synsetによってわかるように、['buy','purchase']および['learn','study']は整列した単語です。

なぜ出力が複数回出力されるのですか？このように>>[['buy', 'purchase'], ['buy', 'purchase'], ['learn', 'study'], ['learn', 'study'], ['learn', 'study'], ['learn', 'study']]

反復せずに1回だけ印刷するには？

a = [['buy', 'purchase'], ['buy', 'purchase'], ['learn', 'study'], \\ 
    ['learn', 'study'], ['learn', 'study'], ['learn', 'study']] 
a = [list(x) for x in set([tuple(x) for x in a])] 
print(a)

：リストはハッシュ可能ではありませんので、あなたが途中でタプルを通過する必要がありますけれども[['buy','purchase'], ['learn','study']]

出典

2017-07-29 sang

>>このようにあなたは、セットに変換することにより、このようなリストから重複を削除することができます与える：

[['buy', 'purchase'], ['learn', 'study']]

出典

2017-07-29 05:08:33 nbubis

うわーそれは働きます！ありがとうございます – sang

MRに基づきます。 nbubis答え、ここで私はタプル関数をコード化します：

def tupleSynonyms(word1, word2): 
    a = cekSynonyms(word1, word2) 
    anew = [list(x) for x in set([tuple(x) for x in a])] 
    return anew 

print tupleSynonyms(word1, word2)

出典

2017-07-29 05:52:14 sang

整列した単語のリストが重複して印刷されるのはなぜですか？

答えて

関連する問題