2017-02-16 10 views
0

各記事について、キーワードリストが返されます。以下に示すように、キー - >値を使用してすべての単語をリストに結合します。私が追加する前に、リストから「u」を取り除きたいと思います。次に、両方のリスト内の共通単語の数を比較し、結果を返します。リストにキーワードを追加するNYT

例のリストはdic['keywords']から返さ:

条1つのリターン:

[ 
    { 
    u'value': u'Dunford, Joseph F Jr', 
    u'name': u'persons', 
    u'rank': u'1' 
    }, 
    { 
    u'value': u'Afghanistan', 
    u'name': u'glocations', 
    u'rank': u'1' 
    }, 
    { 
    u'value': u'Afghan National Police', 
    u'name': u'organizations', 
    u'rank': u'1' 
    }, 
    { 
    u'value': u'Afghanistan War (2001-)', 
    u'name': u'subject', 
    u'rank': u'1' 
    }, 
    { 
    u'value': u'Defense and Military Forces', 
    u'name': u'subject', 
    u'rank': u'2' 
    } 
] 

条2つのリターン:

[ 
    { 
    u'value': u'Gall, Carlotta', 
    u'name': u'persons', 
    u'rank': u'1' 
    }, 
    { 
    u'value': u'Gannon, Kathy', 
    u'name': u'persons', 
    u'rank': u'2' 
    }, 
    { 
    u'value': u'Niedringhaus, Anja (1965-2014)', 
    u'name': u'persons', 
    u'rank': u'3' 
    }, 
    { 
    u'value': u'Kabul (Afghanistan)', 
    u'name': u'glocations', 
    u'rank': u'2' 
    }, 
    { 
    u'value': u'Afghanistan', 
    u'name': u'glocations', 
    u'rank': u'1' 
    }, 
    { 
    u'value': u'Afghan National Police', 
    u'name': u'organizations', 
    u'rank': u'1' 
    }, 
    { 
    u'value': u'Afghanistan War (2001-)', 
    u'name': u'subject', 
    u'rank': u'1' 
    } 
] 

所望の出力:

List1 = ['Dunford, Joseph F Jr',’ Afghanistan’, ‘Afghan National Police’, ‘: Afghanistan War (2001-)’, ‘Defense and Military Forces’] 
List2 = [‘Gall, Carlotta'’,’ u'Gannon, Kathy',’ Niedringhaus, Anja (1965-2014)’,’Afghanistan’] 

共通するキーワード:2

マイコードは次のとおりです。

from flask import Flask, render_template, request, session, g, redirect, url_for 
    from nytimesarticle import articleAPI 

    api = articleAPI('X') 

articles = api.search(q = 'Afghan War', 
fq = {'headline':'', 'source':['Reuters','AP', 'The New York Times']}, 
begin_date = 20111231) 

def parse_articles(articles): 
''' 
This function takes in a response to the NYT api and parses 
the articles into a list of dictionaries 
''' 
news = [] 
for i in articles['response']['docs']: 
    dic = {} 
    dic['id'] = i['_id'] 
    if i['abstract'] is not None: 
     dic['abstract'] = i['abstract'].encode("utf8") 
    dic['headline'] = i['headline']['main'].encode("utf8") 
    dic['desk'] = i['news_desk'] 
    dic['date'] = i['pub_date'][0:10] # cutting time of day. 
    dic['section'] = i['section_name'] 
    dic['keywords'] = i['keywords'] 
    print dic['keywords'] 
    if i['snippet'] is not None: 
     dic['snippet'] = i['snippet'].encode("utf8") 
    dic['source'] = i['source'] 
    dic['type'] = i['type_of_material'] 
    dic['url'] = i['web_url'] 
    dic['word_count'] = i['word_count'] 
    # locations 
    locations = [] 
    for x in range(0,len(i['keywords'])): 
     if 'glocations' in i['keywords'][x]['name']: 
      locations.append(i['keywords'][x]['value']) 
    dic['locations'] = locations 
    # subject 
    subjects = [] 
    for x in range(0,len(i['keywords'])): 
     if 'subject' in i['keywords'][x]['name']: 
      subjects.append(i['keywords'][x]['value']) 
    dic['subjects'] = subjects 
    news.append(dic) 
return(news) 

print(parse_articles(articles)) 

答えて

0

あなたが与えられた辞書からリストを構築するために、リストの内包表記を使用することができます。

d = [{u'value': u'Dunford, Joseph F Jr', u'name': u'persons', u'rank': u'1'}, {u'value': u'Afghanistan', u'name': u'glocations', u'rank': u'1'}, {u'value': u'Afghan National Police', u'name': u'organizations', u'rank': u'1'}, {u'value': u'Afghanistan War (2001-)', u'name': u'subject', u'rank': u'1'}, {u'value': u'Defense and Military Forces', u'name': u'subject', u'rank': u'2'}] 
print [v['value'] for v in d] # prints [u'Dunford, Joseph F Jr', u'Afghanistan', u'Afghan National Police', u'Afghanistan War (2001-)', u'Defense and Military Forces'] 
関連する問題