Django DBオブジェクトのフィルタが初めて新規アイテムを取得しない

何らかの理由で、このコードを実行すると、同じオブジェクトをループし続け、データベースから新しいアイテムが取得されません。言い換えれば、印刷出力は、リスト内の項目を反復処理する必要があるときと同じオブジェクトです。Django DBオブジェクトのフィルタが初めて新規アイテムを取得しない

article = Article.objects.filter(is_locked=False, is_downloaded=False).first() 
while article: 
    article.is_locked = True 
    article.save() 

    print '******************************' 
    date = article.datetime 
    title = article.title 
    url = article.url 
    print('date: %s' % date) 
    print('url: %s' % url) 
    print('title: %s' % title) 

    get_article(url, title, article) 

    article = Article.objects.filter(is_locked=False, is_downloaded=False).first()

mldb.modelsがある：私もこれを試してみましたが、それはまた、オブジェクトをループしない

from django.db import models 


class Article(models.Model): 
    url = models.CharField(max_length=1028) 
    title = models.CharField(max_length=1028) 
    category = models.CharField(max_length=128) 
    locale = models.CharField(max_length=128) 
    section = models.CharField(max_length=512) 
    tag = models.CharField(max_length=128) 
    author = models.CharField(max_length=256) 
    datetime = models.DateTimeField() 
    description = models.TextField() 
    article = models.TextField() 
    is_locked = models.BooleanField(default=False) 
    is_downloaded = models.BooleanField(default=False) 

    def __str__(self):    # __unicode__ on Python 2 
     return self.name 

    class Meta: 
     app_label = 'mldb'

のいずれか（ループだけで何度も同じオブジェクトを繰り返し、ここに私のコードです）：

articles = Article.objects.filter(is_locked=False, is_downloaded=False) 
for article in articles: 
    ...

ここにget_article（）があります。これは、（私は、この関数の呼び出しを削除した場合、すべてが正常に動作します）問題を引き起こしているもののようだ：

def get_article(url, title, article): failed_attempts = 0 while True: try: content = urllib2.urlopen(url).read() soup = BeautifulSoup(content, "html5lib") description = soup.find(property="og:description")["content"] if soup.find(property="og:description") else '' locale = soup.find(property="og:locale")["content"] if soup.find(property="og:locale") else '' section = soup.find(property="og:article:section")["content"] if soup.find(property="og:article:section") else '' tag = soup.find(property="og:article:tag")["content"] if soup.find(property="og:article:tag") else '' author = soup.find(property="og:article:author")["content"] if soup.find(property="og:article:author") else '' date = soup.find(property="og:article:published_time")["content"] if soup.find(property="og:article:published_time") else '' print 'date' print date body = '' for body_tag in soup.findAll("div", {"class" : re.compile('ArticleBody_body.*')}): body += body_tag.text # datetime.strptime (ts, "%Y") # 2012-01-02T04:32:57+0000 dt = dateutil.parser.parse(date, fuzzy=True) print dt print url article.title = title.encode('utf-8') article.url = url.encode('utf-8') article.description = description.encode('utf-8') article.locale = locale.encode('utf-8') article.section = section.encode('utf-8') article.tag = tag.encode('utf-8') article.author = author.encode('utf-8') article.body = body.encode('utf-8') article.is_downloaded = True article.article = body article.save() print(description.encode('utf-8')) except (urllib2.HTTPError, ValueError) as err: print err time.sleep(20) failed_attempts += 1 if failed_attempts < 10: continue

任意のアイデア？

出典

2017-09-30 Rob

なぜあなたはexp最初の1つは、それぞれの反復ごとに別の記事を思いついて、私の外です。 2番目の提案はクエリーセット全体をループする必要があります。たぶん、あなたのポストにインデントを修正する必要があります！ – schwobaseggl

インデントを固定しました。 .first（）が異なる必要があるのは、フィルタが "is_locked = False"の記事のみを取得している間に、 "article.is_locked = True"と "article.save（）"という行のためです。 – Rob

' get_article（） '？ – dahrens

あなたのget_article()機能には無限ループがあります。

、例示の目的のために、あなたのget_article()のこの単純化されたバージョンを考えてみましょう：単にcontinueを呼び出していない

def get_article(url, title, article): 
    failed_attempts = 0 
    # Note how this while loop runs endlessly. 
    while True: 
     try: 
      # doing something here without calling `return` anywhere 
      # I'll just write `pass` for the purpose of simplification 
      pass 
     except (urllib2.HTTPError, ValueError) as err: 
      failed_attempts += 1 
      if failed_attempts < 10: 
       # you're calling `continue` here but you're not calling 
       # `break` or `return` anywhere if failed_attemps >= 10 
       # and therefore you're still stuck in the while-loop 
       continue

注意は、whileループを停止しません。

while True: 
    print('infinite loop!') 
    if some_condition: 
     # if some_condition is truthy, continue 
     continue 
    # but if it's not, we will continue anyway. the above if-condition 
    # therefore doesn't make sense

固定バージョンは、このように見えるかもしれません、私は細部を省いた：

def get_article(url, title, article): 
    failed_attempts = 0 
    while True: 
     try: 
      # it's considered good practice to only put the throwing 
      # statement you want to catch in the try-block 
      content = urllib2.urlopen(url).read() 
     except (urllib2.HTTPError, ValueError) as err: 
      failed_attempts += 1 
      if failed_attempts == 10: 
       # if it's the 10th attempt, break the while loop. 
       # consider throwing an error here which you can handle 
       # where you're calling `get_article` from. otherwise 
       # the caller doesn't know something went wrong 
       break 
     else: 
      # do your work here 
      soup = BeautifulSoup(content, "html5lib") 
      # ... 
      article.save() 
      # and call return! 
      return

出典

2017-09-30 18:35:58 olieidel

'failed_attempts + = 1; failed_attempts <10： 'がこれを処理しなければならない場合。 – dahrens

私はそうは思わない。どのように "呼び出しを続行しない"ループを停止するのですか？私は中間にいくつかの説明的なコードを追加しました（第2のコード例） – olieidel

正しい - 良いキャッチ！ – dahrens

Django DBオブジェクトのフィルタが初めて新規アイテムを取得しない

答えて

関連する問題