pythonでリソースURLから完全なURLを取得する方法

<img>,<link>および<script>タグでそれぞれ埋め込まれている場合、Webページでは、画像、CSS、JavaScriptなどのリソースがクライアントのWebブラウザによって読み込まれます。pythonでリソースURLから完全なURLを取得する方法

リソースURLが異なる形態をとることができ、それは例えば、完全なURLを指定できます

http://cdn.mysite.com/images/animage.jpg

それは相対パスにすることができます。

images/animage.jpg 
../images/animage.jpg

またはルートへの参照のみ

/images/animage.jpg

私はページのURLをとるPythonで関数を作成することができますどのように

、およびresourのURLそれについての完全なURLが返されることを保証しますか？例えば

：

def resource_url(page,resource): 
    ## if the resource is a full URL, return that 
    ## if not, use the page URL and the resource to return the full URL

出典

2012-02-23 Alex Coplan

あなたはurllib.parse.urljoin方法を見たことがありますか？ http://docs.python.org/release/3.1.3/library/urllib.parse.html – Peter

from urlparse import urljoin 

def resource_url(page, resource): 
    if not resource.startswith(page): 
    # doesn't start with http://example.com 
    resource = urljoin(page, resource) 
    return resource

出典

2012-02-23 14:19:40 platinummonkey

pythonでリソースURLから完全なURLを取得する方法

答えて

関連する問題