イメージのsrc属性のparamsで長いURLを解析するHTMLアジリティパック

HAPを使用してhtmlドキュメントを解析するときに画像のsrc属性に問題が発生しました。 ID src属性の値をパラメータと長いURLである、例：HAPは、彼らが属性である思考のparamsを分割ように見えます<img border='0' title='Kommunelogo' alt='Kommunelogo' style='margin-top: 5px;' src='http://livskraftig.bedrekommune.no/more/reports/profilechart.jsp?legend="Y"&amp;graphtype="xy"&amp;profileid="19433213274429306"&amp;element="72"&amp;addyears="tru"e'/>イメージのsrc属性のparamsで長いURLを解析するHTMLアジリティパック

：<img border='0' title='Kommunelogo' alt='Kommunelogo' style='margin-top: 5px;' src='http://livskraftig.bedrekommune.no/more/reports/profilechart.jsp?legend=Y&graphtype=xy&profileid=19433213274429306&element=72&addyears=true' />

その後、HAPは、このような画像を解析します。

マイコード：

HtmlDocument doc = new HtmlDocument(); 
doc.OptionOutputAsXml = true; 
doc.OptionAutoCloseOnEnd = true; 
doc.OptionFixNestedTags = true; 
doc.LoadHtml(input_which_is_a_whole_html_file); 

HtmlAgilityPack.HtmlNodeCollection imageNodes = doc.DocumentNode.SelectNodes("//img"); 
if (imageNodes != null) 
{ 
    foreach (HtmlAgilityPack.HtmlNode imgNode in imageNodes) 
    { 
     string imgSrc = imgNode.Attributes["src"].Value; 
    } 
}

私はこれを避けることができますどのように任意のアイデア？

ありがとうございます！

出典

2010-11-23 byte_slave

解析するドキュメント全体を教えてください。その 'img'だけでドキュメント内のコードをテストすると、HAPはURLを完全に返します。 –

次は正常に動作しますので、あなたのコードは、おそらく、奇妙な何かをやっている：

HtmlDocument doc = new HtmlDocument(); 
    doc.LoadHtml("<img border='0' title='Kommunelogo' alt='Kommunelogo' style='margin-top: 5px;' src='http://livskraftig.bedrekommune.no/more/reports/profilechart.jsp?legend=Y&graphtype=xy&profileid=19433213274429306&element=72&addyears=true' />"); 
    doc.Save(Console.Out);

あなたはREPROを持っていますか？

出典

2010-11-23 10:06:42

一部のコードを編集して追加しました。ありがとう！ –

イメージのsrc属性のparamsで長いURLを解析するHTMLアジリティパック

答えて

関連する問題