リンクを解析するためのPHP Reg ex

フォーム（メッセージ）のPOSTコンテンツを解析し、実際のHTMLリンクのURLを変換するPHPスクリプトがあります。これは私が使用する2つの正規表現です：リンクを解析するためのPHP Reg ex

$dbQueryList['sb_message'] = preg_replace("#(^|[\n ])([\w]+?://[^ \"\n\r\t<]*)#is", "\\1<a href=\"\\2\" target=\"_blank\">\\2</a>", $dbQueryList['sb_message']); 

$dbQueryList['sb_message'] = preg_replace("#(^|[\n ])((www|ftp)\.[^ \"\t\n\r<]*)#is", "\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a>", $dbQueryList['sb_message']);

[OK]を、それがうまく動作しますが、今、別のスクリプトでは、私は反対のことをやりたいです。だから私の$dbQueryList['sb_message']私はこのようなリンクを持つことができる "<a href="http://google.com" target="_blank">Google</a>"と私はちょうど "http://google.com"を持っていると思います。

私はそれを行うことができる正規表現を書くことができません。私を手伝ってくれますか？このような感謝:)

出典

2010-11-21 Jensen

何かが、私は思う：

echo preg_replace('/<a href="([^"]*)([^<\/]*)<\/a>/i', "$1", 'moofoo <a href="http://google.com" target="_blank"> Google </a> helloworld');

出典

2010-11-21 19:02:05 shybovycha

これは、HTMLの内容を解析するために正規表現の代わりにDOMDocumentを使用する方が安全です。

このコードを試してみてください。

<?php 

function extractAnchors($html) 
{ 
    $dom = new DOMDocument(); 
    // loadHtml() needs mb_convert_encoding() to work well with UTF-8 encoding 
    $dom->loadHtml(mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8")); 

    $xpath = new DOMXPath($dom); 

    foreach ($xpath->query('//a') as $node) 
    { 
     if ($node->hasAttribute('href')) 
     { 
      $newNode = $dom->createDocumentFragment(); 
      $newNode->appendXML($node->getAttribute('href')); 
      $node->parentNode->replaceChild($newNode, $node); 
     } 
    } 

    // get only the body tag with its contents, then trim the body tag itself to get only the original content 
    return mb_substr($dom->saveXML($xpath->query('//body')->item(0)), 6, -7, "UTF-8"); 
} 

$html = 'Some text <a href="http://www.google.com">Google</a> some text <img src="http://dontextract.it" alt="alt"> some text.'; 
echo extractAnchors($html);

出典

2010-11-21 19:14:13

リンクを解析するためのPHP Reg ex

答えて

関連する問題