指定されたテキストから鉱山フレーズ(3ワードまで)私は私の問題に簡単な解決策のために前に頼ま
誰かが親切にこのコードを提供してくれました...(スフィンクス検索サービスを使用して)が、私はどこにもなっていません
<?php
/**
* $Project: GeoGraph $
* $Id$
*
* GeoGraph geographic photo archive project
* This file copyright (C) 2005 Barry Hunter ([email protected])
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
*/
/**
* Provides the methods for updating the worknet tables
*
* @package Geograph
* @author Barry Hunter <[email protected]>
* @version $Revision$
*/
function addTwoLetterPhrase($phrase) {
global $w2;
$w2[$phrase] = (isset($w2[$phrase]))?($w2[$phrase]+1):1;
}
function addThreeLetterPhrase($phrase) {
global $w3;
$w3[$phrase] = (isset($w3[$phrase]))?($w3[$phrase]+1):1;
}
function updateWordnet(&$db,$text,$field,$id) {
global $w1,$w2,$w3;
$alltext = strtolower(preg_replace('/\W+/',' ',str_replace("'",'',$text)));
if (strlen($text)< 1)
return;
$words = preg_split('/ /',$alltext);
$w1 = array();
$w2 = array();
$w3 = array();
//build a list of one word phrases
foreach ($words as $word) {
$w1[$word] = (isset($w1[$word]))?($w1[$word]+1):1;
}
//build a list of two word phrases
$text = $alltext;
$text = preg_replace('/(\w+) (\w+)/e','addTwoLetterPhrase("$1 $2")',$text);
$text = $alltext;
$text = preg_replace('/(\w+)/','',$text,1);
$text = preg_replace('/(\w+) (\w+)/e','addTwoLetterPhrase("$1 $2")',$text);
//build a list of three word phrases
$text = $alltext;
$text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text);
$text = $alltext;
$text = preg_replace('/(\w+)/','',$text,1);
$text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text);
$text = $alltext;
$text = preg_replace('/(\w+) (\w+)/','',$text,1);
$text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text);
foreach ($w1 as $word=>$count) {
$db->Execute("insert into wordnet1 set gid = $id,words = '$word',$field = $count");// ON DUPLICATE KEY UPDATE $field=$field+$count");
}
foreach ($w2 as $word=>$count) {
$db->Execute("insert into wordnet2 set gid = $id,words = '$word',$field = $count");
}
foreach ($w3 as $word=>$count) {
$db->Execute("insert into wordnet3 set gid = $id,words = '$word',$field = $count");
}
}
?>
それが正常に動作して除いて.......ほぼ正確に何が必要ありません....それはやさしいUTF8されていませんが...私は意味...それは上(の部分に単語全体を分割します特別な文字)どこにすべきではない!
ので、私の推測では、私が...ではなく、正規にpreg_replaceの
をマルチバイト関数を使用する必要がありますされ2と3の単語のために私はmb_ereg_replaceとにpreg_replaceを交換しようとしたが、それが必要として、それが機能していません...少なくともありませんフレーズ
アイデア?
私は/ Uを追加DDID ...のような:...... ............... $ alltext = strtolower(preg_replace( '/ \ W +/u'、 ''、str_replace( "'、' '、$ text))); .......... and ...... ...... $ text = preg_replace( '/(\ w +)/ e/u'、 'addTwoLetterPhrase( "$ 1 $ 2")'、$ text ); しかし、私はエラーが発生しています....私は申し訳ありませんが、私は正規表現を吸う –
まあ、ああ。マニュアルをお読みください。修飾子を追加すると、 '/.../ u'または' /.../ eu' - 2つの区切り文字を追加しないことを意味します。 – mario
何とか働いていると思います)) .... THANKS A LOT! –