Solr：データインポートハンドラとsolrセル

solrセルを使用して、データインポートハンドラでリッチドキュメント（pdf、office）をインデックス化することは可能ですか？Solr：データインポートハンドラとsolrセル

私はsolr 3.2を使用します。

ありがとうございました。

2011-07-13 bobosh

Solrのセルは、別名ExtractingRequestHandler、舞台裏Apache Tikaを使用し、後者は容易DataImportHandlerに組み込むことができる。

<dataConfig> 
<!-- use any of type DataSource<InputStream> --> 
    <dataSource type="BinURLDataSource"/> 
    <document> 
    <!-- The value of format can be text|xml|html|none. this is the format in which the body is emited (the 'text' field) . The implicit field 'text' will have that format. 
      default value is 'text' (if not specified) . format="none" means body is not emited--> 
    <entity processor="TikaEntityProcessor" tikaConfig="tikaconfig.xml" url="${some.var.goes.here}" format="text"> 
     <!--Do appropriate mapping here meta="true" means it is a metadata field --> 
     <field column="Author" meta="true" name="author"/> 
     <field column="title" meta="true" name="docTitle"/> 
     <!--'text' is an implicit field emited by TikaEntityProcessor . Map it appropriately--> 
     <field column="text"/> 
    </entity> 
    <document> 
</dataConfig>

この機能はSOLR-1358で実施されました。

出典

2011-07-13 09:14:02 opyate

私は数分前に発見したが、私はエラーがあります：GRAVEを：フルインポートに失敗しました：org.apache.solr.handler.dataimport.DataImportHandlerEx ception：いいえデータソース：エンティティのビン利用可能：94600730275216処理ドク ument＃1 。どうしてか分かりません。 – bobosh

しかし、私はデータソースを設定しました – bobosh

あなたの質問は "それは可能ですか？"でした。別の質問をしてください。 – opyate

Solr：データインポートハンドラとsolrセル

答えて

関連する問題