opennlpのトレーニングモデル

フォーマットは上記のページに記載されています（各モデルは異なるフォーマットが必要です）。このファイルを作成したら、APIまたはopennlpアプリケーション（コマンドライン経由）で実行し、.binファイルを生成します。この.binファイルを作成したら、それをモデルにロードして使用することができます（上記のWebサイトのapiに従って）。

出典

2013-10-28 15:57:38

また、RTFMを使ってタイピングを省くこともできます。 – demongolem

http://opennlp.apache.org/docs/1.8.1/manual/opennlp.htmlの最新のドキュメントを参照してください。 –

まず、必要なエンティティでデータをトレーニングする必要があります。

センテンスは改行文字（\ n）で区切る必要があります。値はスペース文字で区切られ、タグにはスペース文字が必要です。

<START:medicine> Augmentin-Duo <END> is a penicillin antibiotic that contains two medicines - <START:medicine> amoxicillin trihydrate <END> and 
<START:medicine> potassium clavulanate <END>. They work together to kill certain types of bacteria and are used to treat certain types of bacterial infections.

あなたは例えばサンプルdatasetを参照することができます：
データはこのようなものでなければなりませんので、あなたが薬のエンティティモデルを作成したいとしましょう。トレーニングデータには、より良い結果を得るために少なくとも15000文が必要です。

さらにOpennlp TokenNameFinderTrainerを使用できます。出力ファイルは.bin形式です。ここで

例です。Writing a custom NameFinder model in OpenNLP詳細について

、Opennlp documentation

出典

2016-06-08 07:27:13

は、データ内のデータをコピーし、独自のmymodel.binを取得するためのコードの下に実行して参照してください。

は、モデルを作成しているhttps://github.com/mccraigmccraig/opennlp/blob/master/src/test/resources/opennlp/tools/namefind/AnnotatedSentencesWithTypes.txt

public class Training { 
     static String onlpModelPath = "mymodel.bin"; 
     // training data set 
     static String trainingDataFilePath = "data.txt"; 

     public static void main(String[] args) throws IOException { 
         Charset charset = Charset.forName("UTF-8"); 
         ObjectStream<String> lineStream = new PlainTextByLineStream(
                 new FileInputStream(trainingDataFilePath), charset); 
         ObjectStream<NameSample> sampleStream = new NameSampleDataStream(
                 lineStream); 
         TokenNameFinderModel model = null; 
         HashMap<String, Object> mp = new HashMap<String, Object>(); 
         try { 
           //   model = NameFinderME.train("en","drugs", sampleStream, Collections.<String,Object>emptyMap(),100,4) ; 
             model= NameFinderME.train("en", "drugs", sampleStream, Collections. emptyMap()); 
         } finally { 
             sampleStream.close(); 
         } 
         BufferedOutputStream modelOut = null; 
         try { 
             modelOut = new BufferedOutputStream(new FileOutputStream(onlpModelPath)); 
             model.serialize(modelOut); 
         } finally { 
             if (modelOut != null) 
                 modelOut.close(); 
         } 
     } 
}

出典

2016-09-21 13:33:28 user6858643

ようこそ！このコードは問題を解決するのに役立つかもしれませんが、質問に答えて_why_および/または_how_を説明しません。この追加の文脈を提供することは、長期的な教育的価値を大幅に改善するだろう。どのような制限や仮定が適用されるかなど、あなたの答えを解説してください。 –

opennlpのトレーニングモデル

答えて

関連する問題