spacy lemmatizerを使って基本形に単語をつける方法

私はspacyを知りたいと思っています。私は彼のlemmatizer founctionを使いたいと思いますが、私は単語の文字列のように使い方がわかりません。単語を基本的な形にした文字列を返します。 'words' => wordのように 'did' => 'do'、ありがとうございます。spacy lemmatizerを使って基本形に単語をつける方法

出典

2016-08-04 yi wang

http://textminingonline.com/getting-started-with-spacy – RAVI

https://spacy.io/docs – RAVI

私は前にこのWebを参照している、あなたに感謝し、しかし、彼らは詳細を説明していない、大丈夫、私は、Webコードを試してみる、もう一度ありがとう。 –

コード：

import os 
from spacy.en import English, LOCAL_DATA_DIR 

data_dir = os.environ.get('SPACY_DATA', LOCAL_DATA_DIR) 

nlp = English(data_dir=data_dir) 

doc3 = nlp(u"this is spacy lemmatize testing. programming books are more better than others") 

for token in doc3: 
    print token, token.lemma, token.lemma_

出力：

this 496 this 
is 488 be 
spacy 173779 spacy 
lemmatize 1510965 lemmatize 
testing 2900 testing 
. 419 . 
programming 3408 programming 
books 1011 book 
are 488 be 
more 529 more 
better 615 better 
than 555 than 
others 871 others

例参考：here

出典

2016-08-04 14:46:18 RAVI

nlp =英語（data_dir = data_dir）： data_dir = data_dir、この意味は同じですが、 –

は変数を渡します。 English（）メソッドは引数data_dirをとります。したがって、 "data_dir = local_variable_name"を渡します。 d_dir = os.environ.get（ 'SPACY_DATA'、LOCAL_DATA_DIR） nlp =英語（data_dir = d_dir）その基本的なpythonのものです。 – RAVI

さて、私はこれらを試してみます。 –

前の答えは複雑であり、編集することはできませんので、ここでより多くの従来のです1。

# make sure your downloaded the english model with "python -m spacy download en" 

import spacy 
nlp = spacy.load('en') 

doc = nlp(u"Apples and oranges are similar. Boots and hippos aren't.") 

for token in doc: 
    print(token, token.lemma, token.lemma_)

出力：

Apples 6617 apples 
and 512 and 
oranges 7024 orange 
are 536 be 
similar 1447 similar 
. 453 . 
Boots 4622 boot 
and 512 and 
hippos 98365 hippo 
are 536 be 
n't 538 not 
. 453 .

official Lighting tour

出典

2017-03-24 15:48:06 damio

テキストを 'nlp'に渡す前にユニコードであることを示す必要がありますか？ [here]（https://spacy.io/docs/usage/lightning-tour#examples-resources）を参照してください –

@ PhilipO'Brien多分Python 2で、私はここではPython 3を使っています – damio

ああ、Python 2でそれが明白にユニコードであることを述べなければなりません。ありがとう！（私は本当に3に切り替えるべきです！） –

から、あなただけのLemmatizerを使用したい場合。あなたは次のようにそれを行うことができます。

from spacy.lemmatizer import Lemmatizer 
from spacy.lang.en import LEMMA_INDEX, LEMMA_EXC, LEMMA_RULES 

lemmatizer = Lemmatizer(LEMMA_INDEX, LEMMA_EXC, LEMMA_RULES) 
lemmas = lemmatizer(u'ducks', u'NOUN') 
print(lemmas)

出力

['duck']

出典

2018-02-23 13:08:16 joel

spacy lemmatizerを使って基本形に単語をつける方法

答えて

関連する問題