5
なぜSklearnのCountVectorizerが代名詞 "I"を無視していますか?CountVectorizerが 'I'を無視します
ngram_vectorizer = CountVectorizer(analyzer = "word", ngram_range = (2,2), min_df = 1)
ngram_vectorizer.fit_transform(['HE GAVE IT TO I'])
<1x3 sparse matrix of type '<class 'numpy.int64'>'
ngram_vectorizer.get_feature_names()
['gave it', 'he gave', 'it to']