2017-12-17 22 views
1

私はTextCatを試してみたいと思います。私がPythonからそれを実行することができれば、個人的なデータセットでどれくらいうまくいくかを知りたいので、私にとっては最も便利です。 Pythonでtextcatを使用するにはどうすればよいですか?

は、私は69個の言語がTextCatsのウェブサイト上で主張languagedetを経由して利用できるよりもはるかに少ない

from languagedet.mixed import MixedDetector 
det = MixedDetector() 
print(det.available) 

に応じlanguagedetを与えたが、。

私もpylibtextcatを試してみましたが、私は得る:

Collecting pylibtextcat 
    Using cached pylibtextcat-0.2.tar.bz2 
Building wheels for collected packages: pylibtextcat 
    Running setup.py bdist_wheel for pylibtextcat ... error 
    Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1dkslney/pylibtextcat/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpyct9pyfepip-wheel- --python-tag cp35: 
    running bdist_wheel 
    running build 
    running build_ext 
    building 'textcat' extension 
    creating build 
    creating build/temp.linux-x86_64-3.5 
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION="0.2" -I/usr/include/python3.5m -c libtextcat.c -o build/temp.linux-x86_64-3.5/libtextcat.o -Wall -Wextra 
    libtextcat.c:7:32: fatal error: libtextcat/textcat.h: No such file or directory 
    compilation terminated. 
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 

    ---------------------------------------- 
    Failed building wheel for pylibtextcat 
    Running setup.py clean for pylibtextcat 
Failed to build pylibtextcat 
Installing collected packages: pylibtextcat 
    Running setup.py install for pylibtextcat ... error 
    Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1dkslney/pylibtextcat/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-lwxglu50-record/install-record.txt --single-version-externally-managed --compile: 
    running install 
    running build 
    running build_ext 
    building 'textcat' extension 
    creating build 
    creating build/temp.linux-x86_64-3.5 
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION="0.2" -I/usr/include/python3.5m -c libtextcat.c -o build/temp.linux-x86_64-3.5/libtextcat.o -Wall -Wextra 
    libtextcat.c:7:32: fatal error: libtextcat/textcat.h: No such file or directory 
    compilation terminated. 
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 

    ---------------------------------------- 
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1dkslney/pylibtextcat/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-lwxglu50-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-1dkslney/pylibtextcat/ 

私はそれをインストールしよう(と私はlibexttextcat-2.0-0libexttextcat-datalibexttextcat-devがインストールされている)とき。

PythonでTextCatを使用できますか?

+0

私はちょうどhttp://thomas.mangin.com//content/texcat-in-python.htmlを見ました –

答えて

0

同じではないようだが、NLTKあります

from nltk.classify import textcat 

text = "This is a simple example." 
cls = textcat.TextCat() 

distances = cls.lang_dists(text) # a dict of 437 elements 
cls.guess_language(text) # a str 
関連する問題