Running on Windows#

UnicodeDecodeError: ‘charmap’ codec can’t decode byte#

To solve this,

Windows Settings > Administrative language settings > Change system locale.

Checked Beta: Use Unicode UTF-8 for worldwide language support.

Restarted, everything works well.

Full dicussion check issue 25.

youtokentome failed to build#

YouTokenToMe required cython and Microsoft Visual C++ 14.0 are required to compile and usually Windows users will break on this part, so we need to install Malaya without YouTokenToMe.

pip install malaya --no-deps
pip install tensorflow>=1.15

If we skipped YouTokenToMe, we not able to use,

  1. language-detection module,

  2. True Case module,

  3. Multinomial model in emotion analysis,

  4. Multinomial model in sentiment analysis,

  5. Multinomial model in subjectivity analysis,

  6. Multinomial model in toxicity analysis,

Or you still need these models, you need to install Cython,

pip install cython

And install Visual Studio from, and choose Visual Studio 2019 Build Tools, vs_buildtools.exe.

And follow

Unable to use any T5 models#

T5 depends on tensorflow-text, currently there is no official tensorflow-text binary released for Windows. So no T5 model for Windows users.

List T5 models,

  1. malaya.summarization.abstractive.transformer

  2. malaya.generator.transformer

  3. malaya.paraphrase.transformer