Contributions are welcome and are greatly appreciated! Every little bit helps, and credit will always be given.

Code Formatting#

We use AutoPEP8 for code formatting and standardization. Check out the pyproject.toml file at root directory.

Report Bugs#

Report bugs through Github issue.

Please report all relevant information and preferably code that exhibits the problem.

Do not try to email us about the issues, we will not respond to any emails, submit a proper Github issue instead.

Fix Bugs#

Look through the Github issue for bugs. Anything is open to whoever wants to implement it.

Implement Features#

Look through the Github issue or Malaya-project for feature requests. Any unassigned improvement issue is open to whoever wants to implement it.

We use Pytorch and heavily use HuggingFace Transformers as backend.

Dataset Contributions#

Create a new issue in Github issue that’s related to your data including the data link or just attach it there. If you want to improve the current dataset we have, you can check at Malaya-Dataset.

Alternatively, you can simply email your data if you do not want to expose the data to the public. Malaya will not expose your data, but our trained models that’s based on your data will be exposed to the public.

Thanks to,

  1. Fake news, contributed by syazanihussin

  2. Speech voice, contributed by Khalil Nooh

  3. Speech voice, contributed by Mas Aisyah Ahmad

  4. Singlish text dump, contributed by brytjy

  5. Singapore news, contributed by brytjy

Documentation Improvements#

Malaya could always use better documentation, there might be some typos or incorrect object names.

Submit Feedback#

The best way to send feedback is to open an issue through Github issue.

Unit Test#

Help test every step of the program flow! You can check the current available unit tests here.

Feel free to help Malaya write unit-tests, fork the repository!