{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Isi Penting Generator article style" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate a long text with article style given isi penting (important facts)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "This tutorial is available as an IPython notebook at [Malaya/example/isi-penting-generator-article-style](https://github.com/huseinzol05/Malaya/tree/master/example/isi-penting-generator-article-style).\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "Results generated using stochastic methods.\n", " \n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 2.66 s, sys: 4.06 s, total: 6.73 s\n", "Wall time: 1.96 s\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/husein/dev/malaya/malaya/tokenizer.py:214: FutureWarning: Possible nested set at position 3397\n", " self.tok = re.compile(r'({})'.format('|'.join(pipeline)))\n", "/home/husein/dev/malaya/malaya/tokenizer.py:214: FutureWarning: Possible nested set at position 3927\n", " self.tok = re.compile(r'({})'.format('|'.join(pipeline)))\n" ] } ], "source": [ "%%time\n", "import malaya\n", "from pprint import pprint" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### List available HuggingFace" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'mesolitica/finetune-isi-penting-generator-t5-small-standard-bahasa-cased': {'Size (MB)': 242,\n", " 'ROUGE-1': 0.24620333,\n", " 'ROUGE-2': 0.05896076,\n", " 'ROUGE-L': 0.15158954,\n", " 'Suggested length': 1024},\n", " 'mesolitica/finetune-isi-penting-generator-t5-base-standard-bahasa-cased': {'Size (MB)': 892,\n", " 'ROUGE-1': 0.24620333,\n", " 'ROUGE-2': 0.05896076,\n", " 'ROUGE-L': 0.15158954,\n", " 'Suggested length': 1024}}" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "malaya.generator.isi_penting.available_huggingface" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load HuggingFace\n", "\n", "Transformer Generator in Malaya is quite unique, most of the text generative model we found on the internet like GPT2 or Markov, simply just continue prefix input from user, but not for Transformer Generator. We want to generate an article or karangan like high school when the users give 'isi penting'.\n", "\n", "```python\n", "def huggingface(\n", " model: str = 'mesolitica/finetune-isi-penting-generator-t5-base-standard-bahasa-cased',\n", " force_check: bool = True,\n", " **kwargs,\n", "):\n", " \"\"\"\n", " Load HuggingFace model to generate text based on isi penting.\n", "\n", " Parameters\n", " ----------\n", " model: str, optional (default='mesolitica/finetune-isi-penting-generator-t5-base-standard-bahasa-cased')\n", " Check available models at `malaya.generator.isi_penting.available_huggingface`.\n", " force_check: bool, optional (default=True)\n", " Force check model one of malaya model.\n", " Set to False if you have your own huggingface model.\n", "\n", " Returns\n", " -------\n", " result: malaya.torch_model.huggingface.IsiPentingGenerator\n", " \"\"\"\n", "```" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a5f8a10a9f0e45f18dad98726eca0a73", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading (…)okenizer_config.json: 0%| | 0.00/2.54k [00:00. If you see this, DO NOT PANIC! This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9b40e67d35ee43c1b4923e1efc793a7c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading (…)lve/main/config.json: 0%| | 0.00/822 [00:00