GPU Environment#

This tutorial is available as an IPython notebook at Malaya/example/gpu-environment.

[1]:
%%time

import malaya
import logging
logging.basicConfig(level = logging.INFO)
CPU times: user 2.9 s, sys: 3.7 s, total: 6.6 s
Wall time: 2.09 s
/home/husein/dev/malaya/malaya/tokenizer.py:214: FutureWarning: Possible nested set at position 3397
  self.tok = re.compile(r'({})'.format('|'.join(pipeline)))
/home/husein/dev/malaya/malaya/tokenizer.py:214: FutureWarning: Possible nested set at position 3927
  self.tok = re.compile(r'({})'.format('|'.join(pipeline)))

List available GPU#

You must install Pytorch GPU version first to enable GPU hardware acceleration.

[2]:
import torch

torch.cuda.device_count()
[2]:
1

Run model inside GPU#

Once you initiate cuda method from pytorch object, all inputs will auto cast to cuda.

[3]:
malaya.translation.available_huggingface
[3]:
{'mesolitica/translation-t5-tiny-standard-bahasa-cased': {'Size (MB)': 139,
  'Suggested length': 1536,
  'en-ms chrF2++': 65.91,
  'ms-en chrF2++': 61.3,
  'ind-ms chrF2++': 58.15,
  'jav-ms chrF2++': 49.33,
  'pasar ms-ms chrF2++': 58.46,
  'pasar ms-en chrF2++': 55.76,
  'manglish-ms chrF2++': 51.04,
  'manglish-en chrF2++': 52.2,
  'from lang': ['en', 'ms', 'ind', 'jav', 'bjn', 'manglish', 'pasar ms'],
  'to lang': ['en', 'ms']},
 'mesolitica/translation-t5-small-standard-bahasa-cased': {'Size (MB)': 242,
  'Suggested length': 1536,
  'en-ms chrF2++': 67.37,
  'ms-en chrF2++': 63.79,
  'ind-ms chrF2++': 58.09,
  'jav-ms chrF2++': 52.11,
  'pasar ms-ms chrF2++': 62.49,
  'pasar ms-en chrF2++': 60.77,
  'manglish-ms chrF2++': 52.84,
  'manglish-en chrF2++': 53.65,
  'from lang': ['en', 'ms', 'ind', 'jav', 'bjn', 'manglish', 'pasar ms'],
  'to lang': ['en', 'ms']},
 'mesolitica/translation-t5-base-standard-bahasa-cased': {'Size (MB)': 892,
  'Suggested length': 1536,
  'en-ms chrF2++': 67.62,
  'ms-en chrF2++': 64.41,
  'ind-ms chrF2++': 59.25,
  'jav-ms chrF2++': 52.86,
  'pasar ms-ms chrF2++': 62.99,
  'pasar ms-en chrF2++': 62.06,
  'manglish-ms chrF2++': 54.4,
  'manglish-en chrF2++': 54.14,
  'from lang': ['en', 'ms', 'ind', 'jav', 'bjn', 'manglish', 'pasar ms'],
  'to lang': ['en', 'ms']},
 'mesolitica/translation-nanot5-tiny-malaysian-cased': {'Size (MB)': 205,
  'Suggested length': 2048,
  'en-ms chrF2++': 63.61,
  'ms-en chrF2++': 59.55,
  'ind-ms chrF2++': 56.38,
  'jav-ms chrF2++': 47.68,
  'mandarin-ms chrF2++': 36.61,
  'mandarin-en chrF2++': 39.78,
  'pasar ms-ms chrF2++': 58.74,
  'pasar ms-en chrF2++': 54.87,
  'manglish-ms chrF2++': 50.76,
  'manglish-en chrF2++': 53.16,
  'from lang': ['en',
   'ms',
   'ind',
   'jav',
   'bjn',
   'manglish',
   'pasar ms',
   'mandarin',
   'pasar mandarin'],
  'to lang': ['en', 'ms']},
 'mesolitica/translation-nanot5-small-malaysian-cased': {'Size (MB)': 358,
  'Suggested length': 2048,
  'en-ms chrF2++': 66.98,
  'ms-en chrF2++': 63.52,
  'ind-ms chrF2++': 58.1,
  'jav-ms chrF2++': 51.55,
  'mandarin-ms chrF2++': 46.09,
  'mandarin-en chrF2++': 44.13,
  'pasar ms-ms chrF2++': 63.2,
  'pasar ms-en chrF2++': 59.78,
  'manglish-ms chrF2++': 54.09,
  'manglish-en chrF2++': 55.27,
  'from lang': ['en',
   'ms',
   'ind',
   'jav',
   'bjn',
   'manglish',
   'pasar ms',
   'mandarin',
   'pasar mandarin'],
  'to lang': ['en', 'ms']},
 'mesolitica/translation-nanot5-base-malaysian-cased': {'Size (MB)': 990,
  'Suggested length': 2048,
  'en-ms chrF2++': 67.87,
  'ms-en chrF2++': 64.79,
  'ind-ms chrF2++': 56.98,
  'jav-ms chrF2++': 51.21,
  'mandarin-ms chrF2++': 47.39,
  'mandarin-en chrF2++': 48.78,
  'pasar ms-ms chrF2++': 65.06,
  'pasar ms-en chrF2++': 64.03,
  'manglish-ms chrF2++': 57.91,
  'manglish-en chrF2++': 55.66,
  'from lang': ['en',
   'ms',
   'ind',
   'jav',
   'bjn',
   'manglish',
   'pasar ms',
   'mandarin',
   'pasar mandarin'],
  'to lang': ['en', 'ms']}}
[4]:
model = malaya.translation.huggingface(model = 'mesolitica/translation-t5-tiny-standard-bahasa-cased')
Loading the tokenizer from the `special_tokens_map.json` and the `added_tokens.json` will be removed in `transformers 5`,  it is kept for forward compatibility, but it is recommended to update your `tokenizer_config.json` by uploading it again. You will see the new `added_tokens_decoder` attribute that will store the relevant information.
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. If you see this, DO NOT PANIC! This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[5]:
model.cuda()
[5]:
T5ForConditionalGeneration(
  (shared): Embedding(32103, 384)
  (encoder): T5Stack(
    (embed_tokens): Embedding(32103, 384)
    (block): ModuleList(
      (0): T5Block(
        (layer): ModuleList(
          (0): T5LayerSelfAttention(
            (SelfAttention): T5Attention(
              (q): Linear(in_features=384, out_features=768, bias=False)
              (k): Linear(in_features=384, out_features=768, bias=False)
              (v): Linear(in_features=384, out_features=768, bias=False)
              (o): Linear(in_features=768, out_features=384, bias=False)
              (relative_attention_bias): Embedding(32, 12)
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (1): T5LayerFF(
            (DenseReluDense): T5DenseActDense(
              (wi): Linear(in_features=384, out_features=1344, bias=False)
              (wo): Linear(in_features=1344, out_features=384, bias=False)
              (dropout): Dropout(p=0.1, inplace=False)
              (act): ReLU()
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
      (1-3): 3 x T5Block(
        (layer): ModuleList(
          (0): T5LayerSelfAttention(
            (SelfAttention): T5Attention(
              (q): Linear(in_features=384, out_features=768, bias=False)
              (k): Linear(in_features=384, out_features=768, bias=False)
              (v): Linear(in_features=384, out_features=768, bias=False)
              (o): Linear(in_features=768, out_features=384, bias=False)
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (1): T5LayerFF(
            (DenseReluDense): T5DenseActDense(
              (wi): Linear(in_features=384, out_features=1344, bias=False)
              (wo): Linear(in_features=1344, out_features=384, bias=False)
              (dropout): Dropout(p=0.1, inplace=False)
              (act): ReLU()
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
    )
    (final_layer_norm): T5LayerNorm()
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (decoder): T5Stack(
    (embed_tokens): Embedding(32103, 384)
    (block): ModuleList(
      (0): T5Block(
        (layer): ModuleList(
          (0): T5LayerSelfAttention(
            (SelfAttention): T5Attention(
              (q): Linear(in_features=384, out_features=768, bias=False)
              (k): Linear(in_features=384, out_features=768, bias=False)
              (v): Linear(in_features=384, out_features=768, bias=False)
              (o): Linear(in_features=768, out_features=384, bias=False)
              (relative_attention_bias): Embedding(32, 12)
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (1): T5LayerCrossAttention(
            (EncDecAttention): T5Attention(
              (q): Linear(in_features=384, out_features=768, bias=False)
              (k): Linear(in_features=384, out_features=768, bias=False)
              (v): Linear(in_features=384, out_features=768, bias=False)
              (o): Linear(in_features=768, out_features=384, bias=False)
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (2): T5LayerFF(
            (DenseReluDense): T5DenseActDense(
              (wi): Linear(in_features=384, out_features=1344, bias=False)
              (wo): Linear(in_features=1344, out_features=384, bias=False)
              (dropout): Dropout(p=0.1, inplace=False)
              (act): ReLU()
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
      (1-3): 3 x T5Block(
        (layer): ModuleList(
          (0): T5LayerSelfAttention(
            (SelfAttention): T5Attention(
              (q): Linear(in_features=384, out_features=768, bias=False)
              (k): Linear(in_features=384, out_features=768, bias=False)
              (v): Linear(in_features=384, out_features=768, bias=False)
              (o): Linear(in_features=768, out_features=384, bias=False)
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (1): T5LayerCrossAttention(
            (EncDecAttention): T5Attention(
              (q): Linear(in_features=384, out_features=768, bias=False)
              (k): Linear(in_features=384, out_features=768, bias=False)
              (v): Linear(in_features=384, out_features=768, bias=False)
              (o): Linear(in_features=768, out_features=384, bias=False)
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (2): T5LayerFF(
            (DenseReluDense): T5DenseActDense(
              (wi): Linear(in_features=384, out_features=1344, bias=False)
              (wo): Linear(in_features=1344, out_features=384, bias=False)
              (dropout): Dropout(p=0.1, inplace=False)
              (act): ReLU()
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
    )
    (final_layer_norm): T5LayerNorm()
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (lm_head): Linear(in_features=384, out_features=32103, bias=False)
)
[ ]:
model.generate(['i like chicken'])