Malaya Cache
Contents
Malaya Cache#
This tutorial is available as an IPython notebook at Malaya/example/caching.
This module only useful if use default Malaya repository, for huggingface, read more at https://huggingface.co/docs/datasets/v1.12.0/cache.html
Default Cache location#
You actually can know where is your Malaya default caching folder. Caching folder will save any models, vocabs, and rules downloaded for specific modules.
[1]:
import malaya
[2]:
malaya.__home__
[2]:
'/Users/huseinzolkepli/Malaya'
Change default Cache location#
To change default cache location, you need to set MALAYA_CACHE
OS environment before import Malaya,
export MALAYA_CACHE=/Users/huseinzolkepli/Documents/Malaya
Or you can set in bashenv to make it permanent if you want.
[1]:
import os
os.environ['MALAYA_CACHE'] = '/Users/huseinzolkepli/Documents/malaya-cache'
[2]:
import malaya
malaya.__home__
[2]:
'/Users/huseinzolkepli/Documents/malaya-cache'
Cache subdirectories#
Start from version 1.0, Malaya put models in subdirectories, you can print it by simply,
[3]:
malaya.utils.print_cache()
Malaya/
├── keyword-extraction/
│ ├── alxlnet/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v9.model
│ │ ├── sp10m.cased.v9.vocab
│ │ └── version
│ └── tiny-bert/
│ ├── model.pb
│ ├── sp10m.cased.bert.model
│ ├── sp10m.cased.bert.vocab
│ └── version
├── qa-squad/
│ ├── albert/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v10.model
│ │ ├── sp10m.cased.v10.vocab
│ │ └── version
│ ├── albert-quantized/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v10.model
│ │ ├── sp10m.cased.v10.vocab
│ │ └── version
│ ├── alxlnet/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v9.model
│ │ ├── sp10m.cased.v9.vocab
│ │ └── version
│ ├── bert/
│ ├── tiny-bert/
│ │ ├── model.pb
│ │ ├── sp10m.cased.bert.model
│ │ ├── sp10m.cased.bert.vocab
│ │ └── version
│ ├── xlnet/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v9.model
│ │ ├── sp10m.cased.v9.vocab
│ │ └── version
│ └── xlnet-quantized/
│ ├── model.pb
│ ├── sp10m.cased.v9.model
│ ├── sp10m.cased.v9.vocab
│ └── version
├── sentiment/
│ ├── albert/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v10.model
│ │ ├── sp10m.cased.v10.vocab
│ │ └── version
│ ├── alxlnet/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v9.model
│ │ ├── sp10m.cased.v9.vocab
│ │ └── version
│ ├── bert/
│ │ ├── model.pb
│ │ ├── sp10m.cased.bert.model
│ │ ├── sp10m.cased.bert.vocab
│ │ └── version
│ ├── xlnet/
│ │ └── model.pb
│ └── xlnet-quantized/
│ ├── model.pb
│ ├── sp10m.cased.v9.model
│ ├── sp10m.cased.v9.vocab
│ └── version
├── similarity/
│ ├── albert/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v10.model
│ │ ├── sp10m.cased.v10.vocab
│ │ └── version
│ ├── alxlnet/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v9.model
│ │ ├── sp10m.cased.v9.vocab
│ │ └── version
│ ├── alxlnet-quantized/
│ │ ├── model.pb
│ │ ├── sp10m.cased.v9.model
│ │ ├── sp10m.cased.v9.vocab
│ │ └── version
│ └── tiny-bert/
│ ├── model.pb
│ ├── sp10m.cased.bert.model
│ ├── sp10m.cased.bert.vocab
│ └── version
├── stem/
│ └── lstm-bahdanau/
│ ├── model.pb
│ ├── stemmer.yttm
│ └── version
├── translation-en-ms/
├── version
└── wordvector/
└── news/
├── version
├── wordvector.json
└── wordvector.npy
Deleting specific model#
Let say you want to clear some spaces, start from version 1.0, you can specifically choose which model you want to delete.
[4]:
malaya.utils.delete_cache('wordvector/news')
[4]:
True
What happen if a directory does not exist?
[7]:
malaya.utils.delete_cache('wordvector/news2')
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-7-1104f6734f26> in <module>
----> 1 malaya.utils.delete_cache('wordvector/news2')
~/Documents/tf-1.15/env/lib/python3.7/site-packages/malaya_boilerplate-0.0.1-py3.7.egg/malaya_boilerplate/utils.py in delete_cache(location)
188 if not os.path.exists(location):
189 raise Exception(
--> 190 f'folder not exist, please check path from `{__package__}.utils.print_cache()`'
191 )
192 if not os.path.isdir(location):
Exception: folder not exist, please check path from `malaya.utils.print_cache()`
Purge cache#
You can simply delete all models, totally purge it. By simply,
[8]:
malaya.utils.delete_all_cache
[8]:
<function malaya_boilerplate.utils.delete_all_cache()>
I am not gonna to run it, because I prefer to keep it for now?