NSFW Detection
Contents
NSFW Detection#
This tutorial is available as an IPython notebook at Malaya/example/nsfw.
Pretty simple and straightforward, just to detect whether a text is NSFW or not.
[1]:
%%time
import malaya
CPU times: user 4.05 s, sys: 741 ms, total: 4.79 s
Wall time: 4.59 s
Load lexicon model#
Pretty naive but really effective, lexicon gathered at Malay-Dataset/corpus/nsfw.
def lexicon(**kwargs):
"""
Load Lexicon NSFW model.
Returns
-------
result : malaya.text.lexicon.nsfw.Lexicon class
"""
[3]:
lexicon_model = malaya.nsfw.lexicon()
[4]:
string1 = 'xxx sgt panas, best weh'
string2 = 'jmpa dekat kl sentral'
string3 = 'Rolet Dengan Wang Sebenar'
Predict batch of strings#
[5]:
lexicon_model.predict([string1, string2, string3])
[5]:
['sex', 'negative', 'gambling']
Load multinomial model#
All model interface will follow sklearn interface started v3.4,
def multinomial(**kwargs):
"""
Load multinomial NSFW model.
Returns
-------
result : malaya.model.ml.BAYES class
"""
[7]:
model = malaya.nsfw.multinomial()
Predict batch of strings#
[8]:
model.predict([string1, string2, string3])
[8]:
['sex', 'negative', 'gambling']
Predict batch of strings with probability#
[9]:
model.predict_proba([string1, string2, string3])
[9]:
[{'sex': 0.9357058034930408,
'gambling': 0.02616353532998711,
'negative': 0.03813066117697173},
{'sex': 0.027541900360621846,
'gambling': 0.03522626245360637,
'negative': 0.9372318371857732},
{'sex': 0.01865380888750343,
'gambling': 0.9765340760395791,
'negative': 0.004812115072918792}]