{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Toxicity Analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "This tutorial is available as an IPython notebook at [Malaya/example/toxicity](https://github.com/huseinzol05/Malaya/tree/master/example/toxicity).\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "This module trained on both standard and local (included social media) language structures, so it is save to use for both.\n", " \n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 5.54 s, sys: 821 ms, total: 6.36 s\n", "Wall time: 5.67 s\n" ] } ], "source": [ "%%time\n", "import malaya" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Models accuracy\n", "\n", "We use `sklearn.metrics.classification_report` for accuracy reporting, check at https://malaya.readthedocs.io/en/latest/models-accuracy.html#toxicity-analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### labels supported\n", "\n", "Default labels for toxicity module." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['severe toxic',\n", " 'obscene',\n", " 'identity attack',\n", " 'insult',\n", " 'threat',\n", " 'asian',\n", " 'atheist',\n", " 'bisexual',\n", " 'buddhist',\n", " 'christian',\n", " 'female',\n", " 'heterosexual',\n", " 'indian',\n", " 'homosexual, gay or lesbian',\n", " 'intellectual or learning disability',\n", " 'male',\n", " 'muslim',\n", " 'other disability',\n", " 'other gender',\n", " 'other race or ethnicity',\n", " 'other religion',\n", " 'other sexual orientation',\n", " 'physical disability',\n", " 'psychiatric or mental illness',\n", " 'transgender',\n", " 'malay',\n", " 'chinese']" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "malaya.toxicity.label" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "string = 'Benda yg SALAH ni, jgn lah didebatkan. Yg SALAH xkan jadi betul. Ingat tu. Mcm mana kesat sekalipun org sampaikan mesej, dan memang benda tu salah, diam je. Xyah nk tunjuk kau open sangat nk tegur cara org lain berdakwah. '\n", "another_string = 'melayu bodoh, dah la gay, sokong lgbt lagi, memang tak guna'\n", "string1 = 'Sis, students from overseas were brought back because they are not in their countries which is if something happens to them, its not the other countries’ responsibility. Student dalam malaysia ni dah dlm tggjawab kerajaan. Mana part yg tak faham?'\n", "string2 = 'Harap kerajaan tak bukak serentak. Slowly release week by week. Focus on economy related industries dulu'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load multinomial model\n", "\n", "```python\n", "def multinomial(**kwargs):\n", " \"\"\"\n", " Load multinomial toxicity model.\n", "\n", " Returns\n", " -------\n", " result : malaya.model.ml.MultilabelBayes class\n", " \"\"\"\n", "```" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "model = malaya.toxicity.multinomial()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Predict batch of strings\n", "\n", "```python\n", "def predict(self, strings: List[str]):\n", " \"\"\"\n", " classify list of strings.\n", "\n", " Parameters\n", " ----------\n", " strings: List[str]\n", "\n", " Returns\n", " -------\n", " result: List[str]\n", " \"\"\"\n", "```" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[['severe toxic',\n", " 'obscene',\n", " 'identity attack',\n", " 'insult',\n", " 'indian',\n", " 'malay',\n", " 'chinese']]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.predict([string])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Predict batch of strings with probability\n", "\n", "```python\n", "def predict_proba(self, strings: List[str]):\n", " \"\"\"\n", " classify list of strings and return probability.\n", "\n", " Parameters\n", " ----------\n", " strings: List[str]\n", "\n", " Returns\n", " -------\n", " result: List[dict[str, float]]\n", " \"\"\"\n", "```" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'severe toxic': 0.997487040981572,\n", " 'obscene': 0.9455379277616331,\n", " 'identity attack': 0.8274699625500679,\n", " 'insult': 0.5607594945618526,\n", " 'threat': 0.024772971511820983,\n", " 'asian': 0.0221240002096628,\n", " 'atheist': 0.013774558637508741,\n", " 'bisexual': 0.0024495807483865223,\n", " 'buddhist': 0.004640372956039871,\n", " 'christian': 0.052795457745171054,\n", " 'female': 0.05289744129561423,\n", " 'heterosexual': 0.008128507494633362,\n", " 'indian': 0.9023637357823499,\n", " 'homosexual, gay or lesbian': 0.04385664232535533,\n", " 'intellectual or learning disability': 0.0014981591337876019,\n", " 'male': 0.07976929455558882,\n", " 'muslim': 0.08806420077375651,\n", " 'other disability': 0.0,\n", " 'other gender': 0.0,\n", " 'other race or ethnicity': 0.0017014040578187566,\n", " 'other religion': 0.0017333144620482767,\n", " 'other sexual orientation': 0.00122606681013474,\n", " 'physical disability': 0.001489522998169223,\n", " 'psychiatric or mental illness': 0.027125947355667267,\n", " 'transgender': 0.012349564445375391,\n", " 'malay': 0.9991900346707605,\n", " 'chinese': 0.9886782229459774}]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.predict_proba([string])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### List available Transformer models" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Size (MB)Quantized Size (MB)micro precisionmicro recallmicro f1-score
bert425.6111.000.860980.773130.81469
tiny-bert57.415.400.835350.796110.81526
albert48.612.800.860540.769730.81261
tiny-albert22.45.980.835350.796110.81526
xlnet446.6118.000.779040.838290.80758
alxlnet46.813.300.833760.802210.81768
fastformer446.6118.000.882490.748260.80985
tiny-fastformer77.319.600.851310.766200.80652
\n", "
" ], "text/plain": [ " Size (MB) Quantized Size (MB) micro precision \\\n", "bert 425.6 111.00 0.86098 \n", "tiny-bert 57.4 15.40 0.83535 \n", "albert 48.6 12.80 0.86054 \n", "tiny-albert 22.4 5.98 0.83535 \n", "xlnet 446.6 118.00 0.77904 \n", "alxlnet 46.8 13.30 0.83376 \n", "fastformer 446.6 118.00 0.88249 \n", "tiny-fastformer 77.3 19.60 0.85131 \n", "\n", " micro recall micro f1-score \n", "bert 0.77313 0.81469 \n", "tiny-bert 0.79611 0.81526 \n", "albert 0.76973 0.81261 \n", "tiny-albert 0.79611 0.81526 \n", "xlnet 0.83829 0.80758 \n", "alxlnet 0.80221 0.81768 \n", "fastformer 0.74826 0.80985 \n", "tiny-fastformer 0.76620 0.80652 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "malaya.toxicity.available_transformer()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load Transformer model\n", "\n", "```python\n", "def transformer(model: str = 'xlnet', quantized: bool = False, **kwargs):\n", " \"\"\"\n", " Load Transformer toxicity model.\n", "\n", " Parameters\n", " ----------\n", " model : str, optional (default='bert')\n", " Model architecture supported. Allowed values:\n", "\n", " * ``'bert'`` - Google BERT BASE parameters.\n", " * ``'tiny-bert'`` - Google BERT TINY parameters.\n", " * ``'albert'`` - Google ALBERT BASE parameters.\n", " * ``'tiny-albert'`` - Google ALBERT TINY parameters.\n", " * ``'xlnet'`` - Google XLNET BASE parameters.\n", " * ``'alxlnet'`` - Malaya ALXLNET BASE parameters.\n", " * ``'fastformer'`` - FastFormer BASE parameters.\n", " * ``'tiny-fastformer'`` - FastFormer TINY parameters.\n", "\n", " quantized : bool, optional (default=False)\n", " if True, will load 8-bit quantized model.\n", " Quantized model not necessary faster, totally depends on the machine.\n", "\n", " Returns\n", " -------\n", " result: model\n", " List of model classes:\n", "\n", " * if `bert` in model, will return `malaya.model.bert.SigmoidBERT`.\n", " * if `xlnet` in model, will return `malaya.model.xlnet.SigmoidXLNET`.\n", " * if `fastformer` in model, will return `malaya.model.fastformer.SigmoidFastFormer`.\n", " \"\"\"\n", "```" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 47.0/46.8 [01:05<00:00, 1.39s/MB]\n", "184%|██████████| 1.00/0.54 [00:02<-1:59:59, 2.43s/MB]\n", "135%|██████████| 1.00/0.74 [00:03<00:00, 3.48s/MB]\n" ] } ], "source": [ "model = malaya.toxicity.transformer(model = 'alxlnet')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load Quantized model\n", "\n", "To load 8-bit quantized model, simply pass `quantized = True`, default is `False`.\n", "\n", "We can expect slightly accuracy drop from quantized model, and not necessary faster than normal 32-bit float model, totally depends on machine." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING:root:Load quantized model will cause accuracy drop.\n" ] } ], "source": [ "quantized_model = malaya.toxicity.transformer(model = 'alxlnet', quantized = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Predict batch of strings\n", "\n", "```python\n", "def predict(self, strings: List[str]):\n", " \"\"\"\n", " classify list of strings.\n", "\n", " Parameters\n", " ----------\n", " strings: List[str]\n", "\n", " Returns\n", " -------\n", " result: List[List[str]]\n", " \"\"\"\n", "```" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[['obscene'],\n", " ['severe toxic', 'obscene', 'identity attack', 'insult', 'malay']]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.predict([string,another_string])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Predict batch of strings with probability\n", "\n", "```python\n", "def predict_proba(self, strings: List[str]):\n", " \"\"\"\n", " classify list of strings and return probability.\n", "\n", " Parameters\n", " ----------\n", " strings : List[str]\n", "\n", " Returns\n", " -------\n", " result: List[dict[str, float]]\n", " \"\"\"\n", "```" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'severe toxic': 0.30419078,\n", " 'obscene': 0.07300964,\n", " 'identity attack': 0.02309686,\n", " 'insult': 0.14792377,\n", " 'threat': 0.0043829083,\n", " 'asian': 0.00018724799,\n", " 'atheist': 0.0013933778,\n", " 'bisexual': 0.0005682409,\n", " 'buddhist': 0.0006982982,\n", " 'christian': 0.00010216236,\n", " 'female': 0.0062876344,\n", " 'heterosexual': 3.6597252e-05,\n", " 'indian': 0.020283729,\n", " 'homosexual, gay or lesbian': 0.0008122027,\n", " 'intellectual or learning disability': 0.00015977025,\n", " 'male': 0.0007993579,\n", " 'muslim': 0.054483294,\n", " 'other disability': 0.00017657876,\n", " 'other gender': 0.00018069148,\n", " 'other race or ethnicity': 6.273389e-05,\n", " 'other religion': 0.0011053085,\n", " 'other sexual orientation': 0.0013027787,\n", " 'physical disability': 0.00010755658,\n", " 'psychiatric or mental illness': 0.00078335404,\n", " 'transgender': 0.00080055,\n", " 'malay': 0.0033579469,\n", " 'chinese': 0.20889702},\n", " {'severe toxic': 0.99571323,\n", " 'obscene': 0.91805434,\n", " 'identity attack': 0.95676684,\n", " 'insult': 0.7667657,\n", " 'threat': 0.02582252,\n", " 'asian': 0.00074103475,\n", " 'atheist': 0.0012175143,\n", " 'bisexual': 0.07754475,\n", " 'buddhist': 0.004547477,\n", " 'christian': 0.0019699335,\n", " 'female': 0.03404945,\n", " 'heterosexual': 0.029964417,\n", " 'indian': 0.021356285,\n", " 'homosexual, gay or lesbian': 0.13626209,\n", " 'intellectual or learning disability': 0.021410972,\n", " 'male': 0.029543608,\n", " 'muslim': 0.06485465,\n", " 'other disability': 0.0006414652,\n", " 'other gender': 0.04015115,\n", " 'other race or ethnicity': 0.010606945,\n", " 'other religion': 0.001650244,\n", " 'other sexual orientation': 0.04054076,\n", " 'physical disability': 0.0025109593,\n", " 'psychiatric or mental illness': 0.0022883855,\n", " 'transgender': 0.01127643,\n", " 'malay': 0.9658916,\n", " 'chinese': 0.33373892}]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.predict_proba([string,another_string])" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'severe toxic': 0.28386846,\n", " 'obscene': 0.25873762,\n", " 'identity attack': 0.021321118,\n", " 'insult': 0.19023287,\n", " 'threat': 0.005617261,\n", " 'asian': 0.00022211671,\n", " 'atheist': 0.000109523535,\n", " 'bisexual': 0.0019034147,\n", " 'buddhist': 0.00038090348,\n", " 'christian': 0.0016773939,\n", " 'female': 0.007807076,\n", " 'heterosexual': 0.0001899302,\n", " 'indian': 0.049388766,\n", " 'homosexual, gay or lesbian': 0.00043603778,\n", " 'intellectual or learning disability': 0.0012571216,\n", " 'male': 0.0043218136,\n", " 'muslim': 0.018054605,\n", " 'other disability': 0.0011820793,\n", " 'other gender': 0.00044164062,\n", " 'other race or ethnicity': 0.00012764335,\n", " 'other religion': 0.0009614825,\n", " 'other sexual orientation': 0.0040558875,\n", " 'physical disability': 0.0005840957,\n", " 'psychiatric or mental illness': 0.0023525357,\n", " 'transgender': 0.003135711,\n", " 'malay': 0.0013717413,\n", " 'chinese': 0.0051787198},\n", " {'severe toxic': 0.9966523,\n", " 'obscene': 0.82459927,\n", " 'identity attack': 0.97338796,\n", " 'insult': 0.49216133,\n", " 'threat': 0.010962069,\n", " 'asian': 0.0034621954,\n", " 'atheist': 0.0007635355,\n", " 'bisexual': 0.044597328,\n", " 'buddhist': 0.0061615705,\n", " 'christian': 0.0029616058,\n", " 'female': 0.023250878,\n", " 'heterosexual': 0.0038115382,\n", " 'indian': 0.0068957508,\n", " 'homosexual, gay or lesbian': 0.084989995,\n", " 'intellectual or learning disability': 0.006228268,\n", " 'male': 0.070231974,\n", " 'muslim': 0.055434316,\n", " 'other disability': 0.00017631054,\n", " 'other gender': 0.02043128,\n", " 'other race or ethnicity': 0.0032926202,\n", " 'other religion': 0.0035361946,\n", " 'other sexual orientation': 0.018447628,\n", " 'physical disability': 0.0007721717,\n", " 'psychiatric or mental illness': 0.004228982,\n", " 'transgender': 0.0046984255,\n", " 'malay': 0.7579823,\n", " 'chinese': 0.8585954}]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quantized_model.predict_proba([string,another_string])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Open toxicity visualization dashboard\n", "\n", "Default when you call `predict_words` it will open a browser with visualization dashboard, you can disable by `visualization=False`.\n", "\n", "```python\n", "def predict_words(\n", " self,\n", " string: str,\n", " method: str = 'last',\n", " bins_size: float = 0.05,\n", " visualization: bool = True,\n", "):\n", " \"\"\"\n", " classify words.\n", "\n", " Parameters\n", " ----------\n", " string : str\n", " method : str, optional (default='last')\n", " Attention layer supported. Allowed values:\n", "\n", " * ``'last'`` - attention from last layer.\n", " * ``'first'`` - attention from first layer.\n", " * ``'mean'`` - average attentions from all layers.\n", " bins_size: float, optional (default=0.05)\n", " default bins size for word distribution histogram.\n", " visualization: bool, optional (default=True)\n", " If True, it will open the visualization dashboard.\n", "\n", " Returns\n", " -------\n", " dictionary: results\n", " \"\"\"\n", "```\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "model.predict_words(another_string)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Vectorize\n", "\n", "Let say you want to visualize sentence / word level in lower dimension, you can use `model.vectorize`,\n", "\n", "```python\n", "def vectorize(self, strings: List[str], method: str = 'first'):\n", " \"\"\"\n", " vectorize list of strings.\n", "\n", " Parameters\n", " ----------\n", " strings: List[str]\n", " method : str, optional (default='first')\n", " Vectorization layer supported. Allowed values:\n", "\n", " * ``'last'`` - vector from last sequence.\n", " * ``'first'`` - vector from first sequence.\n", " * ``'mean'`` - average vectors from all sequences.\n", " * ``'word'`` - average vectors based on tokens.\n", "\n", " Returns\n", " -------\n", " result: np.array\n", " \"\"\"\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Sentence level" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "texts = [string, another_string, string1, string2]\n", "r = quantized_model.vectorize(texts, method = 'first')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(4, 2)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.manifold import TSNE\n", "import matplotlib.pyplot as plt\n", "\n", "tsne = TSNE().fit_transform(r)\n", "tsne.shape" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize = (7, 7))\n", "plt.scatter(tsne[:, 0], tsne[:, 1])\n", "labels = texts\n", "for label, x, y in zip(\n", " labels, tsne[:, 0], tsne[:, 1]\n", "):\n", " label = (\n", " '%s, %.3f' % (label[0], label[1])\n", " if isinstance(label, list)\n", " else label\n", " )\n", " plt.annotate(\n", " label,\n", " xy = (x, y),\n", " xytext = (0, 0),\n", " textcoords = 'offset points',\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Word level" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "r = quantized_model.vectorize(texts, method = 'word')" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "x, y = [], []\n", "for row in r:\n", " x.extend([i[0] for i in row])\n", " y.extend([i[1] for i in row])" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(107, 2)" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tsne = TSNE().fit_transform(y)\n", "tsne.shape" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "scrolled": false }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize = (7, 7))\n", "plt.scatter(tsne[:, 0], tsne[:, 1])\n", "labels = x\n", "for label, x, y in zip(\n", " labels, tsne[:, 0], tsne[:, 1]\n", "):\n", " label = (\n", " '%s, %.3f' % (label[0], label[1])\n", " if isinstance(label, list)\n", " else label\n", " )\n", " plt.annotate(\n", " label,\n", " xy = (x, y),\n", " xytext = (0, 0),\n", " textcoords = 'offset points',\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pretty good, outliers are toxic words." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Stacking models\n", "\n", "More information, you can read at [https://malaya.readthedocs.io/en/latest/Stack.html](https://malaya.readthedocs.io/en/latest/Stack.html)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:loading sentence piece model\n" ] } ], "source": [ "albert = malaya.toxicity.transformer(model = 'albert')" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'severe toxic': 0.9968317,\n", " 'obscene': 0.43022493,\n", " 'identity attack': 0.90531594,\n", " 'insult': 0.42289576,\n", " 'threat': 0.0058603976,\n", " 'asian': 0.000983668,\n", " 'atheist': 0.0005495089,\n", " 'bisexual': 0.0009623809,\n", " 'buddhist': 0.0003632398,\n", " 'christian': 0.0018632574,\n", " 'female': 0.006050684,\n", " 'heterosexual': 0.0025569045,\n", " 'indian': 0.0056869243,\n", " 'homosexual, gay or lesbian': 0.012232827,\n", " 'intellectual or learning disability': 0.00091394753,\n", " 'male': 0.011594971,\n", " 'muslim': 0.0042621437,\n", " 'other disability': 0.00027529505,\n", " 'other gender': 0.0010361207,\n", " 'other race or ethnicity': 0.0012320877,\n", " 'other religion': 0.00091365684,\n", " 'other sexual orientation': 0.0027996385,\n", " 'physical disability': 0.00010540871,\n", " 'psychiatric or mental illness': 0.000815311,\n", " 'transgender': 0.0016718076,\n", " 'malay': 0.96644485,\n", " 'chinese': 0.05199418}]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "malaya.stack.predict_stack([model, albert], [another_string])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 2 }