Lexicon Generator#

This tutorial is available as an IPython notebook at Malaya/example/lexicon.

[1]:
%%time
import malaya
import numpy as np
CPU times: user 4.47 s, sys: 1.01 s, total: 5.48 s
Wall time: 5.37 s

Why lexicon#

Lexicon is populated words related to certain domains, like, words for negative and positive sentiments.

Example, word suka can represent as positive sentiment. If suka exists in a sentence, we can say that sentence is positive sentiment.

Lexicon based is common way people use to classify a text and very fast. Again, it is pretty naive because a word can be semantically ambiguous.

sentiment lexicon#

Malaya provided a small sample for sentiment lexicon, simply,

[6]:
sentiment_lexicon = malaya.lexicon.sentiment
sentiment_lexicon.keys()
[6]:
dict_keys(['negative', 'positive'])

emotion lexicon#

Malaya provided a small sample for emotion lexicon, simply,

[3]:
emotion_lexicon = malaya.lexicon.emotion
emotion_lexicon.keys()
[3]:
dict_keys(['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'])

Lexicon generator#

To build a lexicon is time consuming, because required expert domains to populate related words to the domains. With the help of word vector, we can induce sample words to specific domains given some annotated lexicon. Why we induced lexicon from word vector? Even for a word suka commonly represent positive sentiment, but if the word vector learnt the context of suka different polarity and based nearest words also represent different polarity, so suka got tendency to become negative sentiment.

Malaya provided inducing lexicon interface, build on top of Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora.

Let say you have a lexicon based on standard language or bahasa baku, then you want to find similar lexicon on social media context. So you can use this malaya.lexicon interface. To use this interface, we must initiate malaya.wordvector.load first.

And, at least small lexicon sample like this,

{'label1': ['word1', 'word2'], 'label2': ['word3', 'word4']}

label can be more than 2, example like malaya.lexicon.emotion, up to 6 different labels.

[5]:
vocab, embedded = malaya.wordvector.load(model = 'socialmedia')
wordvector = malaya.wordvector.WordVector(embedded, vocab)

random walk#

Random walk technique is main technique use by the paper, can read more at 3.2 Propagating polarities from a seed set

def random_walk(
    lexicon,
    wordvector,
    pool_size = 10,
    top_n = 20,
    similarity_power = 100.0,
    beta = 0.9,
    arccos = True,
    normalization = True,
    soft = False,
    silent = False,
):

    """
    Induce lexicon by using random walk technique, use in paper, https://arxiv.org/pdf/1606.02820.pdf

    Parameters
    ----------

    lexicon: dict
        curated lexicon from expert domain, {'label1': [str], 'label2': [str]}.
    wordvector: object
        wordvector interface object.
    pool_size: int, optional (default=10)
        pick top-pool size from each lexicons.
    top_n: int, optional (default=20)
        top_n for each vectors will multiple with `similarity_power`.
    similarity_power: float, optional (default=100.0)
        extra score for `top_n`, less will generate less bias induced but high chance unbalanced outcome.
    beta: float, optional (default=0.9)
        penalty score, towards to 1.0 means less penalty. 0 < beta < 1.
    arccos: bool, optional (default=True)
        covariance distribution for embedded.dot(embedded.T). If false, covariance + 1.
    normalization: bool, optional (default=True)
        normalize word vectors using L2 norm. L2 is good to penalize skewed vectors.
    soft: bool, optional (default=False)
        if True, a word not in the dictionary will be replaced with nearest jarowrinkler ratio.
        if False, it will throw an exception if a word not in the dictionary.
    silent: bool, optional (default=False)
        if True, will not print any logs.

    Returns
    -------
    tuple: (labels[argmax(scores), axis = 1], scores, labels)

    """
[5]:
%%time

results, scores, labels = malaya.lexicon.random_walk(sentiment_lexicon, wordvector, pool_size = 5)
populating nearest words from wordvector
populating vectors from populated nearest words
random walking from populated vectors

CPU times: user 1min 36s, sys: 16.1 s, total: 1min 52s
Wall time: 28.1 s
[6]:
np.unique(list(results.values()), return_counts = True)
[6]:
(array(['negative', 'positive'], dtype='<U8'), array([2260, 2922]))
[7]:
results
[7]:
{'serang': 'negative',
 'cilegon': 'positive',
 'culik': 'negative',
 'tanjungpinang': 'positive',
 'jenguk': 'negative',
 'luka': 'negative',
 'jerawat': 'negative',
 'infeksi': 'negative',
 'migrain': 'negative',
 'penyakit': 'negative',
 'penaklukan': 'negative',
 '4ir': 'positive',
 'renjer': 'positive',
 'kezhaliman': 'positive',
 'proklamator': 'positive',
 'kelucahan': 'negative',
 'pablisiti': 'positive',
 'terjwp': 'positive',
 '33100': 'positive',
 'impos': 'positive',
 'kritikan': 'negative',
 'mandat': 'negative',
 'teguran': 'negative',
 'persepsi': 'negative',
 'pembelaan': 'negative',
 'muflis': 'negative',
 'mempelajarinya': 'negative',
 'melarat': 'positive',
 'dihabisi': 'positive',
 'kooperatif': 'positive',
 'kelemahan': 'negative',
 'keyakinan': 'positive',
 'kehendak': 'negative',
 'keburukan': 'negative',
 'gerombolan': 'negative',
 'kelakuan': 'negative',
 'antek': 'negative',
 'politikus': 'negative',
 'ulah': 'negative',
 'debu': 'negative',
 'kotoran': 'negative',
 'polusi': 'negative',
 'kuman': 'negative',
 'keringat': 'negative',
 'sinis': 'negative',
 'misterius': 'positive',
 'menggemaskan': 'positive',
 'emosional': 'negative',
 'progresif': 'positive',
 'bocor': 'negative',
 'pecah': 'negative',
 'retak': 'negative',
 'rosak': 'negative',
 'terbalik': 'negative',
 'kekacauan': 'negative',
 'penindasan': 'negative',
 'perdebatan': 'negative',
 'kesombongan': 'negative',
 'pengamatan': 'negative',
 'permusuhan': 'negative',
 'ketidakadilan': 'negative',
 'empati': 'negative',
 'perpecahan': 'negative',
 'menghasut': 'negative',
 'menghukum': 'negative',
 'memfitnah': 'negative',
 'memaki': 'negative',
 'memprovokasi': 'negative',
 'bersedih': 'negative',
 'mengalah': 'negative',
 'terlena': 'negative',
 'cemburu': 'negative',
 'dikenang': 'negative',
 'jatuh': 'negative',
 'terjatuh': 'negative',
 'putus': 'negative',
 'hilang': 'negative',
 'hancur': 'negative',
 'dipakai': 'negative',
 'digunakan': 'negative',
 'dikonsumsi': 'negative',
 'dipake': 'negative',
 'diminum': 'negative',
 'harapan': 'negative',
 'kebahagiaan': 'positive',
 'impian': 'positive',
 'cita2': 'negative',
 'senyuman': 'positive',
 'beban': 'negative',
 'resiko': 'negative',
 'kerugian': 'negative',
 'tekanan': 'negative',
 'risiko': 'negative',
 'mencaci': 'negative',
 'dicaci': 'negative',
 'mengejek': 'negative',
 'disia': 'negative',
 'bengkak': 'negative',
 'berair': 'negative',
 'lebam': 'negative',
 'lenguh': 'negative',
 'toksik': 'negative',
 'toksin': 'negative',
 'pepejal': 'positive',
 'kafein': 'negative',
 'buih': 'negative',
 'terperangkap': 'negative',
 'dijumpai': 'negative',
 'tersimpan': 'negative',
 'tergabung': 'negative',
 'bertarung': 'negative',
 'rahsia': 'negative',
 'cabaran': 'positive',
 'petua': 'negative',
 'persamaan': 'negative',
 'punca': 'negative',
 'fail': 'negative',
 'failed': 'negative',
 'approve': 'negative',
 'consider': 'negative',
 'freehair': 'negative',
 'munafik': 'negative',
 'dungu': 'negative',
 'liberal': 'negative',
 'rasis': 'negative',
 'konservatif': 'negative',
 'parasit': 'negative',
 'klorofil': 'negative',
 'klorin': 'positive',
 'fibroid': 'negative',
 'antibakteri': 'negative',
 'menyesal': 'negative',
 'nyesal': 'negative',
 'terkejut': 'positive',
 'terliur': 'positive',
 'sebak': 'positive',
 'pemberontakan': 'negative',
 'kudeta': 'negative',
 'feminisme': 'negative',
 'keragaman': 'negative',
 'kesangsian': 'negative',
 'nelponke': 'positive',
 'datebook': 'negative',
 '4dalzk': 'negative',
 'ketidakpentinganku': 'positive',
 'fasis': 'negative',
 'portugis': 'negative',
 'ateisme': 'positive',
 'illuminati': 'negative',
 'malang': 'negative',
 'depok': 'positive',
 'kediri': 'positive',
 'semarang': 'positive',
 'cirebon': 'positive',
 'mendatangkan': 'negative',
 'menimbulkan': 'negative',
 'memupuk': 'negative',
 'mengundang': 'negative',
 'menghianati': 'negative',
 'kejatuhan': 'negative',
 'pelemahan': 'negative',
 'lonjakan': 'negative',
 'ketiadaan': 'negative',
 'pengubahan': 'negative',
 'memusnahkan': 'negative',
 'mengadopsi': 'negative',
 'merampas': 'negative',
 'mengangkut': 'negative',
 'mengarahkan': 'negative',
 'kemarahan': 'negative',
 'keimanan': 'positive',
 'penderitaan': 'negative',
 'wabak': 'negative',
 'letupan': 'negative',
 'jangkitan': 'negative',
 'serangan': 'negative',
 'jenayah': 'negative',
 'tragedi': 'negative',
 'peristiwa': 'negative',
 'insiden': 'negative',
 'kejadian': 'negative',
 'menganggur': 'negative',
 'dioptimalkan': 'positive',
 'menyakitimu': 'positive',
 'bernafsu': 'positive',
 'derhaka': 'negative',
 'menakan': 'negative',
 'sulung': 'positive',
 'bongsu': 'negative',
 'teruna': 'negative',
 'merungut': 'negative',
 'komplen': 'negative',
 'giveup': 'negative',
 'melalak': 'negative',
 'melawa': 'negative',
 'berdarah': 'negative',
 'bengkok': 'negative',
 'layu': 'negative',
 'ngeri': 'negative',
 'serem': 'negative',
 'kocak': 'negative',
 'mantep': 'positive',
 'miris': 'negative',
 'menghina': 'negative',
 'menuduh': 'negative',
 'membenci': 'negative',
 'menyalahkan': 'negative',
 'menyekat': 'negative',
 'menggenjot': 'negative',
 'mengevaluasi': 'negative',
 'mengalirkan': 'negative',
 'melemahkan': 'negative',
 'keengganan': 'negative',
 'vendon': 'positive',
 'koturno': 'positive',
 'spesialisasikan': 'positive',
 "'pembongkaran": 'positive',
 'neraka': 'negative',
 'surga': 'negative',
 'syurga': 'positive',
 'kubur': 'negative',
 'mesjid': 'negative',
 'gerun': 'negative',
 'betui2': 'positive',
 'bankrup': 'positive',
 'gamak': 'positive',
 'mendobi': 'negative',
 'penghapusan': 'negative',
 'proyeksi': 'negative',
 'realisasi': 'negative',
 'pengendalian': 'negative',
 'maraknya': 'negative',
 'strike': 'negative',
 'adop': 'positive',
 'seats': 'positive',
 'sponsored': 'positive',
 'script': 'positive',
 'pengangguran': 'negative',
 'pns': 'negative',
 'koruptor': 'negative',
 'oposisi': 'negative',
 'stunting': 'negative',
 'mengamuk': 'negative',
 'membebel': 'negative',
 'menjerit': 'negative',
 'meroyan': 'negative',
 'bergaduh': 'negative',
 'keruntuhan': 'negative',
 'maxxie': 'positive',
 '081266267925': 'positive',
 'evvadiki': 'positive',
 'digibdulu': 'positive',
 'kekuatan': 'negative',
 'kepercayaan': 'positive',
 'kesadaran': 'negative',
 'hasrat': 'negative',
 'radikal': 'negative',
 'sekuler': 'negative',
 'intoleran': 'negative',
 'sosialis': 'negative',
 'penagih': 'negative',
 'penagihan': 'positive',
 'professor': 'negative',
 'keldai': 'negative',
 'penebar': 'negative',
 'menghentam': 'negative',
 'jagungbakar': 'positive',
 'pembakaram': 'positive',
 'bajucoplemurah': 'positive',
 'ma3i': 'positive',
 'pembakar': 'negative',
 'limpahan': 'positive',
 'melarutkan': 'positive',
 'pencegah': 'negative',
 'merendam': 'positive',
 'membakar': 'negative',
 'mengikat': 'negative',
 'membersihkan': 'positive',
 'menghancurkan': 'negative',
 'pembakaran': 'negative',
 'penat': 'negative',
 'letih': 'negative',
 'stress': 'negative',
 'bosan': 'negative',
 'mengantuk': 'negative',
 'binasa': 'negative',
 'membengkak': 'positive',
 'terpejam': 'positive',
 'menggumpal': 'positive',
 'bergoyang': 'negative',
 'diasingkan': 'negative',
 'difokuskan': 'negative',
 'melindungimu': 'positive',
 'terselamatkan': 'positive',
 'tertid': 'positive',
 'mengelak': 'negative',
 'menyiasat': 'negative',
 'menghindar': 'negative',
 'mengelakkan': 'negative',
 'dilepaskan': 'negative',
 'tempur': 'negative',
 'migas': 'negative',
 'nuklir': 'negative',
 'manufaktur': 'negative',
 'ilegal': 'negative',
 'discrimination': 'negative',
 'dramaticnyer': 'positive',
 'disuwek': 'positive',
 '6066030438': 'positive',
 'fahdy': 'positive',
 'merugikan': 'negative',
 'meresahkan': 'negative',
 'menimpa': 'negative',
 'meyakinkan': 'positive',
 'membanggakan': 'positive',
 'membingungkan': 'negative',
 'diperlihatkan': 'negative',
 'dilakukannya': 'positive',
 'disegani': 'positive',
 'dititipkan': 'negative',
 'fatal': 'positive',
 'provokatif': 'positive',
 'memprihatinkan': 'positive',
 'ambisius': 'positive',
 'mendasar': 'positive',
 'peredaran': 'negative',
 'sirkulasi': 'negative',
 'pembuluh': 'negative',
 'murka': 'negative',
 'dilaknat': 'negative',
 'diijabah': 'negative',
 'berkehendak': 'negative',
 'terusik': 'positive',
 'virus': 'negative',
 'hama': 'negative',
 'stroke': 'negative',
 'perkauman': 'negative',
 'lgbt': 'negative',
 'icerd': 'negative',
 'rasuah': 'negative',
 'politik': 'negative',
 'kehancuran': 'negative',
 'kedewasaan': 'negative',
 'penjajahan': 'negative',
 'menurun': 'negative',
 'meningkat': 'negative',
 'berkurang': 'negative',
 'membaik': 'negative',
 'meroket': 'negative',
 'mengetepikan': 'negative',
 'kuimplankan': 'positive',
 'mountaineer': 'positive',
 'chapalein': 'positive',
 '40365036': 'positive',
 'penjara': 'negative',
 'lokap': 'negative',
 'mengekori': 'negative',
 'c4uf5s': 'positive',
 '085602974529': 'positive',
 'kebiasqan': 'positive',
 'teamgoals': 'positive',
 'bimbang': 'negative',
 'khawatir': 'negative',
 'kesal': 'positive',
 'sungkan': 'negative',
 'pemabuk': 'negative',
 'adibrunner': 'positive',
 'eppii': 'positive',
 '3s3bju': 'positive',
 'jakwir': 'positive',
 'pemukul': 'negative',
 'seminaronline7': 'positive',
 'gemoksaya': 'positive',
 'gabisabisa': 'positive',
 'berocorak': 'positive',
 'penentangan': 'negative',
 'livescreen': 'positive',
 'meliriktelegramdan': 'positive',
 '081334186600': 'positive',
 'indox': 'positive',
 'terdesak': 'negative',
 'desperate': 'negative',
 'bebal': 'negative',
 'fobia': 'negative',
 'nekad': 'positive',
 'tahi': 'negative',
 'taik': 'negative',
 'bangkai': 'negative',
 'seekor': 'negative',
 'ulat': 'negative',
 'kesusahan': 'negative',
 'kesedihan': 'negative',
 'keraguan': 'negative',
 'berdepan': 'negative',
 'dikaitkan': 'negative',
 'dimulakan': 'negative',
 'mengesan': 'negative',
 'dikejutkan': 'negative',
 'tamak': 'negative',
 'biadap': 'negative',
 'bongkak': 'negative',
 'angkuh': 'negative',
 'pendarahan': 'negative',
 'alahan': 'negative',
 'pembengkakan': 'negative',
 'kegatalan': 'negative',
 'komplikasi': 'negative',
 'dirosakkan': 'negative',
 'sajadahmasjid': 'positive',
 'wisatalumajang': 'positive',
 'dsmua': 'positive',
 'otogod': 'positive',
 'kekufuran': 'negative',
 'auratnya': 'positive',
 'kebhinekaan': 'positive',
 'kekuatannya': 'negative',
 'maksiat': 'negative',
 'zina': 'negative',
 'provokasi': 'negative',
 'syirik': 'negative',
 'dicemari': 'negative',
 'bergandingan': 'negative',
 'diperankan': 'positive',
 'dihalang': 'negative',
 'bpuasa': 'positive',
 'merobohkan': 'negative',
 'wediaraya': 'positive',
 'pliharaku': 'positive',
 'diinfor': 'positive',
 'ivgfood': 'positive',
 'mencuri': 'negative',
 'pecahkan': 'negative',
 'sumbang': 'negative',
 'meminjam': 'negative',
 'curi': 'negative',
 'disembelih': 'negative',
 'terobati': 'negative',
 'diangetin': 'positive',
 'berharta': 'positive',
 'dituliskan': 'positive',
 'pengepungan': 'negative',
 'menyamoaikan': 'positive',
 'kihoii': 'positive',
 'sukasukanya': 'positive',
 '085740709892': 'positive',
 'menyeleweng': 'negative',
 'bukanyah': 'positive',
 'terlangkap': 'positive',
 'nurulady_sandwich': 'positive',
 'spupet': 'positive',
 'krisis': 'negative',
 'konflik': 'negative',
 'kekhawatiran': 'negative',
 'keterbatasan': 'negative',
 'ancaman': 'negative',
 'dipadamkan': 'negative',
 'diagungkan': 'positive',
 'digunapakai': 'positive',
 'dikenalpasti': 'negative',
 'digariskan': 'positive',
 'sumpahan': 'negative',
 'busuknya': 'negative',
 'raklu': 'positive',
 'adela': 'negative',
 'sgguh': 'positive',
 'merebut': 'negative',
 'memindahkan': 'negative',
 'menyelamatkan': 'negative',
 'memperluas': 'negative',
 'pembangkang': 'negative',
 'ppbm': 'negative',
 'bn': 'negative',
 'tmj': 'negative',
 'pkr': 'negative',
 'bercanggah': 'negative',
 'berkerjasama': 'negative',
 'diberhentikan': 'negative',
 'terpalit': 'negative',
 'selari': 'negative',
 'penalty': 'negative',
 'lipliner': 'positive',
 'glasses': 'positive',
 'kdak': 'positive',
 'logbook': 'positive',
 'tergantung': 'negative',
 'beda': 'negative',
 'berbeda': 'positive',
 'gatau': 'negative',
 'berdasarkan': 'negative',
 'longgar': 'negative',
 'ketat': 'positive',
 'sendat': 'positive',
 'ramping': 'positive',
 'dijahit': 'negative',
 'kontroversi': 'negative',
 'kezaliman': 'negative',
 'penolakan': 'negative',
 'menakutkan': 'negative',
 'menyedihkan': 'negative',
 'mengerikan': 'negative',
 'mendebarkan': 'positive',
 'dibenci': 'negative',
 'mengusik': 'negative',
 'memberkahi': 'positive',
 'menyirami': 'negative',
 'memantulkan': 'negative',
 'menampar': 'negative',
 'problem': 'negative',
 'prob': 'positive',
 'down': 'negative',
 'error': 'negative',
 'function': 'positive',
 'pelarian': 'negative',
 'pengemis': 'negative',
 'jurnalis': 'negative',
 'primadona': 'negative',
 'buzzer': 'negative',
 'lengkap': 'negative',
 'lengkapnya': 'positive',
 'komplit': 'positive',
 'pengirim': 'negative',
 'simpel': 'positive',
 'bencana': 'negative',
 'musibah': 'negative',
 'tsunami': 'negative',
 'kerusuhan': 'negative',
 'rompakan': 'negative',
 'samun': 'negative',
 'lynas': 'negative',
 'rusuhan': 'negative',
 'penyelewengan': 'negative',
 'meletup': 'negative',
 'tercabut': 'negative',
 'terkencing': 'negative',
 'pitam': 'negative',
 'letup': 'negative',
 'membosankan': 'negative',
 'menyebalkan': 'negative',
 'rumit': 'negative',
 'bantahan': 'negative',
 'cenderahati': 'negative',
 'instruksi': 'negative',
 'ketertarikan': 'negative',
 'penghasut': 'negative',
 'hasanudin': 'positive',
 'astuti': 'positive',
 'kurva': 'positive',
 'gerd': 'positive',
 'ribut': 'negative',
 'ngeluh': 'negative',
 'rusuh': 'negative',
 'berantem': 'negative',
 'ngumpul': 'negative',
 'bergelut': 'negative',
 'disibukkan': 'negative',
 'berkolaborasi': 'negative',
 'berkutat': 'negative',
 'khinzir': 'negative',
 'cmnie': 'positive',
 'kecikk': 'positive',
 'instafemes': 'positive',
 'siuk': 'positive',
 'gangguan': 'negative',
 'kerusakan': 'negative',
 'permasalahan': 'negative',
 'berisiko': 'negative',
 'beresiko': 'positive',
 'rentan': 'negative',
 'berpotensi': 'negative',
 'disyaki': 'negative',
 'mengetuk': 'negative',
 'membukakan': 'negative',
 'bukain': 'negative',
 'ngetok': 'negative',
 'bukakan': 'negative',
 'memutuskan': 'negative',
 'berkomitmen': 'positive',
 'berencana': 'negative',
 'berniat': 'negative',
 'diminta': 'negative',
 'penceroboh': 'negative',
 'keperpercayaan': 'positive',
 'coherence': 'positive',
 'lgdnya': 'positive',
 "deto'x": 'positive',
 'sindiran': 'negative',
 'heroik': 'positive',
 'ceramahnya': 'positive',
 'petuah': 'negative',
 'ketegasan': 'negative',
 'hukuman': 'negative',
 'pidana': 'negative',
 'sanksi': 'negative',
 'najis': 'negative',
 'cicak': 'negative',
 'iblis': 'negative',
 'depresi': 'negative',
 'mengharamkan': 'negative',
 'memaknai': 'negative',
 'meragukan': 'negative',
 'mengedepankan': 'negative',
 'kelaparan': 'negative',
 'kesepian': 'negative',
 'tenggelam': 'negative',
 'gelisah': 'negative',
 'terluka': 'negative',
 'korupsi': 'negative',
 'makar': 'negative',
 'kriminal': 'negative',
 'vandalisme': 'negative',
 'penipuan': 'negative',
 'kebencian': 'negative',
 'kebohongan': 'negative',
 'hoaks': 'negative',
 'dusta': 'negative',
 'inflasi': 'negative',
 'apbn': 'negative',
 'trauma': 'negative',
 'mual': 'negative',
 'stres': 'negative',
 'badmood': 'negative',
 'keradangan': 'negative',
 'pigmentasi': 'negative',
 'peradangan': 'negative',
 'keletihan': 'negative',
 'selulit': 'negative',
 'kesilapan': 'negative',
 'kesalahan': 'negative',
 'kemusnahan': 'negative',
 'perbendeharaan': 'positive',
 'romanticist': 'positive',
 'deseu2': 'positive',
 'menyjilat': 'positive',
 'benci': 'negative',
 'menyampah': 'positive',
 'jijik': 'negative',
 'kagum': 'positive',
 'geli': 'positive',
 'mendesak': 'negative',
 'mengkritik': 'negative',
 'menggesa': 'negative',
 'menghimbau': 'negative',
 'diperintah': 'negative',
 'tahap': 'negative',
 'level': 'negative',
 'fasa': 'negative',
 'tingkat': 'negative',
 'babak': 'negative',
 'praktikal': 'negative',
 'kaunseling': 'negative',
 'stpm': 'negative',
 'pt3': 'negative',
 'practical': 'negative',
 'dahsyat': 'negative',
 'tragis': 'negative',
 'dasyat': 'negative',
 'kematian': 'negative',
 'pembunuhan': 'negative',
 'kekalahan': 'negative',
 'kebodohan': 'negative',
 'pembelotan': 'negative',
 'bis2lo': 'negative',
 'nepisnya': 'positive',
 'stabizernya': 'negative',
 'dziewczynka': 'negative',
 'mengkhianati': 'negative',
 'mengabaikan': 'negative',
 'menyembah': 'negative',
 'meremehkan': 'negative',
 'perbuatannya': 'negative',
 'protes': 'negative',
 'kritik': 'negative',
 'dibela': 'negative',
 'rekonsiliasi': 'negative',
 'diusir': 'negative',
 'tuduhan': 'negative',
 'dakwaan': 'negative',
 'perbuatan': 'negative',
 'tuntutan': 'negative',
 'dadah': 'negative',
 'hey': 'positive',
 'astagfirullah': 'negative',
 'heh': 'negative',
 'fak': 'positive',
 'ditakuti': 'negative',
 'diharamkan': 'negative',
 'dicintai': 'positive',
 'nasionalis': 'negative',
 'mengalir': 'negative',
 'tumpah': 'negative',
 'merebak': 'negative',
 'dimasukkan': 'negative',
 'terjun': 'negative',
 'mencederakan': 'negative',
 'mummuy': 'positive',
 'pkdnya': 'positive',
 'dilepasi': 'positive',
 'tolak': 'negative',
 'keluarkan': 'negative',
 'tuntut': 'negative',
 'pegang': 'negative',
 'kutip': 'negative',
 'khianat': 'negative',
 'bersaksi': 'negative',
 'dipersalahkan': 'positive',
 'menyeksa': 'negative',
 'morah2': 'positive',
 'hakimnegara': 'positive',
 'princemmed': 'positive',
 'bedaken': 'positive',
 'kemelesetan': 'negative',
 'raauww': 'positive',
 "'aiyok": 'positive',
 '15dan': 'positive',
 'huina': 'positive',
 'melumpuhkan': 'negative',
 'dipercayakan': 'positive',
 'direbut': 'negative',
 'menyasar': 'positive',
 'mengetuai': 'negative',
 'kesengsaraan': 'negative',
 'kebermanfaatan': 'positive',
 'kegelisahan': 'negative',
 'berkabung': 'negative',
 'berbasikal': 'positive',
 'berbisnes': 'negative',
 'memuncak': 'positive',
 'berbahas': 'negative',
 'pengakuan': 'negative',
 'kesaksian': 'negative',
 'pernyataan': 'negative',
 'perang': 'negative',
 'neraca': 'negative',
 'negosiasi': 'negative',
 'kebangkitan': 'positive',
 'menyerahkan': 'negative',
 'menyalurkan': 'negative',
 'membagikan': 'negative',
 'serahkan': 'negative',
 'mengajukan': 'negative',
 'hutang': 'negative',
 'utang': 'negative',
 'pendapatan': 'negative',
 'pajak': 'negative',
 'cukai': 'negative',
 'saingan': 'negative',
 'trofi': 'positive',
 'pertarungan': 'negative',
 'kompetisi': 'negative',
 'klasemen': 'negative',
 'mengeruhkan': 'negative',
 'zuaini': 'positive',
 'sedip': 'positive',
 '7572687': 'positive',
 'sesiapo': 'positive',
 'mengemis': 'negative',
 'tanyaa': 'negative',
 'feeling2': 'positive',
 'berdendam': 'negative',
 'bermasalah': 'negative',
 'sensitif': 'positive',
 'terganggu': 'negative',
 'berjerawat': 'positive',
 'menghitam': 'positive',
 'disaster': 'negative',
 'ngisahin': 'positive',
 'butoset': 'positive',
 'stuffed': 'positive',
 'kayk': 'positive',
 'rapuh': 'negative',
 'rebah': 'negative',
 'mengering': 'positive',
 'kaku': 'negative',
 'hti': 'negative',
 'syaitan': 'negative',
 'pembohong': 'negative',
 'opposition': 'negative',
 'accord': 'positive',
 'hone': 'positive',
 'writternya': 'positive',
 'memahat': 'positive',
 'dikawal': 'negative',
 'ditangani': 'negative',
 'diselamatkan': 'negative',
 'diselesaikan': 'negative',
 'dilewati': 'negative',
 'beracun': 'negative',
 'lazim': 'positive',
 'merbahaya': 'positive',
 'mengkilap': 'positive',
 'berbahaya': 'negative',
 'gross': 'negative',
 'paint': 'positive',
 'bunny': 'positive',
 'teriyaki': 'positive',
 'panther': 'positive',
 'menghantui': 'negative',
 'menyiksa': 'negative',
 'menuntun': 'negative',
 'cintakan': 'negative',
 'membohongi': 'negative',
 'bodoh': 'negative',
 'bangang': 'negative',
 'bodo': 'positive',
 'noob': 'negative',
 'merenggangkan': 'negative',
 'nowel2': 'positive',
 'memmpesonahh': 'positive',
 'sotoguk': 'positive',
 'promotinggal2harilagiburuuaann': 'positive',
 'polemik': 'negative',
 'penahanan': 'negative',
 'usulan': 'negative',
 'pertikaian': 'negative',
 'sejarahnya': 'negative',
 'kejanggalan': 'negative',
 'petaka': 'negative',
 'tamparan': 'negative',
 'takut': 'negative',
 'risau': 'negative',
 'malu': 'negative',
 'segan': 'negative',
 'ketinggalan': 'negative',
 'kehabisan': 'negative',
 'kebagian': 'negative',
 'lewatkan': 'negative',
 'terlepas': 'negative',
 'paksaan': 'negative',
 'kejelasan': 'negative',
 'batasnya': 'negative',
 'halangan': 'negative',
 'bingung': 'negative',
 'penasaran': 'positive',
 'mikir': 'negative',
 'kepikiran': 'negative',
 'males': 'negative',
 'ditinggalkan': 'negative',
 'dibunuh': 'negative',
 'dihina': 'negative',
 'dijalani': 'negative',
 'dilanda': 'negative',
 'mengidap': 'negative',
 'picu': 'negative',
 'memicu': 'negative',
 'terjangkit': 'negative',
 'penyerang': 'negative',
 'gelandang': 'negative',
 'pembalap': 'negative',
 'manajer': 'negative',
 'kiper': 'negative',
 'mencurigai': 'negative',
 'zemwah': 'positive',
 'enenenenenene': 'positive',
 'destroyers': 'positive',
 'norsyida': 'positive',
 'memarahi': 'negative',
 'dereta': 'positive',
 'pengambil': 'positive',
 'menjudge': 'positive',
 'disodorin': 'positive',
 'disentuh': 'negative',
 'memakainya': 'negative',
 'membacanya': 'negative',
 'dicerna': 'negative',
 'dihilangkan': 'negative',
 'membimbangkan': 'negative',
 'dibaiat': 'positive',
 'memenatkan': 'negative',
 'diingati': 'positive',
 'perosak': 'negative',
 'penghianat': 'negative',
 'pembela': 'negative',
 'perusak': 'negative',
 'minoriti': 'negative',
 'kemudaratan': 'negative',
 'kainavailable': 'positive',
 'angesti': 'positive',
 'konsta': 'positive',
 'togor2': 'positive',
 'menangkis': 'negative',
 'gobindh': 'positive',
 "k'sasar": 'positive',
 'mgnr': 'positive',
 'kemesu': 'positive',
 'rugi': 'negative',
 'untung': 'negative',
 'berdosa': 'negative',
 'berbaloi': 'positive',
 'terasa': 'negative',
 'merasa': 'negative',
 'berdebar': 'negative',
 'terlihat': 'positive',
 'berasa': 'negative',
 'tebusan': 'negative',
 '082257468845': 'positive',
 'penghakiman': 'positive',
 'dihafal': 'positive',
 'kecelaruan': 'negative',
 'pakvwi': 'positive',
 'mwamuna': 'positive',
 'hapepend': 'positive',
 'mengekuarkan': 'positive',
 'kasar': 'negative',
 'kotor': 'negative',
 'halus': 'positive',
 'kusam': 'positive',
 'memaksa': 'negative',
 'menyayangi': 'negative',
 'menyuruh': 'negative',
 'menyakiti': 'negative',
 'fanatik': 'negative',
 'toleran': 'positive',
 'zalim': 'negative',
 'atheis': 'negative',
 'kemiskinan': 'negative',
 'pelampau': 'negative',
 'dicekal': 'positive',
 'ysfheartnezia': 'positive',
 'photograther': 'positive',
 'ntuh': 'positive',
 'takot': 'negative',
 'teror': 'negative',
 'menyerang': 'negative',
 'membunuh': 'negative',
 'membela': 'negative',
 'menolong': 'negative',
 'menjatuhkan': 'negative',
 'menyamakan': 'negative',
 'meninggalkan': 'negative',
 'menemui': 'negative',
 'tinggalkan': 'negative',
 'menemukan': 'negative',
 'mengubah': 'negative',
 'miskin': 'negative',
 'goblok': 'negative',
 'jelek': 'negative',
 'jomblo': 'negative',
 'bego': 'negative',
 'siber': 'negative',
 'undang2': 'negative',
 'menangis': 'negative',
 'nangis': 'negative',
 'tertidur': 'negative',
 'tertunggak': 'negative',
 'langsai': 'positive',
 'rm2k': 'positive',
 'rm450': 'negative',
 'xsilap': 'positive',
 'lucah': 'negative',
 'porno': 'negative',
 'semburit': 'negative',
 'seks': 'negative',
 '3gp': 'negative',
 'mengalami': 'negative',
 'menderita': 'negative',
 'merasakan': 'negative',
 'menyebabkan': 'negative',
 'musnah': 'negative',
 'lenyap': 'negative',
 'sengsara': 'negative',
 'stereotaip': 'negative',
 'ahmbs': 'positive',
 'radangmembaik': 'positive',
 'escapepenang': 'positive',
 'f7szfx': 'positive',
 'ironinya': 'negative',
 'moyez': 'positive',
 'mauloee': 'positive',
 'ndakanamirana': 'positive',
 'skf3013': 'positive',
 'pergolakan': 'negative',
 'gelembung': 'negative',
 'menghadkan': 'negative',
 'wardrobenya': 'positive',
 'anrara': 'positive',
 'tukaanza': 'positive',
 'tersebutnya': 'positive',
 'hamba': 'negative',
 'hambanya': 'negative',
 'firman': 'negative',
 'takdir': 'negative',
 'rasul': 'negative',
 'memburukkan': 'negative',
 'tubuhkan': 'negative',
 'menggulingkan': 'negative',
 'meruntuhkan': 'negative',
 'membantai': 'negative',
 'haiwan': 'negative',
 'dajjal': 'negative',
 'penyamun': 'negative',
 'sampah': 'negative',
 'rumput': 'negative',
 'racun': 'negative',
 'rokok': 'negative',
 'dengki': 'negative',
 'jeles': 'positive',
 'sombong': 'negative',
 'hasutan': 'negative',
 'palsu': 'negative',
 'negatif': 'negative',
 ...}
[8]:
%%time

results_emotion, scores_emotion, labels_emotion = malaya.lexicon.random_walk(emotion_lexicon,
                                                                             wordvector,
                                                                             pool_size = 10)
populating nearest words from wordvector
populating vectors from populated nearest words
random walking from populated vectors

CPU times: user 5.9 s, sys: 3.13 s, total: 9.03 s
Wall time: 1.5 s
[9]:
np.unique(list(results_emotion.values()), return_counts = True)
[9]:
(array(['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'], dtype='<U8'),
 array([ 76, 156,  14, 132,  40,  34]))
[10]:
results_emotion
[10]:
{'sebal': 'anger',
 'gesture': 'anger',
 'se7': 'anger',
 'ziraa': 'love',
 'mantepp': 'love',
 'mesem': 'love',
 'nggapapa': 'love',
 'maen2': 'love',
 'gacocok': 'anger',
 'jeongwoo': 'love',
 'bergelora': 'anger',
 'mereda': 'anger',
 'skeptis': 'anger',
 'gebus': 'love',
 'tyrion': 'love',
 'memuncak': 'anger',
 'mewabah': 'love',
 'mengenaskan': 'anger',
 'kesasar': 'love',
 'kepedean': 'love',
 'annoying': 'anger',
 'awkward': 'fear',
 'scary': 'fear',
 'handsome': 'fear',
 'nervous': 'fear',
 'cringe': 'fear',
 'menyampah': 'fear',
 'kelakar': 'fear',
 'cute': 'fear',
 'cuak': 'fear',
 'bodoh': 'anger',
 'bangang': 'anger',
 'bebal': 'anger',
 'bodo': 'fear',
 'noob': 'fear',
 'bengap': 'fear',
 'celaka': 'fear',
 'biadap': 'fear',
 'pukimak': 'fear',
 'berang': 'anger',
 'buru': 'anger',
 'nerus': 'anger',
 'kangsar': 'anger',
 'lipis': 'anger',
 'pilah': 'anger',
 'besut': 'anger',
 'krai': 'anger',
 'klawang': 'anger',
 'ketil': 'anger',
 'amuk': 'anger',
 'mbatin': 'love',
 'sebarin': 'love',
 'sebarisan': 'love',
 'ngalami': 'love',
 'tikt': 'love',
 'diharga': 'love',
 'threesome': 'love',
 'shizuka': 'love',
 'bokondini': 'love',
 'mendidih': 'anger',
 'mengental': 'anger',
 'sebati': 'anger',
 'mengembang': 'anger',
 'layu': 'anger',
 'kecoklatan': 'anger',
 'matang': 'anger',
 'meresap': 'anger',
 'mengering': 'anger',
 'direbus': 'anger',
 'pengecut': 'anger',
 'bajingan': 'anger',
 'pembohong': 'anger',
 'pecundang': 'anger',
 'dungu': 'anger',
 'pemberani': 'anger',
 'negarawan': 'anger',
 'jahil': 'anger',
 'biadab': 'anger',
 'provokator': 'anger',
 'bengang': 'anger',
 'menyirap': 'fear',
 'meluat': 'fear',
 'frust': 'fear',
 'rimas': 'fear',
 'annoyed': 'fear',
 'lonely': 'fear',
 'berdukacita': 'anger',
 'menyakitimu': 'anger',
 'bersinggungan': 'love',
 'bermesra': 'love',
 'meridhoi': 'anger',
 'menyelubungi': 'love',
 'empukk': 'love',
 'berserban': 'love',
 'diracuni': 'love',
 'dibayangi': 'love',
 'jengkel': 'anger',
 'gugup': 'anger',
 'dibiasain': 'love',
 'mubazir': 'anger',
 'amnesia': 'anger',
 'psikopat': 'anger',
 'gumoh': 'love',
 'diurusin': 'love',
 'ngangenin': 'anger',
 'purging': 'anger',
 'babi': 'anger',
 'sial': 'fear',
 'kimak': 'fear',
 'anjing': 'anger',
 'pundek': 'fear',
 'cibai': 'fear',
 'setan': 'anger',
 'lembu': 'anger',
 'pedar': 'anger',
 'sanwya': 'love',
 'qabaya': 'love',
 '5pac': 'love',
 'wa082336409906': 'love',
 'mpibg': 'love',
 'honachahthu': 'anger',
 'unieleven': 'love',
 'mengepilkan': 'anger',
 'ciknorzaidi': 'love',
 'benci': 'anger',
 'jijik': 'fear',
 'kagum': 'surprise',
 'geli': 'fear',
 'insecure': 'fear',
 'geram': 'fear',
 'respect': 'fear',
 'jealous': 'fear',
 'marah': 'anger',
 'maki': 'fear',
 'merajuk': 'fear',
 'marah2': 'surprise',
 'perli': 'fear',
 'jeles': 'fear',
 'tegur': 'fear',
 'kecam': 'fear',
 'cemburu': 'surprise',
 'bitter': 'fear',
 'ngeri': 'fear',
 'serem': 'fear',
 'kocak': 'fear',
 'mantep': 'fear',
 'miris': 'fear',
 'ngeselin': 'fear',
 'nyesek': 'fear',
 'kesel': 'fear',
 'sebel': 'fear',
 'lebay': 'fear',
 'phobia': 'fear',
 'mendem': 'love',
 'berideologi': 'love',
 'niru': 'love',
 'nyicip': 'love',
 'ngerawat': 'fear',
 'riweuh': 'anger',
 'nmun': 'love',
 'ngancam': 'love',
 'bencong': 'love',
 'anxiety': 'fear',
 'glasses': 'love',
 'manners': 'fear',
 'satan': 'love',
 'popularity': 'love',
 'curl': 'love',
 'impossible': 'love',
 'mayb': 'love',
 'sperm': 'love',
 'nyumpah': 'love',
 'fitnah': 'fear',
 'hoax': 'fear',
 'provokasi': 'fear',
 'kebencian': 'fear',
 'dusta': 'fear',
 'hoaks': 'fear',
 'kebohongan': 'fear',
 'bohong': 'fear',
 'rasis': 'anger',
 'ngibul': 'fear',
 'horror': 'fear',
 'horor': 'fear',
 'romance': 'fear',
 'day6': 'fear',
 'dokumenter': 'fear',
 'porno': 'fear',
 'anime': 'fear',
 'sinetron': 'fear',
 'drakor': 'fear',
 'dangdut': 'fear',
 'takut': 'fear',
 'risau': 'fear',
 'malu': 'fear',
 'khawatir': 'sadness',
 'segan': 'fear',
 'kecewa': 'sadness',
 'takot': 'fear',
 'bimbang': 'sadness',
 'takutnya': 'fear',
 'sedih': 'sadness',
 'panic': 'fear',
 'loud': 'love',
 'impressed': 'love',
 'expected': 'love',
 'dying': 'love',
 'rush': 'fear',
 'shitty': 'love',
 'smoke': 'fear',
 'suck': 'fear',
 'cheap': 'fear',
 'emo': 'fear',
 'boring': 'fear',
 'gelabah': 'fear',
 'ngantok': 'fear',
 'syok': 'joy',
 'seronok': 'joy',
 'busy': 'fear',
 'serabut': 'fear',
 'syiok': 'fear',
 'sendu': 'fear',
 'riang': 'joy',
 'ceria': 'sadness',
 'takbir': 'joy',
 'bersuka': 'anger',
 'emma': 'love',
 'barakah': 'anger',
 'telemovie': 'anger',
 'riuh': 'anger',
 'ria': 'joy',
 'khutbah': 'joy',
 'sebak': 'fear',
 'excited': 'fear',
 'terharu': 'surprise',
 'terliur': 'fear',
 'girang': 'joy',
 'ditikung': 'love',
 'ambis': 'anger',
 'rafa': 'love',
 'digangguin': 'love',
 'nyiksa': 'anger',
 'maruk': 'love',
 'tamvan': 'love',
 'pengap': 'love',
 'iklas': 'love',
 'puas': 'joy',
 'muak': 'sadness',
 'kenyang': 'fear',
 'lega': 'fear',
 'bosan': 'fear',
 'berbaloi': 'fear',
 'berpuas': 'sadness',
 'lelah': 'sadness',
 'bahagia': 'joy',
 'menyenangkan': 'sadness',
 'gelisah': 'sadness',
 'nyaman': 'sadness',
 'indah': 'sadness',
 'sukses': 'sadness',
 'sehat': 'sadness',
 'damai': 'sadness',
 'suka': 'joy',
 'sukanya': 'fear',
 'doyan': 'fear',
 'demen': 'fear',
 'suke': 'fear',
 'gasuka': 'fear',
 'gemar': 'fear',
 'sukaa': 'fear',
 'prefer': 'fear',
 'happy': 'joy',
 'hepi': 'love',
 'wish': 'fear',
 'nice': 'fear',
 'cerita': 'joy',
 'citer': 'fear',
 'cite': 'fear',
 'crita': 'fear',
 'kisah': 'love',
 'percakapan': 'joy',
 'tweet': 'fear',
 'drama': 'fear',
 'lagu': 'fear',
 'ceramah': 'joy',
 'cinta': 'love',
 'kebahagiaan': 'love',
 'cintanya': 'sadness',
 'cintaku': 'sadness',
 'persahabatan': 'love',
 'cintamu': 'sadness',
 'kesabaran': 'love',
 'dendam': 'sadness',
 'kesedihan': 'sadness',
 'asa': 'love',
 'baby': 'love',
 'daddy': 'love',
 'mira': 'fear',
 'princess': 'love',
 'bella': 'love',
 'farah': 'love',
 'mommy': 'love',
 'sister': 'love',
 'mummy': 'love',
 'lisa': 'love',
 'love': 'love',
 'luv': 'love',
 'hate': 'love',
 'thought': 'fear',
 'mean': 'fear',
 'want': 'fear',
 'see': 'fear',
 'need': 'fear',
 'hope': 'fear',
 'peace': 'fear',
 'syang': 'love',
 'noi': 'love',
 'bilang2': 'love',
 'syng': 'love',
 'mut': 'love',
 'ribbey': 'love',
 'seneng2': 'love',
 'butoset': 'love',
 'manly': 'love',
 'twet': 'love',
 'syg': 'love',
 'sayangg': 'love',
 'sayang': 'love',
 'bby': 'love',
 'cntik': 'fear',
 'knl': 'surprise',
 'anon': 'fear',
 'sistur': 'love',
 'sayang2': 'love',
 'bgus': 'fear',
 'rindukn': 'love',
 'ajeb2an': 'love',
 'hshakjsjsbs': 'love',
 'miliknyamencatat': 'love',
 'p6a': 'love',
 'ahsjahhaa': 'love',
 'diwajibk': 'love',
 'protese': 'love',
 'botaqin': 'love',
 'kruntel': 'love',
 'rindu': 'love',
 'sayangku': 'love',
 'sayangkan': 'love',
 'sayangnya': 'love',
 'disayang': 'anger',
 'moody': 'fear',
 'rindukan': 'love',
 'merindui': 'love',
 'takutkan': 'love',
 'banggakan': 'love',
 'cintakan': 'love',
 'perbuat': 'surprise',
 'merindukan': 'love',
 'ceraikan': 'love',
 'jumpai': 'love',
 'rindunya': 'fear',
 'teringat': 'fear',
 'rinduu': 'fear',
 'lapar': 'fear',
 'kempunan': 'fear',
 'teringin': 'fear',
 'kangen': 'fear',
 'confuse': 'fear',
 'stress': 'fear',
 'letih': 'fear',
 'penat': 'fear',
 'stres': 'sadness',
 'mengantuk': 'fear',
 'tertekan': 'sadness',
 'terganggu': 'sadness',
 'tertipu': 'surprise',
 'keliru': 'surprise',
 'mengeluh': 'sadness',
 'merosot': 'sadness',
 'susut': 'sadness',
 'terjebak': 'surprise',
 'terpengaruh': 'surprise',
 'kesal': 'sadness',
 'terkejut': 'surprise',
 'bersalah': 'sadness',
 'berdosa': 'fear',
 'dihargai': 'sadness',
 'janggal': 'anger',
 'resah': 'sadness',
 'kesepian': 'sadness',
 'gundah': 'sadness',
 'goyah': 'sadness',
 'disakiti': 'sadness',
 'takjub': 'sadness',
 'sengsara': 'sadness',
 'seram': 'fear',
 'menyebalkan': 'sadness',
 'merana': 'fear',
 'melarat': 'anger',
 'angkuh': 'sadness',
 'rakus': 'sadness',
 'terpuruk': 'sadness',
 'pengsan': 'surprise',
 'tertido': 'fear',
 'pitam': 'surprise',
 'terlelap': 'surprise',
 'terberak': 'surprise',
 'nanges': 'fear',
 'mengamuk': 'fear',
 'tdoq': 'fear',
 'termuntah': 'surprise',
 'tidor': 'surprise',
 'bangga': 'surprise',
 'surprise': 'surprise',
 'suprise': 'surprise',
 'makan2': 'surprise',
 'attention': 'fear',
 'kejutan': 'surprise',
 'assignment': 'fear',
 'comeback': 'surprise',
 'chance': 'fear',
 'homework': 'surprise',
 'appointment': 'surprise',
 'wtf': 'surprise',
 'huh': 'fear',
 'seriously': 'fear',
 'omg': 'fear',
 'aik': 'fear',
 'wth': 'fear',
 'shit': 'fear',
 'apoo': 'fear',
 'hah': 'fear',
 'damn': 'fear',
 'stun': 'surprise',
 'pinafsueun': 'love',
 'neelehh': 'love',
 'rudgard': 'love',
 '016344981': 'love',
 'pramaandika': 'love',
 'hamidibahawa': 'love',
 'spesialers': 'love',
 'superpignan': 'love',
 '082187486748': 'love',
 'tertanya2': 'surprise',
 'terperanjat': 'surprise',
 'cubaa': 'surprise',
 'stuju': 'surprise',
 'stayback': 'love',
 'cakaplah': 'surprise',
 'melebih': 'anger',
 'tanyaa': 'surprise',
 'ngandung': 'surprise'}

propagate probabilistic#

def propagate_probabilistic(
    lexicon,
    wordvector,
    pool_size = 10,
    top_n = 20,
    similarity_power = 10.0,
    arccos = True,
    normalization = True,
    soft = False,
    silent = False,
):

    """
    Learns polarity scores via standard label propagation from lexicon sets.

    Parameters
    ----------

    lexicon: dict
        curated lexicon from expert domain, {'label1': [str], 'label2': [str]}.
    wordvector: object
        wordvector interface object.
    pool_size: int, optional (default=10)
        pick top-pool size from each lexicons.
    top_n: int, optional (default=20)
        top_n for each vectors will multiple with `similarity_power`.
    similarity_power: float, optional (default=10.0)
        extra score for `top_n`, less will generate less bias induced but high chance unbalanced outcome.
    arccos: bool, optional (default=True)
        covariance distribution for embedded.dot(embedded.T). If false, covariance + 1.
    normalization: bool, optional (default=True)
        normalize word vectors using L2 norm. L2 is good to penalize skewed vectors.
    soft: bool, optional (default=False)
        if True, a word not in the dictionary will be replaced with nearest jarowrinkler ratio.
        if False, it will throw an exception if a word not in the dictionary.
    silent: bool, optional (default=False)
        if True, will not print any logs.

    Returns
    -------
    tuple: (labels[argmax(scores), axis = 1], scores, labels)
    """
[11]:
%%time

results_emotion, scores_emotion, labels_emotion = malaya.lexicon.propagate_probabilistic(emotion_lexicon,
                                                                             wordvector,
                                                                             pool_size = 10)
populating nearest words from wordvector
populating vectors from populated nearest words
propagating probabilistic from populated vectors

CPU times: user 5.64 s, sys: 2.05 s, total: 7.68 s
Wall time: 1.29 s
[12]:
np.unique(list(results_emotion.values()), return_counts = True)
[12]:
(array(['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'], dtype='<U8'),
 array([315,  66,  10,  21,  28,  12]))
[13]:
results_emotion
[13]:
{'sebal': 'anger',
 'gesture': 'anger',
 'se7': 'anger',
 'ziraa': 'anger',
 'mantepp': 'anger',
 'mesem': 'anger',
 'nggapapa': 'anger',
 'maen2': 'anger',
 'gacocok': 'anger',
 'jeongwoo': 'anger',
 'bergelora': 'anger',
 'mereda': 'anger',
 'skeptis': 'anger',
 'gebus': 'anger',
 'tyrion': 'anger',
 'memuncak': 'anger',
 'mewabah': 'anger',
 'mengenaskan': 'anger',
 'kesasar': 'anger',
 'kepedean': 'anger',
 'annoying': 'anger',
 'awkward': 'fear',
 'scary': 'fear',
 'handsome': 'anger',
 'nervous': 'fear',
 'cringe': 'fear',
 'menyampah': 'fear',
 'kelakar': 'anger',
 'cute': 'anger',
 'cuak': 'fear',
 'bodoh': 'anger',
 'bangang': 'anger',
 'bebal': 'anger',
 'bodo': 'anger',
 'noob': 'anger',
 'bengap': 'anger',
 'celaka': 'anger',
 'biadap': 'anger',
 'pukimak': 'anger',
 'berang': 'anger',
 'buru': 'anger',
 'nerus': 'anger',
 'kangsar': 'anger',
 'lipis': 'anger',
 'pilah': 'anger',
 'besut': 'anger',
 'krai': 'anger',
 'klawang': 'anger',
 'ketil': 'anger',
 'amuk': 'anger',
 'mbatin': 'anger',
 'sebarin': 'anger',
 'sebarisan': 'anger',
 'ngalami': 'anger',
 'tikt': 'anger',
 'diharga': 'anger',
 'threesome': 'anger',
 'shizuka': 'anger',
 'bokondini': 'anger',
 'mendidih': 'anger',
 'mengental': 'anger',
 'sebati': 'anger',
 'mengembang': 'anger',
 'layu': 'anger',
 'kecoklatan': 'anger',
 'matang': 'anger',
 'meresap': 'anger',
 'mengering': 'anger',
 'direbus': 'anger',
 'pengecut': 'anger',
 'bajingan': 'anger',
 'pembohong': 'anger',
 'pecundang': 'anger',
 'dungu': 'anger',
 'pemberani': 'anger',
 'negarawan': 'anger',
 'jahil': 'anger',
 'biadab': 'anger',
 'provokator': 'anger',
 'bengang': 'anger',
 'menyirap': 'fear',
 'meluat': 'anger',
 'frust': 'fear',
 'rimas': 'fear',
 'annoyed': 'fear',
 'lonely': 'fear',
 'berdukacita': 'anger',
 'menyakitimu': 'anger',
 'bersinggungan': 'anger',
 'bermesra': 'anger',
 'meridhoi': 'anger',
 'menyelubungi': 'anger',
 'empukk': 'anger',
 'berserban': 'anger',
 'diracuni': 'anger',
 'dibayangi': 'anger',
 'jengkel': 'anger',
 'gugup': 'anger',
 'dibiasain': 'anger',
 'mubazir': 'anger',
 'amnesia': 'anger',
 'psikopat': 'anger',
 'gumoh': 'anger',
 'diurusin': 'anger',
 'ngangenin': 'anger',
 'purging': 'anger',
 'babi': 'anger',
 'sial': 'anger',
 'kimak': 'anger',
 'anjing': 'anger',
 'pundek': 'anger',
 'cibai': 'anger',
 'setan': 'anger',
 'lembu': 'anger',
 'pedar': 'anger',
 'sanwya': 'anger',
 'qabaya': 'anger',
 '5pac': 'anger',
 'wa082336409906': 'anger',
 'mpibg': 'anger',
 'honachahthu': 'anger',
 'unieleven': 'anger',
 'mengepilkan': 'anger',
 'ciknorzaidi': 'anger',
 'benci': 'anger',
 'jijik': 'fear',
 'kagum': 'surprise',
 'geli': 'anger',
 'insecure': 'fear',
 'geram': 'anger',
 'respect': 'anger',
 'jealous': 'fear',
 'marah': 'anger',
 'maki': 'anger',
 'merajuk': 'anger',
 'marah2': 'anger',
 'perli': 'anger',
 'jeles': 'fear',
 'tegur': 'anger',
 'kecam': 'anger',
 'cemburu': 'surprise',
 'bitter': 'anger',
 'ngeri': 'fear',
 'serem': 'anger',
 'kocak': 'anger',
 'mantep': 'anger',
 'miris': 'fear',
 'ngeselin': 'anger',
 'nyesek': 'anger',
 'kesel': 'fear',
 'sebel': 'fear',
 'lebay': 'anger',
 'phobia': 'fear',
 'mendem': 'anger',
 'berideologi': 'anger',
 'niru': 'anger',
 'nyicip': 'anger',
 'ngerawat': 'anger',
 'riweuh': 'anger',
 'nmun': 'anger',
 'ngancam': 'anger',
 'bencong': 'anger',
 'anxiety': 'fear',
 'glasses': 'anger',
 'manners': 'anger',
 'satan': 'anger',
 'popularity': 'anger',
 'curl': 'anger',
 'impossible': 'anger',
 'mayb': 'anger',
 'sperm': 'anger',
 'nyumpah': 'anger',
 'fitnah': 'fear',
 'hoax': 'anger',
 'provokasi': 'anger',
 'kebencian': 'anger',
 'dusta': 'anger',
 'hoaks': 'anger',
 'kebohongan': 'anger',
 'bohong': 'anger',
 'rasis': 'anger',
 'ngibul': 'anger',
 'horror': 'fear',
 'horor': 'fear',
 'romance': 'anger',
 'day6': 'anger',
 'dokumenter': 'anger',
 'porno': 'anger',
 'anime': 'anger',
 'sinetron': 'anger',
 'drakor': 'anger',
 'dangdut': 'anger',
 'takut': 'fear',
 'risau': 'fear',
 'malu': 'fear',
 'khawatir': 'sadness',
 'segan': 'fear',
 'kecewa': 'sadness',
 'takot': 'fear',
 'bimbang': 'sadness',
 'takutnya': 'anger',
 'sedih': 'sadness',
 'panic': 'fear',
 'loud': 'anger',
 'impressed': 'anger',
 'expected': 'anger',
 'dying': 'anger',
 'rush': 'anger',
 'shitty': 'anger',
 'smoke': 'anger',
 'suck': 'anger',
 'cheap': 'anger',
 'emo': 'fear',
 'boring': 'fear',
 'gelabah': 'fear',
 'ngantok': 'fear',
 'syok': 'joy',
 'seronok': 'joy',
 'busy': 'fear',
 'serabut': 'fear',
 'syiok': 'anger',
 'sendu': 'fear',
 'riang': 'joy',
 'ceria': 'joy',
 'takbir': 'anger',
 'bersuka': 'anger',
 'emma': 'love',
 'barakah': 'anger',
 'telemovie': 'anger',
 'riuh': 'anger',
 'ria': 'anger',
 'khutbah': 'anger',
 'sebak': 'fear',
 'excited': 'fear',
 'terharu': 'surprise',
 'terliur': 'fear',
 'girang': 'joy',
 'ditikung': 'anger',
 'ambis': 'anger',
 'rafa': 'anger',
 'digangguin': 'anger',
 'nyiksa': 'anger',
 'maruk': 'anger',
 'tamvan': 'anger',
 'pengap': 'anger',
 'iklas': 'anger',
 'puas': 'joy',
 'muak': 'fear',
 'kenyang': 'fear',
 'lega': 'fear',
 'bosan': 'fear',
 'berbaloi': 'fear',
 'berpuas': 'sadness',
 'lelah': 'sadness',
 'bahagia': 'joy',
 'menyenangkan': 'sadness',
 'gelisah': 'sadness',
 'nyaman': 'sadness',
 'indah': 'sadness',
 'sukses': 'anger',
 'sehat': 'sadness',
 'damai': 'sadness',
 'suka': 'joy',
 'sukanya': 'anger',
 'doyan': 'anger',
 'demen': 'anger',
 'suke': 'anger',
 'gasuka': 'anger',
 'gemar': 'anger',
 'sukaa': 'anger',
 'prefer': 'anger',
 'happy': 'joy',
 'hepi': 'anger',
 'wish': 'anger',
 'nice': 'fear',
 'cerita': 'joy',
 'citer': 'fear',
 'cite': 'fear',
 'crita': 'anger',
 'kisah': 'love',
 'percakapan': 'anger',
 'tweet': 'fear',
 'drama': 'anger',
 'lagu': 'anger',
 'ceramah': 'anger',
 'cinta': 'love',
 'kebahagiaan': 'anger',
 'cintanya': 'anger',
 'cintaku': 'anger',
 'persahabatan': 'anger',
 'cintamu': 'anger',
 'kesabaran': 'anger',
 'dendam': 'sadness',
 'kesedihan': 'anger',
 'asa': 'sadness',
 'baby': 'love',
 'daddy': 'love',
 'mira': 'love',
 'princess': 'anger',
 'bella': 'love',
 'farah': 'love',
 'mommy': 'love',
 'sister': 'love',
 'mummy': 'love',
 'lisa': 'love',
 'love': 'love',
 'luv': 'love',
 'hate': 'anger',
 'thought': 'anger',
 'mean': 'anger',
 'want': 'anger',
 'see': 'anger',
 'need': 'anger',
 'hope': 'anger',
 'peace': 'anger',
 'syang': 'love',
 'noi': 'anger',
 'bilang2': 'anger',
 'syng': 'anger',
 'mut': 'anger',
 'ribbey': 'anger',
 'seneng2': 'anger',
 'butoset': 'anger',
 'manly': 'anger',
 'twet': 'anger',
 'syg': 'love',
 'sayangg': 'anger',
 'sayang': 'love',
 'bby': 'anger',
 'cntik': 'anger',
 'knl': 'anger',
 'anon': 'anger',
 'sistur': 'anger',
 'sayang2': 'anger',
 'bgus': 'anger',
 'rindukn': 'love',
 'ajeb2an': 'anger',
 'hshakjsjsbs': 'anger',
 'miliknyamencatat': 'anger',
 'p6a': 'anger',
 'ahsjahhaa': 'anger',
 'diwajibk': 'anger',
 'protese': 'anger',
 'botaqin': 'anger',
 'kruntel': 'anger',
 'rindu': 'love',
 'sayangku': 'anger',
 'sayangkan': 'anger',
 'sayangnya': 'love',
 'disayang': 'anger',
 'moody': 'fear',
 'rindukan': 'love',
 'merindui': 'anger',
 'takutkan': 'anger',
 'banggakan': 'anger',
 'cintakan': 'anger',
 'perbuat': 'anger',
 'merindukan': 'anger',
 'ceraikan': 'anger',
 'jumpai': 'anger',
 'rindunya': 'fear',
 'teringat': 'fear',
 'rinduu': 'fear',
 'lapar': 'fear',
 'kempunan': 'fear',
 'teringin': 'fear',
 'kangen': 'fear',
 'confuse': 'fear',
 'stress': 'sadness',
 'letih': 'fear',
 'penat': 'fear',
 'stres': 'sadness',
 'mengantuk': 'fear',
 'tertekan': 'sadness',
 'terganggu': 'sadness',
 'tertipu': 'surprise',
 'keliru': 'sadness',
 'mengeluh': 'sadness',
 'merosot': 'anger',
 'susut': 'anger',
 'terjebak': 'sadness',
 'terpengaruh': 'surprise',
 'kesal': 'sadness',
 'terkejut': 'surprise',
 'bersalah': 'sadness',
 'berdosa': 'fear',
 'dihargai': 'sadness',
 'janggal': 'anger',
 'resah': 'sadness',
 'kesepian': 'sadness',
 'gundah': 'anger',
 'goyah': 'anger',
 'disakiti': 'anger',
 'takjub': 'anger',
 'sengsara': 'sadness',
 'seram': 'fear',
 'menyebalkan': 'anger',
 'merana': 'sadness',
 'melarat': 'anger',
 'angkuh': 'anger',
 'rakus': 'anger',
 'terpuruk': 'anger',
 'pengsan': 'surprise',
 'tertido': 'anger',
 'pitam': 'anger',
 'terlelap': 'anger',
 'terberak': 'anger',
 'nanges': 'anger',
 'mengamuk': 'anger',
 'tdoq': 'anger',
 'termuntah': 'anger',
 'tidor': 'anger',
 'bangga': 'surprise',
 'surprise': 'surprise',
 'suprise': 'anger',
 'makan2': 'anger',
 'attention': 'anger',
 'kejutan': 'anger',
 'assignment': 'fear',
 'comeback': 'anger',
 'chance': 'fear',
 'homework': 'anger',
 'appointment': 'anger',
 'wtf': 'surprise',
 'huh': 'anger',
 'seriously': 'anger',
 'omg': 'anger',
 'aik': 'anger',
 'wth': 'anger',
 'shit': 'anger',
 'apoo': 'fear',
 'hah': 'anger',
 'damn': 'anger',
 'stun': 'surprise',
 'pinafsueun': 'anger',
 'neelehh': 'anger',
 'rudgard': 'anger',
 '016344981': 'anger',
 'pramaandika': 'anger',
 'hamidibahawa': 'anger',
 'spesialers': 'anger',
 'superpignan': 'anger',
 '082187486748': 'anger',
 'tertanya2': 'surprise',
 'terperanjat': 'anger',
 'cubaa': 'anger',
 'stuju': 'anger',
 'stayback': 'anger',
 'cakaplah': 'anger',
 'melebih': 'anger',
 'tanyaa': 'anger',
 'ngandung': 'anger'}

propagate graph#

def propagate_graph(
    lexicon,
    wordvector,
    pool_size = 10,
    top_n = 20,
    similarity_power = 10.0,
    normalization = True,
    soft = False,
    silent = False,
):

    """
    Graph propagation method dapted from Velikovich, Leonid, et al. "The viability of web-derived polarity lexicons." http://www.aclweb.org/anthology/N10-1119

    Parameters
    ----------

    lexicon: dict
        curated lexicon from expert domain, {'label1': [str], 'label2': [str]}.
    wordvector: object
        wordvector interface object.
    pool_size: int, optional (default=10)
        pick top-pool size from each lexicons.
    top_n: int, optional (default=20)
        top_n for each vectors will multiple with `similarity_power`.
    similarity_power: float, optional (default=10.0)
        extra score for `top_n`, less will generate less bias induced but high chance unbalanced outcome.
    normalization: bool, optional (default=True)
        normalize word vectors using L2 norm. L2 is good to penalize skewed vectors.
    soft: bool, optional (default=False)
        if True, a word not in the dictionary will be replaced with nearest jarowrinkler ratio.
        if False, it will throw an exception if a word not in the dictionary.
    silent: bool, optional (default=False)
        if True, will not print any logs.

    Returns
    -------
    tuple: (labels[argmax(scores), axis = 1], scores, labels)
    """
[14]:
%%time

results_emotion, scores_emotion, labels_emotion = malaya.lexicon.propagate_graph(emotion_lexicon,
                                                                             wordvector,
                                                                             pool_size = 10)
populating nearest words from wordvector
populating vectors from populated nearest words
propagate graph from populated nearest words
100%|██████████| 452/452 [00:00<00:00, 1830.24it/s]
CPU times: user 16.5 s, sys: 2.2 s, total: 18.7 s
Wall time: 11.8 s

[15]:
np.unique(list(results_emotion.values()), return_counts = True)
[15]:
(array(['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'], dtype='<U8'),
 array([149,  61,  49,  69,  46,  78]))
[16]:
results_emotion
[16]:
{'sebal': 'anger',
 'gesture': 'fear',
 'se7': 'anger',
 'ziraa': 'anger',
 'mantepp': 'anger',
 'mesem': 'fear',
 'nggapapa': 'anger',
 'maen2': 'anger',
 'gacocok': 'fear',
 'jeongwoo': 'anger',
 'bergelora': 'anger',
 'mereda': 'anger',
 'skeptis': 'anger',
 'gebus': 'love',
 'tyrion': 'fear',
 'memuncak': 'anger',
 'mewabah': 'anger',
 'mengenaskan': 'anger',
 'kesasar': 'love',
 'kepedean': 'anger',
 'annoying': 'anger',
 'awkward': 'fear',
 'scary': 'fear',
 'handsome': 'love',
 'nervous': 'fear',
 'cringe': 'anger',
 'menyampah': 'anger',
 'kelakar': 'anger',
 'cute': 'love',
 'cuak': 'fear',
 'bodoh': 'anger',
 'bangang': 'anger',
 'bebal': 'anger',
 'bodo': 'anger',
 'noob': 'anger',
 'bengap': 'anger',
 'celaka': 'anger',
 'biadap': 'anger',
 'pukimak': 'anger',
 'berang': 'anger',
 'buru': 'joy',
 'nerus': 'anger',
 'kangsar': 'fear',
 'lipis': 'anger',
 'pilah': 'fear',
 'besut': 'anger',
 'krai': 'anger',
 'klawang': 'anger',
 'ketil': 'anger',
 'amuk': 'anger',
 'mbatin': 'love',
 'sebarin': 'anger',
 'sebarisan': 'fear',
 'ngalami': 'joy',
 'tikt': 'anger',
 'diharga': 'anger',
 'threesome': 'anger',
 'shizuka': 'anger',
 'bokondini': 'anger',
 'mendidih': 'anger',
 'mengental': 'anger',
 'sebati': 'surprise',
 'mengembang': 'anger',
 'layu': 'surprise',
 'kecoklatan': 'anger',
 'matang': 'sadness',
 'meresap': 'surprise',
 'mengering': 'anger',
 'direbus': 'anger',
 'pengecut': 'anger',
 'bajingan': 'fear',
 'pembohong': 'fear',
 'pecundang': 'fear',
 'dungu': 'fear',
 'pemberani': 'anger',
 'negarawan': 'anger',
 'jahil': 'anger',
 'biadab': 'fear',
 'provokator': 'fear',
 'bengang': 'anger',
 'menyirap': 'joy',
 'meluat': 'surprise',
 'frust': 'surprise',
 'rimas': 'sadness',
 'annoyed': 'anger',
 'lonely': 'love',
 'berdukacita': 'anger',
 'menyakitimu': 'surprise',
 'bersinggungan': 'anger',
 'bermesra': 'anger',
 'meridhoi': 'love',
 'menyelubungi': 'anger',
 'empukk': 'anger',
 'berserban': 'anger',
 'diracuni': 'anger',
 'dibayangi': 'fear',
 'jengkel': 'anger',
 'gugup': 'anger',
 'dibiasain': 'joy',
 'mubazir': 'anger',
 'amnesia': 'fear',
 'psikopat': 'fear',
 'gumoh': 'anger',
 'diurusin': 'fear',
 'ngangenin': 'joy',
 'purging': 'joy',
 'babi': 'anger',
 'sial': 'surprise',
 'kimak': 'surprise',
 'anjing': 'fear',
 'pundek': 'surprise',
 'cibai': 'surprise',
 'setan': 'fear',
 'lembu': 'anger',
 'pedar': 'anger',
 'sanwya': 'love',
 'qabaya': 'love',
 '5pac': 'love',
 'wa082336409906': 'love',
 'mpibg': 'love',
 'honachahthu': 'love',
 'unieleven': 'love',
 'mengepilkan': 'surprise',
 'ciknorzaidi': 'love',
 'benci': 'anger',
 'jijik': 'fear',
 'kagum': 'surprise',
 'geli': 'fear',
 'insecure': 'sadness',
 'geram': 'sadness',
 'respect': 'love',
 'jealous': 'anger',
 'marah': 'anger',
 'maki': 'surprise',
 'merajuk': 'surprise',
 'marah2': 'surprise',
 'perli': 'joy',
 'jeles': 'love',
 'tegur': 'surprise',
 'kecam': 'fear',
 'cemburu': 'sadness',
 'bitter': 'surprise',
 'ngeri': 'fear',
 'serem': 'anger',
 'kocak': 'fear',
 'mantep': 'fear',
 'miris': 'anger',
 'ngeselin': 'fear',
 'nyesek': 'fear',
 'kesel': 'sadness',
 'sebel': 'anger',
 'lebay': 'fear',
 'phobia': 'fear',
 'mendem': 'joy',
 'berideologi': 'anger',
 'niru': 'anger',
 'nyicip': 'anger',
 'ngerawat': 'love',
 'riweuh': 'joy',
 'nmun': 'anger',
 'ngancam': 'surprise',
 'bencong': 'fear',
 'anxiety': 'fear',
 'glasses': 'love',
 'manners': 'fear',
 'satan': 'fear',
 'popularity': 'love',
 'curl': 'surprise',
 'impossible': 'fear',
 'mayb': 'love',
 'sperm': 'anger',
 'nyumpah': 'fear',
 'fitnah': 'fear',
 'hoax': 'fear',
 'provokasi': 'anger',
 'kebencian': 'love',
 'dusta': 'love',
 'hoaks': 'anger',
 'kebohongan': 'love',
 'bohong': 'anger',
 'rasis': 'sadness',
 'ngibul': 'anger',
 'horror': 'fear',
 'horor': 'joy',
 'romance': 'love',
 'day6': 'anger',
 'dokumenter': 'anger',
 'porno': 'anger',
 'anime': 'joy',
 'sinetron': 'joy',
 'drakor': 'joy',
 'dangdut': 'joy',
 'takut': 'fear',
 'risau': 'surprise',
 'malu': 'fear',
 'khawatir': 'sadness',
 'segan': 'anger',
 'kecewa': 'sadness',
 'takot': 'surprise',
 'bimbang': 'sadness',
 'takutnya': 'love',
 'sedih': 'sadness',
 'panic': 'fear',
 'loud': 'love',
 'impressed': 'surprise',
 'expected': 'surprise',
 'dying': 'joy',
 'rush': 'surprise',
 'shitty': 'anger',
 'smoke': 'surprise',
 'suck': 'love',
 'cheap': 'fear',
 'emo': 'anger',
 'boring': 'joy',
 'gelabah': 'surprise',
 'ngantok': 'surprise',
 'syok': 'joy',
 'seronok': 'joy',
 'busy': 'joy',
 'serabut': 'sadness',
 'syiok': 'surprise',
 'sendu': 'joy',
 'riang': 'joy',
 'ceria': 'joy',
 'takbir': 'joy',
 'bersuka': 'love',
 'emma': 'love',
 'barakah': 'anger',
 'telemovie': 'joy',
 'riuh': 'surprise',
 'ria': 'joy',
 'khutbah': 'joy',
 'sebak': 'sadness',
 'excited': 'joy',
 'terharu': 'surprise',
 'terliur': 'surprise',
 'girang': 'joy',
 'ditikung': 'anger',
 'ambis': 'anger',
 'rafa': 'anger',
 'digangguin': 'anger',
 'nyiksa': 'fear',
 'maruk': 'love',
 'tamvan': 'anger',
 'pengap': 'anger',
 'iklas': 'love',
 'puas': 'joy',
 'muak': 'sadness',
 'kenyang': 'joy',
 'lega': 'joy',
 'bosan': 'love',
 'berbaloi': 'sadness',
 'berpuas': 'sadness',
 'lelah': 'sadness',
 'bahagia': 'joy',
 'menyenangkan': 'sadness',
 'gelisah': 'sadness',
 'nyaman': 'sadness',
 'indah': 'sadness',
 'sukses': 'sadness',
 'sehat': 'sadness',
 'damai': 'sadness',
 'suka': 'joy',
 'sukanya': 'love',
 'doyan': 'fear',
 'demen': 'anger',
 'suke': 'love',
 'gasuka': 'love',
 'gemar': 'love',
 'sukaa': 'love',
 'prefer': 'love',
 'happy': 'joy',
 'hepi': 'love',
 'wish': 'love',
 'nice': 'surprise',
 'cerita': 'joy',
 'citer': 'surprise',
 'cite': 'surprise',
 'crita': 'surprise',
 'kisah': 'love',
 'percakapan': 'fear',
 'tweet': 'love',
 'drama': 'joy',
 'lagu': 'joy',
 'ceramah': 'surprise',
 'cinta': 'love',
 'kebahagiaan': 'sadness',
 'cintanya': 'anger',
 'cintaku': 'sadness',
 'persahabatan': 'joy',
 'cintamu': 'anger',
 'kesabaran': 'fear',
 'dendam': 'sadness',
 'kesedihan': 'sadness',
 'asa': 'sadness',
 'baby': 'love',
 'daddy': 'love',
 'mira': 'love',
 'princess': 'love',
 'bella': 'joy',
 'farah': 'surprise',
 'mommy': 'love',
 'sister': 'surprise',
 'mummy': 'love',
 'lisa': 'joy',
 'love': 'love',
 'luv': 'love',
 'hate': 'surprise',
 'thought': 'surprise',
 'mean': 'surprise',
 'want': 'joy',
 'see': 'surprise',
 'need': 'joy',
 'hope': 'surprise',
 'peace': 'anger',
 'syang': 'love',
 'noi': 'fear',
 'bilang2': 'anger',
 'syng': 'anger',
 'mut': 'fear',
 'ribbey': 'anger',
 'seneng2': 'anger',
 'butoset': 'anger',
 'manly': 'anger',
 'twet': 'anger',
 'syg': 'love',
 'sayangg': 'love',
 'sayang': 'love',
 'bby': 'surprise',
 'cntik': 'anger',
 'knl': 'surprise',
 'anon': 'anger',
 'sistur': 'surprise',
 'sayang2': 'surprise',
 'bgus': 'anger',
 'rindukn': 'love',
 'ajeb2an': 'surprise',
 'hshakjsjsbs': 'anger',
 'miliknyamencatat': 'anger',
 'p6a': 'anger',
 'ahsjahhaa': 'surprise',
 'diwajibk': 'anger',
 'protese': 'surprise',
 'botaqin': 'surprise',
 'kruntel': 'anger',
 'rindu': 'love',
 'sayangku': 'anger',
 'sayangkan': 'love',
 'sayangnya': 'fear',
 'disayang': 'joy',
 'moody': 'surprise',
 'rindukan': 'love',
 'merindui': 'surprise',
 'takutkan': 'surprise',
 'banggakan': 'surprise',
 'cintakan': 'surprise',
 'perbuat': 'surprise',
 'merindukan': 'joy',
 'ceraikan': 'surprise',
 'jumpai': 'anger',
 'rindunya': 'surprise',
 'teringat': 'surprise',
 'rinduu': 'surprise',
 'lapar': 'sadness',
 'kempunan': 'surprise',
 'teringin': 'joy',
 'kangen': 'joy',
 'confuse': 'anger',
 'stress': 'sadness',
 'letih': 'joy',
 'penat': 'joy',
 'stres': 'sadness',
 'mengantuk': 'joy',
 'tertekan': 'sadness',
 'terganggu': 'sadness',
 'tertipu': 'sadness',
 'keliru': 'sadness',
 'mengeluh': 'sadness',
 'merosot': 'sadness',
 'susut': 'surprise',
 'terjebak': 'sadness',
 'terpengaruh': 'sadness',
 'kesal': 'sadness',
 'terkejut': 'surprise',
 'bersalah': 'sadness',
 'berdosa': 'anger',
 'dihargai': 'sadness',
 'janggal': 'surprise',
 'resah': 'sadness',
 'kesepian': 'sadness',
 'gundah': 'surprise',
 'goyah': 'surprise',
 'disakiti': 'anger',
 'takjub': 'anger',
 'sengsara': 'sadness',
 'seram': 'anger',
 'menyebalkan': 'fear',
 'merana': 'surprise',
 'melarat': 'surprise',
 'angkuh': 'anger',
 'rakus': 'anger',
 'terpuruk': 'anger',
 'pengsan': 'surprise',
 'tertido': 'anger',
 'pitam': 'surprise',
 'terlelap': 'anger',
 'terberak': 'surprise',
 'nanges': 'surprise',
 'mengamuk': 'surprise',
 'tdoq': 'anger',
 'termuntah': 'surprise',
 'tidor': 'anger',
 'bangga': 'anger',
 'surprise': 'surprise',
 'suprise': 'love',
 'makan2': 'fear',
 'attention': 'fear',
 'kejutan': 'fear',
 'assignment': 'fear',
 'comeback': 'fear',
 'chance': 'love',
 'homework': 'fear',
 'appointment': 'fear',
 'wtf': 'surprise',
 'huh': 'love',
 'seriously': 'love',
 'omg': 'love',
 'aik': 'love',
 'wth': 'love',
 'shit': 'anger',
 'apoo': 'anger',
 'hah': 'anger',
 'damn': 'love',
 'stun': 'surprise',
 'pinafsueun': 'anger',
 'neelehh': 'anger',
 'rudgard': 'anger',
 '016344981': 'anger',
 'pramaandika': 'anger',
 'hamidibahawa': 'love',
 'spesialers': 'anger',
 'superpignan': 'anger',
 '082187486748': 'anger',
 'tertanya2': 'anger',
 'terperanjat': 'anger',
 'cubaa': 'anger',
 'stuju': 'anger',
 'stayback': 'anger',
 'cakaplah': 'anger',
 'melebih': 'anger',
 'tanyaa': 'anger',
 'ngandung': 'anger'}