Lexicon Generator
Contents
Lexicon Generator#
This tutorial is available as an IPython notebook at Malaya/example/lexicon.
[1]:
%%time
import malaya
import numpy as np
CPU times: user 4.47 s, sys: 1.01 s, total: 5.48 s
Wall time: 5.37 s
Why lexicon#
Lexicon is populated words related to certain domains, like, words for negative and positive sentiments.
Example, word suka
can represent as positive sentiment. If suka
exists in a sentence, we can say that sentence is positive sentiment.
Lexicon based is common way people use to classify a text and very fast. Again, it is pretty naive because a word can be semantically ambiguous.
sentiment lexicon#
Malaya provided a small sample for sentiment lexicon, simply,
[6]:
sentiment_lexicon = malaya.lexicon.sentiment
sentiment_lexicon.keys()
[6]:
dict_keys(['negative', 'positive'])
emotion lexicon#
Malaya provided a small sample for emotion lexicon, simply,
[3]:
emotion_lexicon = malaya.lexicon.emotion
emotion_lexicon.keys()
[3]:
dict_keys(['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'])
Lexicon generator#
To build a lexicon is time consuming, because required expert domains to populate related words to the domains. With the help of word vector, we can induce sample words to specific domains given some annotated lexicon. Why we induced lexicon from word vector? Even for a word suka
commonly represent positive sentiment, but if the word vector learnt the context of suka
different polarity and based nearest words also represent different polarity, so suka
got tendency to become negative
sentiment.
Malaya provided inducing lexicon interface, build on top of Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora.
Let say you have a lexicon based on standard language or bahasa baku
, then you want to find similar lexicon on social media context. So you can use this malaya.lexicon
interface. To use this interface, we must initiate malaya.wordvector.load
first.
And, at least small lexicon sample like this,
{'label1': ['word1', 'word2'], 'label2': ['word3', 'word4']}
label
can be more than 2, example like malaya.lexicon.emotion
, up to 6 different labels.
[5]:
vocab, embedded = malaya.wordvector.load(model = 'socialmedia')
wordvector = malaya.wordvector.WordVector(embedded, vocab)
random walk#
Random walk technique is main technique use by the paper, can read more at 3.2 Propagating polarities from a seed set
def random_walk(
lexicon,
wordvector,
pool_size = 10,
top_n = 20,
similarity_power = 100.0,
beta = 0.9,
arccos = True,
normalization = True,
soft = False,
silent = False,
):
"""
Induce lexicon by using random walk technique, use in paper, https://arxiv.org/pdf/1606.02820.pdf
Parameters
----------
lexicon: dict
curated lexicon from expert domain, {'label1': [str], 'label2': [str]}.
wordvector: object
wordvector interface object.
pool_size: int, optional (default=10)
pick top-pool size from each lexicons.
top_n: int, optional (default=20)
top_n for each vectors will multiple with `similarity_power`.
similarity_power: float, optional (default=100.0)
extra score for `top_n`, less will generate less bias induced but high chance unbalanced outcome.
beta: float, optional (default=0.9)
penalty score, towards to 1.0 means less penalty. 0 < beta < 1.
arccos: bool, optional (default=True)
covariance distribution for embedded.dot(embedded.T). If false, covariance + 1.
normalization: bool, optional (default=True)
normalize word vectors using L2 norm. L2 is good to penalize skewed vectors.
soft: bool, optional (default=False)
if True, a word not in the dictionary will be replaced with nearest jarowrinkler ratio.
if False, it will throw an exception if a word not in the dictionary.
silent: bool, optional (default=False)
if True, will not print any logs.
Returns
-------
tuple: (labels[argmax(scores), axis = 1], scores, labels)
"""
[5]:
%%time
results, scores, labels = malaya.lexicon.random_walk(sentiment_lexicon, wordvector, pool_size = 5)
populating nearest words from wordvector
populating vectors from populated nearest words
random walking from populated vectors
CPU times: user 1min 36s, sys: 16.1 s, total: 1min 52s
Wall time: 28.1 s
[6]:
np.unique(list(results.values()), return_counts = True)
[6]:
(array(['negative', 'positive'], dtype='<U8'), array([2260, 2922]))
[7]:
results
[7]:
{'serang': 'negative',
'cilegon': 'positive',
'culik': 'negative',
'tanjungpinang': 'positive',
'jenguk': 'negative',
'luka': 'negative',
'jerawat': 'negative',
'infeksi': 'negative',
'migrain': 'negative',
'penyakit': 'negative',
'penaklukan': 'negative',
'4ir': 'positive',
'renjer': 'positive',
'kezhaliman': 'positive',
'proklamator': 'positive',
'kelucahan': 'negative',
'pablisiti': 'positive',
'terjwp': 'positive',
'33100': 'positive',
'impos': 'positive',
'kritikan': 'negative',
'mandat': 'negative',
'teguran': 'negative',
'persepsi': 'negative',
'pembelaan': 'negative',
'muflis': 'negative',
'mempelajarinya': 'negative',
'melarat': 'positive',
'dihabisi': 'positive',
'kooperatif': 'positive',
'kelemahan': 'negative',
'keyakinan': 'positive',
'kehendak': 'negative',
'keburukan': 'negative',
'gerombolan': 'negative',
'kelakuan': 'negative',
'antek': 'negative',
'politikus': 'negative',
'ulah': 'negative',
'debu': 'negative',
'kotoran': 'negative',
'polusi': 'negative',
'kuman': 'negative',
'keringat': 'negative',
'sinis': 'negative',
'misterius': 'positive',
'menggemaskan': 'positive',
'emosional': 'negative',
'progresif': 'positive',
'bocor': 'negative',
'pecah': 'negative',
'retak': 'negative',
'rosak': 'negative',
'terbalik': 'negative',
'kekacauan': 'negative',
'penindasan': 'negative',
'perdebatan': 'negative',
'kesombongan': 'negative',
'pengamatan': 'negative',
'permusuhan': 'negative',
'ketidakadilan': 'negative',
'empati': 'negative',
'perpecahan': 'negative',
'menghasut': 'negative',
'menghukum': 'negative',
'memfitnah': 'negative',
'memaki': 'negative',
'memprovokasi': 'negative',
'bersedih': 'negative',
'mengalah': 'negative',
'terlena': 'negative',
'cemburu': 'negative',
'dikenang': 'negative',
'jatuh': 'negative',
'terjatuh': 'negative',
'putus': 'negative',
'hilang': 'negative',
'hancur': 'negative',
'dipakai': 'negative',
'digunakan': 'negative',
'dikonsumsi': 'negative',
'dipake': 'negative',
'diminum': 'negative',
'harapan': 'negative',
'kebahagiaan': 'positive',
'impian': 'positive',
'cita2': 'negative',
'senyuman': 'positive',
'beban': 'negative',
'resiko': 'negative',
'kerugian': 'negative',
'tekanan': 'negative',
'risiko': 'negative',
'mencaci': 'negative',
'dicaci': 'negative',
'mengejek': 'negative',
'disia': 'negative',
'bengkak': 'negative',
'berair': 'negative',
'lebam': 'negative',
'lenguh': 'negative',
'toksik': 'negative',
'toksin': 'negative',
'pepejal': 'positive',
'kafein': 'negative',
'buih': 'negative',
'terperangkap': 'negative',
'dijumpai': 'negative',
'tersimpan': 'negative',
'tergabung': 'negative',
'bertarung': 'negative',
'rahsia': 'negative',
'cabaran': 'positive',
'petua': 'negative',
'persamaan': 'negative',
'punca': 'negative',
'fail': 'negative',
'failed': 'negative',
'approve': 'negative',
'consider': 'negative',
'freehair': 'negative',
'munafik': 'negative',
'dungu': 'negative',
'liberal': 'negative',
'rasis': 'negative',
'konservatif': 'negative',
'parasit': 'negative',
'klorofil': 'negative',
'klorin': 'positive',
'fibroid': 'negative',
'antibakteri': 'negative',
'menyesal': 'negative',
'nyesal': 'negative',
'terkejut': 'positive',
'terliur': 'positive',
'sebak': 'positive',
'pemberontakan': 'negative',
'kudeta': 'negative',
'feminisme': 'negative',
'keragaman': 'negative',
'kesangsian': 'negative',
'nelponke': 'positive',
'datebook': 'negative',
'4dalzk': 'negative',
'ketidakpentinganku': 'positive',
'fasis': 'negative',
'portugis': 'negative',
'ateisme': 'positive',
'illuminati': 'negative',
'malang': 'negative',
'depok': 'positive',
'kediri': 'positive',
'semarang': 'positive',
'cirebon': 'positive',
'mendatangkan': 'negative',
'menimbulkan': 'negative',
'memupuk': 'negative',
'mengundang': 'negative',
'menghianati': 'negative',
'kejatuhan': 'negative',
'pelemahan': 'negative',
'lonjakan': 'negative',
'ketiadaan': 'negative',
'pengubahan': 'negative',
'memusnahkan': 'negative',
'mengadopsi': 'negative',
'merampas': 'negative',
'mengangkut': 'negative',
'mengarahkan': 'negative',
'kemarahan': 'negative',
'keimanan': 'positive',
'penderitaan': 'negative',
'wabak': 'negative',
'letupan': 'negative',
'jangkitan': 'negative',
'serangan': 'negative',
'jenayah': 'negative',
'tragedi': 'negative',
'peristiwa': 'negative',
'insiden': 'negative',
'kejadian': 'negative',
'menganggur': 'negative',
'dioptimalkan': 'positive',
'menyakitimu': 'positive',
'bernafsu': 'positive',
'derhaka': 'negative',
'menakan': 'negative',
'sulung': 'positive',
'bongsu': 'negative',
'teruna': 'negative',
'merungut': 'negative',
'komplen': 'negative',
'giveup': 'negative',
'melalak': 'negative',
'melawa': 'negative',
'berdarah': 'negative',
'bengkok': 'negative',
'layu': 'negative',
'ngeri': 'negative',
'serem': 'negative',
'kocak': 'negative',
'mantep': 'positive',
'miris': 'negative',
'menghina': 'negative',
'menuduh': 'negative',
'membenci': 'negative',
'menyalahkan': 'negative',
'menyekat': 'negative',
'menggenjot': 'negative',
'mengevaluasi': 'negative',
'mengalirkan': 'negative',
'melemahkan': 'negative',
'keengganan': 'negative',
'vendon': 'positive',
'koturno': 'positive',
'spesialisasikan': 'positive',
"'pembongkaran": 'positive',
'neraka': 'negative',
'surga': 'negative',
'syurga': 'positive',
'kubur': 'negative',
'mesjid': 'negative',
'gerun': 'negative',
'betui2': 'positive',
'bankrup': 'positive',
'gamak': 'positive',
'mendobi': 'negative',
'penghapusan': 'negative',
'proyeksi': 'negative',
'realisasi': 'negative',
'pengendalian': 'negative',
'maraknya': 'negative',
'strike': 'negative',
'adop': 'positive',
'seats': 'positive',
'sponsored': 'positive',
'script': 'positive',
'pengangguran': 'negative',
'pns': 'negative',
'koruptor': 'negative',
'oposisi': 'negative',
'stunting': 'negative',
'mengamuk': 'negative',
'membebel': 'negative',
'menjerit': 'negative',
'meroyan': 'negative',
'bergaduh': 'negative',
'keruntuhan': 'negative',
'maxxie': 'positive',
'081266267925': 'positive',
'evvadiki': 'positive',
'digibdulu': 'positive',
'kekuatan': 'negative',
'kepercayaan': 'positive',
'kesadaran': 'negative',
'hasrat': 'negative',
'radikal': 'negative',
'sekuler': 'negative',
'intoleran': 'negative',
'sosialis': 'negative',
'penagih': 'negative',
'penagihan': 'positive',
'professor': 'negative',
'keldai': 'negative',
'penebar': 'negative',
'menghentam': 'negative',
'jagungbakar': 'positive',
'pembakaram': 'positive',
'bajucoplemurah': 'positive',
'ma3i': 'positive',
'pembakar': 'negative',
'limpahan': 'positive',
'melarutkan': 'positive',
'pencegah': 'negative',
'merendam': 'positive',
'membakar': 'negative',
'mengikat': 'negative',
'membersihkan': 'positive',
'menghancurkan': 'negative',
'pembakaran': 'negative',
'penat': 'negative',
'letih': 'negative',
'stress': 'negative',
'bosan': 'negative',
'mengantuk': 'negative',
'binasa': 'negative',
'membengkak': 'positive',
'terpejam': 'positive',
'menggumpal': 'positive',
'bergoyang': 'negative',
'diasingkan': 'negative',
'difokuskan': 'negative',
'melindungimu': 'positive',
'terselamatkan': 'positive',
'tertid': 'positive',
'mengelak': 'negative',
'menyiasat': 'negative',
'menghindar': 'negative',
'mengelakkan': 'negative',
'dilepaskan': 'negative',
'tempur': 'negative',
'migas': 'negative',
'nuklir': 'negative',
'manufaktur': 'negative',
'ilegal': 'negative',
'discrimination': 'negative',
'dramaticnyer': 'positive',
'disuwek': 'positive',
'6066030438': 'positive',
'fahdy': 'positive',
'merugikan': 'negative',
'meresahkan': 'negative',
'menimpa': 'negative',
'meyakinkan': 'positive',
'membanggakan': 'positive',
'membingungkan': 'negative',
'diperlihatkan': 'negative',
'dilakukannya': 'positive',
'disegani': 'positive',
'dititipkan': 'negative',
'fatal': 'positive',
'provokatif': 'positive',
'memprihatinkan': 'positive',
'ambisius': 'positive',
'mendasar': 'positive',
'peredaran': 'negative',
'sirkulasi': 'negative',
'pembuluh': 'negative',
'murka': 'negative',
'dilaknat': 'negative',
'diijabah': 'negative',
'berkehendak': 'negative',
'terusik': 'positive',
'virus': 'negative',
'hama': 'negative',
'stroke': 'negative',
'perkauman': 'negative',
'lgbt': 'negative',
'icerd': 'negative',
'rasuah': 'negative',
'politik': 'negative',
'kehancuran': 'negative',
'kedewasaan': 'negative',
'penjajahan': 'negative',
'menurun': 'negative',
'meningkat': 'negative',
'berkurang': 'negative',
'membaik': 'negative',
'meroket': 'negative',
'mengetepikan': 'negative',
'kuimplankan': 'positive',
'mountaineer': 'positive',
'chapalein': 'positive',
'40365036': 'positive',
'penjara': 'negative',
'lokap': 'negative',
'mengekori': 'negative',
'c4uf5s': 'positive',
'085602974529': 'positive',
'kebiasqan': 'positive',
'teamgoals': 'positive',
'bimbang': 'negative',
'khawatir': 'negative',
'kesal': 'positive',
'sungkan': 'negative',
'pemabuk': 'negative',
'adibrunner': 'positive',
'eppii': 'positive',
'3s3bju': 'positive',
'jakwir': 'positive',
'pemukul': 'negative',
'seminaronline7': 'positive',
'gemoksaya': 'positive',
'gabisabisa': 'positive',
'berocorak': 'positive',
'penentangan': 'negative',
'livescreen': 'positive',
'meliriktelegramdan': 'positive',
'081334186600': 'positive',
'indox': 'positive',
'terdesak': 'negative',
'desperate': 'negative',
'bebal': 'negative',
'fobia': 'negative',
'nekad': 'positive',
'tahi': 'negative',
'taik': 'negative',
'bangkai': 'negative',
'seekor': 'negative',
'ulat': 'negative',
'kesusahan': 'negative',
'kesedihan': 'negative',
'keraguan': 'negative',
'berdepan': 'negative',
'dikaitkan': 'negative',
'dimulakan': 'negative',
'mengesan': 'negative',
'dikejutkan': 'negative',
'tamak': 'negative',
'biadap': 'negative',
'bongkak': 'negative',
'angkuh': 'negative',
'pendarahan': 'negative',
'alahan': 'negative',
'pembengkakan': 'negative',
'kegatalan': 'negative',
'komplikasi': 'negative',
'dirosakkan': 'negative',
'sajadahmasjid': 'positive',
'wisatalumajang': 'positive',
'dsmua': 'positive',
'otogod': 'positive',
'kekufuran': 'negative',
'auratnya': 'positive',
'kebhinekaan': 'positive',
'kekuatannya': 'negative',
'maksiat': 'negative',
'zina': 'negative',
'provokasi': 'negative',
'syirik': 'negative',
'dicemari': 'negative',
'bergandingan': 'negative',
'diperankan': 'positive',
'dihalang': 'negative',
'bpuasa': 'positive',
'merobohkan': 'negative',
'wediaraya': 'positive',
'pliharaku': 'positive',
'diinfor': 'positive',
'ivgfood': 'positive',
'mencuri': 'negative',
'pecahkan': 'negative',
'sumbang': 'negative',
'meminjam': 'negative',
'curi': 'negative',
'disembelih': 'negative',
'terobati': 'negative',
'diangetin': 'positive',
'berharta': 'positive',
'dituliskan': 'positive',
'pengepungan': 'negative',
'menyamoaikan': 'positive',
'kihoii': 'positive',
'sukasukanya': 'positive',
'085740709892': 'positive',
'menyeleweng': 'negative',
'bukanyah': 'positive',
'terlangkap': 'positive',
'nurulady_sandwich': 'positive',
'spupet': 'positive',
'krisis': 'negative',
'konflik': 'negative',
'kekhawatiran': 'negative',
'keterbatasan': 'negative',
'ancaman': 'negative',
'dipadamkan': 'negative',
'diagungkan': 'positive',
'digunapakai': 'positive',
'dikenalpasti': 'negative',
'digariskan': 'positive',
'sumpahan': 'negative',
'busuknya': 'negative',
'raklu': 'positive',
'adela': 'negative',
'sgguh': 'positive',
'merebut': 'negative',
'memindahkan': 'negative',
'menyelamatkan': 'negative',
'memperluas': 'negative',
'pembangkang': 'negative',
'ppbm': 'negative',
'bn': 'negative',
'tmj': 'negative',
'pkr': 'negative',
'bercanggah': 'negative',
'berkerjasama': 'negative',
'diberhentikan': 'negative',
'terpalit': 'negative',
'selari': 'negative',
'penalty': 'negative',
'lipliner': 'positive',
'glasses': 'positive',
'kdak': 'positive',
'logbook': 'positive',
'tergantung': 'negative',
'beda': 'negative',
'berbeda': 'positive',
'gatau': 'negative',
'berdasarkan': 'negative',
'longgar': 'negative',
'ketat': 'positive',
'sendat': 'positive',
'ramping': 'positive',
'dijahit': 'negative',
'kontroversi': 'negative',
'kezaliman': 'negative',
'penolakan': 'negative',
'menakutkan': 'negative',
'menyedihkan': 'negative',
'mengerikan': 'negative',
'mendebarkan': 'positive',
'dibenci': 'negative',
'mengusik': 'negative',
'memberkahi': 'positive',
'menyirami': 'negative',
'memantulkan': 'negative',
'menampar': 'negative',
'problem': 'negative',
'prob': 'positive',
'down': 'negative',
'error': 'negative',
'function': 'positive',
'pelarian': 'negative',
'pengemis': 'negative',
'jurnalis': 'negative',
'primadona': 'negative',
'buzzer': 'negative',
'lengkap': 'negative',
'lengkapnya': 'positive',
'komplit': 'positive',
'pengirim': 'negative',
'simpel': 'positive',
'bencana': 'negative',
'musibah': 'negative',
'tsunami': 'negative',
'kerusuhan': 'negative',
'rompakan': 'negative',
'samun': 'negative',
'lynas': 'negative',
'rusuhan': 'negative',
'penyelewengan': 'negative',
'meletup': 'negative',
'tercabut': 'negative',
'terkencing': 'negative',
'pitam': 'negative',
'letup': 'negative',
'membosankan': 'negative',
'menyebalkan': 'negative',
'rumit': 'negative',
'bantahan': 'negative',
'cenderahati': 'negative',
'instruksi': 'negative',
'ketertarikan': 'negative',
'penghasut': 'negative',
'hasanudin': 'positive',
'astuti': 'positive',
'kurva': 'positive',
'gerd': 'positive',
'ribut': 'negative',
'ngeluh': 'negative',
'rusuh': 'negative',
'berantem': 'negative',
'ngumpul': 'negative',
'bergelut': 'negative',
'disibukkan': 'negative',
'berkolaborasi': 'negative',
'berkutat': 'negative',
'khinzir': 'negative',
'cmnie': 'positive',
'kecikk': 'positive',
'instafemes': 'positive',
'siuk': 'positive',
'gangguan': 'negative',
'kerusakan': 'negative',
'permasalahan': 'negative',
'berisiko': 'negative',
'beresiko': 'positive',
'rentan': 'negative',
'berpotensi': 'negative',
'disyaki': 'negative',
'mengetuk': 'negative',
'membukakan': 'negative',
'bukain': 'negative',
'ngetok': 'negative',
'bukakan': 'negative',
'memutuskan': 'negative',
'berkomitmen': 'positive',
'berencana': 'negative',
'berniat': 'negative',
'diminta': 'negative',
'penceroboh': 'negative',
'keperpercayaan': 'positive',
'coherence': 'positive',
'lgdnya': 'positive',
"deto'x": 'positive',
'sindiran': 'negative',
'heroik': 'positive',
'ceramahnya': 'positive',
'petuah': 'negative',
'ketegasan': 'negative',
'hukuman': 'negative',
'pidana': 'negative',
'sanksi': 'negative',
'najis': 'negative',
'cicak': 'negative',
'iblis': 'negative',
'depresi': 'negative',
'mengharamkan': 'negative',
'memaknai': 'negative',
'meragukan': 'negative',
'mengedepankan': 'negative',
'kelaparan': 'negative',
'kesepian': 'negative',
'tenggelam': 'negative',
'gelisah': 'negative',
'terluka': 'negative',
'korupsi': 'negative',
'makar': 'negative',
'kriminal': 'negative',
'vandalisme': 'negative',
'penipuan': 'negative',
'kebencian': 'negative',
'kebohongan': 'negative',
'hoaks': 'negative',
'dusta': 'negative',
'inflasi': 'negative',
'apbn': 'negative',
'trauma': 'negative',
'mual': 'negative',
'stres': 'negative',
'badmood': 'negative',
'keradangan': 'negative',
'pigmentasi': 'negative',
'peradangan': 'negative',
'keletihan': 'negative',
'selulit': 'negative',
'kesilapan': 'negative',
'kesalahan': 'negative',
'kemusnahan': 'negative',
'perbendeharaan': 'positive',
'romanticist': 'positive',
'deseu2': 'positive',
'menyjilat': 'positive',
'benci': 'negative',
'menyampah': 'positive',
'jijik': 'negative',
'kagum': 'positive',
'geli': 'positive',
'mendesak': 'negative',
'mengkritik': 'negative',
'menggesa': 'negative',
'menghimbau': 'negative',
'diperintah': 'negative',
'tahap': 'negative',
'level': 'negative',
'fasa': 'negative',
'tingkat': 'negative',
'babak': 'negative',
'praktikal': 'negative',
'kaunseling': 'negative',
'stpm': 'negative',
'pt3': 'negative',
'practical': 'negative',
'dahsyat': 'negative',
'tragis': 'negative',
'dasyat': 'negative',
'kematian': 'negative',
'pembunuhan': 'negative',
'kekalahan': 'negative',
'kebodohan': 'negative',
'pembelotan': 'negative',
'bis2lo': 'negative',
'nepisnya': 'positive',
'stabizernya': 'negative',
'dziewczynka': 'negative',
'mengkhianati': 'negative',
'mengabaikan': 'negative',
'menyembah': 'negative',
'meremehkan': 'negative',
'perbuatannya': 'negative',
'protes': 'negative',
'kritik': 'negative',
'dibela': 'negative',
'rekonsiliasi': 'negative',
'diusir': 'negative',
'tuduhan': 'negative',
'dakwaan': 'negative',
'perbuatan': 'negative',
'tuntutan': 'negative',
'dadah': 'negative',
'hey': 'positive',
'astagfirullah': 'negative',
'heh': 'negative',
'fak': 'positive',
'ditakuti': 'negative',
'diharamkan': 'negative',
'dicintai': 'positive',
'nasionalis': 'negative',
'mengalir': 'negative',
'tumpah': 'negative',
'merebak': 'negative',
'dimasukkan': 'negative',
'terjun': 'negative',
'mencederakan': 'negative',
'mummuy': 'positive',
'pkdnya': 'positive',
'dilepasi': 'positive',
'tolak': 'negative',
'keluarkan': 'negative',
'tuntut': 'negative',
'pegang': 'negative',
'kutip': 'negative',
'khianat': 'negative',
'bersaksi': 'negative',
'dipersalahkan': 'positive',
'menyeksa': 'negative',
'morah2': 'positive',
'hakimnegara': 'positive',
'princemmed': 'positive',
'bedaken': 'positive',
'kemelesetan': 'negative',
'raauww': 'positive',
"'aiyok": 'positive',
'15dan': 'positive',
'huina': 'positive',
'melumpuhkan': 'negative',
'dipercayakan': 'positive',
'direbut': 'negative',
'menyasar': 'positive',
'mengetuai': 'negative',
'kesengsaraan': 'negative',
'kebermanfaatan': 'positive',
'kegelisahan': 'negative',
'berkabung': 'negative',
'berbasikal': 'positive',
'berbisnes': 'negative',
'memuncak': 'positive',
'berbahas': 'negative',
'pengakuan': 'negative',
'kesaksian': 'negative',
'pernyataan': 'negative',
'perang': 'negative',
'neraca': 'negative',
'negosiasi': 'negative',
'kebangkitan': 'positive',
'menyerahkan': 'negative',
'menyalurkan': 'negative',
'membagikan': 'negative',
'serahkan': 'negative',
'mengajukan': 'negative',
'hutang': 'negative',
'utang': 'negative',
'pendapatan': 'negative',
'pajak': 'negative',
'cukai': 'negative',
'saingan': 'negative',
'trofi': 'positive',
'pertarungan': 'negative',
'kompetisi': 'negative',
'klasemen': 'negative',
'mengeruhkan': 'negative',
'zuaini': 'positive',
'sedip': 'positive',
'7572687': 'positive',
'sesiapo': 'positive',
'mengemis': 'negative',
'tanyaa': 'negative',
'feeling2': 'positive',
'berdendam': 'negative',
'bermasalah': 'negative',
'sensitif': 'positive',
'terganggu': 'negative',
'berjerawat': 'positive',
'menghitam': 'positive',
'disaster': 'negative',
'ngisahin': 'positive',
'butoset': 'positive',
'stuffed': 'positive',
'kayk': 'positive',
'rapuh': 'negative',
'rebah': 'negative',
'mengering': 'positive',
'kaku': 'negative',
'hti': 'negative',
'syaitan': 'negative',
'pembohong': 'negative',
'opposition': 'negative',
'accord': 'positive',
'hone': 'positive',
'writternya': 'positive',
'memahat': 'positive',
'dikawal': 'negative',
'ditangani': 'negative',
'diselamatkan': 'negative',
'diselesaikan': 'negative',
'dilewati': 'negative',
'beracun': 'negative',
'lazim': 'positive',
'merbahaya': 'positive',
'mengkilap': 'positive',
'berbahaya': 'negative',
'gross': 'negative',
'paint': 'positive',
'bunny': 'positive',
'teriyaki': 'positive',
'panther': 'positive',
'menghantui': 'negative',
'menyiksa': 'negative',
'menuntun': 'negative',
'cintakan': 'negative',
'membohongi': 'negative',
'bodoh': 'negative',
'bangang': 'negative',
'bodo': 'positive',
'noob': 'negative',
'merenggangkan': 'negative',
'nowel2': 'positive',
'memmpesonahh': 'positive',
'sotoguk': 'positive',
'promotinggal2harilagiburuuaann': 'positive',
'polemik': 'negative',
'penahanan': 'negative',
'usulan': 'negative',
'pertikaian': 'negative',
'sejarahnya': 'negative',
'kejanggalan': 'negative',
'petaka': 'negative',
'tamparan': 'negative',
'takut': 'negative',
'risau': 'negative',
'malu': 'negative',
'segan': 'negative',
'ketinggalan': 'negative',
'kehabisan': 'negative',
'kebagian': 'negative',
'lewatkan': 'negative',
'terlepas': 'negative',
'paksaan': 'negative',
'kejelasan': 'negative',
'batasnya': 'negative',
'halangan': 'negative',
'bingung': 'negative',
'penasaran': 'positive',
'mikir': 'negative',
'kepikiran': 'negative',
'males': 'negative',
'ditinggalkan': 'negative',
'dibunuh': 'negative',
'dihina': 'negative',
'dijalani': 'negative',
'dilanda': 'negative',
'mengidap': 'negative',
'picu': 'negative',
'memicu': 'negative',
'terjangkit': 'negative',
'penyerang': 'negative',
'gelandang': 'negative',
'pembalap': 'negative',
'manajer': 'negative',
'kiper': 'negative',
'mencurigai': 'negative',
'zemwah': 'positive',
'enenenenenene': 'positive',
'destroyers': 'positive',
'norsyida': 'positive',
'memarahi': 'negative',
'dereta': 'positive',
'pengambil': 'positive',
'menjudge': 'positive',
'disodorin': 'positive',
'disentuh': 'negative',
'memakainya': 'negative',
'membacanya': 'negative',
'dicerna': 'negative',
'dihilangkan': 'negative',
'membimbangkan': 'negative',
'dibaiat': 'positive',
'memenatkan': 'negative',
'diingati': 'positive',
'perosak': 'negative',
'penghianat': 'negative',
'pembela': 'negative',
'perusak': 'negative',
'minoriti': 'negative',
'kemudaratan': 'negative',
'kainavailable': 'positive',
'angesti': 'positive',
'konsta': 'positive',
'togor2': 'positive',
'menangkis': 'negative',
'gobindh': 'positive',
"k'sasar": 'positive',
'mgnr': 'positive',
'kemesu': 'positive',
'rugi': 'negative',
'untung': 'negative',
'berdosa': 'negative',
'berbaloi': 'positive',
'terasa': 'negative',
'merasa': 'negative',
'berdebar': 'negative',
'terlihat': 'positive',
'berasa': 'negative',
'tebusan': 'negative',
'082257468845': 'positive',
'penghakiman': 'positive',
'dihafal': 'positive',
'kecelaruan': 'negative',
'pakvwi': 'positive',
'mwamuna': 'positive',
'hapepend': 'positive',
'mengekuarkan': 'positive',
'kasar': 'negative',
'kotor': 'negative',
'halus': 'positive',
'kusam': 'positive',
'memaksa': 'negative',
'menyayangi': 'negative',
'menyuruh': 'negative',
'menyakiti': 'negative',
'fanatik': 'negative',
'toleran': 'positive',
'zalim': 'negative',
'atheis': 'negative',
'kemiskinan': 'negative',
'pelampau': 'negative',
'dicekal': 'positive',
'ysfheartnezia': 'positive',
'photograther': 'positive',
'ntuh': 'positive',
'takot': 'negative',
'teror': 'negative',
'menyerang': 'negative',
'membunuh': 'negative',
'membela': 'negative',
'menolong': 'negative',
'menjatuhkan': 'negative',
'menyamakan': 'negative',
'meninggalkan': 'negative',
'menemui': 'negative',
'tinggalkan': 'negative',
'menemukan': 'negative',
'mengubah': 'negative',
'miskin': 'negative',
'goblok': 'negative',
'jelek': 'negative',
'jomblo': 'negative',
'bego': 'negative',
'siber': 'negative',
'undang2': 'negative',
'menangis': 'negative',
'nangis': 'negative',
'tertidur': 'negative',
'tertunggak': 'negative',
'langsai': 'positive',
'rm2k': 'positive',
'rm450': 'negative',
'xsilap': 'positive',
'lucah': 'negative',
'porno': 'negative',
'semburit': 'negative',
'seks': 'negative',
'3gp': 'negative',
'mengalami': 'negative',
'menderita': 'negative',
'merasakan': 'negative',
'menyebabkan': 'negative',
'musnah': 'negative',
'lenyap': 'negative',
'sengsara': 'negative',
'stereotaip': 'negative',
'ahmbs': 'positive',
'radangmembaik': 'positive',
'escapepenang': 'positive',
'f7szfx': 'positive',
'ironinya': 'negative',
'moyez': 'positive',
'mauloee': 'positive',
'ndakanamirana': 'positive',
'skf3013': 'positive',
'pergolakan': 'negative',
'gelembung': 'negative',
'menghadkan': 'negative',
'wardrobenya': 'positive',
'anrara': 'positive',
'tukaanza': 'positive',
'tersebutnya': 'positive',
'hamba': 'negative',
'hambanya': 'negative',
'firman': 'negative',
'takdir': 'negative',
'rasul': 'negative',
'memburukkan': 'negative',
'tubuhkan': 'negative',
'menggulingkan': 'negative',
'meruntuhkan': 'negative',
'membantai': 'negative',
'haiwan': 'negative',
'dajjal': 'negative',
'penyamun': 'negative',
'sampah': 'negative',
'rumput': 'negative',
'racun': 'negative',
'rokok': 'negative',
'dengki': 'negative',
'jeles': 'positive',
'sombong': 'negative',
'hasutan': 'negative',
'palsu': 'negative',
'negatif': 'negative',
...}
[8]:
%%time
results_emotion, scores_emotion, labels_emotion = malaya.lexicon.random_walk(emotion_lexicon,
wordvector,
pool_size = 10)
populating nearest words from wordvector
populating vectors from populated nearest words
random walking from populated vectors
CPU times: user 5.9 s, sys: 3.13 s, total: 9.03 s
Wall time: 1.5 s
[9]:
np.unique(list(results_emotion.values()), return_counts = True)
[9]:
(array(['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'], dtype='<U8'),
array([ 76, 156, 14, 132, 40, 34]))
[10]:
results_emotion
[10]:
{'sebal': 'anger',
'gesture': 'anger',
'se7': 'anger',
'ziraa': 'love',
'mantepp': 'love',
'mesem': 'love',
'nggapapa': 'love',
'maen2': 'love',
'gacocok': 'anger',
'jeongwoo': 'love',
'bergelora': 'anger',
'mereda': 'anger',
'skeptis': 'anger',
'gebus': 'love',
'tyrion': 'love',
'memuncak': 'anger',
'mewabah': 'love',
'mengenaskan': 'anger',
'kesasar': 'love',
'kepedean': 'love',
'annoying': 'anger',
'awkward': 'fear',
'scary': 'fear',
'handsome': 'fear',
'nervous': 'fear',
'cringe': 'fear',
'menyampah': 'fear',
'kelakar': 'fear',
'cute': 'fear',
'cuak': 'fear',
'bodoh': 'anger',
'bangang': 'anger',
'bebal': 'anger',
'bodo': 'fear',
'noob': 'fear',
'bengap': 'fear',
'celaka': 'fear',
'biadap': 'fear',
'pukimak': 'fear',
'berang': 'anger',
'buru': 'anger',
'nerus': 'anger',
'kangsar': 'anger',
'lipis': 'anger',
'pilah': 'anger',
'besut': 'anger',
'krai': 'anger',
'klawang': 'anger',
'ketil': 'anger',
'amuk': 'anger',
'mbatin': 'love',
'sebarin': 'love',
'sebarisan': 'love',
'ngalami': 'love',
'tikt': 'love',
'diharga': 'love',
'threesome': 'love',
'shizuka': 'love',
'bokondini': 'love',
'mendidih': 'anger',
'mengental': 'anger',
'sebati': 'anger',
'mengembang': 'anger',
'layu': 'anger',
'kecoklatan': 'anger',
'matang': 'anger',
'meresap': 'anger',
'mengering': 'anger',
'direbus': 'anger',
'pengecut': 'anger',
'bajingan': 'anger',
'pembohong': 'anger',
'pecundang': 'anger',
'dungu': 'anger',
'pemberani': 'anger',
'negarawan': 'anger',
'jahil': 'anger',
'biadab': 'anger',
'provokator': 'anger',
'bengang': 'anger',
'menyirap': 'fear',
'meluat': 'fear',
'frust': 'fear',
'rimas': 'fear',
'annoyed': 'fear',
'lonely': 'fear',
'berdukacita': 'anger',
'menyakitimu': 'anger',
'bersinggungan': 'love',
'bermesra': 'love',
'meridhoi': 'anger',
'menyelubungi': 'love',
'empukk': 'love',
'berserban': 'love',
'diracuni': 'love',
'dibayangi': 'love',
'jengkel': 'anger',
'gugup': 'anger',
'dibiasain': 'love',
'mubazir': 'anger',
'amnesia': 'anger',
'psikopat': 'anger',
'gumoh': 'love',
'diurusin': 'love',
'ngangenin': 'anger',
'purging': 'anger',
'babi': 'anger',
'sial': 'fear',
'kimak': 'fear',
'anjing': 'anger',
'pundek': 'fear',
'cibai': 'fear',
'setan': 'anger',
'lembu': 'anger',
'pedar': 'anger',
'sanwya': 'love',
'qabaya': 'love',
'5pac': 'love',
'wa082336409906': 'love',
'mpibg': 'love',
'honachahthu': 'anger',
'unieleven': 'love',
'mengepilkan': 'anger',
'ciknorzaidi': 'love',
'benci': 'anger',
'jijik': 'fear',
'kagum': 'surprise',
'geli': 'fear',
'insecure': 'fear',
'geram': 'fear',
'respect': 'fear',
'jealous': 'fear',
'marah': 'anger',
'maki': 'fear',
'merajuk': 'fear',
'marah2': 'surprise',
'perli': 'fear',
'jeles': 'fear',
'tegur': 'fear',
'kecam': 'fear',
'cemburu': 'surprise',
'bitter': 'fear',
'ngeri': 'fear',
'serem': 'fear',
'kocak': 'fear',
'mantep': 'fear',
'miris': 'fear',
'ngeselin': 'fear',
'nyesek': 'fear',
'kesel': 'fear',
'sebel': 'fear',
'lebay': 'fear',
'phobia': 'fear',
'mendem': 'love',
'berideologi': 'love',
'niru': 'love',
'nyicip': 'love',
'ngerawat': 'fear',
'riweuh': 'anger',
'nmun': 'love',
'ngancam': 'love',
'bencong': 'love',
'anxiety': 'fear',
'glasses': 'love',
'manners': 'fear',
'satan': 'love',
'popularity': 'love',
'curl': 'love',
'impossible': 'love',
'mayb': 'love',
'sperm': 'love',
'nyumpah': 'love',
'fitnah': 'fear',
'hoax': 'fear',
'provokasi': 'fear',
'kebencian': 'fear',
'dusta': 'fear',
'hoaks': 'fear',
'kebohongan': 'fear',
'bohong': 'fear',
'rasis': 'anger',
'ngibul': 'fear',
'horror': 'fear',
'horor': 'fear',
'romance': 'fear',
'day6': 'fear',
'dokumenter': 'fear',
'porno': 'fear',
'anime': 'fear',
'sinetron': 'fear',
'drakor': 'fear',
'dangdut': 'fear',
'takut': 'fear',
'risau': 'fear',
'malu': 'fear',
'khawatir': 'sadness',
'segan': 'fear',
'kecewa': 'sadness',
'takot': 'fear',
'bimbang': 'sadness',
'takutnya': 'fear',
'sedih': 'sadness',
'panic': 'fear',
'loud': 'love',
'impressed': 'love',
'expected': 'love',
'dying': 'love',
'rush': 'fear',
'shitty': 'love',
'smoke': 'fear',
'suck': 'fear',
'cheap': 'fear',
'emo': 'fear',
'boring': 'fear',
'gelabah': 'fear',
'ngantok': 'fear',
'syok': 'joy',
'seronok': 'joy',
'busy': 'fear',
'serabut': 'fear',
'syiok': 'fear',
'sendu': 'fear',
'riang': 'joy',
'ceria': 'sadness',
'takbir': 'joy',
'bersuka': 'anger',
'emma': 'love',
'barakah': 'anger',
'telemovie': 'anger',
'riuh': 'anger',
'ria': 'joy',
'khutbah': 'joy',
'sebak': 'fear',
'excited': 'fear',
'terharu': 'surprise',
'terliur': 'fear',
'girang': 'joy',
'ditikung': 'love',
'ambis': 'anger',
'rafa': 'love',
'digangguin': 'love',
'nyiksa': 'anger',
'maruk': 'love',
'tamvan': 'love',
'pengap': 'love',
'iklas': 'love',
'puas': 'joy',
'muak': 'sadness',
'kenyang': 'fear',
'lega': 'fear',
'bosan': 'fear',
'berbaloi': 'fear',
'berpuas': 'sadness',
'lelah': 'sadness',
'bahagia': 'joy',
'menyenangkan': 'sadness',
'gelisah': 'sadness',
'nyaman': 'sadness',
'indah': 'sadness',
'sukses': 'sadness',
'sehat': 'sadness',
'damai': 'sadness',
'suka': 'joy',
'sukanya': 'fear',
'doyan': 'fear',
'demen': 'fear',
'suke': 'fear',
'gasuka': 'fear',
'gemar': 'fear',
'sukaa': 'fear',
'prefer': 'fear',
'happy': 'joy',
'hepi': 'love',
'wish': 'fear',
'nice': 'fear',
'cerita': 'joy',
'citer': 'fear',
'cite': 'fear',
'crita': 'fear',
'kisah': 'love',
'percakapan': 'joy',
'tweet': 'fear',
'drama': 'fear',
'lagu': 'fear',
'ceramah': 'joy',
'cinta': 'love',
'kebahagiaan': 'love',
'cintanya': 'sadness',
'cintaku': 'sadness',
'persahabatan': 'love',
'cintamu': 'sadness',
'kesabaran': 'love',
'dendam': 'sadness',
'kesedihan': 'sadness',
'asa': 'love',
'baby': 'love',
'daddy': 'love',
'mira': 'fear',
'princess': 'love',
'bella': 'love',
'farah': 'love',
'mommy': 'love',
'sister': 'love',
'mummy': 'love',
'lisa': 'love',
'love': 'love',
'luv': 'love',
'hate': 'love',
'thought': 'fear',
'mean': 'fear',
'want': 'fear',
'see': 'fear',
'need': 'fear',
'hope': 'fear',
'peace': 'fear',
'syang': 'love',
'noi': 'love',
'bilang2': 'love',
'syng': 'love',
'mut': 'love',
'ribbey': 'love',
'seneng2': 'love',
'butoset': 'love',
'manly': 'love',
'twet': 'love',
'syg': 'love',
'sayangg': 'love',
'sayang': 'love',
'bby': 'love',
'cntik': 'fear',
'knl': 'surprise',
'anon': 'fear',
'sistur': 'love',
'sayang2': 'love',
'bgus': 'fear',
'rindukn': 'love',
'ajeb2an': 'love',
'hshakjsjsbs': 'love',
'miliknyamencatat': 'love',
'p6a': 'love',
'ahsjahhaa': 'love',
'diwajibk': 'love',
'protese': 'love',
'botaqin': 'love',
'kruntel': 'love',
'rindu': 'love',
'sayangku': 'love',
'sayangkan': 'love',
'sayangnya': 'love',
'disayang': 'anger',
'moody': 'fear',
'rindukan': 'love',
'merindui': 'love',
'takutkan': 'love',
'banggakan': 'love',
'cintakan': 'love',
'perbuat': 'surprise',
'merindukan': 'love',
'ceraikan': 'love',
'jumpai': 'love',
'rindunya': 'fear',
'teringat': 'fear',
'rinduu': 'fear',
'lapar': 'fear',
'kempunan': 'fear',
'teringin': 'fear',
'kangen': 'fear',
'confuse': 'fear',
'stress': 'fear',
'letih': 'fear',
'penat': 'fear',
'stres': 'sadness',
'mengantuk': 'fear',
'tertekan': 'sadness',
'terganggu': 'sadness',
'tertipu': 'surprise',
'keliru': 'surprise',
'mengeluh': 'sadness',
'merosot': 'sadness',
'susut': 'sadness',
'terjebak': 'surprise',
'terpengaruh': 'surprise',
'kesal': 'sadness',
'terkejut': 'surprise',
'bersalah': 'sadness',
'berdosa': 'fear',
'dihargai': 'sadness',
'janggal': 'anger',
'resah': 'sadness',
'kesepian': 'sadness',
'gundah': 'sadness',
'goyah': 'sadness',
'disakiti': 'sadness',
'takjub': 'sadness',
'sengsara': 'sadness',
'seram': 'fear',
'menyebalkan': 'sadness',
'merana': 'fear',
'melarat': 'anger',
'angkuh': 'sadness',
'rakus': 'sadness',
'terpuruk': 'sadness',
'pengsan': 'surprise',
'tertido': 'fear',
'pitam': 'surprise',
'terlelap': 'surprise',
'terberak': 'surprise',
'nanges': 'fear',
'mengamuk': 'fear',
'tdoq': 'fear',
'termuntah': 'surprise',
'tidor': 'surprise',
'bangga': 'surprise',
'surprise': 'surprise',
'suprise': 'surprise',
'makan2': 'surprise',
'attention': 'fear',
'kejutan': 'surprise',
'assignment': 'fear',
'comeback': 'surprise',
'chance': 'fear',
'homework': 'surprise',
'appointment': 'surprise',
'wtf': 'surprise',
'huh': 'fear',
'seriously': 'fear',
'omg': 'fear',
'aik': 'fear',
'wth': 'fear',
'shit': 'fear',
'apoo': 'fear',
'hah': 'fear',
'damn': 'fear',
'stun': 'surprise',
'pinafsueun': 'love',
'neelehh': 'love',
'rudgard': 'love',
'016344981': 'love',
'pramaandika': 'love',
'hamidibahawa': 'love',
'spesialers': 'love',
'superpignan': 'love',
'082187486748': 'love',
'tertanya2': 'surprise',
'terperanjat': 'surprise',
'cubaa': 'surprise',
'stuju': 'surprise',
'stayback': 'love',
'cakaplah': 'surprise',
'melebih': 'anger',
'tanyaa': 'surprise',
'ngandung': 'surprise'}
propagate probabilistic#
def propagate_probabilistic(
lexicon,
wordvector,
pool_size = 10,
top_n = 20,
similarity_power = 10.0,
arccos = True,
normalization = True,
soft = False,
silent = False,
):
"""
Learns polarity scores via standard label propagation from lexicon sets.
Parameters
----------
lexicon: dict
curated lexicon from expert domain, {'label1': [str], 'label2': [str]}.
wordvector: object
wordvector interface object.
pool_size: int, optional (default=10)
pick top-pool size from each lexicons.
top_n: int, optional (default=20)
top_n for each vectors will multiple with `similarity_power`.
similarity_power: float, optional (default=10.0)
extra score for `top_n`, less will generate less bias induced but high chance unbalanced outcome.
arccos: bool, optional (default=True)
covariance distribution for embedded.dot(embedded.T). If false, covariance + 1.
normalization: bool, optional (default=True)
normalize word vectors using L2 norm. L2 is good to penalize skewed vectors.
soft: bool, optional (default=False)
if True, a word not in the dictionary will be replaced with nearest jarowrinkler ratio.
if False, it will throw an exception if a word not in the dictionary.
silent: bool, optional (default=False)
if True, will not print any logs.
Returns
-------
tuple: (labels[argmax(scores), axis = 1], scores, labels)
"""
[11]:
%%time
results_emotion, scores_emotion, labels_emotion = malaya.lexicon.propagate_probabilistic(emotion_lexicon,
wordvector,
pool_size = 10)
populating nearest words from wordvector
populating vectors from populated nearest words
propagating probabilistic from populated vectors
CPU times: user 5.64 s, sys: 2.05 s, total: 7.68 s
Wall time: 1.29 s
[12]:
np.unique(list(results_emotion.values()), return_counts = True)
[12]:
(array(['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'], dtype='<U8'),
array([315, 66, 10, 21, 28, 12]))
[13]:
results_emotion
[13]:
{'sebal': 'anger',
'gesture': 'anger',
'se7': 'anger',
'ziraa': 'anger',
'mantepp': 'anger',
'mesem': 'anger',
'nggapapa': 'anger',
'maen2': 'anger',
'gacocok': 'anger',
'jeongwoo': 'anger',
'bergelora': 'anger',
'mereda': 'anger',
'skeptis': 'anger',
'gebus': 'anger',
'tyrion': 'anger',
'memuncak': 'anger',
'mewabah': 'anger',
'mengenaskan': 'anger',
'kesasar': 'anger',
'kepedean': 'anger',
'annoying': 'anger',
'awkward': 'fear',
'scary': 'fear',
'handsome': 'anger',
'nervous': 'fear',
'cringe': 'fear',
'menyampah': 'fear',
'kelakar': 'anger',
'cute': 'anger',
'cuak': 'fear',
'bodoh': 'anger',
'bangang': 'anger',
'bebal': 'anger',
'bodo': 'anger',
'noob': 'anger',
'bengap': 'anger',
'celaka': 'anger',
'biadap': 'anger',
'pukimak': 'anger',
'berang': 'anger',
'buru': 'anger',
'nerus': 'anger',
'kangsar': 'anger',
'lipis': 'anger',
'pilah': 'anger',
'besut': 'anger',
'krai': 'anger',
'klawang': 'anger',
'ketil': 'anger',
'amuk': 'anger',
'mbatin': 'anger',
'sebarin': 'anger',
'sebarisan': 'anger',
'ngalami': 'anger',
'tikt': 'anger',
'diharga': 'anger',
'threesome': 'anger',
'shizuka': 'anger',
'bokondini': 'anger',
'mendidih': 'anger',
'mengental': 'anger',
'sebati': 'anger',
'mengembang': 'anger',
'layu': 'anger',
'kecoklatan': 'anger',
'matang': 'anger',
'meresap': 'anger',
'mengering': 'anger',
'direbus': 'anger',
'pengecut': 'anger',
'bajingan': 'anger',
'pembohong': 'anger',
'pecundang': 'anger',
'dungu': 'anger',
'pemberani': 'anger',
'negarawan': 'anger',
'jahil': 'anger',
'biadab': 'anger',
'provokator': 'anger',
'bengang': 'anger',
'menyirap': 'fear',
'meluat': 'anger',
'frust': 'fear',
'rimas': 'fear',
'annoyed': 'fear',
'lonely': 'fear',
'berdukacita': 'anger',
'menyakitimu': 'anger',
'bersinggungan': 'anger',
'bermesra': 'anger',
'meridhoi': 'anger',
'menyelubungi': 'anger',
'empukk': 'anger',
'berserban': 'anger',
'diracuni': 'anger',
'dibayangi': 'anger',
'jengkel': 'anger',
'gugup': 'anger',
'dibiasain': 'anger',
'mubazir': 'anger',
'amnesia': 'anger',
'psikopat': 'anger',
'gumoh': 'anger',
'diurusin': 'anger',
'ngangenin': 'anger',
'purging': 'anger',
'babi': 'anger',
'sial': 'anger',
'kimak': 'anger',
'anjing': 'anger',
'pundek': 'anger',
'cibai': 'anger',
'setan': 'anger',
'lembu': 'anger',
'pedar': 'anger',
'sanwya': 'anger',
'qabaya': 'anger',
'5pac': 'anger',
'wa082336409906': 'anger',
'mpibg': 'anger',
'honachahthu': 'anger',
'unieleven': 'anger',
'mengepilkan': 'anger',
'ciknorzaidi': 'anger',
'benci': 'anger',
'jijik': 'fear',
'kagum': 'surprise',
'geli': 'anger',
'insecure': 'fear',
'geram': 'anger',
'respect': 'anger',
'jealous': 'fear',
'marah': 'anger',
'maki': 'anger',
'merajuk': 'anger',
'marah2': 'anger',
'perli': 'anger',
'jeles': 'fear',
'tegur': 'anger',
'kecam': 'anger',
'cemburu': 'surprise',
'bitter': 'anger',
'ngeri': 'fear',
'serem': 'anger',
'kocak': 'anger',
'mantep': 'anger',
'miris': 'fear',
'ngeselin': 'anger',
'nyesek': 'anger',
'kesel': 'fear',
'sebel': 'fear',
'lebay': 'anger',
'phobia': 'fear',
'mendem': 'anger',
'berideologi': 'anger',
'niru': 'anger',
'nyicip': 'anger',
'ngerawat': 'anger',
'riweuh': 'anger',
'nmun': 'anger',
'ngancam': 'anger',
'bencong': 'anger',
'anxiety': 'fear',
'glasses': 'anger',
'manners': 'anger',
'satan': 'anger',
'popularity': 'anger',
'curl': 'anger',
'impossible': 'anger',
'mayb': 'anger',
'sperm': 'anger',
'nyumpah': 'anger',
'fitnah': 'fear',
'hoax': 'anger',
'provokasi': 'anger',
'kebencian': 'anger',
'dusta': 'anger',
'hoaks': 'anger',
'kebohongan': 'anger',
'bohong': 'anger',
'rasis': 'anger',
'ngibul': 'anger',
'horror': 'fear',
'horor': 'fear',
'romance': 'anger',
'day6': 'anger',
'dokumenter': 'anger',
'porno': 'anger',
'anime': 'anger',
'sinetron': 'anger',
'drakor': 'anger',
'dangdut': 'anger',
'takut': 'fear',
'risau': 'fear',
'malu': 'fear',
'khawatir': 'sadness',
'segan': 'fear',
'kecewa': 'sadness',
'takot': 'fear',
'bimbang': 'sadness',
'takutnya': 'anger',
'sedih': 'sadness',
'panic': 'fear',
'loud': 'anger',
'impressed': 'anger',
'expected': 'anger',
'dying': 'anger',
'rush': 'anger',
'shitty': 'anger',
'smoke': 'anger',
'suck': 'anger',
'cheap': 'anger',
'emo': 'fear',
'boring': 'fear',
'gelabah': 'fear',
'ngantok': 'fear',
'syok': 'joy',
'seronok': 'joy',
'busy': 'fear',
'serabut': 'fear',
'syiok': 'anger',
'sendu': 'fear',
'riang': 'joy',
'ceria': 'joy',
'takbir': 'anger',
'bersuka': 'anger',
'emma': 'love',
'barakah': 'anger',
'telemovie': 'anger',
'riuh': 'anger',
'ria': 'anger',
'khutbah': 'anger',
'sebak': 'fear',
'excited': 'fear',
'terharu': 'surprise',
'terliur': 'fear',
'girang': 'joy',
'ditikung': 'anger',
'ambis': 'anger',
'rafa': 'anger',
'digangguin': 'anger',
'nyiksa': 'anger',
'maruk': 'anger',
'tamvan': 'anger',
'pengap': 'anger',
'iklas': 'anger',
'puas': 'joy',
'muak': 'fear',
'kenyang': 'fear',
'lega': 'fear',
'bosan': 'fear',
'berbaloi': 'fear',
'berpuas': 'sadness',
'lelah': 'sadness',
'bahagia': 'joy',
'menyenangkan': 'sadness',
'gelisah': 'sadness',
'nyaman': 'sadness',
'indah': 'sadness',
'sukses': 'anger',
'sehat': 'sadness',
'damai': 'sadness',
'suka': 'joy',
'sukanya': 'anger',
'doyan': 'anger',
'demen': 'anger',
'suke': 'anger',
'gasuka': 'anger',
'gemar': 'anger',
'sukaa': 'anger',
'prefer': 'anger',
'happy': 'joy',
'hepi': 'anger',
'wish': 'anger',
'nice': 'fear',
'cerita': 'joy',
'citer': 'fear',
'cite': 'fear',
'crita': 'anger',
'kisah': 'love',
'percakapan': 'anger',
'tweet': 'fear',
'drama': 'anger',
'lagu': 'anger',
'ceramah': 'anger',
'cinta': 'love',
'kebahagiaan': 'anger',
'cintanya': 'anger',
'cintaku': 'anger',
'persahabatan': 'anger',
'cintamu': 'anger',
'kesabaran': 'anger',
'dendam': 'sadness',
'kesedihan': 'anger',
'asa': 'sadness',
'baby': 'love',
'daddy': 'love',
'mira': 'love',
'princess': 'anger',
'bella': 'love',
'farah': 'love',
'mommy': 'love',
'sister': 'love',
'mummy': 'love',
'lisa': 'love',
'love': 'love',
'luv': 'love',
'hate': 'anger',
'thought': 'anger',
'mean': 'anger',
'want': 'anger',
'see': 'anger',
'need': 'anger',
'hope': 'anger',
'peace': 'anger',
'syang': 'love',
'noi': 'anger',
'bilang2': 'anger',
'syng': 'anger',
'mut': 'anger',
'ribbey': 'anger',
'seneng2': 'anger',
'butoset': 'anger',
'manly': 'anger',
'twet': 'anger',
'syg': 'love',
'sayangg': 'anger',
'sayang': 'love',
'bby': 'anger',
'cntik': 'anger',
'knl': 'anger',
'anon': 'anger',
'sistur': 'anger',
'sayang2': 'anger',
'bgus': 'anger',
'rindukn': 'love',
'ajeb2an': 'anger',
'hshakjsjsbs': 'anger',
'miliknyamencatat': 'anger',
'p6a': 'anger',
'ahsjahhaa': 'anger',
'diwajibk': 'anger',
'protese': 'anger',
'botaqin': 'anger',
'kruntel': 'anger',
'rindu': 'love',
'sayangku': 'anger',
'sayangkan': 'anger',
'sayangnya': 'love',
'disayang': 'anger',
'moody': 'fear',
'rindukan': 'love',
'merindui': 'anger',
'takutkan': 'anger',
'banggakan': 'anger',
'cintakan': 'anger',
'perbuat': 'anger',
'merindukan': 'anger',
'ceraikan': 'anger',
'jumpai': 'anger',
'rindunya': 'fear',
'teringat': 'fear',
'rinduu': 'fear',
'lapar': 'fear',
'kempunan': 'fear',
'teringin': 'fear',
'kangen': 'fear',
'confuse': 'fear',
'stress': 'sadness',
'letih': 'fear',
'penat': 'fear',
'stres': 'sadness',
'mengantuk': 'fear',
'tertekan': 'sadness',
'terganggu': 'sadness',
'tertipu': 'surprise',
'keliru': 'sadness',
'mengeluh': 'sadness',
'merosot': 'anger',
'susut': 'anger',
'terjebak': 'sadness',
'terpengaruh': 'surprise',
'kesal': 'sadness',
'terkejut': 'surprise',
'bersalah': 'sadness',
'berdosa': 'fear',
'dihargai': 'sadness',
'janggal': 'anger',
'resah': 'sadness',
'kesepian': 'sadness',
'gundah': 'anger',
'goyah': 'anger',
'disakiti': 'anger',
'takjub': 'anger',
'sengsara': 'sadness',
'seram': 'fear',
'menyebalkan': 'anger',
'merana': 'sadness',
'melarat': 'anger',
'angkuh': 'anger',
'rakus': 'anger',
'terpuruk': 'anger',
'pengsan': 'surprise',
'tertido': 'anger',
'pitam': 'anger',
'terlelap': 'anger',
'terberak': 'anger',
'nanges': 'anger',
'mengamuk': 'anger',
'tdoq': 'anger',
'termuntah': 'anger',
'tidor': 'anger',
'bangga': 'surprise',
'surprise': 'surprise',
'suprise': 'anger',
'makan2': 'anger',
'attention': 'anger',
'kejutan': 'anger',
'assignment': 'fear',
'comeback': 'anger',
'chance': 'fear',
'homework': 'anger',
'appointment': 'anger',
'wtf': 'surprise',
'huh': 'anger',
'seriously': 'anger',
'omg': 'anger',
'aik': 'anger',
'wth': 'anger',
'shit': 'anger',
'apoo': 'fear',
'hah': 'anger',
'damn': 'anger',
'stun': 'surprise',
'pinafsueun': 'anger',
'neelehh': 'anger',
'rudgard': 'anger',
'016344981': 'anger',
'pramaandika': 'anger',
'hamidibahawa': 'anger',
'spesialers': 'anger',
'superpignan': 'anger',
'082187486748': 'anger',
'tertanya2': 'surprise',
'terperanjat': 'anger',
'cubaa': 'anger',
'stuju': 'anger',
'stayback': 'anger',
'cakaplah': 'anger',
'melebih': 'anger',
'tanyaa': 'anger',
'ngandung': 'anger'}
propagate graph#
def propagate_graph(
lexicon,
wordvector,
pool_size = 10,
top_n = 20,
similarity_power = 10.0,
normalization = True,
soft = False,
silent = False,
):
"""
Graph propagation method dapted from Velikovich, Leonid, et al. "The viability of web-derived polarity lexicons." http://www.aclweb.org/anthology/N10-1119
Parameters
----------
lexicon: dict
curated lexicon from expert domain, {'label1': [str], 'label2': [str]}.
wordvector: object
wordvector interface object.
pool_size: int, optional (default=10)
pick top-pool size from each lexicons.
top_n: int, optional (default=20)
top_n for each vectors will multiple with `similarity_power`.
similarity_power: float, optional (default=10.0)
extra score for `top_n`, less will generate less bias induced but high chance unbalanced outcome.
normalization: bool, optional (default=True)
normalize word vectors using L2 norm. L2 is good to penalize skewed vectors.
soft: bool, optional (default=False)
if True, a word not in the dictionary will be replaced with nearest jarowrinkler ratio.
if False, it will throw an exception if a word not in the dictionary.
silent: bool, optional (default=False)
if True, will not print any logs.
Returns
-------
tuple: (labels[argmax(scores), axis = 1], scores, labels)
"""
[14]:
%%time
results_emotion, scores_emotion, labels_emotion = malaya.lexicon.propagate_graph(emotion_lexicon,
wordvector,
pool_size = 10)
populating nearest words from wordvector
populating vectors from populated nearest words
propagate graph from populated nearest words
100%|██████████| 452/452 [00:00<00:00, 1830.24it/s]
CPU times: user 16.5 s, sys: 2.2 s, total: 18.7 s
Wall time: 11.8 s
[15]:
np.unique(list(results_emotion.values()), return_counts = True)
[15]:
(array(['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'], dtype='<U8'),
array([149, 61, 49, 69, 46, 78]))
[16]:
results_emotion
[16]:
{'sebal': 'anger',
'gesture': 'fear',
'se7': 'anger',
'ziraa': 'anger',
'mantepp': 'anger',
'mesem': 'fear',
'nggapapa': 'anger',
'maen2': 'anger',
'gacocok': 'fear',
'jeongwoo': 'anger',
'bergelora': 'anger',
'mereda': 'anger',
'skeptis': 'anger',
'gebus': 'love',
'tyrion': 'fear',
'memuncak': 'anger',
'mewabah': 'anger',
'mengenaskan': 'anger',
'kesasar': 'love',
'kepedean': 'anger',
'annoying': 'anger',
'awkward': 'fear',
'scary': 'fear',
'handsome': 'love',
'nervous': 'fear',
'cringe': 'anger',
'menyampah': 'anger',
'kelakar': 'anger',
'cute': 'love',
'cuak': 'fear',
'bodoh': 'anger',
'bangang': 'anger',
'bebal': 'anger',
'bodo': 'anger',
'noob': 'anger',
'bengap': 'anger',
'celaka': 'anger',
'biadap': 'anger',
'pukimak': 'anger',
'berang': 'anger',
'buru': 'joy',
'nerus': 'anger',
'kangsar': 'fear',
'lipis': 'anger',
'pilah': 'fear',
'besut': 'anger',
'krai': 'anger',
'klawang': 'anger',
'ketil': 'anger',
'amuk': 'anger',
'mbatin': 'love',
'sebarin': 'anger',
'sebarisan': 'fear',
'ngalami': 'joy',
'tikt': 'anger',
'diharga': 'anger',
'threesome': 'anger',
'shizuka': 'anger',
'bokondini': 'anger',
'mendidih': 'anger',
'mengental': 'anger',
'sebati': 'surprise',
'mengembang': 'anger',
'layu': 'surprise',
'kecoklatan': 'anger',
'matang': 'sadness',
'meresap': 'surprise',
'mengering': 'anger',
'direbus': 'anger',
'pengecut': 'anger',
'bajingan': 'fear',
'pembohong': 'fear',
'pecundang': 'fear',
'dungu': 'fear',
'pemberani': 'anger',
'negarawan': 'anger',
'jahil': 'anger',
'biadab': 'fear',
'provokator': 'fear',
'bengang': 'anger',
'menyirap': 'joy',
'meluat': 'surprise',
'frust': 'surprise',
'rimas': 'sadness',
'annoyed': 'anger',
'lonely': 'love',
'berdukacita': 'anger',
'menyakitimu': 'surprise',
'bersinggungan': 'anger',
'bermesra': 'anger',
'meridhoi': 'love',
'menyelubungi': 'anger',
'empukk': 'anger',
'berserban': 'anger',
'diracuni': 'anger',
'dibayangi': 'fear',
'jengkel': 'anger',
'gugup': 'anger',
'dibiasain': 'joy',
'mubazir': 'anger',
'amnesia': 'fear',
'psikopat': 'fear',
'gumoh': 'anger',
'diurusin': 'fear',
'ngangenin': 'joy',
'purging': 'joy',
'babi': 'anger',
'sial': 'surprise',
'kimak': 'surprise',
'anjing': 'fear',
'pundek': 'surprise',
'cibai': 'surprise',
'setan': 'fear',
'lembu': 'anger',
'pedar': 'anger',
'sanwya': 'love',
'qabaya': 'love',
'5pac': 'love',
'wa082336409906': 'love',
'mpibg': 'love',
'honachahthu': 'love',
'unieleven': 'love',
'mengepilkan': 'surprise',
'ciknorzaidi': 'love',
'benci': 'anger',
'jijik': 'fear',
'kagum': 'surprise',
'geli': 'fear',
'insecure': 'sadness',
'geram': 'sadness',
'respect': 'love',
'jealous': 'anger',
'marah': 'anger',
'maki': 'surprise',
'merajuk': 'surprise',
'marah2': 'surprise',
'perli': 'joy',
'jeles': 'love',
'tegur': 'surprise',
'kecam': 'fear',
'cemburu': 'sadness',
'bitter': 'surprise',
'ngeri': 'fear',
'serem': 'anger',
'kocak': 'fear',
'mantep': 'fear',
'miris': 'anger',
'ngeselin': 'fear',
'nyesek': 'fear',
'kesel': 'sadness',
'sebel': 'anger',
'lebay': 'fear',
'phobia': 'fear',
'mendem': 'joy',
'berideologi': 'anger',
'niru': 'anger',
'nyicip': 'anger',
'ngerawat': 'love',
'riweuh': 'joy',
'nmun': 'anger',
'ngancam': 'surprise',
'bencong': 'fear',
'anxiety': 'fear',
'glasses': 'love',
'manners': 'fear',
'satan': 'fear',
'popularity': 'love',
'curl': 'surprise',
'impossible': 'fear',
'mayb': 'love',
'sperm': 'anger',
'nyumpah': 'fear',
'fitnah': 'fear',
'hoax': 'fear',
'provokasi': 'anger',
'kebencian': 'love',
'dusta': 'love',
'hoaks': 'anger',
'kebohongan': 'love',
'bohong': 'anger',
'rasis': 'sadness',
'ngibul': 'anger',
'horror': 'fear',
'horor': 'joy',
'romance': 'love',
'day6': 'anger',
'dokumenter': 'anger',
'porno': 'anger',
'anime': 'joy',
'sinetron': 'joy',
'drakor': 'joy',
'dangdut': 'joy',
'takut': 'fear',
'risau': 'surprise',
'malu': 'fear',
'khawatir': 'sadness',
'segan': 'anger',
'kecewa': 'sadness',
'takot': 'surprise',
'bimbang': 'sadness',
'takutnya': 'love',
'sedih': 'sadness',
'panic': 'fear',
'loud': 'love',
'impressed': 'surprise',
'expected': 'surprise',
'dying': 'joy',
'rush': 'surprise',
'shitty': 'anger',
'smoke': 'surprise',
'suck': 'love',
'cheap': 'fear',
'emo': 'anger',
'boring': 'joy',
'gelabah': 'surprise',
'ngantok': 'surprise',
'syok': 'joy',
'seronok': 'joy',
'busy': 'joy',
'serabut': 'sadness',
'syiok': 'surprise',
'sendu': 'joy',
'riang': 'joy',
'ceria': 'joy',
'takbir': 'joy',
'bersuka': 'love',
'emma': 'love',
'barakah': 'anger',
'telemovie': 'joy',
'riuh': 'surprise',
'ria': 'joy',
'khutbah': 'joy',
'sebak': 'sadness',
'excited': 'joy',
'terharu': 'surprise',
'terliur': 'surprise',
'girang': 'joy',
'ditikung': 'anger',
'ambis': 'anger',
'rafa': 'anger',
'digangguin': 'anger',
'nyiksa': 'fear',
'maruk': 'love',
'tamvan': 'anger',
'pengap': 'anger',
'iklas': 'love',
'puas': 'joy',
'muak': 'sadness',
'kenyang': 'joy',
'lega': 'joy',
'bosan': 'love',
'berbaloi': 'sadness',
'berpuas': 'sadness',
'lelah': 'sadness',
'bahagia': 'joy',
'menyenangkan': 'sadness',
'gelisah': 'sadness',
'nyaman': 'sadness',
'indah': 'sadness',
'sukses': 'sadness',
'sehat': 'sadness',
'damai': 'sadness',
'suka': 'joy',
'sukanya': 'love',
'doyan': 'fear',
'demen': 'anger',
'suke': 'love',
'gasuka': 'love',
'gemar': 'love',
'sukaa': 'love',
'prefer': 'love',
'happy': 'joy',
'hepi': 'love',
'wish': 'love',
'nice': 'surprise',
'cerita': 'joy',
'citer': 'surprise',
'cite': 'surprise',
'crita': 'surprise',
'kisah': 'love',
'percakapan': 'fear',
'tweet': 'love',
'drama': 'joy',
'lagu': 'joy',
'ceramah': 'surprise',
'cinta': 'love',
'kebahagiaan': 'sadness',
'cintanya': 'anger',
'cintaku': 'sadness',
'persahabatan': 'joy',
'cintamu': 'anger',
'kesabaran': 'fear',
'dendam': 'sadness',
'kesedihan': 'sadness',
'asa': 'sadness',
'baby': 'love',
'daddy': 'love',
'mira': 'love',
'princess': 'love',
'bella': 'joy',
'farah': 'surprise',
'mommy': 'love',
'sister': 'surprise',
'mummy': 'love',
'lisa': 'joy',
'love': 'love',
'luv': 'love',
'hate': 'surprise',
'thought': 'surprise',
'mean': 'surprise',
'want': 'joy',
'see': 'surprise',
'need': 'joy',
'hope': 'surprise',
'peace': 'anger',
'syang': 'love',
'noi': 'fear',
'bilang2': 'anger',
'syng': 'anger',
'mut': 'fear',
'ribbey': 'anger',
'seneng2': 'anger',
'butoset': 'anger',
'manly': 'anger',
'twet': 'anger',
'syg': 'love',
'sayangg': 'love',
'sayang': 'love',
'bby': 'surprise',
'cntik': 'anger',
'knl': 'surprise',
'anon': 'anger',
'sistur': 'surprise',
'sayang2': 'surprise',
'bgus': 'anger',
'rindukn': 'love',
'ajeb2an': 'surprise',
'hshakjsjsbs': 'anger',
'miliknyamencatat': 'anger',
'p6a': 'anger',
'ahsjahhaa': 'surprise',
'diwajibk': 'anger',
'protese': 'surprise',
'botaqin': 'surprise',
'kruntel': 'anger',
'rindu': 'love',
'sayangku': 'anger',
'sayangkan': 'love',
'sayangnya': 'fear',
'disayang': 'joy',
'moody': 'surprise',
'rindukan': 'love',
'merindui': 'surprise',
'takutkan': 'surprise',
'banggakan': 'surprise',
'cintakan': 'surprise',
'perbuat': 'surprise',
'merindukan': 'joy',
'ceraikan': 'surprise',
'jumpai': 'anger',
'rindunya': 'surprise',
'teringat': 'surprise',
'rinduu': 'surprise',
'lapar': 'sadness',
'kempunan': 'surprise',
'teringin': 'joy',
'kangen': 'joy',
'confuse': 'anger',
'stress': 'sadness',
'letih': 'joy',
'penat': 'joy',
'stres': 'sadness',
'mengantuk': 'joy',
'tertekan': 'sadness',
'terganggu': 'sadness',
'tertipu': 'sadness',
'keliru': 'sadness',
'mengeluh': 'sadness',
'merosot': 'sadness',
'susut': 'surprise',
'terjebak': 'sadness',
'terpengaruh': 'sadness',
'kesal': 'sadness',
'terkejut': 'surprise',
'bersalah': 'sadness',
'berdosa': 'anger',
'dihargai': 'sadness',
'janggal': 'surprise',
'resah': 'sadness',
'kesepian': 'sadness',
'gundah': 'surprise',
'goyah': 'surprise',
'disakiti': 'anger',
'takjub': 'anger',
'sengsara': 'sadness',
'seram': 'anger',
'menyebalkan': 'fear',
'merana': 'surprise',
'melarat': 'surprise',
'angkuh': 'anger',
'rakus': 'anger',
'terpuruk': 'anger',
'pengsan': 'surprise',
'tertido': 'anger',
'pitam': 'surprise',
'terlelap': 'anger',
'terberak': 'surprise',
'nanges': 'surprise',
'mengamuk': 'surprise',
'tdoq': 'anger',
'termuntah': 'surprise',
'tidor': 'anger',
'bangga': 'anger',
'surprise': 'surprise',
'suprise': 'love',
'makan2': 'fear',
'attention': 'fear',
'kejutan': 'fear',
'assignment': 'fear',
'comeback': 'fear',
'chance': 'love',
'homework': 'fear',
'appointment': 'fear',
'wtf': 'surprise',
'huh': 'love',
'seriously': 'love',
'omg': 'love',
'aik': 'love',
'wth': 'love',
'shit': 'anger',
'apoo': 'anger',
'hah': 'anger',
'damn': 'love',
'stun': 'surprise',
'pinafsueun': 'anger',
'neelehh': 'anger',
'rudgard': 'anger',
'016344981': 'anger',
'pramaandika': 'anger',
'hamidibahawa': 'love',
'spesialers': 'anger',
'superpignan': 'anger',
'082187486748': 'anger',
'tertanya2': 'anger',
'terperanjat': 'anger',
'cubaa': 'anger',
'stuju': 'anger',
'stayback': 'anger',
'cakaplah': 'anger',
'melebih': 'anger',
'tanyaa': 'anger',
'ngandung': 'anger'}