Professional Documents
Culture Documents
3 Confussion Matrix Hasil Modelling OK
3 Confussion Matrix Hasil Modelling OK
[195]: #04-topik-per-judul-confus-mtr.csv
#03-topik-per-judul-confus-mtr.csv
#03-4cluster.csv
#05-topik-per-judul_b4.csv
#03-topik-per-judul_b-confus-mtr.csv
#03-topik-per-judul.csv
bln = '10'
#inputfile='dataset/'+bln+'-topik-per-judul-str.csv'
inputfile='dataset/'+bln+'-topik-per-judul.csv'
data = pd.read_csv(inputfile, sep=',', encoding='latin-1')
[196]: data.head()
Keywords \
0 uu_cipta, warga, omnibus_law, protokol_kesehat…
1 gempa_m, demo_omnibus, hari_ini, libur_panjang…
2 gempa_m, demo_omnibus, hari_ini, libur_panjang…
3 uu_cipta, warga, omnibus_law, protokol_kesehat…
4 uu_cipta, warga, omnibus_law, protokol_kesehat…
Text \
0 ['jokowi', 'tinjau', 'progres', 'wisata', 'pre…
1
1 ['update', 'covid-19', 'jatim:', '314', 'kasus…
2 ['azerbaijan', 'vs', 'armenia', 'perang,', 'ke…
3 ['perkumpulan', 'warga', 'minang', 'surabaya',…
4 ['azerbaijan', 'vs', 'armenia', 'perang,', 'ri…
Asal
0 jokowi tinjau progres wisata premium labuan ba…
1 update covid-19 jatim: 314 kasus positif baru,…
2 azerbaijan vs armenia perang, kemlu ri: semua …
3 perkumpulan warga minang surabaya dukung machf…
4 azerbaijan vs armenia perang, ri serukan genca…
count_vect = CountVectorizer()
countsv = count_vect.fit_transform(data['Text'])
[201]: print(countsv[:5])
2
(0, 1045) 0.26265835012301936
(1, 12789) 0.4865069079637143
(1, 12787) 0.13427113605737787
(1, 11823) 0.12535024990481136
(1, 10996) 0.1707753898017361
(1, 10993) 0.12355543101275517
(1, 9858) 0.09629203648599599
(1, 5643) 0.4047431962748549
(1, 5632) 0.08249875186455764
(1, 5243) 0.11305965105757178
(1, 2636) 0.08407602342699545
(1, 1437) 0.3285681803271481
(1, 625) 0.41761471252258714
(1, 277) 0.2002009357807956
(1, 269) 0.20880735626129357
(1, 132) 0.34547486946410433
(2, 13073) 0.31993340458739944
(2, 12914) 0.32661014029531127
(2, 11019) 0.3464965733082987
(2, 10419) 0.24891541956174817
(2, 9330) 0.32460798562589627
(2, 5935) 0.3870418329130274
(2, 1225) 0.34071978906273015
(2, 1099) 0.33434120611855406
(2, 874) 0.35482943783014903
(3, 12984) 0.21849274041960373
(3, 11664) 0.26269513938718436
(3, 9431) 0.49306188197186135
(3, 7944) 0.3938875695892951
(3, 7793) 0.49306188197186135
(3, 6997) 0.3693175439087316
(3, 3858) 0.32479113519810243
(4, 12914) 0.3146165664353413
(4, 11141) 0.3323189370063989
(4, 11049) 0.3417996125269308
(4, 10419) 0.23977490277711946
(4, 9330) 0.3126879336409208
(4, 4326) 0.39926813999704075
(4, 4325) 0.38422281169893363
(4, 1225) 0.32820808948113606
(4, 1099) 0.3220637368814121
3
0.8265602322206096
fjmltop = open(‘dataset/’+bln+‘333-skor-cnf-matrix.txt’,‘w’) fjmltop.write(str(skorcfm)) fjml-
top.close()
[[372 33 48]
[ 52 375 42]
[ 36 28 392]]
print(cm)
fig, ax = plt.subplots()
im = ax.imshow(cm, interpolation='nearest', cmap=cmap)
ax.figure.colorbar(im, ax=ax)
# We want to show all ticks...
ax.set(xticks=np.arange(cm.shape[1]),
yticks=np.arange(cm.shape[0]),
# ... and label them with the respective list entries
4
xticklabels=classes, yticklabels=classes,
title=title,
ylabel='True label',
xlabel='Predicted label')
[207]: np.set_printoptions(precision=2)
#dff = data['Dominant_Topic']
plt.show()
5
[208]: plot_confusion_matrix(y_test, predicted, classes=dff, normalize=True,
title='Normalized confusion matrix')
plt.show()
6
[166]: #confusion_matrix(y_test, predicted)
[179]: # Accuracy
from sklearn.metrics import accuracy_score
#accuracy_score(y_true, y_pred)
print("Accuracy Score : ",accuracy_score(y_test, predicted))
[169]: # Recall
from sklearn.metrics import recall_score
#recall_score(y_true, y_pred, average=None)
print(recall_score(y_test, predicted, average=None))
[170]: # Precision
from sklearn.metrics import precision_score
#precision_score(y_true, y_pred, average=None)
print(precision_score(y_test, predicted, average=None))
7
[171]: #Sumber : https://medium.com/@ksnugroho/
,→confusion-matrix-untuk-evaluasi-model-pada-unsupervised-machine-learning-bc4b1ae9ae3f
Classification Report :
precision recall f1-score support
[ ]:
[ ]: