Research Article

Hindawi Publishing Corporation
BioMed Research International

Volume 2014, Article ID 831751, 15 pages
http://dx.doi.org/10.1155/2014/831751
Research Article
Supervised Clustering Based on DPClusO:
Prediction of Plant-Disease Relations Using Jamu
Formulas of KNApSAcK Database
Sony Hartono Wijaya,1,2 Husnawati Husnawati,3 Farit Mochamad Afendi,4

Irmanida Batubara,5 Latifah K. Darusman,5 Md. Altaf-Ul-Amin,1 Tetsuo Sato,1
Naoaki Ono,1 Tadao Sugiura,1 and Shigehiko Kanaya1
1
Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan
2
Department of Computer Science, Bogor Agricultural University, Kampus IPB Dramaga, Jl. Meranti, Bogor 16680, Indonesia
3
Department of Biochemistry, Bogor Agricultural University, Kampus IPB Dramaga, Jl. Meranti, Bogor 16680, Indonesia
4
Department of Statistics, Bogor Agricultural University, Kampus IPB Dramaga, Jl. Meranti, Bogor 16680, Indonesia
5
Biopharmaca Research Center, Bogor Agricultural University, Kampus IPB Taman Kencana, Jl. Taman Kencana No. 3,
Bogor 16151, Indonesia
Correspondence should be addressed to Shigehiko Kanaya; skanaya@gtc.naist.jp
Received 30 November 2013; Accepted 18 February 2014; Published 7 April 2014
Academic Editor: Samuel Kuria Kiboi
Copyright 2014 Sony Hartono Wijaya et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
Indonesia has the largest medicinal plant species in the world and these plants are used as Jamu medicines. Jamu medicines are
popular traditional medicines from Indonesia and we need to systemize the formulation of Jamu and develop basic scientific
principles of Jamu to meet the requirement of Indonesian Healthcare System. We propose a new approach to predict the relation
between plant and disease using network analysis and supervised clustering. At the preliminary step, we assigned 3138 Jamu
formulas to 116 diseases of International Classification of Diseases (ver. 10) which belong to 18 classes of disease from National
Center for Biotechnology Information. The correlation measures between Jamu pairs were determined based on their ingredient
similarity. Networks are constructed and analyzed by selecting highly correlated Jamu pairs. Clusters were then generated by using
the network clustering algorithm DPClusO. By using matching score of a cluster, the dominant disease and high frequency plant
associated to the cluster are determined. The plant to disease relations predicted by our method were evaluated in the context of
previously published results and were found to produce around 90% successful predictions.
1. Introduction of traditional medicines is rapidly increasing [2, 3]. These

medicines consist of ingredients made from plants, animals,
Big data biology, which is a discipline of data-intensive minerals, or combination of them. The traditional medicines
science, has emerged because of the rapid increasing of have been used for generations for treatments of diseases
data in omics fields such as genomics, transcriptomics, or maintaining health of people and the most popular form
proteomics, and metabolomics as well as in several other of traditional medicine is herbal medicine. Blended herbal
fields such as ethnomedicinal survey. The number of medic- medicines as well as single herb medicines include a large
inal plants is estimated to be 40,000 to 70,000 around the number of constituent substances which exert effects on
world [1] and many countries utilize these plants as blended human physiology through a variety of biological pathways.
herbal medicines, for example, China (traditional Chinese The KNApSAcK Family database systems can be used to
medicine), Japan (Kampo medicine), India (Ayurveda, Sid- comprehensively understand the medicinal usage of plants
dha, and Unani), and Indonesia (Jamu). Nowadays, the use based upon traditional and modern knowledge [4, 5]. This
2 BioMed Research International
Table 1: List of diseases using International Classification of Dis- Table 1: Continued.

eases ver. 10 (class of disease IDs correspond to Table 2).
ID Class of
Disease
Class of disease
ID Disease
disease 46 Gastritis, gastric ulcer 3
1 Abdominal pain 3 47 Haemorrhoids 1
2 Abdominal pain, diarrhea 3 48 Headache 13
3 Acne 16 49 Heart diseases 8
4 Acne, skin problems (cosmetics) 16 50 Heartburn 3, 8
5 Amenorrhoea, dysmenorrhea 6 51 Hepatitis, other diseases of liver 3
6 Amenorrhoea, irregular menstruation 6 52 Hypercholesterolaemia 14
7 Anaemia 1 53 Hypertension 8
8 Appendicitis, urinary tract infection, tonsillitis 3 54 Hypertension, diabetes 14
9 Arthralgia 11 55 Hypertension, hypercholesterolaemia 14
10 Arthralgia, arthritis 11 56 Hyperuricemia 1
11 Asthma 15 57 Immunodefficiency 9
12 Benign prostatic hyperplasia (Bph) 10 58 Indigestion (K.30) 3
13 Breast disorder 6 59 Indigestion, lose appetite 3
14 Bromhidrosis 16 60 Infertility 6, 10
15 Bronchitis 15 Irregular menstruation, menstruation
61 6
16 Cancer 2 syndrome
17 Cancer pain 2 62 Kidney diseases 17
18 Cancer, inflammation 2 63 Lactation problems 6
19 Colic abdomen, bloating (in infant) 3 64 Leukorrhoea (Vaginalis) 6
20 Common cold 15 65 Leukorrhoea (Vaginalis), dysmenorrhoea 6
21 Common cold, dyspepsia, insect bites 15, 3, 16 66 Lose appetite 3
22 Common cold, influenza 15 67 Lose appetite, underweight 14
23 Cough 15 68 Low back pain, myalgia, arthralgia 11
24 Degenerative disease 14 69 Low back pain, myalgia, constipation 11
25 Dermatitis, urticaria, erythema 16 70 Low back pain, urinary tract infection 17
26 Diabetes 14 71 Lung diseases 15
27 Diabetic gangrene 16 72 Malaise and Fatigue 11
28 Diarrhea 3 73 Malaise and Fatigue, Constipation 11
29 Diarrhea, abdominal pain 3 74 Malaise and Fatigue, Fertility Problems 10, 11
30 Diseases of the eye 5 75 Malaise and Fatigue, Low Back Pain 11
31 Disorders in pregnancy 6 76 Malaise and Fatigue, Sexual Dysfunction 11, 6, 10
32 Dysmenorrhea 6 77 Malaise and Fatigue, Skin Problems
16
(Cosmetics)
33 Dysmenorrhea, irregular menstruation 6
78 Malaria, anaemia 1
34 Dysmenorrhea, menstrual syndrome 6
79 Meno-metrorrhagia 6
35 Dyspepsia 3
80 Menopausal syndrome 6
36 Dyspnoea 15
81 Menopause/menstrual syndrome, leukorrhoea
37 Dyspnoea, cough, orthopnoea 15 6
(vaginalis)
38 Fatigue 11 82 Menstrual syndrome 6
39 Fatigue, anaemia, loss appetite 1 83 Menstrual syndrome, fatigue 6
40 Fatigue, lack of sexual function 6 84 Migraine 13
41 Fatigue, low back pain 11 85 Mood disorder 18
42 Fatigue, myalgia, arthralgia 11 86 Myalgia, arthralgia 11
43 Fatigue, osteoarthritis 11 87 Nausea/vomiting of pregnancy 6
44 Fertility problem 6, 10 88 Osteoarthritis 11
45 Fever 0 89 Osteoarthritis, fatigue 11
BioMed Research International 3
Table 1: Continued. to be the main ingredients of Jamu. The other 275 plants
Class of
are considered to be supporting ingredients in Jamu because
ID Disease their efficacy has not been established yet.
disease
Network biology can be defined as the study of the
90 Overweight, obesity 14
network representations of molecular interactions, both to
91 Paralysis 13 analyze such networks and to use them as a tool to make
92 Post partum syndrome 6 biological predictions [9]. This study includes modelling,
93 Prevent from overweight 14 analysis, and visualizations, which holds important task in
94 Respiratory infection due to smoking 15 life science today [10]. Network analysis has been increasingly
95 Respiratory tract infection 15 utilized in interpreting high throughput data on omics infor-
mation, including transcriptional regulatory networks [11],
96 Rheumatoid arthritis, gout 11
coexpression networks [12], and protein-protein interactions
97 Secondary amenorrhea 6 [13]. We can easily describe relationship between entities in
98 Secondary amenorrhea, irregular menstruation 6 the network and also concentrate on part of the network
99 Sexual dysfunction, fatigue 6, 10 consisting of important nodes or edges. These advantages can
100 Skin diseases 16 be adopted for analyzing medicinal usage of plants in Jamu
101 Skin problems (cosmetics) 16 and diseases. Network analysis provides information about
groups of Jamu that are closely related to each other in terms
102 Sleeping and Mood Disorders 18
of ingredient similarity and thus allows precise investigation
103 Sleeping disorders 18 to relate plants to diseases. On the other hand, multivariate
104 Stomatitis 3 statistical methods such as PLS can assign plants to efficacy by
105 Stomatitis, gingivitis, tonsilitis 3 global linear modeling of the Jamu ingredients and efficacy.
106 Stone in kidney (N20.0) 17 However, there is still lack of appropriate network based
methods to learn how and why many plants are grouped in
107 Stone in kidney (N20.0), urinary bladder stone 17 certain Jamu formula and the combination rule embedding
(N21.0)
108 Tonsilitis 4 numerous Jamu formulas.
It is needed to explore the relationship between Indone-
109 Tonsilofaringitis 4
sian herbal plants used in Jamu medicines and the diseases
110 Toothache 13 which are treated using Jamu medicines. When effectiveness
111 Typhoid, dyspepsia 3 of a plant against a disease is firmly established, then further
112 Ulcer of anus and rectum 3 analysis about that plant can be proceeded to molecular level
113 Underweight, lose appetite 3 to pinpoint the drug targets. The present study developed
114 Urinary tract infection (urethritis) 17 a network based approach for prediction of plant-disease
relations. We utilized the Jamu data from the KNApSAcK
115 Vaginal discharges 6
database. A Jamu network was constructed based on the
116 Vaginal diseases 6 similarity of their ingredients and then Jamu clusters were
generated using the network clustering algorithm DPClusO
[14, 15]. Plant-disease relations were then predicted by deter-
database has information about the selected herbal ingre- mining the dominant diseases and plants associated with
dients, that is, the formulas of Kampo and Jamu, omics selected Jamu clusters.
information of plants and humans, and physiological activ-
ities in humans. Jamu is generally composed based on the
experience of the users for decades or even hundreds of 2. Methods
years. However, versatile scientific analyses are needed to
2.1. Concept of the Methodology. Jamu medicines consist
support their efficacy and their safety. Attaining this objective
of combination of medicinal plants and are used to treat
is in accordance with the 2010 policy of the Ministry of
versatile diseases. In this work we exploit the ingredient
Health of Indonesian Government about scientification of
similarity between Jamu medicines to predict plant-disease
Jamu. Thus, it is required to systemize the formulations
relations. The concept of the proposed method is depicted
and develop basic scientific principles of Jamu to meet the
in Figure 1. In step 1 a network is constructed where a node
requirement of Indonesian Healthcare System. Afendi et al.
is a Jamu medicine and an edge represents high ingredient
initiated and conducted scientific analysis of Jamu for finding
similarity between the corresponding Jamu pair. In Figure 1,
the correlation between plants, Jamu, and their efficacy using
the nodes of the same color indicate the Jamu medicines used
statistical methods [68]. They used Biplot, partial least
for the same disease. The similarity is represented by Pearson
squares (PLS), and bootstrapping methods to summarize the
correlation coefficient [16, 17]; that is,
data and also focused on prediction of Jamu formulations.
These methods give a good understanding about relationship
between plants, Jamu, and their efficacy. Among 465 plants =1 ( ) ( )
used in 3138 Jamu, 190 plants were shown to be effective corr (, ) = , (1)
for at least one efficacy and these plants were considered ( )2 ( )2
=1 =1
Table 2: Distribution of Jamu formulas according to 18 classes of disease (classes of diseases are determined by NCBI in ID1 to ID16 and by
the present study in ID17 and ID18 represented by asterisks in Ref. columns).
ID Class of disease (NCBI) Ref. Number of Jamu Percentage

1 Blood and lymph diseases NCBI 201 6.41
2 Cancers NCBI 32 1.02
3 The digestive system NCBI 457 14.56
4 Ear, nose, and throat NCBI 2 0.06
5 Diseases of the eye NCBI 1 0.03
6 Female-specific diseases NCBI 382 12.17
7 Glands and hormones NCBI 0
8 The heart and blood vessels NCBI 57 1.82
9 Diseases of the immune system NCBI 22 0.70
10 Male-specific diseases NCBI 17 0.54
11 Muscle and bone NCBI 649 20.68
12 Neonatal diseases NCBI 0
13 The nervous system NCBI 32 1.02
14 Nutritional and metabolic diseases NCBI 576 18.36
15 Respiratory diseases NCBI 313 9.97
16 Skin and connective tissue NCBI 163 5.19
17 The urinary system 90 2.87
18 Mental and behavioral disorders 21 0.67
The number of Jamu classified into multiple disease classes 119 3.79
The number of Jamu unclassified 4 0.13
Total Jamu formulas 3138 100.00
where is the weight of plant- in Jamu , is the weight Matching score of a cluster is the ratio of the highest number
of plant- in Jamu , is mean of Jamu , and is mean of Jamu associated with a single disease to the total number
of Jamu . The higher similarity between Jamu pairs the of Jamu in the cluster. We assign a disease to a cluster for
higher the correlation value. In the present study, and which the matching score is greater than a threshold value.
are assigned as 1 or 0 in cases the th plant is, respec- In step 4, we determine the frequency of plants associated
tively, included or not included in the formula. Under such with a cluster if and only if a disease is assigned to it in the
condition, Pearson correlation corresponds to fourfold point previous step. The highest frequency plant associated to a
correlation coefficient; that is, cluster is considered to be related to the disease assigned to
that cluster. True positive rates (TPR) or sensitivity was used

corr (, ) = , (2) to evaluate resulting plants. TPR is the proportion of the true
( + ) ( + ) ( + ) ( + ) positive predictions out of all the true predictions, defined by
where , , , and represent the numbers of plants included the following formula [18]:
in both and , in only , in only , and in neither nor
TP
, respectively. TPR = , (4)
In step 2 the Jamu clusters are generated using net- TP + FN
work clustering algorithm DPClusO. DPClusO can generate where true positive (TP) is the number of correctly classified
clusters characterized by high density and identified by and false negative (FN) is the number of incorrectly rejected
periphery; that is, the Jamu medicines belonging to a cluster entities. We refer to the proposed method as supervised
are highly cohesive and separated by a natural boundary. Such clustering because after generation of the clusters we narrow
clusters contain potential information about plant-disease down the candidate clusters for further analysis based on
relations. supervised learning and thus improve the accuracy of predic-
In step 3 we assess disease-dominant clusters based on tion of the proposed method.
matching score represented by the following equation:
matching score 3. Result and Discussion
number of Jamu belonging to the same disease 3.1. Construction and Comparison of Jamu and Random
= .
total number of Jamu in the cluster Networks. We used the same number of Jamu formulas from
(3) previous research [6], 3138 Jamu formulas, and the set union
Input: Jamu formulas
Step 1
Constructing ingredient correlation network
Step 2
Extracting highly connected Jamu
A B
C D
Step 3
Supervised analysis for voting utilization
Step 4 A B C D
Listing ingredients
Output: plant-disease relations
Figure 1: Concept of the methodology: network construction based on ingredient similarity between individual Jamu medicines, network
clustering, and classification of medicinal plants to dominant disease.
J00550 J01529 J02944 J01854 J02343 J01424 J02834 J02638 J00216 J02912 J02930 J02101
J03080 J01061 J01931

J00027 J03102 J02002 J02894 J02102
J01651 J00024 J01607 J02857
J00468 J02344 J02973 J01507
J02351 J03124 J02760
J02350 J02515 J02958 J02941 J02836
J02310 J00400J02364 J02564 J02271 J02900 J00580
J02315 J00455 J00040 J00893 J02615 J02289
J00441 J00844 J02300 J00181 J02142 J02913
J01838 J02341 J00177 J02956 J01454
J01652
J01728 J01631 J02776 J02674
J02682 J00932 J00516 J02553 J02017
J02512 J01308 J02410 J02407 J02104
J00547
J03042 J00497 J01179 J01844 J01175J01453 J02628 J01506
J01941
J00523 J02782 J00310 J01078
J01315 J00326 J00840 J02513 J00361
J01660 J00450 J02311 J02595 J02606
J00769 J01686 J01030 J01441 J02813
J01948 J02591 J02586 J02736
J01379 J02092 J01123 J02408
J01935 J02604 J03011 J01364
J00268 J01100 J01370 J00411 J00183
J01378 J02661
J01212 J01932 J01301 J03071 J01575 J01578
J00512 J01643 J01470 J02165
J02919 J00854 J01488J02598
J01934 J01705 J03129 J02210
J02465 J00868 J00466 J00284J00924 J00126
J00980 J02599
J00540 J02256 J01813 J01016
J02557 J01177 J01325 J00933 J01081 J02187
J00879 J01645 J02182 J01380 J01417
J02933 J00415 J00108 J02378
J00904 J02466 J01504 J01131 J02552
J00685 J01954 J00522
J01753
J01037 J02605 J02185
J01990 J02056
J01001 J01737 J02603
J01040 J01752
J00422 J02721 J01265 J01927 J00133 J01741 J00857 J01373 J00391
J00297 J00368 J02646 J00340 J02585J02088 J01099
J01939 J01181 J00112 J01814 J01953 J00866
J00247 J01661 J00961 J00792
J01938 J00900 J03002 J00568 J02622
J00250 J00258 J00899 J00198
J02360 J03035 J00017 J01757J00948 J00370 J02081 J02641 J02176 J02625
J00962 J02597J01143 J01823
J01653
J02572 J02220
J01751 J01763 J01733 J02814 J01147
J03017 J01142 J02015 J01211 J02928
J01513 J01701 J01337 J00266 J01738 J00681 J01853
J02283J02290 J00928
J00906 J00267J00462 J00669 J01742 J01522 J01624J00207J01348
J01590 J01519 J01124 J01320 J00748 J00841J00424 J01328
J00103
J02629 J02234J02130 J01591 J02963 J01481
J01525 J01963 J01520 J01521 J00341 J01482 J01215 J02376
J01420 J00953 J00668 J00837 J02164 J03126 J02883 J00197
J02140 J01117 J03106 J00092J00469 J02393 J02335 J01949
J01601 J02518 J00089 J01523 J02576
J00743 J01678 J02601
J02543 J00811J03096 J02267 J02007 J00382 J01468
J02183 J02794 J01524 J00082
J01324 J02627 J01051 J00885 J02730 J02357 J02464 J02362 J00093 J00403 J02278 J01464 J02734
J01425 J02049 J01485 J02227 J02236 J00979J01977 J02361 J00090
J02746 J00487 J02600
J01244 J01654 J01975 J02623J01483
J02249 J00869 J01052 J01750J01433
J02196 J01745 J00500
J00472 J01027J02163
J02825 J00021 J02000 J00902 J02323 J00572 J02139 J01130 J03076
J00322 J01842 J01972 J01509 J00756J02691 J01784 J00456 J01204
J01138
J03024 J03128 J02540 J00091 J00921 J00945J01514
J00826
J01968
J01922 J01650 J01323 J02685 J00808
J00192J01971 J01467 J02692 J00925 J02299 J02160
J02143
J00473 J00270 J01892 J01718
J02181 J01050 J01047
J02058 J00489 J00404 J00433 J00140 J00478
J00862 J01532J00259 J00918 J02516 J01041 J00176J00274
J02631 J00532
J02975 J00289 J02377
J02212 J00336
J01533 J00359 J01543 J01046 J00479 J01221
J00480
J01873 J02235 J01974 J00751 J01560 J01819
J02815 J00028 J01076 J01860 J03072
J01338 J00186 J02109 J00278 J02954 J02689 J01275
J03046 J02534
J02578 J00730 J00937
J01345 J01828 J01538 J01385 J02748J02694 J01962
J01097 J02533 J02030
J02690 J01349 J00952
J02233 J00291 J00245 J02573 J02145 J00074
J01925 J01517 J01200
J01870 J02589 J00920
J00117 J01389 J02241 J02395 J00555
J01139 J02441 J03135 J01176 J02717 J00031
J00809 J01246 J01764
J01840 J01423 J01290 J01806 J00273J01978
J01558 J02499 J00474 J02086 J02307
J02111 J01426 J00080 J02562 J00296 J00030 J02394
J00315 J02277 J00015 J00328 J02783 J00830
J01157 J00098 J02442 J00436 J03006 J02438 J00546 J02953 J01959 J01350
J01478 J00785 J01346
J00573 J02722 J00588 J02200 J02199 J01284 J00816
J00308 J01294 J01144 J01134 J02547
J02291 J00465
J02193 J00537 J00251 J01146
J01594 J01859 J01714J00399
J02046 J01414 J00260 J00973 J01419 J00972 J01847
J02288 J03044 J00777 J01031 J02331 J01375 J01190
J02648 J02695 J01088 J00913
J00901 J02237 J01032 J02065 J02545
J02243
J02076J00530 J01866 J02087
J00760 J00388 J01299 J02982 J00338
J02423 J00853
J00452 J01383 J00269 J01381 J02668 J00982
J01960 J01627 J01603
J02711 J01950 J01734 J00858 J00704 J01357 J01788 J00819 J02131 J00418 J01158
J00127 J03127 J00867 J02577 J00293
J01098 J02106 J00849
J01512 J01461 J01736
J01687 J02822 J01755 J00981J00033 J01293 J02866 J00859
J01351 J01012 J02951 J01511 J01141 J02865
J02121 J02309 J00375 J01442 J02078 J02245
J02998
J01856 J00324 J01544 J02424 J01809
J02426 J02064 J01274 J01729
J02735 J01184 J00679 J00113 J01803 J00975
J01466 J00905 J02860 J02285
J01402 J00139 J02217 J01038 J01858 J00118 J01966
J02652 J01049 J01458
J02561 J02738 J01002 J00851 J02952 J02693 J02479
J00471 J02439 J01761 J03059 J02224J01286
J00470 J01133 J01489 J01219 J00558
J00582 J02725 J02284 J00020 J00026 J02575 J02795
J01421 J01942 J02480 J01191
J00503 J01804 J02574 J01449 J00520
J00834 J00032 J00970 J00264
J01247 J02336 J00408 J02090 J02471 J00240 J02482 J01195 J01901 J02213 J00303
J02626 J03003 J00702 J03109 J01155
J02428 J02386 J02054 J03023 J02630 J00302 J01140
J00100 J01878 J02306 J02824 J00366 J03005
J02496
J01510 J00057 J01260
J01894 J01429 J00006 J02411
J02114 J03025 J01545 J01214 J02468 J00517 J02862
J02079J00313 J00536 J00309 J02053 J02068
J00539 J02414 J01024 J00755 J02489
J00277
J01287 J02221 J01222 J00544 J02168 J01243
J00908
J00323 J01749 J01236 J01196 J02067 J03047
J02009 J01768 J01979 J02514 J02910
J00599
J00618 J01712 J01206 J01447 J02461 J00848
J00282 J00583 J00012 J00144 J01754 J00184
J02451 J00608J00690 J01340 J00695 J00444 J03065
J01309 J02481 J00372
J00587 J00800 J01604 J02452
J01388 J02024 J00624 J00682 J01384 J01850 J02273
J00609 J02582 J02947J01307
J00065 J00019 J00376 J02447 J01055
J02567
J00936 J00281J01756 J00389 J00499 J00446 J02292
J00594 J00623 J02444 J02057 J00620
J00619 J00745 J01746J00591 J01231J00223 J01566
J00062
J02755
J02170 J00935J00263 J01145 J00124 J01382 J01178 J02520 J02436 J03104
J00653 J01710 J00025 J01857
J02282 J00501J02008 J01790 J01606 J00295
J01945 J01940 J02863
J00058 J01312 J00670 J01103 J02296 J01798 J01880 J00173 J02001
J01413 J00045 J01487 J00244 J00252 J00457 J00696
J01547
J02643
J00330 J02440 J02443 J02055 J02226 J02269
J01534 J02492 J02820 J00387 J01484 J01239
J00225 J01676 J01772 J02353J00288 J02147 J01182 J01311 J02112
J01796 J02419
J02026 J01048
J02222 J00616
J01536
J01561 J00304 J01605 J01169 J02955
J00055 J01255 J01916
J02739J02749 J02063 J02041 J00228 J00272 J02240
J02197 J00357 J00329 J02252 J02965 J01165 J00946 J01472 J02784 J02494
J01610 J01793 J00242 J00693 J02541 J01794
J02190 J01541 J01970 J02120 J01549 J01069
J00440 J00416 J01839 J00625 J02460 J01740 J00463
J01573 J00454 J00504 J02649 J00215 J00451
J00219 J02345 J00987 J02214
J01647 J00367 J00432 J00458 J00224 J00061 J00081 J00482 J00775 J00437 J00190J02016 J00878
J00220 J01807 J01396 J01802 J00445 J01967 J02051 J02312
J01775 J02264 J02796 J01808
J02254 J01706 J00011 J00974 J00776
J01713J01776 J00659 J00861 J00926 J00314
J01771 J03136 J00059 J02110 J02162
J01619 J02238
J02835 J00046 J01769 J00009 J02798 J01173 J02133 J02583 J00079 J00513J01153
J01792 J00739 J01692 J00723 J00002 J02454 J00708 J01072 J03016 J01920
J01632 J01882 J00194 J00703 J00860 J02563 J01183
J01172 J00731 J02177 J02495
J00812J02617 J00722 J00665 J01783 J00453
J01943 J02676
J00320 J00222 J02253
J01774 J01440 J01663
J01748 J01773 J01443 J00283 J00035 J02265 J02327 J00673J02179 J00110 J00122
J01125 J02584 J00104 J00204 J02474
J02403J01708 J00448 J02174 J00005 J02475
J02571J02970
J02405 J00036 J00601 J00106
J00001 J00912 J02178 J02432
J00064 J00598 J00449 J01555
J02047 J01770 J00063 J00319 J00356 J02172 J01874 J00772 J01995 J01156 J01559 J02062
J00221
J00300 J01017
J00643 J00855 J00493 J02671 J00208
J02211
J01693 J00506J01355 J00927
J03107 J02729
J00934
J00076 J00666 J03045 J01779 J01918
J01427 J01273 J00645 J01159 J02108 J01465
J00044 J00519 J00136 J02488 J01937 J02231
J00196 J02347 J00740 J01535 J02490 J03009 J01717 J00610
J00138 J00790 J00773 J00188 J02418 J00822
J02966 J03034J02425 J03105 J02590
J02498 J00976
J01691 J01363 J01984 J02293 J02565
J02404 J02383 J00637 J02485 J00393 J03049 J02608 J01033
J00180 J02427 J00788 J00629 J03091 J00843 J01313 J01999 J02095
J02508 J01065 J03012 J01302 J02248
J01318 J01480J02279 J02731 J02437
J02580 J01248 J00716 J00664 J00774 J00764 J00603
J02964 J01229 J01356 J00802
J03053J00254 J02018 J01288 J01556 J02367
J00172 J01319J00491 J01929 J02118
J00714 J03019 J00742 J02113 J01600
J00193 J01422 J00717 J01720 J02209 J00825 J02080 J02602
J01895 J01843 J02832
J00750 J01404 J03020J03007J00721 J00016 J01018 J01272 J00401 J03013 J02219 J00595 J00877 J02962
J01276
J00229 J02902
J02281
J02208 J00715 J00718 J00202 J01732 J01722
J03138 J02817 J03050 J02523 J02266 J03031
J00719 J00782 J03018 J03078 J02519 J02532J00676 J01043 J00694 J02446
J00729 J00010 J01890 J01366
J02295 J01416 J03055 J00720 J00727
J01462 J01486J02251 J03051 J00231 J00724 J00903
J02048J01593 J02099 J01410 J02363 J01296
J00481 J02467 J00968 J00607 J00725 J00585 J00944 J02978
J00169 J02559J01174 J00799 J01085
J00856 J01015 J00602 J00767 J02201
J02094
J00398 J00515 J00392 J00780 J00680
J00757 J00754 J02581 J01957 J00586 J00789 J00672 J02180J00299 J03092
J00152J00151 J00551 J01743 J03038 J02476 J00109 J00651 J01277 J00850 J00647 J00761 J02073 J02891
J02985 J01985 J00655
J00931 J00793 J02645J00763
J00649 J00421 J00590 J01719
J01110 J02592 J01572J02297 J01622
J00013 J03112 J00701 J00390 J02677 J00386
J00626 J02473J03054 J02531J00634 J01042 J01278 J02415
J03110 J00014 J00779 J02429
J01009 J02713J03063J00632 J00038J00265 J01150 J02458 J00029 J01677J00378
J02708 J00007J00593 J01765 J01282 J00644J02430 J00674 J02330 J01585 J00712
J03026 J01135 J02449 J02072 J00597 J01730 J02215 J02710
J01475 J00781 J02781 J00915 J00084
J00969 J03057 J00950 J01109 J02501 J01904
J00210 J02937 J00212 J01062 J02680 J01723 J01474 J02280
J00049 J01249J00661 J02448 J01280 J00648
J00137 J03121 J00964 J02052 J01986 J00394
J01707 J01642 J00003 J00034 J01343 J02420 J03021 J00290 J02935 J00041 J00688 J00425
J03037 J01409 J00371 J00574 J00327 J01956
J00352 J03039 J02368 J00008J00787 J01073 J02369 J00963J02503
J01066 J02491 J01731 J02294
J00753 J00827 J00606 J00342
J00373 J03008 J01218 J02455 J02134
J00700 J03043 J00951 J00671J00794 J00864 J01149J02477J01014 J00294 J00689 J01010 J01694
J00575 J02457J00768 J00200 J00369 J01724 J00677 J02976
J00004 J01390
J00971 J00083 J02191J00863J03052 J01412J01283
J00641 J00206
J01285 J00650J02478 J01151 J00692
J01471
J00037 J02003 J00874 J01019 J01008 J01965
J00461 J01075 J00786 J00383
J01715 J01264
J01810 J02192 J01132 J01056 J00405 J01105
J01980 J02365 J02013 J01223 J00099 J00662 J00875 J00839 J00890 J02333 J01194
J02421
J00886
J01224 J02070 J01958
J02242
J01997
J01197 J00956J01201 J01039J00784 J00657 J01129 J00663 J01557
J02286 J01951 J00783 J01216 J03130 J00654 J00642
J03010 J01011 J00765 J02936 J01136 J01621 J02250
J02768J00984 J01220 J00589
J02697
J01114 J00459J01811 J02785 J01192J01917 J01198 J00996J02644J02642
J01716 J00143 J02799J00791 J02681 J00660 J00248 J02704 J00914 J02074
J02229
J00167 J03048 J01203 J03022 J01154 J00175
J01537 J01060 J02517 J00377 J02154 J00965 J00360 J02435
J02275 J00023 J01208 J01281 J01235J00778J00168 J01199 J01202 J00636 J00276 J03113 J01162
J00420 J01864 J02931 J01232 J00628 J02673 J00728
J00737 J02971
J01418 J01310
J01623 J03027
J00638 J00508
J01036J02536J02004 J00604 J00880 J02664 J02720 J01919J02780 J00807 J02325J01562
J02194 J01205 J00726 J00846
J01596 J00435 J02806 J00796 J02100J03123 J00374 J00271 J02355 J01228 J03064J01930
J02707 J01321 J00709 J01322 J01029 J01955 J02132 J01568
J01251 J00105 J00114 J01187 J01899 J02346 J02797 J00991 J01682 J00922
J02324 J02967 J01263 J01897J01898 J01207 J01973 J00630 J01936 J02972 J01230
J03108 J02527 J02169 J01902J01886 J02216J01928 J02509
J02827 J00467J01185 J00828 J02380 J00639 J02555
J00770 J00923 J00205
J00627 J01317 J03122
J03040 J00142 J01209J01006 J01334 J01007 J00498
J01459
J02665 J00022 J02061
J00417 J02409 J02819
J01148 J01473 J00917 J00997 J00410 J00845 J00771 J02511 J00505 J00179J02228 J02225
J02807 J00707 J02762 J02184 J01161 J01812
J00174 J00824 J00406 J01415 J01805 J00621 J02542 J00998 J00492J02014
J01709 J00384 J00687 J01574 J00622 J02069
J02493 J01226 J01137 J02556 J02483
J02397 J02406
J01352 J00430 J03029 J00135 J00759 J00884 J00596 J02566
J01670J01635 J00656 J01727 J01914 J01913 J01257
J01656 J02987 J03000 J00842 J01306 J02166 J00631
J01445J01657 J00358 J00600 J02115
J01637 J00484
J02672 J00107 J00419 J01739 J01259
J00134J00801 J00947 J02510 J00187 J02683
J01639 J01068 J02122 J00698 J00039 J00678
J02328 J01241 J00887 J01460 J00584 J00699 J01238
J00983 J00882 J02060 J02943 J03133
J00195 J03028 J01057 J00429 J02186 J00667 J01477
J01988 J01924 J02614 J00095
J00362 J01815 J02141
J01655 J00253J00199J02329 J02544 J01791 J00125 J00097
J02158 J02756
J03058
J00683 J02535
J02206 J01269 J00686 J02688
J00365 J00691 J01893 J02588
J00941 J00617 J00178 J00334 J02107
J03103 J01530 J01923
J02818 J01463 J00605 J00249 J02144J01291J01526 J01987 J00938
J02372 J02981
J00847 J02188 J00883 J00078J02959 J01903 J00101
J01640 J01166 J01711 J00056 J00896J01664
J00257 J02942 J02714 J01626 J02434
J02634 J00164 J02398 J01799 J01408 J01262 J00094 J00189 J00301
J02632
J01766 J01305 J01797 J02611J02276
J01634
J02119 J00752 J00705 J00203 J00579 J02272J00218 J02375 J00335
J02538 J01867 J01584 J00929 J00747 J00096 J02804
J02639 J01672 J02696
J00166 J01906 J00889 J01102 J00414 J01597J02996
J00413 J03137
J01531
J02684
J02961 J03090 J00395 J01896 J02960J03134
J02934 J01163 J00409 J02679 J01374 J02105 J01567 J01969
J02816 J02389 J01497 J01641 J00379 J00214 J00876 J02939 J01376 J00881
J02640 J02637 J02635 J00292 J00321 J00911 J02218 J00578 J00510
J02500 J02171 J00275 J01674
J01667 J02093 J02167 J01167 J01900
J01912 J01915 J01976
J01671 J02703 J00141 J01115 J02878
J01638 J02116 J01371 J02431
J00246 J02066
J01787 J02718 J01964 J00888 J02006
J02920
J02303 J00262 J02189J02268
J01602 J03032 J00684
J02554 J02528
J03030 J01726
J01013 J01885 J02019 J01884
J03117
J02359 J02012 J02905J02040
J02005 J00949
J01455
J00960 J02198 J00350 J03116 J01341
J02522 J01800 J01430 J00060 J01164 J03120 J00560 J01469
J01451 J01227 J02539 J00171 J01586
J02702 J02071J00978 J01054 J00502 J00762
J02472 J02232 J01592
J01841 J01403 J02381 J00810 J02546
J01875 J01267 J02932 J03119
J00261 J03041
J01668J01665 J02723 J02525 J01369 J00999 J00243
J02338 J02940J01673 J01045 J02098 J00279
J02633 J01659 J00795 J00407
J01833 J00353 J00285 J00710 J01314
J01666 J01490 J02701 J00111
J01829 J00354 J01344 J00345 J02091
J00990 J01669 J02148 J00518 J00318 J02470 J00344
J02938 J01450 J01020J02356 J01189 J00910 J01889
J02548 J01005 J01629 J02146
J02570 J00658 J02657 J00170 J00343 J00349 J00613 J01888 J02385
J02979 J00346 J00994 J01905 J00511
J01399J01400 J01926
J01625 J00496 J01747J00347 J01696 J00675 J00988
J01392 J02752 J01448 J00907 J01298 J02873
J03125 J00351J00823 J00236 J00529
J02895
J01398 J01044 J00165 J01725 J02526
J01554J00525 J02549 J01991 J00464 J00348
J00201J00412 J00495 J01266 J02715 J00048
J01406 J00838 J03033
J02195 J00494 J01240
J01171 J01386 J02413
J01401 J02537 J00363 J02459 J02161J03062 J02686
J02502
J00533 J00380 J01303 J00989 J00162 J02462
J02594 J02010 J01883 J00255 J01160 J00434
J00280 J01295 J01411
J00942 J02020 J00234
J01407 J01452 J01034 J02507 J00531 J00317 J01358
J01210 J00402 J01225 J01744 J01907
J01499 J02716
J02974 J01911 J00209 J02593 J00443 J00235
J01760 J02867 J02881 J02487 J00873
J02712 J02486 J02207 J00333
J00241 J02011 J02742 J01515
J00527 J02484 J01989 J02370J03060 J02872J02841
J01881
J00526 J03056 J01300 J01646
J00571 J02724
J00766 J01021
J01630 J02727
J02021 J01101J01439
J02579
J02558 J02741 J02896
J01636 J02728 J02869J02892
J01777
J01786 J01887 J03093
J02997 J01077 J02587 J02747 J02524 J02744J02733 J01074 J01789
J02737 J02097
J00311 J01035 J02740
J01552 J02027 J01542
J02255 J02915 J00955
J03131 J00298 J02687 J00542 J02371
J00312 J02732 J00732J00191
J01679 J01071 J02923
J03114
J00554 J01879 J01868 J02921 J02318 J01993
J01316 J00746 J00640
J02616 J03061 J01444
J01456 J00592 J01780
J02223
J02790 J02927 J03075
J01116 J02743
J00957 J01992
J02880 J02885 J00614
J02871 J02868J01279
J00145 J01611 J01781
J01438 J03087
J00895 J00633
J02758 J00488 J00646 J02929
J01695 J03089 J03079 J00337
J02904 J02849
J00865
J02390 J02852
J02844
J02839 J00154
J01023 J00233 J00120 J02853 J00156J00155
J02843 J02103 J00148
J01393 J01702 J02791 J00086 J01824
J01861 J00230 J02851
J01126 J00075 J02850 J02845
J01170 J01579 J02802 J00355 J00797
J00733
J03004 J03082 J02808
J01359 J01354
J01528 J02856J01431 J01835J02801 J00818
J02854 J00813
J01498 J02666J00713J02036 J02840 J01825
J01090 J01615 J00898 J01909 J02670J00158 J00153
J02261J02969 J02831
J01565
J01863 J01680 J00332J02022 J01845
J02833 J02859
J02400 J01551 J02922
J02035
J02893 J01832 J00306 J01394
J00891 J00815J01367
J00068J02374 J03088 J01395
J02203 J02884
J02262
J01983
J00986 J02793 J02656 J00958
J00226 J02387 J00149
J02135
J00894 J00427
J02886 J02263 J00054 J01704 J00088 J00821
J01690 J02792 J01617 J01500
J01569J03099J02803
J00160
J01361 J00307
J02789J02754 J00892 J01502J02655 J02847 J00085
J01698 J01649J02137 J03081 J02029 J02156
J02123 J02032
J02855 J02317
J03097J00758
J01700 J01432 J03100J02373
J02045 J00147J02038 J02751 J00071
J01053 J00286
J02260
J01064
J02326 J02663
J00227 J02812 J02159 J01862
J02126 J01703 J02653 J01616J02662
J02950
J02075 J02342
J02138 J01614
J00102 J02033
J02028 J03098 J01648 J02136 J02153
J01501
J02848
J01827 J03085
J01831J02925 J02811J02339 J01550 J01681
J01697
J02995
J02399 J00985 J02202
J02124J00798 J01908J02497 J02152 J02830J01826 J01699
J00121
J02787 J01836
J01613
J02660 J02259 J02788 J00150 J01434
J03115 J02568 J00331
J01377 J02402 J01618
J01580
J02879 J01270
J02530 J02204
J02388 J02658
J01837 J01609 J02753 J02659 J00087 J01688
J00067 J01855 J03101 J00157 J01689 J00897J01063 J01120J00073 J02125
J03086
J01368 J01576J02354J03095 J00734
J01353 J02809 J03084 J03083
J02157 J02037 J02039
J01548 J02258
J02948 J00735
J01564 J02786
J01336
J01685 J00146
J01122 J02469
J00161 J01683
J01830 J02034 J02810
J00053 J03094
J02949
J00066 J00820
J01271 J02968 J00077
J02031 J01583
J01527
J02401
J01360 J01121
J02529
J00460 J02085 J02991 J01571 J02945 J00652 J01188 J02875 J02908 J00612 J02609 J02700 J01508 J01405 J02759 J02313 J02433 J01877 J02392 J02778 J00486 J02926 J02909 J02901 J01491 J00552 J01329 J01849 J02096 J00615
J02983
J02993
J02990
J02316 J02083 J02838 J01910 J01180 J02861 J02636 J02247 J01852 J02667 J00697
J02305 J01087 J02874 J00711 J02610
J00916 J01330 J02453 J01587 J02391 J02777 J00438 J02897 J02882 J02887 J01493 J00514 J01326
J01952 J00635 J02876 J00611 J01865 J02302
J02084 J02771 J00123 J01234 J02899 J01834 J01539

J02320
J01268
J00238 J01581 J02650 J00069 J02618 J01108 J02911 J00563 J00339 J02569 J01435 J02846 J02314 J01998 J01818 J00959 J01479 J02761 J00232 J01387 J02298 J01436 J01118 J01080 J00566 J01871 J02675 J01981 J01758 J02709 J01996
J00548 J01872 J02745 J01982 J01759 J02719 J02023
J00364 J01582 J02651 J00070 J02619 J01107 J02129 J02050 J02763 J02669 J02607 J02319 J01437 J01119 J01082
J02890 J00556 J00316 J00954 J00018 J02842 J02301 J00490 J01817
Figure 2: The network consisting of 0.7% Jamu pairs (correlation value above or equal to 0.596).
Table 3: Statistics of three datasets.

Parameters 0.7% 0.5% 0.3%
Total pairs 34,454 24,610 14,766
Minimum correlation 0.596 0.665 0.718
Number of Jamu formulas 2,779 2,496 2,085
Average degree 24.8 19.7 14.2
(Random network: ER) (24.8 0.0) (19.7 0.0) (14.2 0.0)
(Random network: BA) (24.7 0.1) (19.7 0.1) (14.1 0.1)
(Random network: CNN) (24.7 0.4) (19.7 0.4) (14.0 0.4)
Clustering coefficient 0.521 0.520 0.540
Network (Random network: BA) (0.030 0.001) (0.028 0.001) (0.026 0.001)
statistics (Random network: CNN) (0.246 0.008) (0.239 0.008) (0.233 0.010)
Number of connected components 69 119 254
(Random networks: ER, BA, CNN) (1) (1) (1)
Network diameter 15 17 20
Network density 0.008 0.008 0.007
Total number of clusters 1,746 1,411 938
Number of clusters with more than 2 Jamu 1,296 873 453
DPClusO
(%) (74.2) (61.9) (48.3)
Number of Jamu formulas in the biggest cluster 118 104 89
of all formulas consists of 465 plants. We assigned 3138 Jamu pairs. We sorted the Jamu pairs based on correlation value
formulas to 116 diseases of International Classification of using descending order and selected top- (0.7%, 0.5%,
Diseases (ICD) version 10 from World Health Organization and 0.3%) pairs of Jamu formula to create 3 sets of Jamu
(WHO, Table 1) [19]. Those 116 diseases are mapped to pairs. The number of Jamu pairs for 0.7%, 0.5%, and 0.3%
18 classes of disease, which contains 16 classes of disease datasets is 34,454 pairs, 24,610 pairs, and 14,766 pairs and
from National Center for Biotechnology Information (NCBI) the corresponding minimum correlation values are 0.596,
[20] and 2 additional classes. Table 2 shows distribution 0.665, and 0.718, respectively. The three datasets of Jamu
of 3138 Jamu into 18 classes of disease. According to this pairs can be regarded as three undirected networks (step 1 in
classification, most Jamu formulas are useful for relieving Figure 1) consisting of 2779, 2496, and 2085 Jamu formulas,
muscle and bone, nutritional and metabolic diseases, and respectively (Table 3). Figure 2 shows visualization of 0.7%
the digestive system. Furthermore, there is no Jamu formula Jamu networks using Cytoscape Spring Embedded layout. We
classified into glands and hormones and neonatal disease verified that the degree distributions of the Jamu networks
classes. We excluded 4 Jamu formulas which are used to treat are somehow close to those of scale-free networks, that is,
fever in the evaluation process because this symptom is very roughly are of power law type. However, in the high-degree
general and almost appeared in all disease classes. Jamu- region the power law structure is broken (Figure 3). Nearly
plant-disease relations can be represented using 2 matrices: accurate relation of power laws between medicinal herbs
first matrix is Jamu-plant relation with dimension 3138 and the number of formulas utilizing them was observed in
465 and the second matrix is Jamu-disease relation with Jamu system but not in Kampo (Japanese crude drug system)
dimension 3138 18. [4]. The difference of formulas between Jamu and Kampo
After completion of data acquisition process, we calcu- can be explained by herb selection by medicinal researchers
lated the similarity between Jamu pairs using correlation based on the optimization process of selection [4]. Thus,
measure. The similarity measures between Jamu pairs were the broken structure of power law corresponding to Jamu
determined based on their ingredients. Corresponding to networks is associated with the fact that selection of Jamu
(3138 in present case) Jamu formulas, there can be maximum pairs based on ingredient correlation leads to nonrandom
( ( 1)/2) = (3138 (3137/2)) = 4,921,953 Jamu selection. We also constructed random networks according
0.7% 0.5%
234

221

63 114
71

37
13 23 40

Frequency
Frequency

22

13
8

5
5

3
2 3

2
1
1 2 5 10 20 50 100 200 1 2 5 10 20 50 100 200
(Deg.) (Deg.)
0.3%

81

12 22 43

Frequency
1 2 5 10 20 50 100
(Deg.)
Figure 3: Degree distributions of three Jamu networks roughly follow power law. The -axis corresponds to the log of degree of a node in the
Jamu network and the -axis corresponds to the log of the number of Jamu.
to Erdos-Renyi (ER) model [21], Barabasi-Albert (BA) model a network is disconnected, its diameter is the maximum of all
[22], and Vazquezs Connecting Nearest Neighbor (CNN) diameters of its connected components. A networks density
model [23] of the same size corresponding to each of the real is the ratio of the number of edges in the network over the
Jamu network. We used Cytoscape Network Analyzer plugin total number of possible edges between all pairs of nodes
[24] and R software for analyzing the characteristics of both (which is ( 1)/2, where is the number of vertices, for
the Jamu and the random networks. an undirected graph). The average number of neighbors and
We determined five statistical indexes, that is, average the network density are the same for the real and random
degree, clustering coefficient, number of connected compo- networks of the same size as it is shown in Table 3. In case
nent, network diameter, and network density of each Jamu of 0.7% and 0.5% real networks, the clustering coefficient is
network and also of each random network. The clustering roughly the same and in case of 0.3% the clustering coefficient
coefficient of a node is defined as = 2 /( ( 1)), is somewhat larger. The number of connected components
where is the number of neighbors of and is the number and the diameter of the Jamu networks gradually decrease
of connected pairs between all neighbors of . The network as the network grows bigger by addition of more nodes and
diameter is the largest distance between any two nodes. If edges.
300 300
Number of clusters
Number of clusters
200 200
100 100
0 0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Matching score Matching score
(a) 0.7% (b) 0.5%
300
Number of clusters
200
100
0
0.0 0.2 0.4 0.6 1.0 0.8
Matching score
(c) 0.3%
Figure 4: Distribution of clusters based on matching score.
1.0 150
Ratio of number of clusters to
Number of predicted plants
0.8
100
total clusters
0.6
0.4
50
0.2
0.0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Matching score threshold Matching score threshold
0.7% 0.7%
0.5% 0.5%
0.3% 0.3%
(a) (b)
Figure 5: (a) Success rate and (b) number of predicted plants with respect to matching score thresholds.
Very different values corresponding to clustering coef- plant-disease relations. Specially, much higher value corre-
ficient, connected component, and network diameter imply sponding to clustering coefficient indicates that there are
that the Jamu networks are quite different from all 3 types clusters in the networks worthy to be investigated. To extract
of random networks. The differences between Jamu networks clusters from the Jamu networks (step 2 in Figure 1) we
and ER random networks are the largest. Random networks applied DPClusO network clustering algorithm [14] to gen-
constructed based on other two models are also substantially erate overlapping clusters based on density and periphery
different from Jamu networks. Based on the fact that the tracking.
random networks constructed based on all three types of
models are different from the Jamu networks, it can be 3.2. Supervised Clustering Based on DPClusO. DPClusO is a
concluded that structure of Jamu networks is reasonably general-purpose clustering algorithm and useful for finding
biased and thus might contain certain information about overlapping cohesive groups in an undirected simple graph
Table 4: List of plants assigned to each disease. Table 4: Continued.
Number Plants name Hit-miss status Number Plants name Hit-miss status
A. Disease: blood and lymph diseases 19 Curcuma longa Hit
1 Tamarindus indica Hit 20 Zingiber aromaticum Hit
2 Allium sativum Hit 21 Phyllanthus urinaria Hit
3 Tinospora tuberculata Hit 22 Myristica fragrans Hit
4 Piper retrofractum Hit 23 Hydrocotyle asiatica Hit
5 Syzygium aromaticum Hit 24 Carica papaya Hit
6 Bupleurum falcatum Hit 25 Mentha arvensis Hit
7 Graptophyllum pictum Hit 26 Lepiniopsis ternatensis Hit
8 Plantago major Hit 27 Helicteres isora Hit
9 Zingiber officinale Hit 28 Andrographis paniculata Hit
10 Cinnamomum burmannii Hit 29 Symplocos odoratissima Hit
11 Soya max Miss 30 Schisandra chinensis Hit
12 Kaempferia galanga Hit 31 Blumea balsamifera Hit
13 Curcuma longa Hit 32 Silybum marianum Hit
14 Piper nigrum Hit 33 Cinnamomum sintoc Hit
15 Zingiber aromaticum Hit 34 Elephantopus scaber Hit
16 Phyllanthus urinaria Hit 35 Curcuma aeruginosa Hit
17 Oryza sativa Hit 36 Kaempferia pandurata Hit
18 Myristica fragrans Hit 37 Curcuma xanthorrhiza Hit
19 Alstonia scholaris Hit 38 Curcuma mangga Hit
20 Syzygium polyanthum Miss 39 Curcuma zedoaria Hit
21 Andrographis paniculata Hit 40 Daucus carota Hit
22 Sida rhombifolia Miss 41 Matricaria chamomilla Hit
23 Cyperus rotundus Hit 42 Cymbopogon nardus Hit
24 Sonchus arvensis Miss D. Disease: female-specific diseases
25 Curcuma aeruginosa Hit
1 Foeniculum vulgare Hit
26 Curcuma xanthorrhiza Hit
2 Imperata cylindrica Hit
B. Disease: cancers
3 Tamarindus indica Hit
1 Catharanthus roseus Hit
4 Pluchea indica Hit
C. Disease: the digestive system
5 Piper retrofractum Hit
1 Foeniculum vulgare Hit
6 Punica granatum Hit
2 Glycyrrhiza uralensis Hit
7 Uncaria rhynchophylla Hit
8 Zingiber officinale Hit
4 Zingiber purpureum Hit
9 Guazuma ulmifolia Hit
5 Physalis peruviana Hit
6 Punica granatum Hit 10 Nigella sativa Hit
7 Echinacea purpurea Hit 11 Terminalia bellirica Hit
8 Zingiber officinale Hit 12 Baeckea frutescens Hit
9 Psidium guajava Hit 13 Phaseolus radiatus Hit
10 Baeckea frutescens Hit 14 Amomum compactum Hit
11 Amomum compactum Hit 15 Sauropus androgynus Hit
12 Cinnamomum burmannii Hit 16 Usnea misaminensis Hit
13 Melaleuca leucadendra Hit 17 Cinnamomum burmannii Hit
14 Caesalpinia sappan Hit 18 Melaleuca leucadendra Hit
15 Parkia roxburghii Hit 19 Parameria laevigata Hit
16 Rheum tanguticum Hit 20 Parkia roxburghii Hit
17 Kaempferia galanga Hit 21 Piper cubeba Hit
18 Coriandrum sativum Hit 22 Kaempferia galanga Hit
Table 4: Continued. Table 4: Continued.

23 Coriandrum sativum Hit 14 Ganoderma lucidum Hit
24 Kaempferia angustifolia Hit 15 Nigella sativa Hit
25 Curcuma longa Hit 16 Terminalia bellirica Hit
26 Zingiber aromaticum Hit 17 Baeckea frutescens Hit
27 Languas galanga Hit 18 Amomum compactum Hit
28 Galla lusitania Hit 19 Cinnamomum burmannii Hit
29 Quercus lusitanica Hit 20 Melaleuca leucadendra Hit
30 Hydrocotyle asiatica Hit 21 Parameria laevigata Hit
31 Areca catechu Hit 22 Psophocarpus tetragonolobus Hit
32 Lepiniopsis ternatensis Hit 23 Parkia roxburghii Hit
33 Helicteres isora Hit 24 Piper cubeba Hit
34 Piper betle Hit 25 Kaempferia galanga Hit
35 Elephantopus scaber Hit 26 Coriandrum sativum Hit
36 Kaempferia pandurata Hit 27 Cola acuminata Hit
37 Curcuma xanthorrhiza Hit 28 Coffea arabica Hit
38 Sesbania grandiflora Hit 29 Orthosiphon stamineus Hit
E. Disease: the heart and blood vessels 30 Curcuma longa Hit
1 Allium sativum Hit 31 Piper nigrum Hit
2 Curcuma longa Hit 32 Alpinia galanga Hit
3 Morinda citrifolia Hit 33 Vitex trifolia Hit
4 Homalomena occulta Hit 34 Zingiber amaricans Hit
5 Hydrocotyle asiatica Hit 35 Zingiber zerumbet Hit
6 Alstonia scholaris Hit 36 Zingiber aromaticum Hit
7 Syzygium polyanthum Miss 37 Languas galanga Hit
8 Andrographis paniculata Hit 38 Massoia aromatica Hit
9 Apium graveolens Miss 39 Morinda citrifolia Hit
10 Imperata cylindrica Hit 40 Carum copticum Hit
F. Disease: male-specific diseases 41 Panax pseudoginseng Hit
1 Cucurbita pepo Miss 42 Oryza sativa Hit
2 Serenoa repens Miss 43 Myristica fragrans Hit
3 Baeckea frutescens Hit 44 Pandanus amaryllifolius Hit
4 Phaseolus radiatus Hit 45 Eurycoma longifolia Hit
5 Curcuma longa Hit 46 Hydrocotyle asiatica Hit
6 Elephantopus scaber Hit 47 Areca catechu Hit
G. Disease: muscle and bone 48 Mentha arvensis Hit
1 Foeniculum vulgare Hit 49 Lepiniopsis ternatensis Hit
2 Clausena anisum-olens Hit 50 Pimpinella pruatjan Hit
3 Zingiber purpureum Hit 51 Andrographis paniculata Hit
4 Allium sativum Hit 52 Blumea balsamifera Hit
5 Strychnos ligustrina Hit 53 Cymbopogon nardus Hit
6 Tinospora tuberculata Hit 54 Sida rhombifolia Hit
7 Piper retrofractum Hit 55 Cinnamomum sintoc Hit
8 Syzygium aromaticum Hit 56 Piper betle Hit
9 Cola nitida Hit 57 Talinum paniculatum Hit
10 Ginkgo biloba Hit 58 Elephantopus scaber Hit
11 Panax ginseng Hit 59 Cyperus rotundus Hit
12 Equisetum debile Hit 60 Curcuma aeruginosa Hit
13 Zingiber officinale Hit 61 Kaempferia pandurata Hit
Table 4: Continued. Table 4: Continued.

62 Curcuma xanthorrhiza Hit 44 Piper betle Hit
63 Tribulus terrestris Hit 45 Spirulina Hit
64 Corydalis yanhusuo Hit 46 Stevia rebaudiana Hit
65 Pausinystalia yohimbe Hit 47 Theae sinensis Hit
H. Disease: nutritional and metabolic diseases 48 Sonchus arvensis Hit
1 Foeniculum vulgare Hit 49 Curcuma heyneana Hit
2 Glycyrrhiza uralensis Hit 50 Curcuma aeruginosa Hit
3 Zingiber purpureum Hit 51 Kaempferia pandurata Hit
4 Allium sativum Hit 52 Curcuma xanthorrhiza Hit
5 Tinospora tuberculata Hit 53 Curcuma zedoaria Hit
6 Pandanus conoideus Hit 54 Olea europaea Hit
7 Syzygium aromaticum Hit I. Disease respiratory diseases
8 Punica granatum Hit 1 Foeniculum vulgare Hit
9 Zingiber officinale Hit 2 Clausena anisum-olens Hit
10 Guazuma ulmifolia Hit 3 Glycyrrhiza uralensis Hit
11 Nigella sativa Hit 4 Zingiber purpureum Hit
12 Amomum compactum Hit 5 Piper retrofractum Hit
13 Cinnamomum burmannii Hit 6 Syzygium aromaticum Hit
14 Parameria laevigata Hit 7 Gaultheria punctata Hit
15 Caesalpinia sappan Hit 8 Panax ginseng Hit
16 Soya max Hit 9 Equisetum debile Hit
17 Cocos nucifera Hit 10 Zingiber officinale Hit
18 Rheum tanguticum Hit 11 Citrus aurantium Hit
19 Piper cubeba Hit 12 Nigella sativa Hit
20 Murraya paniculata Hit 13 Amomum compactum Hit
21 Kaempferia galanga Hit 14 Cinnamomum burmannii Hit
22 Coffea arabica Hit 15 Melaleuca leucadendra Hit
23 Orthosiphon stamineus Hit 16 Parkia roxburghii Hit
24 Curcuma longa Hit 17 Cocos nucifera Hit
25 Piper nigrum Hit 18 Piper cubeba Hit
26 Zingiber aromaticum Hit 19 Kaempferia galanga Hit
27 Aloe vera Hit 20 Coriandrum sativum Hit
28 Phaleria papuana Hit 21 Curcuma longa Hit
29 Galla lusitania Hit 22 Piper nigrum Hit
30 Quercus lusitanica Hit 23 Zingiber aromaticum Hit
31 Morinda citrifolia Hit 24 Languas galanga Hit
32 Myristica fragrans Hit 25 Mentha piperita Hit
33 Momordica charantia Hit 26 Oryza sativa Hit
34 Areca catechu Hit 27 Myristica fragrans Hit
35 Lepiniopsis ternatensis Hit 28 Pandanus amaryllifolius Hit
36 Alstonia scholaris Hit 29 Hydrocotyle asiatica Hit
37 Hibiscus sabdariffa Hit 30 Mentha arvensis Hit
38 Laminaria japonica Hit 31 Lepiniopsis ternatensis Hit
39 Syzygium polyanthum Hit 32 Helicteres isora Hit
40 Andrographis paniculata Hit 33 Blumea balsamifera Hit
41 Sindora sumatrana Hit 34 Cymbopogon nardus Hit
42 Cassia angustifolia Hit 35 Piper betle Hit
43 Woodfordia floribunda Hit 36 Curcuma xanthorrhiza Hit
Table 4: Continued. 70 63
Number Plants name Hit-miss status 60

Number of plants
37 Salix alba Hit 50
38 Matricaria chamomilla Miss 40
J. Disease: skin and connective tissue
30 24
1 Strychnos ligustrina Hit 18
20 14
2 Merremia mammosa Hit
3 Piper retrofractum Hit 10 5 6
2 2 1
4 Santalum album Hit 0
1 2 3 4 5 6 7 8 9
5 Zingiber officinale Hit
Number of diseases
6 Citrus aurantium Hit
7 Citrus hystrix Hit Figure 6: Distribution of 135 plants assigned based on 0.7% dataset
with respect to the number of diseases they are assigned to.
8 Cassia siamea Hit
9 Cocos nucifera Hit
10 Trigonella foenum-graecum Hit
11 Orthosiphon stamineus Hit for any type of application. It ensures coverage and performs
robustly in case of random addition, removal, and rearrange-
12 Curcuma longa Hit
ment of edges in protein-protein interaction (PPI) networks
13 Vetiveria zizanioides Hit [14]. While applying DPClusO, the parameter values of
14 Aloe vera Hit density and cluster property that we used in this experiment
15 Rosa chinensis Hit are 0.9 and 0.5, respectively [15]. Table 3 shows the summary
16 Jasminum sambac Hit of clustering result by DPClusO. Because clusters consisting
17 Phyllanthus urinaria Hit of two Jamu formulas are trivial clusters, for the next steps
18 Mentha piperita Hit we only use clusters each of which consists of 3 or more
19 Oryza sativa Hit Jamu formulas. The number of total clusters increases along
with the larger dataset, although the threshold correlation
20 Myristica fragrans Hit
between Jamu pairs decreases. We evaluated the clustering
21 Hydrocotyle asiatica Hit result using matching score to determine dominant disease
22 Lepiniopsis ternatensis Hit for every cluster (step 3 in Figure 1). Matching score of a
23 Alstonia scholaris Hit cluster is the ratio of the highest number of Jamu associated
24 Andrographis paniculata Hit with the same disease to the total number of Jamu in the
25 Cymbopogon nardus Hit cluster. Thus matching score is a measure to indicate how
26 Piper betle Hit strongly a disease is associated to a cluster. Figure 4 shows
27 Theae sinensis Hit the distribution of the clusters with respect to matching score
from three datasets. All datasets have the highest frequency
28 Curcuma heyneana Hit
of clusters at matching score >0.9 and overall most of the
29 Kaempferia pandurata Hit
clusters have higher matching score, which means most of
30 Curcuma xanthorrhiza Hit the DPClusO generated clusters can be confidently related
31 Melaleuca leucadendra Hit to a dominant disease. Furthermore the number of clusters
32 Matricaria chamomilla Miss with matching score >0.9 is remarkably larger compared to
K. Disease: the urinary system the same in other ranges of matching score in case of the 0.3%
1 Foeniculum vulgare Hit dataset (Figure 4(c)). If we compare the ratio of frequency of
clusters at matching score >0.9 for every dataset, the 0.3%
dataset has the highest ratio with 40.84% (of 453), compared
3 Strychnos ligustrina Hit
to 29.67% (of 873) and 21.91% (of 1296), in case of 0.5% and
4 Plantago major Hit 0.7% datasets, respectively. Thus, the most reliable species
5 Zingiber officinale Hit to disease relations can be predicted at matching score >0.9
6 Cinnamomum burmannii Hit corresponding to the clusters generated from 0.3% dataset.
7 Strobilanthes crispus Hit Figure 5(a) shows the success rate for all 3 datasets with
8 Kaempferia galanga Hit respect to threshold matching scores. Success rate is defined
9 Orthosiphon stamineus Hit as the ratio of the number of clusters with matching score
10 Phyllanthus urinaria Hit larger than the threshold to the total number of clusters.
As expected it tends to produce lower success rate if we
11 Blumea balsamifera Hit
decrease correlation value to create the datasets. However
12 Sonchus arvensis Hit more clusters are generated and more information can be
13 Curcuma xanthorrhiza Hit extracted when we lower the threshold correlation value. The
indicates that plant will not assigned if we use matching score >0.7. success rate increases rapidly as the matching score decreases
Table 5: Relation between disease classes in NCBI and efficacy classes reported by Afendi et al. [6].
Class of disease Ref. Efficacy class

D1 Blood and lymph diseases NCBI E7 Pain/inflammation (PIN)
D2 Cancers NCBI E7 Pain/inflammation (PIN)
E4 Gastrointestinal disorders (GST)
D3 The digestive system NCBI
E7 Pain/inflammation (PIN)
D4 Ear, nose, and throat NCBI E7 Pain/inflammation (PIN)
D5 Diseases of the eye NCBI E7 Pain/inflammation (PIN)
D6 Female-specific diseases NCBI E5 Female reproductive organ problems (FML)
D7 Glands and hormones NCBI E7 Pain/inflammation (PIN)
D8 The heart and blood vessels NCBI E7 Pain/inflammation (PIN)
D9 Diseases of the immune system NCBI E7 Pain/inflammation (PIN)
D10 Male-specific diseases NCBI E6 Musculoskeletal and connective tissue disorders (MSC)
D11 Muscle and bone NCBI E6 Musculoskeletal and connective tissue disorders (MSC)
D12 Neonatal diseases NCBI E7 Pain/inflammation (PIN)
D13 The nervous system NCBI E7 Pain/inflammation (PIN)
E2 Disorders of appetite (DOA)
D14 Nutritional and metabolic diseases NCBI
E4 Gastrointestinal disorders (GST)
E8 Respiratory disease (RSP)
D15 Respiratory diseases NCBI
E7 Pain/inflammation (PIN)
D16 Skin and connective tissue NCBI E9 Wounds and skin infections (WND)
D17 The urinary system E1 Urinary related problems (URI)
D18 Mental and behavioural disorders E3 Disorders of mood and behavior (DMB)
from 0.9 to 0.6 and after that the slope of increase of success 3.4. Evaluation of the Supervised Clustering Based on DPClu-
rate decreases. Therefore in this study we empirically decide sO. We used previously published results [6] as gold standard
0.6 as the threshold matching score to predict plant-disease to evaluate our results. The previous study assigned plants
relations. to 9 kinds of efficacy whereas we assigned the plants to 18
disease classes (16 from NCBI and 2 additional classes). For
the sake of evaluation we got done a mapping of the 18 disease
3.3. Assignment of Plants to Disease. By using DPClusO re- classes to 9 efficacy classes by a professional doctor, which
sulting clusters, we assigned plants to classes of disease. Based is shown in Table 5. Table 6 shows the prediction result of
on a threshold matching score we assigned dominant disease plant-disease relations for all 3 datasets, corresponding to
to a cluster. Then we assign a plant to a cluster by way of clusters with matching score greater than 0.6. Table 6 also
analyzing the ingredients of the Jamu formulas belonging shows corresponding efficacy, the number of assigned plants,
to that cluster and determining the highest frequency plant, number of correctly predicted plants, and true positive rates
that is, the plant that is used for maximum number Jamu (TPR), respectively.
belonging to that cluster (step 4 in Figure 1). Thus we assign We determined TPR corresponding to a disease/efficacy
a disease and a plant to each cluster having matching score class by calculating the ratio of the number of correct
greater than a threshold. Our hypothesis is that the disease prediction to the number of all predictions. When a disease
and the plant assigned to the same cluster are related. corresponds to more than one kind of efficacy, the highest
The total number of assigned plants depends on matching TPR can be considered the TPR for the corresponding
score value. Figure 5(b) shows the number of predicted plants disease. For all 3 datasets the TPR corresponding to each
that can be assigned to diseases in the context of matching disease is roughly 90% or more. The 0.3% dataset consists of
score. With higher matching score value, the number of Jamu pairs with higher correlation values and based on this
predicted plants assigned to classes of disease is supposed to dataset 117 plants are assigned to 14 disease classes. The 0.7%
remain similar or decrease but the reliability of prediction dataset contains more Jamu pairs and assigned plants to
increases. In Figure 5(b) a sudden change in the number 11 disease classes, one less disease class compared to 0.5%
of predicted plants is seen at matching score 0.6 which we dataset. The two disease classes covered by 0.3% dataset
consider as empirical threshold in this work. Based on the but not covered by 0.5% and 0.7% datasets are the nervous
0.7% dataset, the largest number of plants (135 plants, Table 4) system (D13) and disease of the immune system (D9). The
was assigned to diseases. There are 63 plants assigned to only only disease class covered by 0.3% and 0.5% datasets but
one class of disease, whereas the other 72 plants are assigned not covered by 0.7% dataset is mental and behavioural
to at least two or more classes of disease (Figure 6). disorders (D18). The larger dataset network tends to have
Table 6: The prediction result of plant-disease relations using matching score >0.6.
0.7% dataset 0.5% dataset 0.3% dataset

Class of Corresponding Number of True Number of True Number of True
Correct Correct Correct
disease efficacy assigned positive assigned positive assigned positive
prediction prediction prediction
plants rate plants rate plants rate
D1 E7 26 22 0.85 24 20 0.83 24 20 0.83
D2 E7 1 1 1.00 5 5 1.00 1 1 1.00
E4 42 1.00 33 1.00 28 1.00
D3 42 33 28
E7 38 0.90 30 0.91 25 0.89
D4 E7 0 0 0 0 0 0
D5 E7 0 0 0 0 0 0
D6 E5 38 38 1.00 37 37 1.00 32 32 1.00
D7 E7 0 0 0 0 0 0
D8 E7 10 8 0.80 8 7 0.88 6 5 0.83
D9 E7 0 0 0 0 1 1 1.00
D10 E6 6 4 0.67 2 0 3 1 0.33
D11 E6 65 65 1.00 71 71 1.00 60 60 1.00
D12 E7 0 0 0 0 0 0
D13 E7 0 0 0 0 5 5 1.00
E2 44 0.81 36 0.80 26 0.74
D14 54 45 35
E4 54 1.00 45 1.00 35 1.00
E7 37 0.97 34 1.00 33 1.00
D15 38 34 33
E8 31 0.82 30 0.88 29 0.88
D16 E9 32 31 0.97 32 32 1.00 27 27 1.00
D17 E1 13 13 1.00 9 9 1.00 8 8 1.00
D18 E3 0 0 5 5 1.00 4 4 1.00
Total assigned plants 135 129 117
lower coverage of disease classes. The number of Jamu pairs, which might be supporting ingredients. By applying the
that is, the number of edges in the network, affect the number proposed method important plants from Jamu formulas for
of DPClusO resulting clusters and number of Jamu formulas every classes of disease were determined. The plant to disease
per cluster. As a consequence, for the larger dataset networks, relations predicted by proposed network based method were
the success rate becomes lower and the coverage of disease evaluated in the context of previously published results and
classes is lower but prediction of more plant-disease relations were found to produce a TPR of 90%. For the larger dataset
can be achieved. networks, success rate and the coverage of disease classes
become lower but prediction of more plant-disease relations
can be achieved.
4. Conclusions
This paper introduces a novel method called supervised Conflict of Interests
clustering for analyzing big biological data by integrat-
ing network clustering and selection of clusters based on The authors declare that there is no financial interest or
supervised learning. In the present work we applied the conflict of interests regarding the publication of this paper.
method for data mining of Jamu formulas accumulated
in KNApSAcK database. Jamu networks were constructed
based on correlation similarities between Jamu formulas and Acknowledgments
then network clustering algorithm DPClusO was applied to
generate high density Jamu modules. For the analysis of This work was supported by the National Bioscience Database
the next steps potential clusters were selected by supervised Center in Japan and the Ministry of Education, Culture,
learning. The successful clusters containing several Jamu Sports, Science, and Technology of Japan (Grant-in-Aid
related to the same disease might be useful for finding main for Scientific Research on Innovation Areas Biosynthetic
ingredient plant for that disease and the lower matching Machinery. Deciphering and Regulating the System for Cre-
score value clusters will be associated with varying plants ating Structural Diversity of Bioactivity Metabolites (2007)).
References [17] J. L. Rodgers and W. A. Nicewander, Thirteen ways to look at

the correlations coefficient, The American Statiscian, vol. 42, pp.
[1] R. Verporte, H. K. Kim, and Y. H. Choi, Plants as source of 5966, 1995.
medicines, in Medicinal and Aromatic Plants, R. J. Boger, L. E. [18] M. Li, J.-E. Chen, J.-X. Wang, B. Hu, and G. Chen, Modifying
Craker, and D. Lange, Eds., chapter 19, pp. 261273, 2006. the DPClus algorithm for identifying protein complexes based
[2] A. Furnharm, Why do people choose and use complemen- on new topological structures, BMC Bioinformatics, vol. 9,
tary therapies? in Complementary Medicine: An Objective article 398, 2008.
Appraisal, E. Ernst, Ed., pp. 7188, Butterworth-Heinemann, [19] World Health Organization, International Classification of
Oxford, UK, 1996. Diseases (ICD) 10, 2010, http://www.who.int/classifications/
[3] E. Ernst, Herbal medicines put into context, British Medical icd/en/.
Journal, vol. 327, no. 7420, pp. 881882, 2003. [20] National Center for Biotechnology Information, Genes and
[4] F. M. Afendi, T. Okada, M. Yamazaki et al., KNApSAcK family Disease, NCBI, Bethesda, Md, USA, 1998.
databases: integrated metaboliteplant species databases for [21] P. Erdos and A. Renyi, On the evolution of random graph,
multifaceted plant research, Plant and Cell Physiology, vol. 53, Publicationes Mathematicae Debrecen, vol. 6, pp. 290297, 1959.
no. 2, p. e1, 2012. [22] A.-L. Barabasi and R. Albert, Emergence of scaling in random
[5] F. M. Afendi, N. Ono, Y. Nakamura et al., Data mining methods networks, Science, vol. 286, no. 5439, pp. 509512, 1999.
for omics and knowledge of crude medicinal plants toward [23] A. Vazquez, Growing network with local rules: preferential
big data biology, Computational and Structural Biotechnology attachment, clustering hierarchy, and degree correlations, Phys-
Journal, vol. 4, no. 5, Article ID e201301010, 2013. ical Review EStatistical, Nonlinear, and Soft Matter Physics,
[6] F. M. Afendi, L. K. Darusman, A. Hirai et al., System biology vol. 67, no. 5, Article ID 056104, 15 pages, 2003.
approach for elucidating the relationship between Indonesian [24] Max Planck Institut Informatik, NetworkAnalyzer, 2013,
herbal plants and the efficacy of Jamu, in Proceedings of the http://med.bioinf.mpi-inf.mpg.de/netanalyzer/index.php.
10th IEEE International Conference on Data Mining Workshops
(ICDMW 10), pp. 661668, Sydney, Australia, December 2010.
[7] F. M. Afendi, L. K. Darusman, A. H. Morita et al., Efficacy of
Jamu formulations by PLS modeling, Current Computer-Aided
Drug Design, vol. 9, pp. 4659, 2013.
[8] F. M. Afendi, L. K. Darusman, M. Fukuyama, M. Altaf-Ul-
Amin, and S. Kanaya, A bootstrapping approach for investi-
gating the consistency of assignment of plants to Jamu efficacy
by PLS-DA model, Malaysian Journal of Mathematical Sciences,
vol. 6, no. 2, pp. 147164, 2012.
[9] W. Winterbach, P. V. Mieghem, M. Reinders, H. Wang, and D.
de Ridder, Topology of molecular interaction networks, BMC
Systems Biology, vol. 7, article 90, 2013.
[10] C. Bachmaier, U. Brandes, and F. Schreiber, Biological net-
work, in Handbook of Graph Drawing and Visualization, pp.
621651, CRC Press, 2013.
[11] X. Chen, M. Chen, and K. Ning, BNArray: an R package for
constructing gene regulatory networks from microarray data
by using Bayesian network, Bioinformatics, vol. 22, no. 23, pp.
29522954, 2006.
[12] P. Langfelder and S. Horvath, WGCNA: an R package for
weighted correlation network analysis, BMC Bioinformatics,
vol. 9, article 559, 2008.
[13] A. Martin, M. E. Ochagavia, L. C. Rabasa, J. Miranda, J.
Fernandez-de-Cossio, and R. Bringas, BisoGenet: a new tool
for gene network building, visualization and analysis, BMC
Bioinformatics, vol. 11, article 91, 2010.
[14] M. Altaf-Ul-Amin, M. Wada, and S. Kanaya, Partitioning a
PPI network into overlapping modules constrained by high-
density and periphery tracking, ISRN Biomathematics, vol.
2012, Article ID 726429, 11 pages, 2012.
[15] M. Altaf-Ul-Amin, H. Tsuji, K. Kurokawa, H. Asahi, Y. Shinbo,
and S. Kanaya, DPClus: a density-periphery based graph
clustering software mainly focused on detection of protein
complexes in interaction networks, Journal of Computer Aided
Chemistry, vol. 7, pp. 150156, 2006.
[16] S. K. Kachigan, Multivariate Statistical Analysis: A Conceptual
Introduction, Radius Press, New York, NY, USA, 1991.

Research Article

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Article

Uploaded by

Copyright:

Available Formats

Hindawi Publishing Corporation

BioMed Research International

Sony Hartono Wijaya,1,2 Husnawati Husnawati,3 Farit Mochamad Afendi,4

Correspondence should be addressed to Shigehiko Kanaya; skanaya@gtc.naist.jp

Received 30 November 2013; Accepted 18 February 2014; Published 7 April 2014

Academic Editor: Samuel Kuria Kiboi

1. Introduction of traditional medicines is rapidly increasing [2, 3]. These

Table 1: List of diseases using International Classification of Dis- Table 1: Continued.

ID Class of disease (NCBI) Ref. Number of Jamu Percentage

Input: Jamu formulas

Output: plant-disease relations

J03080 J01061 J01931

J02084 J02771 J00123 J01234 J02899 J01834 J01539

J00548 J01872 J02745 J01982 J01759 J02719 J02023

J02890 J00556 J00316 J00954 J00018 J02842 J02301 J00490 J01817

Table 3: Statistics of three datasets.

Figure 4: Distribution of clusters based on matching score.

Number of predicted plants

Table 4: List of plants assigned to each disease. Table 4: Continued.

Table 4: Continued. Table 4: Continued.

Table 4: Continued. Table 4: Continued.

Class of disease Ref. Efficacy class

0.7% dataset 0.5% dataset 0.3% dataset

References [17] J. L. Rodgers and W. A. Nicewander, Thirteen ways to look at

You might also like