Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Available online at www.sciencedirect.

com
Available online at www.sciencedirect.com
ScienceDirect
ScienceDirect
Procedia
Available Computer
online Science 00 (2019) 000–000
at www.sciencedirect.com
Procedia Computer Science 00 (2019) 000–000
www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia
ScienceDirect
Procedia Computer Science 159 (2019) 1347–1356

23rd International Conference on Knowledge-Based and Intelligent Information & Engineering


23rd International Conference on Knowledge-Based
Systems and Intelligent Information & Engineering
Systems
A Method of Extracting and Classifying Local Community
A Method of Extracting and Classifying Local Community
Problems from Citizen-Report Data using Text Mining
Problems from Citizen-Report Data using Text Mining
Eiji Kanoab *, Yoshikatsu Fujitacc, Kazuhiko Tsudaaa
Eiji Kano *, Yoshikatsu Fujita , Kazuhiko Tsuda
ab
a
University of Tsukuba, Graduate School of Business Sciences, Bunkyo-ku, Tokyo 112-0012, Japan
b
Institute ofaUniversity
Administrative
of Tsukuba,
Information
Graduate
Systems,
School
Dept.of
of Business
Research
Sciences,
and Publication,
Bunkyo-ku,
Chiyoda-ku,
Tokyo 112-0012,
Tokyo Japan
100-0012, Japan
b
Institute of Administrative
c
Teikyo
Information
University,
Systems,
Dept. Dept.of
of Sociology,
Research
Hachioji,
and Publication,
Tokyo 192-0395,
Chiyoda-ku,
Japan Tokyo 100-0012, Japan
c
Teikyo University, Dept. of Sociology, Hachioji, Tokyo 192-0395, Japan

Abstract
Abstract
Local governments are required to appropriately prioritize and respond to regional issues that are becoming diversified and
Local governments
complicated under the are constraints
required toofappropriately
manpower and prioritize
budget.and respond to
In addition, regional
it is requiredissues that are becoming
to understand diversified
regional issues basedand on
complicated
objective dataunder the from
analysis constraints of manpower
the viewpoint and budget. Inpolicymaking.”
of “evidence-based addition, it is Under
required to understand
these circumstances,regional issues based on
the “Citizen-Report”
objective
mechanism, data analysis
which from thebeen
has recently viewpoint of “evidence-based
introduced policymaking.”
in local governments, Under not
may contribute these circumstances,
only the “Citizen-Report”
to prompt resolution of individual
mechanism,
field problems which has recently
but also been introduced
to the clarification in local governments,
of the tendencies of problemmay contribute
occurrence. not onlythe
However, to classification
prompt resolution of individual
of problems is not
field problems
necessarily set but
for also to the clarification
reflecting of the tendencies
the actual occurrence of problem
tendencies. occurrence.
Therefore, However,
this study proposestheaclassification of problems
method to extract is not
and classify
necessarilyinset
problems an for reflecting
objective and the actual occurrence
reproducible manner tendencies.
that reflects Therefore, this study
the tendencies proposes
of actual a method
problem to extract
occurrences and classify
by analyzing the
problems
content of in
theanCitizen-Report
objective and using
reproducible manner
text mining. We that
verifyreflects the tendencies
this method using theof actual
data problem
of Chiba Cityoccurrences
as an example,by analyzing the
and the result
contentthat
shows of the
theCitizen-Report
tendencies of real using text mining.
problem We verify
occurrence, whichthis method
were using
not able to the data of Chiba
be understood by City as an example,
classifying based onand
the the result
category
shows
of that the tendencies
the department in charge ofinreal problem
local occurrence,
government such as which were
“roads” ornot able tobecame
“parks,” be understood by method
clear. This classifying based
is also on the category
applicable to other
of the department
Citizen-Report datainand
charge
can in
belocal government
expected such
to be used forasunderstanding
“roads” or “parks,”
regionalbecame
issuesclear. This local
in various method is also applicable to other
governments.
Citizen-Report data and can be expected to be used for understanding regional issues in various local governments.
© 2019 The Author(s). Published by Elsevier B.V.
© 2019
© 2019 The
The Authors.
Author(s). Published
Published bybyElsevier
ElsevierB.V.
B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open
Peer-review access
under article under
responsibility of the CC
KES BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
International.
Peer-review under responsibility of KES International.
Peer-review under responsibility of KES International.
Keywords: Text mining, Citizen Report, Chiba-Repo, government
Keywords: Text mining, Citizen Report, Chiba-Repo, government

* Corresponding author. Tel.: +81-3-3500-1121; fax: +81-3-3500-1122.


E-mail address:author.
* Corresponding Tel.: +81-3-3500-1121; fax: +81-3-3500-1122.
kanoeiji0626@gmail.com
E-mail address: kanoeiji0626@gmail.com
1877-0509 © 2019 The Author(s). Published by Elsevier B.V.
This is an open
1877-0509 © 2019
access
The article
Author(s).
underPublished
the CC BY-NC-ND
by Elsevier license
B.V. (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review
This is an open
under
access
responsibility
article under
of KES
the CC
International.
BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of KES International.

1877-0509 © 2019 The Authors. Published by Elsevier B.V.


This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of KES International.
10.1016/j.procs.2019.09.305
1348 Eiji Kano et al. / Procedia Computer Science 159 (2019) 1347–1356
Eiji Kano/ Procedia Computer Science 00 (2019) 000–000

1. Introduction

In recent years, the circumstances surrounding the operation of local governments have become increasingly severe.
Given the declining working-age population as a result of the declining birthrate and an aging population, financial
tightness owing to a decrease in tax revenue and an increase in social security costs, the shortage in administrative
staffs has become serious. On the other hand, the issues that local governments have to address are becoming more
complicated and diversified, and as a result, it is becoming difficult to maintain a sufficient level of community
services. Therefore, local governments are required to prioritize appropriately and respond to issues within the
constraints of manpower and budget [1]. In addition, it is required to understand regional issues based on objective
data analysis from the viewpoint of “evidence-based policymaking [2].”
Under such circumstances, in recent years, the interest of local government in the mechanism of “Citizen-Report,”
in which residents report local community problems such as road damage and light breakage to the local government
via the application of smartphones, etc., has increased.
By introducing the Citizen-Report system, residents can feel free to access the administration at any time and try to
solve problems promptly. In addition, local government can also expect to improve the efficiency of businesses and
services, such as in reducing the load of telephoning and saving on road inspection costs. In addition to these direct
benefits, the Citizen-Report system can be utilized to clarify the tendency of problems by analyzing the Citizen-Report.
However, the category of problems as the basis of analysis is not necessarily set based on the actual tendency of
problem occurrence. The biggest factor is that no classification method that reflects the actual problem occurrence
tendency has been established.
Therefore, this study proposes a method to extract and classify local community problems in an objective and
reproducible manner that reflects the actual problem occurrence tendency by using text mining. In addition, this study
tries to clarify the characteristics of the tendencies of problem occurrences in Chiba City, which started a service of
the Citizen-Report system called “Chiba-Repo” ahead of other local governments in Japan and has received the largest
number of reports.

2. Significance of classification of Citizen-Report

2.1. Research on Citizen-Report

The forerunner of the Citizen-Report system was FixMyStreet in the United Kingdom, which launched its service
in 2012. Since then, similar systems have been introduced in many countries, as shown in Table 1. The challenge has
been noted from the beginning as an innovative attempt to redefine the relationship between citizens and governments,
and various studies have been conducted to examine the significance of this approach, social functions, and proposals
for new uses [3, 4, 5]. Recently, research has also been conducted to position the role of finding problems by citizens
as a type of sensor and to try to explain its social significance [6]. In Japan, various considerations have been made
regarding the mechanism, social significance, etc., of the Citizen-Report system, mainly by organizations and
stakeholders related to the operation of the Citizen-Report [7, 8].
In analyses using the data of Citizen-Report, there have been studies that analyzed the tendency of the number of
registrants and used the data in workshops to analyze differences in the problem recognition of inhabitants [9, 10].
Moreover, although it did not deal with Citizen-Report, there was a study that tried to classify the data of residents’
comments using machine learning as a related example [11].

2.2. Subject of previous research

Previous studies had the following issues.


First, as described in 2.1, studies were conducted to elucidate the mechanism and significance of Citizen-Report
and to evaluate its achievements. On the other hand, none of the research analyzed the content of the Citizen-Report
for understanding regional issues. The content of regional issues can be understood only through a category set by the
local government as the service operator.
Eiji Kano et al. / Procedia Computer Science 159 (2019) 1347–1356 1349
Eiji Kano/ Procedia Computer Science 00 (2019) 000–000

Second, in order to extract and classify local community problems from Citizen-Report, text mining can be an
effective alternative to analyzing the report content, but no studies have applied this method to Citizen-Report.
The significance of this study is to clarify the structure of regional issues by analyzing Citizen-Report using text
mining.

3. Citizen-Report system and data to be analyzed

This chapter gives an overview of the flow of problem resolution through the Citizen-Report system, and the
characteristics and contents of the data generated by the mechanism.

3.1. Flow of problem resolution by Citizen-Report system

The Citizen-Report system is operated by various local governments and other operation bodies, as shown in Table
1, but the flow of problem resolution is generally common. As an example, in the “Chiba-Repo,” problems are solved
as follows: [12]
1) Register account
An applicant applies for account registration. There is no examination, and non-Chiba citizens can also
register.
2) Reports and shares problems by user
A user reports problems using websites or apps. After entering text for “title” and “content,” selecting “field,”
and attaching photos, registration is completed. The registered information is published on the website of
“Chiba-Repo,” except for cases in which the information violates the terms and conditions, etc.
3) Correspondence to problems
The registered information is circulated to the relevant departments at Chiba City Hall, and the problem is
addressed. In the meantime, the corresponding status such as “accepted,” “in progress,” and “completed” is
updated on the website by Chiba City Hall.

Table 1. Examples of Citizen-Report


-DSDQ 2WKHUFRXQWULHV
0\&LW\5HSRUW )L[0\6WUHHW -DSDQ 2ULJLQDO6\VWHPV )L[0\6WUHHW 6HH&OLFN)L[
&KLED&LW\ +DQGD &LW\ 2WVX&LW\ 8QLWHG.LQJGRP 2DNODQG 86
0XURUDQ &LW\ %HSSX &LW\ 6DJDPLKDUD&LW\ 1RUZD\ &DOLIRUQLD 86
1XPD]X&LW\ .RUL\DPD&LW\ +DPDPDWVX&LW\ &DQDGD 'HWURLW 86
+LURVDNL &LW\ ,NRPD &LW\ $VKL\D &LW\ 1HZ=HD/DQG 0HDWXVHV 86
$GDFKL:DUG ,ZDNL&LW\ .XVDWVX &LW\ %UXVVHO %HOJLXP &RQQHFWLFXW 86
6XPLGD:DUG .XPDJD\D &LW\ ,]XPLVDQR &LW\ =XULFK 6ZLW]ODQG 6RQRUD 0H[LFR
&LWLHVDQGPRUH &LWLHVDQGPRUH HWF HWF
8QGHUYHULILFDWLRQWHVW

3.2. Characteristics of data used in this study

The items shown in Table 2 are the items of “Chiba-Repo.”


The manner of entering “title” and “content” is not controlled, so there is no regularity or consistency. The item
“field” is a selection formula from four fields: “road,” “park,” “garbage,” and “others.” At first glance, this is simple
and easy to understand, but as mentioned above, it is impossible to classify specific problems and understand regional
issues. In addition, this classification is not necessarily mutually exclusive and collectively exhaustive. There are many
cases in which a report does not fit well in the above four fields, and the field selection can differ depending on reporter.
For example, a report titled “streetlamp is broken” as illustrated in Table 2 can be classified as both “road” or “park”
depending on the place of the problem occurrence, even though it is the same problem. If a category was set specifically
for lamp breakage, such a difference in classification cannot occur.
1350 Eiji Kano et al. / Procedia Computer Science 159 (2019) 1347–1356
Eiji Kano/ Procedia Computer Science 00 (2019) 000–000

On the other hand, since there is no restriction on the scope of Citizen-Report and the categories is not set based on
the content of problem, it is considered that the bias of local government that affects the content of the report should
be small, and the residents’ subjective problem recognition should be strongly reflected.

Table 2. Data Item of “Chiba Repo”.


,QSXWPHWKRG 'DWD,WHP ([DPSOHV
$XWRPDWLFDOO\,QSXW ,' ,QFLGHQW,'8VHU,'HWF
'DWHDQG7LPH 5HFHSWLRQ'DWH
5HSRUW'DWH7LPH
&RPSOHWH'DWH
$GGUHVV 3ODFHRI5HSRUW
/RQJLWXGH/DWLWXGH 
,QSXWE\UHSRUWHU 6XEMHFW 5HTXHVWIRU5RDG5HSDLU
&RPPHQW $FDYHin of a road. It’s
GDQJHURXVSOHDVHUHSDLULW
)LHOG 5RGH3DUN*DUEDJH2WKHUV

In addition, as shown in Table 3, the methods of classifying categories are different and inconsistent among local
governments. Some Citizen-Report systems such as those specialized for roads do not set any categories at the
beginning.

Table 3. Examples of categories of Citizen-Report


Chiba-City Koriyama-City Handa-City Sendai-City United Kingdom
Road Road Problem of Water Channel Road Abandoned vehicles Road traffic signs
Park River Problem of Weed Gutter Bus stops Roads/highways
Garbage Park Problem of Traffic Safety Guardrail Car parking Rubbish(refuse and recycling)
Others Security Light Problem of Park Problem Curve-mirror Dog fouling Street cleaning
of Public Facility
Garbage Street Light Flyposting Street lighting
Public Facility Flytipping Street nameplates
Other Graffiti Traffic lights
Parks/landscapes Trees
Pavements/footpaths Other
Potholes
Public toilets

If a category is not set according to the content of local community problems, it is difficult to classify and understand
the tendency based on the content of the problem. On the other hand, if a local government sets a category without
considering the real problem occurrence tendency, the classification will be biased. Therefore, this study tries to
establish a category newly based on the nature of the tendencies of local community problems by applying text mining
for the text of the title and content, and to utilize this to understand regional issues.
In this study, 4,574 Citizen-Reports from August 2014, when “Chiba-Repo” was launched, to March 2018 will be
used.
Eiji Kano et al. / Procedia Computer Science 159 (2019) 1347–1356 1351
Eiji Kano/ Procedia Computer Science 00 (2019) 000–000

4. Analysis of Citizen-Report using text mining

This chapter discusses the methods and results of classification of Citizen-Report into a small number of categories
using text mining.

4.1. Text mining

Text mining is a method that divides a sentence into morphemes, which are the smallest unit having meaning.
Natural language processing such as morphological analysis is applied to identify the word class, etc., analyze the
appearance frequency, modification structure, co-occurrence relation, etc., and extract useful information [13].
Although there are various previous studies about classification using text mining [14, 15], there is no example of an
application to Citizen-Report, so this study establishes a new method to classify problems.

4.2. Classification using modification structure

The following steps were implemented to ensure accurate problem classification in an objective and reproducible
manner according to the content of Citizen-Report:
1) Natural language processing
First, a morphological analysis was conducted using Text Mining Studio ver. 6.0.3.
2) Grouping using modification structure
Next, an analysis of dependency was conducted using the above software. In addition, words having the same
modified phrases such as “go out” were regarded as synonymous when classifying words such as “electric lamp” and
“street lamp” as shown in Fig. 1. These were classified into the same group.

 <0RGLI\LQJ3DUW>

HOHFWULFODPS

VWUHHWODPS
<0RGLILHG3DUW>
IOXRUHVFHQWOLJKW

HOHFWULFOLJKW JRRXW
RXWVLGHOLJKW

VWUHHWOLJKW

HOHFWULFZLUH
*URXSLQJ

Fig. 1. Grouping using modification structure

However, in order to exclude combinations of dependencies that are not useful for grouping, only
combinations of the 706 cases that correspond to the following conditions are extracted:
(a) A combination of “noun-verb” or “noun of irregular conjugation of the line of ‘sa’”
(b) A combination of two or more occurrences
3) Removal of expressions in which problems are not identifiable
In addition, combinations of the following expressions that are not useful for meaningful classification were
excluded:
(a) Expressions that do not lead to the identification of the problem
(e.g., rainfall, photograph-take, people-pass)
(b) Supplementary expression
(e.g., do, please, think)
1352 Eiji Kano et al. / Procedia Computer Science 159 (2019) 1347–1356
Eiji Kano/ Procedia Computer Science 00 (2019) 000–000

As a result, 397 combinations of dependencies were extracted. This included 32 modified words as shown in Fig.
2. This means that 32 groups were formed by the commonality of the modified words. Hereafter, these groups are
named the “first group.”

4) Create categories
In order to consolidate the first group into a smaller number of categories, modified words that have the same
modifying word were classified into a group. When creating groups by modifying words that have two modified words,
groups were formed that were considered to be well representative of the characteristics of local community problems.
Hereafter, these groups formed by the commonality of modifying words are named the “second group.” In addition,
when grouping words that have three or more modified words, useful groups were not formed because the generality
of the meaning of the words became too strong. Finally, as a result of organizing the second group according to the
dependency relationship, seven categories were derived.
Figure 2 shows the classification results. The categories are named based on their meanings: Lamp Breakage, Road
Damage, Facility Defect, Invisibility, Overgrowth, Clogging, and Leaving.
&DWHJRU\ 6HFRQG*URXS )LUVW*URXS &DWHJRU\ 6HFRQG*URXS )LUVW*URXS

HOHFWURQLFODPS JRRXW FRYHUFUDFN FDYHLQ


HOHFWURQLFODPS FRQFUHWHEORFN SHHORII
7XUQHGRII
/DPS%UHDNDJH IOXRUHVFHQWOLJKW GRHVQRWOLJKW
5RDG'DPDJH OHYHOGLIIHUHQFH EUHDN
HOHFWULFOLJKW OLJKWRXW OHYHOGLIIHUHQFH
OLJKWLQJ VWXPEOH
FXUEVWRQH
IDOOHQOHDI FORJJHG
GHWHULRUDWLRQ
*XWWHUHDUWKDQG EORFNHG
&ORJJLQJ JXDUGSLSH IDOOGRZQ
VDQG JDWKHUHG )DFLOLW\'HIHFW
SROH GDPDJH
UDLQUDLQZDWHU RYHUIORZHG
JXDUGUDLO VQDSSHG
OHDYLQJ VFULEEOLQJ

SODFHURDGVLGH VFDWWHU
GLIILFXOWWRVHH
/HDYLQJ ELF\FOH WKURZLQJDZD\
,QYLVLELOLW\ RXWRIVLJKW
DJDLQVWWKHUXOHV
PLUURUFDU LQVWDOODWLRQ
DQLPDO UXQRXW
VLGHRIWKHURDG

ZHHGWUHH RYHUJURZQ
2YHUJURZWK SODQW JURZ
6WUHHWWUHH VWLFNRXW
JDUGHQSODQW REVWUXFWLYH
SUXQLQJ

Fig. 2. Category of “Chiba Repo” using commonality of modified words

4.3. Set automatic classification rules

Based on the categories derived in section 4.2, the data of “Chiba-Repo” was classified. Specifically, when a
modified word, modifying word, or synonym thereof belonged to a category included in a report, it was classified into
that category.
However, a report can include words belonging to multiple categories when the report includes multiple requests,
for instance. In this case, such a report is classified according to the following rules:
(a) Prioritize conformance to “theme” over conformance to “content”
If a word contained in a “theme” item and a word contained in a “content” item are classified in different
categories, priority is given to the category contained in “theme.”
(b) Prioritize the word that appeared earlier
If there are multiple words that belong to different categories in the “content” item in the same report, the
word that appears first takes precedence.
By processing the above steps, any report can be classified into a category automatically except when no word in a
report is included in the category classification criteria or when a report is excluded because a particular reporter posts
Eiji Kano et al. / Procedia Computer Science 159 (2019) 1347–1356 1353
Eiji Kano/ Procedia Computer Science 00 (2019) 000–000

many reports repeatedly for the same issue and overlapping parts are removed as noise (hereinafter referred to as an
“error”). In addition, the error ratio is limited to 12.4% (62 cases of 500 sample data mentioned later).

4.4. Verification of automatic classification rule

In order to verify the validity of the automatic classification rule of section 4.3, 500 cases were extracted from the
oldest order, and the difference between the correct answer data classified manually based on the categories in Fig. 2
and the result of the automatic classification rules was verified. Consequently, the difference in the composition ratio
of each category after error elimination was less than 5%, as shown in Fig. 3.

(DFK9DULHQFH LV8QGHU


 ▲% 



 +  /DPS%UHDNDJH

5RDG'DPDJH

)DFLOLW\'HIHFW
 + 

/HDYLQJ
  +  2YHUJURZWK

&ORJJLQJ
 ▲
 
,QYLVLELOLW\
  ▲ 
  ▲ 
5XOH([HFXWLRQ &RUUHFW$QVZHU

Fig.3. Verification result of automatic classification rule

From the above result, the automatic classification rule of section 4.3 was verified to be suitable for accurately
classifying Citizen-Report.

4.5. Actual composition of local community problems in Chiba City

According to the classifications of the existing default category, as shown the left bar graph in Fig. 4, problems
with roads account for three quarters of the data. At first glance, most of the local community problems for the residents
seem to be road obstructions. However, the results are shown in the right bar graph in Fig. 4, obtained as the result of
applying the automatic classification rule of section 4.3 to all cases. This graph implies that the composition of road
problems is not particularly high, and that, in fact, seven major problems (Lamp Breakage, Road Damage, Facility
Defect, Invisibility, Overgrowth, Clogging, and Leaving) exist equally. Actual road problems count for only 16% of
the data.
It appears that the classification proposed in this study can further classify problems more finely and appropriately
than the default category of “Chiba-Repo,” as the right bar graph in Fig. 4 shows.
1354 EijiKano/
Eiji KanoProcedia
et al. / Procedia
ComputerComputer Science
Science 00 (2019)159 (2019) 1347–1356
000–000

'HIDXOW&DWHJRU\ 3URSRVLQJ&DWHJRU\




  /DPS%UHDNDJH


5RDG'DPDJH

5RDG   )DFLOLW\'HIHFW

3DUN /HDYLQJ
 
*DUEDJH 2YHUJURZWK
 
2WKHUV &ORJJLQJ
  ,QYLVLELOLW\
 
(UURU

 

  1 

Fig.4. Comparison between default and proposing categories

4.6. Difference of problem recognition by area and generation

In “Chiba-Repo,” metadata such as the location and attributes of reporters are also collected. By utilizing them, it
is possible to analyze the characteristics of regional issues and local problem occurrences in more detail. In the
following analysis, differences in the residents’ recognition of local community problems by area and generation are
clarified.
1) Differences in characteristics by region
Problems reported in “Chiba-Repo” have different characteristics depending on the area, even in the same city. Fig.
5 shows the composition of classified local community problems derived in section 4.2 for each of the six
administrative districts of Chiba City, based on the address data attached to each report. The balloon in the figure
indicates problems for which the composition ratio is particularly large.

Fig.5. Difference in problem recognition by administrative district

Figure 5 implies that there are differences in the priority of residents’ problem recognition by administrative district.
The most frequently reported problem in Chuo Ward, Wakaba Ward, and Mihama Ward is Road Damage. In
Hanamigawa Ward, Facility Defect is reported the most. In Inage Ward, it is Invisibility. In Midori Ward, it is Lamp
Breakage.
2) Differences in characteristics by generations
By linking “Chiba-Repo” data and the attribute data of applicants by using the user ID as a key, difference in the
composition of problem recognition by the reporter’s generation can be clarified, as shown in Fig. 6.
Eiji Kano et al. / Procedia Computer Science 159 (2019) 1347–1356 1355
Eiji Kano/ Procedia Computer Science 00 (2019) 000–000

Fig.6. Differences in task recognition among generations

Figure 6 shows that Road Damage is recognized as a common problem in any generation [(a) in Fig. 6]. On the
other hand, for other problems, there are differences between generations. For example, twentysomethings are most
interested in “Leaving,” and those in their seventies are highly interested in Overgrowth [(b) and (c)] at the same level
as Road Damage. On the other hand, middle generations in their forties and fifties, who account for 60% or more of
the reports, watch all issues without any major bias (d).
Thus, by combining the problem classification method with attribute data, it is possible to analyze the characteristics
and tendencies of local community problem occurrences in more detail, and to utilize the results for more detailed
responses.

5. Conclusion

This study tried to classify the Citizen-Report based on the content of the text by applying text mining. As a result,
it was verified that automatic classification based on the content of the actual Citizen-Report works more accurately
than the classification of the default category. In addition, the composition of the local community problems that the
residents recognize was clarified in detail.
The results, in which the characteristics of local community problems were clearly understood by analyzing Citizen-
Report, implies that the new perspective of utilizing data for solving regional issues can be effective.
Another challenge is to analyze differences in local community problems between regions by adding data from
other municipalities, etc., and the relationship between the characteristics of the classification results obtained in this
study and other external factors.

References

[1] Ministry of Internal Affairs and Communications, The Second Report of the study group for designing local government strategy toward 2040,
<http://www.soumu.go.jp/menu_news/s-news/01gyosei04_02000068.html> Accessed April 1, 2019.
[2] Statistics Reform Promotion Council, Final Report, 2017, <https://www.kantei.go.jp/jp/singi/toukeikaikaku/pdf/saishu_honbun.pdf> Accessed
April 1, 2019.
[3] Stephen F. King, Paul Brown, “Fix My Street or Else: Using the Internet to Voice Local Public Service Concerns,” ICEGOV2007, December
10–13, pp. 72–80, 2007.
[4] Burcu Baykurt, Redefining Citizenship and Civic Engagement: political values embodied in FixMyStreet.com, AoIR Selected Papers of Internet
Research; IR 12,2011
[5] Nils Walravens, Validating a Business Model Framework for Smart City Services: The Case of FixMyStreet, paper presented at the 27th IEEE
International Conference on Advanced Information Networking and Applications Workshops (Barcelona, Spain, 2013)
[6] Lasse Berntzen, Marius Rohde Johannessen, Stephan Böhm, Roberto Morales, “Citizens as Sensors: Human Sensors as a Smart Cit y Data
Source,” SMART 2018, The Seventh International Conference on Smart Cities, Systems, Devices and Technologies, Barcelona, Spain, pp. 11–
18, 2018.
[7] Hiromasa Hijikata, “Solve Problems in Community and Town by Smartphone: Data Driven Management via ‘FixMyStreet Japan,’” 2018
<https://www.nttdata-strategy.com/aboutus/newsrelease/180705/> Accessed April 1, 2019.
1356 Eiji Kano et al. / Procedia Computer Science 159 (2019) 1347–1356
Eiji Kano/ Procedia Computer Science 00 (2019) 000–000

[8] Kinzoku Yoshihiko, “A New Communication Tool to Connect Citizens and Government,” Administration & Information Systems, vol. 51, no.
2, pp. 8–14, 2015.
[9] Masami Honda, “Estimation of the Penetration Process of Administrative Application Based on the Registrant Information of ‘Chiba Repo’
Converted to Open Data,” Journal of Information and Knowledge Society, vol. 26, no. 2, pp. 187–194, 2016.
[10] Shota Nakatogawa, Toshikazu Seto, “The Possibilities of Citizen Participatory GIS from the View of Difference in Problem Recognition
Based on Social Attributes: In the Case of “Chiba Repo,” Proceedings of the Geography Society of Japan, vol. S, no. 0, 2016.
[11] Yuta Sano, Kohei Yamaguchi, Tsunenori Mine, “Automatic Classification of Complaint Reports about City Park,” Information Engineering
Express International Institute of Applied Informatics, vol. 1, no. 4, pp. 119–130, 2015.
[12] City of Chiba, Chibarepo "Chiba Citizen Collaboration Report" <https://www.city.chiba.jp/shimin/shimin/kohokocho/chibarepo.html>
Accessed April 1, 2019.
[13] Tetsuya Nasukawa, Technology of Using/Making Text Mining: The Essence and Utilization Method Derived from Basic Technology and
Application Cases, Tokyo Denki University Press, 2006.
[14] Akihiro Saito, Application of Textmining in Japan, The society for economic studies, The University of Kitakyushu, Working Paper Series
2012, No. 2011-12. <http:// www.kitakyu-u.ac.jp/economy/study/pdf/2011/2011_11.pdf> Accessed April 1, 2019.
[15] Matsumura Masahiro, Miura Asako, Text Mining for Humanities and Social Sciences [revised edition], Seishin Shobo, 2014.

You might also like