T3 - A Comparative Study and Analysis On Crime Predictions Based On SVM and CNN For Smart Cities

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 74

An Efficient Technique to Predict Crimes in Smart Cities Using

SVM and CNN


(Modified Title)
A Comparative Study and Analysis on Crime Predictions Based
on SVM and CNN for Smart Cities

Introduction
The 21st Century is frequently referenced as the “Century of the City”, reflecting the
unprecedented global migration into urban areas that is happening nowadays. Such a steadily
increasing urbanization is bringing huge social, economic and environmental transformations
and, at the same time, presenting challenges in city management issues, like resource
planning (water, electricity), traffic, air and water quality, public policy and public safety
services. Moreover, given that the larger cities the higher crime rates, crime spiking is
becoming one of the most important social problems in large urban areas, because it affects
public safety, children development, and adult socio-economic status. With the ever-
increasing ability of public organizations and police departments to collect and store detailed
data tracking crime events, a significant amount of data with spatial and temporal information
is daily collected. This offers the opportunity to apply data analytics methodologies to extract
useful predictive models related to crime events, which can enable police departments to
better utilize their limited resources and develop effective strategies for crime prevention. In
particular, extensive criminal justice research studies show that the incidence of criminal
events is not equally distributed within a city. In fact, crime rates can change with respect to
the geographic location of the area (there are low-risk and highrisk areas) and crime trends
can vary (seasonal patterns, picks, dips) with respect to the period of the year. For such a
reason, an accurate predictive model must be able to automatically detect both which areas in
the city are more affected by crime events and how the crime rate of each specific area varies
with respect to the temporal period. This knowledge can enable police departments to more
efficiently allocate their resources to specific crime hot spots, allowing for the effective
deployment of officers to areas of high risk or removal of officers from areas seeing
decreasing levels of crime, thus more efficiently preventing or quickly responding to criminal
activity.
Abstract

The steadily increasing urbanization is causing significant economic and social


transformations in urban areas and it will be posing several challenges in city management
issues. In particular, given that the larger cities the higher crime rates, crime spiking is
becoming one of the most important social problems in large urban areas. To handle with the
increase in crimes, new technologies are enabling police departments to access growing
volumes of crime-related data that can be analyzed to understand patterns and trends,
finalized to an efficient deployment of police officers over the territory and more effective
crime prevention. This paper presents an approach based on spatial analysis and auto-
regressive models to automatically detect high-risk crime regions in urban areas and reliably
forecast crime trends in each region. The final result of the algorithm is a spatio-temporal
crime forecasting model, composed of a set of crime dense regions and a set of associated
crime predictors, each one representing a predictive model for forecasting the number of
crimes that will happen in its specific region. The experimental evaluation, performed on
real-world data collected in a big area of Chicago, shows that the proposed approach achieves
good accuracy in spatial and temporal crime forecasting over rolling time horizons.
Along with this we propose new techniques with CNN and SVM algorithms which
classifies the clustering data that has been implements and provides an accurate result and
increases the performance evaluation of the entire process. Due to high and better accuracy
the crime events has been reduced up to the core. The datasets of entire crime occurring areas
and those datasets are analyzed and compared using CNN and SVM algorithms to get an
accurate result.
Objective
 An approach has been exploited to identify key members in criminal networks, to
study interaction patterns among them.

 Provides a clear structure of criminal database which enhances the complete system.

 It is possible to accurately forecast selected crimes on month ahead in small areas,


such as police precincts.

 The goal of the analysis is to find models for reliably predicting the number and
location of crimes at a given timestamp
DATA MINING:

Data mining (the analysis step of the "Knowledge Discovery in Databases" process,
or KDD), a field at the intersection of computer science and statistics, is the process that
attempts to discover patterns in large data sets. It utilizes methods at the intersection of
artificial intelligence, machine learning, statistics, and database systems The overall goal of
the data mining process is to extract information from a data set and transform it into an
understandable structure for further use Aside from the raw analysis step, it involves database
and data management aspects, data preprocessing, model and inference considerations,
interestingness metrics, complexity considerations, post-processing of discovered structures,
visualization, and online updating. Generally, data mining (sometimes called data or
knowledge discovery) is the process of analyzing data from different perspectives and
summarizing it into useful information - information that can be used to increase revenue,
cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing
data. It allows users to analyze data from many different dimensions or angles, categorize it,
and summarize the relationships identified. Technically, data mining is the process of finding
correlations or patterns among dozens of fields in large relational databases.

Table
Raw data

Data mine database

Raw data
Data cleansing and
loading into data Data Data
Extraction of data from repositories mining database transformation visualizati
Pattern discovery using clustering and classification alg
and
interpretati
results

Fig 1.1: Process of data mining


Data

Data are any facts, numbers, or text that can be processed by a computer. Today,
organizations are accumulating vast and growing amounts of data in different formats and
different databases. This includes:
 Operational or transactional data such as, sales, cost, inventory, payroll, and
accounting
 Nonoperational data, such as industry sales, forecast data, and macro economic data
 Meta data - data about the data itself, such as logical database design or data
dictionary definitions
Information
The patterns, associations, or relationships among all this data can provide
information. For example, analysis of retail point of sale transaction data can yield
information on which products are selling and when.
Knowledge
Information can be converted into knowledge about historical patterns and future
trends. For example, summary information on retail supermarket sales can be analyzed in
light of promotional efforts to provide knowledge of consumer buying behavior. Thus, a
manufacturer or retailer could determine which items are most susceptible to promotional
efforts.
Data Warehouses
In computing, a data warehouse (DW or DWH) is a database used for reporting and
data analysis. It is a central repository of data which is created by integrating data from
multiple disparate sources. Data warehouses store current as well as historical data and are
commonly used for creating trending reports for senior management reporting such as annual
and quarterly comparisons. The data stored in the warehouse are uploaded from the
operational systems (such as marketing, sales etc., shown in the figure to the right). The data
may pass through an operational data store for additional operations before they are used in
the DW for reporting. The typical ETL-based data warehouse uses staging, integration, and
access layers to house its key functions. The staging layer or staging database stores raw data
extracted from each of the disparate source data systems. The integration layer integrates the
disparate data sets by transforming the data from the staging layer often storing this
transformed data in an operational data store (ODS) database.
A data warehouse constructed from integrated data source systems does not require
ETL, staging databases, or operational data store databases. The integrated data source
systems may be considered to be a part of a distributed operational data store layer. Data
federation methods or data virtualization methods may be used to access the distributed
integrated source data systems to consolidate and aggregate data directly into the data
warehouse database tables. Unlike the ETL-based data warehouse, the integrated source data
systems and the data warehouse are all integrated since there is no transformation of
dimensional or reference data. This integrated data warehouse architecture supports the drill
down from the aggregate data of the data warehouse to the transactional data of the integrated
source data systems.
Data warehouses can be subdivided into data marts. Data marts store subsets of data
from a warehouse. This definition of the data warehouse focuses on data storage. The main
source of the data is cleaned, transformed, cataloged and made available for use by managers
and other business professionals for data mining, online analytical processing, market
research and decision support However, the means to retrieve and analyze data, to extract,
transform and load data, and to manage the data dictionary are also considered essential
components of a data warehousing system. Many references to data warehousing use this
broader context. Thus, an expanded definition for data warehousing includes business
intelligence tools, tools to extract, transform and load data into the repository, and tools to
manage and retrieve metadata. Dramatic advances in data capture, processing power, data
transmission, and storage capabilities are enabling organizations to integrate their various
databases into data warehouses. Data warehousing is defined as a process of centralized data
management and retrieval. Data warehousing, like data mining, is a relatively new term
although the concept itself has been around for years. Data warehousing represents an ideal
vision of maintaining a central repository of all organizational data. Centralization of data is
needed to maximize user access and analysis. Dramatic technological advances are making
this vision a reality for many companies. And, equally dramatic advances in data analysis
software are allowing users to access this data freely. The data analysis software is what
supports data mining. It enables these companies to determine relationships among "internal"
factors such as price, product positioning, or staff skills, and "external" factors such as
economic indicators, competition, and customer demographics. And, it enables them to
determine the impact on sales, customer satisfaction, and corporate profits. Finally, it enables
them to "drill down" into summary information to view detail transactional data.

Preprocessing Data mining

Data cleaning Evaluation

Resourcing Interpretations

Problem specification Exploration

Data Knowledge

Fig 1.2: Levels of data mining

Data mining elements


 Extract, transform, and load transaction data onto the data warehouse system.
 Store and manage the data in a multidimensional database system.
 Provide data access to business analysts and information technology professionals.
 Analyze the data by application software.
 Present the data in a useful format, such as a graph or table.

Different levels of analysis


 Artificial neural networks: Non-linear predictive models that learn through training
and resemble biological neural networks in structure.
 Genetic algorithms: Optimization techniques that use processes such as genetic
combination, mutation, and natural selection in a design based on the concepts of natural
evolution.
 Decision trees: Tree-shaped structures that represent sets of decisions. These
decisions generate rules for the classification of a dataset. Specific decision tree methods
include Classification and Regression Trees (CART) and Chi Square Automatic Interaction
Detection (CHAID). CART and CHAID are decision tree techniques used for classification
of a dataset. They provide a set of rules that you can apply to a new (unclassified) dataset to
predict which records will have a given outcome. CART segments a dataset by creating 2-
way splits while CHAID segments using chi square tests to create multi-way splits. CART
typically requires less data preparation than CHAID.

 Nearest neighbor method: A technique that classifies each record in a dataset based on
a combination of the classes of the k record(s) most similar to it in a historical dataset (where
k 1). Sometimes called the k-nearest neighbor technique.

 Rule induction: The extraction of useful if-then rules from data based on statistical
significance.

 Data visualization: The visual interpretation of complex relationships in


multidimensional data. Graphics tools are used to illustrate data relationships.
Data Mining Techniques
There are several major data mining techniques have been developed and used in data
mining projects recently including association, classification, clustering, prediction and
sequential patterns.

Association
Association is one of the best known data mining technique. In association, a pattern
is discovered based on a relationship of a particular item on other items in the same
transaction. For example, the association technique is used in market basket analysis to
identify what products that customers frequently purchase together. Based on this data
businesses can have corresponding marketing campaign to sell more products to make more
profit.
Database Statistics
technology

Machine learning Data mining Visualization

Information Other
science disciplines

Fig 1.3: Techniques of data mining

Classification
Classification is a classic data mining technique based on machine learning. Basically
classification is used to classify each item in a set of data into one of predefined set of classes
or groups. Classification method makes use of mathematical techniques such as decision
trees, linear programming, neural network and statistics. In classification, make the software
that can learn how to classify the data items into groups. For example, can apply
classification in application that “given all past records of employees who left the company,
predict which current employees are probably to leave in the future.” In this case, divide the
employee’s records into two groups that are “leave” and “stay”.

Clustering
Clustering is a data mining technique that makes meaningful or useful cluster of
objects that have similar characteristic using automatic technique. Different from
classification, clustering technique also defines the classes and put objects in them, while in
classification objects are assigned into predefined classes. To make the concept clearer, can
take library as an example. In a library, books have a wide range of topics available. The
challenge is how to keep those books in a way that readers can take several books in a
specific topic without hassle.
What is Web Mining

Web mining is the use of data mining techniques to automatically discover and extract
information from Web documents and services.

There are three general classes of information that can be discovered by web mining:

 Web activity, from server logs and Web browser activity tracking.
 Web graph, from links between pages, people and other data.
 Web content, for the data found on Web pages and inside of documents.

At Scale Unlimited we focus on the last one – extracting value from web pages and
other documents found on the web.

Note that there’s no explicit reference to “search” in the above description. While
search is the biggest web miner by far, and generates the most revenue, there are many other
valuable end uses for web mining results. A partial list includes:

 Business intelligence
 Competitive intelligence
 Pricing analysis
 Events
 Product data
 Popularity
 Reputation

Four Steps in Content Web Mining

When extracting Web content information using web mining, there are four typical steps.

1. Collect – fetch the content from the Web


2. Parse – extract usable data from formatted data (HTML, PDF, etc)
3. Analyze – tokenize, rate, classify, cluster, filter, sort, etc.
4. Produce – turn the results of analysis into something useful (report, search index, etc)
Web Mining versus Data Mining

When comparing web mining with traditional data mining, there are three main
differences to consider:

1. Scale – In traditional data mining, processing 1 million records from a database would
be large job. In web mining, even 10 million pages wouldn’t be a big number.
2. Access – When doing data mining of corporate information, the data is private and
often requires access rights to read. For web mining, the data is public and rarely
requires access rights. But web mining has additional constraints, due to the implicit
agreement with webmasters regarding automated (non-user) access to this data. This
implicit agreement is that a webmaster allows crawlers access to useful data on the
website, and in return the crawler (a) promises not to overload the site, and (b) has the
potential to drive more traffic to the website once the search index is published. With
web mining, there often is no such index, which means the crawler has to be extra
careful/polite during the crawling process, to avoid causing any problems for the
webmaster.
3. Structure – A traditional data mining task gets information from a database, which
provides some level of explicit structure. A typical web mining task is processing
unstructured or semi-structured data from web pages. Even when the underlying
information for web pages comes from a database, this often is obscured by HTML
markup.

Note that by “traditional” data mining we mean the type of analysis supported by most
vendor tools, which assumes you’re processing table-oriented data that typically comes from
a database.

Sentiment Analysis

Sentiment analysis is a type of data mining that measures the inclination of people’s
opinions through natural language processing (NLP), computational linguistics and text
analysis, which are used to extract and analyze subjective information from the Web - mostly
social media and similar sources. The analyzed data quantifies the general public's sentiments
or reactions toward certain products, people or ideas and reveal the contextual polarity of the
information. Sentiment analysis is also known as opinion mining.
Sentiment analysis uses data mining processes and techniques to extract and capture data
for analysis in order to discern the subjective opinion of a document or collection of
documents, like blog posts, reviews, news articles and social media feeds like tweets and
status updates.
Sentiment analysis allows organizations to track the following:

 Brand reception and popularity


 New product perception and anticipation
 Company reputation
 Flame/rant detection

Sentiment Analysis: Concept, Analysis and Applications


Sentiment analysis is contextual mining of text which identifies and extracts subjective
information in source material, and helping a business to understand the social sentiment of
their brand, product or service while monitoring online conversations. However, analysis of
social media streams is usually restricted to just basic sentiment analysis and count based
metrics. This is akin to just scratching the surface and missing out on those high value insights
that are waiting to be discovered. So what should a brand do to capture that low hanging fruit?

With the recent advances in deep learning, the ability of algorithms to analyse text has
improved considerably. Creative use of advanced artificial intelligence techniques can be an
effective tool for doing in-depth research. We believe it is important to classify incoming
customer conversation about a brand based on following lines:

1. Key aspects of a brand’s product and service that customers care about.

2. Users’ underlying intentions and reactions concerning those aspects.

These basic concepts when used in combination become a very important tool for
analyzing millions of brand conversations with human level accuracy. In the post, we take the
example of Uber and demonstrate how this works. Read On!
Text classifier- The basic building blocks

Sentiment Analysis

Sentiment Analysis is the most common text classification tool that analyses an
incoming message and tells whether the underlying sentiment is positive, negative our neutral.
You can input a sentence of your choice and gauge the underlying sentiment by playing with
the demo.

Intent Analysis

Intent analysis steps up the game by analyzing the user’s intention behind a message
and identifying whether it relates an opinion, news, marketing, complaint, suggestion,
appreciation or query.

Contextual Semantic Search (CSS)

Now this is where things get really interesting. To derive actionable insights, it is
important to understand what aspect of the brand is a user discussing about. For example:
Amazon would want to segregate messages that related to: late deliveries, billing issues,
promotion related queries, product reviews etc. On the other hand, Starbucks would want to
classify messages based on whether they relate to staff behaviour, new coffee, flavours,
hygiene feedback, online orders, store name and location etc. But how can one do that?

We introduce an intelligent smart search algorithm called Contextual Semantic


Search (a.k.a. CSS). The way CSS works is that it takes thousands of messages and a concept
(like Price) as input and filters all the messages that closely match with the given concept. The
graphic shown below demonstrates how CSS represents a major improvement over existing
methods used by the industry.
Existing approach vs Contextual Semantic Search

A conventional approach for filtering all Price related messages is to do a keyword


search on Price and other closely related words like (pricing, charge, $, paid). This method
however is not very effective as it is almost impossible to think of all the relevant keywords
and their variants that represent a particular concept. CSS on the other hand just takes the
name of the concept (Price) as input and filters all the contextually similar even where the
obvious variants of the concept keyword are not mentioned.
Literature Survey
Title Author & Year Description
SMPTE Periodical - M. Cobley ; G. Graham An overview of resources, facilities and
Audiovisual  ( Volume: 87 , Issue: 4 , services provided by Canada, as Host
Programs at the April 1978 ) Country, to meet the requirements for
United Nations multiple language versions and access
Conference on systems.
Human Settlements Datasets - Urban Area Dataset
Advantages: Provides needed facility for
people.
Disadvantages: Consumes large amount
of time and data.
Application: In maintaing the people
database and can provide efficient
information to the government.
Future Scope: Can be applied in various
fields like automobiles, wildlife etc.

Selective versus Venkat Surya It perform a thorough feasibility analysis


Non-Selective Dasari ; Maryam of two possible data acquisition
Acquisition of Pouryazdan ; Burak approaches for crowd-solicited IoT data 
Crowd-Solicited IoT Kantarci Dataset: Population Dataset
Data and Its Advantages: Provides detailed
Dependability information about the citizes.
Disadvantages: Privacy for citizens is a
quiet complicated one.
Application: Can be applied in various
public sector.
Future Scope: Can be implemented in
numerous fields like corporate etc.
Title Author & Year Description
Crimetracer: Mohammad A. This phenomenon has drawn attention to
Activity space Tayebi ; Martin spatial crime analysis, focusing on crime
based crime Ester ; Uwe areas with higher crime density. In this paper
location Glässer ; Patricia L. we present CRIMETRACER, a personalized
prediction Brantingham random walk based approach to spatial crime
analysis and crime location prediction outside
16-10 Oct. 2013. of hotspots.

Dataset: Crime Analysis in Delhi


Advantages: Reduces crime rates in a
concerned city
Disadvantages: Hard to collect the data.
Application: Designed for police department
to reduce the crime rates in the city.
Future Scope: This can be enhanced with
more efficient features like predicting the
locations using tracking options
Crime Rate Liang Ge ; Junling We infer the fine-grained crime situation of
Inference Using Liu ; Aoli Zhou ; Hang different times in a year for each community
Tensor Li of Chicago, IL in the USA, by using the
Decomposition Chicago crime datasets. We model the crime
8-12 Oct. 2018 situation of Chicago with a three dimension
tensor, where the three dimensions stand for
communities, crime categories, and time
slots, respectively.
Datasets: Chicago Crime Datasets.
Advantages: Reduces the crime level in the
city.
Disadvantages: Prediction analysis is a vague
process.
Application: Applied for cops, predicting
culprits etc.
Future Scope: In future it can be applied in
numerous fields such as police departments,
law, courts etc.
Title Author & Year Description
Plenario: An open Henrik I Christensen This course covers the general area of
data discovery and 2014 Simultaneous Localization and Mapping
exploration (SLAM). Initially the problems of
platform for urban localization, mapping, and SLAM are
Science introduced from a methodological point of
view. Different methods for representation
of uncertainty will be introduced including
their ability to handle single and multi-mode
uncertainty representations.
Dataset: Locating and Tracking Mapping
Datasets.
Advantages: Gets a clear status of every
information.
Disadvantages: Chance for misspelled data.
Application: Can be applied in various
fields such as marketing, logistics, crime
predictions etc.
Future Scope: Implemented in attaining
accurate and perfect results in crime
predictions including current status.
Crime Pattern Shyam Varan Nath Here we look at use of clustering algorithm
Detection Using for a data mining approach to help detect
Data Mining the crimes patterns and speed up the process
of solving crime.
Datasets: Crime Occurrences in cities.
Advantages: Reduces the latency in the
system.
Disadvantages: Increases the manual
works.
Application: Applied to reduce the crime
occurrences.
Future Scope: Various patterns of crime
analysis can be predicted and solved.
Title Author & Year Description
Time, Place, and Modus Peng Chen ; Justin Kurland This paper aims to solve the
Operandi: A Simple problem of identifying
Apriori Algorithm 23-25 July 2018 potential serial offending
Experiment for Crime patterns using previously
Pattern Detection underutilised attributes from
police-recorded crime data.
Dataset: Crime data
processing.
Advantages: Provides
detailed status of crime data.
Disadvantages: Maintenance
is a major failure.
Application: Used to identify
the crime files.
Future Scope: Details
including current status and
their punishments can be
added.
Data mining approaches to Jeroen S. De Bruin ; Tim K. Narrative reports and
criminal career analysis Cocx ; Walter A. criminal records are stored
Kosters ; Jeroen F. J. digitally across individual
Laros ; Joost N. Kok police departments, enabling
the collection of this data to
18-22 Dec. 2006 compile a nation-wide
database of criminals.
Dataset: Criminal Datasets of
various fields.
Advantages: Accessible
throughout the nation.
Disadvantages: Chance of
lack of security of data.
Application: In identification
of culprits and their records
in a prominent way.
Future scope: Data can be
implemented in cloud server
and encrypted with better
privacy.
Title Author & Year Description
Surveylance: Amin Kharraz ; WilliamWe present
Automatically Detecting Robertson ; Engin Kirda SURVEYLANCE, the first
Online Survey Scams system that automatically
 20-24 May 2018 identifies survey scams using
machine learning techniques.
Datasets: SCAM Dataset
Advantages: Easier to
identify the SCAM and its
occurrences.
Disadvantages: Chance of
misguidance.
Application: Can be used in
finding and predicting the
SCAM and more reliable for
justice.
Future Scope: Can be
initiated with higher accuracy
which provides better and
efficient results.
A multivariate time series B. Chandra ; Manish Clustering multivariate time
clustering approach for Gupta ; M. P. Gupta series has potential for
crime trends prediction analyzing large volume of
02-04 Oct 2012 crime data at different time
points as law enforcement
agencies are interested in
finding crime trends of
various police administration.
Datasets: Clustering Crime
Analysis
Advantages: Results of
different areas can be easily
predicted.
Disadvantages: Quiet
complicated to acquire data
as large amount of data are
used.
Application: Used to predict
the criminals in various areas.
Future Scope: As like
criminals actors in various
fields can be analysed and
predicted.
Title Author & Year Description
Crime Pattern Detection Shyam Varan Nath Here we look at use of
Using Data Mining clustering algorithm for a
18-22 Dec. 2006 data mining approach to help
detect the crimes patterns and
speed up the process of
solving crime. We will look
at k-means clustering with
some enhancements to aid in
the process of identification
of crime patterns. 
Dataset: Crime Rate Dataset
Advantages: Decreases the
crime occurrences and
prediction is accurate.
Disadvantages: Chance of
latency.
Application: Mostly useful
for cyber crime

Crime data H. Chen, W. Chung, J. Xu, A major challenge facing all


mining: a general G. Wang, Y. Qin, and M. law-enforcement and
framework and some Chau intelligence-gathering
examples, organizations is accurately
vol. 37, no. 4, pp. 50–56, and efficiently analyzing the
2004. growing volumes of crime
data. Detecting cybercrime
can likewise be difficult
because busy network traffic
and frequent online
transactions generate large
amounts of data, only a small
portion of which relates to
illegal activities. 
Dataset: Cyber Crime
Datasets
Advantages: Reduces online
fraudulent activities.
Disadvantages: Prediction
may inaccurate.
Future Scope: Covers most
fields that has been running
in cloud.
Title Author & Year Description
Crime forecasting C.-H. Yu, M. Ward, M. We first discuss our
using data mining Morabito, and W. Ding approach to architecting
techniques, datasets from original crime
IEEE 11th International records. The datasets contain
Conference on, 2011, pp. aggregated counts of crime
779–786. and crime-related events
categorized by the police
department. The location and
time of these events is
embedded in the data.
Dataset: Crime Records.
Advantages: Increases the
efficiency of the system.
Disadvantages: Latency level
and prediction is quiet
complicated.
Application: Useful in law
and order maintenance.
Future Work: Can be used
with further improvement for
accurate output, which
reduces the crimes.
Crime hot spot Y. Zhuang, M. Almeida, M. It allows the effective
forecasting Morabito, and W. Ding deployment of officers to
high-risk crime areas and
IEEE International elimination from areas with a
Conference on Big decreasing crime trend as
Knowledge (ICBK), Aug well as developing effective
2017, pp. 143–150. crime prevention strategies.
The purpose of this paper is
to show the usefulness of
analytic algorithms in
predicting crimes.
Dataset: Lucknow (UP) city
Crime Dataset.
Advantages: Reduces and
provides advance prediction
in crime occurring areas.
Disadvantages: A slow
process.
Application: Increases the
efficiency of the city.
Future Scope: Entire
metropolitan city can be
covered instead of single
area.
Title Author & Year Description
Crime rate inference with H. Wang, D. Kifer, C. Graif,  In this paper we infer the
big data, and Z. Li fine-grained crime situation
of different times in a year
ACM, 2016, pp. 635–644 for each community of
Chicago, IL in the USA, by
using the Chicago crime
datasets. We model the crime
situation of Chicago with a
three dimension tensor,
where the three dimensions
stand for communities, crime
categories, and time slots,
respectively.
Datasets: Chicago crime
datasets
Advantages: Reduces the
crime in the city.
Disadvantages: Prediction
level is quiet low.
Application: Useful for
maintaining the crime rate to
a limited extent.
Future Scope: Applied to be
developed for the entire
nation which reduces the
crime rates up to the core.

Time series analysis for Grzegorz Borowik ; Zbigniew The purpose is to show


crime forecasting M.Wawrzyniak ; Paweł usefulness of analytic
Cichosz algorithms in predicting
crimes, there are applications
vol. 34, no. 4, pp. 50–56, of such analyzes in the area
2009. of law enforcement, such as
defining criminal hot spots,
creating criminal profiles,
and detecting crime trends.
Datasets: Algorithm
Collections
Advantages: Decreases the
presence of crimes in a
specific hotspot.
Disadvantages: Detection
process is quiet slow.
Application: Useful in
algorithm comparison to
select a perfect one.
Future Scope: Chooses
perfect method to predict
crimes and spots in a city.
Title Author & Year Description
Short-term forecasting International Journal of In this paper, time series
of crime Forecasting, model of ARIMA is used to
make short-term forecasting
vol. 19, no. 4, pp. 579 – 594, of property crime for one city
2003. of China. With the given data
of property crime for 50
weeks, an ARIMA model is
determined and the crime
amount of 1 week ahead is
predicted. The modelpsilas
fitting and forecasting results
are compared with the SES
and HES.
Dataset: Property Crimes
Advantages: Useful in civil
crime prediction.
Disadvantages: Chance of
error occurrences.
Application: Useful to
maintain the city in a low
level crime presence.
Future Scope: More number
of fields can be added along
with the property.

Forecasting crime using the P. Chen, H. Yuan, and X. In this paper, time series
arima model Shu model of ARIMA is used to
make short-term forecasting
FSKD ’08. Fifth of property crime for one city
International Conference on, of China.
vol. 5, 2008, pp. 627–630. Datasets: Crime Prediction
Dataset in China.
Advantages: Reduces the
Crime rates.
Disadvantages: Maintenance
in most difficult.
Application: Decreases the
crime hotspot areas.
Future Scope: Accuracy level
can be increased.
Title Author & Year Description
Forecasting crimes using E. Cesario, C. Catlett, and D. Crime is undesired anti-
autoregressive Talia social behavior and poses
models, 2016 IEEE 2nd Int. Conf. on serious threat to society. The
Big Data Intelligence and civilized societies make
Computing and Cyber everything possible to reduce
Science and Technology, crime within its regime of
2016, pp. 795–802. influence. Alarming the
crime prone areas in advance
is one of the best strategies
for crime to be ceased to
happen.
Datasets: City Crime
Datasets
Advantages: Provides
information about the city
crimes.
Disadvantages: Information
may vary.
Application: Used to reduce
the crime presences in a
specific area.
Future Scope: Areas can be
covered with perfect tracking
objective.
Predicting the future in S Selva Priya ; Lavanya Gupta In this paper, various
time series using auto prediction methods are
regressive linear regression 9-11 Sept. 2015 compared based on
modeling performance for an example
time series. Given the crime
data and demographic
features like the sex ratio,
population density and
religious composition of a
region,
Datasets: City Performance
Dataset
Advantages: Maintains the
entire information and data
in a single click.
Disadvantages: Prediction
level may be low.
Application: Increases the
city a crime freeone.
Future scope: This paper
delivers a method to predict
the region-specific crime rate
for the future
Title Author & Year Description
Forecasting Violent Jacob R. Scanlon ; Matthew S. This paper presents research
Extremist Cyber Gerber on forecasting the daily level
Recruitment of cyber-recruitment activity
 IEEE Transactions on of VE groups. We used a
Information Forensics and previously developed support
Security ( Volume: 10 , Issue: vector machine model to
11 , Nov. 2015 ) identify recruitment posts
within a Western jihadist
discussion forum.
Dataset: We analyzed the
textual content of this data
set with latent Dirichlet
allocation (LDA),
Advantages: Enhanced cyber
security.
Disadvantages: Comparison
level may be low.
Application: In builds in
online crime predictions.
Future Scope: Can be
effectively analyze the online
crimes and social media
frauds.
Intelligent Crime Anomaly Sharmila The quick and accurate
Detection in Smart Cities Chackravarthy ; Steven identification of criminal
Using Deep Learning Schmitt ; Li Yang activity is paramount to
securing any residence. With
 2018 IEEE 4th International the rapid growth of smart
Conference on Collaboration cities, the integration of
and Internet Computing (CIC) crime detection systems
18-20 Oct. 2018 seeks to improve this
security.
Dataset: Area and Resident
Dataset.
Advantages: Secures the
resident with crime free one.
Disadvantages: Unable to
maintain large amount of
data.
Application: Can be applied
as a area wise data.
Future Scope: Useful to
maintain the metropolitan
city with high security.
Title Author & Year Description
Forecasting Cyberattacks Gordon Werner ; Ahmet It investigates the use of
as Time Series with Okutan ; Shanchieh Auto-Regressive Integrated
Different Aggregation Yang ; Katie McConky Moving Average (ARIMA)
Granularity 2018 IEEE International models and Bayesian
Symposium on Technologies for Networks (BN) to predict
Homeland Security (HST)  23- future cyber attack
24 Oct. 2018 occurrences and intensities
against target entities. In
addition to incident count
forecasting, categorical and
binary occurrence metrics are
proposed to represent volume
forecasts to a victim. 
Datasets: Bayesian Networks
Advantages: Increases the
efficiency to find the online
fraudulent activities.
Disadvantages: Prediction
and analyzes may be slow
and inefficient.
Application: Can be applied
for entire online processes.
Future Scope: Cyber attacks
can be minimized to extent.
Safe cities. A participatory Jaime Ballesteros ; Mahmudur This work takes steps toward
sensing approach Rahman ; Bogdan implementing smart, safe
Carbunar ; Naphtali Rishe cities, by combining the use
of personal devices and
  22-25 Oct. 2012 social networks to make
users aware of safety of
their surroundings. We
propose novel metrics to
define location and user
based safety values.
Dataset: Smart City Crime
Prediction.
Advantages: Minimizes the
crime activities in smart
cities.
Disadvantages: Comparison
and prediction is vague.
Application: Reduces the
occurrences of crimes.
Future Scope: We evaluate
ability of forecasting
techniques including
(ARIMA) and artificial
neural networks (ANN) to
predict future safety values. 
Title Author & Year Description
A Fast Density-Based Bing Liu In this paper, a fast density-
Clustering Algorithm for based clustering algorithm is
Large Databases 2006 International Conference presented based on
on Machine Learning and DBSCAN. After sorting
Cybernetics objects by a certain
13-16 Aug. 2006 dimensional coordinates, the
new algorithm selects orderly
unlabelled points outside a
core object's neighborhood as
seeds to expand clusters so
that the execution frequency
of region queries can be
decreased. 
Datasets: Density Based on
Population
Advantages: Increases the
security measures in dense
populated area.
Disadvantages: Prediction
level may be showered due to
large amount of data
Application: Can be applied
in smart cities.
Future Scope: Performance
and efficiency of smart cities
can be improved.
An improved sampling-  International Conference on Spatial data clustering is one
based DBSCAN for large Intelligent Sensing and of the important data mining
spatial databases Information Processing, 2004. techniques for extracting
Proceedings knowledge from large
amount of spatial data
 4-7 Jan. 2004 collected in various
applications, such as remote
sensing, GIS, computer
cartography, environmental
assessment and planning, etc.
Dataset: DBSCAN
Advantages: Noise points can
be reduced.
Disadvantages: Identification
is slow process.
Application: Can be applied
to guess the remote sensing
values in cyber network and
also in block chain.
Future Scope: Increases
security in networking and
IOT.
Title Author & Year Description
Internet of things Nomusa Dlodlo ; Oscar This paper is on potential
technologies in smart cities Gcaba ; Andrew Smith smart cities applications as
applied to the domains of
2016 IST-Africa Week smart transport, smart
Conference 11-13 May 2016 tourism and recreation, smart
health, ambient-assisted
living, crime prevention and
community safety,
governance, monitoring and
infrastructure, disaster
management, environment
management, refuse
collection and sewer
management, smart homes
and smart energy.
Datasets: JARVIS
Advantages: Increases the
safety and environment
management.
Disadvantages: Data
collection is difficult process.
Application: Provides
detailed information and data
about the environment and
smart cities.
Future Scope: Efficiently
increases the environment
and a free one.
A Data-Driven Approach Charlie Catlett ; Eugenio This paper presents an
for Spatio-Temporal Crime Cesario ; Domenico approach based on spatial
Predictions in Smart Cities Talia ; Andrea Vinci 2018 analysis and auto-regressive
IEEE International Conference models to automatically
on Smart Computing detect high-risk crime
(SMARTCOMP) 31 July regions in urban areas and
2018 reliably forecast crime trends
in each region.
Datasets: Spatial temporal
Advantages: Prediction level
is accurate and a easier one.
Disadvantages: Analyzing
the dataset is an complicated
one.
Application: It can be useful
in predicting crime levels in
smart cities in advance.
Future Scope: In future it can
be done in various fields like
software, data markets and so
on.
Title Author & Year Description
How fear of crime affects Julia van Heek ; Katrin In this empirical study, we
needs for privacy & Aming ; Martina Ziefle explore users' perceptions on
safety”: Acceptance of safety and privacy in the
surveillance technologies in 2016 5th International context of surveillance
smart cities Conference on Smart Cities and systems in urban
Green ICT Systems environments. 
(SMARTGREENS) 23-25 Dataset: Traffic Analysis
April 2016 Advantages: Can useful in
analyzing and maintain the
traffic and traffic based data
for smart cities.
Disadvantages: Chance of
collapse in data that has been
provided.
Application: Traffic
management.
Future Scope: Enormously
implement in covering the
entire smart city and urban
areas.
Dynamic Network Model  Olivera Kotevska  ; A. Gilad Today's cities generate
for Smart City Data-Loss Kusne ; Daniel V. tremendous amounts of data,
Resilience Case Study: Samarov ; Ahmed Lbath thanks to a boom in
City-to-City Network for affordable smart devices and
Crime Analytics IEEE Access ( Volume: 5 ) sensors. The resulting big
12 October 2017 data creates opportunities to
develop diverse sets of
context-aware services and
systems, ensuring smart city
services are optimized to the
dynamic city environment.
Dataset: Crime Data
Analytics
Advantages: Increases the
security and confidentiality.
Disadvantages: Chance of
presence of irrelevant data
Application: May be used to
optimize the distribution of
police, medical, and
emergency services
Future Scope: Data and
information can be gained
from various precautions.
SYSTEM STUDY
Existing System
The design and implementation of an approach based on spatial analysis and auto-
regressive models to automatically detect high-risk crime regions in urban areas and to
reliably forecast crime trends in each region. The algorithm is composed of several steps.
First, high crime density areas (called crime dense regions or crime hotspots) are discovered
through a spatial analysis approach, where shapes of the detected regions are automatically
traced by the algorithm without any pre-fixed division in areas. Then, a specific crime
prediction model is discovered from each detected region, analyzing the partitions discovered
during the previous step. The final result of the algorithm is a spatiotemporal crime
forecasting model, composed of a set of crime dense regions and a set of associated crime
predictors, each one representing a predictive model to forecast the number of crimes that
will happen in its specific region. As case study, we present here the analysis of crimes within
a big area of Chicago involving about two million crime events. Crime data has been
gathered a Web framework that provides public access to more than one hundred urban
datasets. The results of the experimental evaluation show the effectiveness of the approach,
by achieving good accuracy in spatial and temporal crime forecasting over rolling time
horizons.

Disadvantages

 Due to lack of information gathering the specific action for the occurred crime
sequences are slow process.

 Accuracy cannot be maintained due to more paper work and manual process, which is
an interrupted pattern in the system.

 Major deficiency in the system is targeted city crime predictions cannot be maintained
with better efficiency.
Proposed System
Crime has been prevalent in our society for a very long time and it continues to be so
even today. Currently, many cities have released crime-related data as part of an open data
initiative. Using this as input, we can apply analytics to be able to predict and hopefully
prevent crime in the future. In this work, we applied data analytics to the crime dataset, as
collected and available through the Open Data initiative. The main focus is to perform an in-
depth analysis of the major types of crimes that occurred in the city, observe the trend over
the years, and determine how various attributes contribute to specific crimes. Furthermore,
we leverage the results of the exploratory data analysis to inform the data preprocessing
process, prior to training various machine learning models for crime type prediction. More
specifically, the model predicts the type of crime that will occur in each district of the city.
We observe that the provided dataset is highly imbalanced, thus metrics used in previous
research focus mainly on the majority class, disregarding the performance of the classifiers in
minority classes, and propose a methodology to improve this issue. The proposed model finds
applications in resource allocation of law enforcement in a Smart City.

Advantages
 Reduces the latency up to the core and increases the reliability of the system.

 Enhances the effectiveness and performance of the approach.

 Large area can be focused and can be managed in an efficient way.


SYSTEM SPECIFICATION

Hardware Requirements

The hardware requirements may serve as the basis for a contract for the
implementation of the system and should therefore be a complete and consistent specification
of the whole system. They are used by software engineers as the starting point for the system
design

 Processor : Intel processor 3.0 GHz


 RAM : 2GB
 Hard disk : 500 GB
 Compact Disk : 650 Mb
 Keyboard : Standard keyboard
 Mouse : Logitech mouse
 Monitor : 15 inch color monitor
Software Requirements:

The software requirements document is the specification of the system. It should


include both a definition and a specification of requirements. It is useful in estimating cost,
planning team activities and performing tasks throughout the development activity.

 Front End : Python


 Back End : MYSQL
 Server : WAMP
 Operating System : Windows OS
 System type : 32-bit or 64-bit Operating System
 IDE : Python 3.5.7
 DLL : Depends upon the title
Software Description

PHP - Overview

PHP is a recursive acronym for "PHP: Hypertext Preprocessor". PHP is a server side
scripting language that is embedded in HTML. It is used to manage dynamic content,
databases, session tracking, even build entire e-commerce sites. The PHP Hypertext
Preprocessor (PHP) is a programming language that allows web developers to create dynamic
content that interacts with databases. PHP is basically used for developing web based
software applications. This tutorial helps you to build your base with PHP.

Why to Learn PHP?

PHP started out as a small open source project that evolved as more and more people
found out how useful it was. Rasmus Lerdorf unleashed the first version of PHP way back in
1994.

PHP is a MUST for students and working professionals to become a great Software
Engineer specially when they are working in Web Development Domain. I will list down
some of the key advantages of learning PHP:

 PHP is a recursive acronym for "PHP: Hypertext Preprocessor".


 PHP is a server side scripting language that is embedded in HTML. It is used to
manage dynamic content, databases, session tracking, even build entire e-commerce sites.
 It is integrated with a number of popular databases, including MySQL, PostgreSQL,
Oracle, Sybase, Informix, and Microsoft SQL Server.
 PHP is pleasingly zippy in its execution, especially when compiled as an Apache
module on the Unix side. The MySQL server, once started, executes even very complex
queries with huge result sets in record-setting time.
 PHP supports a large number of major protocols such as POP3, IMAP, and LDAP.
PHP4 added support for Java and distributed object architectures (COM and CORBA),
making n-tier development a possibility for the first time.
 PHP is forgiving: PHP language tries to be as forgiving as possible.
 PHP Syntax is C-Like.
Fig 1: Basic View of PHP

Characteristics of PHP

Five important characteristics make PHP's practical nature possible −

 Simplicity
 Efficiency
 Security
 Flexibility
 Familiarity

Hello World using PHP.

Just to give you a little excitement about PHP, I'm going to give you a small
conventional PHP Hello World program, You can try it using Demo link.

<html>
<head>
<title>Hello World</title>
</head>
<body>
<?php echo "Hello, World!";?>
</body> </html>
Applications of PHP

As mentioned before, PHP is one of the most widely used language over the web. I'm
going to list few of them here:

PHP performs system functions, i.e. from files on a system it can create, open, read,
write, and close them. and can handle forms, i.e. gather data from files, save data to a file,
through email you can send data, return data to the user. You add, delete, modify elements
within your database through PHP and access cookies variables and set cookies. Using PHP,
you can restrict users to access some pages of your website and encrypt data.

Architecture Overview

This section explains how all the different parts of the driver fit together. From the
different language runtimes, through the extension and to the PHP libraries on top. This new
architecture has replaced the old mongo extension. We refer to the new one as
the mongodb extension.

Fig 2: Overview of PHP

At the top of this stack sits a pure » PHP library, which we will distribute as a
Composer package. This library will provide an API similar to what users have come to
expect from the old mongo driver (e.g. CRUD methods, database and collection objects,
command helpers) and we expect it to be a common dependency for most applications built
with MongoDB. This library will also implement common » specifications, in the interest of
improving API consistency across all of the » drivers maintained by MongoDB (and
hopefully some community drivers, too).Sitting below that library we have the lower level
driver. This extension will effectively form the glue between PHP and our system libraries.
This extension will expose an identical public API for the most essential and performance-
sensitive functionality:

 Connection management
 BSON encoding and decoding
 Object document serialization (to support ODM libraries)
 Executing commands and write operations
 Handling queries and cursors

Prerequisites

Before proceeding with this tutorial you should have at least basic understanding of
computer programming, Internet, Database, and MySQL etc is very helpful.

PHP started out as a small open source project that evolved as more and more people
found out how useful it was. Rasmus Lerdorf unleashed the first version of PHP way back in
1994.

 PHP is a recursive acronym for "PHP: Hypertext Preprocessor".


 PHP is a server side scripting language that is embedded in HTML. It is used to
manage dynamic content, databases, session tracking, even build entire e-commerce sites.
 It is integrated with a number of popular databases, including MySQL, PostgreSQL,
Oracle, Sybase, Informix, and Microsoft SQL Server.
 PHP is pleasingly zippy in its execution, especially when compiled as an Apache
module on the Unix side. The MySQL server, once started, executes even very complex
queries with huge result sets in record-setting time.
Common uses of PHP

 PHP performs system functions, i.e. from files on a system it can create, open, read,
write, and close them.
 PHP can handle forms, i.e. gather data from files, save data to a file, through email
you can send data, return data to the user.
 You add, delete, modify elements within your database through PHP. Access cookies
variables and set cookies. Using PHP, you can restrict users to access some pages of your
website. It can encrypt data.

Characteristics of PHP

Five important characteristics make PHP's practical nature possible −

 Simplicity
 Efficiency
 Security
 Flexibility
 Familiarity

In order to develop and run PHP Web pages three vital components need to be
installed on your computer system.

 Web Server − PHP will work with virtually all Web Server software, including
Microsoft's Internet Information Server (IIS) but then most often used is freely available
Apache Server. Download Apache for free here − https://httpd.apache.org/download.cgi

 Database − PHP will work with virtually all database software, including Oracle and
Sybase but most commonly used is freely available MySQL database. Download MySQL for
free here − https://www.mysql.com/downloads/

 PHP Parser − In order to process PHP script instructions a parser must be installed to
generate HTML output that can be sent to the Web Browser. This tutorial will guide you how
to install PHP parser on your computer.
PHP Parser Installation

Before you proceed it is important to make sure that you have proper environment
setup on your machine to develop your web programs using PHP.

Type the following address into your browser's address box.

http://127.0.0.1/info.php

If this displays a page showing your PHP installation related information then it
means you have PHP and Webserver installed properly. Otherwise you have to follow given
procedure to install PHP on your computer.

This section will guide you to install and configure PHP over the following four
platforms −

 PHP Installation on Linux or Unix with Apache

 PHP Installation on Mac OS X with Apache

 PHP Installation on Windows NT/2000/XP with IIS

 PHP Installation on Windows NT/2000/XP with Apache

Apache Configuration

If you are using Apache as a Web Server then this section will guide you to edit
Apache Configuration Files.

Just Check it here − PHP Configuration in Apache Server

PHP.INI File Configuration

The PHP configuration file, php.ini, is the final and most immediate way to affect
PHP's functionality.

Just Check it here − PHP.INI File Configuration


Windows IIS Configuration

To configure IIS on your Windows machine you can refer your IIS Reference Manual
shipped along with IIS.

The main way to store information in the middle of a PHP program is by using a
variable.

Here are the most important things to know about variables in PHP.

 All variables in PHP are denoted with a leading dollar sign ($).

 The value of a variable is the value of its most recent assignment.

 Variables are assigned with the = operator, with the variable on the left-hand side and
the expression to be evaluated on the right.

 Variables can, but do not need, to be declared before assignment.

 Variables in PHP do not have intrinsic types - a variable does not know in advance
whether it will be used to store a number or a string of characters.

 Variables used before they are assigned have default values.

 PHP does a good job of automatically converting types from one to another when
necessary.

 PHP variables are Perl-like.

PHP has a total of eight data types which we use to construct our variables −

 Integers − are whole numbers, without a decimal point, like 4195.

 Doubles − are floating-point numbers, like 3.14159 or 49.1.

 Booleans − have only two possible values either true or false.

 NULL − is a special type that only has one value: NULL.


 Strings − are sequences of characters, like 'PHP supports string operations.'

 Arrays − are named and indexed collections of other values.

 Objects − are instances of programmer-defined classes, which can package up both


other kinds of values and functions that are specific to the class.

 Resources − are special variables that hold references to resources external to PHP
(such as database connections).

Conclusion

FINAL THOUGHT: it's very important to learn an entire subject matter. As a


programmer-in-the-making, you may be inclined to take what you've learned and start coding
immediately, but before you've learned enough of the topic at large. In reality this will lead to
you coding away, and then eventually spending hours just to research how to solve one little
aspect you need. If you learned the whole subject matter of, say, procedural PHP, you most
likely will have naturally encountered that solution, and in a faction of the time! Often it can
take very many hours to research one small solution that results in one line of code.

Fig3 : Evolution of Various Scripts

Whereas learning that trick might have been a natural thing to learn as part of learning
the whole subject, and only requires 5 minutes of study in between learning many other
tricks. In other words, a developer that has to constantly seek out solutions to things he/she
doesn't know will waste a lot more time in aggregate than someone that mastered the subject
as a whole, and then went to apply it. You're just more relaxed and in a better learning mode
when you're focused on nothing but learning. But when you're focused on producing results,
and have to learn at the same time, it can be stressful and waste you tons of time going back
and forth from testing each of the tens of wrong solutions you're trying out and googling until
you find the right one. 

MYSQL

MySQL is the most popular Open Source Relational SQL Database Management
System. MySQL is one of the best RDBMS being used for developing various web-based
software applications. MySQL is developed, marketed and supported by MySQL AB, which
is a Swedish company. This tutorial will give you a quick start to MySQL and make you
comfortable with MySQL programming.

Fig 4: Structure of Data Directory


MySQL Database

MySQL is a fast, easy-to-use RDBMS being used for many small and big businesses.
MySQL is developed, marketed and supported by MySQL AB, which is a Swedish
company. MySQL is becoming so popular because of many good reasons −

 MySQL is released under an open-source license. So you have nothing to pay to use
it.

 MySQL is a very powerful program in its own right. It handles a large subset of the
functionality of the most expensive and powerful database packages.

 MySQL uses a standard form of the well-known SQL data language.

 MySQL works on many operating systems and with many languages including PHP,
PERL, C, C++, JAVA, etc.

 MySQL works very quickly and works well even with large data sets.

 MySQL is very friendly to PHP, the most appreciated language for web development.

 MySQL supports large databases, up to 50 million rows or more in a table. The


default file size limit for a table is 4GB, but you can increase this (if your operating
system can handle it) to a theoretical limit of 8 million terabytes (TB).

 MySQL is customizable. The open-source GPL license allows programmers to


modify the MySQL software to fit their own specific environments.

MYSQL Functions

Here is the list of all important MySQL functions. Each function has been explained
along with suitable example.

 MySQL Group By Clause − The MySQL GROUP BY statement is used along with
the SQL aggregate functions like SUM to provide means of grouping the result
dataset by certain database table column(s).

 MySQL COUNT Function − The MySQL COUNT aggregate function is used to


count the number of rows in a database table.

 MySQL MAX Function − The MySQL MAX aggregate function allows us to select
the highest (maximum) value for a certain column.
 MySQL MIN Function − The MySQL MIN aggregate function allows us to select the
lowest (minimum) value for a certain column.

 MySQL SUM Function − The MySQL SUM aggregate function allows selecting the
total for a numeric column.

 MySQL CONCAT Function − This is used to concatenate any string inside any
MySQL command.

 MySQL DATE and Time Functions  − Complete list of MySQL Date and Time
related functions.

 MySQL Numeric Functions − Complete list of MySQL functions required to


manipulate numbers in MySQL.

 MySQL String Functions − Complete list of MySQL functions required to


manipulate strings in MySQL.

Discussion

MySQL is the most popular Open Source Relational SQL Database Management
System. MySQL is one of the best RDBMS being used for developing various web-based
software applications. MySQL is developed, marketed and supported by MySQL AB, which
is a Swedish company. This tutorial will give you a quick start to MySQL and make you
comfortable with MySQL programming.

Angular JS - Overview

AngularJS is an open-source web application framework. It was originally developed


in 2009 by Misko Hevery and Adam Abrons. It is now maintained by Google. Its latest
version is 1.2.21.

AngularJS is a very powerful JavaScript Framework. It is used in Single Page


Application (SPA) projects. It extends HTML DOM with additional attributes and makes it
more responsive to user actions. AngularJS is open source, completely free, and used by
thousands of developers around the world. It is licensed under the Apache license version
2.0.

General Features

 AngularJS is a efficient framework that can create Rich Internet Applications (RIA).
 AngularJS provides developers an options to write client side applications using
JavaScript in a clean Model View Controller (MVC) way.

 Applications written in AngularJS are cross-browser compliant. AngularJS


automatically handles JavaScript code suitable for each browser.

 AngularJS is open source, completely free, and used by thousands of developers


around the world. It is licensed under the Apache license version 2.0.

Overall, AngularJS is a framework to build large scale, high-performance, and


easyto-maintain web applications.

Core Features

The core features of AngularJS are as follows −

 Data-binding − It is the automatic synchronization of data between model and view


components.

 Scope − These are objects that refer to the model. They act as a glue between
controller and view.

 Controller − These are JavaScript functions bound to a particular scope.

 Services − AngularJS comes with several built-in services such as $http to make a
XMLHttpRequests. These are singleton objects which are instantiated only once in
app.

 Filters − These select a subset of items from an array and returns a new array.

 Directives − Directives are markers on DOM elements such as elements, attributes,


css, and more. These can be used to create custom HTML tags that serve as new,
custom widgets. AngularJS has built-in directives such as ngBind, ngModel, etc.

 Templates − These are the rendered view with information from the controller and
model. These can be a single file (such as index.html) or multiple views in one page
using partials.

 Routing − It is concept of switching views.


 Model View Whatever − MVW is a design pattern for dividing an application into
different parts called Model, View, and Controller, each with distinct
responsibilities. AngularJS does not implement MVC in the traditional sense, but
rather something closer to MVVM (Model-View-ViewModel). The Angular JS team
refers it humorously as Model View Whatever.

 Deep Linking − Deep linking allows to encode the state of application in the URL so
that it can be bookmarked. The application can then be restored from the URL to the
same state.

 Dependency Injection − AngularJS has a built-in dependency injection subsystem


that helps the developer to create, understand, and test the applications easily.

Fig 5: Overview - Angular JS

Advantages of AngularJS

The advantages of AngularJS are −

 It provides the capability to create Single Page Application in a very clean and
maintainable way.

 It provides data binding capability to HTML. Thus, it gives user a rich and responsive
experience.

 AngularJS code is unit testable.

 AngularJS uses dependency injection and make use of separation of concerns.

 AngularJS provides reusable components.

 With AngularJS, the developers can achieve more functionality with short code.
 In AngularJS, views are pure html pages, and controllers written in JavaScript do the
business processing.

On the top of everything, AngularJS applications can run on all major browsers and
smart phones, including Android and iOS based phones/tablets.

Disadvantages of AngularJS

Though AngularJS comes with a lot of merits, here are some points of concern −

 Not Secure − Being JavaScript only framework, application written in AngularJS are
not safe. Server side authentication and authorization is must to keep an application
secure.

 Not degradable − If the user of your application disables JavaScript, then nothing
would be visible, except the basic page.

AngularJS Directives

The AngularJS framework can be divided into three major parts −

 ng-app − This directive defines and links an AngularJS application to HTML.

 ng-model − This directive binds the values of AngularJS application data to HTML
input controls.

 ng-bind − This directive binds the AngularJS application data to HTML tags.

Applications of AngularJS

The general features of AngularJS are as follows −

 AngularJS is a efficient framework that can create Rich Internet Applications (RIA).

 AngularJS provides developers an options to write client side applications using


JavaScript in a clean Model View Controller (MVC) way.

 Applications written in AngularJS are cross-browser compliant. AngularJS


automatically handles JavaScript code suitable for each browser.

 AngularJS is open source, completely free, and used by thousands of developers


around the world. It is licensed under the Apache license version 2.0.
WAMP Server

WAMP is an acronym that stands for Windows, Apache, MySQL, and PHP. It’s a
software stack which means installing WAMP installs Apache, MySQL, and PHP on your
operating system (Windows in the case of WAMP). Even though you can install them
separately, they are usually bundled up, and for a good reason too.

What’s good to know is that WAMP derives from LAMP (the L stands for Linux).
The only difference between these two is that WAMP is used for Windows, while LAMP –
for Linux based operating systems.

Let’s quickly go over what each letter represents “W” stands for Windows, there’s
also LAMP (for Linux) and MAMP (for Mac). “A” stands for Apache. Apache is the server
software that is responsible for serving web pages. When you request a page to be seen by
you, Apache grants your request over HTTP and shows you the site. “M” stands for MySQL.
MySQL’s job is to be the database management system for your server. It stores all of the
relevant information like your site’s content, user profiles, etc. “P” stands for PHP. It’s the
programming language that was used to write WordPress. It acts like glue for this whole
software stack. PHP is running in conjunction with Apache and communicating with
MySQL.

Fig 6: WAMP Structure


Instead of installing and testing Word Press on your hosting account, you can do it on
your personal computer (localhost).

WAMP acts like a virtual server on your computer. It allows you to test all WordPress
features without any consequences since it’s localized on your machine and is not connected
to the web.

First of all, this means that you don’t need to wait until files are uploaded to your site,
and secondly – this makes creating backups much easier.

WAMP speeds up the work process for both developers and theme designers alike.
What is more, you also get the benefit of playing around with your site to your heart’s
content. However, to actually make the website go live, you need to get some form of hosting
service and a Domain. See our beginner-friendly article about web hosting for more
information. In essence, WAMP is used as a safe space to work on your website, without
needing to actually host it online. WAMP also has a control panel. Once you install the
software package, all of the services mentioned above (excluding the operating system that
is) will be installed on your local machine. Whether you use WAMP or software packages for
the other operating systems, it’s a great way to save time. You won’t have to upload files to a
site and will be able to learn how to develop in a safe and care-free environment.
Architecture Diagram

Feature Extraction

Initially we collect the crime dataset of a specific city and preprocess the data that has
been related with the dataset. The data and information that has been inbuilt in the dataset are
analyzed and preprocessed with various algorithms. The preprocessed data are extracted
according to the features they are related.

During the operation the input data is denoted as the dataset of crime data of a specific
city, from that dataset the data are preprocessed with salient features of the algorithms. The
preprocessed data are extracted and compared with the current crime occurrences with the
crime occurrence in the dataset. So that the details can be detected in an efficient way. The
detected result has been analyzed for proper results and incase of occurrences of any errors or
any defects the same procedure has been repeated to attain perfect results and evaluation. The
data are compared with the help of Support Vector Machine Algorithm (SVM) which
classifies the given data and compares them for proper predictions.

After comparing the data with the help of SVM the perfect result has been gathered
and the gathered data are clustered using K-Means Clustering Algorithm to gain the accurate
result of the entire system. The K-Means Clustering provides the accurate value to attain the
results.
The data can be updated in case of newly occurred crimes and those data are updated
to the system which enhances the efficiency of the system in an enormous way. As a result
the new crime data are added to the previous dataset which contains the list of crime events
of the cities and the entire data can be compared and clustered with the SVM and K- Means
Clustering Algorithms to retrieve better results. This process continuous repeatedly, so that
every new crime events and its related contents are added to the dataset for better future
performance.

After analyzing and extracting the data we can obtain the clear information about the
crime events of the city. It also provides us an efficient result of crime prediction and using
that information the occurrences of crime events can be detected or reduced up to the core. In
real life events it can be implemented in smart cities in future which maintains the crime rates
in control or reduces the crime rates.

Dataflow Diagram

Level 0
Crime Prediction
Admin Database

Level 1

Acquire Dataset

Fetch Crime Details

Database
Admin Preprocess Data

Analyze Information

Data Segmentation

Level 2
SVM Classifier

Classify Segmented Data

Predict Accuracy Database


ADMIN
K - Means Clustering

Data Retrieval

Feature Extraction

UML Diagram
The Unified Modeling Language (UML) is a general-purpose, developmental,
modeling language in the field of software engineering that is intended to provide a standard
way to visualize the design of a system.

Use Case Diagram


Use case diagrams are usually referred to as behaviour diagrams used to describe a set
of actions (use cases) that some system or systems (subject) should or can perform in
collaboration with one or more external users of the system (actors).

A use case diagram at its simplest is a representation of a user's interaction with the
system that shows the relationship between the user and the different use cases in which the
user is involved.
Acquire Dataset

Retrieve Data

Preprocess Rertrieved Data

Crime Data Segmentation


Training Phase
Testing Phase

SVM Classifier

Classification of Data

Predict Accuracy Using K - Means

Feature Extraction
Class Diagram
The class diagram is the main building block of object-oriented modelling. It is used
for general conceptual modeling of the systematic of the application, and for detailed
modeling translating the models into programming code. Class diagrams can also be used for
data modelling.
In the diagram, classes are represented with boxes that contain three compartments:
The top compartment contains the name of the class. It is printed in bold and centered,
and the first letter is capitalized.

The middle compartment contains the attributes of the class. They are left-aligned and
the first letter is lowercase.

The bottom compartment contains the operations the class can execute. They are also
left-aligned and the first letter is lowercase.
In the design of a system, a number of classes are identified and grouped together in a
class diagram that helps to determine the static relations between them. With detailed
modelling, the classes of the conceptual design are often split into a number of subclasses. In
order to further describe the behaviour of systems, these class diagrams can be complemented
by a state diagram or UML state machine.
TestingPhase
Acquire
Retrieve
TrainingPhase Preprocess
Acquire Segmentation
Retrieve Classify
Preprocess Predict
Extraction Extraction

Acquire Dataset() Acquire Dataset()


Retrieve Data() Retrieve Data()
Preprocess Data() Preprocess Data()
Feature Extraction() Data Segmentation()
Classification of Data()
...
Predict Accuracy()
Feature Extraction()

Database Server
Store
View

Stire Data()
View Data()

Activity Diagram
An activity diagram visually presents a series of actions or flow of control in a system
similar to a flowchart or a data flow diagram. Activity diagrams are often used in business
process modelling. They can also describe the steps in a use case diagram. Activities
modelled can be sequential and concurrent.
The Purpose of Activity Diagrams. The basic purposes of activity diagrams is similar
to other four diagrams. It captures the dynamic behaviour of the system. Other four diagrams
are used to show the message flow from one object to another but activity diagram is used to
show message flow from one activity to another. Activity diagrams are constructed from a
limited number of shapes, connected with arrows.
Testing Testing
Phase Phase

Acquire Acquire
Dataset Dataset

Retrieve Data Retrieve Data

Preprocess
Preprocess Retrieved Data
Retrieved Data

Crime Data
Segmentation

SVM Classifier

Classification
of Data

Predict Accuracy
Using K - Me...

Feature
Extraction
Sequence Diagram
The sequence diagram is a good diagram to use to document a system's requirements
and to flush out a system's design. The reason the sequence diagram is so useful is because it
shows the interaction logic between the objects in the system in the time order that the
interactions take place.
A sequence diagram shows object interactions arranged in time sequence. It depicts
the objects and classes involved in the scenario and the sequence of messages exchanged
between the objects needed to carry out the functionality of the scenario.A sequence diagram
shows, as parallel vertical lines (lifelines), different processes or objects that live
simultaneously, and, as horizontal arrows, the messages exchanged between them, in the
order in which they occur. This allows the specification of simple runtime scenarios in a
graphical manner.

Training Phase Data Server Testing Phase

1: Provide Dataset

2: Fetch Data

3: Acquire Data

4: Preprocess Acquired Data

5: Crime Data Segmentation

6: SVM Classifier

7: Data Classification

8: Predict Accuracy Using K - Means Clustering

9: Predict Accuracy Using K - Means Clustering

10: Feature Extraction


Collaboration Diagram
A collaboration diagram, also called a communication diagram or interaction diagram,
is an illustration of the relationships and interactions among software objects in the Unified
Modelling Language (UML).UML Collaboration diagrams illustrate the relationship and
interaction between software objects.

They require use cases, system operation contracts, and domain model to already
exist. The collaboration diagram illustrates messages being sent between classes and objects
(instances). A collaboration diagram shows the objects and relationships involved in an
interaction, and the sequence of messages exchanged among the objects during the
interaction. Communication diagrams model the interactions between objects in sequence.
They describe both the static structure and the dynamic behaviour of a system. In many ways,
a communication diagram is a simplified version of a collaboration diagram introduced in
UML 2.0.

1: Provide Dataset
9: Predict Accuracy Using K - Means Clustering
Training Data
Phase Server
3: Acquire Data

2: Fetch Data
4: Preprocess Acquired Data
5: Crime Data Segmentation
6: SVM Classifier
7: Data Classification
8: Predict Accuracy Using K - Means Clustering
10: Feature Extraction

Testing
Phase
SYSTEM IMPLAMENTATION

Modules
 Datasets acquisition
 Preprocessing
 Classification
 Feature Extraction
 Prediction
 Evaluation criteria
Module Description

Data Acquisition

Initially we collect the crime dataset of a specific city and preprocess the data that has
been related with the dataset. The data and information that has been inbuilt in the dataset are
analyzed and preprocessed with various algorithms. The preprocessed data are extracted
according to the features they are related.

During the operation the input data is denoted as the dataset of crime data of a specific
city, from that dataset the data are preprocessed with salient features of the algorithms.

Preprocessing
The preprocessed data are extracted and compared with the current crime occurrences
with the crime occurrence in the dataset. So that the details can be detected in an efficient
way. The detected result has been analyzed for proper results and incase of occurrences of
any errors or any defects the same procedure has been repeated to attain perfect results and
evaluation.

Classification

The data are compared with the help of Support Vector Machine Algorithm (SVM)
which classifies the given data and compares them for proper predictions. After comparing
the data with the help of SVM the perfect result has been gathered and the gathered data are
clustered using K-Means Clustering Algorithm to gain the accurate result of the entire
system.

Segmentation

The data are compared with the help of Support Vector Machine Algorithm (SVM)
which classifies the given data and compares them for proper predictions. The K-Means
Clustering provides the accurate value to attain the results. The data can be updated in case
of newly occurred crimes and those data are updated to the system which enhances the
efficiency of the system in an enormous way. As a result the new crime data are added to the
previous dataset which contains the list of crime events of the cities and the entire data can be
compared and clustered with the SVM and K- Means Clustering Algorithms to retrieve better
results.
Classification
It performs a spatial clustering of the data set, where each cluster represents a dense
region of crimes. The density-based notion is a common approach for clustering, whose
inspiring idea is that objects forming a dense region should be grouped together into one
cluster. In our implementation, this step is performed by applying, a popular density based
clustering algorithm that finds clusters starting from the estimated density distribution of the
considered data. We have chosen the SVM algorithm because it has the ability to discover
clusters with arbitrary shape such as linear, concave, oval, etc. and differently from other
clustering algorithms proposed and it does not require the predetermination of the number of
clusters to be discovered.

Feature Extraction
Given a specific dense region, the CRIMEPREDICTOR method discovers a
predictive model to forecast the number of crimes that will happen in its specific area. In our
implementation, this has been performed by the Seasonal Auto Regressive Integrated Moving
Average model, which is defined as a combination of auto regression, moving average and
difference modeling. Briefly, having the time series, which is the value of the time series at
the timestamp, an ARIMA model is written are the regression coefficient of the moving
average part, are lagged values and lagged errors, and it is the white noise and takes into
account the forecast error.

Prediction
As described, crime dense regions are detected by applying an our ad-hoc modified
version of SVM, which exploits a decay factor that gives a higher weight to the recent crime
events than the older ones. Moreover, in order to detect high quality crime dense regions, it is
necessary to profitable tune the key parameters of the algorithm so as to improve results’
performance. In particular, the values of the SVM’s parameters and determines the size of the
clusters, as they represents the minimum crime density required by an area to be part of a
cluster. On the one hand, the larger is the extension of the dense regions detected: this results
in the discovery of large regions that actually are no longer dense. On the other hand, the
smaller the cluster sizes, resulting in a high number of dense regions detected that could be
not significant for the analysis. Conversely, growing the value results in increasing the
fragmentation of the produced clustering assignment. The values of a key factor for the
accuracy of the dense region detection phase and for the right balance among separability,
compactness and significance of clusters.

Evaluation Criteria
To evaluate the performance and the effectiveness of the approach that has been
described in the paper, we carried out an experimental evaluation by analyzing crimes
occurring in a big area of the city. The main goal consists in detecting the most significant
crime dense regions and discovering a predictive model for each one to forecast the number
of crimes that will happen in the future in each area. In the following sub-sections we
describe the main issues of our analysis: data description and gathering, crime dense region
detection, the regressive model training for each region, and the evaluation of the model on
the test set. Reports the values of K- Means Clustering for the whole area and the top three
largest crime dense regions, by considering one-year-ahead, two-year-ahead and three-year
ahead prediction horizons.
SYSTEM TESTING

Test Case
File level deduplication will save a relatively large memory space. In general, file
level deduplication view multiple copies of same file. It stores first file and then it links other
references to the first file. Only one copy will be stored. In testing, even though file names
are same, the system can able to detect deduplication. If we upload the same file by using
different names, it will view only the content and not names. Thus redundant data is avoided.
In registration phase, the user may not registered before and type their information. So
if the user is new user, the alert message will display that the user is not registered before.

Fig: System Testing.


Unit Testing
It is the testing of an individual unit or group of related units. It is done by
programmer to test that the implementation is producing expected output against given input
and it falls under white box testing. Unit testing is done in order to check registration whether
the user properly registered into the cloud. It is done in order to check whether a file is
properly uploaded into the cloud. And an encryption and decryption is checked with unit
testing if it is converted properly. Then deduplication is checked with unit testing.
Integration Testing
All the modules should be integrated into a single module and it should be checked
that it is still working still by integration testing.

System Testing
It is done to ensure that by putting the software in different environments and check
that it still works. System Testing is done by uploading same file in this cloud checking
whether any duplicate file exists.
Software Testing
It is the process of evaluating a software item to detect differences between given
input and expected output. Also to assess the feature of a software item. Testing assess the
quality of the product. It is a process that should be done during the development process. In
other words software testing is a verification and validation process.

There are two types of software testing.


1. Black box testing
2. White box testing

Verification
Verification is the process to make sure the product satisfies the conditions imposed at
the start of the development phase. In other words, to make sure the product behaves the way
we want it to.

Validation
Validation is the process to make sure the product satisfies the specified requirements
at the end of the development phase. In other words, to make sure the product is built as per
customer requirements.
Black Box Testing
Black box testing is a testing which ignores internal mechanism of system
and focuses on output generated against any input and execution of system. It is
done for validation. It is done to check encryption and decryption after
uploading a file into the cloud.

White Box Testing


It is done for verification and it is a testing that takes into account the internal
mechanism of the system. It is done by checking content verification. It will verify that
whether same content exists in the cloud.
Conclusion
This paper presented a general algorithm for Spatio- Temporal Crime Prediction in
urban areas, that takes advantages from the partitioning of the whole analyzed area by
detecting crime dense regions (of arbitrary shapes). Such regions are then analyzed and a
different forecasting autoregressive model is tailored specifically for each detected region.
Experimental evaluation, performed on crime data of a wide area of a city, showed that the
proposed methodology can forecast the number of crimes with an high accuracy.
Furthermore, the approach gives fine-grained information about where crime events are
expected to occur.
Future Enhancement

In future work, other research issues may be investigated. First, we may further
explore the application of other spatial analysis approaches for the detection of crime dense
regions and for modeling and forecasting crime trends on such regions. Second, we may
perform an extended experimental evaluation on an wider urban territory, to assess the results
obtained in the case study reported here. Third, we may apply such an approach for spatio-
temporal prediction of other kind of events, different than crimes.
References
[1] United Nations Settlements Programme, the state of the world’s cities 2004/2005:
Globalization and urban culture. Earthscan, 2004.

[2] “Cities: The century of the city,” Nature 467, 900-901 (2010), vol. 467, pp. 900–901,
2010.

[3] F. Cicirelli, A. Guerrieri, G. Spezzano, and A. Vinci, “An edge-based platform for
dynamic smart city applications,” Future Generation Comp. Syst., vol. 76, 2017.

[4] M. Tayebi, M. Ester, U. Glasser, and P. Brantingham, “Crimetracer: Activity space based
crime location prediction,” in Advances in Social Networks Analysis and Mining (ASONAM),
2014 IEEE/ACM International Conference on, 2014, pp. 472–480.

[5] H. Wang, D. Kifer, C. Graif, and Z. Li, “Crime rate inference with big data,” in
Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, ser. KDD ’16. ACM, 2016, pp. 635–644.

[6] C. Catlett, T. Malik, B. Goldstein, J. Giuffrida, Y. Shao, A. Panella, D. Eder, E. van


Zanten, R. Mitchum, S. Thaler, and I. T. Foster, “Plenario: An open data discovery and
exploration platform for urban science,” IEEE Data Eng. Bull., vol. 37, no. 4, 2014.

[7] T. Wang, C. Rudin, D. Wagner, and R. Sevieri, “Learning to detect patterns of crime,” in
Machine Learning and Knowledge Discovery in Databases - European Conference, ECML
PKDD 2013, 2013.

[8] D. E. Brown and S. Hagen, “Data association methods with applications to law
enforcement,” Decision Support Systems, vol. 34, no. 4, pp. 369– 378, 2003.

[9] J. S. d. Bruin, T. K. Cocx, W. A. Kosters, J. F. J. Laros, and J. N. Kok, “Data mining


approaches to criminal career analysis,” in Proceedings of the Sixth International Conference
on Data Mining, ser. ICDM ’06, 2006, pp. 171–177.
[10] G. Wang, H. Chen, and H. Atabakhsh, “Automatically detecting deceptive criminal
identities,” Commun. ACM, vol. 47, no. 3, pp. 70–76, 2004.

[11] B. Chandra, M. Gupta, and M. Gupta, “A multivariate time series clustering approach
for crime trends prediction,” in Systems, Man and Cybernetics, 2008. SMC 2008. IEEE
International Conference on, 2008, pp. 892–896.

[12] S. V. Nath, “Crime pattern detection using data mining,” in Web Intelligence and
Intelligent Agent Technology Workshops, 2006. WIIAT 2006 Workshops. 2006
IEEE/WIC/ACM International Conference on, 2006, pp. 41–44.

[13] H. Chen, W. Chung, J. Xu, G. Wang, Y. Qin, and M. Chau, “Crime data mining: a
general framework and some examples,” Computer, vol. 37, no. 4, pp. 50–56, 2004.

[14] C.-H. Yu, M. Ward, M. Morabito, and W. Ding, “Crime forecasting using data mining
techniques,” in Data Mining Workshops (ICDMW), 2011 IEEE 11th International
Conference on, 2011, pp. 779–786.

[15] Y. Zhuang, M. Almeida, M. Morabito, and W. Ding, “Crime hot spot forecasting: A
recurrent model with spatial and temporal information,” in 2017 IEEE International
Conference on Big Knowledge (ICBK), Aug 2017, pp. 143–150.

[16] H. Wang, D. Kifer, C. Graif, and Z. Li, “Crime rate inference with big data,” in
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining. ACM, 2016, pp. 635–644.

[17] W. Gorr, A. Olligschlaeger, and Y. Thompson, “Short-term forecasting of crime,”


International Journal of Forecasting, vol. 19, no. 4, pp. 579 – 594, 2003.

[18] P. Chen, H. Yuan, and X. Shu, “Forecasting crime using the arima model,” in Fuzzy
Systems and Knowledge Discovery, 2008. FSKD ’08. Fifth International Conference on, vol.
5, 2008, pp. 627–630.
[19] E. Cesario, C. Catlett, and D. Talia, “Forecasting crimes using autoregressive models,”
in 2016 IEEE 2nd Int. Conf. on Big Data Intelligence and Computing and Cyber Science and
Technology, 2016, pp. 795–802.

[20] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for
discovering clusters in large spatial databases with noise,” in Proceedings of the Second
International Conference on Knowledge Discovery and Data Mining, ser. KDD’96. AAAI
Press, 1996.

[21] R. J. Hyndman and G. Athanasopoulos, Forecasting: principles and practice.


OTexts.com, 2014.
1. WORKPLAN

S.No. PROJECTMODULES PROJECT COMPLETION


PERIOD

1. Collecting Base Paper 1st one Week of January 2019

2. Studying The Base Paper And 2nd and 3rdWeek of January 2019
Proposed Features

3. Collecting the Datasets and Needed 1stand 2ndWeek of February 2019


Information

4. Implementation Works 3rdand 4thWeek of February 2019

5. Project Completion and Further Modification 1stand 2ndWeek of March 2019

6. Preparation of Report 3rd Week of March 2020


2. BUDGET
Requirements Total

System Price 12500

Godrej Security Solutions Seethru HD IR CCTV 1100


Camera 

Project Development Cost (Collection of raw 1500


data)

15,100
Total Expected Budget (Fifteen Thousand and One
Hundred only)

You might also like