Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Document Clustering for Forensic Analysis: An Approach for Improving

Computer Inspection
PROBLEM DEFINITION & ANALYSIS

In a more practical and realistic scenario, domain experts (e.g., forensic


examiners) are scarce and have limited time available for performing
examinations. Thus, it is reasonable to assume that, after finding a relevant
document, the examiner could prioritize the analysis of other documents
belonging to the cluster of interest, because it is likely that these are also relevant
to the investigation. Such an approach, based on document clustering, can
indeed improve the analysis of seized computers, as it will be discussed in more
detail later.
EXISTING SYSTEM:

No such clustering system for crime detection. So there may be difficult to find
the history of data of the crimes. The detection of crimes takes much more time.
PROPOSED SYSTEM:

In this context, automated methods of analysis are of great interest. In particular,


algorithms for clustering documents can facilitate the discovery of new and useful
knowledge from the documents under analysis. We present an approach that
applies document clustering algorithms to forensic analysis of computers seized
in police investigations
Clustering algorithms are typically used for exploratory data analysis, where
there is little or no prior knowledge about the data. This is precisely the case in
several applications of Computer Forensics, including the one addressed in our
work.
To formulate crime pattern detective as machine learning task and to thereby use
data mining to support police detectives in solving crimes.
Cluster details

-based on crime
-probability
-crime nature

The whole system that we have proposed is divided mainly into three modules:
Login and Registration Module:
There are three users to the Crime Analyser the Administrator, an Inspector from
every station and the Head of the Dept. (here Inspector General). The three
users are given different user names and passwords. In this module the login of
the three different users and registration of a new police station to the system is
dealt. All the three users have their own home pages with their specified
privileges.
Crime Registration Module:
a. The crime registration module deals with the crime registration,
criminal registration and all entries which account to a crime

registration.
The Administrator is given the privilege to the following registrations:
Station registration
State entry
District entry
Place entry
Main crime entry
Crime entry
Weapon entry
Criminal entry
Case entry
Designation entry
Criminal to Case entry
b. It is the administrators duty to enter the basic details like the states,
District etc. the Inspector and the Head are not given those privileges.
The Inspector from each Police Stations are given the following

privileges to register:
Crime registration entry
Suspect details entry
Set criminal to case
The Head of the Dept. (here the Inspector General) are not given
any privileges to register a crime.

Crime cluster processing:


Clustering algorithms have been studied for decades, and the literature on the
subject is huge. Therefore, we decided to choose a representative algorithm in
order to show the potential of the proposed approach, namely: the partitional Kmeans clustering.
This final module deals with the crime analysis. Only the Inspector and the Head
is given the privilege to do crime analysis, because the administrator may or may
not be form the police department. The crime analyser analysis the crime pattern
using the history of crimes present in the database. The evidences and
observations from the crime scene are collected and entered. The crime pattern
analyser then compares the entry with the crime history to trace out a similarity or
a pattern in the crimes occurred. Using the pattern obtained it speeds up the
investigation process for the detectives.
In a more practical and realistic scenario, domain experts (e.g., forensic
examiners) are scarce and have limited time available for performing
examinations. Thus, it is reasonable to assume that, after finding a relevant
document, the examiner could prioritize the analysis of other documents
belonging to the cluster of interest, because it is likely that these are also relevant
to the investigation. Such an approach, based on document clustering, can
indeed improve the analysis of seized computers, as it will be discussed in more
detail later.

Platform

: Java

Database

: MySQL

You might also like