Professional Documents
Culture Documents
Data Analytics On Cybercrime - An Indian Perspective: Authors
Data Analytics On Cybercrime - An Indian Perspective: Authors
Authors:
1. Leo Gladwin L
Research Scholar,
CHRIST (Deemed to be University)
Email: l.leo@res.christuniversity.in
2. Dr. Sangeetha R
Associate Professor,
Department of Management Studies
CHRIST (Deemed to be University)
Email: sangeetha.r@christuniversity.in
Abstract
Data science has a major role to play for the present generation, due to the transformation of a
physical world into a digital one. Citizens in the physical world are now named as Netizens in
the virtual world, which means a person involved in online communities or internet. In this
growing digital age, we have become more dependent on the Internet for many of our daily
activities. Especially during the pandemic, global digital transactions are increasing. At every
phase, the world is witnessing new cybercrime, which is a big challenge for every citizen. People
need to be aware of cybercrimes, how and how many are committed to protect themselves and
their families from such fraudulent activities. Big data analytics can do a great job in
anticipating future cyber-attacks. This is especially possible when Artificial Intelligence (AI) and
Machine Learning (ML) are integrated into the platform. Anticipating attacks is the most
effective way to fight cybercrime. Machine Learning (ML) uses various algorithms to read and
learn from the data and understand the consequences. Incorporating big data analytics helps
organisations to classify the type of threat, detect anomalies and measure how frequently cyber
threats occur. Hence, it allows organisations and individuals to actively secure their data. This
paper is intended to explore the role of data science by using the techniques of data analytics, in
order to project future cybercrimes based on historical data from 2016 to 2019. The data is
based on secondary sources. During the study, it found that electronic crimes are growing
tremendously, with the motive to steal information, such as an organisation’s confidential data,
debit cards, credit cards, Internet banking frauds and OTP frauds. This paper also has
recommendations and tips for citizens to surf safe in spite of growing threats everywhere.
Keywords: Electronic crime, cybercrime statistics, Digital victims, hacking, bank frauds
1. Introduction
Cybercrime, also known as Internet crime or Digital crime, is a fast-growing mode of crime at
present. Digital platforms have given plenty of positive opportunities, but at the same time it is
misused by criminals for fraudulent activity. (Business Standard 2016) Even Rahul Gandhi’s
(MP) Twitter account was hacked, from where cyber criminals threatened to release all the secret
communications of the party.
Kaspersky Security Network (2020) report shows that it detected and blocked 5,28,20,874, or 5.2
crore cyber intimidations in India from January 2020 to March 2020. India has seen a 37 per cent
rise in cyber-attacks between two consecutive quarterly reports of 2019-20. Accordingly, India
now ranks 27th globally in Q1-2020 compared to Q4-2019, when it was ranked 32nd.
Data science is an important field and it probably comes to the rescue of combating cybercrimes.
When companies are able to advertise their products on the Internet, based on an individual’s
behavior while browsing, then it should also be definitely possible to identify the malicious
behavior of an individual. Here, data science can play a crucial role through four major
categories.
Firstly Clustering and Classification Algorithm help in identifying good and bad transactions.
Secondly, Anomaly Detection algorithms can be applied to detect bad transactions based on
browsing history and behavior.
Thirdly, regression algorithms can be applied to predict what a criminal is about to do.
Lastly, Artificial Intelligence (AI) using machine learning can block such hackers. The AI of
Internet Service Provider (ISP) can even go to the extent of blocking their phones,
computers, internet and even simcards. If the transactions are highly suspicious, it is even
possible to locate such criminals.
Infosecurity, (2019), a prominent magazine, published that data scientists depend upon machine
learning that detects would-be malicious transactions. It allows them to predict risks based on
past behavior patterns. Regression is a wonderful access tool that uses an Intrusion Detection
System (IDS) to project forthcoming cyber-attacks.
Review of Literature
Various kinds of literatures were reviewed with an intention to identify the research gap in
cybercrime along with the impact of GDP, Unemployment and Literacy on cybercrime.
Rennie Naidoo (2020) analysed and interpreted COVID-19 related cybercrime data from across
the globe. The researcher mentioned that the number of cybercrimes has increased, especially
during the pandemic, triggering off fake charities, relief programmes, social media profiles,
websites, emails, products and work-from-home jobs. Kamini Dashora (2011) has made a study
on various problems related to the cybernet and also suggested some ideas to overcome it. Some
of the problems include hacking, phishing, cybersquatting and cyber terrorism. It is concluded
that it helps to take preventive methods, such as updating antivirus, control over cookies and
avoiding using credit or debit cards. Md Shamimul Hasan (2015) research is carried out to
protect students by creating awareness and provide some empirical evidence to assist policy
makers in combating cybercrime. This paper also suggests remedies to young generations, to
diminish the risk of being victims in cybercrime.
Lin & Lu (2011) have found some of the major factors enabling people to join social media. It is
also identified that male and female have dissimilar influencing factors. Females are influenced
by the number of their friends and families in social media, whereas men had no impact on their
peers. O'Keeffe & Clarke (2011) has concluded that youth and children engaging in Social
Networking sites are common in this generation. Gaming sites, simulated worlds, videos
(YouTube) and other blogs offer great entertainment for them. Dwyer, Hiltz, & Passerini (2007)
studied the reputed social media sites such as Myspace and Facebook to check the trust and
privacy concerns. It is found that there is no difference identified in privacy matters. This study
concludes that privacy and trust do not really matter as far as social media sites are concerned.
Malathi A, Babboo SS and Anbarasi A (2011) state that data mining plays an important role in
combating crime-related issues. The law administration and intelligence analysis can further
help in decision-making. McClendon, L., & Meghanathan, N (2015) show that data mining is an
important tool that can be used to detect crime and prevent it. Mittal, GLM and SJ.K (2019)
convey that machine learning tools become handy to analyse crime data and to determine the
economic factors impacting the crimes in the country. It is found that there is correlation between
the unemployment rate and robbery. Tayal, D. K. et al. (2015) say that crimes in India are
increasing due to some of the factors such as illiteracy, relocation, joblessness, poverty and
corruption. The approach has various modules such as Data Extraction, Data Processing, Google
map, Clustering and WEKA (Waikato Environment for Knowledge Analysis) software. Hussain
KZ et al (2012) has detailed examination on the data mining methods to project the crime and
decision-making process that were automated from the information extracted from legal
documents. Saeed et al. (2015) estimates that fraudulent activities using different classification
techniques like decision trees and Naïve Bayes. It was identified that decision trees are less
suitable than Naïve Bayes classifiers for crime data examination. He has used the dataset of the
communities extracted from the UCI machine learning repository.
Previous studies based on cybercrimes were mainly based on primary data and theoretical
perspectives. During the reviews, it has been identified that there is a need to explore the pattern
of various cybercrimes along with the impact of economic factors such as the Gross Domestic
Product (GDP), unemployment and literacy. Hence, this paper has given special attention to
cybercrime’s statistical facts, especially in India, in 2019.
2. Data and Methodology
The research is based on descriptive analysis, in which the data and statistics are mainly taken
from secondary sources, such as websites, NCB reports, journals and articles. The study has
explored the cybercrime statistics of the past four-year’s (2016-2019) statistics, while data
mining is done for 2020. It has information on different types of cybercrime, such as identity
theft, banking frauds, online cheating, tampering computers, publishing obscene materials and
fake accounts. Analysis is carried out using statistical techniques like Correlation, Standard
Deviation and Regression related to GDP, Unemployment and Literacy rates.
Table 4.1: Statewise top ten Cyber Crimes from 2016 to 2020
SN State 2016 2017 2018 2019 2020 % Population Crime
(Estimated) Share (Lakhs)
Rate
(2019)
(2019)
1 Karnataka 1101 3174 5839 12020 14389 27 659.7 18.2
2 Telangana 593 1209 1205 2691 2997 6 372.8 7.2
3 Assam 696 1120 2022 2231 2894 5 344.2 6.5
4 Uttar 2639 4971 6280 11416 13237 25.6 2259.7 5.1
Pradesh
5 Maharashtra 2380 3604 3511 4967 5533 11.2 1225.3 4.1
6 Andhra 616 931 1207 1886 2182 4.2 523.2 3.6
Pradesh
7 Odisha 317 824 843 1485 1748 3.3 437.3 3.4
8 Jharkhand 259 720 930 1095 1431 2.5 375.8 2.9
9 Meghalaya 39 39 74 89 107 0.2 32.3 2.8
10 Rajasthan 941 1304 1104 1762 1844 4 776 2.3
11 Union 130 203 151 244 255 0.3 240.8 0.6
Territories
12 Other States 2606 3697 4082 4660 5109 10.7 6129.1 0.76
Total 12317 21796 27248 44546 51723 100 13376.1 3.3
Source: NCRB Report *Estimation for 2020 is calculated using linear regression *Crime Rate = Population/No of crimes
*Population from 2011
census
Table 4.1 showcases that there is a continuous growth in cybercrime cases from 2016 to 2019.
The highest number (12,020) of crime has taken place in Karnataka, in 2019, this could be
because Bengaluru in Karnataka is regarded as Silicon Valley of India due to its leading role in
the IT sector. The number of cases in Uttar Pradesh, about 11,416 (25%) is also high.
Fortunately, the crime rate in Union Territories (UTs) is low even after combining all the UTs
together, which includes the capital city, Delhi. The total number of crimes estimated for 2020 is
51,723.
Table 4.2: Number of cybercrimes with their types in India from 2016 to 2020
Cyber Crime Type 2016 2017 2018 2019 2020 2019(%)
(Estimated)
In Table 4.2, the highest crime cases reported every year is Identity and Data theft, which means
obtaining the personal or financial information of another person to commit fraud, such as
making unauthorised banking transactions or purchases. Some of the identity information
includes mobile number, Aadhar number, PAN details, email address and bank details. A
majority of crime is related to Financial/Banking scam, in which the criminals mostly operate in
the area of money-related offences to hack bank accounts.
Table 4:3 shows that there are 26,891 (60% in 2019) criminals, who are motivated to commit
fraudulent activities through identity theft, cheating and fake profiles. It is surprising to know
that there are even some criminals who were politically motivated in 2019 (0.15%) in order to
show support to a particular party or spread hatred. Sometimes fun ends in crime, so there are
172 (0.93% in 2019) cases committed in the name of pranks. Sexual exploitation is not only
happening in the physical but also in the digital world, totaling about 531 cases (2.89%) in 2019.
It can be noted that spreading piracy was high (671) in 2018, compared to earlier years.
Table 4.4: Top ten cybercrimes in metropolitan cities of India from 2016 to 2020
City 2016 2017 2018 2019 2020 % Population Crime
(Estimated) (2019) (Lakhs) Rate
Bengaluru 762 2743 5253 10555 12801 57.45 85 124.2
Lucknow 361 608 962 1262 1563 6.87 29 43.5
Hyderabad 291 328 428 1379 1448 7.51 77.5 17.8
Jaipur 532 685 415 544 486 2.96 30.7 17.7
Ghaziabad 62 118 191 347 412 1.89 23.6 14.7
Mumbai 980 1362 1482 2527 2778 13.75 184.1 13.7
Kanpur 136 229 229 365 412 1.99 29.2 12.5
Patna 167 79 115 202 176 1.10 20.5 9.9
Pune 269 318 153 309 251 1.68 50.5 6.1
Surat 66 105 155 228 273 1.24 45.8 5
Other Metro 546 687 715 654 739 3.56 564.5 1.2
Cities
Total 4172 7262 1009 18372 21335 100 1140.4 16.1
8
Source: NCRB Report *Estimation for 2020 is calculated using linear regression *Crime Rate 2019 = Population/No of crimes
Table 4.4 indicates that the top ten metropolitan cities are high in cybercrimes. Out of the total
digital crimes (44,546) in India, 58% (2019) are from metropolitan cities. The highest rate (57%)
of cybercrimes has taken place in Bengaluru (10,555 cases in 2019) compared to other cities.
Bengaluru is popularly regarded as the Silicon city, because of its leading role in Information
technology (IT) and at the same time the above data indicates that it is also leading in
cybercrimes, not only in 2019, but also in 2017 and 2018. The least cybercrime rate among the
top ten belongs to Surat (5) in 2019.
Table 4.5: Analysis result of multi-variable linear regression co-efficients and correlation
Coefficient Standard t Stat P-value Upper 95.0% Results Correlation
Error
0.456 0.667 56111.9898
Intercept 8462.79 18536.34 5 1 7
0.540 0.612 0.04075729 0.59
X1 Other Crimes 0.007076 0.013102 0 3 9 significant
0.670 0.532 0.00011280 0.65
X2 GDP 2.33 3.4803 7 1 8 significant
X3Unemployme -195.415 275.7972 -0.708 0.510 513.543952 Not -0.16
nt Rate 2 6 significant
0.719 578.833718 Not 0.20
4
X Literacy rate -100.556 264.2940 -0.380 2 2 significant
Table 4.5 presents the correlation analysis between the cybercrimes with other crimes (0.59) and
GDP (0.65) in India, which has a positive moderate correlation. It was assumed that only literate
people commit cybercrime because of their basic knowledge of computers, however, the above
facts prove that it has low degree correlation (0.20). Analysis is conducted to know whether
unemployed people commit cybercrime to get money, but it has no significance, hence it is
negatively correlated (-0.16). As far as regression is concerned, X1 and X2 are significant
whereas X3 and X4 are not.
Table 4.6: Multi variable regression test result – Model fit summary
Model Sum of squares df Mean Square F Significance
Regression 81322273.69 4 20330568 1.22035 0.407057522
Residual 83298107.91 5 16659622 Multiple R R Square
Total 164620381.6 9 0.70285 0.493999
Table 4.6 is the result of regression tests, in which the study has taken the number of cybercrimes
as a dependent variable (y), while other crimes, apart from cyber (X1), GDP (X2),
Unemployment (X3) and Literacy rate (X4) are taken as independent variables. Regression
analysis is conducted considering the null hypothesis “There is no significant influence of
independent variables on the dependent variable". As the p-value is higher than 0.05, the
null hypothesis is accepted. This means that independent variables (X1, X2, X3 and X4) do not
have an impact on the dependent variable (Y). Therefore, it can be stated that there must be some
other omitted variables influencing the cybercrime rate.
Cybercrime is the biggest threat and creates a huge loss for the business, economy and state. As
technology and infrastructure expand, cybercrime also increases. National Cyber Security (2020)
in India witnessed a loss of Rs 1.25 lakh crore in 2019 due to cybercrimes. It will continue to
surge, as the nation plans discharge of fifth generation networks to set up smart cities. Due to
cybercrimes, business gets affected, as consumers lose confidence and tend to become reluctant
to process digital transactions, because they might receive fake products, defective items and
online payment thefts. Therefore, it will make business organisations slow down. Nath, 2006,
explains that the crime rate is determined by various economic factors, such as the
unemployment rate, income levels, GDP and CPI (Consumer Price Index).
Derek Manky (2013) conveys that cybercrime has continued to grow highly in an organised
form. It has evolved as big business and emerging markets for cyber deceptions. On the other
hand, cybercrime has expanded as a highly organised structure, involving leaders and
developers. Cybercrime has become a business providing a variety of services, right from attack
data, consulting, services and advertising.
In Table 4.5, it is significant that the motive behind cybercrimes is not always related to finance,
as 40% of them have non-monetary motives such as sexual exploitation, revenge or anger,
pranks, political motives, illegal business, inciting hate against the country or defaming someone
online. Cybercrimes sometimes result in physical violence.
During the study, it was found that cybercrimes are continuously growing at an alarming rate of
more than 30%, while nowhere was it shown to be declining. Danny Maher (2017) states that it
is hard to evade cyber threats in industry. Artificial intelligence (AI) and associated technologies,
such as deep learning, automated network monitoring and machine learning must be put in place.
R. Ramirez and N. Choucri (2016) have said that connections and interactions of critical
infrastructure with society are lagging behind. Jain and Bhatnagar (2016) suggest Big Data
Analysis to make the right decisions that can help law and order to be sustained. Suppose if the
cybercrimes were high in the state, then additional security must be provided. However, the
government of India has enhanced the fund for digital India programmes by 23% to Rs 3,958
crore for 2020-21 (Economic Times, 2020).
It is found that that there is a slight correlation (0.20) between literacy and crime. Innocents with
low digital knowledge fall prey to cybercrimes. Deloitte (2016) states that despite rising
smartphone users and internet, digital literacy in India has been low. It is good if Digital India
programmes try to reach all sections of the population in order to improve digital literacy.
During the study, it was noticed (Table 4.3) that cybercrimes might target women and children
through forms such as morphing, uploading photos and videos taken without knowledge, cyber
bullying, online harassment, child pornography and defamation. Women are unsafe not only in
the physical but also in the digital world. The digital world is very attractive for children and
they are easily exploited by criminals. To be safe in the virtual world, one must follow safe
practices and be aware of cybercrime types that will help digital experience hassle-free.
Muhammad Dharma (2018) has mentioned that the ways to overcome this crime can be
classified into three categories that is cyber law, education and strict policy-making. The
research can be further explored around the globe while major factors influencing cybercrimes
can be explored.
References: