Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

MAIN CODE

# -*- coding: utf-8 -*-


"""Copy of finished final dheepak project .ipynb

Automatically generated by Colaboratory.

Original file is located at


https://colab.research.google.com/drive/14MK2xdc3vvL94zraESLzfB3rAGZj2an4
"""

import csv
import nltk
from nltk.tokenize import word_tokenize
from transformers import pipeline, BartTokenizer, BartForConditionalGeneration
import plotly.express as px

# Download NLTK data for word tokenization


nltk.download('punkt')

def load_dataset_from_csv(file_path, encoding='latin-1'):


dataset = []
with open(file_path, newline='', encoding=encoding) as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
dataset.append(row)
return dataset

def case_insensitive_contains(word_list, target_word):


return any(target_word.lower() == word.lower() for word in word_list)

def filter_crime_data(dataset, month, year, country):


filtered_data = []
for entry in dataset:
if (
case_insensitive_contains(word_tokenize(entry['month']), month)
and
int(entry['year']) == int(year) and
case_insensitive_contains(word_tokenize(entry['country']),
country)
):
filtered_data.append(entry)
return filtered_data

def generate_report(summary_text):
tokenizer = BartTokenizer.from_pretrained("facebook/bart-large-cnn")
model = BartForConditionalGeneration.from_pretrained("facebook/bart-large-
cnn")
inputs = tokenizer(summary_text, max_length=1024, return_tensors="pt",
truncation=True)
summary_ids = model.generate(**inputs, max_length=150, min_length=50,
length_penalty=2.0, num_beams=4, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

return summary

def visualize_data(filtered_data):
fig = px.line(filtered_data, x='year', y='crime', title='Cybercrime Trends
Over Years')
fig.show()

def calculate_percentage_crime_by_country(dataset):
country_crime_percentage = {}
for entry in dataset:
country = entry['country']
crime_count = country_crime_percentage.get(country, 0)
country_crime_percentage[country] = crime_count + 1

total_crimes = len(dataset)
for country, crime_count in country_crime_percentage.items():
country_crime_percentage[country] = (crime_count / total_crimes) * 100

return country_crime_percentage

def visualize_country_percentage(country_crime_percentage):
fig = px.line(x=list(country_crime_percentage.keys()),
y=list(country_crime_percentage.values()), title='Percentage of Cybercrimes by
Country Over Years')
fig.update_layout(xaxis_title='Country', yaxis_title='Percentage of
Crime')
fig.show()

def main():
file_path = "/content/DHEEPAK CSV DATA S.csv"
dataset = load_dataset_from_csv(file_path)

month = input("Enter the month: ")


year = input("Enter the year: ")
country = input("Enter the country: ")

filtered_data = filter_crime_data(dataset, month, year, country)

if filtered_data:
print("\nCrime Details:")
for entry in filtered_data:
print(f"Month: {entry['month']}, Year: {entry['year']}, Country:
{entry['country']}")
print(f"Type of Attack: {entry['type_of_attack']}")
print(f"Crime: {entry['crime']}")
print(f"Motive: {entry['motive']}")
print("\n")

# Generate report
summary_text = f"Crimes in {month} {year} in {country}: {',
'.join([entry['type_of_attack'] for entry in filtered_data])}."
report_summary = generate_report(summary_text)
print("\nAutomated Report Summary:")
print(report_summary)

# Visualize filtered data


visualize_data(filtered_data)

else:
print("No matching records found.")

# Calculate percentage of crime by country and visualize


country_crime_percentage = calculate_percentage_crime_by_country(dataset)
visualize_country_percentage(country_crime_percentage)

if __name__ == "__main__":
main()

EXPLAINATION -----

Certainly! Let's break down the script in simpler terms:

1. **Importing Libraries**: Think of libraries as tools that help the computer do certain tasks. Just
like how you need different tools to fix different things, computers need different libraries to perform
different tasks. In this script, we're importing libraries that help with reading data, analyzing text,
creating reports, and making graphs.

2. **Function Definitions**:

- `load_dataset_from_csv`: This function is like a helper that reads information from a special type
of file called a CSV file. It then organizes this information in a way that the computer can understand.

- `case_insensitive_contains`: Imagine you're looking for a specific word in a book, but you're not
sure if it's written in lowercase or uppercase letters. This function helps the computer find words in a
similar way.
- `filter_crime_data`: This function helps the computer find specific information about cybercrimes
based on the month, year, and country you tell it.

- `generate_report`: This function helps the computer create a summary report about cybercrimes
automatically.

- `visualize_data` and `visualize_country_percentage`: These functions help the computer create


graphs to show trends and percentages of cybercrimes.

3. **Main Functionality** (`main()`):

- First, it reads the cybercrime data from a special file.

- Then, it asks you to tell it which month, year, and country you're interested in.

- After that, it looks through the data to find cybercrimes that match what you told it.

- If it finds any matches, it shows you the details of those crimes, creates a summary report, and
makes a graph to show the trends.

- Finally, it calculates how many cybercrimes happened in each country and shows you that
information in another graph.

4. **Execution**: When you run this script, it will start doing all these things automatically.

5. **Data Source**: The information about cybercrimes comes from a special file stored on the
computer.

6. **Model**: The computer uses a special tool called a model to help it write the summary report.

In simpler terms, this script helps you find and understand information about cybercrimes without
having to read through a lot of data by yourself. It's like having a smart assistant that does all the
hard work for you!
ALTERNATIVE EXPLAINATION

This Python script is a part of a web application, likely built with Django, that deals with cybercrime
data analysis and visualization. Let me simplify the components and functionalities for someone
unfamiliar with computers:

1. **Importing Libraries**: These are tools that provide additional functionality. Imagine them as
tools in a toolbox that help perform specific tasks. For example, the `csv` library helps read and write
CSV files, `nltk` helps with natural language processing tasks, and `transformers` helps with advanced
text processing using pre-trained models.

2. **Function Definitions**: These are like predefined sets of instructions that can be reused
whenever needed. They perform specific tasks like loading data from a CSV file, filtering data based
on criteria like month, year, and country, generating reports summarizing the data, and visualizing
the data trends.

3. **Main Functionality**: This is the main part of the program where everything comes together.
It's like the conductor of an orchestra, directing different parts to work together harmoniously. In this
case, it loads data from a CSV file, filters it based on user input (month, year, country), generates a
summary report, visualizes the data, and calculates the percentage of cybercrimes by country.

4. **Web Application Integration**: This script seems to be integrated into a web application built
with Django. It defines views (`Index`, `main`, `output`) that handle HTTP requests and responses. For
instance, when a user interacts with the web application, it triggers these views to perform tasks like
rendering HTML templates and returning data.

5. **User Interface**: This script likely serves as the backend logic for a web application. The actual
user interface (HTML templates, forms, etc.) would be defined elsewhere in the Django project.

Overall, this script forms a crucial part of a web application that allows users to analyze, summarize,
and visualize cybercrime data. It interacts with users through a web interface, processes their input,
and presents them with meaningful insights derived from the data.
RESULT PAGE HTML CODE

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Crime Data</title>
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
<style>
body {
font-family: Arial, sans-serif;
background-color: #f2f2f2;
color: #333;
margin: 0;
padding: 0;
}

.container {
width: 80%;
margin: 20px auto;
background-color: #fff;
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
padding: 20px;
border-radius: 10px;
}

.crime-entry {
margin-bottom: 20px;
}

h1 {
color: #007bff;
text-align: center;
}
</style>
</head>
<body>
<div class="container">
<h1>Crime Data</h1>
<div class="crime-entry">
<p><strong>Month:</strong> January</p>
<p><strong>Year:</strong> 2023</p>
<p><strong>Country:</strong> Country A</p>
<p><strong>Type of Attack:</strong> Terrorism</p>
<p><strong>Crime:</strong> Bombing</p>
<p><strong>Motive:</strong> Political</p>
<p><strong>Percentage of Crime:</strong> 10%</p>
<canvas id="crimeChart1" width="400" height="200"></canvas>
</div>
<div class="crime-entry">
<p><strong>Month:</strong> February</p>
<p><strong>Year:</strong> 2023</p>
<p><strong>Country:</strong> Country B</p>
<p><strong>Type of Attack:</strong> Assault</p>
<p><strong>Crime:</strong> Assault and Battery</p>
<p><strong>Motive:</strong> Personal</p>
<p><strong>Percentage of Crime:</strong> 15%</p>
<canvas id="crimeChart2" width="400" height="200"></canvas>
</div>
<!-- Add more crime entries as needed -->
</div>

<script>
// Crime data
const crimeData = [
{
month: "January",
year: 2023,
country: "Country A",
typeOfAttack: "Terrorism",
crime: "Bombing",
motive: "Political",
percentage: 10
},
{
month: "February",
year: 2023,
country: "Country B",
typeOfAttack: "Assault",
crime: "Assault and Battery",
motive: "Personal",
percentage: 15
}
// Add more crime entries as needed
];

// Generate charts for each crime entry


crimeData.forEach((crime, index) => {
const ctx = document.getElementById(`crimeChart${index +
1}`).getContext('2d');
const crimeChart = new Chart(ctx, {
type: 'bar',
data: {
labels: ['Percentage'],
datasets: [{
label: `Percentage of Crime - ${crime.month}`,
data: [crime.percentage],
backgroundColor: ['#007bff']
}]
},
options: {
scales: {
y: {
beginAtZero: true
}
}
}
});
});
</script>
</body>
</html>

This HTML code represents a webpage displaying crime data in a simplified format. Let's break it
down for someone who might not be familiar with computers:

1. **HTML Structure**:

- The code starts with `<!DOCTYPE html>`, which defines the document type and version of HTML
being used.

- The `<html>` element is the root element of the HTML document, and it contains all other HTML
elements.

- The `<head>` element contains meta-information about the HTML document, such as character
encoding, viewport settings, and the page title.

- The `<body>` element contains the visible content of the HTML document, including text, images,
and other elements.

2. **Styling**:

- The `<style>` element contains CSS (Cascading Style Sheets) rules that define the visual
appearance of the webpage.

- CSS rules define properties like font family, background color, text color, margins, padding, and
border radius to style various elements.

3. **Content**:
- Inside the `<body>` element, there's a `<div>` element with the class `container`. This `<div>` acts
as a container for the entire content of the webpage.

- Inside the container, there's an `<h1>` element with the text "Crime Data". This represents the
main heading of the webpage.

- Following the heading, there are multiple `<div>` elements with the class `crime-entry`. Each of
these `<div>` elements represents a single entry of crime data.

- Inside each `crime-entry` `<div>`, there are `<p>` elements containing details about the crime,
such as month, year, country, type of attack, crime description, motive, and percentage of crime.

- Additionally, each crime entry includes a `<canvas>` element with a unique `id`, which will be used
to render a chart representing the percentage of crime.

4. **JavaScript**:

- The `<script>` element contains JavaScript code that handles the dynamic generation of charts
based on the crime data.

- Crime data is represented as an array of JavaScript objects, where each object contains details
about a specific crime entry.

- The JavaScript code iterates over each crime entry, retrieves the corresponding `<canvas>`
element using its `id`, and generates a bar chart using the Chart.js library.

- The generated charts visually represent the percentage of crime for each crime entry.

Overall, this HTML code creates a webpage that visually presents crime data in a structured format,
with each crime entry accompanied by a corresponding chart representing the percentage of crime.

FRONT PAGE HTML

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>CYBER CRIME DATA EXPLORER</title>
<style>
body {
font-family: Arial, sans-serif;
background-color: #f4f4f4;
margin: 0;
padding: 0;
display: flex;
justify-content: center;
align-items: center;
height: 100vh;
}

h1 {
text-align: center;
background-color: #3498db;
color: #fff;
padding: 10px;
text-transform: uppercase;
margin-bottom: 20px;
}

form {
max-width: 400px;
background-color: #fff;
padding: 20px;
border-radius: 5px;
box-shadow: 0 0 10px rgba(0, 0, 0, 0.1);
}

label {
display: block;
margin-bottom: 8px;
}

select,
input[type="number"],
input[type="text"] {
width: 100%;
padding: 8px;
margin-bottom: 15px;
box-sizing: border-box;
border: 1px solid #ccc;
border-radius: 4px;
}

input[type="submit"],
input[type="reset"] {
background-color: #3498db;
color: #fff;
padding: 10px 15px;
border: none;
border-radius: 4px;
cursor: pointer;
}

input[type="submit"]:hover,
input[type="reset"]:hover {
background-color: #555;
}
</style>
</head>
<body>

<h1>CYBER CRIME DATA EXPLORER</h1>

<form action="Main" method="POST">


{% csrf_token %}
<label for="month">Month:</label>
<select id="month" name="month">
<option value="January">January</option>
<option value="February">February</option>
<option value="March">March</option>
<option value="April">April</option>
<option value="May">May</option>
<option value="June">June</option>
<option value="July">July</option>
<option value="August">August</option>
<option value="September">September</option>
<option value="October">October</option>
<option value="November">November</option>
<option value="December">December</option>
</select>

<br>

<label for="year">Year:</label>
<input type="number" id="year" name="year" min="2003" max="2023">

<br>

<label for="country">Country:</label>
<input type="text" id="country" name="country">

<br>

<label for="Additional Details">Additional Details: </label>


<input type="text" id="Details" name="Details">
<!-- Options for types of attacks -->
<!-- Add more types as needed -->

<br>

<input type="submit" value="Submit">


<input type="reset" value="Reset">
</form>

</body>
</html>

EXPLAINATION-

Certainly! Imagine you're entering a building, but before you can go inside, there's a reception area
where you need to fill out a form. This form is similar to what you see on the webpage:

1. **Big Title**: At the top, there's a big title that says "CYBER CRIME DATA EXPLORER." It's like the
big sign outside the building that tells you what the place is about.

2. **Form**: Below the title, there's a section with some questions, just like a paper form at the
reception desk. Here's what each part of the form is for:

- **Month Dropdown**: You see a list where you can pick the month. It's like choosing the month
of the year from a menu, like January, February, March, etc.

- **Year Input**: There's a box where you can type in a number. This is where you put the year, like
2023 or 2010. You can't type letters here, only numbers.

- **Country Input**: Another box, but this time, you can type in letters. This is where you write the
name of the country, like "United States" or "Canada."

- **Additional Details**: This is a box where you can write anything extra you want to tell them. It's
like a space for any special notes or comments you might have.

3. **Buttons**: At the bottom of the form, there are two buttons:

- **Submit Button**: When you're done filling out the form, you click this button to send it to the
people who work there. It's like handing over your completed paper form to the receptionist.

- **Reset Button**: If you make a mistake and want to start over, you can click this button. It's like
erasing everything you wrote on the paper form and starting fresh.
That's basically what the webpage is all about. It's a way for you to tell them what month, year, and
country you're interested in, and if you have any special notes, you can write them there too. Once
you're done, you click the submit button, and they'll take it from there!

This HTML code represents a simple web form for exploring cybercrime data. Let me explain it in a
way that someone unfamiliar with computers can understand:

1. **Basic Structure**:

- The code starts by declaring that it's an HTML document using `<!DOCTYPE html>`.

- The `<html>` element is the root element of the HTML document.

- The `<head>` section contains meta-information about the document, such as character encoding
and the page title.

- The `<body>` section contains the visible content of the webpage.

2. **Styling**:

- CSS (Cascading Style Sheets) rules inside the `<style>` tags control the appearance of various
elements on the webpage.

- For example, `body` sets the font family, background color, and margin, while `h1` styles the main
heading.

3. **Content**:

- The main heading `<h1>` displays "CYBER CRIME DATA EXPLORER" and is styled to have a
background color, text color, padding, and text transformation.

- Inside the `<form>` element, users can input information to explore cybercrime data.

- Labels `<label>` provide descriptions for the input fields.

- Input fields include a dropdown menu for selecting the month, a number input for the year, and a
text input for entering the country.

- Additionally, there's an input field labeled "Additional Details" where users can provide additional
information if needed.

4. **Interaction**:

- When users fill in the form and click "Submit," the form data is sent to the URL specified in the
`action` attribute of the form tag (in this case, "Main").
- The `method` attribute specifies the HTTP method to be used for submitting the form data (POST
method).

- The CSRF token `{% csrf_token %}` is included for security purposes to prevent CSRF (Cross-Site
Request Forgery) attacks.

- The "Reset" button clears all the form fields when clicked.

Overall, this HTML code creates a user-friendly interface for inputting parameters to explore
cybercrime data. Users can select the month, enter the year and country, and provide additional
details if necessary, all within a clean and organized form layout.

You might also like