Welcome to Scribd!

I Have A Set of Data in A Cluster W

Uploaded by

0% found this document useful (0 votes)

5 views1 page

To identify if attack data in a cluster is present in labelled attack data: 1) Load the cluster and labelled data into Pandas dataframes 2) Create a unique identifier column by concatenating relevant columns in both dataframes 3) Merge the dataframes on the unique identifier using an inner join 4) Check the size of the merged dataframe - if greater than zero, there is overlap between the data

Original Description:

Original Title

I have a set of data in a cluster w

Copyright

Available Formats

TXT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as TXT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as txt, pdf, or txt

0% found this document useful (0 votes)

5 views1 page

I Have A Set of Data in A Cluster W

Uploaded by

Abhishek Mishra

Copyright:

Available Formats

Download as TXT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as txt, pdf, or txt

Jump to Page

You are on page 1of 1

Search inside document

I have a set of data in a cluster which are attack data from NSL-KDD dataset.

And i
have labelled data where different types of attack data are labelled. How can i
identify whether the attack data in cluster is also present in labelled attack data
using python?
To identify whether the attack data in the cluster is also present in the labelled
attack data using Python, you can follow these steps:

Load both the cluster data and labelled data into separate Pandas dataframes.

Create a new column in both dataframes that contains a unique identifier for each
row. This identifier can be a concatenation of relevant columns, such as the source
IP, destination IP, and timestamp.

Merge the two dataframes on the unique identifier column using an inner join. This
will create a new dataframe that only contains rows that are present in both the
cluster and labelled data.

Check the size of the new dataframe. If the size is greater than zero, then there
is overlap between the cluster and labelled data. If the size is zero, then there
is no overlap.

Here's a sample code snippet that demonstrates these steps:

python
Copy code
import pandas as pd

# Load cluster data into a Pandas dataframe

cluster_data = pd.read_csv('cluster_data.csv')

# Load labelled data into a Pandas dataframe

labelled_data = pd.read_csv('labelled_data.csv')

# Create unique identifier column in both dataframes

cluster_data['unique_id'] = cluster_data['src_ip'] + '_' + cluster_data['dst_ip'] +
'_' + cluster_data['timestamp']
labelled_data['unique_id'] = labelled_data['src_ip'] + '_' +
labelled_data['dst_ip'] + '_' + labelled_data['timestamp']

# Merge dataframes on unique identifier column using inner join

merged_data = pd.merge(cluster_data, labelled_data, on='unique_id', how='inner')

# Check size of merged dataframe

if merged_data.shape[0] > 0:
print('Overlap exists between cluster and labelled data')
else:
print('No overlap between cluster and labelled data')
Note that the specific column names and unique identifier formula used in this
example may need to be adjusted to match the structure of your data.

Cody's Data Cleaning Techniques Using SAS, Third Edition
From Everand
Cody's Data Cleaning Techniques Using SAS, Third Edition
Ron Cody
Rating: 4.5 out of 5 stars
4.5/5 (2)
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
Document20 pages
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
sa00059
No ratings yet
Codes
Document37 pages
Codes
Tame PcAddict
No ratings yet
Python Pandas-Series-neww
Document80 pages
Python Pandas-Series-neww
p
100% (1)
Python Pandas Series
Document37 pages
Python Pandas Series
R
No ratings yet
On Data Handling Using Pandas-I
Document63 pages
On Data Handling Using Pandas-I
anagha
100% (2)
Pandas Library
Document5 pages
Pandas Library
none
No ratings yet
Csc-322a (Week 11) Lab No 10
Document25 pages
Csc-322a (Week 11) Lab No 10
Osama Ashraf
No ratings yet
AI Phase3
Document4 pages
AI Phase3
sameithyatech
No ratings yet
60 ChatGPT Prompts For Data Science 2023
Document67 pages
60 ChatGPT Prompts For Data Science 2023
T L
100% (2)
7 Days Analytics Course 3feiz7 4
Document8 pages
7 Days Analytics Course 3feiz7 4
anupamakarupiah
No ratings yet
5CS037 WS02 PandasForDataAnalysis
Document30 pages
5CS037 WS02 PandasForDataAnalysis
Pankaj Mahato
No ratings yet
Python Pandas Interview Questions
Document17 pages
Python Pandas Interview Questions
hasnain qureshi
100% (1)
2nd Programme AIML 7th Sem
Document2 pages
2nd Programme AIML 7th Sem
awfullymeee
No ratings yet
Lab5 Example Fall 23
Document4 pages
Lab5 Example Fall 23
Patel Vedant
No ratings yet
Ch-2 Panda: #Import The Pandas Library and Aliasing As PD
Document5 pages
Ch-2 Panda: #Import The Pandas Library and Aliasing As PD
RC Sharma
No ratings yet
Data Science Lab 3
Document5 pages
Data Science Lab 3
Tayyaba Faisal
No ratings yet
Mongo DB
Document16 pages
Mongo DB
narasimhulu.mr2421
No ratings yet
Pandas: Key Features of Pandas
Document44 pages
Pandas: Key Features of Pandas
jose
No ratings yet
Freda Song Drechsler - Maneuvering WRDS Data
Document8 pages
Freda Song Drechsler - Maneuvering WRDS Data
RicardoHenriquez
No ratings yet
Pandas Basics
Document21 pages
Pandas Basics
Dhruv Bhardwaj
No ratings yet
Python Pandas New Sylabus
Document53 pages
Python Pandas New Sylabus
Rohan sushil
No ratings yet
Algorithm Assignment
Document3 pages
Algorithm Assignment
Jibon Khan
No ratings yet
Python Pandas
Document35 pages
Python Pandas
Mayur Nasare
100% (1)
Pandas Python PDF
Document51 pages
Pandas Python PDF
Nenavath Ganesh
No ratings yet
Sample Phase 2 Document
Document7 pages
Sample Phase 2 Document
Karishma Yaz
No ratings yet
Unit 4
Document36 pages
Unit 4
anurajyellurkar29
No ratings yet
Pandas
Document41 pages
Pandas
Gabriel Chakhvashvili
No ratings yet
Pandas
Document9 pages
Pandas
Allinagaram Ajay
No ratings yet
Eda Unit 2
Document65 pages
Eda Unit 2
60 Vibha Shree.S
No ratings yet
Python Pandas ch-2
Document56 pages
Python Pandas ch-2
Rohan sushil
No ratings yet
Python Code
Document7 pages
Python Code
Gnan Shetty
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
Document15 pages
Exp1 - Manipulating Datasets Using Pandas
mnbatrawi
No ratings yet
Pandas
Document16 pages
Pandas
Honey
No ratings yet
PRACTICAL QUESTIONS For DSBDA
Document9 pages
PRACTICAL QUESTIONS For DSBDA
ngak1214
No ratings yet
Data Analytics Pandas
Document33 pages
Data Analytics Pandas
Vivek Munjayasra
No ratings yet
Ch-2 - Panda - Part-1 - 3rd - Day
Document5 pages
Ch-2 - Panda - Part-1 - 3rd - Day
RC Sharma
No ratings yet
Exp8 SBLC
Document9 pages
Exp8 SBLC
Raj
No ratings yet
Quality Control Sheet
Document2 pages
Quality Control Sheet
haneenalaa465
No ratings yet
Exercise and Experiment 3
Document14 pages
Exercise and Experiment 3
h8792670
No ratings yet
Python Pandas
Document230 pages
Python Pandas
Arun Narasimhan
No ratings yet
Dsbda Assignment 1
Document5 pages
Dsbda Assignment 1
ngak1214
No ratings yet
Data Understanding and Preparation
Document48 pages
Data Understanding and Preparation
MohamedYounes
No ratings yet
Exp2 - Data Visualization and Cleaning and Feature Selection
Document13 pages
Exp2 - Data Visualization and Cleaning and Feature Selection
mnbatrawi
No ratings yet
1 Pandas Basics
Document13 pages
1 Pandas Basics
Biku
No ratings yet
Informatics Practices Class 12 Study Material
Document128 pages
Informatics Practices Class 12 Study Material
Rishikesh Crafts and Tech
No ratings yet
Roll NO 2020
Document8 pages
Roll NO 2020
Ali Mohsin
No ratings yet
04 Introduction To Python-1
Document29 pages
04 Introduction To Python-1
FucKerWengie
No ratings yet
Pandas
Document29 pages
Pandas
Vineet Saraswat
No ratings yet
Ai - Phase 3
Document9 pages
Ai - Phase 3
Manikandan N
No ratings yet
Pandas DataFrame Notes
Document13 pages
Pandas DataFrame Notes
pankaj sethia
No ratings yet
Pandas DataFrame Notes
Document10 pages
Pandas DataFrame Notes
Sarath Ramineni
100% (1)
IP TERM-1 Study Material (Session 2021-22)
Document84 pages
IP TERM-1 Study Material (Session 2021-22)
AARTI BARWAL
No ratings yet
Assvid
Document13 pages
Assvid
diyalap01
No ratings yet
Unit-2 Bda
Document11 pages
Unit-2 Bda
claritysubhash55
No ratings yet
Vid 4
Document6 pages
Vid 4
diyalap01
No ratings yet
Data Analysis With Python
Document12 pages
Data Analysis With Python
Minh Nhựt Nguyễn
No ratings yet
Dva Lab Manual
Document32 pages
Dva Lab Manual
hiya Chopra
No ratings yet
DA0101EN-Review-Introduction - Jupyter Notebook
Document8 pages
DA0101EN-Review-Introduction - Jupyter Notebook
Sohail Doulah
No ratings yet
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
From Everand
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
Matthew Rosch
No ratings yet
Final Paper
Document6 pages
Final Paper
Abhishek Mishra
No ratings yet
A Framework For Zero-Day Vulnerabilities Detection and Prioritization
Document9 pages
A Framework For Zero-Day Vulnerabilities Detection and Prioritization
Abhishek Mishra
No ratings yet
Before We Knew It: An Empirical Study of Zero-Day Attacks in The Real World
Document12 pages
Before We Knew It: An Empirical Study of Zero-Day Attacks in The Real World
Abhishek Mishra
No ratings yet
1.benign Malicious
Document8 pages
1.benign Malicious
Abhishek Mishra
No ratings yet