Design and Implementation of Data Leakage Detection and Prevention Software For Campus Network

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 38

EAST POINT COLLEGE OF ENGINEERING FOR WOMEN

Jnana Prabha, Bidarahalli, Virgonagar Post, Bangalore 560 049.


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
A Project on

Design and Implementation of Data Leakage


Detection and prevention software for campus
network
Anussha.V.S
Archana.S
Kavya.S
Malashree.G

1EC12CS006
1EC12CS007
1EC12CS025
1EC12CS032
Under the guidance
of
Ms. Saranya.A

Design and Implementation of DLP for campus network

Problem Statement
Design and implementation of data leakage
detection and prevention for campus network
includes :
DLD- Data Leakage Detection
DLP- Data Leakage Prevention

Design and Implementation of DLP for campus network

System Architecture
Campus Network
Gateway

Node
A

Node
X

Node
Y

Node
B

Internet

Design and Implementation of DLP for campus network

System Architecture
Keywords

Add/delete/update
keywords

Database

Admin

Database
admin
Alerts
admin

GUI

Design and Implementation of DLP for campus network

Modules
Implementing the pattern matching algorithms.
Designing a DLDP software that detects the
leakage.
Creating the database.
Developing a GUI to alert the admin in case of
data leakage

Design and Implementation of DLP for campus network

Usecase Diagram
Packets

User1

Detection
algorithms
Pattern
recognition
GUI
Alerts

User 2
Pattern
database

Admin

Design and Implementation of DLP for campus network

Sequence Diagram
Data Leakage Detection:
:user1

:System

sends E-mail

:user2

:Admin

scans the database


and E-mail

alerts GUI

GUI alerts the admin

warns

Design and Implementation of DLP for campus network

Data Leakage Prevention:


:user1

:System

sends E-mail

:Admin

:user2

scans the database


and E-mail

alerts GUI

GUI alerts the admin

drops Email

Design and Implementation of DLP for campus network

Dataflow Diagram
Start
Packets
DLP Software
scans

If
leakage?
Yes
A

No

Stop

Design and Implementation of DLP for campus network

GUI alerts admin

Admin warns the


user
Admin drops
packets
Stop

Design and Implementation of DLP for campus network

Algorithms
Boyre-Moore Horspool algorithm
Aho Corasick algorithm

Design and Implementation of DLP for campus network

Boyre-Moore Horspool Algorithm


Given a pattern string P of length m and a text string T
of length n, we would like to know whether there
exists an occurrence of P in T.

Text
Pattern

Design and Implementation of DLP for campus network

Example
T : GCATCGCAGAGAGTATACAGTACG
P : GCAGAGAG

Letter

Value

GCATCGCAGAGAGTATACAGTACG
GCAGAGAG
pos 0 + d[t0+7] , pos 0 + d[A], pos 1
GCATCGCAGAGAGTATACAGTACG
GCAGAGAG
pos 1 + d[t1+7] , pos 1 + d[G], pos 3
GCATCGCAGAGAGTATACAGTACG
GCAGAGAG
pos 3 + d[t3+7] , pos 3 + d[G], pos 5

Design and Implementation of DLP for campus network


0 1 2 3 4 5

6 7 8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

GCATCGCAGAGAGTATACAGTACG
GCAGAGAG
Letter
While j > 0 And tpos+j = pj Do j j-1
If j = 0 Then report an occurrence at pos+1

Value

pos 5 + d[t5+7] , pos 5 + d[G], pos 7

GCATCGCAGAGAGTATACAGTACG
GCAGAGAG
pos 7 + d[t7+7] , pos 7 + d[A], pos 8

GCATCGCAGAGAGTATACAGTACG
GCAGAGAG
pos 8 + d[t8+7] , pos 8 + d[T], pos 16

Design and Implementation of DLP for campus network

Algorithm Steps
Input : let pattern P = p1p2pm with length m and text T = t1t2tn
Output : position of occurances of pattern P.
For c Do d[c] m
For j 1m-1 Do d[pj] m - j
pos0
While pos n-m Do
j m
While j > 0 And tpos+j = pj Do j j-1
If j = 0 Then report an occurrence at pos+1
pos pos +d[tpos+m]
End of while

Design and Implementation of DLP for campus network

Aho-Corasick Algorithm
Locate all occurrences of any of a finite
number of keywords in a string of text.
Consists of two parts :
constructing a finite state pattern matching
machine from the keywords
using the pattern matching machine to process the
text string in a single pass.

Design and Implementation of DLP for campus network

Pattern Matching Machine(1)


Our problem is to locate and identify all substrings of
x which are keywords in K.
K : K={y1,y2,,yk} be a finite set of strings which
we shall call keywords
x : x is an arbitrary string which we shall call the
text string.
The behavior of the pattern matching machine is
dictated by three functions: a goto function g, a
failure function f, and an output function output.

Design and Implementation of DLP for campus network

Pattern Matching Machine(2)


g (s,a) = s or fail maps a pair consisting of
a state and an input symbol into a state or the
message fail.
f (s) = s maps a state into a state, and is
consulted whenever the goto function reports
fail.
output (s) = keywords associating a set of
keyword with every state.

Design and Implementation of DLP for campus network

Pattern Matching Machine Example


with keywords {he,she,his,hers}

Design and Implementation of DLP for campus network

Start state is state 0.


Let s be the current state and a the current
symbol of the input string x.
Operating cycle
If g(s,a)=s, makes a goto transition, and enters
state s and the next symbol of x becomes the
current input symbol.
If g(s,a)=fail, make a failure transition f. If f(s)=s,
the machine repeats the cycle with s as the
current state and a as the current input symbol.

Design and Implementation of DLP for campus network

Example
Text:

u s h e r s
State: 0 0 3 4 5 8 9
2
In state 4, since g(4,e)=5, and the machine
enters state 5, and finds keywords she and
he at the end of position four in text string,
emits output(5).

Design and Implementation of DLP for campus network

Contd
In state 5 on input symbol r, the machine
makes two state transitions in its operating
cycle.
Since g(5,r)=fail, M enters state 2=f(5) . Then
since g(2,r)=8, M enters state 8 and advances
to the next input symbol.
No output is generated in this operating
cycle.

Design and Implementation of DLP for campus network

Algorithm 1. Pattern matching machine.


Input. A text string x = a1 a2 a n where each
a i is an input symbol
and a pattern matching machine M with
goto function g, failure
function f, and output function output,
as described above.
Output. Locations at which keywords occur in
x

Design and Implementation of DLP for campus network

Method.
begin
state 0
for i 1 until n do
begin
while g (state, a i ) = fail do state f(state)
state g (state, a i )
if output (state) empty then
begin
print i
print output (state)
end
end
end

Design and Implementation of DLP for campus network

Algorithm 2
Algorithm 2. Construction of the goto function.
Input. Set of keywords K = {yl, y2, . . . . . yk}.
Output. Goto function g and a partially computed output function
output.
Method. We assume output(s) is empty when state s is first created,
and g(s, a) = fail if a is undefined or if g(s, a) has not yet
been defined. The procedure enter(y) inserts into the goto
graph a path that spells out y.

Design and Implementation of DLP for campus network

begin
newstate 0
for i 1 until k do enter(y i )
for all a such that g(0, a) = fail do g(0, a) 0
end

procedure enter(a 1 a 2 a m ):
begin
state 0; j 1

Design and Implementation of DLP for campus network

while g (state, aj ) fail do


begin
state g (state, aj)
j j + l
end
for p j until m do
begin
newstate newstate + 1
g (state, ap ) newstate
state newstate
end
output(state) { a 1 a 2 a m}
end

Design and Implementation of DLP for campus network

Algorithm 3
Algorithm 3. Construction of the failure function.
Input. Goto function g and output function output from Algorithm 2.
Output. Failure function f and output function output.

Design and Implementation of DLP for campus network

Method.
begin
queue empty
for each a such that g(0, a) = s0 do
begin
queue queue {s}
f(s) 0
end
while queue empty do
begin
let r be the next state in queue
queue queue - {r}

Design and Implementation of DLP for campus network

for each a such that g(r, a) = sfail do


begin
queue queue {s}
state f(r)
while g (state, a) = fail do state f(state)
f(s) g(state, a)
output(s) output(s) output(f(s))
end
end
end

Design and Implementation of DLP for campus network

Algorithm 4
Algorithm 4. Construction of a deterministic finite automaton.
Input. Goto function g from Algorithm 2 and failure function f from
Algorithm 3.
Output. Next move function 8.

31

Design and Implementation of DLP for campus network

Method.
begin
queue empty
for each symbol a do
begin
(0, a) g(0, a)
if g (0, a) 0 then queue queue {g (0,
a) }

while queue end


empty do
begin
let r be the next state in queue
queue queue - {r}

Design and Implementation of DLP for campus network

for each symbol a do


if g(r, a) = s fail do
begin
queue queue {s}
(r, a) s

end

else (r, a) (f(r), a)


end
end

Design and Implementation of DLP for campus network

OUTPUT OF BOYRE-MOORE ALGORITHM

Design and Implementation of DLP for campus network

Design and Implementation of DLP for campus network

Design and Implementation of DLP for campus network

Design and Implementation of DLP for campus network

Thank You

You might also like