Professional Documents
Culture Documents
Email Security and Anti-Spam Technology: MSC Vuong Thi Nhung
Email Security and Anti-Spam Technology: MSC Vuong Thi Nhung
Email Security and Anti-Spam Technology: MSC Vuong Thi Nhung
Anti-spam technology
2
Email Basics
• What is an Email – an electronic message
transmitted over a network from one user to
another.
• Can be as simple as a few lines of text, or
include attachments such as pictures or
documents.
• Email made up 75% of network traffic soon after
the introduction of the internet.
What Makes Up An Email
• The Header
– Who sent the email.
– To whom the mail is sent.
– When the email was sent.
– The email subject.
– The size of the email.
What Makes Up An Email
• The Body
– Contains the message.
– May also contain an attachment.
• Attachments
– If not embedded within the body, attachments
are sent along with the email.
How does email messaging system work?
SMTP
SMTP
POP3/IMAP
Webpage
Containing
Message
7
Content
• How emails work
• Email security
• What is Spam?
• Anti-Spam framework
• Statistical rules for SpamAssassin
• Dynamic Sender Policy Framework
• Personalized Email Prioritization
8
#1a: Encrypt Your POP & IMAP Traffic
9
#1b. Securing SMTP protocol
Mail Server
Sending Receiving
E-Mail E-Mail
Client Client
10
#1c.End-to-end email encryption
• E-Mail Encryption
– PGP and S/MIME for encryption
– S/MIME usually built in if available at all
– S/MIME uses PKI technology for encryption
– PGP usually a cumbersome add-on to e-mail
11
Cryptographic Protection for E-Mail
Mail Server
13
#2. Neutralize Viruses and Worms
14
#3. Port 25 filtering
15
#4. SMTP relay
16
Open SMT relay issues
2.Forward to
destination SMTP
1.Send spams to your exchange servers
open SMTP relay
Blacklist
17
Open SMTP relay issue
• Open SMTP relay gives “spammers” free
reliable delivery of their Spam messages
through your SMTP relay server, until reaching
destination.
• Spam messages marks as originated from your
domain name . Results:
– Your domain will be placed into a DNS-based
Blackhole List (DNSBL) to block e-mail from
those sources.
– Other users from other domains cant receive
emails from your domain any more. 18
Best practices for STMP relay
maildomain.com
19
Content
• How emails work
• Email security
• What is Spam?
• Anti-Spam framework
• SpamAssassin
20
What is spam?
• Spam is unsolicited bulk email (illegitimate
email that you don’t want to receive)
• Ham is the legitimate email you want to receive
or “good mail”.
• Reasons for spam
– Sender is anonymous
– Low cost to send an email
– Demand of advertisement
21
Example of spam
22
Integrated anti-spam software
DNS (mx record) DNS (mx record)
vuongnhung.hanu@gmail.com nhungfit@hanu.edu.vn
SMTP Port 25
Server Server
antispam antispam
software software
Port 25
Port 110 Port 80
POP3 SMTP
Outlook IE
23
Client antispam software
Email gateway
DNS (mx record) DNS (mx record)
vuongnhung.hanu@gmail.com nhungfit@hanu.edu.vn
Port 25
Port 25
Port 25
POP3 SMTP
IMAP Email gateway
Outlook IE
24
POP3 anti-spam proxy
DNS (mx record) DNS (mx record)
vuongnhung.hanu@gmail.com nhungfit@hanu.edu.vn
SMTP Port 25
Port 25
Port 110
Outlook
25
Content
• How emails work
• Email security
• What is Spam?
• Anti-Spam framework
• SpamAssassin
26
Anti-spam approaches
• SMTP server
– Real-time Blackhole List, auto whitelisting, greylisting
• Artificial intelligence
– Rule-based, SpamAssassin, text categorization,
ontology
• Sender authentication
– SPF, DomainKeys, SenderID
• Collaboration
– Peer to Peer, Grid computing
• Social networks
– UserRank
27
Real-time Blackhole List (RBL)
• Third party provide Real-time Blackhole List
service
• Keep a list of IP which send spam
• The service: to check whether if an IP is in RBL
• How to collect “bad” IP
– Spam report
– Provided by an large email server
28
Auto whitelisting
• Each user is assigned a credit score
– If an user A sends out a ham, then the credit score
increases
– If an user A sends out a spam, then the credit score
decreases
• Users with high credit score are automatically
added to the whitelist
29
Greylisting
• MTA uses greylisting to "temporarily reject"
any email from a sender it does not
recognize.
• When receiving SMTP initial connection error
– Normal users retry
– Spammers don’t retry
– If email is resent, put it into whitelist
or else mark it as spam and reject it permanently
30
Rule-based Filtering
• Defines many rules
• Each rule has a score
• If a rule matches, then add the score of
that rule to the total score of the email.
• If the score of the email > a threshold,
then this is a spam
Rule-based
31
Example
• Rules:
– Subject contains “free”: score 1
– Body contains “free”: score 1
– Attachment name is “love.exe”: score 3
• Threshold: 5
32
Example (cont.)
• Determine score of this email
• Is it a spam or ham?
- Case 1:
33
• Case 2:
34
Sender authentication
• SPF (Sender Policy Framework)
– SPF vs. DNS MX
• SenderID
– SPF + PRA
• DomainKeys
– Public keys
35
SPF v1
Recipient:
you@to.com
37
SenderID
39
DomainKeys
– Basically public/private keys for authenticating
client mail and the servers along the path
– Acts as a chain of custody from the source
client machine to the destination client
machine
– Backward compatible with existing technology
– Google and Yahoo have already deployed!
Peer to Peer
• Spam
fingerprint
database
• Exchange
spam
fingerprints
by using P2P
network
41
Grid computing
• Filter spam
on a large
scale
• Cloud-
based anti-
spam
service
42
* Adopted from ACM Computing Surveys, Vol. 44, No. 2, Article 9
Social networks
• Consider an
email network
as network of
notes (users).
• Links between
notes are email
transaction
activities
43
Content
• How emails work
• Email security
• What is Spam?
• Anti-Spam framework
• SpamAssassin
44
SpamAssassin review
• SpamAssassin
– Open source, widely used
– Rule-based approach
body DEAR_FRIEND /^\s*Dear Friend\b/i
describe DEAR_FRIEND Dear Friend? That's not very dear!
score DEAR_FRIEND 0.542
45
Regular expression
• Abbreviated as regex, regexp
• Often called a pattern, is an expression that
describes a set of strings.
• Are used to give a concise description of a set
without having to list all elements
46
Regex operation
• Alternation: A vertical bar separates
alternatives
– gray|grey
• can match "gray" or "grey".
• Grouping: Parentheses are used to define
the scope and precedence of the operators
– gr(a|e)y
• describe the set of "gray" and "grey"
• For more operation, discuss in Tutorial
47
Rules directory
• C:\Program Files\SpamAssassin\Rules
10_misc.cf 30_text_de.cf
20_body_tests.cf 30_text_es.cf
20_head_tests.cf 30_text_fr.cf
20_uri_tests.cf 30_text_pl.cf
25_body_tests_es.cf 40_spam_phrases.cf
25_body_tests_pl.cf 50_scores.cf
25_head_tests_es.cf 60_whitelist.cf
25_head_tests_pl.cf local.cf
48
Example
49
Headers added to spam by SpamAssassin
50
Reminders
• This week, each class monitor needs to submit
groups of presenters via FIT portal
– Only students who haven’t got enough
discussion marks need to make a presentation
(students have 0 or 1 discussion marks are
required). Topics of presentation vary from
week to week. Check topics on FIT portal.
– From 10th week, students are required to
deliver presentations during lecture time.
51
• Next week (9th week), lectures and tutorial
will be OFF for midterm time.
• Remind that before every video, you need
to type your name, your class and what
you’re going to do.
52
Presentation Schedule
54
Final projects
• Final project is launched after Midterm.
• On 10th week, class monitors need to
submit groups of final projects along with
project topics
55
Tutorial this week
• Regular Expression
• SpamAssassin rules
56