Steganography: By: Joe Jupin Supervised By: Dr. Longin Jan Latecki

Steganography
By: Joe Jupin

Supervised by: Dr. Longin Jan Latecki
Overview
Introduction
Clandestine Communication
Digital Applications of Steganography
Background
Uncompressed Images
Compressed Images
Steganalysis
The Images Used
Finding and Extracting Messages from Bitmaps
Detecting Messages in jpegs
Future Work
Introduction
Clandestine Communication
Cryptography
Scrambles the message into cipher
Steganography
Hides the message in unexpected places
Digital Applications of Steganography
Can be hidden in digital data
MS Word (doc)
Web pages (htm)
Executables (exe)
Sound files (mp3, wav, cda)
Video files (mpeg, avi)
Digital images (bmp, gif, jpg)
Background
Uncompressed Images
Grayscale Bitmap images (bmp)
256 shades of intensity from black to white
Can be obtained from color images
Arranged into a 2-D matrix
Messages are hidden in the least significant bits
(lsb)
Matrix values change slightly
Interested in patterns that form messages
Character Integer Binary
Space 32 00100000
0 9 48 57 00110000 - 00111001
A Z 65 90 01000001 - 01011010
a z 97 122 01100001 01111010
Length = 12
Message = Hello Stego!

Background
Compressed Images
Grayscale jpeg images (jpg)
Joint Photographic Experts Group (jpeg)
Converts image to YCbCr colorspace
Divides into 8x8 blocks
Uses Discrete Cosine Transform (DCT)
Obtain frequency coefficients
Scaled by quantization to remove some frequencies
High quality setting will not be noticed
Huffman Coding
Affects the images statistical properties
Background
Steganalysis
The Images Used
From Star Trek Website
1,000 color jpeg images
320x240 or 240x320
www.startrek.com
There will be Klingons
Finding and Extracting
Messages from Bitmaps
Problem
Messages can be hidden in lsbs
May be anywhere in image
Cannot see message in image
Would take forever to be processed by a
human
Procedure
Inject messages into a images
Take a Boolean snapshot of even and odd pixels
Construct a string of all possible characters
An n-pixel image has n-7 individual character
enumerations (320 x 240 - 7 = 76,793)
Use character properties to match a message
pattern in the enumerated string
Define a message (pattern of message characters)
Define message characters (used in messages)
Use stego stems (patterns)
A test can be performed faster by using tiled
samples

Steganography is the art and science of communicating in a
way which hides the existence of the communication. In
contrast to cryptography, where the "enemy" is allowed to
detect, intercept and modify messages without being able to
violate certain security premises guaranteed by a
cryptosystem, the goal of steganography is to hide messages
inside other "harmless" messages in a way that does not
allow any "enemy" to even detect that there is a second
secret message present [Markus Kuhn 1995-07-03].

Observation
Only considered linear unencrypted
messages
Trial performed on 100 grayscale bitmaps
97 clean
3 stego
Took an average of 9 seconds per image to
find with 100% accuracy (no training -- cold)
Occasionally some garbage text at head or tail
Took an average of 3 seconds per image to
test with 100% accuracy
Clean images had pattern scores of less than 10
Stego images had pattern scores of 31 or more
Conclusion
Messages are detectible and extractible
from non-encrypted uncompressed images
Linear messages can be found in any
direction with more computation
This method can be foiled by hashing the
message into the image
Detecting Messages in
jpegs
Problem
Cannot use an enumeration scheme to
detect or find a message
May only be able to detect because of
encoding schemes and encryption
Cannot see message in image
Statistical properties of an image change
when a message is injected
jpegs
Procedure
Obtain the 4-level 2-D wavelet decomposition of
the images
Obtain the orientation decomposition of frequency
space statistics
72 features plus the class (0 = clean, 1=stego)
Includes: mean, variance, skewness and kurtosis of
coefficients and error for prediction in subband
Normalize the data by 0-1 min-max
Train Fisher Linear Descriptor (FLD)
Test the FLD threshold
-0.004 17.120 120.485 0.059 0.363 1.041 3.809 -0.291
-0.146 838.622 97.874 0.887 0.034 1.391 3.948 -0.703
-2.200 15627.538 47.077 -1.128 -0.465 2.060 3.726 -0.738
0.011 15.318 90.017 0.594 0.268 0.969 3.877 -0.172
-0.523 920.19 62.226 -1.366 -0.146 1.326 3.944 -0.705
4.418 15572.229 23.531 -0.123 -0.541 1.980 3.571 -0.705
-0.004 0.935 182.339 -1.808 0.601 1.226 4.692 0.205
-0.079 193.451 364.874 -9.569 -0.116 1.133 4.244 -0.577
1.899 3640.213 24.731 0.766 -0.349 1.681 3.426 -0.625
0
0.590963 0.050189 0.080103 0.345166 0.343829 0.332710 0.001311 0.021374
0.482941 0.094929 0.084698 0.411032 0.331954 0.572352 0.260870 0.337264
0.135543 0.065238 0.079329 0.542244 0.187500 0.603208 0.306227 0.424866
0.370270 0.032725 0.025054 0.381317 0.412698 0.385321 0.001666 0.043085
0.402427 0.053992 0.155397 0.553661 0.476190 0.432629 0.237224 0.271698
0.422609 0.096439 0.087974 0.463496 0.471598 0.242233 0.153389 0.360447
0.395349 0.026724 0.044753 0.738226 0.479060 0.367367 0.073430 0.361345
0.427911 0.042625 0.055986 0.558653 0.350634 0.332762 0.165738 0.301011
0.611057 0.054988 0.166710 0.497393 0.518569 0.373766 0.153005 0.320611
0

meanV
12
meanH
12
meanD
12
varV
12
varH
12
varD
12
skwV
12
skwH
12
skwD
12
krtV
12
krtH
12
krtD
12
meanEv
12
meanEh
12
meanEd
12
varEv
12
varEh
12
varEd
12
skwEv
12
skwEh
12
skwEd
12
krtEv
12
krtEh
12
krtEd
12
meanV
23
meanH
23
meanD
23
varV
23
varH
23
varD
23
skwV
23
skwH
23
skwD
23
krtV
23
krtH
23
krtD
23
meanEv
23
meanEh
23
meanEd
23
varEv
23
varEh
23
varEd
23
skwEv
23
skwEh
23
skwEd
23
krtEv
23
krtEh
23
krtEd
23
meanV
34
meanH
34
meanD
34
varV
34
varH
34
varD
34
skwV
34
skwH
34
skwD
34
krtV
34
krtH
34
krtD
34
meanEv
34
meanEh
34
meanEd
34
varEv
34
varEh
34
varEd
34
skwEv
34
skwEh
34
skwEd
34
krtEv
34
krtEh
34
krtEd
34
class
jpegs
Observation
Trials performed on 2000 images
1000 clean and 1000 stego
Random selection of 1000 instances
without replacement (500 each class)
Messages in stego had sufficient size
Results show overwhelming accuracy
Bior3.1 True Neg 100%, True Pos 98.6%
Rbio5.5 True Neg 99.8%, True Pos 98.8%
jpegs
Conclusion
Messages of sufficient size can be detected
in stego images with great accuracy
Improved accuracy may be due to a large
training set
1000 (800/200)
500 (400/100)
Restricted domain
Many similar images
jpegs
Problems
Authors did not handle log of zero problem
Replaced with small value
Differing jpeg sizes need differing message
sizes
Dynamic message injection
jpegs
Other Classifiers
Tests were run on J4.8, SMO, Logistic and
Nave Bayes for bior3.1 and rbio5.5 with
80/20 split and default settings
Results
Future Work
Would like to find optimal stems
Pattern matching
Text mining
Cryptanalysis
Would like to optimize TestMsg code
C/assembly code
References
Petitcolas, F.A.P., Anderson, R., Kuhn, M.G., "Information Hiding - A
Survey", July1999, URL:
http://www.cl.cam.ac.uk/~fapp2/publications/ieee99-infohiding.pdf
(11/26/0117:00)
Farid, Hany, Detecting Steganographic Messages in Digital Images
Department of Computer Science, Dartmouth College, Hanover NH
03755
Moby Words II, Copyright (c) 1988-93, Grady Ward. All Rights
Reserved.
Lyu, Siwei and Farid, Hany, Steganalysis Using Color Wavelet Statistics
and One-Class Support Vector Machines, Department of Computer
Science, Dartmouth College, Hanover, NH 03755, USA
Farid, Hany, Detecting Hidden Messages Using Higher Order Statistical
Models Department of Computer Science, Dartmouth College, Hanover
NH 03755
Spy Vs. Spy
by Antonio Prohias
from MAD Magazine
Have a good Winter Break!

Steganography: By: Joe Jupin Supervised By: Dr. Longin Jan Latecki

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Steganography: By: Joe Jupin Supervised By: Dr. Longin Jan Latecki

Uploaded by

Copyright:

Available Formats

Steganography

By: Joe Jupin

You might also like