Professional Documents
Culture Documents
Sobel Edge Detector in VB
Sobel Edge Detector in VB
NET
I recently stumbled onto a C tutorial on edge detection and decided to implement the algorithm in VB.NET.
Edge detection is a machine vision technique that attempts to identify interesting parts of an image, such as
where one object ends and another begins. One way of finding these areas is to search for sharp changes in
intensity between each pixel and its neighbors. A source image is processed and an output image is created
that highlights the edges found in the source. The output image is a visualization of what the algorithm detected
and ultimately these results can be applied to solve a specific problem.
Although the background and mathematical basis for the algorithm are interesting, Im only going to discuss
them briefly and focus on the implementation and performance issues in .NET. If you are curious about the
details behind the algorithm, you should check out the original articles I found.
The Algorithm
To find relative changes in intensity level, the algorithm processes the image one pixel at a time. It looks at the
change in intensity to the left and right of the current pixel, stores it and then checks the change in intensity in
the pixels above and below the current pixel and stores it as well. The actual process happens one dimension
at a time for each pixel (horizontal and then vertical), and then combines the result into two dimensional pixel
data, the output image.
As each pixel is encountered, its neighboring pixels intensity levels get calculated and then subtracted from the
neighbor pixel on the opposite side. The resulting sum of all the pixels is the relative change in intensity at that
location. If opposite neighbor pixels are the same color, when subtracted from each other the result will be zero
(black). If the two neighboring pixels were radically different intensity levels the output would be greater than
zero. The mechanism that averages the pixels is a weighted matrix, one for vertical (the xMask) and one for
horizontal (the yMask):
For the vertical yMask, the strictly horizontal elements are zero, and the opposite is true for the xMask. The
elements of each matrix are multiplied by the border pixels intensity levels, and then the results are summed.
Since the opposite sides are also opposite signs, the sum is the change in intensity we were looking for. The
final step is to add the absolute value of the horizontal and vertical differences in intensity. This process is
actually a rough approximation of the mathematical gradient of the image.
First Implementation: GDI+
To get things started, I wanted to keep the details of working with the image data to a minimum, so I created the
algorithm to work with the infinitely slow GDI+ Bitmap object using the GetPixel and SetPixel methods. The
implementation is straight forward:
This process typically runs in about 3K pixels /second, which sounds fast but actually takes about 160
seconds to process an 800 x 600 pixel image (480K pixels). So for real-time processing, this method is
absolutely out. Although its very slow, this method works as expected and was useful to me as a reference
renderer as I tested new approaches.
Take Two: Direct Pixel Access
The GDI Bitmap object offers a handy function, Lock/UnlockBits(), that returns the raw bytes of memory
composing the Bitmap object. By using this function, it is possible to read all pixels in one call, process them,
and then write them back in a single call. This is much, much faster than using Get/SetPixel() methods which
operate on a single pixel for each read and write. The intense down side of LockBits is that you no longer have
friendly access to the pixel by X and Y coordinates, and documentation is pretty bad.
When calling LockBits(), you specify what format you want the data to be returned in. The pixels get returned as
a contiguous array of bit data, and you are charged with picking it apart. For my purposes, Ive forced the format
to always be 24 bit RGB. The function returns a one dimensional array of bytes. Each pixel is encoded
according to the format specified, so in my case, there are 3 bytes for each pixel (R, G and B). To emulate two
dimensions, a stride value is given, which lets you know how many bytes there are per line, along with a
height which is the total number of scan lines. So each pixel can still be accessed with X and Y coordinates by
Pixel.Red = Array[stride * Y + X * 3]
Pixel.Blue = Array[stride * Y + X * 3 + 1]
Pixel.Green = Array[stride * Y + X * 3 + 2]
Notice that the Y value is multiplied by the width of the scan line and X is multiplied by the amount of byte data
per pixel, the 3 here is for RGB. The first byte at this location is red, the next byte is green and the last byte is
blue, which is why 0, 1 and 2 are added to X.
To make this logic less painful, I created a wrapper class for the bitmap object which has its own GetPixel and
SetPixel methods. This class locks the bits on the image when it loads and then operates on the array. It
implements IDisposable, and when Dispose is called, it calls UnlockBits on the original bitmap image,
committing all changes to the image at once.
This implementation runs at around 50Kp/s, a huge improvement of over the original. Processing the pixels in
blocks greatly improved performance, but this is still a little too slow for any real-time application. The 800600
image still takes about 10 seconds to process with this new method.
Take Three: Divide and Conquer
The next approach I took was to reduce the input data by splitting the image in two. The actual split is done by
creating Rectangle objects and then passing these rectangles to LockBits when retrieving the image data.
Since I wanted to test different numbers of splits, it became very important to *neatly* keep track of work units,
that is, what part of the image was actually being processed. Also, I had another idea in mind for the next
implementation, so I wanted this idea of work units to be reusable. To facilitate this, I created a class called
ImageWorkUnit with the following properties:
...
Next
Next
Now here is the new processing loop for the work unit based implementation:
Low Pass Filter dan High Pass Filter pada Pengolahan Citra
Digital
Perbedaan domain spasial dengan domain frekuensi :
Domain
Spasial
Domain
Frekuensi
Konsep
koordinat baris
dan kolom.
Konsep
frekuensi,
perubahan
intensitas piksel
ke piksel
(frekuensi
rendah dan
tinggi).
Pemrosesan
pixel-by-pixel.
Pemrosesan
berdasarkan
pemilihan
frekuensi yang
akan difilter
atau tidak.
Komputasi lama
(terutama citra
dengan ukuran
spasial tinggi).
Komputasi relatif
cepat (terutama
citra dengan
ukuran spasial
tinggi)
Dasar untuk filter linear dalam domain spasial dan frekuensi adalah teori
konvolusi, yang dapat dituliskan dengan:
Seperti halnya teori konvolusi, juga bisa mendapatkan hasil yang sama dalam
domain frekuensi dengan perkalian antara F(u,v) dengan H(u,v).
Lowpass Filter
Low-pass ltering merupakan metode penghalusan sebuah sinyal atau
citra. Smoothing / blurring / penghalusan dicapai dalam domain frekuensi dengan
pelemahan frekuensi tinggi.Smoothing dapat membantu menghilangkan noise,
karena noise / interference disebabkan oleh frekuensi tinggi.
Ideal Lowpass Filter (ILPF)
Filter lowpass 2-D yang melewatkan tanpa pelemahan semua frekuensi rendah
dalam lingkaran radius D0 dari origin dan dengan cutof semua frekuensi di luar
lingkaran disebutIdeal Lowpass Filter (ILPF) yang ditentukan oleh fungsi :
di mana D0 adalah konstanta positif jarak origin dan D(u,v) adalah jarak antara
titik (u,v) dalam domain frekuensi dan pusat persegi panjang frekuensi, maka:
dan
D(u,v) adalah jarak antara titik (u,v) dalam domain frekuensi dan pusat persegi
panjang frekuensi, dimana :
D0 merupakan jarak dari origin dan D(u,v) adalah jarak antara titik (u,v) dalam
domain frekuensi dan pusat persegi panjang frekuensi, dimana :
Highpass filtering
High-pass ltering merupakan kebalikan dari low-pass ltering, yaitu metode
yang membuat sebuah sinyal atau citra menjadi kurang halus. Metode yang
digunakan adalah melakukan pelemahan dalam domain frekuensi yang memiliki
frekuensi rendah. highpass ltering biasa digunakan untuk Unsharp Masking,
Deconvolution, Edge Detection, mengurangi blur, atau menambah noise. (Morse,
2010 )
Ideal Highpass Filter (IHPF)
Ideal Highpass
Filter melewatkan
semua
frekuensi
tinggi
dan
melakukan cutof semua frekuensi rendah. IHPF 2-D dituliskan dalam bentuk :
di mana D0 adalah konstanta positif jarak origin dan D(u,v) adalah jarak antara
titik (u,v) dalam domain frekuensi dan pusat persegi panjang frekuensi, maka:
dan
D(u,v) adalah jarak antara titik (u,v) dalam domain frekuensi dan pusat persegi
panjang frekuensi, dimana :
dan
D0 merupakan jarak dari origin dan D(u,v) adalah jarak antara titik (u,v) dalam
domain frekuensi dan pusat persegi panjang frekuensi, dimana :