Image Processing Notes

You might also like

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

ImageProcessingNotes

0. MOTIVATION

Image processing is very important in science, and specifically in data


science. There are important things such as the digit recognition problem,
convolutional neural networks and generative adversarial networks. We will use
NumPy and MatPlotLib but libraries like Pillow or OpenCV are very good if we want
to deepen in the subject.

1. BASIC INVESTIGATION

An image is an array of shape (Nx,Ny,3) where Nx and Ny are the number of


pixels in those dimensions and in the 3 dimension we have the values for the three
color channels, which are red, green and blue and the values are in the range
[0,255]. We can load images in png or jpg format with 'plt.imread(file)'.

1.1. Plotting

We print the image with 'plt.imshow(image)'. We always use


'interpolation=None'

1.2. Histograms

They allow us to see how pixels are distributed in terms of intensity.


In order to create them, we can split the third dimension of the array and ravel
each color into a one dimensional array. Then we can just use the histogram
function but with the list of arrays and the 'stacked=True' function in order to
put the number of pixels on top of each other.

1.3. Color and greyscales

Because an image is just an array, cropping it is just indexing the


part we want. We can also separate the color channels and plot them separately
using the following 'cmap=': 'Reds_r', 'Greens_r', 'Blues_r' for the different
colors.

One way to get greyscales (of all the different possibilities) is


luminiscance preservation. It uses the following code:

pixels = np.array(0.299*R + 0.587*G + 0.114*B, dtype=np.uint8)


im_gs = np.stack([pixels, pixels, pixels], axis=2)

2. NUMERICAL OPERATIONS ON IMAGES

Even if they might not have a clear meaning, as images are arrays, let's see
what happens with different numerical operations on images.

2.1. Addition and substraction

We can try to add or substract images. We have to make sure, though,


that the new values of the pixels are inside the range [0,255]. For some reason he
uses floats to make the calculation and then he goes back to integers with the
'.astype(float or uint8)' function. This is because images are in uint8, which only
allows for numbers up to 256. We need to go to float to get numbers above that or
below 0 and then keep them bright. Finally we have to return them to uint8 so it
has the good format. The addition gets a generally bright result and the
substraction more dark.
2.2. Modifying certain pixels

Now we want to modify pixels that satisfy a certain condition. For it,
we create a mask under a certain condition and use the mask (array of booleans) in
the indexing for each color, getting the 1D array and modifying this one. Then we
only have to stack the colors back to create the full image.

2.3. Modifying regions

We can use the meshgrid to modify certain regions of the image. In


particular, it allows us to create an array with the boolean values corresponding
to the fact that a pixel is or not satisfying the condition we want. The steps are,
to get the condition of a circle:

- Create a meshgrid containing the indices of the pixels.


- With the xs and ys, create an array where each pixel has the
value of the distance to the center of the circle we want to do.
- Return the mask where the distance is lower than the radius
chosen.

Then, we just index with the mask to get the values changed.

3. IMAGE FILTERS (NOT DONE, MAYBE WORTH LOOKING THE SECOND PART)

3.1. Kernels and image blocks versus Windows

3.1.1. Intuitive but inefficient approach

3.1.2. Fast NumPy based approach

3.2. FEW TYPICAL FILTERS

3.2.1. Utility functions

3.2.2. Blurry filter

3.2.3. Edge detection

3.2.4. Sharpen filter

You might also like