Fruit Recognition On A Raspberry Pi

5/5/24, 19:30 Fruit Recognition on a Raspberry Pi - Real Time Applications and Software Techniques
Real Time Applications and Software Techniques

The Deep Learning HS-AlbSig Blog Site
Fruit Recognition on a Raspberry Pi

As a instructor I offer a class, in which I let master students choose a topic for practi-
cal work during a semester. I usually give them a rough description, which included
last time a raspberry pi, a camera, and a neural network.
Some students have chosen to work on fruit recognition with a camera. So the sce-
nario is the following: The camera is connected to a raspberry pi. The camera ob-
serves a clean table. As soon as a user puts a fruit onto the table, the user can hit a
button on a shield attached on the raspberry pi. The button triggers the camera to
take an image. Then, the image is fed into a trained neural network for image catego-
rization. The category was then fed into a speech synthesizer to speak out the
category.
The type of neural network my students and I used is a multi-categorical neuronal

network. So the goal was to feed the neuronal network with image and a category
will come out as an output.
Preparing the Data
In the beginning we chose fruit images from a database which is available on github.
You find it here. It had about 120 different categories of fruits and vegetables avail-
able. The problem we find with these images are, that the fruits and vegetables
seemed to be perfect looking which is in reality not the case. The variation of fruit
images within one category also seemed to be very limited. On the one hand, they do
have many images within each category, on the other hand it looks like each image
from one category only comes from a perfect fruit photographed in different
positions.
The fruits fill out the complete image, as well. When you photograph a fruit from a
table, this is in general not the case. The left part of Figure 1 shows an orange which
fills in only part of the image.
https://www3.hs-albsig.de/wordpress/point2pointmotion/2020/03/26/fruit-recognition-on-a-raspberry-pi/ 1/11
What is more, the background of the images from the database is extremely bright.
This is not quite a real life background, which we find is much darker when you take
pictures from inside a building. In Figure 2 you can see two different backgrounds
which are surfaces from two different tables. The backgrounds do have relatively
low brightness.
Cropping the images
The first task was to prepare the data for training the neural network. We decided to
crop the images to the size of the fruits, so we receive some kind of standardization
of the images. Below you find the code which crops the images to the size of the fruit.
In this case we have the fruit images inside the addfolder. Inside the addfolder we
first have two more directories, Testing and Training. Below these directories you
find the directories for each fruit. We limit the number of fruits to six. The fruits we
use are listed in dirlist, which are also the directory names.
The code is iterating through the Testing and Training directories and the fruit direc-
tories in dirlist and loads in every image with the opencv function imread. It converts
the loaded image to a grayscale image and filters it with the opencv threshold func-
tion. After this we apply the findContours function which returns a list of contours of
the image. The second largest contour (the largest contour has the size of the image
itself) is taken and the width and height information of the contour is retrieved. The
second largest contour is the fruit portion on the image. The application copies a
square at the position of the second largest contour from the original image, resizes
it to 100×100 pixels and saves it into a new directory destfolder.
1. srcfolder = '/home/inf/Bilder/Scale/orig/'
2. destfolder = '/home/inf/Bilder/Scale/cropped/'
3. addfolder = '/home/inf/Bilder/Scale/added/'
4. processedfolder = '/home/inf/Bilder/Scale/processed/'
5.
6. dirtraintest = ['Testing', 'Training']
7. dirlist = ['Apfel','Gurke','Kartoffel','Orange','Tomate','Zwiebel']
8.
9. count = 0
10. pattern = "*.jpg"
11. img_size = (100,100)
12.
13. for traintest in dirtraintest:
14. for fruit in dirlist:
15. count = 0
16. for file in glob.glob(os.path.join(addfolder, traintest, fruit,
pattern)):
17. im = cv2.imread(file)
18. imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
19. ret, thresh = cv2.threshold(imgray, 127, 200, 0)
20. contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
21. if len(contours) > 1:
22. cnt = sorted(contours, key=cv2.contourArea)
23. x, y, w, h = cv2.boundingRect(cnt[-2])
24. w = max((h, w))
25. h = w
26. crop_img = im[y:y+h, x:x+w]
27. im = cv2.resize(crop_img, img_size)
28. cv2.imwrite(os.path.join(destfolder, traintest, fruit,
str("cropped_img_"+str(count)+".jpg")), im)
29. count += 1
Figure 1 shows how the application crops an image of an orange. On the left side, the
orange fills out only part of the image. On the right side, the orange fills out the com-
plete image.
Figure 1: Original Image and Cropped Image
Changing the backgrounds
Due to the extreme bright background of the images from the database we came to
the decision to fill in new backgrounds on top of the bright ones. In Figure 2, you can
see two different table surfaces, taken by the camera we used.
Figure 2: Backgrounds
The code below shows how each image from the directory structure (which I ex-
plained above) is loaded into the variable pixels with the opencv imread function.
Each pixel on each layer (RGB) of the image is checked, if a threshold of brightness
has been reached. We assume that a pixel exceeding a certain brightness threshold is
a background pixel (which is not always the case). The application then replaces the
pixel with a pixel from a background image shown in Figure 2. It saves the new im-
age to the directory processedfolder.
1. background = cv2.imread("background.jpg")
2.
3. bg = np.zeros((img_size[0], img_size[1],3), np.uint8)
4. bgData = np.zeros((img_size[0], img_size[1],3), np.uint8)
5.
6. bg = cv2.resize(background, img_size)
7. bgData = bg.copy()
8.
9. threshold = (100, 100, 100)
10.
11. for traintest in dirtraintest:
12. for fruit in dirlist:
13. count = 0
14. for name in glob.glob(os.path.join(destfolder, traintest, fruit,
pattern)):
15. pixels = cv2.imread(os.path.join(destfolder, traintest, fruit,
name))
16. pixelsData = pixels.copy()
17.
18. for i in range(pixels.shape[0]): # for every pixel:
19. for j in range(pixels.shape[1]):
20. if pixelsData[i, j][0] >= threshold[0] and
pixelsData[i, j][1] >= threshold[1] and pixelsData[i, j][2] >= threshold[2]:
21. pixelsData[i, j] = bgData[i, j]
22. cv2.imwrite(os.path.join(processedfolder, traintest, fruit,
str("processed_img_"+str(count)+".jpg")), pixelsData)
23. count += 1
Figure 3 shows the output of two images from the code above. It shows the same or-
ange with two different backgrounds.
Figure 3: Orange with two different Backgrounds
Training the Model
Below the code of a neural network model. It consists of four convolutional layers.
The number of filters is increased with each layer. After each convolutional layer
there is a max pooling layer to reduce the image size for the input of the following
layer. A flatten layer follows and is fed into a dense layer. Finally there is another
dense layer with six neurons. This is the number of categories we have. Each layer
uses the relu activation function. In the last layer however we use the softmax acti-
vation function. The reason for softmax, and not sigmoid, is, that we expect only one
category from the six categories to be true for a given input image. This can be repre-
sented by the highest number calculated from the six output neurons. For optimiza-
tion, we use stochastic gradient descent method.
1. model = Sequential()
2. model.add(Conv2D(16, (3, 3), activation='relu',
kernel_initializer='he_uniform', padding='same', input_shape=input_shape))
3. model.add(MaxPooling2D((2, 2)))
kernel_initializer='he_uniform', padding='same'))
6. model.add(Dropout(0.1))
13. model.add(Flatten())
14. model.add(Dense(256, activation='relu', kernel_initializer='he_uniform'))
16. model.add(Dense(6, activation='softmax'))
17.
18. opt = SGD(lr=0.001, momentum=0.9)
19. model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['ac-
curacy'])
We load in all training and validation images from the directory train_path and
valid_path with the Keras ImageDataGenerator. By doing this the
ImageDataGenerator rescales the images and augment the images by shifting and
flipping. The training and validation images from the directories train_path and
valid_path are moved into the lists train_it and valid_it. The method
flow_from_directory makes this task easy since it considers the directory structure
below the directories train_path and valid_path, as well. In our case, we have the di-
rectories Apfel, Gurke, Kartoffel, Orange, Tomate, Zwiebel below of train_path and
valid_path. In each of these directories you find the corresponding images (such all
apple images in directory Apfel, all cucumber images in directory Gurke etc.).
1. train_datagen = ImageDataGenerator(rescale=1.0/255.0,width_shift_range=0.1,
height_shift_range=0.1, horizontal_flip=True)
2. test_datagen = ImageDataGenerator(rescale=1.0/255.0)
3.
4. train_it = train_datagen.flow_from_directory(train_path,class_mode='cate-
gorical', batch_size=64, target_size=image_size)
5. valid_it = test_datagen.flow_from_directory(valid_path,class_mode='categor-
ical', batch_size=64, target_size=image_size)
The training is started with the Keras fit_generator command. It uses the lists train_it
and valid_it as inputs. We defined a callback function to produce checkpoints from
the neural network weights, each time the training shows some improvement con-
cerning validation loss.
1. callbacks = [
2. EarlyStopping(patience=10, verbose=1),
3. ReduceLROnPlateau(factor=0.1, patience=3, min_lr=0.00001, verbose=1),
4. ModelCheckpoint('modelmulticat.h5', verbose=1, save_best_only=True,
save_weights_only=True)
5. ]
6.
7. history = model.fit_generator(train_it,
steps_per_epoch=len(train_it),validation_data=valid_it,
validation_steps=len(valid_it), epochs=10, callbacks=callbacks, verbose=1)
8.
9. _, acc = model.evaluate_generator(valid_it, steps=len(valid_it), verbose=0)
10. print('> %.3f' % (acc * 100.0))
11.
12. model_json = model.to_json()
13. with open("modelmulticat.json", "w") as json_file:
14. json_file.write(model_json)
Finally the structure of the trained model is saved to a json file.
The training time with this model is about three minutes on a NVIDIA graphics card.
We use about 6000 images for training and 2000 images for validation, altogether.
The validation accuracy was 96% which was above the accuracy, which shows a little
underfitting.
Testing the Model
We tested the model with the code below. First, we loaded the image in the variable
img with the opencv function imread read. Right after this, we have to take care of
the image layers. The way opencv handles the image layers is different from the way
Keras with its predict method does. They have the Red and the Blue layers switched.
For this reason, we have to apply the cvtColor method, which switches the Red and
Blue layers. The image is then normalized by dividing its pixels values with 255.
Finally the prediction method is used to predict the image. Figure 4 shows an exam-
ple of an image for input, which is printed out by the matplotlib function imshow.
The method predict returns a probability vector predictions. The index with the high-
est value of the vector corresponds to the category. The category can be retrieved
from the class_indices list.
1. img = cv2.imread(os.path.join(valid_path,"Apfel/cropped_img_592.jpg"),1)
2. img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
3. imshow(img)
4. img = np.array(img, dtype=np.float32)
5. img *= 1.0/255.0
6. predictions = model.predict([[img]])
7. print(predictions)
8. result = np.where(predictions[0] == np.amax(predictions[0]))
9. assert len(result)==1
10. print(list(valid_it.class_indices)[result[0][0]])
We tested a few times with different image and saw that the prediction delivered
pretty good results.
Figure 4: Prediction Image
The Raspberry Pi application
The setup of the experiment is shown in Figure 5. The raspberry pi 4, power supply
and a socket are mounted on a top-hat rail. On the raspberry pi you see a piface
shield attached. The shield had to be mechanically prepared to fit on a raspberry pi
4. The shield provides buttons in case it is needed. Additionally we have a relay and a
power socket. The relay can be triggered by the piface, so the relay applies 230V to
the socket. On top of the construction you find an usb camera.
Figure 5: Experiment Setup
We defined a function getCrop, see code below, which crops the image to the size of
the portion of the fruit. This procedure was already explained above. Here we intro-
duced the variable threshset, where the user can modify the threshold value of the
opencv threshold method using keys. This is explained later.
1. threshset = 100
2.
3. def getCrop(im):
4. global threshset
5. imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
6. ret, thresh = cv2.threshold(imgray, threshset, 255, 0)
7. contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
8. if len(contours) >= 1:
9. cnts = sorted(contours, key=cv2.contourArea, reverse=True)
10. for cnt in cnts:
11. x, y, w, h = cv2.boundingRect(cnt)
12. if w > im.shape[0]*20//100 and w < im.shape[0]*95//100:
13. if h > im.shape[1]*20//100 and h < im.shape[1]*95//100:
14. w = max((h, w))
15. h = w
16. return x,y,w
17. return 0,0,0
In the beginning we faced the problem that the neural network did not predict very
well due to too few training images. Therefore we introduced a function to save eas-
ily badly predicted images. The name of the function is saveimg. It simply saves an
image img to a directory with a name containing the parameters dircat and fruit. The
image name also contains the date and the time.
1. def saveimg(img, dircat, fruit):

2. global croppedfolder
3. now = datetime.now()
4. dt_string = now.strftime("%d_%m_%Y_%H_%M_%S")
5. resized = np.zeros((image_size[0], image_size[1],3), np.uint8)
6. resized = cv2.resize(img, image_size, interpolation = cv2.INTER_AREA)
7. cv2.imwrite(os.path.join(croppedfolder, dircat, fruit,
str("img_"+dt_string+".jpg")), resized)
Below you find the raspberry pi application code. In the beginning it sets up the
opencv video feature. Inside the while loop, an image frame from the usb camera is
taken, which is then copied into the image objectfr. The function getCrop is used to
get the fruit portion of the image and a rectangle is drawn around the fruit portion.
The function putText writes the current value of threshset into the image objectfr as
well. The application then shows the modified image on a display, see Figure 6. The
opencv method waitkey checks for a pressed key. In case a key was pressed, code de-
pending on the key will be executed.
1. cam = cv2.VideoCapture(0)
2. cv2.namedWindow("object")
3.
4. while True:
5. ret, frame = cam.read()
6. if not ret:
7. print("cam.read something wrong")
8. break
9. objectfr = frame.copy()
10. x,y,w = getCrop(objectfr)
11. cv2.rectangle(objectfr, (x,y), (x+w,y+w), (0,255,0), 1)
12. cv2.putText(objectfr, "thresh: {}".format(threshset), (10,30),
cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 1, cv2.LINE_AA)
13. cv2.imshow("object", objectfr)
14. if not ret:
15. break
16. k = cv2.waitKey(1)
17. if k & 0xFF == ord('q') :
18. break
19. elif k & 0xFF == ord('n') :
20. resized = np.zeros((image_size[0], image_size[1],3), np.uint8)
21. resized = cv2.resize(frame[y:y+w,x:x+w,:], image_size, interpola-
tion = cv2.INTER_AREA)
22. cv2.imwrite("checkpic.jpg",resized)
23. resized = cv2.cvtColor(resized, cv2.COLOR_BGR2RGB)
24. resized = np.array(resized, dtype=np.float32)
25. resized *= 1.0/255.0
26. predictions = model.predict([[resized]])
27. print(predictions)
28. result = np.where(predictions[0] == np.amax(predictions[0]))
29. assert len(result)==1
30. print(result[0][0])
31. print(list(valid_it.class_indices)[result[0][0]])
32. os.system("espeak -vde {}".format(list(valid_it.class_indices)
[result[0][0]]))
33. elif k & 0xFF == ord('a'):
34. saveimg(frame[y:y+w,x:x+w,:], "Training", "Apfel")
35. img_counter += 1
36. elif k & 0xFF == ord('z'):
37. saveimg(frame[y:y+w,x:x+w,:], "Training", "Zwiebel")
39. elif k & 0xFF == ord('o'):
40. saveimg(frame[y:y+w,x:x+w,:], "Training", "Orange")
42. elif k & 0xFF == ord('k'):
43. saveimg(frame[y:y+w,x:x+w,:], "Training", "Kartoffel")
45. elif k & 0xFF == ord('+'):
46. threshset += 5
47. if threshset > 255:
48. threshset = 255
49. elif k & 0xFF == ord('-'):
50. threshset -= 5
51. if threshset < 0:
52. threshset = 0
53.
54.
55. cam.release()
56. cv2.destroyAllWindows()
If the key ‘q’ is pressed, than the application stops. If the key ‘n’ is pressed, the image
inside the rectangle is taken and the category is predicted with the Keras predict
method. The string is handed over to the espeak application which speaks out the
category on the speaker attached on the raspberry pi. The keys ‘a’, ‘z’, ‘o’, ‘k’ execute
the saveimg function with different parameters. The purpose of these keys is, that
the user can save an image, in case there is a bad prediction. Next time, the model is
trained, the saved image will be included in the training data. At last we have the ‘+’
and ‘-‘ keys, which modify the threshset value. The effect will be, that the rectangle
(Figure 6, green rectangle) is enlarged or downsized due to the shadow on the
background.
Figure 6: Displayed Image
Conclusion
The application works amazingly well with few fruits to predict considering the rela-
tive low number of training data. In the beginning we had to retrain the model a cou-
ple of times with newly generated images using the application keys described
above.
As soon as we take e.g. an apple with different colors, there is a high chance that the
prediction fails. In such cases we have take more images and retrain again.
Acknowledgement
Thanks to Carmen Furch and Armin Weisser providing the data preparation code
and the raspberry pi application.
Also special thanks to the University of Applied Science Albstadt-Sigmaringen offer-

ing a classroom and appliances to enable this research.
 26. March 2020  DocD  AI, Classification, Deep Learning, KI, Neural Network
Proudly powered by WordPress

Fruit Recognition On A Raspberry Pi - Real Time Applications and Software Techniques

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fruit Recognition On A Raspberry Pi - Real Time Applications and Software Techniques

Uploaded by

Copyright:

Available Formats

5/5/24, 19:30 Fruit Recognition on a Raspberry Pi - Real Time Applications and Software Techniques

Real Time Applications and Software Techniques

The type of neural network my students and I used is a multi-categorical neuronal

Preparing the Data

Cropping the images

Figure 1: Original Image and Cropped Image

Changing the backgrounds

Figure 3: Orange with two different Backgrounds

Training the Model

Finally the structure of the trained model is saved to a json file.

Testing the Model

Figure 4: Prediction Image

The Raspberry Pi application

Figure 5: Experiment Setup

1. def saveimg(img, dircat, fruit):

Figure 6: Displayed Image

Also special thanks to the University of Applied Science Albstadt-Sigmaringen offer-

Proudly powered by WordPress

You might also like