Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Animal Prediction and classification using SVM

Abstract intraclass variability and interclass similarities. There


are various techniques to tackling this challenge, each
The major purpose of this study is to create a system
with its own set of benefits and drawbacks. The first
that can recognize photos of cats and dogs. The
method creates complex features that better represent
outcome will be anticipated after reviewing the
and differentiate sample pictures, however
provided image. The concept may be expanded to
constructing such a feature is difficult and problem-
include a website or any mobile device as needed. On
dependent. The second technique concatenates the
the Kaggle website, you may find a database of Dogs
retrieved features from the several approaches to
vs Cats. Pictures of cats and dogs may be found in the
create a more powerful feature vector. This covers
database. The model's main purpose is to learn a
learning how to create a rigorous test module to assess
range of cat and dog characteristics. After you've
model performance, how to test model development,
completed the model training, you'll be able to tell the
and how to maintain a model and upload it later to
difference between photographs of cats and dogs. The
generate predictions on fresh data.
Dog Data Set vs. Cats database is a popular computer-
generated database that categorizes photographs of The dog vs. cat data set refers to the data used in the
dogs and cats. Despite the fact that the issue looks 2013 Kaggle machine learning competition. Photos of
simple, it has only lately been addressed using dogs and cats are included in the database, which is
convolutional neural networks for in-depth learning. made up of images from a big library of 3 million
Despite the fact that the database has been solved, it manually produced customized images. The database
may still be used to study and practice how to was created as a result of a collaboration between
develop, examine, and use in-depth neural learning Petfinder.com and Microsoft. The database was
networks to separate images. created to serve as a CAPTCHA (Completely
Automated Public Turing to tell Computers and
Introduction
Humans Apart) test, which is a seemingly simple
Animal identification and categorization is an function that cannot be performed mechanically and
essential topic that has received little attention. is used on websites to distinguish between human and
Animal classification based on the difficulty of bot users. The project was dubbed "Asirra," which
differentiating photographs of different animal stands for Animal Identification with Restricted
species is a straightforward task for humans, but data Access and is a form of CAPTCHA. You'll learn how
reveals that even in simple examples like cats and to create a convolutional neural network that can
dogs, automatic categorization is challenging. distinguish between photos of dogs and cats in this
Animals have a flexible structure that allows them to tutorial. The Convolutional Neural Network (CNN) is
disguise themselves, and they frequently occur in an algorithm that takes an image as an input and
intricate scenes. They may also appear in varied assigns weight and bias to all parts of the picture,
lighting circumstances, angles, and scales, as do all separating them one from the other. Photo collections,
objects. There have been attempts to apply each with a label identifying the actual picture, may
identification algorithms to animal photos, but the be used to train neural networks (cat or dog here).
topic of animal classification has just lately piqued From 10 to hundreds of photographs might be found
interest. Many existing approaches for human face in a collection. Throughout the collection, the
recognition that show promise cannot adequately network prediction is compared to the existing related
capture the range of animal groups with complicated label in each image, and the gap between network
speculation and reality is calculated. The network Detection Using Convolutional Networks," a 2013
parameters are then tweaked to shorten the distance, article. The database is simple and small enough to be
increasing the network prediction capacity. The stored in memory. As a result, when begun by
training procedure is carried out consistently across convolutional neural networks, it has become a "good
the collections. world" or "launch" set of beginning computer vision
data. As a result, in this activity, it is common to attain
Background /related work:
roughly 80% accuracy with a convolutional neural
The current performance was attained by SVM and network created with specificity and 90%+ accuracy
presented in a 2007 article entitled "Asirra using transfer learning.
CAPTCHA Reading Machine Learning Machine",
The Dataset
which achieved 80 percent grade accuracy when the
competition was advertised. This article On this premise, Microsoft Research (MSR)
demonstrated that shortly after the project was presented the challenge of distinguishing cats from
suggested, the task was no longer a suitable dogs as a test for distinguishing people from robots,
CAPTCHA function. It describes an 82.7 percent and the ASIRRA test ([19], Fig. 3) was devised. The
accuracy class in categorizing photos of cats and dogs assumption is that out of a collection of twelve
used in Asirra. This section is made up of vector- photographs of pets, any machine would guess the
assisted machine separators that extract color and family of at least one of them erroneously, but
texture information from the photos. Our findings humans would not. The ASIRRA test is now being
imply that removing Asirra from defense is a bad used to safeguard a number of websites against
idea. We're introducing Asirra, a CAPTCHA that uninvited bot access. However, the accuracy of the
asks visitors to identify the cats in a batch of 12 classifier used by the bot determines the test's
photos that includes both cats and dogs. Asirra is reliability. For example, if the classifier has a = 95%
simple for users; user surveys reveal that it can be accuracy, the bot will deceive the ASIRRA test
done in less than 30 seconds 99.6% of the time. We around half of the time (12 54%). The whole MSR
anticipate computers to have a 1 in 54,000 probability ASIRRA system is built on a database of millions of
of fixing it in order to prevent big improvements in pet photos, evenly split between cats and dogs. Our
machine vision. Asirra's photography website classifiers are put to the test on the 24,990 photos that
benefits from a unique and lucrative connection with have been made publicly available for research and
Petfinder.com. We show a "adopt me" link beneath assessment. About 2,000–2,500 photos were acquired
each of their 3 million photographs when we swap the from various data sources for each of the 37 breeds to
use of their photos, increasing Petfinder's core generate a pool of candidates for inclusion in the
purpose of identifying unwanted animals. We collection. Images were removed from the candidate
describe Asirra's design, analyze security issues, and list if any of the following conditions were met, as
provide feedback on the pre-shipping process. We determined by the annotators: I the image was
also explain two unique methods that may be utilized grayscale, (ii) another image depicting the same
in many existing CAPTCHAs to expand the skills gap animal existed (which happens frequently on Flickr),
between people and machines. (iii) the lighting was poor, (iv) the pet was not
centered in the image, or (v) the pet was dressed.
The Kaggle Contest featured 25,000 captioned
However, mistakes in the breed designations were
photographs, with 12,500 dogs and cats. The
shown to be the most prevalent fault across all data
experimental database of 12,500 unlabeled photos
sources. As a result, human annotators examined and
then requires predictions. Pierre Sermanet (now a
corrected labels wherever feasible.
research scientist at Google Brain) won the
competition with a segmentation accuracy of 98.914 The ASIRRA dataset offered the difficulty of
percent in a tiny sample of 70% of the experimental distinguishing cats from dogs as a test to distinguish
database. His method was later described in humans from mammals. Microsoft Research (MSR)
"OverFeat: Integrated Recognition, Localization, and presented the challenge of distinguishing cats from
dogs as a test to distinguish humans from mammalian Shape Model
species in the ASIRRA dataset. Protocol for
We utilize the deformable component model of to
assessment. There are three tasks: pet family
express shape. In this concept, an item is created by
classification (Cat versus Dog, a two-class issue),
connecting a root element to eight smaller parts at a
breed classification given the family (a 12-class
finer scale via springs. Each part's appearance is
problem for cats and a 25-class problem for dogs),
represented by a HOG filter, which captures the
and breed and family classification (a 12-class
picture's local distribution of edges; inference
problem for cats and a 25-class problem for dogs) (a
(detection) employs dynamic programming to
37 class problem). The average per-class
discover the optimal balance between accurately
classification accuracy is used to assess performance
matching each part to the image and not over-
in all situations. This is the fraction of properly
deforming the springs. This model, while strong, falls
categorized pictures in each of the classes, and it may
short of accurately representing the flexibility and
be calculated as the average of the confusion matrix's
diversity of a pet's physique. Examining the
diagonal (row normalized). This suggests that a
performance of this detector on cats and dogs in the
random classifier's average accuracy for the family
latest PASCAL VOC 2011 challenge data
classification job is 1/2 = 50%, and for the breeds and
demonstrates this. In an example, the deformable
family classification is 1/37 3%. On the train and test
parts detector only achieves an Average Precision
subsets, algorithms are trained.
(AP) of 31.7 percent and 22.1 percent on cats and
A breed discrimination model dogs, respectively, but an easy category like bicycle
has an AP of 54 percent. In the PASCAL VOC
A pet's size, shape, fur type, and color are all
challenge, however, the goal is to detect the animal's
influenced by its breed. Because measuring a pet's
whole body. We apply the deformable part model to
size from a photograph without an absolute reference
recognize specific stable and distinguishable body
is impossible, our model focuses on capturing the
components, much like in the technique. The head
pet's form and fur look. Automatically segmenting the
annotations in the Pet data are used to learn a
pet from the photographic backdrop is also part of the
deformable part model of the cat and dog faces,
model. The flat approach, in which both features are
respectively ([24, 29, 45] also focuses on modeling
used to regress the pet's family and the breed
the faces of pets).
simultaneously, and the hierarchical approach, in
which the family is determined first based on the Appearance Model
shape features alone, and then appearance is used to
Using k-means on characteristics randomly picked
predict the breed conditioned on the family, are then
from the training data, the vocabulary is learned. The
considered and compared. In order to infer the model
quantized SIFT features are aggregated into a spatial
from a picture, the animal must be separated from the
histogram with a dimension equal to 4,000 times the
surroundings.
number of spatial bins in order to get an image
descriptor. Histograms are then normalized and used spatial layouts. Concatenating the histograms
to classify data in a support vector machine (SVM) computed on the separate spatial components of the
based on the exponential-2 kernel. layout produces the picture descriptor in each
scenario. Yellow-black lines distinguish the spatial
By aligning the spatial bins with specific geometric
bins as well as additional spatial bins computed on the
elements of the pet, other forms of spatial histograms
foreground object region and its complement, as
may be created.
indicated, in correspondence of the pet head (as for
The layout of the image the image + head arrangement) The foreground
This structure, as seen in, consists of five spatial bins region is created using either automated pet body
grouped as a 1 1 and 2 2 grid encompassing the whole segmentation or ground-truth segmentation to
picture area. As a consequence, a 20,000-dimensional provide a best-case baseline. Similar to the picture
feature vector is created. arrangement, the foreground region is separated into
five spatial bins. A second bin is created by removing
The layout of the image and head the head area from the foreground region and adding
This layout adds a spatial bin in correspondence with no more spatial subdivisions. When the histograms
the head bounding box (as identified by the for all of the spatial bins in this pattern are added
deformable component model of the pet face) as well together, the result is a 48,000-dimensional feature
as one for the complement of this box to the picture vector.
layout as described. There are no further spatial
subdivisions in these two areas. When the histograms
for all of the spatial bins in this pattern are added
together, the result is a 28,000-dimensional feature
vector.
Layout consisting of an image, head, and a body
The spatial tiles in the picture layout are combined
with an extra spatial bin in this pattern. Image
descriptors were computed using three alternative

Methods and Techniques present since the 1940s, they have just recently become a
prominent aspect of artificial intelligence. This is due to
Convolutional neural networks were used to create this
the introduction of a method known as "back
model. Yann LeCun invented convolutional neural
propagation," which allows networks to alter their hidden
networks (CNN) as a specific design of artificial neural
layers of neurons when the output does not match what
networks in 1988. CNN makes advantage of some of the
the creator expects, such as when a network meant to
visual cortex's properties. The neural network may now
detect dogs misidentifies a cat.
be implemented when the pre-processing is completed. It
will generate three convolution layers using two-by-two Choose a standard photo size.
max-pooling. One of the most important technologies in
Before modeling, images will need to be adjusted so that
machine learning is artificial neural networks. They are,
they are all the same size. Typically, this is a little square
as the name implies, brain-inspired technologies that are
picture. There are a number of techniques to accomplish
designed to mimic the way humans learn. In most
this, the most frequent of which is to use a basic resize
situations, neural networks have input and output layers,
function to stretch and deactivate the aspect ratio of each
as well as a hidden layer of units that turn the input into
picture and push it into a new dimension. We may upload
something that the output layer can utilise. They're great
all of the photographs, examine the width and height
at spotting patterns that would be impossible for a human
distribution, and then design a new image size that best
programmer to extract and train the computer to
mimics what we could observe in motion. A tiny input
recognise. Despite the fact that neural networks have been
indicates a model that can be trained quickly, and this images are arranged by captions. With this common
issue generally dictates the picture size. We'll use this format, we may write text to make a duplicate of the
training database and resize them to 200 by 200 square database. We'll utilize 25% of the photos (about 6,250) in
images. Each image's label is likewise controlled by the the test database at random.
file names. Save a lot of photographs and labels.
Create a CNN Baseline Model
We may develop a convolutional network neural network
database for dogs vs cats in this part. The fundamental
model will establish the performance of a tiny model
against which all of our other models can be evaluated, as
well as a model framework for study and improvement.
The conventional architectural principles of VGG models
are a nice place to start. These are nice locations to start
since they performed well in the ILSVRC 2014
competition and the building module's architecture is
simple to learn and utilize. See the 2015 paper "Very
Deep Convolutional Networks for Large-Scale Image
Recognition" for additional details on the VGG model.
Convolutional layers are packed with tiny 3 3 filters, then
a huge composite layer is added on top.
When the number of filters in each block rises with a
network depth of 32, 64, 128, 256 in the first four blocks
of the model, these layers constitute a block, and these
blocks can be repeated. The length and breadth of the
output element of the input-like maps are ensured by
Pre-Process Photo Sizes padding added to convolutional layers. We may compare
the model to these structures with 1, 2, and 3 blocks and
We can guess that loading all of the photos into memory
test it on a dog problem against a cat’s problem. The
will take roughly 12 GB of RAM. That's 3,000,000,000
ReLU activation function and He weight implementation
pixels of 32-bit pixels, or 25,000 pictures of 200x200x3
will be used in each layer since they are the best processes
pixels apiece. All of the photographs may be uploaded,
in general. Consider a three-block VGG structure with a
resized, and saved as a single NumPy application. Most
single layer of cohesive and cohesive in each block.
current PCs can fit this into RAM, but not all of them can,
Implement 1VGG, 2VGG, and 3VGG models after that.
especially if you only have 8 MB to deal with. We can
develop custom code to save photos to memory, resize Augmenting Image Data
them throughout the upload, and keep them available for By producing changed copies of photos in the database,
modeling. The Keras image processing API is used in the adding image data may be utilized to imitate the size of a
example below to upload all 25,000 photographs to the training database. The ability of appropriate models to
Pre-Process Photos into Standard Directories replicate what they have learned in new images can be
enhanced by training neural network models for in-depth
Alternatively, we may use the Keras ImageDataGenerator
learning of additional data. Augmentation techniques can
section and the flow from directory () API to submit
create a variety of images that can enhance the ability of
photos in a continuous stream. This will take some time
appropriate models to replicate what they have learned in
to create, but it should work on most PCs. This API
new images. Minor changes in the input photos of dogs
prefers that data be divided into a distinct train / and
and cats, such as minor shifts and horizontal rotations,
test/reference directory, with a sub-directory for each
may be beneficial in this case. These enhancements can
class, such as train/dog / and train/cat/subtitles and other
be given in the ImageDataGenerator for a training data set
similar tests, beneath each directory. After that, the
as parameters. Because we want to assess model
performance on unadjusted photos, we shouldn't utilize
additions for a test set of data. This necessitates the
creation of train duplicates and test sets for relevant data
generators, as well as a different ImageDataGenerator
model for train and test databases.
Experimental Section:
During the experimental section following steps were
included.
1.Installing different libraries for python
Requirement already satisfied: werkzeug>=0.11.15 in c:\users\kingu\a
naconda3\lib\site-packages (from tensorboard<2.9,>=2.8->tensorflow)
(2.0.2)
Collecting google-auth<3,>=1.6.3
6. Training Model
Downloading google_auth-2.6.0-py2.py3-none-any.whl (156 kB)
Collecting google-auth-oauthlib<0.5,>=0.4.1
Downloading google_auth_oauthlib-0.4.6-py2.py3-none-any.whl (18 kB)
Collecting tensorboard-data-server<0.7.0,>=0.6.0
Downloading tensorboard_data_server-0.6.1-py3-none-any.whl (2.4 kB)
Collecting tensorboard-plugin-wit>=1.6.0 Epoch 85/100
Downloading tensorboard_plugin_wit-1.8.1-py3-none-any.whl (781 kB) 63/63 [==============================] - 2s 30ms/step - loss: 2.3766e-06
Collecting cachetools<6.0,>=2.0.0
Epoch 86/100
Downloading cachetools-5.0.0-py3-none-any.whl (9.1 kB)
Collecting rsa<5,>=3.1.4 63/63 [==============================] - 2s 30ms/step - loss: 2.2836e-06
Downloading rsa-4.8-py3-none-any.whl (39 kB) Epoch 87/100
63/63 [==============================] - 2s 30ms/step - loss: 2.1946e-06
Collecting pyasn1-modules>=0.2.1
Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB) Epoch 88/100
Collecting requests-oauthlib>=0.7.0 63/63 [==============================] - 2s 31ms/step - loss: 2.1090e-06
Downloading requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB) Epoch 89/100
Requirement already satisfied: importlib-metadata>=4.4 in c:\users\k
ingu\anaconda3\lib\site-packages (from markdown>=2.6.8->tensorboard< 63/63 [==============================] - 2s 30ms/step - loss: 2.0278e-06
2.9,>=2.8->tensorflow) (4.8.1) Epoch 90/100
Requirement already satisfied: zipp>=0.5 in c:\users\kingu\anaconda3 63/63 [==============================] - 2s 29ms/step - loss: 1.9490e-06
\lib\site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->t
ensorboard<2.9,>=2.8->tensorflow) (3.6.0) Epoch 91/100
Collecting pyasn1<0.5.0,>=0.4.6 63/63 [==============================] - 2s 31ms/step - loss: 1.8738e-06
Downloading pyasn1-0.4.8-py2.py3-none-any.whl (77 kB) Epoch 92/100
Requirement already satisfied: certifi>=2017.4.17 in c:\users\kingu\
anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2 63/63 [==============================] - 2s 30ms/step - loss: 1.8019e-06
.9,>=2.8->tensorflow) (2021.10.8) Epoch 93/100
Requirement already satisfied: idna<4,>=2.5 in c:\users\kingu\anacon 63/63 [==============================] - 2s 30ms/step - loss: 1.7327e-06
da3\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.9,>=2
2. Importing different libraries for python
.8->tensorflow) (3.2) Epoch 94/100
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\kin 63/63 [==============================] - 2s 30ms/step - loss: 1.6666e-06
3. Providing file path.
gu\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboar Epoch 95/100
d<2.9,>=2.8->tensorflow) (1.26.7)
Requirement already satisfied: charset-normalizer~=2.0.0 in c:\users 63/63 [==============================] - 2s 30ms/step - loss: 1.6031e-06
4. Converting the value of Cat and Dog into 0 and 1
\kingu\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensor Epoch 96/100
board<2.9,>=2.8->tensorflow) (2.0.4) 63/63 [==============================] - 2s 29ms/step - loss: 1.5419e-06
Collecting oauthlib>=3.0.0
Cat = 0
Downloading oauthlib-3.2.0-py3-none-any.whl (151 kB) Epoch 97/100
Building wheels for collected packages: termcolor 63/63 [==============================] - 2s 29ms/step - loss: 1.4835e-06
Building wheel for termcolor (setup.py): started
Dog = 1
Building wheel for termcolor (setup.py): finished with status 'done' 7.Fitting the Model
Epoch 98/100
Created wheel for termcolor: filename=termcolor-1.1.0-py3-none-any.w 63/63 [==============================] - 2s 28ms/step - loss: 1.4272e-06
hl size=4847 sha256=28ddc3bf1374d8c22b1a53aa1ac6871ecad64146c8ad0a11 Epoch 99/100
5. Model Summary
dfe6a70dbb407eed Epoch
63/631/100
[==============================] - 2s 29ms/step - loss: 1.3735e-06
Stored in directory: c:\users\kingu\appdata\local\pip\cache\wheels\b 63/63
6\0d\90\0d1bbd99855f99cb2f6c2e5ff96f8023fad8ec367695f7d72d Epoch[==============================]
100/100 - 5s 37ms/step - loss: 0.5477
Successfully built termcolor Epoch
63/632/100
[==============================] - 2s 29ms/step - loss: 1.3218e-06
Installing collected packages: pyasn1, rsa, pyasn1-modules, oauthlib 63/63 [==============================] - 2s 29ms/step - loss:Out[22]:
0.2390
, cachetools, requests-oauthlib, google-auth, tensorboard-plugin-wit Epoch 3/100
<keras.callbacks.History at 0x243bb080220>
, tensorboard-data-server, protobuf, markdown, grpcio, google-auth-o 63/63 [==============================] - 2s 30ms/step - loss: 0.0912
authlib, absl-py, tf-estimator-nightly, termcolor, tensorflow-io-gcs
-filesystem, tensorboard, opt-einsum, libclang, keras-preprocessing, Epoch 4/100
keras, google-pasta, gast, flatbuffers, astunparse, tensorflow 63/63 [==============================] - 2s 29ms/step - loss: 0.0248
Epoch 5/100
63/63 [==============================] - 2s 29ms/step - loss: 0.0042
Epoch 6/100
63/63 [==============================] - 2s 29ms/step - loss: 0.0012
Epoch 7/100
63/63 [==============================] - 2s 30ms/step - loss: 7.2933
e-04
Epoch 8/100
63/63 [==============================] - 2s 31ms/step - loss: 5.3413
e-04
Epoch 9/100
63/63 [==============================] - 2s 32ms/step - loss: 4.1587
e-04
Epoch 10/100
8. Classification of Cats and Dogs. generally the same database for use with the flow from
directory () method in the ImageDataGenerator class. We
need to establish a new directory with all of the training
photos sorted by dogs/cats/sub-lists without dividing
them into a train and/or inspecting/catalog. This may be
accomplished by going through the script we wrote at the
start of the course. In this situation, we'll make a new
graduation folder dogs vs cats / containing
dogs/cats/folder.
In this paper, a business intelligence model for classifying
colorful creatures has been erected, grounded on a
specific business structure that deals with Beast bracket
utilising an applicable deep literacy approach. A scientific
fashion to measuring delicacy was used to estimate this
model. The model is constructed using a Convolutional
Neural Network (CNN). Several tests were conducted to
demonstrate the suggested system's performance in
comparison to being state-of-the- art approaches. The
dataset description, as well as the experimental setup and
issues, are described in the subsections that follow. Model
testing can begin formerly the model has been trained. A
test set of data is loaded at this step. Because the model
has noway encountered this data set, its genuine
correctness will be verified. Eventually, the model has
been saved and is ready to use in the real world. This
implies that the model can be applied to fresh data. A
small law has been used infig.6.3 to illustrate that the
testing is done rightly utilising the images on the desktop.
Conclusion
It was a challenging assignment to categorize the supplied
image as a cat or a dog. However, thanks to CNN, it's
gotten a lot simpler. With the aid of this trained model, it
is possible to determine whether the provided image is of
a cat or a dog. Many alternative machine learning
approaches have been used to handle this problem in the
Discussion of the results
past, but those models predicted the problem with
The model development process may go on as long as we considerably less accuracy, which was insufficient for the
have ideas, time, and resources to test it. In some user. However, thanks to CNN, the model's accuracy has
circumstances, the final model's configuration must be grown by more than 95 percent.
chosen and accepted. We'll keep things basic in this
References
situation and utilize the VGG-16 transfer learning
approach as our final model. First, we'll complete our 1. American kennel club. http://www.akc.org/.
model by importing the full database into the training
2. The cat fanciers association inc. http://www.cfa.
database and saving it for future use. The stored model
org/Client/home.aspx.
will then be uploaded and used to generate a prediction
on a single image. For all accessible data, such as a 3. Cats in sinks. http://catsinsinks.com/.
mixture of all train and test data sets, the final model is
4. Catster. http://www.catster.com/. classification. Journal of Machine Learning
Research, 9, 2008.
5. Dogster. http://www.dogster.com/.
22. L. Fei-Fei, R. Fergus, and P. Perona. A Bayesian
6. Flickr! http://www.flickr.com/.
approach to unsupervised one-shot learning of object
7. Google images. http://images.google.com/. categories. In Proc. ICCV, 2003.
8. The international cat association. http://www.tica. 23. P.F. Felzenszwalb, R. B. Grishick, D. McAllester, and
org/. D. Ramanan. Object detection with discriminatively
9. My cat space. http://www.mycatspace.com/. trained part based models. PAMI, 2009.

10. My dog space. http://www.mydogspace.com/. 24. F. Fleuret and D. Geman. Stationary features and cat
detection. Journal of Machine Learning Research, 9,
11. Petfinder. http://www.petfinder.com/index. 2008.
12. World canine organisation. http://www.fci.be/. 25. P. Golle. Machine learning attacks against the asirra
13. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. captcha. In 15th ACMConference on Computer and
From contours to regions: An empirical evaluation. Communications Security (CCS), 2008.
In Proc. CVPR, 2009. 26. G. Griffin, A. Holub, and P. Perona. Caltech-256
14. S. Branson, C. Wah, F. Schroff, B. Babenko, P. object category dataset. Technical report, California
Welinder, P. Perona, and S. Belongie. Visual Institute of Technology, 2007.
recognition with humans in the loop. In Proc. ECCV, 27. Khosla, N. Jayadevaprakash, B. Yao, and F. F. Li.
2010. Novel dataset for fine-grained image categorization.
15. Y. Chai, V. Lempitsky, and A. Zisserman. Bicos: A bi- In First Workshop on Fine-Grained Visual
level co-segmentation method for image Categorization, CVPR, 2011.
classification. In Proc. ICCV, 2011. 28. Lampert, H. Nickisch, and S. Harmeling. Learning to
16. G. Csurka, C. R. Dance, L. Dan, J. Willamowski, and detect unseen object classes by between-class
C. Bray. Visual categorization with bags of keypoints. attribute transfer. In Proc. CVPR, 2009.
In Proc. ECCV Workshop on Stat. Learn. in Comp. 29. Laptev. Improvements of object detection using
Vision, 2004. boosted histograms. In Proc. BMVC, 2006.
17. N. Dalal and B. Triggs. Histograms of oriented 30. S. Lazebnik, C. Schmid, and J. Ponce. Beyond bag of
gradients for human detection. In Proc. CVPR, 2005. features: Spatial pyramid matching for recognizing
18. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. natural scene categories. In Proc. CVPR, 2006.
Fei-Fei. ImageNet: A Large-Scale Hierarchical 31. D. G. Lowe. Object recognition from local scale-
Image Database. In Proc. CVPR, 2009. invariant features. In Proc. ICCV, 1999.
19. J. Elson, J. Douceur, J. Howell, and J. J. Saul. Asirra: 32. M.-E. Nilsback and A. Zisserman. A visual
A CAPTCHA that exploits interest-aligned manual vocabulary for f lower classification. In Proc. CVPR,
image categorization. In Conf. on Computer and 2006.
Communications Security (CCS), 2007.
33. M.-E. Nilsback and A. Zisserman. Automated flower
20. M. Everingham, L. Van Gool, C. K. I. Williams, J. classification over a large number of classes. In Proc.
Winn, and A. Zisserman. The PASCAL Visual Object ICVGIP, 2008.
Classes Challenge 2011 (VOC2011) Results.
http://www.pascalnetwork.org/challenges/VOC/voc2 34. O. Parkhi, A. Vedaldi, C. V. Jawahar, and A.
011/workshop/index.html. Zisserman. The truth about cats and dogs. In Proc.
ICCV, 2011.
21. R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang,
and C.-J. Lin. LIBLINEAR: A library for large linear

You might also like