Professional Documents
Culture Documents
Using Neural Networks For Image Classification
Using Neural Networks For Image Classification
Using Neural Networks For Image Classification
SJSU ScholarWorks
Master's Projects Master's Theses and Graduate Research
Spring 5-18-2015
Recommended Citation
Kang, Tim, "Using Neural Networks for Image Classification" (2015). Master's Projects. 395.
http://scholarworks.sjsu.edu/etd_projects/395
This Master's Project is brought to you for free and open access by the Master's Theses and Graduate Research at SJSU ScholarWorks. It has been
accepted for inclusion in Master's Projects by an authorized administrator of SJSU ScholarWorks. For more information, please contact
scholarworks@sjsu.edu.
Using Neural Networks for Image Classification
SanJoseStateUniversity,CS298Spring2015
Author:TimKang(
timothykang.x@gmail.com ),SanJoseStateUniversity
Advisor:RobertChun(
robert.chun@sjsu.edu),SanJoseStateUniversity
Committee:ThomasAustin(thomas.austin@sjsu.edu ),SanJoseStateUniversity
Committee:ThomasHowell(thomas.howell@sjsu.edu ),SanJoseStateUniversity
Abstract
Thispaperwillfocusonapplyingneuralnetworkmachinelearningmethodstoimages
forthepurposeofautomaticdetectionandclassification.Themainadvantageofusing
neuralnetworkmethodsinthisprojectisitsadeptnessatfittingnonlineardataandits
abilitytoworkasanunsupervisedalgorithm.Thealgorithmswillberunoncommon,
publicallyavailabledatasets,namelytheMNISTandCIFAR10,sothatourresultswill
beeasilyreproducible.
1
TableofContents
Introduction 3
Background 3
ShortHistoryofArtificialNeuralNetworks
QuickExplanationofArtificialNeuralNetworks
ALittleRegardingImageClassification
RelatedWork 7
DeepLearningwithCOTSHPCSystems
LearningNewFactsfromKnowledgeBaseswithNeuralTensorNetworksand
SemanticWordVectors
ConvolutionalRecursiveDeepLearningfor3DObjectClassification
AFastAlgorithmforDeepBeliefNets
ReducingtheDimensionalityofDatawithNeuralNetworks
ToRecognizeShapes,FirstLearntoGenerateImages
LearningMultipleLayersofRepresentation
LearningMethodsforGenericObjectRecognitionwithInvariancetoPoseand
Lighting
ScalingAlgorithmsTowardsAI
ComparingSVMandConvolutionalNetworksforEpilepticSeizurePrediction
fromIntracranialEEG
Google
Facebook
ProposedApproach 20
MachineLearningLibaries
DataSets
Hardware
Implementation 27
Overview
MNISTDigitsandCifar10Details
Torch7Details
DataPreparation&FeatureEngineering
RunningtheTest
Results 39
MNISTResults
CIFAR10Results
OtherThoughts
Conclusion 47
References 49
2
Introduction
Computervisionisaproblemthathasexistedforalongtime.Inthispaper,wewillbe
focusingonthetaskofclassificationofcomputerimagesintodifferentpreset
categories.Thisspecificpartofcomputervisionhasmanydiverserealworld
applications,rangingfromvideogamestoselfdrivingcars.However,italsohasbeen
traditionallyverydifficulttopulloffsuccessfully,duetotheenormousamountofdifferent
factors(cameraangles,lighting,colorbalance,resolutionetc.)thatgointocreatingan
image.
Wewillbefocusingonusingartificialneuralnetworksforimageclassification.While
artificialneuralnetworksaresomeoftheoldestmachinelearningalgorithmsin
existence,theyhavenotbeenwidelyusedinthefieldofcomputervision.Morerecent
improvementsinthemethodsoftrainingartificialneuralnetworkshavemadethem
worthlookingintoonceagainforthetaskofimageclassification.
Background
Artificialneuralnetworksweredesignedtobemodeledafterthestructureofthebrain.
Theywerefirstdevisedin1943byresearchersWarrenMcCullochandWalterPitts[1].
3
Backthen,themodelwasinitiallycalledthresholdlogic,whichbranchedintotwo
differentapproaches:oneinspiredmorebybiologicalprocessesandonefocusedon
artificialintelligenceapplications.
Althoughartificialneuralnetworksinitiallysawalotofresearchanddevelopment,their
popularitysoondeclinedandresearchslowedbecauseoftechnicallimitations.The
computationalintensityofartificialneuralnetworkswastoocomplicatedforthe
computersatthetime.Computersatthetimedidnothavesufficientcomputational
powerandwouldtaketoolongtotrainneuralnetworks.Asaresult,othermachine
learningtechniquesbecamemorepopularandartificialneuralnetworksweremostly
neglected.
However,oneimportantalgorithmrelatedtoartificialneuralnetworkswasdeveloped
duringthistimebackpropagation,discoveredbyPaulWerbos[2].Backpropagationis
awayoftrainingartificialneuralnetworksbyattemptingtominimizetheerrors.This
algorithmallowedscientiststotrainartificialnetworksmuchmorequickly.
Artificialneuralnetworksbecamepopularonceagaininthelate2000swhencompanies
likeGoogleandFacebookshowedtheadvantagesofusingmachinelearning
techniquesonbigdatasetscollectedfromeverydayusers.Thesealgorithmsare
nowadaysmostlyusedfordeeplearning,whichisanareaofmachinelearningthattries
4
tomodelrelationshipsthataremorecomplicatedforexample,nonlinear
relationships.
Eachartificialneuralnetworkconsistsofmanyhiddenlayers.Eachhiddenlayerinthe
artificialneuralnetworkconsistsofmultiplenodes.Eachnodeislinkedtoothernodes
usingincomingandoutgoingconnections.Eachoftheconnectionscanhaveadifferent,
adjustableweight.Dataispassedthroughthesemanyhiddenlayersandtheoutputis
eventuallyinterpretedasdifferentresults.
Inthisexamplediagram,therearethreeinputnodes,showninred.Eachinputnode
representsaparameterfromthedatasetbeingused.Ideallythedatafromthedataset
wouldbepreprocessedandnormalizedbeforebeingputintotheinputnodes.
Thereisonlyonehiddenlayerinthisexampleanditisrepresentedbythenodesin
blue.Thishiddenlayerhasfournodesinit.Someartificialneuralnetworkshavemore
thanonehiddenlayer.
6
Theoutputlayerinthisexamplediagramisshowningreenandhastwonodes.The
connectionsbetweenallthenodes(representedbyblackarrowsinthisdiagram)are
weighteddifferentlyduringthetrainingprocess.
Imagerecognitionandclassificationisaproblemthathasbeenaroundforalongtime
andhasmanyrealworldapplications.Policecanuseimagerecognitionand
classificationtohelpidentifysuspectsinsecurityfootage.Bankscanuseittohelpsort
outchecks.Morerecently,Googlehasbeenusingitintheirselfdrivingcarprogram.
Traditionally,alotofdifferentmachinelearningalgorithmshavebeenutilizedforimage
classification,includingtemplatematching,supportvectormachines,kNN,andhidden
Markovmodels.Imageclassificationremainsoneofthemostdifficultproblemsin
machinelearning,eventoday.
Related Work
Inacademia,thereisrelatedworkdonebyProfessorAndrewNgatStanfordUniversity,
ProfessorGeoffreyHintonatUniversityofToronto,ProfessorYannLeCunatNewYork
University,andProfessorMichaelJordanatUCBerkeley.Muchoftheirworkdealswith
applyingartificialneuralnetworksorothermachinelearningalgorithms.Followingisa
7
samplingofafewpapersfromthelargebodyofworkthatisavailable.Thesepapers
areallfairlyrecentandaremoregearedtowardsspecificapplicationsofmachine
learningandartificialneuralnetworks.Thereisalsoalotofresearchinvolvingmachine
learningandartificialneuralnetworksgoingoninindustry,mostspecificallyatGoogle
andFacebook,whichwewilltalkaboutbriefly.
ThisisacollaborationbetweenAndrewNgsStanfordresearchgroupandNVIDIA[3].
Oneofthemainproblemsfacingmachinelearningtodayistrainingthesystems.As
datasetsgetlargerandlarger,moreandmorecomputingpowerisrequiredtotrainthe
models.Infact,artificialneuralnetworksareespeciallyhardtotrain.
Thepaperpresentsamoreinexpensivealternativetotrainingmachinelearningmodels
byusingCOTSHPC(communityofftheshelfhighperformancecomputing)hardware.
COTSHPChardwarereferstocomputerhardwarethatcanbeboughtatyourtypical
computerhardwarestore:thingslikeIntelorAMDCPUs.Inthiscase,theCOTSHPC
usedwereNVIDIAGPUs.
Theirsetupwasof16servers.EachserverhadtwoquadcoreCPUs,fourNVIDIAGTX
680GPUsandtheFDRInfinibandadapter(forlowlatencycommunication).Theychose
thisconfigurationspecificallytobalancethenumberofGPUswithCPUs,citingpast
8
exampleswheretoomanyGPUsoverwhelmedthesystemsthroughI/O,cooling,CPU
compute,andpowerissues.
TheirsoftwarewaswritteninC++andbuiltontopofthepreviousMVAPICHMPI
framework,chosentomakecommunicationbetweendifferentprocesseseasier.The
codealsoincludesGPUcodewritteninNVIDIAsNVICUDAlanguage.
Intheend,theywereabletotraindeeplearningsystems(includingneuralnetworks)
withover11billionparameters,whichisseveraltimeswhatotherpeoplewereableto
dobefore.
Learning New Facts from Knowledge Bases with Neural Tensor Networks and
ThisisanotherpaperfromAndrewNgsStanfordresearchgroup,thistimefocusedon
usingneuralnetworkstotrytoextractdataandinsightsfromunannotatedtext[4].It
focusesonlexicaldatabaseslikeWordNetandYago.Thesedatabasesstore
informationaboutEnglishwords,specificallydefinitionandusage,andalsoprovide
informationabouttherelationshipbetweendifferentEnglishwords.Theyarecommonly
usedinaritificialintelligenceandtextprocessingresearch.
9
Thepaperusesaspecifictypeofneuralnetworkcalledaneuraltensornetwork.This
typeofneuralnetworkisusedbecauseitiseasiertoadapttowordsandtheir
relationships.Abigadvantageofusingthistypeofmodelisthatitcanrelatetwoinputs
directly.Themodelsweretrainedusinggradients.
UsingtheWordNetdataset,thetrainedmodelwasaskedtoclassifywhetheran
arbitrarytripletofentitiesandrelationsistrueornot.Falseexampleswerecreatedby
purposelymessingupexistingknowntripletsbyswitchingentitiesandrelations.This
modelwasabletoachieve75.8%accuracy,whichismuchbetterthanwhatother
researcherswereabletoachievebefore,withsimilaritybasedmodels(66.7%)and
Hadamardbasedmodels(71.9%).
ThisisyetanotherpaperfromAndrewNgsStanfordresearchgroup.Thistimethe
paperisaboutusingneuralnetworkstoclassifyimagesofobjects,specificallyfocusing
onRGBDimages[5].RGBDimagesareimagesthatalsohavedepthinformationin
additiontothetypicalcolorinformationincludedinimages.Agood,everydayexample
ofadevicethatcapturesRGBDinformationistheKinectsensorcreatedbyMicrosoft
fortheXboxOneandtheXbox360.
10
Thepaperusesatypeofneuralnetworkcalledtheconvolutionalrecursiveneural
network(CNNRNN)forlearningandclassifyingRGBDimages.Thisactuallyconsists
oftwodifferentnetworks.Theconvolutionalneuralnetworkisfirsttrainedinan
unsupervisedmannerbyclusteringtheimages.Thisisthenusedtocreatethefiltersfor
theCNNs.TheresultingfeaturesarethenfedintotheRNNs,whichclassifytheimages.
ThedatasetusedinthispaperwastheRGBDObjectDatasetfromtheUniversityof
Washington,organizedbyKevinLai.Thepaperwassuccessfulinclassificationofthe
objectsintheLaidataset,outperformingmostpreviousattemptsmadeusingthesame
dataset.
ThispaperisfromGeoffreyHinton'sgroupattheUniversityofTorontoanddealswith
usingtheconceptof"complementarypriors"tobothspeedupandimproveneural
networksthathavealargenumberofhiddenlayers[6].Thisisdonebycreatinga
neuralnetworkwiththetoptwolayersasundirectedassociatedmemoryandtherestof
thehiddenlayersasanacyclicgraph.Thereareafewadvantagesofdoingthis,most
notablythatitallowstheneuralnetworktofindadecentsetofparametersmuchmore
rapidly.
11
ThepaperusesthecommonlyreferencedMNISThandwrittendigitsdatabasetotest
outthisnewwayofcreatinganeuralnetwork.Anotheradvantagegainedfromusingthe
MNISTisthattherehasalreadybeenalotofresearchpublishedusingtheMNISTso
thatitwouldbeeasyfortheresearcherstofindothermethodstocompareagainst.The
resultsthispapergotafterrunningagainsttheMNISTdatabasewerefavorabletoother
resultsobtainedbyusingthemorecommonfeedforwardneuralnetworks.
Whiletheinitialresultsweregood,thepaperalsooutlinesseveralproblemsthatcould
limitthepowerofthisparticularmethod.Forexample,ittreatsnonbinaryimagesas
probabilities,whichwon'tworkfornaturalimages.
ThisisapublicationbyGeoffHintonoriginallyappearingintheScienceMagazinein
July2006anddealswiththeproblemofusingneuralnetworkstoreducethe
dimensionalityofdata[7].Theproblemofdimensionalityreductionhasbeen
traditionallytackledusingmethodslikeprincipalcomponentsanalysis(PCA).Principal
componentsanalysisbasicallylooksforthegreatestvarianceinthedatasetandcanbe
donewithalgorithmslikesingularvaluedecomposition.
Thispaperdiscussesusingatypeofneuralnetworkcalledan"autoencoder"asan
alternativetousingprincipalcomponentsanalysis.Anautoencoderisamultilayer
12
neuralnetworkthatcantakehighdimensionaldata,encodeitintolowdimensional
formatandalsohastheabilitytotrytoreconstructtheoriginalhighdimensionaldata
usingfilesinthelowdimensionalformat.
Thepaperteststhetrainedautoencoderonseveraldifferentdatasets.Oneofthedata
setsusedwasacustomrandomlygeneratedsetofcurvesinthetwodimensional
space.Forthisdataset,theautoencoderwasabletoproducemuchbetter
reconstructionsthanPCA.TheMNISTdigitsdataset,Olivettifacedataset,andadata
setofdocumentswerealsoused.Onceagain,theautoencoderwasabletoperform
betterthanPCA.
ThispaperisfromGeoffreyHintonattheUniversityofTorontoandtheCanadian
InstituteforAdvancedResearchanddealswiththeproblemoftrainingmultilayerneural
networks[8].Itisanoverviewofthemanydifferentstrategiesthatareusedtotrain
multilayerneuralnetworkstoday.
Firstitdiscusses"fivestrategiesforlearningneuralnetworks,"whichincludedenial,
evolution,procrastination,calculus,andgenerative.Outofthesefivestrategies,the
mostsignificantonesarestrategiesfourandfive.Calculusincludesthestrategyof
13
backpropagation,whichhasbeenindependentlydiscoveredbymultipleresearchers.
Generativeincludesthe"wakesleep"algorithm.
Therestofthepapergoesovermanycommonwaysoftrainingamultilayerneural
networkincluding"learningfeaturedetectorswithnosupervision,learningonelayerof
featuredetectors(withrestrictedBoltzmannmachines),agreedyalgorithmforlearning
multiplehiddenlayers,usingbackpropagationfordiscrimininativefinetuning,andusing
contrastivewakesleepforgenerativefinetuning."
14
Learning Multiple Layers of Representation
ThisisapaperbyGeoffreyHintonthatdealswiththeproblemoftrainingmultilayer
neuralnetworks[9].Trainingmultilayerneuralnetworkshasbeendonemostlyusingthe
backpropagationalgorithm.Backpropagationisalsothefirstcomputationallyfeasible
algorithmthatcanbeusedtotrainmultiplelayers.However,backpropagationalsohas
severallimitations,includingrequiringlabeleddataandbecomingslowwhenusedon
neuralnetworkswithexcessiveamountsoflayers.
Thispaperproposesusinggenerativemodelstosolvethatproblem.Neuralnetworks
canberun"bottomup"inordertomakerecognitionmodelsor"topdown"tomake
generativeconnections.Running"topdown"throughaneuralnetworkwithstochastic
neuronswillresultincreatinganinputvector.Thepapersuggeststrainingmodelsby
tweakingtheweightsonthetoplevelconnectionsandtryingtoachievethemaximum
similaritywiththeoriginaltrainingdata,withthereasoningbeingthatthehigherlevel
featureswillbeabletoaffecttheoutcomemorethanthelowerlevelfeatures.According
tothepaper,thekeytomakingthisworkistherestrictedBoltzmannmachine(RBM).
Thepapertestsitstrainedmodelsontwodifferentdatasets,theMNISThandwritten
digitsdatasetandalsoadatasetofsequentialimages.Intheend,Hintonattributesthe
successofthismodeltothreedifferentfactors:usingagenerativemodelinsteadof
15
attemptingtoclassify,usingrestrictedBoltzmannmachinestolearnonelayeratatime,
andhavingaseparatefinetuningstage.
Learning Methods for Generic Object Recognition with Invariance to Pose and
Lighting
ThispaperisbyYannLeCunfromtheCourantInstituteatNewYorkUniversityand
dealswithobjectrecognition[10].Recognitionofobjectsusingjustshapeinformation,
withoutaccountingforpose,lighting,andbackgrounds,isaproblemthathasnotbeen
dealtwithfrequentlyinthefieldofcomputervision.
ThepaperusestheNORBdataset,whichisalargedatasetwith97,200imagepairsof
50objectsthatbelongtofivedifferentcategories,specificallyhumanfigures,animals,
airplanes,trucksandcars.ThisNORBdatasetwasusedtogenerateotherdatasets
wheretheimagesvaryinlocation,scale,brightness,etc.
Awiderangeofdifferentmachinelearningmethodswereusedtotesttheimagesfrom
thedatasets,includinglinearclassifier,KNN(nearestneighborwithEuclidean
distance),pairwiseSVM(supportvectormachineswithaGaussiankernel),and
convolutionalneuralnetworks.Forthemostpart,convolutionalneuralnetworksended
upwithgoodperformancecomparedtotheothermethods,exceptsurprisinglyonthe
jitteredclustereddataset.Manyoftheothermethodsalsoranintosignificanttroubles
16
becauseoflimitationsinCPUpowerandtimeandcouldnotbetrainedreasonablyon
manyofthedatasetstested.
ThisisacollaborationbetweenYoshuaBengiooftheUniversityofMontrealandYann
LeCunofNewYorkUniversityanddealswiththelongtermproblemoftraining
algorithmstoworkwithartificialintelligence[11].Thepaperdiscussesmanyofthe
commonlimitationsfoundwhenworkingwithartificialintelligence,mainlyshallow
architectureandlocalestimators.
Thepaperdoesnotattempttofindagenerallearningmethod,sayingthatsuchataskis
doomedtofailure.Instead,theyattempttolookforlearningmethodsforspecifictasks,
reasoningthatfindingoutthesemethodswillbringthemclosertocreatinganartificially
intelligentagent.
Thepaperthengoesintomoredetailontheadvantagesanddisadvantagesofdifferent
algorithmsetups,forexampledeepversusshallow.Italsocomparesvariousalgorithms
likesupportvectormachinesandconvolutionalneuralnetworksagainsteachother,
usingdatasetsliketheMNISThandwrittendigitsdatasetandtheNORBdataset.
17
Comparing SVM and Convolutional Networks for Epileptic Seizure Prediction
ThisisapaperfromYannLeCunofNewYorkUniversitythatfocusesonusingmachine
learningmethodslikesupportvectormachinesandconvolutionalneuralnetworksin
ordertotrytopredictwhenepilepticseizureswilloccur[12].Epilepsyisaneural
diseasethataffectsaroundonetotwopercentoftheworldpopulationandcausesits
victimstohaveseizuresoccassionally.Therehasbeenalotofotherresearchdoneon
tryingtopredictwhentheseseizureswilloccur,butalmostnoneusingmodernmachine
learningmethods.
Thepaperusesdatagatheredfromanelectroencephalagraphymachine,whichrecords
thelocalvoltageofmillionsofbraincellsatatime.Currentmethodsofseizure
predictionsufferfromatradeoffbetweenbeingabletopredicttheseizuresandavoiding
falsealarmswhenpredictingtheseizures.Themostcommonapproachcurrentlyused
isbinaryclassification,whichissusceptibletotheseproblems.Themachinelearning
algorithmsusedbythispapercanmitigatetheseproblemsbecauseoftheirabilityto
clasifynonlinearfeaturesinahighdimensionalfeaturespace.
ThepaperthenusesMATLABtoimplementthesupportvectormachinesand
convolutionalneuralnetworksused.Theresultswerehighlysuccessful,especiallyfor
18
theconvolutionalneuralnetworks,whichwereabletoachievenofalsealarmsonallof
thepatientsexceptforone.
Googlehasbeenfocusingmuchmoreonitsmachinelearninganddatascience
departments.AlmosteveryproductatGoogleusessomesortofmachinelearningor
datascience.Forexample,GoogleAdsenseusesdatasciencetobettertargetads
towardscustomersandPicasausesmachinelearningtorecognizefacesinimages.
OneofthemoreinterestingGoogleproductsusingmachinelearninganddatascience
isDeepMind.DeepMindTechnologieswasatechstartupbasedinLondonthatwas
acquiredbyGooglenearthebeginningof2014[13].Theirgoalistocombinethebest
techniquesfrommachinelearningandsystemsneurosciencetobuildpowerful
generalpurposelearningalgorithms.
DeepMindhas,infact,trainedaneuralnetworktoplayvideogames,includingclassics
likePongandSpaceInvaders[42].
LikeGoogle,Facebookhasalsobeenfocusingalotonmachinelearninganddata
science.Thisisbecause,asacompanyrelyingheavilyonadvertisingforrevenue,and
19
asacompanywithhugeamountsofuserpersonaldata,machinelearninganddata
sciencewillallowthemtotargettheiradsbetterandgetmorerevenue.
ThemainFacebookoperationiscurrentlybasedinNewYorkCity.In2014,Facebook
hiredNewYorkUniversityprofessorandfamedneuralnetworkresearcherYannLeCun
tohelpheadandleadthisoperation.
Proposed Approach
Weconsideredmanydifferentmachinelearninglibraries,including:
Torch7
Theano/PyLearn
Caffe
Torch7isamachinelearninglibrarythatisbeingdevelopedatNewYorkUniversity,the
IdiapResearchInstitute,andNECLaboratoriesAmerica[14].Accordingtotheir
description:
20
Torch7isascientificcomputingframeworkwithwidesupportformachine
learningalgorithms.Itiseasytouseandprovidesaveryefficient
implementation,thankstoaneasyandfastscriptinglanguage,LuaJIT,andan
underlyingCimplementation.
TheanoisamachinelearninglibrarybeingdevelopedmainlyattheUniversityof
Montreal[15].ItisaPythonlibraryandismoreofageneralpurposecomputeralgebra
systemlibrarywithaemphasisonmatrixoperations.
CaffeisamachinelearninglibrarybeingdevelopedatUCBerkeley[16].Accordingto
theirdescription:
Caffeisadeeplearningframeworkdevelopedwithcleanliness,readability,and
speedinmind.ItwascreatedbyYangqingJiaduringhisPhDatUCBerkeley,
andisinactivedevelopmentbytheBerkeleyVisionandLearningCenter(BVLC)
andbycommunitycontributors.CaffeisreleasedundertheBSD2Clause
license.
Itisimportanttonotethatallthreelibrarieshaveafocusonthedeeplearningaspectof
machinelearning,butarealsoconfigurableinmanydifferentwaysandcansupport
manyotheralgorithms.
21
Aftersomepreliminaryevaluation,wedecidedonusingtheTorch7machinelearning
library.Thislibrarywaschosenoutofthedifferentmachinelearningframeworksthat
supportartificialneuralnetworksbecauseofmanyreasons.Speedwise,itisalotfaster
thanthealternatives.ItalsosupportsinterfacingwithCandCUDAcodeeasily.Finally,
outoftheframeworksconsidered,atthispointintime,itisthemostcommonlyusedin
industry.BothFacebookandGooglehaveteamsthatareusingTorch7formachine
learningresearch.
Data Sets
Wealsoconsideredmanydifferentimagedatasets,including:
Caltech101
PASCALVOC
MNISTDigits
Flowerclassificationdataset
StanfordDogs
AnimalswithAttributes
Cifar10
Whenconsideringthedifferentimagedatasets,wetookintoconsiderationthesizeof
thedataset,thecontentoftheimages,andtheformatthedatainthedatasetis
22
presented.Wewantedadatasetthatalreadyhadtheimageswellformattedandwould
beeasytoworkwithusingourmachinelearninglibraries.
TheCaltech101dataset[17]is,accordingtotheirdescription:
Picturesofobjectsbelongingto101categories.About40to800imagesper
category.Mostcategorieshaveabout50images.CollectedinSeptember2003
byFeiFeiLi,MarcoAndreetto,andMarc'AurelioRanzato.Thesizeofeach
imageisroughly300x200pixels.
Thisdatasetofimagesalsocontainsoutlineannotationsofthedifferentobjectsshown,
whichpossiblycouldcomeinusefullater.
ThePASCALVOCdataset[18]isfromthe2009challengerunbythePASCAL2
NetworkofExcellenceonPatternAnalysis,StatisticalModelling,andComputational
LearningandfundedbytheEU.
Accordingtotheirwebsite,PASCALVOCcontainstwentycategoriesofeveryday
objects:
Person:person
Animal:bird,cat,cow,dog,horse,sheep
23
Vehicle:aeroplane,bicycle,boat,bus,car,motorbike,train
Indoor:bottle,chair,diningtable,pottedplant,sofa,tv/monitor
TheMNISTdigitsdataset[25]isalargecollectionofimagesofhandwrittendigits.They
haveatrainingsetof60,000examplesandatestingsetof10,000examples.The
imageshavealreadybeencenteredandnormalizedandthewholedatasetisasmaller
subsetofalargerdatasetavailablefromtheNIST(NationalInstituteofStandardsand
Technology).
TheFlowerClassificationdatasetisfromtheVisualGeometryGroupattheUniversityof
Oxford.Thereareactuallytwoversionsofthedataset,onewith17categoriesandone
with102categories.
TheflowersarecommonflowersseenintheUnitedKingdom[19].
StanfordDogsdataset[20]isadatasetfromStanfordUniversity.Accordingtotheir
website:
TheStanfordDogsdatasetcontainsimagesof120breedsofdogsfromaround
theworld.Thisdatasethasbeenbuiltusingimagesandannotationfrom
ImageNetforthetaskoffinegrainedimagecategorization.
24
Itcontains20,580imagesofdogssortedinto120differentcategorieswithclasslabels
andboundingboxes.
AnimalswithAttributes[21]isadatasetfromtheMaxPlanckInstituteforBiological
Cybernetics,whichislocatedinTbingen,BadenWrttemberg,Germany.Accordingto
theirdescription:
Thisdatasetprovidesaplattformtobenchmarktransferlearningalgorithms,in
particularattributebaseclassification.Itconsistsof30475imagesof50animal
classeswithsixpreextractedfeaturerepresentationsforeachimage.The
animalclassesarealignedwithOsherson'sclassicalclass/attributematrix,
therebyproviding85numericattributevaluesforeachclass.Usingtheshared
attributes,itispossibletotransferinformationbetweendifferentclasses.
Cifar10[36]isadatasetputtogetherbyAlexKrizhevsky,VinodNair,andGeoffrey
HintonfromtheUniversityofToronto.Itisasmallersubsetofthelarger80milliontiny
imagesdataset,butwiththeadvantageofhavingeverythinglabeled.Thereare50000
trainingimagesand10differentcategoriesairplane,automobile,bird,cat,deer,dog,
frog,horse,shipandtruck.
Afterlookingoverallthedifferentdatasets,itwasdecidedthatwewillbemainlyusing
theMNISTdigitsdataset.Alotofpriorresearchhasbeendoneusingthisdatasetsowe
25
caneasilycompareresultstoteststhatothershaverunbefore.Wewillalsoberunning
additionaltestsusingtheCifar10datasetduetotheexcellentsupportithaswithour
othertools.
Hardware
AllofthiswillberunonacomputerrunningUbuntuLinux14.04TrustyTahrwiththe
followingspecifications:
AMDPhenomIIX3720
4GBRAM
NvidiaGeforce750Ti
500GB7200rpmHDD
TheNvidiaGeforce750TigraphicscardcanbeespeciallyusefulbecauseTorch7is
alsocodedtoincludesupportfortheNvidiaCUDAlibrary.NvidiaCUDAallows
programstousethemassivelyparallelcomputingpowerofagraphicscard.Therestof
thehardwarewaschosensimplyjustbecauseitwasreadilyavailable.
26
Implementation
Overview
Theimplementationofourneuralnetworkrequiresmanydifferentsteps.
Thefirststeprequiredisdatasetpreparation.Evenforthosethatarecommonlyused,
datasetscomeinmanydifferentformats.Itisoftennecessarytowriteafewshort
scriptsthatwilltakeintheexamplesfromthedatasetandthenformatthemproperlyfor
themachinelearningtoolsthatwillbeused.
Itisalsooftennecessarytodowhatisknownasfeatureengineering.Examplesfrom
datasetscanhavetoomanyfeatures.Runningatrainingalgorithmonadatasetwithtoo
manyfeaturescancausethealgorithmtobecomeconfusedandproducesubpar
results.Therefore,itisnecessarytopickoutwhichfeaturestokeepandwhichto
remove(ortogivelessweightto).Thiscanbedonemanuallybyhandorusingan
algorithmlikePCA(principalcomponentsanalysis).
Finally,evenaftersuccessfullyrunningthealgorithmonadataset,itmaybehelpfulto
tweaksomeparametersandrerunthealgorithm.
27
MNIST Digits and Cifar10 Details
Asmentionedabove,theMNISTDigitsisadatasetofhandwrittendigits.Thereare
60,000differenthandwrittendigitfilesavailableinthisparticulardataset,designedtobe
usedfortrainingtheselectedalgorithm.Therearealso10,000digitfilesfromthe
datasetdesignedtobeavailablefortestingoutthealgorithmafterithasbeentrained
[25].
ItisavailableonlinefromtheNYUCourantInstituteandisasubsetofalargerNIST
(NationalInstituteofStandardsandTechnology)datasetofhandwrittendigits.The
originalNISTdatasetconsistsofmanyspecialdatabases,whicharehandwrittendigits
collectedfromdifferentsourcesandareorganizedintogroupscalledSpecialDigits.
TheMNISTdatasetusesdigitsfromtheNISTSpecialDigits1andNISTSpecialDigits
3.ThedigitsfromSpecialDigits1arefromCensusBureauemployeeswhilethedigits
fromSpecialDigits3werecollectedfromhighschoolstudents.MNISTusesaneven
mixtureof30,000digitsfromSpecialDigits1and30,000digitsfromSpecialDigits3for
thetrainingsetandanevenmixtureof5,000digitsfromeachforthetestingset.
ThedigitfilesareimagesoftheArabicnumerals0to9.Eachimageis28by28pixels
andisnormalizedsothatthenumeralsfitwhilealsokeepingthesameaspectratio.The
imageshavealsobeencenteredtofitintothe28by28pixelarea.Thefollowingisa
samplingofimagesfromtheMNISTdataset:
28
Cifar10andCifar100aredatasetsfromtheUniversityofToronto.Thesedatasetsarea
subsetofthe80milliontinyimagesdataset,withtheadvantageofhavingeverything
labeled.
TheCifar10datasetcontains60,000imagesthataresortedinto10differentclasses
whiletheCifar100contains60,000imagesthataresortedinto100differentclasses.We
willbeusingtheCifar10datasetforourexperiments.
29
Theimagesinthedatasetare32x32colorimages.Theyarecategorizedinto10
differentclassesairplane,automobile,bird,cat,dog,deer,frog,horse,shipandtruck.
Theclassescontainnooverlapwitheachotherforexampleifsomethingislabeledas
acaritwillnotbelabeledasatruck.Followingisasamplingofimagesfromthe
dataset:
Torch7 Details
Asmentionedabove,Torch7isanopensourcemachinelearninglibrarybeing
developedprimarilyatNewYorkUniversity.ItusestheLuascriptinglanguageasits
defaultlanguageofchoice,althoughitalsoallowssnippetsofCcodetobeinsertedas
wellasinterfacingwithNVIDIACUDA,whenspeedisespeciallyimportant.
30
TheLuaprogramminglanguageisanunusualchoicebythedevelopersofTorch7.
Accordingtotheirwebsite[26],Luadescribesitselfas:
Luaisapowerful,fast,lightweight,embeddablescriptinglanguage.
Luacombines simpleproceduralsyntaxwithpowerfuldatadescriptionconstructs
idealforconfiguration,scripting,andrapidprototyping.
ItwasdevelopedatthePontificalCatholicUniversityofRiodeJaneiroinBrazilbythree
computerscientists:LuizHenriquedeFigueiredo,RobertoIerusalimschy,and
WaldemarCeles[26]whowerepartoftheTecgraf(ComputerGraphicsTechnology
Group)atthetime.LuawasdevelopedduringaperiodoftimewhenBrazilhadenacted
manytradebarriers,especiallyinregardstotechnology.Asaresult,Luawascreated
almostfromscratchandhasmanystrangequirks.Forexample,itiscustomaryforLua
arrayindicestostartat1insteadofthestandard0usedinmostotherprogramming
languages.
31
Althoughithasbeenusedbymanylargecompanies,includingAdobe,Bombardier,
Disney,ElectronicArts,Intel,LucasArts,Microsoft,NASA,OlivettiandPhilips[26],its
usageinthegeneralprogrammingcommunityremainsquitelow.
TheTIOBEIndexisarankingofprogramminglanguagepopularitythatismaintainedby
TIOBESoftware[27].Whileitisnotanexactmeasurementbyanymeans,itisagood
waytogetaroughestimateofaparticularprogramminglanguagespopularitywiththe
community.AccordingtotheTIOBEIndexforJanuary2015,Luaisranked31stin
popularitybeingusedinabout0.649%ofallprogrammingapplications,evenbehindold
languagessuchasAdaandPascal.
Torch7usestheLuaJITcompilerformostgeneralpurposes[28].LuaJITisaopen
sourceLuacompilerthataimstoprovideaJIT(justintime)compilerfortheLua
language.ManyotherlanguageslikeJavaalsousejustintimecompilationforthe
compiler.Theadvantageofjustintimecompilationisthatitallowscodetobeexecuted
morequicklythancodethanisinterpreted.
Torch7alsoallowsfortheuseofLuaRocks,whichisanopensourcepackage
managementsystemforLua[29].Programscanbebundledtogetherintheformofa
packagecalledaLuaRock.ManyofthecoreTorch7packagesarehostedat
LuaRocksandcanbeinstalledeasilyfromthecommandline.Thefollowingcommand
32
isanexampleofhowLuaRockscanbeusedtoinstallapackagecalled
somepackage.
ThereisalsoacustomcommandlineinterpreterincludedwiththedefaultTorch7install.
Thiscanbeaccessedthroughthethcommandfromtheterminal,onceTorch7is
installedandallthePATHsettingsareconfiguredcorrectly.Thiscustomcommandline
interpreteriscalledTREPL,whichstandsfortorchreadevalprintloop.TREPLhas
severaladvantagesoverthedefaultLuaonebecauseithasmanyextrafeatures
designedtomakeworkingwithTorch7Luacodeeasier,suchastabcompletionand
history.Thisisanexampleofthethcommand,takenfromtheTorch7website[14]:
$ th
______ __ | Torch7
/_ __/__ ________/ / | Scientific computing
forLua.
/ / / _ \/ __/ __/ _ \ |
/_/ \___/_/ \__/_//_/ | https://github.com/torch
| http://torch.ch
th> torch.Tensor{1,2,3}
1
2
3
[torch.DoubleTensor of dimension 3]
th>
33
OutofthenumerousLuaRocksavailablethroughthepackagemanagementsystem,an
especiallyimportantoneforthisprojectisdp.Thisisalibrarydesignedtofacilitatethe
processofusingTorch7fordeeplearning.dpwasdevelopedbyNicholasLeonard
whilehewasagraduatestudentworkingintheLISAlabunderthesupervisionof
YoshuaBengioandAaronCourville[30].
Itdescribesitselfonitshomepageasahighlevelframeworkthatabstractsaway
commonusagepatternsofthennandtorch7packagesuchasloadingdatasetsand
earlystoppingwithhyperparameteroptimizationfacilitiesforsamplingandrunning
experimentsfromthecommandlineorpriorhyperparameterdistributionsandfacilites
forstoringandanalysinghyperpametersandresultsusingaPostgreSQLdatabase
backendwhichfacilitatesdistributingexperimentsoverdifferentmachines.
BoththeMNISTdigitsdatasetandtheCifar10datasetdonotcomeinastandardimage
format.Theycomeintheirownspecialformatsdesignedforstoringvectorsand
multidimensionalmatrices.Usuallywhenworkingwiththesetypeofdatasets,oneis
requiredtowriteasmallprogramtoparsethespecialformat.However,thedpLuarocks
module(whichisdesignedtoeliminatecommonrepetitivetasks)makesthis
unnecessarybecauseitalreadyincludesasmallamountofcodetofacilitatetheloading
ofthedatafrommanycommondatasets(includingMNISTandCifar10).
34
ThedpLuarocksmodulealsocontainsafewmethodstohelppreprocessand
dp.Standardize
standardizethedata.Thisisaccomplishedusingthe method,which
subtractsthemeanandthendividesbythestandarddeviation.WhiletheMNIST
datasetisalreadyformattednicelyforthemostpart,weapplyitanywayusingthe
commoncodepatternshowbelow:
Thisgetsridofanyanomaliesandisgoodpracticeingeneralwhendoingmachine
learning.TheCifar10datasetdoesnotworkwellwithdp.Standardize
,soweleaveit
outwhenrunningtestsonCifar10.
Thefirststepinrunningthetestiscreatingasetofparametersthatwillbeusedforthe
test.StoringalltheseparametersinatableintheformofaLuavariableisagoodidea
becauseitletsuskeeptrackofthingsmoreeasilyandalsoallowsustochange
parametersasweseefit.Thecodeforthatwouldlooksomethinglikethis:
myparams = {
hiddenunits = 100,
learningrate = 0.1,
momentum = 0.9,
maximumnorm = 1,
35
batchsize = 128,
maxtries = 100,
maxiterations = 1000
}
ThesearethebasicparametersthatweneedtosupplytoTorch7anddpinordertorun
ourtest.Hereisanquickexplanationofwhateachoftheseparametersmean:
hiddenunits
:Thisrepresentsthenumberofnodesinthehiddenlayer(shown
asthebluecolorednodesinapreviousdiagram).
learningrate
:Thisvariabledetermintesthelearningratefortheneural
network.Asmallerlearningratemakeshesystemlearninfinerincrements,but
canalsodrasticallyincreasethetimerequiredtotrainthesystem.
momentum
:Themomentumaddspartofthepreviousweighttothecurrent
weight.Thisisdonetotrytopreventthesystemfromsettlingonalocalminimum
whentraining.Ahighmomentumcanmakethesystemtrainfaster,butcanalso
endupovershootingtheminimum.
maximumnorm
:Thisisusedtodeterminehowmuchtoupdatetheneuron
weights.
batchsize
:Thisisthebatchsizeforthetrainingexample.
maxtries
:Thisdetermineswhentostopthetrainingprocessearly,ifaftera
certainamountoftries,errorhasnotdecreased
maxiterations
:Thisdeterminesthemaximumamountoftimestoiterate
overall.
36
Wethenneedtobuildamodeltorepresentourneuralnetwork.Weusethe
dp.Neural
classwhichrepresentsalayerintheneuralnetwork.Weusetheparametersthatwe
setaboveinmyparams.Asinglelayerlookssomethinglikethis:
dp.Neural {
input_size = mydata:featureSize(),
output_size = myparams.hiddenunits,
transfer = nn.Tanh()
}
Wecreateseveraloftheselayersandcombinethemtogetherusingthe
dp.Sequential transfer
moduletoformourneuralnetwork.The variableusedtoset
thetransferfunction.Transferfunctionsareusedtoallowformorecomplexitythanwhat
atypicallogisticregressionfunctionwouldprovide.
train
Nextwesetupthreepropagators: valid
, test
,and .Thesedeterminehowthe
neuralnetworkwilltrainthesystemanddeterminewhatisgoodandwhatisbad.
train
The dp.Optimizer
propagatorisaninstanceofthe classandrequiresafew
parameterstobeprovided:
loss
:Thisrepresentsthetypicalmachinelearninglossfunction,whichisa
functionthatthesystemwantstominimize
37
visitor
:Hereweusesomeoftheparameterswesetaboveinmyparms,
momentum
namely learningrate
, and
maximumnorm
feedback
:Thisdetermineshowfeedbackisprovidedaftereachiterationof
dp.Confusion
training.Weuse ,whichisjustaconfusionmatrix.
sampler
:Thisdetermineswhichordertoiteratethroughthedataset.Forthetrain
dp.ShuffleSampler
propagatorweuse whichrandomlyshufflesthedataset
beforeeachiteration.
valid
The test
and dp.Evaluator
propagatorsareinstancesofthe classandalso
requireafewparameterstobeprovided:
loss train
:(sameas propagator)
feedback train
:(sameas propagator)
sampler valid
:Forthe test
and propagators,weuseadifferentsamplerthan
train
the dp.ShuffleSampler
propagator.Insteadof ,weuse
dp.Sampler
whichiteratesthroughthedatasetinorder.
Finally,wesetuptheexperimentandpreparefortraining.Weusethe
dp.Experiment
classwhichtakesinthefollowingparameters:
model dp.Neural
:Thisissetusingthemodelthatwesetupbeforeusing to
dp.Sequential
formlayersand tocombinethemintoaneuralnetwork
38
optimizer train
:Thisissetusingthe propagatorthatwecreatedabove.
validator valid
:Thisissetusingthe propagatorthatwecreatedabove.
tester
:Thisissetusingthetestpropagatorthatwecreatedabove.
observer
:Observerisafeatureofdpthatlistensasthemodelisbeingtrained
andthencallsspecificfunctionswhencertaineventsoccur.Ourobserveruses
dp.EarlyStopper
whichendsthetrainingprocessearlyifnoadditionalresults
dp.FileLogger
arebeingobtainedand whichwritestheresultstoafile
max_epoch
:Thisisthemaximumnumberofiterationsthattheexperimentwillgo
myparams.maxiterations
throughwhentraining.Wesetthisto .
Thefinalstepisrunningtheexperimentonthedatasets,whichcanbeaccomplishedby
thefollowinglineofcode:
myexperiment:run(mydata)
Nowthatwehaveruntheexperiment,letuslookattheresults.
Results
MNIST Results
OneofthemainadvantagesoftheMNISTdatasetisthatitiswidelyusedandsothere
arealotofprevioustestsrunontheMNISTdatasetthatwecancompareourresultsto.
39
FollowingareafewpaperswithresultsrunontheMNISTDigitsdataset.Ihave
decidedtouseasamplingofresultsrunusingmanydifferentmachinelearning
methods,whichwillallowustomakeabettercomparisonandseeawiderpictureon
howourresultsmeasureup.
CIFAR10 Results
TheCifar10datasethasfewerprevioustestsrunonitwhencomparedtotheMNIST
dataset,buttherearestillenoughtomakeacomparison.Thetablebelowhasa
samplingofpriorresultsaswellasourownresults.Whenreportingresultsfrom
40
Cifar10,itistheaccuracyratethatiscommonlyreported(unlikeMNISTwheretheerror
rateisreported).AlsonotethattheimagestobeclassifiedinCifar10aremuchmore
complexthantheonesinMNIST.
OurNNResults 50.46%
Other Thoughts
Wetrainedtheneuralnetworkandranourtestafewtimesusingmanydifferent
parameters.Ourmaintestswereallsettorunforamaxof1000iterationswitha
learningrateof0.1andtostopearlyifafter100iterations,nofurtherprogresswas
made.Themodelsallhadonehiddenlayerwiththenumberofhiddennodes(also
sometimesreferredtoashiddenunits)beingvariableeachtimethetestwasrun.
Followingareafewtablesandgraphswithmoredetailedinformationonallthe
numerousexperimentsrun:
41
MNISTTable
42
MNISTGraphs
43
Cifar10Table
44
Cifar10Graphs
45
Onedisadvantageofneuralnetworksisthelongtrainingtimes.Wecanuseour
experiencestrainingMNISTtodemonstrate(ourexperienceswithCifar10aresimilar).
Whenwesetthenumberofhiddennodesto100,eachiterationtookaboutsixseconds
torunandourprogramwentthrough193iterationsbeforedecidingtostopearlydueto
lackofprogress.Theidealsolutionwasfoundonthe92nditeration.Weendupgetting
anerrorrateof3.8%.Whenwesetthenumberofhiddennodesto300,eachiteration
tookabouttensecondstorunandourprogramwentthrough589iterationsbefore
decidingtostopearlyduetolackofprogress.Theidealsolutionwasfoundonthe488th
iteration.Weendupgettinganerrorrateof2.7%.Weshouldalsonotethatincreasing
thenumberofhiddennodesfrom100to300dramaticallyincreasedthetimeittookto
traintheneuralnetworkfromaround30minutestoover3hours.
Interestinglyenough,thenumberofiterationsrequiredbeforereachingtheideal
solutionseemstohavenocorrelationwiththenumberofhiddenunits.Thisisbecause
thealgorithmchoosesarandomspottostartwalkingtowardstheidealsolutionand
maysometimeslandinamorefavorablespotinitially.However,ingeneral,more
hiddenunitsresultinmorefavorableresults,especiallyforMNIST.
TheCifar10datasetresultedinmuchworseresultsthantheMNIST.Thiscanbe
explainedbythemuchmorecompleximagescontainedinthedataset.Inorderto
improveresults,itmaybenecessarytoincreasethenumberofhiddenunits,increase
46
thenumberoflayersinthemodel,oruseamoreaggressivelearningrate.
Unfortunatelythisisunfeasiblewiththelimitedhardwarethatwehaveaccessto.
Overall,theseresultsaresomewhattypicalofneuralnetworks,whichseemtohavea
largevariationinerrorratepercentage.OntheMNISTdatasetotherresultsrangefrom
4.7%inatestrunbyYannLeCun[31]to0.35%inatestrunbyDanCiresan[34].The
currentbestresultforCifar10[37]alsousesavariationonneuralnetworks,showing
thatneuralnetworksareindeedagoodcandidateforclassificationofmorecomplex
imageryaswell.
Nevertheless,theseareprettyfavorableresults(especiallywhenconsideringthelimited
hardwarethatthetestwasrunon)andaptlydemonstratethepotentialthatneural
networkshavewhensolvingthesesortsofproblems.
Conclusion
Inconclusion,neuralnetworksareshowntobeaviablechoicewhendoingimage
recognition,especiallyofhandwrittendigitimages.Thisisbecauseneuralnetworksare
especiallyusefulforsolvingproblemswithnonlinearsolutions,whichappliesinthe
caseofhandwrittendigitimages,sincethehiddenunitsareabletoeffectivelymodel
suchproblems.Wemustalsonotethatthisiswiththecaveatofhavingthenecessary
47
computationalhardwareandtime.Withoutsuchresources,resultscanbesubpar,as
shownbytheCifar10tests.
Whiletherearemanyadvantagestousingneuralnetworks,therearealsoafew
drawbacks.Onedrawbacktoneuralnetworksisintraining.Sincethesystemhastogo
throughmanyiterationsduringthetrainingphase,thismaycausetrainingtotakealong
time,especiallywhenrunoncomputersusingolderhardware.
Anotherdrawbacktoneuralnetworks(althoughthisappliestoallmachinelearningin
general)ispickingtherightparameters(suchasnumberofhiddennodesperlayer,
learningrate,andnumberoflayers).Therightparameterscancauseahugedifference
inresultswithamassivedecreaseinerrorratepercentage.However,itisdifficultto
balanceandthewrongchoicescancauseextremelylongtrainingtimesorinaccurate
results.
Futureworkmayincluderunningtestsonmodelsthathavemorehiddenunitsand
layersaswellasusingamoreaggressivelearningrate.Toaccommodatethelarge
hardwaredemandsthatarerequiredtodosuchwork,wemaylookintorunningour
computationsinthecloud.AmazonhastheEC2cloudwhichmaybeabletooffloada
lotofthework.Anotherpossiblewaytodrasticallyimprovehardwareperformancemay
betousethegraphicscardtohelpwithcomputations.NVIDIAhastheCUDAlibrary
48
whichisexcellentatrunningcomputationsinparallel,whichTorch7actuallyhassome
builtinsupportfor.
Withtheriseoftoolssuchastorch7(anddp)neuralnetworksarenowmoreusefulthan
everbeforeandwillprobablybeappliedtomanyotherproblemsinthefuture.Thiscan
rangefromitemrecommendationatashoppingservicelikeAmazonoreventothings
likeselfdrivingcars.WeliveinaneraruledbydataandIamexcitedtoseewhatwill
comenext.
References
[1]WarrenS.McCullochandWalterPitts.
ALogicalCalculusoftheIdeasImmanentin
NervousActivity
.Published1943.AccessedOct2014.
[2]PaulJ.Werbos.
BackpropagationThroughTime:WhatItDoesandHowtoDoIt.
Published1990.AccessedOct2014.
[3]AdamCoates,BrodyHuval,TaoWang,DavidJ.Wu,BryanCatanzaroandAndrew
DeepLearningwithCOTSHPCSystems.
Y.Ng. PublishedJul2013.AccessedOct
2014.
[4]DanqiChen,RichardSocher,ChristopherD.ManningandAndrewY.Ng.
Learning
NewFactsFromKnowledgeBaseswithNeuralTensorNetworksandSemanticWord
Vectors
.PublishedMar2013.AccessedOct2014.
49
[5]RichardSocher,BrodyHuval,BharathBhat,ChristopherD.ManningandAndrewY.
ConvolutionalRecursiveDeepLearningfor3DObjectClassification
Ng. .Published
2012.AccessedOct2014.
[6]GeoffreyE.Hinton,SimonOsindero,andYeeWhyeTeh.
AFastLearningAlgorithm
forDeepBeliefNets
.Published2006.AccessedOct2014.
[7]GeoffreyE.HintonandR.R.Salakhutdinov.
ReducingtheDimensionalityofData
withNeuralNetworks
.Published2006.AccessedOct2014.
ToRecognizeShapes,FirstLearntoGenerateImages
[8]GeoffreyE.Hinton. .
PublishedOct2006.AccessedOct2014.
LearningMultipleLayersofRepresentation
[9]GeoffreyE.Hinton. .PublishedOct2007.
AccessedOct2014.
[10]YannLeCun,FuJieHuang,andLeonBottou.
LearningMethodsforGenericObject
RecognitionwithInvariancetoPoseandLighting
.PublishedCVPR,2004.Accessed
Oct2014.
ScalingLearningAlgorithmsTowardsAI
[11]YoshuaBengioandYannLeCun. .
PublishedMITPress,2007.AccessedOct2014.
[12]PiotrW.Mirowski,YannLeCun,DeepakMadhavan,andRubenKuzniecky.
ComparingSVMandConvolutionalNetworksforEpilepticSeizurePredictionfrom
IntracranialEEG
.Published2008.AccessedOct2014.
IsGoogleCorneringtheMarketonDeepLearning?
[13]AntonioRegalado. .Published
MITTechnologyReview,Jan2014.AccessedOct2014.
[14]Torch7.AccessedOct2014.<http://torch.ch/>
50
[15]Theano.AccessedOct2014.<http://deeplearning.net/software/theano/>
[16]Caffe.AccessedOct2014.<http://caffe.berkeleyvision.org/>
[17]Caltech101.AccessedOct2014.
<http://www.vision.caltech.edu/Image_Datasets/Caltech101/>
[18]PASCALVOC.AccessedOct2014.
<http://pascallin.ecs.soton.ac.uk/challenges/VOC/>
[19]OxfordFlowers.AccessedOct2014.
<http://www.robots.ox.ac.uk/~vgg/data/flowers/>
[20]StanfordDogs.AccessedOct2014.
<http://vision.stanford.edu/aditya86/ImageNetDogs/>
[21]AnimalswithAttributes.AccessedOct2014.
<http://attributes.kyb.tuebingen.mpg.de/>
[22]ThomasSerre,LiorWolf,andTomasoPoggio.
ObjectRecognitionwithFeatures
InspiredbyVisualCortex
.Published2005.AccessedNov2014.
[23]KristenGraumanandTrevorDarrell.
ThePyramidMatchKernel:Discriminative
ClassificationwithSetsofImageFeatures
.PublishedICCV,2005.AccessedNov2014.
[24]JianchaoYang,KaiYu,YihongGong,andThomasHuang.
LinearSpatialPyramid
MatchingUsingSparseCodingforImageClassification
.PublishedCVPR,2009.
AccessedNov2014.
[25]MNISTHandwrittendigitdatabase.AccessedDec2014.
<http://yann.lecun.com/exdb/mnist/>
51
[26]RobertIerusalimschy,LuizHenriquedeFigueiredo,andWaldemarCeles.
The
EvolutionofLua
.AccessedJan2015.<http://www.lua.org/doc/hopl.pdf>
[27]TIOBESoftware:TIOBEIndex.AccessedJan2015.
<http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html>
[28]LuaJIT.AccessedJan2015.<http://luajit.org/luajit.html>
[29]LuaRocks.AccessedJan2015.<http://luarocks.org/>
[30]dp.AccessedJan2015.<http://dp.readthedocs.org/en/latest/index.html>
[31]YannLeCun,LeonBottou,YoshuaBengio,andPatrickHaffner.
GradientBased
LearningAppliedtoDocumentRecognition
.PublishedNov1998.AccessedFeb2015.
[32]DanielKeysers,ThomasDeselaers,ChristianGollan,andHermannNay.
DeformationModelsforImageRecognition
.Published2007.AccessedFeb2015.
TrainingInvariantSupportVectorMachines
[33]DennisDecoste,BernhardScholkopf. .
Published2002.AccessedFeb2015.
[34]DanCiresan,UeliMeier,LucaGambardella,andJuergenSchmidhuber.
DeepBig
SimpleNeuralNetsExcelonHandwrittenDigitRecognition
.PublishedMar2010.
AccessedFeb2015.
[35]MNISTNearestNeighborResults.AccessedFeb2015.
<http://finmath.uchicago.edu/~wilder/Mnist/>
[36]CIFAR10andCIFAR100Datasets.AccessedApril2015.
<http://www.cs.toronto.edu/~kriz/cifar.html>
[37]ChenYuLee,SainingXie,PatrickGallagher,ZhengyouZhang,ZhuowenTu.
DeeplySupervisedNets.
Published2014.AccessedApril2015.
52
NetworkInNetwork.
[38]MinLin,QiangChen,ShuichengYan. Published2013.
AccessedApril2015.
[39]RobertGens,PedroDomingos.
DiscriminativeLearningofSumProductNetworks.
Published2012.AccessedApril2015.
[40]JulienMairal,PiotrKoniusz,ZaidHarchaoui,CordeliaSchmid.
Convolutional
KernelNetworks.
Published2014.AccessedApril2015.
[41]TsungHanChan,KuiJia,ShenghuaGao,JiwenLu,ZinanZeng,YiMa.
PCANet:
ASimpleDeepLearningBaselineforImageClassification.
Published2014.Accessed
April2015.
[42]TheLastAIBreakthroughDeepMindMadeBeforeGoogleBoughtItFor$400m.
AccessedApril2015.
<https://medium.com/thephysicsarxivblog/thelastaibreakthroughdeepmindmadeb
eforegoogleboughtitfor400m7952031ee5e1>
53
54