Download as pdf or txt
Download as pdf or txt
You are on page 1of 225

12/31/2015 MQ - The Guide - MQ - The Guide

MQTheGuide
ByPieterHintjens,CEOofiMatix

Pleaseusetheissuetrackerforallcommentsanderrata.ThisversioncoversthelateststablereleaseofZeroMQ(3.2).Ifyouare
usingolderversionsofZeroMQthensomeoftheexamplesandexplanationswon'tbeaccurate.

TheGuideisoriginallyinC,butalsoinPHP,Python,Lua,andHaxe.We'vealsotranslatedmostoftheexamplesintoC++,C#,
CL,Delphi,Erlang,F#,Felix,Haskell,Java,ObjectiveC,Ruby,Ada,Basic,Clojure,Go,Haxe,Node.js,ooc,Perl,andScala.

Preface topprevnext

ZeroMQinaHundredWords topprevnext

ZeroMQ(alsoknownasMQ,0MQ,orzmq)lookslikeanembeddablenetworkinglibrarybutactslikeaconcurrencyframework.
Itgivesyousocketsthatcarryatomicmessagesacrossvarioustransportslikeinprocess,interprocess,TCP,andmulticast.You
canconnectsocketsNtoNwithpatternslikefanout,pubsub,taskdistribution,andrequestreply.It'sfastenoughtobethe
fabricforclusteredproducts.ItsasynchronousI/Omodelgivesyouscalablemulticoreapplications,builtasasynchronous
messageprocessingtasks.IthasascoreoflanguageAPIsandrunsonmostoperatingsystems.ZeroMQisfromiMatixandis
LGPLv3opensource.

HowItBegan topprevnext

WetookanormalTCPsocket,injecteditwithamixofradioactiveisotopesstolenfromasecretSovietatomicresearchproject,
bombardeditwith1950eracosmicrays,andputitintothehandsofadrugaddledcomicbookauthorwithabadlydisguised
fetishforbulgingmusclescladinspandex.Yes,ZeroMQsocketsaretheworldsavingsuperheroesofthenetworkingworld.

Figure1Aterribleaccident

http://zguide.zeromq.org/page:all 1/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheZenofZero topprevnext

TheinZeroMQisallabouttradeoffs.OntheonehandthisstrangenamelowersZeroMQ'svisibilityonGoogleandTwitter.On
theotherhanditannoystheheckoutofsomeDanishfolkwhowriteusthingslike"MGrtfl",and"isnotafunnylooking
zero!"and"Rdgrdmedflde!",whichisapparentlyaninsultthatmeans"mayyourneighboursbethedirectdescendantsof
Grendel!"Seemslikeafairtrade.

OriginallythezeroinZeroMQwasmeantas"zerobroker"and(ascloseto)"zerolatency"(aspossible).Sincethen,ithascome
toencompassdifferentgoals:zeroadministration,zerocost,zerowaste.Moregenerally,"zero"referstothecultureof
minimalismthatpermeatestheproject.Weaddpowerbyremovingcomplexityratherthanbyexposingnewfunctionality.

Audience topprevnext

Thisbookiswrittenforprofessionalprogrammerswhowanttolearnhowtomakethemassivelydistributedsoftwarethatwill
dominatethefutureofcomputing.WeassumeyoucanreadCcode,becausemostoftheexampleshereareinCeventhough
ZeroMQisusedinmanylanguages.Weassumeyoucareaboutscale,becauseZeroMQsolvesthatproblemaboveallothers.
Weassumeyouneedthebestpossibleresultswiththeleastpossiblecost,becauseotherwiseyouwon'tappreciatethetrade
offsthatZeroMQmakes.Otherthanthatbasicbackground,wetrytopresentalltheconceptsinnetworkinganddistributed
computingyouwillneedtouseZeroMQ.

Acknowledgements topprevnext

ThankstoAndyOramformakingtheO'Reillybookhappen,andeditingthistext.

ThankstoBillDesmarais,BrianDorsey,DanielLin,EricDesgranges,GonzaloDiethelm,GuidoGoldstein,HunterFord,Kamil
Shakirov,MartinSustrik,MikeCastleman,NaveenChawla,NicolaPeduzzi,OliverSmith,OlivierChamoux,PeterAlexander,
PierreRouleau,RandyDryburgh,JohnUnwin,AlexThomas,MihailMinkov,JeremyAvnet,MichaelCompton,KamilKisiel,Mark
Kharitonov,GuillaumeAubert,IanBarber,MikeSheridan,FarukAkgul,OlegSidorov,LevGivon,AllisterMacLeod,Alexander
D'Archangel,AndreasHoelzlwimmer,HanHoll,RobertG.Jakabosky,FelipeCruz,MarcusMcCurdy,MikhailKulemin,Dr.Gerg
rdi,PavelZhukov,AlexanderElse,GiovanniRuggiero,Rick"Technoweenie",DanielLundin,DaveHoover,SimonJefford,
BenjaminPeterson,JustinCase,DevonWeller,RichardSmith,AlexanderMorland,WadimGrasza,MichaelJakl,Uwe
Dauernheim,SebastianNowicki,SimoneDeponti,AaronRaddon,DanColish,MarkusSchirp,BenoitLarroque,Jonathan
Palardy,IsaiahPeng,ArkadiuszOrzechowski,UmutAydin,MatthewHorsfall,JeremyW.Sherman,EricPugh,TylerSellon,John
E.Vincent,PavelMitin,MinRK,IgorWiedler,Olofkesson,PatrickLucas,HeowGoodman,SenthilPalanisami,JohnGallagher,
TomasRoos,StephenMcQuay,ErikAllik,ArnaudCogolugnes,RobGagnon,DanWilliams,EdwardSmith,JamesTucker,
KristianKristensen,VadimShalts,MartinTrojer,TomvanLeeuwen,HitenPandya,HarmAarts,MarcHarter,IskrenIvov
http://zguide.zeromq.org/page:all 2/225
12/31/2015 MQ - The Guide - MQ - The Guide
Chernev,JayHan,SoniaHamilton,NathanStocks,NaveenPalli,andZedShawfortheircontributionstothiswork.

Chapter1Basics topprevnext

FixingtheWorld topprevnext

HowtoexplainZeroMQ?Someofusstartbysayingallthewonderfulthingsitdoes.It'ssocketsonsteroids.It'slikemailboxes
withrouting.It'sfast!Otherstrytosharetheirmomentofenlightenment,thatzappowkaboomsatoriparadigmshiftmoment
whenitallbecameobvious.Thingsjustbecomesimpler.Complexitygoesaway.Itopensthemind.Otherstrytoexplainby
comparison.It'ssmaller,simpler,butstilllooksfamiliar.Personally,IliketorememberwhywemadeZeroMQatall,because
that'smostlikelywhereyou,thereader,stillaretoday.

Programmingissciencedressedupasartbecausemostofusdon'tunderstandthephysicsofsoftwareandit'srarely,ifever,
taught.Thephysicsofsoftwareisnotalgorithms,datastructures,languagesandabstractions.Thesearejusttoolswemake,
use,throwaway.Therealphysicsofsoftwareisthephysicsofpeoplespecifically,ourlimitationswhenitcomestocomplexity,
andourdesiretoworktogethertosolvelargeproblemsinpieces.Thisisthescienceofprogramming:makebuildingblocksthat
peoplecanunderstandanduseeasily,andpeoplewillworktogethertosolvetheverylargestproblems.

Weliveinaconnectedworld,andmodernsoftwarehastonavigatethisworld.Sothebuildingblocksfortomorrow'sverylargest
solutionsareconnectedandmassivelyparallel.It'snotenoughforcodetobe"strongandsilent"anymore.Codehastotalkto
code.Codehastobechatty,sociable,wellconnected.Codehastorunlikethehumanbrain,trillionsofindividualneuronsfiring
offmessagestoeachother,amassivelyparallelnetworkwithnocentralcontrol,nosinglepointoffailure,yetabletosolve
immenselydifficultproblems.Andit'snoaccidentthatthefutureofcodelookslikethehumanbrain,becausetheendpointsof
everynetworkare,atsomelevel,humanbrains.

Ifyou'vedoneanyworkwiththreads,protocols,ornetworks,you'llrealizethisisprettymuchimpossible.It'sadream.Even
connectingafewprogramsacrossafewsocketsisplainnastywhenyoustarttohandlereallifesituations.Trillions?Thecost
wouldbeunimaginable.Connectingcomputersissodifficultthatsoftwareandservicestodothisisamultibilliondollarbusiness.

Soweliveinaworldwherethewiringisyearsaheadofourabilitytouseit.Wehadasoftwarecrisisinthe1980s,whenleading
softwareengineerslikeFredBrooksbelievedtherewasno"SilverBullet"to"promiseevenoneorderofmagnitudeof
improvementinproductivity,reliability,orsimplicity".

Brooksmissedfreeandopensourcesoftware,whichsolvedthatcrisis,enablingustoshareknowledgeefficiently.Todayweface
anothersoftwarecrisis,butit'sonewedon'ttalkaboutmuch.Onlythelargest,richestfirmscanaffordtocreateconnected
applications.Thereisacloud,butit'sproprietary.Ourdataandourknowledgeisdisappearingfromourpersonalcomputersinto
cloudsthatwecannotaccessandwithwhichwecannotcompete.Whoownsoursocialnetworks?ItislikethemainframePC
revolutioninreverse.

Wecanleavethepoliticalphilosophyforanotherbook.ThepointisthatwhiletheInternetoffersthepotentialofmassively
connectedcode,therealityisthatthisisoutofreachformostofus,andsolargeinterestingproblems(inhealth,education,
economics,transport,andsoon)remainunsolvedbecausethereisnowaytoconnectthecode,andthusnowaytoconnectthe
brainsthatcouldworktogethertosolvetheseproblems.

Therehavebeenmanyattemptstosolvethechallengeofconnectedcode.TherearethousandsofIETFspecifications,each
solvingpartofthepuzzle.Forapplicationdevelopers,HTTPisperhapstheonesolutiontohavebeensimpleenoughtowork,but
itarguablymakestheproblemworsebyencouragingdevelopersandarchitectstothinkintermsofbigserversandthin,stupid
clients.

SotodaypeoplearestillconnectingapplicationsusingrawUDPandTCP,proprietaryprotocols,HTTP,andWebsockets.It
remainspainful,slow,hardtoscale,andessentiallycentralized.DistributedP2Parchitecturesaremostlyforplay,notwork.How
manyapplicationsuseSkypeorBittorrenttoexchangedata?

Whichbringsusbacktothescienceofprogramming.Tofixtheworld,weneededtodotwothings.One,tosolvethegeneral
problemof"howtoconnectanycodetoanycode,anywhere".Two,towrapthatupinthesimplestpossiblebuildingblocksthat
peoplecouldunderstandanduseeasily.

Itsoundsridiculouslysimple.Andmaybeitis.That'skindofthewholepoint.
http://zguide.zeromq.org/page:all 3/225
12/31/2015 MQ - The Guide - MQ - The Guide

StartingAssumptions topprevnext

Weassumeyouareusingatleastversion3.2ofZeroMQ.WeassumeyouareusingaLinuxboxorsomethingsimilar.We
assumeyoucanreadCcode,moreorless,asthat'sthedefaultlanguagefortheexamples.Weassumethatwhenwewrite
constantslikePUSHorSUBSCRIBE,youcanimaginetheyarereallycalledZMQ_PUSHorZMQ_SUBSCRIBEiftheprogramming
languageneedsit.

GettingtheExamples topprevnext

TheexamplesliveinapublicGitHubrepository.Thesimplestwaytogetalltheexamplesistoclonethisrepository:

gitclonedepth=1https://github.com/imatix/zguide.git

Next,browsetheexamplessubdirectory.You'llfindexamplesbylanguage.Ifthereareexamplesmissinginalanguageyouuse,
you'reencouragedtosubmitatranslation.Thisishowthistextbecamesouseful,thankstotheworkofmanypeople.All
examplesarelicensedunderMIT/X11.

AskandYeShallReceive topprevnext

Solet'sstartwithsomecode.WestartofcoursewithaHelloWorldexample.We'llmakeaclientandaserver.Theclientsends
"Hello"totheserver,whichreplieswith"World".Here'stheserverinC,whichopensaZeroMQsocketonport5555,reads
requestsonit,andreplieswith"World"toeachrequest:

hwserver:HelloWorldserverinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Q|Racket|Ruby|Scala
|Tcl|Ada|Basic|ooc

Figure2RequestReply

TheREQREPsocketpairisinlockstep.Theclientissueszmq_send()andthenzmq_recv(),inaloop(oronceifthat'sallit
needs).Doinganyothersequence(e.g.,sendingtwomessagesinarow)willresultinareturncodeof1fromthesendorrecv
call.Similarly,theserviceissueszmq_recv()andthenzmq_send()inthatorder,asoftenasitneedsto.

http://zguide.zeromq.org/page:all 4/225
12/31/2015 MQ - The Guide - MQ - The Guide
ZeroMQusesCasitsreferencelanguageandthisisthemainlanguagewe'lluseforexamples.Ifyou'rereadingthisonline,the
linkbelowtheexampletakesyoutotranslationsintootherprogramminglanguages.Let'scomparethesameserverinC++:

//
//HelloWorldserverinC++
//BindsREPsockettotcp://*:5555
//Expects"Hello"fromclient,replieswith"World"
//
#include<zmq.hpp>
#include<string>
#include<iostream>
#ifndef_WIN32
#include<unistd.h>
#else
#include<windows.h>

#definesleep(n)Sleep(n)
#endif

intmain(){
//Prepareourcontextandsocket
zmq::context_tcontext(1)
zmq::socket_tsocket(context,ZMQ_REP)
socket.bind("tcp://*:5555")

while(true){
zmq::message_trequest

//Waitfornextrequestfromclient
socket.recv(&request)
std::cout<<"ReceivedHello"<<std::endl

//Dosome'work'
sleep(1)

//Sendreplybacktoclient
zmq::message_treply(5)
memcpy((void*)reply.data(),"World",5)
socket.send(reply)
}
return0
}

hwserver.cpp:HelloWorldserver

YoucanseethattheZeroMQAPIissimilarinCandC++.InalanguagelikePHPorJava,wecanhideevenmoreandthecode
becomeseveneasiertoread:

<?php
/*
*HelloWorldserver
*BindsREPsockettotcp://*:5555
*Expects"Hello"fromclient,replieswith"World"
*@authorIanBarber<ian(dot)barber(at)gmail(dot)com>
*/

$context=newZMQContext(1)

//Sockettotalktoclients
$responder=newZMQSocket($context,ZMQ::SOCKET_REP)
$responder>bind("tcp://*:5555")

http://zguide.zeromq.org/page:all 5/225
12/31/2015 MQ - The Guide - MQ - The Guide
while(true){
//Waitfornextrequestfromclient
$request=$responder>recv()
printf("Receivedrequest:[%s]\n",$request)

//Dosome'work'
sleep(1)

//Sendreplybacktoclient
$responder>send("World")
}

hwserver.php:HelloWorldserver

//
//HelloWorldserverinJava
//BindsREPsockettotcp://*:5555
//Expects"Hello"fromclient,replieswith"World"
//

importorg.zeromq.ZMQ

publicclasshwserver{

publicstaticvoidmain(String[]args)throwsException{
ZMQ.Contextcontext=ZMQ.context(1)

//Sockettotalktoclients
ZMQ.Socketresponder=context.socket(ZMQ.REP)
responder.bind("tcp://*:5555")

while(!Thread.currentThread().isInterrupted()){
//Waitfornextrequestfromtheclient
byte[]request=responder.recv(0)
System.out.println("ReceivedHello")

//Dosome'work'
Thread.sleep(1000)

//Sendreplybacktoclient
Stringreply="World"
responder.send(reply.getBytes(),0)
}
responder.close()
context.term()
}
}

hwserver.java:HelloWorldserver

Theserverinotherlanguages:

hwserver:HelloWorldserverinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Q|Racket|Ruby|Scala
|Tcl|Ada|Basic|ooc

Here'stheclientcode:

hwclient:HelloWorldclientinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Q|Racket|Ruby|Scala
|Tcl|Ada|Basic|ooc

http://zguide.zeromq.org/page:all 6/225
12/31/2015 MQ - The Guide - MQ - The Guide
Nowthislookstoosimpletoberealistic,butZeroMQsocketshave,aswealreadylearned,superpowers.Youcouldthrow
thousandsofclientsatthisserver,allatonce,anditwouldcontinuetoworkhappilyandquickly.Forfun,trystartingtheclient
andthenstartingtheserver,seehowitallstillworks,thenthinkforasecondwhatthismeans.

Letusexplainbrieflywhatthesetwoprogramsareactuallydoing.TheycreateaZeroMQcontexttoworkwith,andasocket.
Don'tworrywhatthewordsmean.You'llpickitup.TheserverbindsitsREP(reply)sockettoport5555.Theserverwaitsfora
requestinaloop,andrespondseachtimewithareply.Theclientsendsarequestandreadsthereplybackfromtheserver.

Ifyoukilltheserver(CtrlC)andrestartit,theclientwon'trecoverproperly.Recoveringfromcrashingprocessesisn'tquitethat
easy.Makingareliablerequestreplyflowiscomplexenoughthatwewon'tcoverituntilChapter4ReliableRequestReply
Patterns.

Thereisalothappeningbehindthescenesbutwhatmatterstousprogrammersishowshortandsweetthecodeis,andhow
oftenitdoesn'tcrash,evenunderaheavyload.Thisistherequestreplypattern,probablythesimplestwaytouseZeroMQ.It
mapstoRPCandtheclassicclient/servermodel.

AMinorNoteonStrings topprevnext

ZeroMQdoesn'tknowanythingaboutthedatayousendexceptitssizeinbytes.Thatmeansyouareresponsibleforformattingit
safelysothatapplicationscanreaditback.Doingthisforobjectsandcomplexdatatypesisajobforspecializedlibrarieslike
ProtocolBuffers.Butevenforstrings,youneedtotakecare.

InCandsomeotherlanguages,stringsareterminatedwithanullbyte.Wecouldsendastringlike"HELLO"withthatextranull
byte:

zmq_send(requester,"Hello",6,0)

However,ifyousendastringfromanotherlanguage,itprobablywillnotincludethatnullbyte.Forexample,whenwesendthat
samestringinPython,wedothis:

socket.send("Hello")

Thenwhatgoesontothewireisalength(onebyteforshorterstrings)andthestringcontentsasindividualcharacters.

Figure3AZeroMQstring

AndifyoureadthisfromaCprogram,youwillgetsomethingthatlookslikeastring,andmightbyaccidentactlikeastring(ifby
luckthefivebytesfindthemselvesfollowedbyaninnocentlylurkingnull),butisn'taproperstring.Whenyourclientandserver
don'tagreeonthestringformat,youwillgetweirdresults.

WhenyoureceivestringdatafromZeroMQinC,yousimplycannottrustthatit'ssafelyterminated.Everysingletimeyoureada
string,youshouldallocateanewbufferwithspaceforanextrabyte,copythestring,andterminateitproperlywithanull.

Solet'sestablishtherulethatZeroMQstringsarelengthspecifiedandaresentonthewirewithoutatrailingnull.Inthe
simplestcase(andwe'lldothisinourexamples),aZeroMQstringmapsneatlytoaZeroMQmessageframe,whichlookslikethe
abovefigurealengthandsomebytes.

Hereiswhatweneedtodo,inC,toreceiveaZeroMQstringanddeliverittotheapplicationasavalidCstring:

//ReceiveZeroMQstringfromsocketandconvertintoCstring
//Chopsstringat255chars,ifit'slonger
staticchar*

http://zguide.zeromq.org/page:all 7/225
12/31/2015 MQ - The Guide - MQ - The Guide
s_recv(void*socket){
charbuffer[256]
intsize=zmq_rec
v(socket,buffer,255,0)
if(size==1)

returnNULL
if(size>255)
size=255
buffer[size]=
0
returnstrdup(buf
fer)
}

Thismakesahandyhelperfunctionandinthespiritofmakingthingswecanreuseprofitably,let'swriteasimilars_sendfunction
thatsendsstringsinthecorrectZeroMQformat,andpackagethisintoaheaderfilewecanreuse.

Theresultiszhelpers.h,whichletsuswritesweeterandshorterZeroMQapplicationsinC.Itisafairlylongsource,andonly
funforCdevelopers,soreaditatleisure.

VersionReporting topprevnext

ZeroMQdoescomeinseveralversionsandquiteoften,ifyouhitaproblem,it'llbesomethingthat'sbeenfixedinalaterversion.
Soit'sausefultricktoknowexactlywhatversionofZeroMQyou'reactuallylinkingwith.

Hereisatinyprogramthatdoesthat:

version:ZeroMQversionreportinginC

C++|C#|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Q|Ruby|Scala|Tcl|Ada|Basic|
Clojure|Haxe|ooc|Racket

GettingtheMessageOut topprevnext

Thesecondclassicpatternisonewaydatadistribution,inwhichaserverpushesupdatestoasetofclients.Let'sseean
examplethatpushesoutweatherupdatesconsistingofazipcode,temperature,andrelativehumidity.We'llgeneraterandom
values,justliketherealweatherstationsdo.

Here'stheserver.We'lluseport5556forthisapplication:

wuserver:WeatherupdateserverinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Racket|Ruby|Scala|
Tcl|Ada|Basic|ooc|Q

There'snostartandnoendtothisstreamofupdates,it'slikeaneverendingbroadcast.

Hereistheclientapplication,whichlistenstothestreamofupdatesandgrabsanythingtodowithaspecifiedzipcode,bydefault
NewYorkCitybecausethat'sagreatplacetostartanyadventure:

wuclient:WeatherupdateclientinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Racket|Ruby|Scala|
Tcl|Ada|Basic|ooc|Q

Figure4PublishSubscribe

http://zguide.zeromq.org/page:all 8/225
12/31/2015 MQ - The Guide - MQ - The Guide

NotethatwhenyouuseaSUBsocketyoumustsetasubscriptionusingzmq_setsockopt()andSUBSCRIBE,asinthiscode.
Ifyoudon'tsetanysubscription,youwon'tgetanymessages.It'sacommonmistakeforbeginners.Thesubscribercansetmany
subscriptions,whichareaddedtogether.Thatis,ifanupdatematchesANYsubscription,thesubscriberreceivesit.The
subscribercanalsocancelspecificsubscriptions.Asubscriptionisoften,butnotnecessarilyaprintablestring.See
zmq_setsockopt()forhowthisworks.

ThePUBSUBsocketpairisasynchronous.Theclientdoeszmq_recv(),inaloop(oronceifthat'sallitneeds).Tryingtosend
amessagetoaSUBsocketwillcauseanerror.Similarly,theservicedoeszmq_send()asoftenasitneedsto,butmustnotdo
zmq_recv()onaPUBsocket.

IntheorywithZeroMQsockets,itdoesnotmatterwhichendconnectsandwhichendbinds.However,inpracticethereare
undocumenteddifferencesthatI'llcometolater.Fornow,bindthePUBandconnecttheSUB,unlessyournetworkdesignmakes
thatimpossible.

ThereisonemoreimportantthingtoknowaboutPUBSUBsockets:youdonotknowpreciselywhenasubscriberstartstoget
messages.Evenifyoustartasubscriber,waitawhile,andthenstartthepublisher,thesubscriberwillalwaysmissthefirst
messagesthatthepublishersends.Thisisbecauseasthesubscriberconnectstothepublisher(somethingthattakesasmall
butnonzerotime),thepublishermayalreadybesendingmessagesout.

This"slowjoiner"symptomhitsenoughpeopleoftenenoughthatwe'regoingtoexplainitindetail.RememberthatZeroMQdoes
asynchronousI/O,i.e.,inthebackground.Sayyouhavetwonodesdoingthis,inthisorder:

Subscriberconnectstoanendpointandreceivesandcountsmessages.
Publisherbindstoanendpointandimmediatelysends1,000messages.

Thenthesubscriberwillmostlikelynotreceiveanything.You'llblink,checkthatyousetacorrectfilterandtryagain,andthe
subscriberwillstillnotreceiveanything.

MakingaTCPconnectioninvolvestoandfromhandshakingthattakesseveralmillisecondsdependingonyournetworkandthe
numberofhopsbetweenpeers.Inthattime,ZeroMQcansendmanymessages.Forsakeofargumentassumeittakes5msecs
toestablishaconnection,andthatsamelinkcanhandle1Mmessagespersecond.Duringthe5msecsthatthesubscriberis
connectingtothepublisher,ittakesthepublisheronly1msectosendoutthose1Kmessages.

InChapter2SocketsandPatternswe'llexplainhowtosynchronizeapublisherandsubscriberssothatyoudon'tstarttopublish
datauntilthesubscribersreallyareconnectedandready.Thereisasimpleandstupidwaytodelaythepublisher,whichisto
sleep.Don'tdothisinarealapplication,though,becauseitisextremelyfragileaswellasinelegantandslow.Usesleepsto
provetoyourselfwhat'shappening,andthenwaitforChapter2SocketsandPatternstoseehowtodothisright.

Thealternativetosynchronizationistosimplyassumethatthepublisheddatastreamisinfiniteandhasnostartandnoend.One

http://zguide.zeromq.org/page:all 9/225
12/31/2015 MQ - The Guide - MQ - The Guide
alsoassumesthatthesubscriberdoesn'tcarewhattranspiredbeforeitstartedup.Thisishowwebuiltourweatherclient
example.

Sotheclientsubscribestoitschosenzipcodeandcollects100updatesforthatzipcode.Thatmeansabouttenmillionupdates
fromtheserver,ifzipcodesarerandomlydistributed.Youcanstarttheclient,andthentheserver,andtheclientwillkeep
working.Youcanstopandrestarttheserverasoftenasyoulike,andtheclientwillkeepworking.Whentheclienthascollected
itshundredupdates,itcalculatestheaverage,printsit,andexits.

Somepointsaboutthepublishsubscribe(pubsub)pattern:

Asubscribercanconnecttomorethanonepublisher,usingoneconnectcalleachtime.Datawillthenarriveandbe
interleaved("fairqueued")sothatnosinglepublisherdrownsouttheothers.

Ifapublisherhasnoconnectedsubscribers,thenitwillsimplydropallmessages.

Ifyou'reusingTCPandasubscriberisslow,messageswillqueueuponthepublisher.We'lllookathowtoprotect
publishersagainstthisusingthe"highwatermark"later.

FromZeroMQv3.x,filteringhappensatthepublishersidewhenusingaconnectedprotocol(tcp://oripc://).Using
theepgm://protocol,filteringhappensatthesubscriberside.InZeroMQv2.x,allfilteringhappenedatthesubscriber
side.

Thisishowlongittakestoreceiveandfilter10Mmessagesonmylaptop,whichisan2011eraInteli5,decentbutnothing
special:

$timewuclient
Collectingupdatesfromweatherserver...
Averagetemperatureforzipcode'10001'was28F

real0m4.470s
user0m0.000s
sys0m0.008s

DivideandConquer topprevnext

Figure5ParallelPipeline

http://zguide.zeromq.org/page:all 10/225
12/31/2015 MQ - The Guide - MQ - The Guide

Asafinalexample(youaresurelygettingtiredofjuicycodeandwanttodelvebackintophilologicaldiscussionsabout
comparativeabstractivenorms),let'sdoalittlesupercomputing.Thencoffee.Oursupercomputingapplicationisafairlytypical
parallelprocessingmodel.Wehave:

Aventilatorthatproducestasksthatcanbedoneinparallel
Asetofworkersthatprocesstasks
Asinkthatcollectsresultsbackfromtheworkerprocesses

Inreality,workersrunonsuperfastboxes,perhapsusingGPUs(graphicprocessingunits)todothehardmath.Hereisthe
ventilator.Itgenerates100tasks,eachamessagetellingtheworkertosleepforsomenumberofmilliseconds:

taskvent:ParalleltaskventilatorinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|
Basic|ooc|Q|Racket

Hereistheworkerapplication.Itreceivesamessage,sleepsforthatnumberofseconds,andthensignalsthatit'sfinished:

taskwork:ParalleltaskworkerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|
Basic|ooc|Q|Racket

Hereisthesinkapplication.Itcollectsthe100tasks,thencalculateshowlongtheoverallprocessingtook,sowecanconfirmthat
theworkersreallywererunninginparalleliftherearemorethanoneofthem:

tasksink:ParalleltasksinkinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|
Basic|ooc|Q|Racket

Theaveragecostofabatchis5seconds.Whenwestart1,2,or4workerswegetresultslikethisfromthesink:

http://zguide.zeromq.org/page:all 11/225
12/31/2015 MQ - The Guide - MQ - The Guide
1worker:totalelapsedtime:5034msecs.
2workers:totalelapsedtime:2421msecs.
4workers:totalelapsedtime:1018msecs.

Let'slookatsomeaspectsofthiscodeinmoredetail:

Theworkersconnectupstreamtotheventilator,anddownstreamtothesink.Thismeansyoucanaddworkersarbitrarily.
Iftheworkersboundtotheirendpoints,youwouldneed(a)moreendpointsand(b)tomodifytheventilatorand/orthesink
eachtimeyouaddedaworker.Wesaythattheventilatorandsinkarestablepartsofourarchitectureandtheworkersare
dynamicpartsofit.

Wehavetosynchronizethestartofthebatchwithallworkersbeingupandrunning.Thisisafairlycommongotchain
ZeroMQandthereisnoeasysolution.Thezmq_connectmethodtakesacertaintime.Sowhenasetofworkersconnect
totheventilator,thefirstonetosuccessfullyconnectwillgetawholeloadofmessagesinthatshorttimewhiletheothers
arealsoconnecting.Ifyoudon'tsynchronizethestartofthebatchsomehow,thesystemwon'truninparallelatall.Try
removingthewaitintheventilator,andseewhathappens.

Theventilator'sPUSHsocketdistributestaskstoworkers(assumingtheyareallconnectedbeforethebatchstartsgoing
out)evenly.Thisiscalledloadbalancingandit'ssomethingwe'lllookatagaininmoredetail.

Thesink'sPULLsocketcollectsresultsfromworkersevenly.Thisiscalledfairqueuing.

Figure6FairQueuing

Thepipelinepatternalsoexhibitsthe"slowjoiner"syndrome,leadingtoaccusationsthatPUSHsocketsdon'tloadbalance
properly.IfyouareusingPUSHandPULL,andoneofyourworkersgetswaymoremessagesthantheothers,it'sbecausethat
PULLsockethasjoinedfasterthantheothers,andgrabsalotofmessagesbeforetheothersmanagetoconnect.Ifyouwant
properloadbalancing,youprobablywanttolookattheloadbalancingpatterninChapter3AdvancedRequestReplyPatterns.

ProgrammingwithZeroMQ topprevnext

Havingseensomeexamples,youmustbeeagertostartusingZeroMQinsomeapps.Beforeyoustartthat,takeadeepbreath,
chillax,andreflectonsomebasicadvicethatwillsaveyoumuchstressandconfusion.

LearnZeroMQstepbystep.It'sjustonesimpleAPI,butithidesaworldofpossibilities.Takethepossibilitiesslowlyand
mastereachone.

Writenicecode.Uglycodehidesproblemsandmakesithardforotherstohelpyou.Youmightgetusedtomeaningless
variablenames,butpeoplereadingyourcodewon't.Usenamesthatarerealwords,thatsaysomethingotherthan"I'm
toocarelesstotellyouwhatthisvariableisreallyfor".Useconsistentindentationandcleanlayout.Writenicecodeand
yourworldwillbemorecomfortable.

Testwhatyoumakeasyoumakeit.Whenyourprogramdoesn'twork,youshouldknowwhatfivelinesaretoblame.This
isespeciallytruewhenyoudoZeroMQmagic,whichjustwon'tworkthefirstfewtimesyoutryit.
http://zguide.zeromq.org/page:all 12/225
12/31/2015 MQ - The Guide - MQ - The Guide
Whenyoufindthatthingsdon'tworkasexpected,breakyourcodeintopieces,testeachone,seewhichoneisnot
working.ZeroMQletsyoumakeessentiallymodularcodeusethattoyouradvantage.

Makeabstractions(classes,methods,whatever)asyouneedthem.Ifyoucopy/pastealotofcode,you'regoingto
copy/pasteerrors,too.

GettingtheContextRight topprevnext

ZeroMQapplicationsalwaysstartbycreatingacontext,andthenusingthatforcreatingsockets.InC,it'sthezmq_ctx_new()
call.Youshouldcreateanduseexactlyonecontextinyourprocess.Technically,thecontextisthecontainerforallsocketsina
singleprocess,andactsasthetransportforinprocsockets,whicharethefastestwaytoconnectthreadsinoneprocess.Ifat
runtimeaprocesshastwocontexts,thesearelikeseparateZeroMQinstances.Ifthat'sexplicitlywhatyouwant,OK,but
otherwiseremember:

Callzmq_ctx_new()onceatthestartofaprocess,andzmq_ctx_destroy()onceattheend.

Ifyou'reusingthefork()systemcall,dozmq_ctx_new()aftertheforkandatthebeginningofthechildprocesscode.In
general,youwanttodointeresting(ZeroMQ)stuffinthechildren,andboringprocessmanagementintheparent.

MakingaCleanExit topprevnext

Classyprogrammerssharethesamemottoasclassyhitmen:alwayscleanupwhenyoufinishthejob.WhenyouuseZeroMQin
alanguagelikePython,stuffgetsautomaticallyfreedforyou.ButwhenusingC,youhavetocarefullyfreeobjectswhenyou're
finishedwiththemorelseyougetmemoryleaks,unstableapplications,andgenerallybadkarma.

Memoryleaksareonething,butZeroMQisquitefinickyabouthowyouexitanapplication.Thereasonsaretechnicaland
painful,buttheupshotisthatifyouleaveanysocketsopen,thezmq_ctx_destroy()functionwillhangforever.Andevenifyou
closeallsockets,zmq_ctx_destroy()willbydefaultwaitforeveriftherearependingconnectsorsendsunlessyousetthe
LINGERtozeroonthosesocketsbeforeclosingthem.

TheZeroMQobjectsweneedtoworryaboutaremessages,sockets,andcontexts.Luckilyit'squitesimple,atleastinsimple
programs:

Usezmq_send()andzmq_recv()whenyoucan,asitavoidstheneedtoworkwithzmq_msg_tobjects.

Ifyoudousezmq_msg_recv(),alwaysreleasethereceivedmessageassoonasyou'redonewithit,bycalling
zmq_msg_close().

Ifyouareopeningandclosingalotofsockets,that'sprobablyasignthatyouneedtoredesignyourapplication.Insome
casessockethandleswon'tbefreeduntilyoudestroythecontext.

Whenyouexittheprogram,closeyoursocketsandthencallzmq_ctx_destroy().Thisdestroysthecontext.

ThisisatleastthecaseforCdevelopment.Inalanguagewithautomaticobjectdestruction,socketsandcontextswillbe
destroyedasyouleavethescope.Ifyouuseexceptionsyou'llhavetodothecleanupinsomethinglikea"final"block,thesame
asforanyresource.

Ifyou'redoingmultithreadedwork,itgetsrathermorecomplexthanthis.We'llgettomultithreadinginthenextchapter,but
becausesomeofyouwill,despitewarnings,trytorunbeforeyoucansafelywalk,belowisthequickanddirtyguidetomakinga
cleanexitinamultithreadedZeroMQapplication.

First,donottrytousethesamesocketfrommultiplethreads.Pleasedon'texplainwhyyouthinkthiswouldbeexcellentfun,just
pleasedon'tdoit.Next,youneedtoshutdowneachsocketthathasongoingrequests.TheproperwayistosetalowLINGER
value(1second),andthenclosethesocket.Ifyourlanguagebindingdoesn'tdothisforyouautomaticallywhenyoudestroya
context,I'dsuggestsendingapatch.

Finally,destroythecontext.Thiswillcauseanyblockingreceivesorpollsorsendsinattachedthreads(i.e.,whichsharethe
samecontext)toreturnwithanerror.Catchthaterror,andthensetlingeron,andclosesocketsinthatthread,andexit.Donot
destroythesamecontexttwice.Thezmq_ctx_destroyinthemainthreadwillblockuntilallsocketsitknowsaboutaresafely
http://zguide.zeromq.org/page:all 13/225
12/31/2015 MQ - The Guide - MQ - The Guide
closed.

Voila!It'scomplexandpainfulenoughthatanylanguagebindingauthorworthhisorhersaltwilldothisautomaticallyandmake
thesocketclosingdanceunnecessary.

WhyWeNeededZeroMQ topprevnext

Nowthatyou'veseenZeroMQinaction,let'sgobacktothe"why".

Manyapplicationsthesedaysconsistofcomponentsthatstretchacrosssomekindofnetwork,eitheraLANortheInternet.So
manyapplicationdevelopersendupdoingsomekindofmessaging.Somedevelopersusemessagequeuingproducts,butmost
ofthetimetheydoitthemselves,usingTCPorUDP.Theseprotocolsarenothardtouse,butthereisagreatdifferencebetween
sendingafewbytesfromAtoB,anddoingmessaginginanykindofreliableway.

Let'slookatthetypicalproblemswefacewhenwestarttoconnectpiecesusingrawTCP.Anyreusablemessaginglayerwould
needtosolveallormostofthese:

HowdowehandleI/O?Doesourapplicationblock,ordowehandleI/Ointhebackground?Thisisakeydesigndecision.
BlockingI/Ocreatesarchitecturesthatdonotscalewell.ButbackgroundI/Ocanbeveryhardtodoright.

Howdowehandledynamiccomponents,i.e.,piecesthatgoawaytemporarily?Doweformallysplitcomponentsinto
"clients"and"servers"andmandatethatserverscannotdisappear?Whatthenifwewanttoconnectserverstoservers?
Dowetrytoreconnecteveryfewseconds?

Howdowerepresentamessageonthewire?Howdoweframedatasoit'seasytowriteandread,safefrombuffer
overflows,efficientforsmallmessages,yetadequatefortheverylargestvideosofdancingcatswearingpartyhats?

Howdowehandlemessagesthatwecan'tdeliverimmediately?Particularly,ifwe'rewaitingforacomponenttocome
backonline?Dowediscardmessages,putthemintoadatabase,orintoamemoryqueue?

Wheredowestoremessagequeues?Whathappensifthecomponentreadingfromaqueueisveryslowandcausesour
queuestobuildup?What'sourstrategythen?

Howdowehandlelostmessages?Dowewaitforfreshdata,requestaresend,ordowebuildsomekindofreliabilitylayer
thatensuresmessagescannotbelost?Whatifthatlayeritselfcrashes?

Whatifweneedtouseadifferentnetworktransport.Say,multicastinsteadofTCPunicast?OrIPv6?Doweneedto
rewritetheapplications,oristhetransportabstractedinsomelayer?

Howdoweroutemessages?Canwesendthesamemessagetomultiplepeers?Canwesendrepliesbacktoanoriginal
requester?

HowdowewriteanAPIforanotherlanguage?Dowereimplementawirelevelprotocolordowerepackagealibrary?If
theformer,howcanweguaranteeefficientandstablestacks?Ifthelatter,howcanweguaranteeinteroperability?

Howdowerepresentdatasothatitcanbereadbetweendifferentarchitectures?Doweenforceaparticularencodingfor
datatypes?Howfaristhisthejobofthemessagingsystemratherthanahigherlayer?

Howdowehandlenetworkerrors?Dowewaitandretry,ignorethemsilently,orabort?

TakeatypicalopensourceprojectlikeHadoopZookeeperandreadtheCAPIcodeinsrc/c/src/zookeeper.c.WhenIread
thiscode,inJanuary2013,itwas4,200linesofmysteryandinthereisanundocumented,client/servernetworkcommunication
protocol.Iseeit'sefficientbecauseitusespollinsteadofselect.Butreally,Zookeepershouldbeusingagenericmessaging
layerandanexplicitlydocumentedwirelevelprotocol.Itisincrediblywastefulforteamstobebuildingthisparticularwheelover
andover.

Buthowtomakeareusablemessaginglayer?Why,whensomanyprojectsneedthistechnology,arepeoplestilldoingitthe
hardwaybydrivingTCPsocketsintheircode,andsolvingtheproblemsinthatlonglistoverandover?

Itturnsoutthatbuildingreusablemessagingsystemsisreallydifficult,whichiswhyfewFOSSprojectsevertried,andwhy
commercialmessagingproductsarecomplex,expensive,inflexible,andbrittle.In2006,iMatixdesignedAMQPwhichstartedto
giveFOSSdevelopersperhapsthefirstreusablerecipeforamessagingsystem.AMQPworksbetterthanmanyotherdesigns,
butremainsrelativelycomplex,expensive,andbrittle.Ittakesweekstolearntouse,andmonthstocreatestablearchitectures
thatdon'tcrashwhenthingsgethairy.

http://zguide.zeromq.org/page:all 14/225
12/31/2015 MQ - The Guide - MQ - The Guide
Figure7MessagingasitStarts

Mostmessagingprojects,likeAMQP,thattrytosolvethislonglistofproblemsinareusablewaydosobyinventinganew
concept,the"broker",thatdoesaddressing,routing,andqueuing.Thisresultsinaclient/serverprotocolorasetofAPIsontopof
someundocumentedprotocolthatallowsapplicationstospeaktothisbroker.Brokersareanexcellentthinginreducingthe
complexityoflargenetworks.ButaddingbrokerbasedmessagingtoaproductlikeZookeeperwouldmakeitworse,notbetter.It
wouldmeanaddinganadditionalbigbox,andanewsinglepointoffailure.Abrokerrapidlybecomesabottleneckandanewrisk
tomanage.Ifthesoftwaresupportsit,wecanaddasecond,third,andfourthbrokerandmakesomefailoverscheme.Peopledo
this.Itcreatesmoremovingpieces,morecomplexity,andmorethingstobreak.

Andabrokercentricsetupneedsitsownoperationsteam.Youliterallyneedtowatchthebrokersdayandnight,andbeatthem
withastickwhentheystartmisbehaving.Youneedboxes,andyouneedbackupboxes,andyouneedpeopletomanagethose
boxes.Itisonlyworthdoingforlargeapplicationswithmanymovingpieces,builtbyseveralteamsofpeopleoverseveralyears.

Figure8MessagingasitBecomes

http://zguide.zeromq.org/page:all 15/225
12/31/2015 MQ - The Guide - MQ - The Guide
Sosmalltomediumapplicationdevelopersaretrapped.Eithertheyavoidnetworkprogrammingandmakemonolithic
applicationsthatdonotscale.Ortheyjumpintonetworkprogrammingandmakebrittle,complexapplicationsthatarehardto
maintain.Ortheybetonamessagingproduct,andendupwithscalableapplicationsthatdependonexpensive,easilybroken
technology.Therehasbeennoreallygoodchoice,whichismaybewhymessagingislargelystuckinthelastcenturyandstirs
strongemotions:negativeonesforusers,gleefuljoyforthosesellingsupportandlicenses.

Whatweneedissomethingthatdoesthejobofmessaging,butdoesitinsuchasimpleandcheapwaythatitcanworkinany
application,withclosetozerocost.Itshouldbealibrarywhichyoujustlink,withoutanyotherdependencies.Noadditional
movingpieces,sonoadditionalrisk.ItshouldrunonanyOSandworkwithanyprogramminglanguage.

AndthisisZeroMQ:anefficient,embeddablelibrarythatsolvesmostoftheproblemsanapplicationneedstobecomenicely
elasticacrossanetwork,withoutmuchcost.

Specifically:

IthandlesI/Oasynchronously,inbackgroundthreads.Thesecommunicatewithapplicationthreadsusinglockfreedata
structures,soconcurrentZeroMQapplicationsneednolocks,semaphores,orotherwaitstates.

ComponentscancomeandgodynamicallyandZeroMQwillautomaticallyreconnect.Thismeansyoucanstart
componentsinanyorder.Youcancreate"serviceorientedarchitectures"(SOAs)whereservicescanjoinandleavethe
networkatanytime.

Itqueuesmessagesautomaticallywhenneeded.Itdoesthisintelligently,pushingmessagesascloseaspossibletothe
receiverbeforequeuingthem.

Ithaswaysofdealingwithoverfullqueues(called"highwatermark").Whenaqueueisfull,ZeroMQautomaticallyblocks
senders,orthrowsawaymessages,dependingonthekindofmessagingyouaredoing(thesocalled"pattern").

Itletsyourapplicationstalktoeachotheroverarbitrarytransports:TCP,multicast,inprocess,interprocess.Youdon't
needtochangeyourcodetouseadifferenttransport.

Ithandlesslow/blockedreaderssafely,usingdifferentstrategiesthatdependonthemessagingpattern.

Itletsyouroutemessagesusingavarietyofpatternssuchasrequestreplyandpubsub.Thesepatternsarehowyou
createthetopology,thestructureofyournetwork.

Itletsyoucreateproxiestoqueue,forward,orcapturemessageswithasinglecall.Proxiescanreducetheinterconnection
complexityofanetwork.

Itdeliverswholemessagesexactlyastheyweresent,usingasimpleframingonthewire.Ifyouwritea10kmessage,you
willreceivea10kmessage.

Itdoesnotimposeanyformatonmessages.Theyareblobsfromzerotogigabyteslarge.Whenyouwanttorepresent
datayouchoosesomeotherproductontop,suchasmsgpack,Google'sprotocolbuffers,andothers.

Ithandlesnetworkerrorsintelligently,byretryingautomaticallyincaseswhereitmakessense.

Itreducesyourcarbonfootprint.DoingmorewithlessCPUmeansyourboxesuselesspower,andyoucankeepyourold
boxesinuseforlonger.AlGorewouldloveZeroMQ.

ActuallyZeroMQdoesrathermorethanthis.Ithasasubversiveeffectonhowyoudevelopnetworkcapableapplications.
Superficially,it'sasocketinspiredAPIonwhichyoudozmq_recv()andzmq_send().Butmessageprocessingrapidly
becomesthecentralloop,andyourapplicationsoonbreaksdownintoasetofmessageprocessingtasks.Itiselegantand
natural.Anditscales:eachofthesetasksmapstoanode,andthenodestalktoeachotheracrossarbitrarytransports.Two
nodesinoneprocess(nodeisathread),twonodesononebox(nodeisaprocess),ortwonodesononenetwork(nodeisabox)
it'sallthesame,withnoapplicationcodechanges.

SocketScalability topprevnext

Let'sseeZeroMQ'sscalabilityinaction.Hereisashellscriptthatstartstheweatherserverandthenabunchofclientsinparallel:

wuserver&
wuclient12345&

http://zguide.zeromq.org/page:all 16/225
12/31/2015 MQ - The Guide - MQ - The Guide
wuclient23456&
wuclient34567&
wuclient45678&
wuclient56789&

Astheclientsrun,wetakealookattheactiveprocessesusingthetopcommand',andweseesomethinglike(ona4corebox):

PIDUSERPRNIVIRTRESSHRS%CPU%MEMTIME+COMMAND
7136ph2001040m959m1156R15712.016:25.47wuserver
7966ph2009860818041372S330.00:03.94wuclient
7963ph2003311617481372S140.00:00.76wuclient
7965ph2003311617841372S60.00:00.47wuclient
7964ph2003311617881372S50.00:00.25wuclient
7967ph2003307217401372S50.00:00.35wuclient

Let'sthinkforasecondaboutwhatishappeninghere.Theweatherserverhasasinglesocket,andyetherewehaveitsending
datatofiveclientsinparallel.Wecouldhavethousandsofconcurrentclients.Theserverapplicationdoesn'tseethem,doesn't
talktothemdirectly.SotheZeroMQsocketisactinglikealittleserver,silentlyacceptingclientrequestsandshovingdataoutto
themasfastasthenetworkcanhandleit.Andit'samultithreadedserver,squeezingmorejuiceoutofyourCPU.

UpgradingfromZeroMQv2.2toZeroMQv3.2 topprevnext

CompatibleChanges topprevnext

Thesechangesdon'timpactexistingapplicationcodedirectly:

Pubsubfilteringisnowdoneatthepublishersideinsteadofsubscriberside.Thisimprovesperformancesignificantlyin
manypubsubusecases.Youcanmixv3.2andv2.1/v2.2publishersandsubscriberssafely.

ZeroMQv3.2hasmanynewAPImethods(zmq_disconnect(),zmq_unbind(),zmq_monitor(),zmq_ctx_set(),
etc.)

IncompatibleChanges topprevnext

Thesearethemainareasofimpactonapplicationsandlanguagebindings:

Changedsend/recvmethods:zmq_send()andzmq_recv()haveadifferent,simplerinterface,andtheoldfunctionality
isnowprovidedbyzmq_msg_send()andzmq_msg_recv().Symptom:compileerrors.Solution:fixupyourcode.

Thesetwomethodsreturnpositivevaluesonsuccess,and1onerror.Inv2.xtheyalwaysreturnedzeroonsuccess.
Symptom:apparenterrorswhenthingsactuallyworkfine.Solution:teststrictlyforreturncode=1,notnonzero.

zmq_poll()nowwaitsformilliseconds,notmicroseconds.Symptom:applicationstopsresponding(infactresponds
1000timesslower).Solution:usetheZMQ_POLL_MSECmacrodefinedbelow,inallzmq_pollcalls.

ZMQ_NOBLOCKisnowcalledZMQ_DONTWAIT.Symptom:compilefailuresontheZMQ_NOBLOCKmacro.

TheZMQ_HWMsocketoptionisnowbrokenintoZMQ_SNDHWMandZMQ_RCVHWM.Symptom:compilefailuresonthe
ZMQ_HWMmacro.

Mostbutnotallzmq_getsockopt()optionsarenowintegervalues.Symptom:runtimeerrorreturnson

http://zguide.zeromq.org/page:all 17/225
12/31/2015 MQ - The Guide - MQ - The Guide
zmq_setsockoptandzmq_getsockopt.

TheZMQ_SWAPoptionhasbeenremoved.Symptom:compilefailuresonZMQ_SWAP.Solution:redesignanycodethatuses
thisfunctionality.

SuggestedShimMacros topprevnext

Forapplicationsthatwanttorunonbothv2.xandv3.2,suchaslanguagebindings,ouradviceistoemulatec3.2asfaras
possible.HereareCmacrodefinitionsthathelpyourC/C++codetoworkacrossbothversions(takenfromCZMQ):

#ifndefZMQ_DONTWAIT
#defineZMQ_DONTWAITZMQ_NOBLOCK
#endif
#ifZMQ_VERSION_MAJOR==2
#definezmq_msg_send(msg,sock,opt)zmq_send(sock,msg,opt)
#definezmq_msg_recv(msg,sock,opt)zmq_recv(sock,msg,opt)
#definezmq_ctx_destroy(context)zmq_term(context)
#defineZMQ_POLL_MSEC1000//zmq_pollisusec
#defineZMQ_SNDHWMZMQ_HWM
#defineZMQ_RCVHWMZMQ_HWM
#elifZMQ_VERSION_MAJOR==3
#defineZMQ_POLL_MSEC1//zmq_pollismsec
#endif

Warning:UnstableParadigms! topprevnext

Traditionalnetworkprogrammingisbuiltonthegeneralassumptionthatonesockettalkstooneconnection,onepeer.Thereare
multicastprotocols,buttheseareexotic.Whenweassume"onesocket=oneconnection",wescaleourarchitecturesincertain
ways.Wecreatethreadsoflogicwhereeachthreadworkwithonesocket,onepeer.Weplaceintelligenceandstateinthese
threads.

IntheZeroMQuniverse,socketsaredoorwaystofastlittlebackgroundcommunicationsenginesthatmanageawholesetof
connectionsautomagicallyforyou.Youcan'tsee,workwith,open,close,orattachstatetotheseconnections.Whetheryouuse
blockingsendorreceive,orpoll,allyoucantalktoisthesocket,nottheconnectionsitmanagesforyou.Theconnectionsare
privateandinvisible,andthisisthekeytoZeroMQ'sscalability.

Thisisbecauseyourcode,talkingtoasocket,canthenhandleanynumberofconnectionsacrosswhatevernetworkprotocols
arearound,withoutchange.AmessagingpatternsittinginZeroMQscalesmorecheaplythanamessagingpatternsittinginyour
applicationcode.

Sothegeneralassumptionnolongerapplies.Asyoureadthecodeexamples,yourbrainwilltrytomapthemtowhatyouknow.
Youwillread"socket"andthink"ah,thatrepresentsaconnectiontoanothernode".Thatiswrong.Youwillread"thread"and
yourbrainwillagainthink,"ah,athreadrepresentsaconnectiontoanothernode",andagainyourbrainwillbewrong.

Ifyou'rereadingthisGuideforthefirsttime,realizethatuntilyouactuallywriteZeroMQcodeforadayortwo(andmaybethree
orfourdays),youmayfeelconfused,especiallybyhowsimpleZeroMQmakesthingsforyou,andyoumaytrytoimposethat
generalassumptiononZeroMQ,anditwon'twork.Andthenyouwillexperienceyourmomentofenlightenmentandtrust,that
zappowkaboomsatoriparadigmshiftmomentwhenitallbecomesclear.

Chapter2SocketsandPatterns topprevnext

http://zguide.zeromq.org/page:all 18/225
12/31/2015 MQ - The Guide - MQ - The Guide
InChapter1BasicswetookZeroMQforadrive,withsomebasicexamplesofthemainZeroMQpatterns:requestreply,pub
sub,andpipeline.Inthischapter,we'regoingtogetourhandsdirtyandstarttolearnhowtousethesetoolsinrealprograms.

We'llcover:

HowtocreateandworkwithZeroMQsockets.
Howtosendandreceivemessagesonsockets.
HowtobuildyourappsaroundZeroMQ'sasynchronousI/Omodel.
Howtohandlemultiplesocketsinonethread.
Howtohandlefatalandnonfatalerrorsproperly.
HowtohandleinterruptsignalslikeCtrlC.
HowtoshutdownaZeroMQapplicationcleanly.
HowtocheckaZeroMQapplicationformemoryleaks.
Howtosendandreceivemultipartmessages.
Howtoforwardmessagesacrossnetworks.
Howtobuildasimplemessagequeuingbroker.
HowtowritemultithreadedapplicationswithZeroMQ.
HowtouseZeroMQtosignalbetweenthreads.
HowtouseZeroMQtocoordinateanetworkofnodes.
Howtocreateandusemessageenvelopesforpubsub.
UsingtheHWM(highwatermark)toprotectagainstmemoryoverflows.

TheSocketAPI topprevnext

Tobeperfectlyhonest,ZeroMQdoesakindofswitchandbaitonyou,forwhichwedon'tapologize.It'sforyourowngoodandit
hurtsusmorethanithurtsyou.ZeroMQpresentsafamiliarsocketbasedAPI,whichrequiresgreateffortforustohideabunch
ofmessageprocessingengines.However,theresultwillslowlyfixyourworldviewabouthowtodesignandwritedistributed
software.

SocketsarethedefactostandardAPIfornetworkprogramming,aswellasbeingusefulforstoppingyoureyesfromfallingonto
yourcheeks.OnethingthatmakesZeroMQespeciallytastytodevelopersisthatitusessocketsandmessagesinsteadofsome
otherarbitrarysetofconcepts.KudostoMartinSustrikforpullingthisoff.Itturns"MessageOrientedMiddleware",aphrase
guaranteedtosendthewholeroomofftoCatatonia,into"ExtraSpicySockets!",whichleavesuswithastrangecravingforpizza
andadesiretoknowmore.

Likeafavoritedish,ZeroMQsocketsareeasytodigest.Socketshavealifeinfourparts,justlikeBSDsockets:

Creatinganddestroyingsockets,whichgotogethertoformakarmiccircleofsocketlife(seezmq_socket(),
zmq_close()).

Configuringsocketsbysettingoptionsonthemandcheckingthemifnecessary(seezmq_setsockopt(),
zmq_getsockopt()).

PluggingsocketsintothenetworktopologybycreatingZeroMQconnectionstoandfromthem(seezmq_bind(),
zmq_connect()).

Usingthesocketstocarrydatabywritingandreceivingmessagesonthem(seezmq_msg_send(),zmq_msg_recv()).

Notethatsocketsarealwaysvoidpointers,andmessages(whichwe'llcometoverysoon)arestructures.SoinCyoupass
socketsassuch,butyoupassaddressesofmessagesinallfunctionsthatworkwithmessages,likezmq_msg_send()and
zmq_msg_recv().Asamnemonic,realizethat"inZeroMQ,allyoursocketsarebelongtous",butmessagesarethingsyou
actuallyowninyourcode.

Creating,destroying,andconfiguringsocketsworksasyou'dexpectforanyobject.ButrememberthatZeroMQisan
asynchronous,elasticfabric.Thishassomeimpactonhowweplugsocketsintothenetworktopologyandhowweusethe
socketsafterthat.

PluggingSocketsintotheTopology topprevnext

http://zguide.zeromq.org/page:all 19/225
12/31/2015 MQ - The Guide - MQ - The Guide
Tocreateaconnectionbetweentwonodes,youusezmq_bind()inonenodeandzmq_connect()intheother.Asageneral
ruleofthumb,thenodethatdoeszmq_bind()isa"server",sittingonawellknownnetworkaddress,andthenodewhichdoes
zmq_connect()isa"client",withunknownorarbitrarynetworkaddresses.Thuswesaythatwe"bindasockettoanendpoint"
and"connectasockettoanendpoint",theendpointbeingthatwellknownnetworkaddress.

ZeroMQconnectionsaresomewhatdifferentfromclassicTCPconnections.Themainnotabledifferencesare:

Theygoacrossanarbitrarytransport(inproc,ipc,tcp,pgm,orepgm).Seezmq_inproc(),zmq_ipc(),zmq_tcp(),
zmq_pgm(),andzmq_epgm().

Onesocketmayhavemanyoutgoingandmanyincomingconnections.

Thereisnozmq_accept()method.Whenasocketisboundtoanendpointitautomaticallystartsacceptingconnections.

Thenetworkconnectionitselfhappensinthebackground,andZeroMQwillautomaticallyreconnectifthenetwork
connectionisbroken(e.g.,ifthepeerdisappearsandthencomesback).

Yourapplicationcodecannotworkwiththeseconnectionsdirectlytheyareencapsulatedunderthesocket.

Manyarchitecturesfollowsomekindofclient/servermodel,wheretheserveristhecomponentthatismoststatic,andtheclients
arethecomponentsthataremostdynamic,i.e.,theycomeandgothemost.Therearesometimesissuesofaddressing:servers
willbevisibletoclients,butnotnecessarilyviceversa.Somostlyit'sobviouswhichnodeshouldbedoingzmq_bind()(the
server)andwhichshouldbedoingzmq_connect()(theclient).Italsodependsonthekindofsocketsyou'reusing,withsome
exceptionsforunusualnetworkarchitectures.We'lllookatsockettypeslater.

Now,imaginewestarttheclientbeforewestarttheserver.Intraditionalnetworking,wegetabigredFailflag.ButZeroMQlets
usstartandstoppiecesarbitrarily.Assoonastheclientnodedoeszmq_connect(),theconnectionexistsandthatnodecan
starttowritemessagestothesocket.Atsomestage(hopefullybeforemessagesqueueupsomuchthattheystarttoget
discarded,ortheclientblocks),theservercomesalive,doesazmq_bind(),andZeroMQstartstodelivermessages.

Aservernodecanbindtomanyendpoints(thatis,acombinationofprotocolandaddress)anditcandothisusingasingle
socket.Thismeansitwillacceptconnectionsacrossdifferenttransports:

zmq_bind(socket,"tcp://*:5555")
zmq_bind(socket,"tcp://*:9999")
zmq_bind(socket,"inproc://somename")

Withmosttransports,youcannotbindtothesameendpointtwice,unlikeforexampleinUDP.Theipctransportdoes,however,
letoneprocessbindtoanendpointalreadyusedbyafirstprocess.It'smeanttoallowaprocesstorecoverafteracrash.

AlthoughZeroMQtriestobeneutralaboutwhichsidebindsandwhichsideconnects,therearedifferences.We'llseethesein
moredetaillater.Theupshotisthatyoushouldusuallythinkintermsof"servers"asstaticpartsofyourtopologythatbindto
moreorlessfixedendpoints,and"clients"asdynamicpartsthatcomeandgoandconnecttotheseendpoints.Then,designyour
applicationaroundthismodel.Thechancesthatitwill"justwork"aremuchbetterlikethat.

Socketshavetypes.Thesockettypedefinesthesemanticsofthesocket,itspoliciesforroutingmessagesinwardsandoutwards,
queuing,etc.Youcanconnectcertaintypesofsockettogether,e.g.,apublishersocketandasubscribersocket.Socketswork
togetherin"messagingpatterns".We'lllookatthisinmoredetaillater.

It'stheabilitytoconnectsocketsinthesedifferentwaysthatgivesZeroMQitsbasicpowerasamessagequeuingsystem.There
arelayersontopofthis,suchasproxies,whichwe'llgettolater.Butessentially,withZeroMQyoudefineyournetwork
architecturebypluggingpiecestogetherlikeachild'sconstructiontoy.

SendingandReceivingMessages topprevnext

Tosendandreceivemessagesyouusethezmq_msg_send()andzmq_msg_recv()methods.Thenamesareconventional,
butZeroMQ'sI/OmodelisdifferentenoughfromtheclassicTCPmodelthatyouwillneedtimetogetyourheadaroundit.

Figure9TCPsocketsare1to1

http://zguide.zeromq.org/page:all 20/225
12/31/2015 MQ - The Guide - MQ - The Guide

Let'slookatthemaindifferencesbetweenTCPsocketsandZeroMQsocketswhenitcomestoworkingwithdata:

ZeroMQsocketscarrymessages,likeUDP,ratherthanastreamofbytesasTCPdoes.AZeroMQmessageislength
specifiedbinarydata.We'llcometomessagesshortlytheirdesignisoptimizedforperformanceandsoalittletricky.

ZeroMQsocketsdotheirI/Oinabackgroundthread.Thismeansthatmessagesarriveinlocalinputqueuesandaresent
fromlocaloutputqueues,nomatterwhatyourapplicationisbusydoing.

ZeroMQsocketshaveonetoNroutingbehaviorbuiltin,accordingtothesockettype.

Thezmq_send()methoddoesnotactuallysendthemessagetothesocketconnection(s).ItqueuesthemessagesothattheI/O
threadcansenditasynchronously.Itdoesnotblockexceptinsomeexceptioncases.Sothemessageisnotnecessarilysent
whenzmq_send()returnstoyourapplication.

UnicastTransports topprevnext

ZeroMQprovidesasetofunicasttransports(inproc,ipc,andtcp)andmulticasttransports(epgm,pgm).Multicastisan
advancedtechniquethatwe'llcometolater.Don'tevenstartusingitunlessyouknowthatyourfanoutratioswillmake1toN
unicastimpossible.

Formostcommoncases,usetcp,whichisadisconnectedTCPtransport.Itiselastic,portable,andfastenoughformostcases.
WecallthisdisconnectedbecauseZeroMQ'stcptransportdoesn'trequirethattheendpointexistsbeforeyouconnecttoit.
Clientsandserverscanconnectandbindatanytime,cangoandcomeback,anditremainstransparenttoapplications.

Theinterprocessipctransportisdisconnected,liketcp.Ithasonelimitation:itdoesnotyetworkonWindows.Byconvention
weuseendpointnameswithan".ipc"extensiontoavoidpotentialconflictwithotherfilenames.OnUNIXsystems,ifyouuseipc
endpointsyouneedtocreatethesewithappropriatepermissionsotherwisetheymaynotbeshareablebetweenprocesses
runningunderdifferentuserIDs.Youmustalsomakesureallprocessescanaccessthefiles,e.g.,byrunninginthesame
workingdirectory.

Theinterthreadtransport,inproc,isaconnectedsignalingtransport.Itismuchfasterthantcporipc.Thistransporthasa
specificlimitationcomparedtotcpandipc:theservermustissueabindbeforeanyclientissuesaconnect.Thisis
somethingfutureversionsofZeroMQmayfix,butatpresentthisdefineshowyouuseinprocsockets.Wecreateandbindone
socketandstartthechildthreads,whichcreateandconnecttheothersockets.

ZeroMQisNotaNeutralCarrier topprevnext

AcommonquestionthatnewcomerstoZeroMQask(it'soneI'veaskedmyself)is,"howdoIwriteanXYZserverinZeroMQ?"
Forexample,"howdoIwriteanHTTPserverinZeroMQ?"TheimplicationisthatifweusenormalsocketstocarryHTTP
http://zguide.zeromq.org/page:all 21/225
12/31/2015 MQ - The Guide - MQ - The Guide
requestsandresponses,weshouldbeabletouseZeroMQsocketstodothesame,onlymuchfasterandbetter.

Theanswerusedtobe"thisisnothowitworks".ZeroMQisnotaneutralcarrier:itimposesaframingonthetransportprotocolsit
uses.Thisframingisnotcompatiblewithexistingprotocols,whichtendtousetheirownframing.Forexample,compareanHTTP
requestandaZeroMQrequest,bothoverTCP/IP.

Figure10HTTPontheWire

TheHTTPrequestusesCRLFasitssimplestframingdelimiter,whereasZeroMQusesalengthspecifiedframe.Soyoucould
writeanHTTPlikeprotocolusingZeroMQ,usingforexampletherequestreplysocketpattern.ButitwouldnotbeHTTP.

Figure11ZeroMQontheWire

Sincev3.3,however,ZeroMQhasasocketoptioncalledZMQ_ROUTER_RAWthatletsyoureadandwritedatawithouttheZeroMQ
framing.YoucouldusethistoreadandwriteproperHTTPrequestsandresponses.HardeepSinghcontributedthischangeso
thathecouldconnecttoTelnetserversfromhisZeroMQapplication.Attimeofwritingthisisstillsomewhatexperimental,butit
showshowZeroMQkeepsevolvingtosolvenewproblems.Maybethenextpatchwillbeyours.

I/OThreads topprevnext

WesaidthatZeroMQdoesI/Oinabackgroundthread.OneI/Othread(forallsockets)issufficientforallbutthemostextreme
applications.Whenyoucreateanewcontext,itstartswithoneI/Othread.ThegeneralruleofthumbistoallowoneI/Othread
pergigabyteofdatainoroutpersecond.ToraisethenumberofI/Othreads,usethezmq_ctx_set()callbeforecreatingany
sockets:

intio_threads=4
void*context=zmq_ctx_new()
zmq_ctx_set(context,ZMQ_IO_THREADS,io_threads)
assert(zmq_ctx_get(context,ZMQ_IO_THREADS)==io_threads)

We'veseenthatonesocketcanhandledozens,eventhousandsofconnectionsatonce.Thishasafundamentalimpactonhow
youwriteapplications.Atraditionalnetworkedapplicationhasoneprocessoronethreadperremoteconnection,andthat
processorthreadhandlesonesocket.ZeroMQletsyoucollapsethisentirestructureintoasingleprocessandthenbreakitupas
necessaryforscaling.

IfyouareusingZeroMQforinterthreadcommunicationsonly(i.e.,amultithreadedapplicationthatdoesnoexternalsocketI/O)
youcansettheI/Othreadstozero.It'snotasignificantoptimizationthough,moreofacuriosity.

MessagingPatterns topprevnext

UnderneaththebrownpaperwrappingofZeroMQ'ssocketAPIliestheworldofmessagingpatterns.Ifyouhaveabackgroundin
enterprisemessaging,orknowUDPwell,thesewillbevaguelyfamiliar.ButtomostZeroMQnewcomers,theyareasurprise.
We'resousedtotheTCPparadigmwhereasocketmapsonetoonetoanothernode.

http://zguide.zeromq.org/page:all 22/225
12/31/2015 MQ - The Guide - MQ - The Guide
Let'srecapbrieflywhatZeroMQdoesforyou.Itdeliversblobsofdata(messages)tonodes,quicklyandefficiently.Youcanmap
nodestothreads,processes,ornodes.ZeroMQgivesyourapplicationsasinglesocketAPItoworkwith,nomatterwhatthe
actualtransport(likeinprocess,interprocess,TCP,ormulticast).Itautomaticallyreconnectstopeersastheycomeandgo.It
queuesmessagesatbothsenderandreceiver,asneeded.Itlimitsthesequeuestoguardprocessesagainstrunningoutof
memory.Ithandlessocketerrors.ItdoesallI/Oinbackgroundthreads.Ituseslockfreetechniquesfortalkingbetweennodes,so
thereareneverlocks,waits,semaphores,ordeadlocks.

Butcuttingthroughthat,itroutesandqueuesmessagesaccordingtopreciserecipescalledpatterns.Itisthesepatternsthat
provideZeroMQ'sintelligence.Theyencapsulateourhardearnedexperienceofthebestwaystodistributedataandwork.
ZeroMQ'spatternsarehardcodedbutfutureversionsmayallowuserdefinablepatterns.

ZeroMQpatternsareimplementedbypairsofsocketswithmatchingtypes.Inotherwords,tounderstandZeroMQpatternsyou
needtounderstandsockettypesandhowtheyworktogether.Mostly,thisjusttakesstudythereislittlethatisobviousatthis
level.

ThebuiltincoreZeroMQpatternsare:

Requestreply,whichconnectsasetofclientstoasetofservices.Thisisaremoteprocedurecallandtaskdistribution
pattern.

Pubsub,whichconnectsasetofpublisherstoasetofsubscribers.Thisisadatadistributionpattern.

Pipeline,whichconnectsnodesinafanout/faninpatternthatcanhavemultiplestepsandloops.Thisisaparalleltask
distributionandcollectionpattern.

Exclusivepair,whichconnectstwosocketsexclusively.Thisisapatternforconnectingtwothreadsinaprocess,notto
beconfusedwith"normal"pairsofsockets.

WelookedatthefirstthreeoftheseinChapter1Basics,andwe'llseetheexclusivepairpatternlaterinthischapter.The
zmq_socket()manpageisfairlyclearaboutthepatternsit'sworthreadingseveraltimesuntilitstartstomakesense.These
arethesocketcombinationsthatarevalidforaconnectbindpair(eithersidecanbind):

PUBandSUB
REQandREP
REQandROUTER(takecare,REQinsertsanextranullframe)
DEALERandREP(takecare,REPassumesanullframe)
DEALERandROUTER
DEALERandDEALER
ROUTERandROUTER
PUSHandPULL
PAIRandPAIR

You'llalsoseereferencestoXPUBandXSUBsockets,whichwe'llcometolater(they'relikerawversionsofPUBandSUB).Any
othercombinationwillproduceundocumentedandunreliableresults,andfutureversionsofZeroMQwillprobablyreturnerrorsif
youtrythem.Youcanandwill,ofcourse,bridgeothersockettypesviacode,i.e.,readfromonesockettypeandwritetoanother.

HighLevelMessagingPatterns topprevnext

ThesefourcorepatternsarecookedintoZeroMQ.TheyarepartoftheZeroMQAPI,implementedinthecoreC++library,and
areguaranteedtobeavailableinallfineretailstores.

Ontopofthose,weaddhighlevelmessagingpatterns.WebuildthesehighlevelpatternsontopofZeroMQandimplementthem
inwhateverlanguagewe'reusingforourapplication.Theyarenotpartofthecorelibrary,donotcomewiththeZeroMQpackage,
andexistintheirownspaceaspartoftheZeroMQcommunity.ForexampletheMajordomopattern,whichweexploreinChapter
4ReliableRequestReplyPatterns,sitsintheGitHubMajordomoprojectintheZeroMQorganization.

Oneofthethingsweaimtoprovideyouwithinthisbookareasetofsuchhighlevelpatterns,bothsmall(howtohandle
messagessanely)andlarge(howtomakeareliablepubsubarchitecture).

WorkingwithMessages topprevnext

http://zguide.zeromq.org/page:all 23/225
12/31/2015 MQ - The Guide - MQ - The Guide

ThelibzmqcorelibraryhasinfacttwoAPIstosendandreceivemessages.Thezmq_send()andzmq_recv()methodsthat
we'vealreadyseenandusedaresimpleoneliners.Wewillusetheseoften,butzmq_recv()isbadatdealingwitharbitrary
messagesizes:ittruncatesmessagestowhateverbuffersizeyouprovide.Sothere'sasecondAPIthatworkswithzmq_msg_t
structures,witharicherbutmoredifficultAPI:

Initialiseamessage:zmq_msg_init(),zmq_msg_init_size(),zmq_msg_init_data().
Sendingandreceivingamessage:zmq_msg_send(),zmq_msg_recv().
Releaseamessage:zmq_msg_close().
Accessmessagecontent:zmq_msg_data(),zmq_msg_size(),zmq_msg_more().
Workwithmessageproperties:zmq_msg_get(),zmq_msg_set().
Messagemanipulation:zmq_msg_copy(),zmq_msg_move().

Onthewire,ZeroMQmessagesareblobsofanysizefromzeroupwardsthatfitinmemory.Youdoyourownserializationusing
protocolbuffers,msgpack,JSON,orwhateverelseyourapplicationsneedtospeak.It'swisetochooseadatarepresentationthat
isportable,butyoucanmakeyourowndecisionsabouttradeoffs.

Inmemory,ZeroMQmessagesarezmq_msg_tstructures(orclassesdependingonyourlanguage).Herearethebasicground
rulesforusingZeroMQmessagesinC:

Youcreateandpassaroundzmq_msg_tobjects,notblocksofdata.

Toreadamessage,youusezmq_msg_init()tocreateanemptymessage,andthenyoupassthatto
zmq_msg_recv().

Towriteamessagefromnewdata,youusezmq_msg_init_size()tocreateamessageandatthesametimeallocate
ablockofdataofsomesize.Youthenfillthatdatausingmemcpy,andpassthemessagetozmq_msg_send().

Torelease(notdestroy)amessage,youcallzmq_msg_close().Thisdropsareference,andeventuallyZeroMQwill
destroythemessage.

Toaccessthemessagecontent,youusezmq_msg_data().Toknowhowmuchdatathemessagecontains,use
zmq_msg_size().

Donotusezmq_msg_move(),zmq_msg_copy(),orzmq_msg_init_data()unlessyoureadthemanpagesandknow
preciselywhyyouneedthese.

Afteryoupassamessagetozmq_msg_send(),MQwillclearthemessage,i.e.,setthesizetozero.Youcannotsend
thesamemessagetwice,andyoucannotaccessthemessagedataaftersendingit.

Theserulesdon'tapplyifyouusezmq_send()andzmq_recv(),towhichyoupassbytearrays,notmessagestructures.

Ifyouwanttosendthesamemessagemorethanonce,andit'ssizable,createasecondmessage,initializeitusing
zmq_msg_init(),andthenusezmq_msg_copy()tocreateacopyofthefirstmessage.Thisdoesnotcopythedatabutcopies
areference.Youcanthensendthemessagetwice(ormore,ifyoucreatemorecopies)andthemessagewillonlybefinally
destroyedwhenthelastcopyissentorclosed.

ZeroMQalsosupportsmultipartmessages,whichletyousendorreceivealistofframesasasingleonthewiremessage.Thisis
widelyusedinrealapplicationsandwe'lllookatthatlaterinthischapterandinChapter3AdvancedRequestReplyPatterns.

Frames(alsocalled"messageparts"intheZeroMQreferencemanualpages)arethebasicwireformatforZeroMQmessages.A
frameisalengthspecifiedblockofdata.Thelengthcanbezeroupwards.Ifyou'vedoneanyTCPprogrammingyou'llappreciate
whyframesareausefulanswertothequestion"howmuchdataamIsupposedtoreadofthisnetworksocketnow?"

ThereisawirelevelprotocolcalledZMTPthatdefineshowZeroMQreadsandwritesframesonaTCPconnection.Ifyou're
interestedinhowthisworks,thespecisquiteshort.

Originally,aZeroMQmessagewasoneframe,likeUDP.Welaterextendedthiswithmultipartmessages,whicharequitesimply
seriesofframeswitha"more"bitsettoone,followedbyonewiththatbitsettozero.TheZeroMQAPIthenletsyouwrite
messageswitha"more"flagandwhenyoureadmessages,itletsyoucheckifthere's"more".

InthelowlevelZeroMQAPIandthereferencemanual,therefore,there'ssomefuzzinessaboutmessagesversusframes.So
here'sausefullexicon:

Amessagecanbeoneormoreparts.
Thesepartsarealsocalled"frames".
Eachpartisazmq_msg_tobject.
http://zguide.zeromq.org/page:all 24/225
12/31/2015 MQ - The Guide - MQ - The Guide
Yousendandreceiveeachpartseparately,inthelowlevelAPI.
HigherlevelAPIsprovidewrapperstosendentiremultipartmessages.

Someotherthingsthatareworthknowingaboutmessages:

Youmaysendzerolengthmessages,e.g.,forsendingasignalfromonethreadtoanother.

ZeroMQguaranteestodeliveralltheparts(oneormore)foramessage,ornoneofthem.

ZeroMQdoesnotsendthemessage(singleormultipart)rightaway,butatsomeindeterminatelatertime.Amultipart
messagemustthereforefitinmemory.

Amessage(singleormultipart)mustfitinmemory.Ifyouwanttosendfilesofarbitrarysizes,youshouldbreaktheminto
piecesandsendeachpieceasseparatesinglepartmessages.Usingmultipartdatawillnotreducememoryconsumption.

Youmustcallzmq_msg_close()whenfinishedwithareceivedmessage,inlanguagesthatdon'tautomaticallydestroy
objectswhenascopecloses.Youdon'tcallthismethodaftersendingamessage.

Andtoberepetitive,donotusezmq_msg_init_data()yet.Thisisazerocopymethodandisguaranteedtocreatetroublefor
you.TherearefarmoreimportantthingstolearnaboutZeroMQbeforeyoustarttoworryaboutshavingoffmicroseconds.

ThisrichAPIcanbetiresometoworkwith.Themethodsareoptimizedforperformance,notsimplicity.Ifyoustartusingthese
youwillalmostdefinitelygetthemwronguntilyou'vereadthemanpageswithsomecare.Sooneofthemainjobsofagood
languagebindingistowrapthisAPIupinclassesthatareeasiertouse.

HandlingMultipleSockets topprevnext

Inalltheexamplessofar,themainloopofmostexampleshasbeen:

1. Waitformessageonsocket.
2. Processmessage.
3. Repeat.

Whatifwewanttoreadfrommultipleendpointsatthesametime?Thesimplestwayistoconnectonesockettoalltheendpoints
andgetZeroMQtodothefaninforus.Thisislegaliftheremoteendpointsareinthesamepattern,butitwouldbewrongto
connectaPULLsockettoaPUBendpoint.

Toactuallyreadfrommultiplesocketsallatonce,usezmq_poll().Anevenbetterwaymightbetowrapzmq_poll()ina
frameworkthatturnsitintoaniceeventdrivenreactor,butit'ssignificantlymoreworkthanwewanttocoverhere.

Let'sstartwithadirtyhack,partlyforthefunofnotdoingitright,butmainlybecauseitletsmeshowyouhowtodononblocking
socketreads.Hereisasimpleexampleofreadingfromtwosocketsusingnonblockingreads.Thisratherconfusedprogramacts
bothasasubscribertoweatherupdates,andaworkerforparalleltasks:

msreader:MultiplesocketreaderinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Java|Lua|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Haskell|Haxe|
Node.js|ooc|Q|Racket

Thecostofthisapproachissomeadditionallatencyonthefirstmessage(thesleepattheendoftheloop,whenthereareno
waitingmessagestoprocess).Thiswouldbeaprobleminapplicationswheresubmillisecondlatencywasvital.Also,youneedto
checkthedocumentationfornanosleep()orwhateverfunctionyouusetomakesureitdoesnotbusyloop.

Youcantreatthesocketsfairlybyreadingfirstfromone,thenthesecondratherthanprioritizingthemaswedidinthisexample.

Nowlet'sseethesamesenselesslittleapplicationdoneright,usingzmq_poll():

mspoller:MultiplesocketpollerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|
Haxe|ooc|Q|Racket

Theitemsstructurehasthesefourmembers:

http://zguide.zeromq.org/page:all 25/225
12/31/2015 MQ - The Guide - MQ - The Guide

typedefstruct{
void*socket//ZeroMQsockettopollon
intfd//OR,nativefilehandletopollon
shortevents//Eventstopollon
shortrevents//Eventsreturnedafterpoll
}zmq_pollitem_t

MultipartMessages topprevnext

ZeroMQletsuscomposeamessageoutofseveralframes,givingusa"multipartmessage".Realisticapplicationsusemultipart
messagesheavily,bothforwrappingmessageswithaddressinformationandforsimpleserialization.We'lllookatreply
envelopeslater.

Whatwe'lllearnnowissimplyhowtoblindlyandsafelyreadandwritemultipartmessagesinanyapplication(suchasaproxy)
thatneedstoforwardmessageswithoutinspectingthem.

Whenyouworkwithmultipartmessages,eachpartisazmq_msgitem.E.g.,ifyouaresendingamessagewithfiveparts,you
mustconstruct,send,anddestroyfivezmq_msgitems.Youcandothisinadvance(andstorethezmq_msgitemsinanarrayor
otherstructure),orasyousendthem,onebyone.

Hereishowwesendtheframesinamultipartmessage(wereceiveeachframeintoamessageobject):

zmq_msg_send(&message,socket,ZMQ_SNDMORE)

zmq_msg_send(&message,socket,ZMQ_SNDMORE)

zmq_msg_send(&message,socket,0)

Hereishowwereceiveandprocessallthepartsinamessage,beitsinglepartormultipart:

while(1){
zmq_msg_tmessage
zmq_msg_init(&message)
zmq_msg_recv(&message,socket,0)
//Processthemessageframe

zmq_msg_close(&message)
if(!zmq_msg_more(&message))
break//Lastmessageframe
}

Somethingstoknowaboutmultipartmessages:

Whenyousendamultipartmessage,thefirstpart(andallfollowingparts)areonlyactuallysentonthewirewhenyou
sendthefinalpart.
Ifyouareusingzmq_poll(),whenyoureceivethefirstpartofamessage,alltheresthasalsoarrived.
Youwillreceiveallpartsofamessage,ornoneatall.
Eachpartofamessageisaseparatezmq_msgitem.
Youwillreceiveallpartsofamessagewhetherornotyoucheckthemoreproperty.
Onsending,ZeroMQqueuesmessageframesinmemoryuntilthelastisreceived,thensendsthemall.
Thereisnowaytocancelapartiallysentmessage,exceptbyclosingthesocket.

IntermediariesandProxies topprevnext

http://zguide.zeromq.org/page:all 26/225
12/31/2015 MQ - The Guide - MQ - The Guide

ZeroMQaimsfordecentralizedintelligence,butthatdoesn'tmeanyournetworkisemptyspaceinthemiddle.It'sfilledwith
messageawareinfrastructureandquiteoften,webuildthatinfrastructurewithZeroMQ.TheZeroMQplumbingcanrangefrom
tinypipestofullblownserviceorientedbrokers.Themessagingindustrycallsthisintermediation,meaningthatthestuffinthe
middledealswitheitherside.InZeroMQ,wecalltheseproxies,queues,forwarders,device,orbrokers,dependingonthe
context.

Thispatternisextremelycommonintherealworldandiswhyoursocietiesandeconomiesarefilledwithintermediarieswho
havenootherrealfunctionthantoreducethecomplexityandscalingcostsoflargernetworks.Realworldintermediariesare
typicallycalledwholesalers,distributors,managers,andsoon.

TheDynamicDiscoveryProblem topprevnext

Oneoftheproblemsyouwillhitasyoudesignlargerdistributedarchitecturesisdiscovery.Thatis,howdopiecesknowabout
eachother?It'sespeciallydifficultifpiecescomeandgo,sowecallthisthe"dynamicdiscoveryproblem".

Thereareseveralsolutionstodynamicdiscovery.Thesimplestistoentirelyavoiditbyhardcoding(orconfiguring)thenetwork
architecturesodiscoveryisdonebyhand.Thatis,whenyouaddanewpiece,youreconfigurethenetworktoknowaboutit.

Figure12SmallScalePubSubNetwork

Inpractice,thisleadstoincreasinglyfragileandunwieldyarchitectures.Let'ssayyouhaveonepublisherandahundred
subscribers.Youconnecteachsubscribertothepublisherbyconfiguringapublisherendpointineachsubscriber.That'seasy.
Subscribersaredynamicthepublisherisstatic.Nowsayyouaddmorepublishers.Suddenly,it'snotsoeasyanymore.Ifyou
continuetoconnecteachsubscribertoeachpublisher,thecostofavoidingdynamicdiscoverygetshigherandhigher.

Figure13PubSubNetworkwithaProxy

http://zguide.zeromq.org/page:all 27/225
12/31/2015 MQ - The Guide - MQ - The Guide

Therearequiteafewanswerstothis,buttheverysimplestansweristoaddanintermediarythatis,astaticpointinthenetwork
towhichallothernodesconnect.Inclassicmessaging,thisisthejobofthemessagebroker.ZeroMQdoesn'tcomewitha
messagebrokerassuch,butitletsusbuildintermediariesquiteeasily.

Youmightwonder,ifallnetworkseventuallygetlargeenoughtoneedintermediaries,whydon'twesimplyhaveamessage
brokerinplaceforallapplications?Forbeginners,it'safaircompromise.Justalwaysuseastartopology,forgetabout
performance,andthingswillusuallywork.However,messagebrokersaregreedythingsintheirroleascentralintermediaries,
theybecometoocomplex,toostateful,andeventuallyaproblem.

It'sbettertothinkofintermediariesassimplestatelessmessageswitches.AgoodanalogyisanHTTPproxyit'sthere,but
doesn'thaveanyspecialrole.Addingapubsubproxysolvesthedynamicdiscoveryprobleminourexample.Wesettheproxyin
the"middle"ofthenetwork.TheproxyopensanXSUBsocket,anXPUBsocket,andbindseachtowellknownIPaddressesand
ports.Then,allotherprocessesconnecttotheproxy,insteadoftoeachother.Itbecomestrivialtoaddmoresubscribersor
publishers.

Figure14ExtendedPubSub

http://zguide.zeromq.org/page:all 28/225
12/31/2015 MQ - The Guide - MQ - The Guide

WeneedXPUBandXSUBsocketsbecauseZeroMQdoessubscriptionforwardingfromsubscriberstopublishers.XSUBand
XPUBareexactlylikeSUBandPUBexcepttheyexposesubscriptionsasspecialmessages.Theproxyhastoforwardthese
subscriptionmessagesfromsubscribersidetopublisherside,byreadingthemfromtheXSUBsocketandwritingthemtothe
XPUBsocket.ThisisthemainusecaseforXSUBandXPUB.

SharedQueue(DEALERandROUTERsockets) topprevnext

IntheHelloWorldclient/serverapplication,wehaveoneclientthattalkstooneservice.However,inrealcasesweusuallyneed
toallowmultipleservicesaswellasmultipleclients.Thisletsusscaleupthepoweroftheservice(manythreadsorprocessesor
nodesratherthanjustone).Theonlyconstraintisthatservicesmustbestateless,allstatebeingintherequestorinsomeshared
storagesuchasadatabase.

Figure15RequestDistribution

http://zguide.zeromq.org/page:all 29/225
12/31/2015 MQ - The Guide - MQ - The Guide
Therearetwowaystoconnectmultipleclientstomultipleservers.Thebruteforcewayistoconnecteachclientsockettomultiple
serviceendpoints.Oneclientsocketcanconnecttomultipleservicesockets,andtheREQsocketwillthendistributerequests
amongtheseservices.Let'ssayyouconnectaclientsockettothreeserviceendpointsA,B,andC.Theclientmakesrequests
R1,R2,R3,R4.R1andR4gotoserviceA,R2goestoB,andR3goestoserviceC.

Thisdesignletsyouaddmoreclientscheaply.Youcanalsoaddmoreservices.Eachclientwilldistributeitsrequeststothe
services.Buteachclienthastoknowtheservicetopology.Ifyouhave100clientsandthenyoudecidetoaddthreemore
services,youneedtoreconfigureandrestart100clientsinorderfortheclientstoknowaboutthethreenewservices.

That'sclearlynotthekindofthingwewanttobedoingat3a.m.whenoursupercomputingclusterhasrunoutofresourcesand
wedesperatelyneedtoaddacoupleofhundredofnewservicenodes.Toomanystaticpiecesarelikeliquidconcrete:
knowledgeisdistributedandthemorestaticpiecesyouhave,themoreeffortitistochangethetopology.Whatwewantis
somethingsittinginbetweenclientsandservicesthatcentralizesallknowledgeofthetopology.Ideally,weshouldbeabletoadd
andremoveservicesorclientsatanytimewithouttouchinganyotherpartofthetopology.

Sowe'llwritealittlemessagequeuingbrokerthatgivesusthisflexibility.Thebrokerbindstotwoendpoints,afrontendforclients
andabackendforservices.Itthenuseszmq_poll()tomonitorthesetwosocketsforactivityandwhenithassome,itshuttles
messagesbetweenitstwosockets.Itdoesn'tactuallymanageanyqueuesexplicitlyZeroMQdoesthatautomaticallyoneach
socket.

WhenyouuseREQtotalktoREP,yougetastrictlysynchronousrequestreplydialog.Theclientsendsarequest.Theservice
readstherequestandsendsareply.Theclientthenreadsthereply.Ifeithertheclientortheservicetrytodoanythingelse(e.g.,
sendingtworequestsinarowwithoutwaitingforaresponse),theywillgetanerror.

Butourbrokerhastobenonblocking.Obviously,wecanusezmq_poll()towaitforactivityoneithersocket,butwecan'tuse
REPandREQ.

Figure16ExtendedRequestReply

Luckily,therearetwosocketscalledDEALERandROUTERthatletyoudononblockingrequestresponse.You'llseeinChapter
3AdvancedRequestReplyPatternshowDEALERandROUTERsocketsletyoubuildallkindsofasynchronousrequestreply
flows.Fornow,we'rejustgoingtoseehowDEALERandROUTERletusextendREQREPacrossanintermediary,thatis,our
littlebroker.

Inthissimpleextendedrequestreplypattern,REQtalkstoROUTERandDEALERtalkstoREP.InbetweentheDEALERand
ROUTER,wehavetohavecode(likeourbroker)thatpullsmessagesofftheonesocketandshovesthemontotheother.

Therequestreplybrokerbindstotwoendpoints,oneforclientstoconnectto(thefrontendsocket)andoneforworkersto
connectto(thebackend).Totestthisbroker,youwillwanttochangeyourworkerssotheyconnecttothebackendsocket.Here
isaclientthatshowswhatImean:

rrclient:RequestreplyclientinC
http://zguide.zeromq.org/page:all 30/225
12/31/2015 MQ - The Guide - MQ - The Guide

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|
Felix|ObjectiveC|ooc|Q

Hereistheworker:

rrworker:RequestreplyworkerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|
Felix|ObjectiveC|ooc|Q

Andhereisthebroker,whichproperlyhandlesmultipartmessages:

rrbroker:RequestreplybrokerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Figure17RequestReplyBroker

Usingarequestreplybrokermakesyourclient/serverarchitectureseasiertoscalebecauseclientsdon'tseeworkers,and
workersdon'tseeclients.Theonlystaticnodeisthebrokerinthemiddle.

ZeroMQ'sBuiltInProxyFunction topprevnext

Itturnsoutthatthecoreloopintheprevioussection'srrbrokerisveryuseful,andreusable.Itletsusbuildpubsubforwarders
andsharedqueuesandotherlittleintermediarieswithverylittleeffort.ZeroMQwrapsthisupinasinglemethod,zmq_proxy():

zmq_proxy(frontend,backend,capture)

Thetwo(orthreesockets,ifwewanttocapturedata)mustbeproperlyconnected,bound,andconfigured.Whenwecallthe
http://zguide.zeromq.org/page:all 31/225
12/31/2015 MQ - The Guide - MQ - The Guide
zmq_proxymethod,it'sexactlylikestartingthemainloopofrrbroker.Let'srewritetherequestreplybrokertocall
zmq_proxy,andrebadgethisasanexpensivesounding"messagequeue"(peoplehavechargedhousesforcodethatdidless):

msgqueue:MessagequeuebrokerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Q|Ruby|Tcl|Ada|Basic|Felix|Node.js|
ObjectiveC|ooc|Racket|Scala

Ifyou'relikemostZeroMQusers,atthisstageyourmindisstartingtothink,"WhatkindofevilstuffcanIdoifIplugrandom
sockettypesintotheproxy?"Theshortansweris:tryitandworkoutwhatishappening.Inpractice,youwouldusuallystickto
ROUTER/DEALER,XSUB/XPUB,orPULL/PUSH.

TransportBridging topprevnext

AfrequentrequestfromZeroMQusersis,"HowdoIconnectmyZeroMQnetworkwithtechnologyX?"whereXissomeother
networkingormessagingtechnology.

Figure18PubSubForwarderProxy

Thesimpleansweristobuildabridge.Abridgeisasmallapplicationthatspeaksoneprotocolatonesocket,andconverts
to/fromasecondprotocolatanothersocket.Aprotocolinterpreter,ifyoulike.AcommonbridgingprobleminZeroMQistobridge
twotransportsornetworks.

Asanexample,we'regoingtowritealittleproxythatsitsinbetweenapublisherandasetofsubscribers,bridgingtwonetworks.
Thefrontendsocket(SUB)facestheinternalnetworkwheretheweatherserverissitting,andthebackend(PUB)faces
subscribersontheexternalnetwork.Itsubscribestotheweatherserviceonthefrontendsocket,andrepublishesitsdataonthe
backendsocket.

http://zguide.zeromq.org/page:all 32/225
12/31/2015 MQ - The Guide - MQ - The Guide
wuproxy:WeatherupdateproxyinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Itlooksverysimilartotheearlierproxyexample,butthekeypartisthatthefrontendandbackendsocketsareontwodifferent
networks.Wecanusethismodelforexampletoconnectamulticastnetwork(pgmtransport)toatcppublisher.

HandlingErrorsandETERM topprevnext

ZeroMQ'serrorhandlingphilosophyisamixoffailfastandresilience.Processes,webelieve,shouldbeasvulnerableas
possibletointernalerrors,andasrobustaspossibleagainstexternalattacksanderrors.Togiveananalogy,alivingcellwillself
destructifitdetectsasingleinternalerror,yetitwillresistattackfromtheoutsidebyallmeanspossible.

Assertions,whichpeppertheZeroMQcode,areabsolutelyvitaltorobustcodetheyjusthavetobeontherightsideofthe
cellularwall.Andthereshouldbesuchawall.Ifitisunclearwhetherafaultisinternalorexternal,thatisadesignflawtobefixed.
InC/C++,assertionsstoptheapplicationimmediatelywithanerror.Inotherlanguages,youmaygetexceptionsorhalts.

WhenZeroMQdetectsanexternalfaultitreturnsanerrortothecallingcode.Insomerarecases,itdropsmessagessilentlyif
thereisnoobviousstrategyforrecoveringfromtheerror.

InmostoftheCexampleswe'veseensofarthere'sbeennoerrorhandling.Realcodeshoulddoerrorhandlingonevery
singleZeroMQcall.Ifyou'reusingalanguagebindingotherthanC,thebindingmayhandleerrorsforyou.InC,youdoneedto
dothisyourself.Therearesomesimplerules,startingwithPOSIXconventions:

MethodsthatcreateobjectsreturnNULLiftheyfail.
Methodsthatprocessdatamayreturnthenumberofbytesprocessed,or1onanerrororfailure.
Othermethodsreturn0onsuccessand1onanerrororfailure.
Theerrorcodeisprovidedinerrnoorzmq_errno().
Adescriptiveerrortextforloggingisprovidedbyzmq_strerror().

Forexample:

void*context=zmq_ctx_new()
assert(context)
void*socket=zmq_socket(context,ZMQ_REP)
assert(socket)
intrc=zmq_bind(socket,"tcp://*:5555")
if(rc==1){
printf("E:bindfailed:%s\n",strerror(errno))
return1
}

Therearetwomainexceptionalconditionsthatyoushouldhandleasnonfatal:

WhenyourcodereceivesamessagewiththeZMQ_DONTWAIToptionandthereisnowaitingdata,ZeroMQwillreturn1
andseterrnotoEAGAIN.

Whenonethreadcallszmq_ctx_destroy(),andotherthreadsarestilldoingblockingwork,thezmq_ctx_destroy()
callclosesthecontextandallblockingcallsexitwith1,anderrnosettoETERM.

InC/C++,assertscanberemovedentirelyinoptimizedcode,sodon'tmakethemistakeofwrappingthewholeZeroMQcallinan
assert().Itlooksneatthentheoptimizerremovesalltheassertsandthecallsyouwanttomake,andyourapplicationbreaks
inimpressiveways.

Figure19ParallelPipelinewithKillSignaling

http://zguide.zeromq.org/page:all 33/225
12/31/2015 MQ - The Guide - MQ - The Guide

Let'sseehowtoshutdownaprocesscleanly.We'lltaketheparallelpipelineexamplefromtheprevioussection.Ifwe'vestarted
awholelotofworkersinthebackground,wenowwanttokillthemwhenthebatchisfinished.Let'sdothisbysendingakill
messagetotheworkers.Thebestplacetodothisisthesinkbecauseitreallyknowswhenthebatchisdone.

Howdoweconnectthesinktotheworkers?ThePUSH/PULLsocketsareonewayonly.Wecouldswitchtoanothersockettype,
orwecouldmixmultiplesocketflows.Let'strythelatter:usingapubsubmodeltosendkillmessagestotheworkers:

ThesinkcreatesaPUBsocketonanewendpoint.
Workersbindtheirinputsockettothisendpoint.
Whenthesinkdetectstheendofthebatch,itsendsakilltoitsPUBsocket.
Whenaworkerdetectsthiskillmessage,itexits.

Itdoesn'ttakemuchnewcodeinthesink:

void*controller=zmq_socket(context,ZMQ_PUB)
zmq_bind(controller,"tcp://*:5559")

//Sendkillsignaltoworkers
s_send(controller,"KILL")

Hereistheworkerprocess,whichmanagestwosockets(aPULLsocketgettingtasks,andaSUBsocketgettingcontrol
commands),usingthezmq_poll()techniquewesawearlier:

taskwork2:ParalleltaskworkerwithkillsignalinginC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic
|Felix|ooc|Q|Racket

http://zguide.zeromq.org/page:all 34/225
12/31/2015 MQ - The Guide - MQ - The Guide
Hereisthemodifiedsinkapplication.Whenit'sfinishedcollectingresults,itbroadcastsakillmessagetoallworkers:

tasksink2:ParalleltasksinkwithkillsignalinginC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic
|Felix|ooc|Q|Racket

HandlingInterruptSignals topprevnext

RealisticapplicationsneedtoshutdowncleanlywheninterruptedwithCtrlCoranothersignalsuchasSIGTERM.Bydefault,
thesesimplykilltheprocess,meaningmessageswon'tbeflushed,fileswon'tbeclosedcleanly,andsoon.

Hereishowwehandleasignalinvariouslanguages:

interrupt:HandlingCtrlCcleanlyinC

C++|C#|Delphi|Erlang|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Ada|Basic|Clojure|CL|F#|Felix|Objective
C|ooc|Q|Racket|Tcl

Theprogramprovidess_catch_signals(),whichtrapsCtrlC(SIGINT)andSIGTERM.Wheneitherofthesesignalsarrive,the
s_catch_signals()handlersetstheglobalvariables_interrupted.Thankstoyoursignalhandler,yourapplicationwillnot
dieautomatically.Instead,youhaveachancetocleanupandexitgracefully.Youhavetonowexplicitlycheckforaninterrupt
andhandleitproperly.Dothisbycallings_catch_signals()(copythisfrominterrupt.c)atthestartofyourmaincode.
Thissetsupthesignalhandling.TheinterruptwillaffectZeroMQcallsasfollows:

Ifyourcodeisblockinginablockingcall(sendingamessage,receivingamessage,orpolling),thenwhenasignalarrives,
thecallwillreturnwithEINTR.
Wrapperslikes_recv()returnNULLiftheyareinterrupted.

SocheckforanEINTRreturncode,aNULLreturn,and/ors_interrupted.

Hereisatypicalcodefragment:

s_catch_signals()
client=zmq_socket(...)
while(!s_interrupted){
char*message=s_recv(client)
if(!message)
break//CtrlCused
}
zmq_close(client)

Ifyoucalls_catch_signals()anddon'ttestforinterrupts,thenyourapplicationwillbecomeimmunetoCtrlCandSIGTERM,
whichmaybeuseful,butisusuallynot.

DetectingMemoryLeaks topprevnext

Anylongrunningapplicationhastomanagememorycorrectly,oreventuallyit'lluseupallavailablememoryandcrash.Ifyou
usealanguagethathandlesthisautomaticallyforyou,congratulations.IfyouprograminCorC++oranyotherlanguagewhere
you'reresponsibleformemorymanagement,here'sashorttutorialonusingvalgrind,whichamongotherthingswillreportonany
leaksyourprogramshave.

Toinstallvalgrind,e.g.,onUbuntuorDebian,issuethiscommand:

sudoaptgetinstallvalgrind

http://zguide.zeromq.org/page:all 35/225
12/31/2015 MQ - The Guide - MQ - The Guide
Bydefault,ZeroMQwillcausevalgrindtocomplainalot.Toremovethesewarnings,createafilecalledvg.suppthat
containsthis:

{
<socketcall_sendto>
Memcheck:Param
socketcall.sendto(msg)
fun:send
...
}
{
<socketcall_sendto>
Memcheck:Param
socketcall.send(msg)
fun:send
...
}

FixyourapplicationstoexitcleanlyafterCtrlC.Foranyapplicationthatexitsbyitself,that'snotneeded,butforlong
runningapplications,thisisessential,otherwisevalgrindwillcomplainaboutallcurrentlyallocatedmemory.

BuildyourapplicationwithDDEBUGifit'snotyourdefaultsetting.Thatensuresvalgrindcantellyouexactlywhere
memoryisbeingleaked.

Finally,runvalgrindthus:

valgrindtool=memcheckleakcheck=fullsuppressions=vg.suppsomeprog

Andafterfixinganyerrorsitreported,youshouldgetthepleasantmessage:

==30536==ERRORSUMMARY:0errorsfrom0contexts...

MultithreadingwithZeroMQ topprevnext

ZeroMQisperhapsthenicestwayevertowritemultithreaded(MT)applications.WhereasZeroMQsocketsrequiresome
readjustmentifyouareusedtotraditionalsockets,ZeroMQmultithreadingwilltakeeverythingyouknowaboutwritingMT
applications,throwitintoaheapinthegarden,pourgasolineoverit,andsetitalight.It'sararebookthatdeservesburning,but
mostbooksonconcurrentprogrammingdo.

TomakeutterlyperfectMTprograms(andImeanthatliterally),wedon'tneedmutexes,locks,oranyotherformofinter
threadcommunicationexceptmessagessentacrossZeroMQsockets.

By"perfectMTprograms",Imeancodethat'seasytowriteandunderstand,thatworkswiththesamedesignapproachinany
programminglanguage,andonanyoperatingsystem,andthatscalesacrossanynumberofCPUswithzerowaitstatesandno
pointofdiminishingreturns.

Ifyou'vespentyearslearningtrickstomakeyourMTcodeworkatall,letalonerapidly,withlocksandsemaphoresandcritical
sections,youwillbedisgustedwhenyourealizeitwasallfornothing.Ifthere'sonelessonwe'velearnedfrom30+yearsof
concurrentprogramming,itis:justdon'tsharestate.It'sliketwodrunkardstryingtoshareabeer.Itdoesn'tmatterifthey'regood
buddies.Soonerorlater,they'regoingtogetintoafight.Andthemoredrunkardsyouaddtothetable,themoretheyfighteach
otheroverthebeer.ThetragicmajorityofMTapplicationslooklikedrunkenbarfights.

ThelistofweirdproblemsthatyouneedtofightasyouwriteclassicsharedstateMTcodewouldbehilariousifitdidn'ttranslate
directlyintostressandrisk,ascodethatseemstoworksuddenlyfailsunderpressure.Alargefirmwithworldbeatingexperience
inbuggycodereleaseditslistof"11LikelyProblemsInYourMultithreadedCode",whichcoversforgottensynchronization,
incorrectgranularity,readandwritetearing,lockfreereordering,lockconvoys,twostepdance,andpriorityinversion.

http://zguide.zeromq.org/page:all 36/225
12/31/2015 MQ - The Guide - MQ - The Guide
Yeah,wecountedsevenproblems,noteleven.That'snotthepointthough.Thepointis,doyoureallywantthatcoderunningthe
powergridorstockmarkettostartgettingtwosteplockconvoysat3p.m.onabusyThursday?Whocareswhattheterms
actuallymean?Thisisnotwhatturnedusontoprogramming,fightingevermorecomplexsideeffectswithevermorecomplex
hacks.

Somewidelyusedmodels,despitebeingthebasisforentireindustries,arefundamentallybroken,andsharedstateconcurrency
isoneofthem.CodethatwantstoscalewithoutlimitdoesitliketheInternetdoes,bysendingmessagesandsharingnothing
exceptacommoncontemptforbrokenprogrammingmodels.

YoushouldfollowsomerulestowritehappymultithreadedcodewithZeroMQ:

Isolatedataprivatelywithinitsthreadandneversharedatainmultiplethreads.TheonlyexceptiontothisareZeroMQ
contexts,whicharethreadsafe.

Stayawayfromtheclassicconcurrencymechanismslikeasmutexes,criticalsections,semaphores,etc.Thesearean
antipatterninZeroMQapplications.

CreateoneZeroMQcontextatthestartofyourprocess,andpassthattoallthreadsthatyouwanttoconnectviainproc
sockets.

Useattachedthreadstocreatestructurewithinyourapplication,andconnectthesetotheirparentthreadsusingPAIR
socketsoverinproc.Thepatternis:bindparentsocket,thencreatechildthreadwhichconnectsitssocket.

Usedetachedthreadstosimulateindependenttasks,withtheirowncontexts.Connecttheseovertcp.Lateryoucan
movethesetostandaloneprocesseswithoutchangingthecodesignificantly.

AllinteractionbetweenthreadshappensasZeroMQmessages,whichyoucandefinemoreorlessformally.

Don'tshareZeroMQsocketsbetweenthreads.ZeroMQsocketsarenotthreadsafe.Technicallyit'spossibletomigratea
socketfromonethreadtoanotherbutitdemandsskill.Theonlyplacewhereit'sremotelysanetosharesocketsbetween
threadsareinlanguagebindingsthatneedtodomagiclikegarbagecollectiononsockets.

Ifyouneedtostartmorethanoneproxyinanapplication,forexample,youwillwanttoruneachintheirownthread.Itiseasyto
maketheerrorofcreatingtheproxyfrontendandbackendsocketsinonethread,andthenpassingthesocketstotheproxyin
anotherthread.Thismayappeartoworkatfirstbutwillfailrandomlyinrealuse.Remember:Donotuseorclosesocketsexcept
inthethreadthatcreatedthem.

Ifyoufollowtheserules,youcanquiteeasilybuildelegantmultithreadedapplications,andlatersplitoffthreadsintoseparate
processesasyouneedto.Applicationlogiccansitinthreads,processes,ornodes:whateveryourscaleneeds.

ZeroMQusesnativeOSthreadsratherthanvirtual"green"threads.Theadvantageisthatyoudon'tneedtolearnanynew
threadingAPI,andthatZeroMQthreadsmapcleanlytoyouroperatingsystem.YoucanusestandardtoolslikeIntel's
ThreadCheckertoseewhatyourapplicationisdoing.ThedisadvantagesarethatnativethreadingAPIsarenotalwaysportable,
andthatifyouhaveahugenumberofthreads(inthethousands),someoperatingsystemswillgetstressed.

Let'sseehowthisworksinpractice.We'llturnouroldHelloWorldserverintosomethingmorecapable.Theoriginalserverranin
asinglethread.Iftheworkperrequestislow,that'sfine:oneMQthreadcanrunatfullspeedonaCPUcore,withnowaits,
doinganawfullotofwork.Butrealisticservershavetodonontrivialworkperrequest.Asinglecoremaynotbeenoughwhen
10,000clientshittheserverallatonce.Soarealisticserverwillstartmultipleworkerthreads.Itthenacceptsrequestsasfastas
itcananddistributesthesetoitsworkerthreads.Theworkerthreadsgrindthroughtheworkandeventuallysendtheirreplies
back.

Youcan,ofcourse,doallthisusingaproxybrokerandexternalworkerprocesses,butoftenit'seasiertostartoneprocessthat
gobblesupsixteencoresthansixteenprocesses,eachgobblinguponecore.Further,runningworkersasthreadswillcutouta
networkhop,latency,andnetworktraffic.

TheMTversionoftheHelloWorldservicebasicallycollapsesthebrokerandworkersintoasingleprocess:

mtserver:MultithreadedserviceinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Q|Ruby|Scala|Ada|Basic|Felix|Node.js|
ObjectiveC|ooc|Racket|Tcl

Figure20MultithreadedServer

http://zguide.zeromq.org/page:all 37/225
12/31/2015 MQ - The Guide - MQ - The Guide

Allthecodeshouldberecognizabletoyoubynow.Howitworks:

Theserverstartsasetofworkerthreads.EachworkerthreadcreatesaREPsocketandthenprocessesrequestsonthis
socket.Workerthreadsarejustlikesinglethreadedservers.Theonlydifferencesarethetransport(inprocinsteadof
tcp),andthebindconnectdirection.

TheservercreatesaROUTERsockettotalktoclientsandbindsthistoitsexternalinterface(overtcp).

TheservercreatesaDEALERsockettotalktotheworkersandbindsthistoitsinternalinterface(overinproc).

Theserverstartsaproxythatconnectsthetwosockets.Theproxypullsincomingrequestsfairlyfromallclients,and
distributesthoseouttoworkers.Italsoroutesrepliesbacktotheirorigin.

Notethatcreatingthreadsisnotportableinmostprogramminglanguages.ThePOSIXlibraryispthreads,butonWindowsyou
havetouseadifferentAPI.Inourexample,thepthread_createcallstartsupanewthreadrunningtheworker_routine
functionwedefined.We'llseeinChapter3AdvancedRequestReplyPatternshowtowrapthisinaportableAPI.

Herethe"work"isjustaonesecondpause.Wecoulddoanythingintheworkers,includingtalkingtoothernodes.Thisiswhat
theMTserverlookslikeintermsofMQsocketsandnodes.NotehowtherequestreplychainisREQROUTERqueue
DEALERREP.

SignalingBetweenThreads(PAIRSockets) topprevnext

WhenyoustartmakingmultithreadedapplicationswithZeroMQ,you'llencounterthequestionofhowtocoordinateyourthreads.
Thoughyoumightbetemptedtoinsert"sleep"statements,orusemultithreadingtechniquessuchassemaphoresormutexes,
theonlymechanismthatyoushoulduseareZeroMQmessages.RememberthestoryofTheDrunkardsandTheBeer
http://zguide.zeromq.org/page:all 38/225
12/31/2015 MQ - The Guide - MQ - The Guide
Bottle.

Let'smakethreethreadsthatsignaleachotherwhentheyareready.Inthisexample,weusePAIRsocketsovertheinproc
transport:

mtrelay:MultithreadedrelayinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Q|Ruby|Scala|Ada|Basic|Felix|Node.js|
ObjectiveC|ooc|Racket|Tcl

Figure21TheRelayRace

ThisisaclassicpatternformultithreadingwithZeroMQ:

1. Twothreadscommunicateoverinproc,usingasharedcontext.
2. Theparentthreadcreatesonesocket,bindsittoaninproc://endpoint,andthenstartsthechildthread,passingthe
contexttoit.
3. Thechildthreadcreatesthesecondsocket,connectsittothatinproc://endpoint,andthensignalstotheparentthread
thatit'sready.

Notethatmultithreadingcodeusingthispatternisnotscalableouttoprocesses.Ifyouuseinprocandsocketpairs,youare
buildingatightlyboundapplication,i.e.,onewhereyourthreadsarestructurallyinterdependent.Dothiswhenlowlatencyis
reallyvital.Theotherdesignpatternisalooselyboundapplication,wherethreadshavetheirowncontextandcommunicateover
ipcortcp.Youcaneasilybreaklooselyboundthreadsintoseparateprocesses.

Thisisthefirsttimewe'veshownanexampleusingPAIRsockets.WhyusePAIR?Othersocketcombinationsmightseemto
work,buttheyallhavesideeffectsthatcouldinterferewithsignaling:

YoucanusePUSHforthesenderandPULLforthereceiver.Thislookssimpleandwillwork,butrememberthatPUSH
willdistributemessagestoallavailablereceivers.Ifyoubyaccidentstarttworeceivers(e.g.,youalreadyhaveone
runningandyoustartasecond),you'll"lose"halfofyoursignals.PAIRhastheadvantageofrefusingmorethanone
connectionthepairisexclusive.

YoucanuseDEALERforthesenderandROUTERforthereceiver.ROUTER,however,wrapsyourmessageinan
"envelope",meaningyourzerosizesignalturnsintoamultipartmessage.Ifyoudon'tcareaboutthedataandtreat
anythingasavalidsignal,andifyoudon'treadmorethanoncefromthesocket,thatwon'tmatter.If,however,youdecide
tosendrealdata,youwillsuddenlyfindROUTERprovidingyouwith"wrong"messages.DEALERalsodistributes
http://zguide.zeromq.org/page:all 39/225
12/31/2015 MQ - The Guide - MQ - The Guide
outgoingmessages,givingthesameriskasPUSH.

YoucanusePUBforthesenderandSUBforthereceiver.Thiswillcorrectlydeliveryourmessagesexactlyasyousent
themandPUBdoesnotdistributeasPUSHorDEALERdo.However,youneedtoconfigurethesubscriberwithanempty
subscription,whichisannoying.

Forthesereasons,PAIRmakesthebestchoiceforcoordinationbetweenpairsofthreads.

NodeCoordination topprevnext

Whenyouwanttocoordinateasetofnodesonanetwork,PAIRsocketswon'tworkwellanymore.Thisisoneofthefewareas
wherethestrategiesforthreadsandnodesaredifferent.Principally,nodescomeandgowhereasthreadsareusuallystatic.
PAIRsocketsdonotautomaticallyreconnectiftheremotenodegoesawayandcomesback.

Figure22PubSubSynchronization

Thesecondsignificantdifferencebetweenthreadsandnodesisthatyoutypicallyhaveafixednumberofthreadsbutamore
variablenumberofnodes.Let'stakeoneofourearlierscenarios(theweatherserverandclients)andusenodecoordinationto
ensurethatsubscribersdon'tlosedatawhenstartingup.

Thisishowtheapplicationwillwork:

Thepublisherknowsinadvancehowmanysubscribersitexpects.Thisisjustamagicnumberitgetsfromsomewhere.

Thepublisherstartsupandwaitsforallsubscriberstoconnect.Thisisthenodecoordinationpart.Eachsubscriber
subscribesandthentellsthepublisherit'sreadyviaanothersocket.

Whenthepublisherhasallsubscribersconnected,itstartstopublishdata.

Inthiscase,we'lluseaREQREPsocketflowtosynchronizesubscribersandpublisher.Hereisthepublisher:

syncpub:SynchronizedpublisherinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|
Felix|ObjectiveC|ooc|Q

Andhereisthesubscriber:

syncsub:SynchronizedsubscriberinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|
Felix|ObjectiveC|ooc|Q

ThisBashshellscriptwillstarttensubscribersandthenthepublisher:

http://zguide.zeromq.org/page:all 40/225
12/31/2015 MQ - The Guide - MQ - The Guide

echo"Startingsubscribers..."
for((a=0a<10a++))do
syncsub&
done
echo"Startingpublisher..."
syncpub

Whichgivesusthissatisfyingoutput:

Startingsubscribers...
Startingpublisher...
Received1000000updates
Received1000000updates
...
Received1000000updates
Received1000000updates

Wecan'tassumethattheSUBconnectwillbefinishedbythetimetheREQ/REPdialogiscomplete.Therearenoguarantees
thatoutboundconnectswillfinishinanyorderwhatsoever,ifyou'reusinganytransportexceptinproc.So,theexampledoesa
bruteforcesleepofonesecondbetweensubscribing,andsendingtheREQ/REPsynchronization.

Amorerobustmodelcouldbe:

PublisheropensPUBsocketandstartssending"Hello"messages(notdata).
SubscribersconnectSUBsocketandwhentheyreceiveaHellomessagetheytellthepublisherviaaREQ/REPsocket
pair.
Whenthepublisherhashadallthenecessaryconfirmations,itstartstosendrealdata.

ZeroCopy topprevnext

ZeroMQ'smessageAPIletsyousendandreceivemessagesdirectlyfromandtoapplicationbufferswithoutcopyingdata.We
callthiszerocopy,anditcanimproveperformanceinsomeapplications.

Youshouldthinkaboutusingzerocopyinthespecificcasewhereyouaresendinglargeblocksofmemory(thousandsofbytes),
atahighfrequency.Forshortmessages,orforlowermessagerates,usingzerocopywillmakeyourcodemessierandmore
complexwithnomeasurablebenefit.Likealloptimizations,usethiswhenyouknowithelps,andmeasurebeforeandafter.

Todozerocopy,youusezmq_msg_init_data()tocreateamessagethatreferstoablockofdataalreadyallocatedwith
malloc()orsomeotherallocator,andthenyoupassthattozmq_msg_send().Whenyoucreatethemessage,youalsopassa
functionthatZeroMQwillcalltofreetheblockofdata,whenithasfinishedsendingthemessage.Thisisthesimplestexample,
assumingbufferisablockof1,000bytesallocatedontheheap:

voidmy_free(void*data,void*hint){
free(data)
}
//Sendmessagefrombuffer,whichweallocateandZeroMQwillfreeforus
zmq_msg_tmessage
zmq_msg_init_data(&message,buffer,1000,my_free,NULL)
zmq_msg_send(&message,socket,0)

Notethatyoudon'tcallzmq_msg_close()aftersendingamessagelibzmqwilldothisautomaticallywhenit'sactuallydone
sendingthemessage.

Thereisnowaytodozerocopyonreceive:ZeroMQdeliversyouabufferthatyoucanstoreaslongasyouwish,butitwillnot
writedatadirectlyintoapplicationbuffers.

http://zguide.zeromq.org/page:all 41/225
12/31/2015 MQ - The Guide - MQ - The Guide
Onwriting,ZeroMQ'smultipartmessagesworknicelytogetherwithzerocopy.Intraditionalmessaging,youneedtomarshal
differentbufferstogetherintoonebufferthatyoucansend.Thatmeanscopyingdata.WithZeroMQ,youcansendmultiple
bufferscomingfromdifferentsourcesasindividualmessageframes.Sendeachfieldasalengthdelimitedframe.Tothe
application,itlookslikeaseriesofsendandreceivecalls.Butinternally,themultiplepartsgetwrittentothenetworkandread
backwithsinglesystemcalls,soit'sveryefficient.

PubSubMessageEnvelopes topprevnext

Inthepubsubpattern,wecansplitthekeyintoaseparatemessageframethatwecallanenvelope.Ifyouwanttousepubsub
envelopes,makethemyourself.It'soptional,andinpreviouspubsubexampleswedidn'tdothis.Usingapubsubenvelopeisa
littlemoreworkforsimplecases,butit'scleanerespeciallyforrealcases,wherethekeyandthedataarenaturallyseparate
things.

Figure23PubSubEnvelopewithSeparateKey

Recallthatsubscriptionsdoaprefixmatch.Thatis,theylookfor"allmessagesstartingwithXYZ".Theobviousquestionis:how
todelimitkeysfromdatasothattheprefixmatchdoesn'taccidentallymatchdata.Thebestansweristouseanenvelope
becausethematchwon'tcrossaframeboundary.Hereisaminimalistexampleofhowpubsubenvelopeslookincode.This
publishersendsmessagesoftwotypes,AandB.

Theenvelopeholdsthemessagetype:

psenvpub:PubSubenvelopepublisherinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

ThesubscriberwantsonlymessagesoftypeB:

psenvsub:PubSubenvelopesubscriberinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Whenyourunthetwoprograms,thesubscribershouldshowyouthis:

[B]Wewouldliketoseethis
[B]Wewouldliketoseethis
[B]Wewouldliketoseethis
...

Thisexampleshowsthatthesubscriptionfilterrejectsoracceptstheentiremultipartmessage(keyplusdata).Youwon'tgetpart
ofamultipartmessage,ever.Ifyousubscribetomultiplepublishersandyouwanttoknowtheiraddresssothatyoucansend
themdataviaanothersocket(andthisisatypicalusecase),createathreepartmessage.

Figure24PubSubEnvelopewithSenderAddress

http://zguide.zeromq.org/page:all 42/225
12/31/2015 MQ - The Guide - MQ - The Guide

HighWaterMarks topprevnext

Whenyoucansendmessagesrapidlyfromprocesstoprocess,yousoondiscoverthatmemoryisapreciousresource,andone
thatcanbetriviallyfilledup.Afewsecondsofdelaysomewhereinaprocesscanturnintoabacklogthatblowsupaserver
unlessyouunderstandtheproblemandtakeprecautions.

Theproblemisthis:imagineyouhaveprocessAsendingmessagesathighfrequencytoprocessB,whichisprocessingthem.
SuddenlyBgetsverybusy(garbagecollection,CPUoverload,whatever),andcan'tprocessthemessagesforashortperiod.It
couldbeafewsecondsforsomeheavygarbagecollection,oritcouldbemuchlonger,ifthere'samoreseriousproblem.What
happenstothemessagesthatprocessAisstilltryingtosendfrantically?SomewillsitinB'snetworkbuffers.Somewillsitonthe
Ethernetwireitself.SomewillsitinA'snetworkbuffers.AndtherestwillaccumulateinA'smemory,asrapidlyastheapplication
behindAsendsthem.Ifyoudon'ttakesomeprecaution,Acaneasilyrunoutofmemoryandcrash.

Itisaconsistent,classicproblemwithmessagebrokers.Whatmakesithurtmoreisthatit'sB'sfault,superficially,andBis
typicallyauserwrittenapplicationwhichAhasnocontrolover.

Whataretheanswers?Oneistopasstheproblemupstream.Aisgettingthemessagesfromsomewhereelse.Sotellthat
process,"Stop!"Andsoon.Thisiscalledflowcontrol.Itsoundsplausible,butwhatifyou'resendingoutaTwitterfeed?Doyou
tellthewholeworldtostoptweetingwhileBgetsitsacttogether?

Flowcontrolworksinsomecases,butnotinothers.Thetransportlayercan'ttelltheapplicationlayerto"stop"anymorethana
subwaysystemcantellalargebusiness,"pleasekeepyourstaffatworkforanotherhalfanhour.I'mtoobusy".Theanswerfor
messagingistosetlimitsonthesizeofbuffers,andthenwhenwereachthoselimits,totakesomesensibleaction.Insome
cases(notforasubwaysystem,though),theansweristothrowawaymessages.Inothers,thebeststrategyistowait.

ZeroMQusestheconceptofHWM(highwatermark)todefinethecapacityofitsinternalpipes.Eachconnectionoutofasocket
orintoasockethasitsownpipe,andHWMforsending,and/orreceiving,dependingonthesockettype.Somesockets(PUB,
PUSH)onlyhavesendbuffers.Some(SUB,PULL,REQ,REP)onlyhavereceivebuffers.Some(DEALER,ROUTER,PAIR)
havebothsendandreceivebuffers.

InZeroMQv2.x,theHWMwasinfinitebydefault.Thiswaseasybutalsotypicallyfatalforhighvolumepublishers.InZeroMQ
v3.x,it'ssetto1,000bydefault,whichismoresensible.Ifyou'restillusingZeroMQv2.x,youshouldalwayssetaHWMonyour
sockets,beit1,000tomatchZeroMQv3.xoranotherfigurethattakesintoaccountyourmessagesizesandexpectedsubscriber
performance.

WhenyoursocketreachesitsHWM,itwilleitherblockordropdatadependingonthesockettype.PUBandROUTERsockets
willdropdataiftheyreachtheirHWM,whileothersockettypeswillblock.Overtheinproctransport,thesenderandreceiver
sharethesamebuffers,sotherealHWMisthesumoftheHWMsetbybothsides.

Lastly,theHWMsarenotexactwhileyoumaygetupto1,000messagesbydefault,therealbuffersizemaybemuchlower(as
littleashalf),duetothewaylibzmqimplementsitsqueues.

MissingMessageProblemSolver topprevnext

AsyoubuildapplicationswithZeroMQ,youwillcomeacrossthisproblemmorethanonce:losingmessagesthatyouexpectto
receive.Wehaveputtogetheradiagramthatwalksthroughthemostcommoncausesforthis.

Figure25MissingMessageProblemSolver

http://zguide.zeromq.org/page:all 43/225
12/31/2015 MQ - The Guide - MQ - The Guide

http://zguide.zeromq.org/page:all 44/225
12/31/2015 MQ - The Guide - MQ - The Guide
Here'sasummaryofwhatthegraphicsays:

OnSUBsockets,setasubscriptionusingzmq_setsockopt()withZMQ_SUBSCRIBE,oryouwon'tgetmessages.
Becauseyousubscribetomessagesbyprefix,ifyousubscribeto""(anemptysubscription),youwillgeteverything.

IfyoustarttheSUBsocket(i.e.,establishaconnectiontoaPUBsocket)afterthePUBsockethasstartedsendingout
data,youwilllosewhateveritpublishedbeforetheconnectionwasmade.Ifthisisaproblem,setupyourarchitectureso
theSUBsocketstartsfirst,thenthePUBsocketstartspublishing.

EvenifyousynchronizeaSUBandPUBsocket,youmaystilllosemessages.It'sduetothefactthatinternalqueues
aren'tcreateduntilaconnectionisactuallycreated.Ifyoucanswitchthebind/connectdirectionsotheSUBsocketbinds,
andthePUBsocketconnects,youmayfinditworksmoreasyou'dexpect.

Ifyou'reusingREPandREQsockets,andyou'renotstickingtothesynchronoussend/recv/send/recvorder,ZeroMQwill
reporterrors,whichyoumightignore.Then,itwouldlooklikeyou'relosingmessages.IfyouuseREQorREP,sticktothe
send/recvorder,andalways,inrealcode,checkforerrorsonZeroMQcalls.

Ifyou'reusingPUSHsockets,you'llfindthatthefirstPULLsockettoconnectwillgrabanunfairshareofmessages.The
accuraterotationofmessagesonlyhappenswhenallPULLsocketsaresuccessfullyconnected,whichcantakesome
milliseconds.AsanalternativetoPUSH/PULL,forlowerdatarates,considerusingROUTER/DEALERandtheload
balancingpattern.

Ifyou'resharingsocketsacrossthreads,don't.Itwillleadtorandomweirdness,andcrashes.

Ifyou'reusinginproc,makesurebothsocketsareinthesamecontext.Otherwisetheconnectingsidewillinfactfail.
Also,bindfirst,thenconnect.inprocisnotadisconnectedtransportliketcp.

Ifyou'reusingROUTERsockets,it'sremarkablyeasytolosemessagesbyaccident,bysendingmalformedidentity
frames(orforgettingtosendanidentityframe).IngeneralsettingtheZMQ_ROUTER_MANDATORYoptiononROUTER
socketsisagoodidea,butdoalsocheckthereturncodeoneverysendcall.

Lastly,ifyoureallycan'tfigureoutwhat'sgoingwrong,makeaminimaltestcasethatreproducestheproblem,andaskfor
helpfromtheZeroMQcommunity.

Chapter3AdvancedRequestReplyPatterns topprevnext

InChapter2SocketsandPatternsweworkedthroughthebasicsofusingZeroMQbydevelopingaseriesofsmallapplications,
eachtimeexploringnewaspectsofZeroMQ.We'llcontinuethisapproachinthischapterasweexploreadvancedpatternsbuilt
ontopofZeroMQ'scorerequestreplypattern.

We'llcover:

Howtherequestreplymechanismswork
HowtocombineREQ,REP,DEALER,andROUTERsockets
HowROUTERsocketswork,indetail
Theloadbalancingpattern
Buildingasimpleloadbalancingmessagebroker
DesigningahighlevelAPIforZeroMQ
Buildinganasynchronousrequestreplyserver
Adetailedinterbrokerroutingexample

TheRequestReplyMechanisms topprevnext

Wealreadylookedbrieflyatmultipartmessages.Let'snowlookatamajorusecase,whichisreplymessageenvelopes.An
envelopeisawayofsafelypackagingupdatawithanaddress,withouttouchingthedataitself.Byseparatingreplyaddresses
intoanenvelopewemakeitpossibletowritegeneralpurposeintermediariessuchasAPIsandproxiesthatcreate,read,and
removeaddressesnomatterwhatthemessagepayloadorstructureis.

http://zguide.zeromq.org/page:all 45/225
12/31/2015 MQ - The Guide - MQ - The Guide
Intherequestreplypattern,theenvelopeholdsthereturnaddressforreplies.ItishowaZeroMQnetworkwithnostatecan
createroundtriprequestreplydialogs.

WhenyouuseREQandREPsocketsyoudon'tevenseeenvelopesthesesocketsdealwiththemautomatically.Butformostof
theinterestingrequestreplypatterns,you'llwanttounderstandenvelopesandparticularlyROUTERsockets.We'llworkthrough
thisstepbystep.

TheSimpleReplyEnvelope topprevnext

Arequestreplyexchangeconsistsofarequestmessage,andaneventualreplymessage.Inthesimplerequestreplypattern,
there'sonereplyforeachrequest.Inmoreadvancedpatterns,requestsandrepliescanflowasynchronously.However,thereply
envelopealwaysworksthesameway.

TheZeroMQreplyenvelopeformallyconsistsofzeroormorereplyaddresses,followedbyanemptyframe(theenvelope
delimiter),followedbythemessagebody(zeroormoreframes).Theenvelopeiscreatedbymultiplesocketsworkingtogetherin
achain.We'llbreakthisdown.

We'llstartbysending"Hello"throughaREQsocket.TheREQsocketcreatesthesimplestpossiblereplyenvelope,whichhasno
addresses,justanemptydelimiterframeandthemessageframecontainingthe"Hello"string.Thisisatwoframemessage.

Figure26RequestwithMinimalEnvelope

TheREPsocketdoesthematchingwork:itstripsofftheenvelope,uptoandincludingthedelimiterframe,savesthewhole
envelope,andpassesthe"Hello"stringuptheapplication.ThusouroriginalHelloWorldexampleusedrequestreplyenvelopes
internally,buttheapplicationneversawthem.

Ifyouspyonthenetworkdataflowingbetweenhwclientandhwserver,thisiswhatyou'llsee:everyrequestandeveryreply
isinfacttwoframes,anemptyframeandthenthebody.Itdoesn'tseemtomakemuchsenseforasimpleREQREPdialog.
Howeveryou'llseethereasonwhenweexplorehowROUTERandDEALERhandleenvelopes.

TheExtendedReplyEnvelope topprevnext

Nowlet'sextendtheREQREPpairwithaROUTERDEALERproxyinthemiddleandseehowthisaffectsthereplyenvelope.
ThisistheextendedrequestreplypatternwealreadysawinChapter2SocketsandPatterns.Wecan,infact,insertany
numberofproxysteps.Themechanicsarethesame.

Figure27ExtendedRequestReplyPattern

http://zguide.zeromq.org/page:all 46/225
12/31/2015 MQ - The Guide - MQ - The Guide

Theproxydoesthis,inpseudocode:

preparecontext,frontendandbackendsockets
whiletrue:
pollonbothsockets
iffrontendhadinput:
readallframesfromfrontend
sendtobackend
ifbackendhadinput:
readallframesfrombackend
sendtofrontend

TheROUTERsocket,unlikeothersockets,trackseveryconnectionithas,andtellsthecalleraboutthese.Thewayittellsthe
calleristosticktheconnectionidentityinfrontofeachmessagereceived.Anidentity,sometimescalledanaddress,isjusta
binarystringwithnomeaningexcept"thisisauniquehandletotheconnection".Then,whenyousendamessageviaaROUTER
socket,youfirstsendanidentityframe.

Thezmq_socket()manpagedescribesitthus:

WhenreceivingmessagesaZMQ_ROUTERsocketshallprependamessagepartcontainingtheidentityoftheoriginating
peertothemessagebeforepassingittotheapplication.Messagesreceivedarefairqueuedfromamongallconnected
peers.WhensendingmessagesaZMQ_ROUTERsocketshallremovethefirstpartofthemessageanduseittodetermine
theidentityofthepeerthemessageshallberoutedto.

Asahistoricalnote,ZeroMQv2.2andearlieruseUUIDsasidentities,andZeroMQv3.0andlateruseshortintegers.There's
someimpactonnetworkperformance,butonlywhenyouusemultipleproxyhops,whichisrare.Mostlythechangewasto
simplifybuildinglibzmqbyremovingthedependencyonaUUIDlibrary.

Identitiesareadifficultconcepttounderstand,butit'sessentialifyouwanttobecomeaZeroMQexpert.TheROUTERsocket
inventsarandomidentityforeachconnectionwithwhichitworks.IftherearethreeREQsocketsconnectedtoaROUTER
socket,itwillinventthreerandomidentities,oneforeachREQsocket.

Soifwecontinueourworkedexample,let'ssaytheREQsockethasa3byteidentityABC.Internally,thismeanstheROUTER
socketkeepsahashtablewhereitcansearchforABCandfindtheTCPconnectionfortheREQsocket.

WhenwereceivethemessageofftheROUTERsocket,wegetthreeframes.

Figure28RequestwithOneAddress

http://zguide.zeromq.org/page:all 47/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thecoreoftheproxyloopis"readfromonesocket,writetotheother",soweliterallysendthesethreeframesoutonthe
DEALERsocket.Ifyounowsniffedthenetworktraffic,youwouldseethesethreeframesflyingfromtheDEALERsockettothe
REPsocket.TheREPsocketdoesasbefore,stripsoffthewholeenvelopeincludingthenewreplyaddress,andonceagain
deliversthe"Hello"tothecaller.

IncidentallytheREPsocketcanonlydealwithonerequestreplyexchangeatatime,whichiswhyifyoutrytoreadmultiple
requestsorsendmultiplereplieswithoutstickingtoastrictrecvsendcycle,itgivesanerror.

Youshouldnowbeabletovisualizethereturnpath.Whenhwserversends"World"back,theREPsocketwrapsthatwiththe
envelopeitsaved,andsendsathreeframereplymessageacrossthewiretotheDEALERsocket.

Figure29ReplywithoneAddress

NowtheDEALERreadsthesethreeframes,andsendsallthreeoutviatheROUTERsocket.TheROUTERtakesthefirstframe
forthemessage,whichistheABCidentity,andlooksuptheconnectionforthis.Ifitfindsthat,itthenpumpsthenexttwoframes
outontothewire.

Figure30ReplywithMinimalEnvelope

TheREQsocketpicksthismessageup,andchecksthatthefirstframeistheemptydelimiter,whichitis.TheREQsocket
discardsthatframeandpasses"World"tothecallingapplication,whichprintsitouttotheamazementoftheyoungeruslooking
atZeroMQforthefirsttime.

What'sThisGoodFor? topprevnext

Tobehonest,theusecasesforstrictrequestreplyorextendedrequestreplyaresomewhatlimited.Foronething,there'sno
easywaytorecoverfromcommonfailuresliketheservercrashingduetobuggyapplicationcode.We'llseemoreaboutthisin
Chapter4ReliableRequestReplyPatterns.Howeveronceyougraspthewaythesefoursocketsdealwithenvelopes,andhow
theytalktoeachother,youcandoveryusefulthings.WesawhowROUTERusesthereplyenvelopetodecidewhichclientREQ
sockettorouteareplybackto.Nowlet'sexpressthisanotherway:

http://zguide.zeromq.org/page:all 48/225
12/31/2015 MQ - The Guide - MQ - The Guide
EachtimeROUTERgivesyouamessage,ittellsyouwhatpeerthatcamefrom,asanidentity.
Youcanusethiswithahashtable(withtheidentityaskey)totracknewpeersastheyarrive.
ROUTERwillroutemessagesasynchronouslytoanypeerconnectedtoit,ifyouprefixtheidentityasthefirstframeofthe
message.

ROUTERsocketsdon'tcareaboutthewholeenvelope.Theydon'tknowanythingabouttheemptydelimiter.Alltheycareabout
isthatoneidentityframethatletsthemfigureoutwhichconnectiontosendamessageto.

RecapofRequestReplySockets topprevnext

Let'srecapthis:

TheREQsocketsends,tothenetwork,anemptydelimiterframeinfrontofthemessagedata.REQsocketsare
synchronous.REQsocketsalwayssendonerequestandthenwaitforonereply.REQsocketstalktoonepeeratatime.
IfyouconnectaREQsockettomultiplepeers,requestsaredistributedtoandrepliesexpectedfromeachpeeroneturnat
atime.

TheREPsocketreadsandsavesallidentityframesuptoandincludingtheemptydelimiter,thenpassesthefollowing
frameorframestothecaller.REPsocketsaresynchronousandtalktoonepeeratatime.IfyouconnectaREPsocketto
multiplepeers,requestsarereadfrompeersinfairfashion,andrepliesarealwayssenttothesamepeerthatmadethe
lastrequest.

TheDEALERsocketisoblivioustothereplyenvelopeandhandlesthislikeanymultipartmessage.DEALERsocketsare
asynchronousandlikePUSHandPULLcombined.Theydistributesentmessagesamongallconnections,andfairqueue
receivedmessagesfromallconnections.

TheROUTERsocketisoblivioustothereplyenvelope,likeDEALER.Itcreatesidentitiesforitsconnections,andpasses
theseidentitiestothecallerasafirstframeinanyreceivedmessage.Conversely,whenthecallersendsamessage,it
usesthefirstmessageframeasanidentitytolookuptheconnectiontosendto.ROUTERSareasynchronous.

RequestReplyCombinations topprevnext

Wehavefourrequestreplysockets,eachwithacertainbehavior.We'veseenhowtheyconnectinsimpleandextendedrequest
replypatterns.Butthesesocketsarebuildingblocksthatyoucanusetosolvemanyproblems.

Thesearethelegalcombinations:

REQtoREP
DEALERtoREP
REQtoROUTER
DEALERtoROUTER
DEALERtoDEALER
ROUTERtoROUTER

Andthesecombinationsareinvalid(andI'llexplainwhy):

REQtoREQ
REQtoDEALER
REPtoREP
REPtoROUTER

Herearesometipsforrememberingthesemantics.DEALERislikeanasynchronousREQsocket,andROUTERislikean
asynchronousREPsocket.WhereweuseaREQsocket,wecanuseaDEALERwejusthavetoreadandwritetheenvelope
ourselves.WhereweuseaREPsocket,wecanstickaROUTERwejustneedtomanagetheidentitiesourselves.

ThinkofREQandDEALERsocketsas"clients"andREPandROUTERsocketsas"servers".Mostly,you'llwanttobindREPand
ROUTERsockets,andconnectREQandDEALERsocketstothem.It'snotalwaysgoingtobethissimple,butitisacleanand
memorableplacetostart.

http://zguide.zeromq.org/page:all 49/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheREQtoREPCombination topprevnext

We'vealreadycoveredaREQclienttalkingtoaREPserverbutlet'stakeoneaspect:theREQclientmustinitiatethemessage
flow.AREPservercannottalktoaREQclientthathasn'tfirstsentitarequest.Technically,it'snotevenpossible,andtheAPI
alsoreturnsanEFSMerrorifyoutryit.

TheDEALERtoREPCombination topprevnext

Now,let'sreplacetheREQclientwithaDEALER.ThisgivesusanasynchronousclientthatcantalktomultipleREPservers.If
werewrotethe"HelloWorld"clientusingDEALER,we'dbeabletosendoffanynumberof"Hello"requestswithoutwaitingfor
replies.

WhenweuseaDEALERtotalktoaREPsocket,wemustaccuratelyemulatetheenvelopethattheREQsocketwouldhave
sent,ortheREPsocketwilldiscardthemessageasinvalid.So,tosendamessage,we:

SendanemptymessageframewiththeMOREflagsetthen
Sendthemessagebody.

Andwhenwereceiveamessage,we:

Receivethefirstframeandifit'snotempty,discardthewholemessage
Receivethenextframeandpassthattotheapplication.

TheREQtoROUTERCombination topprevnext

InthesamewaythatwecanreplaceREQwithDEALER,wecanreplaceREPwithROUTER.Thisgivesusanasynchronous
serverthatcantalktomultipleREQclientsatthesametime.Ifwerewrotethe"HelloWorld"serverusingROUTER,we'dbeable
toprocessanynumberof"Hello"requestsinparallel.WesawthisintheChapter2SocketsandPatternsmtserverexample.

WecanuseROUTERintwodistinctways:

Asaproxythatswitchesmessagesbetweenfrontendandbackendsockets.
Asanapplicationthatreadsthemessageandactsonit.

Inthefirstcase,theROUTERsimplyreadsallframes,includingtheartificialidentityframe,andpassesthemonblindly.Inthe
secondcasetheROUTERmustknowtheformatofthereplyenvelopeit'sbeingsent.AstheotherpeerisaREQsocket,the
ROUTERgetstheidentityframe,anemptyframe,andthenthedataframe.

TheDEALERtoROUTERCombination topprevnext

NowwecanswitchoutbothREQandREPwithDEALERandROUTERtogetthemostpowerfulsocketcombination,whichis
DEALERtalkingtoROUTER.Itgivesusasynchronousclientstalkingtoasynchronousservers,wherebothsideshavefullcontrol
overthemessageformats.

BecausebothDEALERandROUTERcanworkwitharbitrarymessageformats,ifyouhopetousethesesafely,youhaveto
becomealittlebitofaprotocoldesigner.AttheveryleastyoumustdecidewhetheryouwishtoemulatetheREQ/REPreply
envelope.Itdependsonwhetheryouactuallyneedtosendrepliesornot.

TheDEALERtoDEALERCombination topprevnext

http://zguide.zeromq.org/page:all 50/225
12/31/2015 MQ - The Guide - MQ - The Guide
YoucanswapaREPwithaROUTER,butyoucanalsoswapaREPwithaDEALER,iftheDEALERistalkingtooneandonly
onepeer.

WhenyoureplaceaREPwithaDEALER,yourworkercansuddenlygofullasynchronous,sendinganynumberofrepliesback.
Thecostisthatyouhavetomanagethereplyenvelopesyourself,andgetthemright,ornothingatallwillwork.We'llseea
workedexamplelater.Let'sjustsayfornowthatDEALERtoDEALERisoneofthetrickierpatternstogetright,andhappilyit's
rarethatweneedit.

TheROUTERtoROUTERCombination topprevnext

ThissoundsperfectforNtoNconnections,butit'sthemostdifficultcombinationtouse.Youshouldavoidituntilyouarewell
advancedwithZeroMQ.We'llseeoneexampleitintheFreelancepatterninChapter4ReliableRequestReplyPatterns,andan
alternativeDEALERtoROUTERdesignforpeertopeerworkinChapter8AFrameworkforDistributedComputing.

InvalidCombinations topprevnext

Mostly,tryingtoconnectclientstoclients,orserverstoserversisabadideaandwon'twork.However,ratherthangivegeneral
vaguewarnings,I'llexplainindetail:

REQtoREQ:bothsideswanttostartbysendingmessagestoeachother,andthiscouldonlyworkifyoutimedthingsso
thatbothpeersexchangedmessagesatthesametime.Ithurtsmybraintoeventhinkaboutit.

REQtoDEALER:youcouldintheorydothis,butitwouldbreakifyouaddedasecondREQbecauseDEALERhasno
wayofsendingareplytotheoriginalpeer.ThustheREQsocketwouldgetconfused,and/orreturnmessagesmeantfor
anotherclient.

REPtoREP:bothsideswouldwaitfortheothertosendthefirstmessage.

REPtoROUTER:theROUTERsocketcanintheoryinitiatethedialogandsendaproperlyformattedrequest,ifitknows
theREPsockethasconnectedanditknowstheidentityofthatconnection.It'smessyandaddsnothingoverDEALERto
ROUTER.

ThecommonthreadinthisvalidversusinvalidbreakdownisthataZeroMQsocketconnectionisalwaysbiasedtowardsonepeer
thatbindstoanendpoint,andanotherthatconnectstothat.Further,thatwhichsidebindsandwhichsideconnectsisnot
arbitrary,butfollowsnaturalpatterns.Thesidewhichweexpectto"bethere"binds:it'llbeaserver,abroker,apublisher,a
collector.Thesidethat"comesandgoes"connects:it'llbeclientsandworkers.Rememberingthiswillhelpyoudesignbetter
ZeroMQarchitectures.

ExploringROUTERSockets topprevnext

Let'slookatROUTERsocketsalittlecloser.We'vealreadyseenhowtheyworkbyroutingindividualmessagestospecific
connections.I'llexplaininmoredetailhowweidentifythoseconnections,andwhataROUTERsocketdoeswhenitcan'tsenda
message.

IdentitiesandAddresses topprevnext

TheidentityconceptinZeroMQrefersspecificallytoROUTERsocketsandhowtheyidentifytheconnectionstheyhavetoother
sockets.Morebroadly,identitiesareusedasaddressesinthereplyenvelope.Inmostcases,theidentityisarbitraryandlocalto
theROUTERsocket:it'salookupkeyinahashtable.Independently,apeercanhaveanaddressthatisphysical(anetwork
endpointlike"tcp://192.168.55.117:5670")orlogical(aUUIDoremailaddressorotheruniquekey).

http://zguide.zeromq.org/page:all 51/225
12/31/2015 MQ - The Guide - MQ - The Guide
AnapplicationthatusesaROUTERsockettotalktospecificpeerscanconvertalogicaladdresstoanidentityifithasbuiltthe
necessaryhashtable.BecauseROUTERsocketsonlyannouncetheidentityofaconnection(toaspecificpeer)whenthatpeer
sendsamessage,youcanonlyreallyreplytoamessage,notspontaneouslytalktoapeer.

ThisistrueevenifyoufliptherulesandmaketheROUTERconnecttothepeerratherthanwaitforthepeertoconnecttothe
ROUTER.HoweveryoucanforcetheROUTERsockettousealogicaladdressinplaceofitsidentity.Thezmq_setsockopt
referencepagecallsthissettingthesocketidentity.Itworksasfollows:

ThepeerapplicationsetstheZMQ_IDENTITYoptionofitspeersocket(DEALERorREQ)beforebindingorconnecting.
UsuallythepeerthenconnectstothealreadyboundROUTERsocket.ButtheROUTERcanalsoconnecttothepeer.
Atconnectiontime,thepeersockettellstheroutersocket,"pleaseusethisidentityforthisconnection".
Ifthepeersocketdoesn'tsaythat,theroutergeneratesitsusualarbitraryrandomidentityfortheconnection.
TheROUTERsocketnowprovidesthislogicaladdresstotheapplicationasaprefixidentityframeforanymessages
cominginfromthatpeer.
TheROUTERalsoexpectsthelogicaladdressastheprefixidentityframeforanyoutgoingmessages.

HereisasimpleexampleoftwopeersthatconnecttoaROUTERsocket,onethatimposesalogicaladdress"PEER2":

identity:IdentitycheckinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Q|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Racket

Hereiswhattheprogramprints:


[005]006B8B4567
[000]
[026]ROUTERusesageneratedUUID

[005]PEER2
[000]
[038]ROUTERusesREQ'ssocketidentity

ROUTERErrorHandling topprevnext

ROUTERsocketsdohaveasomewhatbrutalwayofdealingwithmessagestheycan'tsendanywhere:theydropthemsilently.
It'sanattitudethatmakessenseinworkingcode,butitmakesdebugginghard.The"sendidentityasfirstframe"approachis
trickyenoughthatweoftengetthiswrongwhenwe'relearning,andtheROUTER'sstonysilencewhenwemessupisn'tvery
constructive.

SinceZeroMQv3.2there'sasocketoptionyoucansettocatchthiserror:ZMQ_ROUTER_MANDATORY.SetthatontheROUTER
socketandthenwhenyouprovideanunroutableidentityonasendcall,thesocketwillsignalanEHOSTUNREACHerror.

TheLoadBalancingPattern topprevnext

Nowlet'slookatsomecode.We'llseehowtoconnectaROUTERsockettoaREQsocket,andthentoaDEALERsocket.These
twoexamplesfollowthesamelogic,whichisaloadbalancingpattern.ThispatternisourfirstexposuretousingtheROUTER
socketfordeliberaterouting,ratherthansimplyactingasareplychannel.

Theloadbalancingpatternisverycommonandwe'llseeitseveraltimesinthisbook.Itsolvesthemainproblemwithsimple
roundrobinrouting(asPUSHandDEALERoffer)whichisthatroundrobinbecomesinefficientiftasksdonotallroughlytakethe
sametime.

It'sthepostofficeanalogy.Ifyouhaveonequeuepercounter,andyouhavesomepeoplebuyingstamps(afast,simple
transaction),andsomepeopleopeningnewaccounts(averyslowtransaction),thenyouwillfindstampbuyersgettingunfairly
stuckinqueues.Justasinapostoffice,ifyourmessagingarchitectureisunfair,peoplewillgetannoyed.

http://zguide.zeromq.org/page:all 52/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thesolutioninthepostofficeistocreateasinglequeuesothatevenifoneortwocountersgetstuckwithslowwork,other
counterswillcontinuetoserveclientsonafirstcome,firstservebasis.

OnereasonPUSHandDEALERusethesimplisticapproachissheerperformance.IfyouarriveinanymajorUSairport,you'll
findlongqueuesofpeoplewaitingatimmigration.Theborderpatrolofficialswillsendpeopleinadvancetoqueueupateach
counter,ratherthanusingasinglequeue.Havingpeoplewalkfiftyyardsinadvancesavesaminuteortwoperpassenger.And
becauseeverypassportchecktakesroughlythesametime,it'smoreorlessfair.ThisisthestrategyforPUSHandDEALER:
sendworkloadsaheadoftimesothatthereislesstraveldistance.

ThisisarecurringthemewithZeroMQ:theworld'sproblemsarediverseandyoucanbenefitfromsolvingdifferentproblems
eachintherightway.Theairportisn'tthepostofficeandonesizefitsnoone,reallywell.

Let'sreturntothescenarioofaworker(DEALERorREQ)connectedtoabroker(ROUTER).Thebrokerhastoknowwhenthe
workerisready,andkeepalistofworkerssothatitcantaketheleastrecentlyusedworkereachtime.

Thesolutionisreallysimple,infact:workerssenda"ready"messagewhentheystart,andaftertheyfinisheachtask.Thebroker
readsthesemessagesonebyone.Eachtimeitreadsamessage,itisfromthelastusedworker.Andbecausewe'reusinga
ROUTERsocket,wegetanidentitythatwecanthenusetosendataskbacktotheworker.

It'satwistonrequestreplybecausethetaskissentwiththereply,andanyresponseforthetaskissentasanewrequest.The
followingcodeexamplesshouldmakeitclearer.

ROUTERBrokerandREQWorkers topprevnext

HereisanexampleoftheloadbalancingpatternusingaROUTERbrokertalkingtoasetofREQworkers:

rtreq:ROUTERtoREQinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Theexamplerunsforfivesecondsandtheneachworkerprintshowmanytaskstheyhandled.Iftheroutingworked,we'dexpect
afairdistributionofwork:

Completed:20tasks
Completed:18tasks
Completed:21tasks
Completed:23tasks
Completed:19tasks
Completed:21tasks
Completed:17tasks
Completed:17tasks
Completed:25tasks
Completed:19tasks

Totalktotheworkersinthisexample,wehavetocreateaREQfriendlyenvelopeconsistingofanidentityplusanempty
envelopedelimiterframe.

Figure31RoutingEnvelopeforREQ

http://zguide.zeromq.org/page:all 53/225
12/31/2015 MQ - The Guide - MQ - The Guide

ROUTERBrokerandDEALERWorkers topprevnext

AnywhereyoucanuseREQ,youcanuseDEALER.Therearetwospecificdifferences:

TheREQsocketalwayssendsanemptydelimiterframebeforeanydataframestheDEALERdoesnot.
TheREQsocketwillsendonlyonemessagebeforeitreceivesareplytheDEALERisfullyasynchronous.

Thesynchronousversusasynchronousbehaviorhasnoeffectonourexamplebecausewe'redoingstrictrequestreply.Itismore
relevantwhenweaddressrecoveringfromfailures,whichwe'llcometoinChapter4ReliableRequestReplyPatterns.

Nowlet'slookatexactlythesameexamplebutwiththeREQsocketreplacedbyaDEALERsocket:

rtdealer:ROUTERtoDEALERinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

ThecodeisalmostidenticalexceptthattheworkerusesaDEALERsocket,andreadsandwritesthatemptyframebeforethe
dataframe.ThisistheapproachIusewhenIwanttokeepcompatibilitywithREQworkers.

However,rememberthereasonforthatemptydelimiterframe:it'stoallowmultihopextendedrequeststhatterminateinaREP
socket,whichusesthatdelimitertosplitoffthereplyenvelopesoitcanhandthedataframestoitsapplication.

IfweneverneedtopassthemessagealongtoaREPsocket,wecansimplydroptheemptydelimiterframeatbothsides,which
makesthingssimpler.ThisisusuallythedesignIuseforpureDEALERtoROUTERprotocols.

ALoadBalancingMessageBroker topprevnext

Thepreviousexampleishalfcomplete.Itcanmanageasetofworkerswithdummyrequestsandreplies,butithasnowaytotalk
toclients.IfweaddasecondfrontendROUTERsocketthatacceptsclientrequests,andturnourexampleintoaproxythatcan
switchmessagesfromfrontendtobackend,wegetausefulandreusabletinyloadbalancingmessagebroker.

Figure32LoadBalancingBroker

http://zguide.zeromq.org/page:all 54/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thisbrokerdoesthefollowing:

Acceptsconnectionsfromasetofclients.
Acceptsconnectionsfromasetofworkers.
Acceptsrequestsfromclientsandholdstheseinasinglequeue.
Sendstheserequeststoworkersusingtheloadbalancingpattern.
Receivesrepliesbackfromworkers.
Sendstheserepliesbacktotheoriginalrequestingclient.

Thebrokercodeisfairlylong,butworthunderstanding:

lbbroker:LoadbalancingbrokerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Thedifficultpartofthisprogramis(a)theenvelopesthateachsocketreadsandwrites,and(b)theloadbalancingalgorithm.
We'lltaketheseinturn,startingwiththemessageenvelopeformats.

Let'swalkthroughafullrequestreplychainfromclienttoworkerandback.Inthiscodewesettheidentityofclientandworker
socketstomakeiteasiertotracethemessageframes.Inreality,we'dallowtheROUTERsocketstoinventidentitiesfor
connections.Let'sassumetheclient'sidentityis"CLIENT"andtheworker'sidentityis"WORKER".Theclientapplicationsendsa
singleframecontaining"Hello".

Figure33MessagethatClientSends

BecausetheREQsocketaddsitsemptydelimiterframeandtheROUTERsocketaddsitsconnectionidentity,theproxyreadsoff
thefrontendROUTERsockettheclientaddress,emptydelimiterframe,andthedatapart.

Figure34MessageCominginonFrontend

Thebrokersendsthistotheworker,prefixedbytheaddressofthechosenworker,plusanadditionalemptyparttokeeptheREQ
attheotherendhappy.

Figure35MessageSenttoBackend

http://zguide.zeromq.org/page:all 55/225
12/31/2015 MQ - The Guide - MQ - The Guide

ThiscomplexenvelopestackgetschewedupfirstbythebackendROUTERsocket,whichremovesthefirstframe.Thenthe
REQsocketintheworkerremovestheemptypart,andprovidestheresttotheworkerapplication.

Figure36MessageDeliveredtoWorker

Theworkerhastosavetheenvelope(whichisallthepartsuptoandincludingtheemptymessageframe)andthenitcando
what'sneededwiththedatapart.NotethataREPsocketwoulddothisautomatically,butwe'reusingtheREQROUTERpattern
sothatwecangetproperloadbalancing.

Onthereturnpath,themessagesarethesameaswhentheycomein,i.e.,thebackendsocketgivesthebrokeramessagein
fiveparts,andthebrokersendsthefrontendsocketamessageinthreeparts,andtheclientgetsamessageinonepart.

Nowlet'slookattheloadbalancingalgorithm.ItrequiresthatbothclientsandworkersuseREQsockets,andthatworkers
correctlystoreandreplaytheenvelopeonmessagestheyget.Thealgorithmis:

Createapollsetthatalwayspollsthebackend,andpollsthefrontendonlyifthereareoneormoreworkersavailable.

Pollforactivitywithinfinitetimeout.

Ifthereisactivityonthebackend,weeitherhavea"ready"messageorareplyforaclient.Ineithercase,westorethe
workeraddress(thefirstpart)onourworkerqueue,andiftherestisaclientreply,wesenditbacktothatclientviathe
frontend.

Ifthereisactivityonthefrontend,wetaketheclientrequest,popthenextworker(whichisthelastused),andsendthe
requesttothebackend.Thismeanssendingtheworkeraddress,emptypart,andthenthethreepartsoftheclientrequest.

Youshouldnowseethatyoucanreuseandextendtheloadbalancingalgorithmwithvariationsbasedontheinformationthe
workerprovidesinitsinitial"ready"message.Forexample,workersmightstartupanddoaperformanceselftest,thentellthe
brokerhowfasttheyare.Thebrokercanthenchoosethefastestavailableworkerratherthantheoldest.

AHighLevelAPIforZeroMQ topprevnext

We'regoingtopushrequestreplyontothestackandopenadifferentarea,whichistheZeroMQAPIitself.There'sareasonfor
thisdetour:aswewritemorecomplexexamples,thelowlevelZeroMQAPIstartstolookincreasinglyclumsy.Lookatthecoreof
theworkerthreadfromourloadbalancingbroker:

while(true){
//Getoneaddressframeandemptydelimiter
char*address=s_recv(worker)
char*empty=s_recv(worker)
assert(*empty==0)
free(empty)

//Getrequest,sendreply
char*request=s_recv(worker)
printf("Worker:%s\n",request)
free(request)

s_sendmore(worker,address)

http://zguide.zeromq.org/page:all 56/225
12/31/2015 MQ - The Guide - MQ - The Guide
s_sendmore(worker,"")

s_send(worker,"OK")
free(address)
}

Thatcodeisn'tevenreusablebecauseitcanonlyhandleonereplyaddressintheenvelope,anditalreadydoessomewrapping
aroundtheZeroMQAPI.IfweusedthelibzmqsimplemessageAPIthisiswhatwe'dhavetowrite:

while(true){
//Getoneaddressframeandemptydelimiter
charaddress[255]
intaddress_size=zmq_recv(worker,address,255,0)
if(address_size==1)
break

charempty[1]
intempty_size=zmq_recv(worker,empty,1,0)
zmq_recv(worker,&empty,0)
assert(empty_size<=0)
if(empty_size==1)
break

//Getrequest,sendreply
charrequest[256]
intrequest_size=zmq_recv(worker,request,255,0)
if(request_size==1)
returnNULL
request[request_size]=0
printf("Worker:%s\n",request)

zmq_send(worker,address,address_size,ZMQ_SNDMORE)
zmq_send(worker,empty,0,ZMQ_SNDMORE)
zmq_send(worker,"OK",2,0)
}

Andwhencodeistoolongtowritequickly,it'salsotoolongtounderstand.Upuntilnow,I'vestucktothenativeAPIbecause,as
ZeroMQusers,weneedtoknowthatintimately.Butwhenitgetsinourway,wehavetotreatitasaproblemtosolve.

Wecan'tofcoursejustchangetheZeroMQAPI,whichisadocumentedpubliccontractonwhichthousandsofpeopleagreeand
depend.Instead,weconstructahigherlevelAPIontopbasedonourexperiencesofar,andmostspecifically,ourexperience
fromwritingmorecomplexrequestreplypatterns.

WhatwewantisanAPIthatletsusreceiveandsendanentiremessageinoneshot,includingthereplyenvelopewithany
numberofreplyaddresses.Onethatletsusdowhatwewantwiththeabsoluteleastlinesofcode.

MakingagoodmessageAPIisfairlydifficult.Wehaveaproblemofterminology:ZeroMQuses"message"todescribeboth
multipartmessages,andindividualmessageframes.Wehaveaproblemofexpectations:sometimesit'snaturaltoseemessage
contentasprintablestringdata,sometimesasbinaryblobs.Andwehavetechnicalchallenges,especiallyifwewanttoavoid
copyingdataaroundtoomuch.

ThechallengeofmakingagoodAPIaffectsalllanguages,thoughmyspecificusecaseisC.Whateverlanguageyouuse,think
abouthowyoucouldcontributetoyourlanguagebindingtomakeitasgood(orbetter)thantheCbindingI'mgoingtodescribe.

FeaturesofaHigherLevelAPI topprevnext

Mysolutionistousethreefairlynaturalandobviousconcepts:string(alreadythebasisforours_sendands_recv)helpers,
frame(amessageframe),andmessage(alistofoneormoreframes).Hereistheworkercode,rewrittenontoanAPIusing
theseconcepts:

http://zguide.zeromq.org/page:all 57/225
12/31/2015 MQ - The Guide - MQ - The Guide

while(true){
zmsg_t*msg=zmsg_recv(worker)
zframe_reset(zmsg_last(msg),"OK",2)
zmsg_send(&msg,worker)
}

Cuttingtheamountofcodeweneedtoreadandwritecomplexmessagesisgreat:theresultsareeasytoreadandunderstand.
Let'scontinuethisprocessforotheraspectsofworkingwithZeroMQ.Here'sawishlistofthingsI'dlikeinahigherlevelAPI,
basedonmyexperiencewithZeroMQsofar:

Automatichandlingofsockets.Ifinditcumbersometohavetoclosesocketsmanually,andtohavetoexplicitlydefinethe
lingertimeoutinsome(butnotall)cases.It'dbegreattohaveawaytoclosesocketsautomaticallywhenIclosethe
context.

Portablethreadmanagement.EverynontrivialZeroMQapplicationusesthreads,butPOSIXthreadsaren'tportable.Soa
decenthighlevelAPIshouldhidethisunderaportablelayer.

Pipingfromparenttochildthreads.It'sarecurrentproblem:howtosignalbetweenparentandchildthreads.OurAPI
shouldprovideaZeroMQmessagepipe(usingPAIRsocketsandinprocautomatically.

Portableclocks.Evengettingthetimetoamillisecondresolution,orsleepingforsomemilliseconds,isnotportable.
RealisticZeroMQapplicationsneedportableclocks,soourAPIshouldprovidethem.

Areactortoreplacezmq_poll().Thepollloopissimple,butclumsy.Writingalotofthese,weendupdoingthesame
workoverandover:calculatingtimers,andcallingcodewhensocketsareready.Asimplereactorwithsocketreadersand
timerswouldsavealotofrepeatedwork.

ProperhandlingofCtrlC.Wealreadysawhowtocatchaninterrupt.Itwouldbeusefulifthishappenedinallapplications.

TheCZMQHighLevelAPI topprevnext

TurningthiswishlistintorealityfortheClanguagegivesusCZMQ,aZeroMQlanguagebindingforC.Thishighlevelbinding,in
fact,developedoutofearlierversionsoftheexamples.ItcombinesnicersemanticsforworkingwithZeroMQwithsome
portabilitylayers,and(importantlyforC,butlessforotherlanguages)containerslikehashesandlists.CZMQalsousesan
elegantobjectmodelthatleadstofranklylovelycode.

HereistheloadbalancingbrokerrewrittentouseahigherlevelAPI(CZMQfortheCcase):

lbbroker2:LoadbalancingbrokerusinghighlevelAPIinC

Delphi|Haxe|Java|Lua|PHP|Python|Scala|Ada|Basic|C++|C#|Clojure|CL|Erlang|F#|Felix|Go|Haskell|Node.js|ObjectiveC|ooc|Perl
|Q|Racket|Ruby|Tcl

OnethingCZMQprovidesiscleaninterrupthandling.ThismeansthatCtrlCwillcauseanyblockingZeroMQcalltoexitwitha
returncode1anderrnosettoEINTR.ThehighlevelrecvmethodswillreturnNULLinsuchcases.So,youcancleanlyexita
looplikethis:

while(true){
zstr_send(client,"Hello")
char*reply=zstr_recv(client)
if(!reply)
break//Interrupted
printf("Client:%s\n",reply)
free(reply)
sleep(1)
}

Or,ifyou'recallingzmq_poll(),testonthereturncode:

http://zguide.zeromq.org/page:all 58/225
12/31/2015 MQ - The Guide - MQ - The Guide
if(zmq_poll(items,2,1000*1000)==1)
break//Interrupted

Thepreviousexamplestilluseszmq_poll().Sohowaboutreactors?TheCZMQzloopreactorissimplebutfunctional.Itlets
you:

Setareaderonanysocket,i.e.,codethatiscalledwheneverthesockethasinput.
Cancelareaderonasocket.
Setatimerthatgoesoffonceormultipletimesatspecificintervals.
Cancelatimer.

zloopofcourseuseszmq_poll()internally.Itrebuildsitspollseteachtimeyouaddorremovereaders,anditcalculatesthe
polltimeouttomatchthenexttimer.Then,itcallsthereaderandtimerhandlersforeachsocketandtimerthatneedattention.

Whenweuseareactorpattern,ourcodeturnsinsideout.Themainlogiclookslikethis:

zloop_t*reactor=zloop_new()
zloop_reader(reactor,self>backend,s_handle_backend,self)
zloop_start(reactor)
zloop_destroy(&reactor)

Theactualhandlingofmessagessitsinsidededicatedfunctionsormethods.Youmaynotlikethestyleit'samatteroftaste.
Whatitdoeshelpwithismixingtimersandsocketactivity.Intherestofthistext,we'llusezmq_poll()insimplercases,and
zloopinmorecomplexexamples.

Hereistheloadbalancingbrokerrewrittenonceagain,thistimetousezloop:

lbbroker3:LoadbalancingbrokerusingzloopinC

Haxe|Java|Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

GettingapplicationstoproperlyshutdownwhenyousendthemCtrlCcanbetricky.Ifyouusethezctxclassit'llautomatically
setupsignalhandling,butyourcodestillhastocooperate.Youmustbreakanyloopifzmq_pollreturns1orifanyofthe
zstr_recv,zframe_recv,orzmsg_recvmethodsreturnNULL.Ifyouhavenestedloops,itcanbeusefultomaketheouter
onesconditionalon!zctx_interrupted.

Ifyou'reusingchildthreads,theywon'treceivetheinterrupt.Totellthemtoshutdown,youcaneither:

Destroythecontext,iftheyaresharingthesamecontext,inwhichcaseanyblockingcallstheyarewaitingonwillendwith
ETERM.
Sendthemshutdownmessages,iftheyareusingtheirowncontexts.Forthisyou'llneedsomesocketplumbing.

TheAsynchronousClient/ServerPattern topprevnext

IntheROUTERtoDEALERexample,wesawa1toNusecasewhereoneservertalksasynchronouslytomultipleworkers.We
canturnthisupsidedowntogetaveryusefulNto1architecturewherevariousclientstalktoasingleserver,anddothis
asynchronously.

Figure37AsynchronousClient/Server

http://zguide.zeromq.org/page:all 59/225
12/31/2015 MQ - The Guide - MQ - The Guide

Here'showitworks:

Clientsconnecttotheserverandsendrequests.
Foreachrequest,theserversends0ormorereplies.
Clientscansendmultiplerequestswithoutwaitingforareply.
Serverscansendmultiplereplieswithoutwaitingfornewrequests.

Here'scodethatshowshowthisworks:

asyncsrv:Asynchronousclient/serverinC

C++|C#|Clojure|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|CL|Felix|ObjectiveC
|ooc|Perl|Q|Racket

Theexamplerunsinoneprocess,withmultiplethreadssimulatingarealmultiprocessarchitecture.Whenyouruntheexample,
you'llseethreeclients(eachwitharandomID),printingouttherepliestheygetfromtheserver.Lookcarefullyandyou'llsee
eachclienttaskgets0ormorerepliesperrequest.

Somecommentsonthiscode:

Theclientssendarequestoncepersecond,andgetzeroormorerepliesback.Tomakethisworkusingzmq_poll(),we
can'tsimplypollwitha1secondtimeout,orwe'dendupsendinganewrequestonlyonesecondafterwereceivedthe
lastreply.Sowepollatahighfrequency(100timesat1/100thofasecondperpoll),whichisapproximatelyaccurate.

Theserverusesapoolofworkerthreads,eachprocessingonerequestsynchronously.Itconnectsthesetoitsfrontend
socketusinganinternalqueue.Itconnectsthefrontendandbackendsocketsusingazmq_proxy()call.

Figure38DetailofAsynchronousServer

http://zguide.zeromq.org/page:all 60/225
12/31/2015 MQ - The Guide - MQ - The Guide

Notethatwe'redoingDEALERtoROUTERdialogbetweenclientandserver,butinternallybetweentheservermainthreadand
workers,we'redoingDEALERtoDEALER.Iftheworkerswerestrictlysynchronous,we'duseREP.However,becausewewant
tosendmultiplereplies,weneedanasyncsocket.Wedonotwanttoroutereplies,theyalwaysgotothesingleserverthread
thatsentustherequest.

Let'sthinkabouttheroutingenvelope.Theclientsendsamessageconsistingofasingleframe.Theserverthreadreceivesa
twoframemessage(originalmessageprefixedbyclientidentity).Wesendthesetwoframesontotheworker,whichtreatsitasa
normalreplyenvelope,returnsthattousasatwoframemessage.Wethenusethefirstframeasanidentitytoroutethesecond
framebacktotheclientasareply.

Itlookssomethinglikethis:

clientserverfrontendworker
[DEALER]<>[ROUTER<>DEALER<>DEALER]
1part2parts2parts

Nowforthesockets:wecouldusetheloadbalancingROUTERtoDEALERpatterntotalktoworkers,butit'sextrawork.Inthis
case,aDEALERtoDEALERpatternisprobablyfine:thetradeoffislowerlatencyforeachrequest,buthigherriskofunbalanced
workdistribution.Simplicitywinsinthiscase.

Whenyoubuildserversthatmaintainstatefulconversationswithclients,youwillrunintoaclassicproblem.Iftheserverkeeps
somestateperclient,andclientskeepcomingandgoing,eventuallyitwillrunoutofresources.Evenifthesameclientskeep
connecting,ifyou'reusingdefaultidentities,eachconnectionwilllooklikeanewone.

Wecheatintheaboveexamplebykeepingstateonlyforaveryshorttime(thetimeittakesaworkertoprocessarequest)and
thenthrowingawaythestate.Butthat'snotpracticalformanycases.Toproperlymanageclientstateinastatefulasynchronous
server,youhaveto:

http://zguide.zeromq.org/page:all 61/225
12/31/2015 MQ - The Guide - MQ - The Guide
Doheartbeatingfromclienttoserver.Inourexample,wesendarequestoncepersecond,whichcanreliablybeusedasa
heartbeat.

Storestateusingtheclientidentity(whethergeneratedorexplicit)askey.

Detectastoppedheartbeat.Ifthere'snorequestfromaclientwithin,say,twoseconds,theservercandetectthisand
destroyanystateit'sholdingforthatclient.

WorkedExample:InterBrokerRouting topprevnext

Let'stakeeverythingwe'veseensofar,andscalethingsuptoarealapplication.We'llbuildthisstepbystepoverseveral
iterations.Ourbestclientcallsusurgentlyandasksforadesignofalargecloudcomputingfacility.Hehasthisvisionofacloud
thatspansmanydatacenters,eachaclusterofclientsandworkers,andthatworkstogetherasawhole.Becausewe'resmart
enoughtoknowthatpracticealwaysbeatstheory,weproposetomakeaworkingsimulationusingZeroMQ.Ourclient,eagerto
lockdownthebudgetbeforehisownbosschangeshismind,andhavingreadgreatthingsaboutZeroMQonTwitter,agrees.

EstablishingtheDetails topprevnext

Severalespressoslater,wewanttojumpintowritingcode,butalittlevoicetellsustogetmoredetailsbeforemakinga
sensationalsolutiontoentirelythewrongproblem."Whatkindofworkistheclouddoing?",weask.

Theclientexplains:

Workersrunonvariouskindsofhardware,buttheyareallabletohandleanytask.Thereareseveralhundredworkersper
cluster,andasmanyasadozenclustersintotal.

Clientscreatetasksforworkers.Eachtaskisanindependentunitofworkandalltheclientwantsistofindanavailable
worker,andsenditthetask,assoonaspossible.Therewillbealotofclientsandthey'llcomeandgoarbitrarily.

Therealdifficultyistobeabletoaddandremoveclustersatanytime.Aclustercanleaveorjointhecloudinstantly,
bringingallitsworkersandclientswithit.

Iftherearenoworkersintheirowncluster,clients'taskswillgoofftootheravailableworkersinthecloud.

Clientssendoutonetaskatatime,waitingforareply.Iftheydon'tgetananswerwithinXseconds,they'lljustsendout
thetaskagain.Thisisn'tourconcerntheclientAPIdoesitalready.

Workersprocessonetaskatatimetheyareverysimplebeasts.Iftheycrash,theygetrestartedbywhateverscript
startedthem.

Sowedoublechecktomakesurethatweunderstoodthiscorrectly:

"Therewillbesomekindofsuperdupernetworkinterconnectbetweenclusters,right?",weask.Theclientsays,"Yes,of
course,we'renotidiots."

"Whatkindofvolumesarewetalkingabout?",weask.Theclientreplies,"Uptoathousandclientspercluster,eachdoing
atmosttenrequestspersecond.Requestsaresmall,andrepliesarealsosmall,nomorethan1Kbyteseach."

SowedoalittlecalculationandseethatthiswillworknicelyoverplainTCP.2,500clientsx10/secondx1,000bytesx2
directions=50MB/secor400Mb/sec,notaproblemfora1Gbnetwork.

It'sastraightforwardproblemthatrequiresnoexotichardwareorprotocols,justsomecleverroutingalgorithmsandcareful
design.Westartbydesigningonecluster(onedatacenter)andthenwefigureouthowtoconnectclusterstogether.

ArchitectureofaSingleCluster topprevnext

http://zguide.zeromq.org/page:all 62/225
12/31/2015 MQ - The Guide - MQ - The Guide
Workersandclientsaresynchronous.Wewanttousetheloadbalancingpatterntoroutetaskstoworkers.Workersareall
identicalourfacilityhasnonotionofdifferentservices.Workersareanonymousclientsneveraddressthemdirectly.Wemake
noattemptheretoprovideguaranteeddelivery,retry,andsoon.

Forreasonswealreadyexamined,clientsandworkerswon'tspeaktoeachotherdirectly.Itmakesitimpossibletoaddorremove
nodesdynamically.Soourbasicmodelconsistsoftherequestreplymessagebrokerwesawearlier.

Figure39ClusterArchitecture

ScalingtoMultipleClusters topprevnext

Nowwescalethisouttomorethanonecluster.Eachclusterhasasetofclientsandworkers,andabrokerthatjoinsthese
together.

Figure40MultipleClusters

http://zguide.zeromq.org/page:all 63/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thequestionis:howdowegettheclientsofeachclustertalkingtotheworkersoftheothercluster?Thereareafewpossibilities,
eachwithprosandcons:

Clientscouldconnectdirectlytobothbrokers.Theadvantageisthatwedon'tneedtomodifybrokersorworkers.But
clientsgetmorecomplexandbecomeawareoftheoveralltopology.Ifwewanttoaddathirdorforthcluster,forexample,
alltheclientsareaffected.Ineffectwehavetomoveroutingandfailoverlogicintotheclientsandthat'snotnice.

Workersmightconnectdirectlytobothbrokers.ButREQworkerscan'tdothat,theycanonlyreplytoonebroker.We
mightuseREPsbutREPsdon'tgiveuscustomizablebrokertoworkerroutinglikeloadbalancingdoes,onlythebuiltin
loadbalancing.That'safailifwewanttodistributeworktoidleworkers,wepreciselyneedloadbalancing.Onesolution
wouldbetouseROUTERsocketsfortheworkernodes.Let'slabelthis"Idea#1".

Brokerscouldconnecttoeachother.Thislooksneatestbecauseitcreatesthefewestadditionalconnections.Wecan't
addclustersonthefly,butthatisprobablyoutofscope.Nowclientsandworkersremainignorantoftherealnetwork
topology,andbrokerstelleachotherwhentheyhavesparecapacity.Let'slabelthis"Idea#2".

Let'sexploreIdea#1.Inthismodel,wehaveworkersconnectingtobothbrokersandacceptingjobsfromeitherone.

Figure41Idea1:CrossconnectedWorkers

Itlooksfeasible.However,itdoesn'tprovidewhatwewanted,whichwasthatclientsgetlocalworkersifpossibleandremote
workersonlyifit'sbetterthanwaiting.Alsoworkerswillsignal"ready"tobothbrokersandcangettwojobsatonce,whileother
workersremainidle.Itseemsthisdesignfailsbecauseagainwe'reputtingroutinglogicattheedges.

http://zguide.zeromq.org/page:all 64/225
12/31/2015 MQ - The Guide - MQ - The Guide
So,idea#2then.Weinterconnectthebrokersanddon'ttouchtheclientsorworkers,whichareREQslikewe'reusedto.

Figure42Idea2:BrokersTalkingtoEachOther

Thisdesignisappealingbecausetheproblemissolvedinoneplace,invisibletotherestoftheworld.Basically,brokersopen
secretchannelstoeachotherandwhisper,likecameltraders,"Hey,I'vegotsomesparecapacity.Ifyouhavetoomanyclients,
givemeashoutandwe'lldeal".

Ineffectitisjustamoresophisticatedroutingalgorithm:brokersbecomesubcontractorsforeachother.Thereareotherthingsto
likeaboutthisdesign,evenbeforeweplaywithrealcode:

Ittreatsthecommoncase(clientsandworkersonthesamecluster)asdefaultanddoesextraworkfortheexceptional
case(shufflingjobsbetweenclusters).

Itletsususedifferentmessageflowsforthedifferenttypesofwork.Thatmeanswecanhandlethemdifferently,e.g.,
usingdifferenttypesofnetworkconnection.

Itfeelslikeitwouldscalesmoothly.Interconnectingthreeormorebrokersdoesn'tgetoverlycomplex.Ifwefindthistobe
aproblem,it'seasytosolvebyaddingasuperbroker.

We'llnowmakeaworkedexample.We'llpackanentireclusterintooneprocess.Thatisobviouslynotrealistic,butitmakesit
simpletosimulate,andthesimulationcanaccuratelyscaletorealprocesses.ThisisthebeautyofZeroMQyoucandesignat
themicrolevelandscalethatuptothemacrolevel.Threadsbecomeprocesses,andthenbecomeboxesandthepatternsand
logicremainthesame.Eachofour"cluster"processescontainsclientthreads,workerthreads,andabrokerthread.

Weknowthebasicmodelwellbynow:

TheREQclient(REQ)threadscreateworkloadsandpassthemtothebroker(ROUTER).
TheREQworker(REQ)threadsprocessworkloadsandreturntheresultstothebroker(ROUTER).
Thebrokerqueuesanddistributesworkloadsusingtheloadbalancingpattern.

FederationVersusPeering topprevnext

Thereareseveralpossiblewaystointerconnectbrokers.Whatwewantistobeabletotellotherbrokers,"wehavecapacity",
andthenreceivemultipletasks.Wealsoneedtobeabletotellotherbrokers,"stop,we'refull".Itdoesn'tneedtobeperfect
sometimeswemayacceptjobswecan'tprocessimmediately,thenwe'lldothemassoonaspossible.

Thesimplestinterconnectisfederation,inwhichbrokerssimulateclientsandworkersforeachother.Wewoulddothisby
connectingourfrontendtotheotherbroker'sbackendsocket.Notethatitislegaltobothbindasockettoanendpointand
connectittootherendpoints.

Figure43CrossconnectedBrokersinFederationModel

http://zguide.zeromq.org/page:all 65/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thiswouldgiveussimplelogicinbothbrokersandareasonablygoodmechanism:whentherearenoworkers,telltheother
broker"ready",andacceptonejobfromit.Theproblemisalsothatitistoosimpleforthisproblem.Afederatedbrokerwouldbe
abletohandleonlyonetaskatatime.Ifthebrokeremulatesalockstepclientandworker,itisbydefinitionalsogoingtobelock
step,andifithaslotsofavailableworkerstheywon'tbeused.Ourbrokersneedtobeconnectedinafullyasynchronousfashion.

Thefederationmodelisperfectforotherkindsofrouting,especiallyserviceorientedarchitectures(SOAs),whichroutebyservice
nameandproximityratherthanloadbalancingorroundrobin.Sodon'tdismissitasuseless,it'sjustnotrightforallusecases.

Insteadoffederation,let'slookatapeeringapproachinwhichbrokersareexplicitlyawareofeachotherandtalkoverprivileged
channels.Let'sbreakthisdown,assumingwewanttointerconnectNbrokers.Eachbrokerhas(N1)peers,andallbrokersare
usingexactlythesamecodeandlogic.Therearetwodistinctflowsofinformationbetweenbrokers:

Eachbrokerneedstotellitspeershowmanyworkersithasavailableatanytime.Thiscanbefairlysimpleinformation
justaquantitythatisupdatedregularly.Theobvious(andcorrect)socketpatternforthisispubsub.Soeverybroker
opensaPUBsocketandpublishesstateinformationonthat,andeverybrokeralsoopensaSUBsocketandconnectsthat
tothePUBsocketofeveryotherbrokertogetstateinformationfromitspeers.

Eachbrokerneedsawaytodelegatetaskstoapeerandgetrepliesback,asynchronously.We'lldothisusingROUTER
socketsnoothercombinationworks.Eachbrokerhastwosuchsockets:onefortasksitreceivesandonefortasksit
delegates.Ifwedidn'tusetwosockets,itwouldbemoreworktoknowwhetherwewerereadingarequestorareplyeach
time.Thatwouldmeanaddingmoreinformationtothemessageenvelope.

Andthereisalsotheflowofinformationbetweenabrokeranditslocalclientsandworkers.

TheNamingCeremony topprevnext

Threeflowsxtwosocketsforeachflow=sixsocketsthatwehavetomanageinthebroker.Choosinggoodnamesisvitalto
keepingamultisocketjugglingactreasonablycoherentinourminds.Socketsdosomethingandwhattheydoshouldformthe
basisfortheirnames.It'saboutbeingabletoreadthecodeseveralweekslateronacoldMondaymorningbeforecoffee,andnot
feelanypain.

Let'sdoashamanisticnamingceremonyforthesockets.Thethreeflowsare:

Alocalrequestreplyflowbetweenthebrokeranditsclientsandworkers.
Acloudrequestreplyflowbetweenthebrokeranditspeerbrokers.
Astateflowbetweenthebrokeranditspeerbrokers.

Findingmeaningfulnamesthatareallthesamelengthmeansourcodewillalignnicely.It'snotabigthing,butattentiontodetails
helps.Foreachflowthebrokerhastwosocketsthatwecanorthogonallycallthefrontendandbackend.We'veusedthese
namesquiteoften.Afrontendreceivesinformationortasks.Abackendsendsthoseouttootherpeers.Theconceptualflowis
fromfronttoback(withrepliesgoingintheoppositedirectionfrombacktofront).

http://zguide.zeromq.org/page:all 66/225
12/31/2015 MQ - The Guide - MQ - The Guide
Soinallthecodewewriteforthistutorial,wewillusethesesocketnames:

localfeandlocalbeforthelocalflow.
cloudfeandcloudbeforthecloudflow.
statefeandstatebeforthestateflow.

Forourtransportandbecausewe'resimulatingthewholethingononebox,we'lluseipcforeverything.Thishastheadvantage
ofworkingliketcpintermsofconnectivity(i.e.,it'sadisconnectedtransport,unlikeinproc),yetwedon'tneedIPaddressesor
DNSnames,whichwouldbeapainhere.Instead,wewilluseipcendpointscalledsomethinglocal,somethingcloud,and
somethingstate,wheresomethingisthenameofoursimulatedcluster.

Youmightbethinkingthatthisisalotofworkforsomenames.Whynotcallthems1,s2,s3,s4,etc.?Theansweristhatifyour
brainisnotaperfectmachine,youneedalotofhelpwhenreadingcode,andwe'llseethatthesenamesdohelp.It'seasierto
remember"threeflows,twodirections"than"sixdifferentsockets".

Figure44BrokerSocketArrangement

Notethatweconnectthecloudbeineachbrokertothecloudfeineveryotherbroker,andlikewiseweconnectthestatebeineach
brokertothestatefeineveryotherbroker.

PrototypingtheStateFlow topprevnext

Becauseeachsocketflowhasitsownlittletrapsfortheunwary,wewilltesttheminrealcodeonebyone,ratherthantryto

http://zguide.zeromq.org/page:all 67/225
12/31/2015 MQ - The Guide - MQ - The Guide
throwthewholelotintocodeinonego.Whenwe'rehappywitheachflow,wecanputthemtogetherintoafullprogram.We'll
startwiththestateflow.

Figure45TheStateFlow

Hereishowthisworksincode:

peering1:PrototypestateflowinC

C#|Clojure|Delphi|F#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|C++|CL|Erlang|Felix|Node.js|
ObjectiveC|ooc|Perl|Q

Notesaboutthiscode:

Eachbrokerhasanidentitythatweusetoconstructipcendpointnames.ArealbrokerwouldneedtoworkwithTCPand
amoresophisticatedconfigurationscheme.We'lllookatsuchschemeslaterinthisbook,butfornow,usinggenerated
ipcnamesletsusignoretheproblemofwheretogetTCP/IPaddressesornames.

Weuseazmq_poll()loopasthecoreoftheprogram.Thisprocessesincomingmessagesandsendsoutstate
messages.Wesendastatemessageonlyifwedidnotgetanyincomingmessagesandwewaitedforasecond.Ifwe
sendoutastatemessageeachtimewegetonein,we'llgetmessagestorms.

Weuseatwopartpubsubmessageconsistingofsenderaddressanddata.Notethatwewillneedtoknowtheaddressof
thepublisherinordertosendittasks,andtheonlywayistosendthisexplicitlyasapartofthemessage.

Wedon'tsetidentitiesonsubscribersbecauseifwedidthenwe'dgetoutdatedstateinformationwhenconnectingto
runningbrokers.

http://zguide.zeromq.org/page:all 68/225
12/31/2015 MQ - The Guide - MQ - The Guide
Wedon'tsetaHWMonthepublisher,butifwewereusingZeroMQv2.xthatwouldbeawiseidea.

Wecanbuildthislittleprogramandrunitthreetimestosimulatethreeclusters.Let'scallthemDC1,DC2,andDC3(thenames
arearbitrary).Werunthesethreecommands,eachinaseparatewindow:

peering1DC1DC2DC3#StartDC1andconnecttoDC2andDC3
peering1DC2DC1DC3#StartDC2andconnecttoDC1andDC3
peering1DC3DC1DC2#StartDC3andconnecttoDC1andDC2

You'llseeeachclusterreportthestateofitspeers,andafterafewsecondstheywillallhappilybeprintingrandomnumbersonce
persecond.Trythisandsatisfyyourselfthatthethreebrokersallmatchupandsynchronizetopersecondstateupdates.

Inreallife,we'dnotsendoutstatemessagesatregularintervals,butratherwheneverwehadastatechange,i.e.,whenevera
workerbecomesavailableorunavailable.Thatmayseemlikealotoftraffic,butstatemessagesaresmallandwe'veestablished
thattheinterclusterconnectionsaresuperfast.

Ifwewantedtosendstatemessagesatpreciseintervals,we'dcreateachildthreadandopenthestatebesocketinthatthread.
We'dthensendirregularstateupdatestothatchildthreadfromourmainthreadandallowthechildthreadtoconflatetheminto
regularoutgoingmessages.Thisismoreworkthanweneedhere.

PrototypingtheLocalandCloudFlows topprevnext

Let'snowprototypetheflowoftasksviathelocalandcloudsockets.Thiscodepullsrequestsfromclientsandthendistributes
themtolocalworkersandcloudpeersonarandombasis.

Figure46TheFlowofTasks

http://zguide.zeromq.org/page:all 69/225
12/31/2015 MQ - The Guide - MQ - The Guide

Beforewejumpintothecode,whichisgettingalittlecomplex,let'ssketchthecoreroutinglogicandbreakitdownintoasimple
yetrobustdesign.

Weneedtwoqueues,oneforrequestsfromlocalclientsandoneforrequestsfromcloudclients.Oneoptionwouldbetopull
messagesoffthelocalandcloudfrontends,andpumptheseontotheirrespectivequeues.Butthisiskindofpointlessbecause
ZeroMQsocketsarequeuesalready.Solet'susetheZeroMQsocketbuffersasqueues.

Thiswasthetechniqueweusedintheloadbalancingbroker,anditworkednicely.Weonlyreadfromthetwofrontendswhen
thereissomewheretosendtherequests.Wecanalwaysreadfromthebackends,astheygiveusrepliestorouteback.Aslong
asthebackendsaren'ttalkingtous,there'snopointinevenlookingatthefrontends.

Soourmainloopbecomes:

Pollthebackendsforactivity.Whenwegetamessage,itmaybe"ready"fromaworkeroritmaybeareply.Ifit'sareply,
routebackviathelocalorcloudfrontend.

Ifaworkerreplied,itbecameavailable,sowequeueitandcountit.

Whilethereareworkersavailable,takearequest,ifany,fromeitherfrontendandroutetoalocalworker,orrandomly,toa
cloudpeer.

Randomlysendingtaskstoapeerbrokerratherthanaworkersimulatesworkdistributionacrossthecluster.It'sdumb,butthatis
fineforthisstage.

Weusebrokeridentitiestoroutemessagesbetweenbrokers.Eachbrokerhasanamethatweprovideonthecommandlinein
thissimpleprototype.Aslongasthesenamesdon'toverlapwiththeZeroMQgeneratedUUIDsusedforclientnodes,wecan
figureoutwhethertorouteareplybacktoaclientortoabroker.

http://zguide.zeromq.org/page:all 70/225
12/31/2015 MQ - The Guide - MQ - The Guide
Hereishowthisworksincode.Theinterestingpartstartsaroundthecomment"Interestingpart".

peering2:PrototypelocalandcloudflowinC

C#|Delphi|F#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|C++|Clojure|CL|Erlang|Felix|Node.js|ObjectiveC
|ooc|Perl|Q|Racket

Runthisby,forinstance,startingtwoinstancesofthebrokerintwowindows:

peering2meyou
peering2youme

Somecommentsonthiscode:

IntheCcodeatleast,usingthezmsgclassmakeslifemucheasier,andourcodemuchshorter.It'sobviouslyan
abstractionthatworks.IfyoubuildZeroMQapplicationsinC,youshoulduseCZMQ.

Becausewe'renotgettinganystateinformationfrompeers,wenaivelyassumetheyarerunning.Thecodepromptsyouto
confirmwhenyou'vestartedallthebrokers.Intherealcase,we'dnotsendanythingtobrokerswhohadnottoldusthey
exist.

Youcansatisfyyourselfthatthecodeworksbywatchingitrunforever.Iftherewereanymisroutedmessages,clientswouldend
upblocking,andthebrokerswouldstopprintingtraceinformation.Youcanprovethatbykillingeitherofthebrokers.Theother
brokertriestosendrequeststothecloud,andonebyoneitsclientsblock,waitingforananswer.

PuttingitAllTogether topprevnext

Let'sputthistogetherintoasinglepackage.Asbefore,we'llrunanentireclusterasoneprocess.We'regoingtotakethetwo
previousexamplesandmergethemintooneproperlyworkingdesignthatletsyousimulateanynumberofclusters.

Thiscodeisthesizeofbothpreviousprototypestogether,at270LoC.That'sprettygoodforasimulationofaclusterthat
includesclientsandworkersandcloudworkloaddistribution.Hereisthecode:

peering3:FullclustersimulationinC

Delphi|F#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Erlang|Felix|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

It'sanontrivialprogramandtookaboutadaytogetworking.Thesearethehighlights:

Theclientthreadsdetectandreportafailedrequest.Theydothisbypollingforaresponseandifnonearrivesaftera
while(10seconds),printinganerrormessage.

Clientthreadsdon'tprintdirectly,butinsteadsendamessagetoamonitorsocket(PUSH)thatthemainloopcollects
(PULL)andprintsoff.Thisisthefirstcasewe'veseenofusingZeroMQsocketsformonitoringandloggingthisisabig
usecasethatwe'llcomebacktolater.

Clientssimulatevaryingloadstogetthecluster100%atrandommoments,sothattasksareshiftedovertothecloud.The
numberofclientsandworkers,anddelaysintheclientandworkerthreadscontrolthis.Feelfreetoplaywiththemtoseeif
youcanmakeamorerealisticsimulation.

Themainloopusestwopollsets.Itcouldinfactusethree:information,backends,andfrontends.Asintheearlier
prototype,thereisnopointintakingafrontendmessageifthereisnobackendcapacity.

Thesearesomeoftheproblemsthataroseduringdevelopmentofthisprogram:

Clientswouldfreeze,duetorequestsorrepliesgettinglostsomewhere.RecallthattheROUTERsocketdropsmessages
itcan'troute.Thefirsttacticherewastomodifytheclientthreadtodetectandreportsuchproblems.Secondly,Iput
zmsg_dump()callsaftereveryreceiveandbeforeeverysendinthemainloop,untiltheoriginoftheproblemswasclear.

Themainloopwasmistakenlyreadingfrommorethanonereadysocket.Thiscausedthefirstmessagetobelost.Ifixed
thatbyreadingonlyfromthefirstreadysocket.

http://zguide.zeromq.org/page:all 71/225
12/31/2015 MQ - The Guide - MQ - The Guide
ThezmsgclasswasnotproperlyencodingUUIDsasCstrings.ThiscausedUUIDsthatcontain0bytestobecorrupted.I
fixedthatbymodifyingzmsgtoencodeUUIDsasprintablehexstrings.

Thissimulationdoesnotdetectdisappearanceofacloudpeer.Ifyoustartseveralpeersandstopone,anditwasbroadcasting
capacitytotheothers,theywillcontinuetosenditworkevenifit'sgone.Youcantrythis,andyouwillgetclientsthatcomplainof
lostrequests.Thesolutionistwofold:first,onlykeepthecapacityinformationforashorttimesothatifapeerdoesdisappear,its
capacityisquicklysettozero.Second,addreliabilitytotherequestreplychain.We'lllookatreliabilityinthenextchapter.

Chapter4ReliableRequestReplyPatterns topprevnext

Chapter3AdvancedRequestReplyPatternscoveredadvancedusesofZeroMQ'srequestreplypatternwithworking
examples.Thischapterlooksatthegeneralquestionofreliabilityandbuildsasetofreliablemessagingpatternsontopof
ZeroMQ'scorerequestreplypattern.

Inthischapter,wefocusheavilyonuserspacerequestreplypatterns,reusablemodelsthathelpyoudesignyourownZeroMQ
architectures:

TheLazyPiratepattern:reliablerequestreplyfromtheclientside
TheSimplePiratepattern:reliablerequestreplyusingloadbalancing
TheParanoidPiratepattern:reliablerequestreplywithheartbeating
TheMajordomopattern:serviceorientedreliablequeuing
TheTitanicpattern:diskbased/disconnectedreliablequeuing
TheBinaryStarpattern:primarybackupserverfailover
TheFreelancepattern:brokerlessreliablerequestreply

Whatis"Reliability"? topprevnext

Mostpeoplewhospeakof"reliability"don'treallyknowwhattheymean.Wecanonlydefinereliabilityintermsoffailure.Thatis,
ifwecanhandleacertainsetofwelldefinedandunderstoodfailures,thenwearereliablewithrespecttothosefailures.No
more,noless.Solet'slookatthepossiblecausesoffailureinadistributedZeroMQapplication,inroughlydescendingorderof
probability:

Applicationcodeistheworstoffender.Itcancrashandexit,freezeandstoprespondingtoinput,runtooslowlyforits
input,exhaustallmemory,andsoon.

SystemcodesuchasbrokerswewriteusingZeroMQcandieforthesamereasonsasapplicationcode.Systemcode
shouldbemorereliablethanapplicationcode,butitcanstillcrashandburn,andespeciallyrunoutofmemoryifittriesto
queuemessagesforslowclients.

Messagequeuescanoverflow,typicallyinsystemcodethathaslearnedtodealbrutallywithslowclients.Whenaqueue
overflows,itstartstodiscardmessages.Soweget"lost"messages.

Networkscanfail(e.g.,WiFigetsswitchedofforgoesoutofrange).ZeroMQwillautomaticallyreconnectinsuchcases,
butinthemeantime,messagesmaygetlost.

Hardwarecanfailandtakewithitalltheprocessesrunningonthatbox.

Networkscanfailinexoticways,e.g.,someportsonaswitchmaydieandthosepartsofthenetworkbecome
inaccessible.

Entiredatacenterscanbestruckbylightning,earthquakes,fire,ormoremundanepowerorcoolingfailures.

Tomakeasoftwaresystemfullyreliableagainstallofthesepossiblefailuresisanenormouslydifficultandexpensivejoband
goesbeyondthescopeofthisbook.

Becausethefirstfivecasesintheabovelistcover99.9%ofrealworldrequirementsoutsidelargecompanies(accordingtoa
highlyscientificstudyIjustran,whichalsotoldmethat78%ofstatisticsaremadeuponthespot,andmoreovernevertotrusta
statisticthatwedidn'tfalsifyourselves),that'swhatwe'llexamine.Ifyou'realargecompanywithmoneytospendonthelasttwo

http://zguide.zeromq.org/page:all 72/225
12/31/2015 MQ - The Guide - MQ - The Guide
cases,contactmycompanyimmediately!There'salargeholebehindmybeachhousewaitingtobeconvertedintoanexecutive
swimmingpool.

DesigningReliability topprevnext

Sotomakethingsbrutallysimple,reliabilityis"keepingthingsworkingproperlywhencodefreezesorcrashes",asituationwe'll
shortento"dies".However,thethingswewanttokeepworkingproperlyaremorecomplexthanjustmessages.Weneedtotake
eachcoreZeroMQmessagingpatternandseehowtomakeitwork(ifwecan)evenwhencodedies.

Let'stakethemonebyone:

Requestreply:iftheserverdies(whileprocessingarequest),theclientcanfigurethatoutbecauseitwon'tgetananswer
back.Thenitcangiveupinahuff,waitandtryagainlater,findanotherserver,andsoon.Asfortheclientdying,wecan
brushthatoffas"someoneelse'sproblem"fornow.

Pubsub:iftheclientdies(havinggottensomedata),theserverdoesn'tknowaboutit.Pubsubdoesn'tsendany
informationbackfromclienttoserver.Buttheclientcancontacttheserveroutofband,e.g.,viarequestreply,andask,
"pleaseresendeverythingImissed".Asfortheserverdying,that'soutofscopeforhere.Subscriberscanalsoselfverify
thatthey'renotrunningtooslowly,andtakeaction(e.g.,warntheoperatoranddie)iftheyare.

Pipeline:ifaworkerdies(whileworking),theventilatordoesn'tknowaboutit.Pipelines,likethegrindinggearsoftime,
onlyworkinonedirection.Butthedownstreamcollectorcandetectthatonetaskdidn'tgetdone,andsendamessage
backtotheventilatorsaying,"hey,resendtask324!"Iftheventilatororcollectordies,whateverupstreamclientoriginally
senttheworkbatchcangettiredofwaitingandresendthewholelot.It'snotelegant,butsystemcodeshouldreallynotdie
oftenenoughtomatter.

Inthischapterwe'llfocusjustonrequestreply,whichisthelowhangingfruitofreliablemessaging.

Thebasicrequestreplypattern(aREQclientsocketdoingablockingsend/receivetoaREPserversocket)scoreslowon
handlingthemostcommontypesoffailure.Iftheservercrasheswhileprocessingtherequest,theclientjusthangsforever.Ifthe
networklosestherequestorthereply,theclienthangsforever.

RequestreplyisstillmuchbetterthanTCP,thankstoZeroMQ'sabilitytoreconnectpeerssilently,toloadbalancemessages,
andsoon.Butit'sstillnotgoodenoughforrealwork.Theonlycasewhereyoucanreallytrustthebasicrequestreplypatternis
betweentwothreadsinthesameprocesswherethere'snonetworkorseparateserverprocesstodie.

However,withalittleextrawork,thishumblepatternbecomesagoodbasisforrealworkacrossadistributednetwork,andwe
getasetofreliablerequestreply(RRR)patternsthatIliketocallthePiratepatterns(you'lleventuallygetthejoke,Ihope).

Thereare,inmyexperience,roughlythreewaystoconnectclientstoservers.Eachneedsaspecificapproachtoreliability:

Multipleclientstalkingdirectlytoasingleserver.Usecase:asinglewellknownservertowhichclientsneedtotalk.Types
offailureweaimtohandle:servercrashesandrestarts,andnetworkdisconnects.

Multipleclientstalkingtoabrokerproxythatdistributesworktomultipleworkers.Usecase:serviceorientedtransaction
processing.Typesoffailureweaimtohandle:workercrashesandrestarts,workerbusylooping,workeroverload,queue
crashesandrestarts,andnetworkdisconnects.

Multipleclientstalkingtomultipleserverswithnointermediaryproxies.Usecase:distributedservicessuchasname
resolution.Typesoffailureweaimtohandle:servicecrashesandrestarts,servicebusylooping,serviceoverload,and
networkdisconnects.

Eachoftheseapproacheshasitstradeoffsandoftenyou'llmixthem.We'lllookatallthreeindetail.

ClientSideReliability(LazyPiratePattern) topprevnext

Wecangetverysimplereliablerequestreplywithsomechangestotheclient.WecallthistheLazyPiratepattern.Ratherthan
doingablockingreceive,we:

PolltheREQsocketandreceivefromitonlywhenit'ssureareplyhasarrived.

http://zguide.zeromq.org/page:all 73/225
12/31/2015 MQ - The Guide - MQ - The Guide
Resendarequest,ifnoreplyhasarrivedwithinatimeoutperiod.
Abandonthetransactionifthereisstillnoreplyafterseveralrequests.

IfyoutrytouseaREQsocketinanythingotherthanastrictsend/receivefashion,you'llgetanerror(technically,theREQsocket
implementsasmallfinitestatemachinetoenforcethesend/receivepingpong,andsotheerrorcodeiscalled"EFSM").Thisis
slightlyannoyingwhenwewanttouseREQinapiratepattern,becausewemaysendseveralrequestsbeforegettingareply.

TheprettygoodbruteforcesolutionistocloseandreopentheREQsocketafteranerror:

lpclient:LazyPirateclientinC

C++|C#|Clojure|Delphi|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Ruby|Tcl|Ada|Basic|CL|Erlang|F#|Felix|Node.js|ObjectiveC|
ooc|Q|Racket|Scala

Runthistogetherwiththematchingserver:

lpserver:LazyPirateserverinC

C++|C#|Clojure|Delphi|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|CL|Erlang|F#|Felix|Node.js|
ObjectiveC|ooc|Q|Racket

Figure47TheLazyPiratePattern

Torunthistestcase,starttheclientandtheserverintwoconsolewindows.Theserverwillrandomlymisbehaveafterafew
messages.Youcanchecktheclient'sresponse.Hereistypicaloutputfromtheserver:

I:normalrequest(1)
I:normalrequest(2)
I:normalrequest(3)
I:simulatingCPUoverload
I:normalrequest(4)
I:simulatingacrash

Andhereistheclient'sresponse:

I:connectingtoserver...
I:serverrepliedOK(1)
I:serverrepliedOK(2)
I:serverrepliedOK(3)
W:noresponsefromserver,retrying...

http://zguide.zeromq.org/page:all 74/225
12/31/2015 MQ - The Guide - MQ - The Guide
I:connectingtoserver...
W:noresponsefromserver,retrying...
I:connectingtoserver...
E:serverseemstobeoffline,abandoning

Theclientsequenceseachmessageandchecksthatrepliescomebackexactlyinorder:thatnorequestsorrepliesarelost,and
norepliescomebackmorethanonce,oroutoforder.Runthetestafewtimesuntilyou'reconvincedthatthismechanism
actuallyworks.Youdon'tneedsequencenumbersinaproductionapplicationtheyjusthelpustrustourdesign.

TheclientusesaREQsocket,anddoesthebruteforceclose/reopenbecauseREQsocketsimposethatstrictsend/receive
cycle.YoumightbetemptedtouseaDEALERinstead,butitwouldnotbeagooddecision.First,itwouldmeanemulatingthe
secretsaucethatREQdoeswithenvelopes(ifyou'veforgottenwhatthatis,it'sagoodsignyoudon'twanttohavetodoit).
Second,itwouldmeanpotentiallygettingbackrepliesthatyoudidn'texpect.

Handlingfailuresonlyattheclientworkswhenwehaveasetofclientstalkingtoasingleserver.Itcanhandleaservercrash,but
onlyifrecoverymeansrestartingthatsameserver.Ifthere'sapermanenterror,suchasadeadpowersupplyontheserver
hardware,thisapproachwon'twork.Becausetheapplicationcodeinserversisusuallythebiggestsourceoffailuresinany
architecture,dependingonasingleserverisnotagreatidea.

So,prosandcons:

Pro:simpletounderstandandimplement.
Pro:workseasilywithexistingclientandserverapplicationcode.
Pro:ZeroMQautomaticallyretriestheactualreconnectionuntilitworks.
Con:doesn'tfailovertobackuporalternateservers.

BasicReliableQueuing(SimplePiratePattern) topprevnext

OursecondapproachextendstheLazyPiratepatternwithaqueueproxythatletsustalk,transparently,tomultipleservers,
whichwecanmoreaccuratelycall"workers".We'lldevelopthisinstages,startingwithaminimalworkingmodel,theSimple
Piratepattern.

InallthesePiratepatterns,workersarestateless.Iftheapplicationrequiressomesharedstate,suchasashareddatabase,we
don'tknowaboutitaswedesignourmessagingframework.Havingaqueueproxymeansworkerscancomeandgowithout
clientsknowinganythingaboutit.Ifoneworkerdies,anothertakesover.Thisisanice,simpletopologywithonlyonereal
weakness,namelythecentralqueueitself,whichcanbecomeaproblemtomanage,andasinglepointoffailure.

Figure48TheSimplePiratePattern

http://zguide.zeromq.org/page:all 75/225
12/31/2015 MQ - The Guide - MQ - The Guide

ThebasisforthequeueproxyistheloadbalancingbrokerfromChapter3AdvancedRequestReplyPatterns.Whatisthevery
minimumweneedtodotohandledeadorblockedworkers?Turnsout,it'ssurprisinglylittle.Wealreadyhavearetrymechanism
intheclient.Sousingtheloadbalancingpatternwillworkprettywell.ThisfitswithZeroMQ'sphilosophythatwecanextenda
peertopeerpatternlikerequestreplybypluggingnaiveproxiesinthemiddle.

Wedon'tneedaspecialclientwe'restillusingtheLazyPirateclient.Hereisthequeue,whichisidenticaltothemaintaskofthe
loadbalancingbroker:

spqueue:SimplePiratequeueinC

C++|C#|Clojure|Delphi|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|CL|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Hereistheworker,whichtakestheLazyPirateserverandadaptsitfortheloadbalancingpattern(usingtheREQ"ready"
signaling):

spworker:SimplePirateworkerinC

C++|C#|Clojure|Delphi|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|CL|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Totestthis,startahandfulofworkers,aLazyPirateclient,andthequeue,inanyorder.You'llseethattheworkerseventuallyall
crashandburn,andtheclientretriesandthengivesup.Thequeueneverstops,andyoucanrestartworkersandclientsad
nauseam.Thismodelworkswithanynumberofclientsandworkers.

RobustReliableQueuing(ParanoidPiratePattern) topprevnext

Figure49TheParanoidPiratePattern

http://zguide.zeromq.org/page:all 76/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheSimplePirateQueuepatternworksprettywell,especiallybecauseit'sjustacombinationoftwoexistingpatterns.Still,itdoes
havesomeweaknesses:

It'snotrobustinthefaceofaqueuecrashandrestart.Theclientwillrecover,buttheworkerswon't.WhileZeroMQwill
reconnectworkers'socketsautomatically,asfarasthenewlystartedqueueisconcerned,theworkershaven'tsignaled
ready,sodon'texist.Tofixthis,wehavetodoheartbeatingfromqueuetoworkersothattheworkercandetectwhenthe
queuehasgoneaway.

Thequeuedoesnotdetectworkerfailure,soifaworkerdieswhileidle,thequeuecan'tremoveitfromitsworkerqueue
untilthequeuesendsitarequest.Theclientwaitsandretriesfornothing.It'snotacriticalproblem,butit'snotnice.To
makethisworkproperly,wedoheartbeatingfromworkertoqueue,sothatthequeuecandetectalostworkeratany
stage.

We'llfixtheseinaproperlypedanticParanoidPiratePattern.

WepreviouslyusedaREQsocketfortheworker.FortheParanoidPirateworker,we'llswitchtoaDEALERsocket.Thishasthe
advantageoflettingussendandreceivemessagesatanytime,ratherthanthelockstepsend/receivethatREQimposes.The
downsideofDEALERisthatwehavetodoourownenvelopemanagement(rereadChapter3AdvancedRequestReply
Patternsforbackgroundonthisconcept).

We'restillusingtheLazyPirateclient.HereistheParanoidPiratequeueproxy:

ppqueue:ParanoidPiratequeueinC

C++|C#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Thequeueextendstheloadbalancingpatternwithheartbeatingofworkers.Heartbeatingisoneofthose"simple"thingsthatcan
bedifficulttogetright.I'llexplainmoreaboutthatinasecond.

http://zguide.zeromq.org/page:all 77/225
12/31/2015 MQ - The Guide - MQ - The Guide
HereistheParanoidPirateworker:

ppworker:ParanoidPirateworkerinC

C++|C#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Somecommentsaboutthisexample:

Thecodeincludessimulationoffailures,asbefore.Thismakesit(a)veryhardtodebug,and(b)dangeroustoreuse.
Whenyouwanttodebugthis,disablethefailuresimulation.

TheworkerusesareconnectstrategysimilartotheonewedesignedfortheLazyPirateclient,withtwomajordifferences:
(a)itdoesanexponentialbackoff,and(b)itretriesindefinitely(whereastheclientretriesafewtimesbeforereportinga
failure).

Trytheclient,queue,andworkers,suchasbyusingascriptlikethis:

ppqueue&
foriin1234do
ppworker&
sleep1
done
lpclient&

Youshouldseetheworkersdieonebyoneastheysimulateacrash,andtheclienteventuallygiveup.Youcanstopandrestart
thequeueandbothclientandworkerswillreconnectandcarryon.Andnomatterwhatyoudotoqueuesandworkers,theclient
willnevergetanoutoforderreply:thewholechaineitherworks,ortheclientabandons.

Heartbeating topprevnext

Heartbeatingsolvestheproblemofknowingwhetherapeerisaliveordead.ThisisnotanissuespecifictoZeroMQ.TCPhasa
longtimeout(30minutesorso),thatmeansthatitcanbeimpossibletoknowwhetherapeerhasdied,beendisconnected,or
goneonaweekendtoPraguewithacaseofvodka,aredhead,andalargeexpenseaccount.

It'sisnoteasytogetheartbeatingright.WhenwritingtheParanoidPirateexamples,ittookaboutfivehourstogetthe
heartbeatingworkingproperly.Therestoftherequestreplychaintookperhapstenminutes.Itisespeciallyeasytocreate"false
failures",i.e.,whenpeersdecidethattheyaredisconnectedbecausetheheartbeatsaren'tsentproperly.

We'lllookatthethreemainanswerspeopleuseforheartbeatingwithZeroMQ.

ShruggingItOff topprevnext

Themostcommonapproachistodonoheartbeatingatallandhopeforthebest.ManyifnotmostZeroMQapplicationsdothis.
ZeroMQencouragesthisbyhidingpeersinmanycases.Whatproblemsdoesthisapproachcause?

WhenweuseaROUTERsocketinanapplicationthattrackspeers,aspeersdisconnectandreconnect,theapplication
willleakmemory(resourcesthattheapplicationholdsforeachpeer)andgetslowerandslower.

WhenweuseSUBorDEALERbaseddatarecipients,wecan'ttellthedifferencebetweengoodsilence(there'snodata)
andbadsilence(theotherenddied).Whenarecipientknowstheothersidedied,itcanforexampleswitchovertoa
backuproute.

IfweuseaTCPconnectionthatstayssilentforalongwhile,itwill,insomenetworks,justdie.Sendingsomething
(technically,a"keepalive"morethanaheartbeat),willkeepthenetworkalive.

http://zguide.zeromq.org/page:all 78/225
12/31/2015 MQ - The Guide - MQ - The Guide

OneWayHeartbeats topprevnext

Asecondoptionistosendaheartbeatmessagefromeachnodetoitspeerseverysecondorso.Whenonenodehearsnothing
fromanotherwithinsometimeout(severalseconds,typically),itwilltreatthatpeerasdead.Soundsgood,right?Sadly,no.This
worksinsomecasesbuthasnastyedgecasesinothers.

Forpubsub,thisdoeswork,andit'stheonlymodelyoucanuse.SUBsocketscannottalkbacktoPUBsockets,butPUB
socketscanhappilysend"I'malive"messagestotheirsubscribers.

Asanoptimization,youcansendheartbeatsonlywhenthereisnorealdatatosend.Furthermore,youcansendheartbeats
progressivelyslowerandslower,ifnetworkactivityisanissue(e.g.,onmobilenetworkswhereactivitydrainsthebattery).As
longastherecipientcandetectafailure(sharpstopinactivity),that'sfine.

Herearethetypicalproblemswiththisdesign:

Itcanbeinaccuratewhenwesendlargeamountsofdata,asheartbeatswillbedelayedbehindthatdata.Ifheartbeatsare
delayed,youcangetfalsetimeoutsanddisconnectionsduetonetworkcongestion.Thus,alwaystreatanyincomingdata
asaheartbeat,whetherornotthesenderoptimizesoutheartbeats.

Whilethepubsubpatternwilldropmessagesfordisappearedrecipients,PUSHandDEALERsocketswillqueuethem.
Soifyousendheartbeatstoadeadpeeranditcomesback,itwillgetalltheheartbeatsyousent,whichcanbe
thousands.Whoa,whoa!

Thisdesignassumesthatheartbeattimeoutsarethesameacrossthewholenetwork.Butthatwon'tbeaccurate.Some
peerswillwantveryaggressiveheartbeatinginordertodetectfaultsrapidly.Andsomewillwantveryrelaxed
heartbeating,inordertoletsleepingnetworkslieandsavepower.

PingPongHeartbeats topprevnext

Thethirdoptionistouseapingpongdialog.Onepeersendsapingcommandtotheother,whichreplieswithapongcommand.
Neithercommandhasanypayload.Pingsandpongsarenotcorrelated.Becausetherolesof"client"and"server"arearbitraryin
somenetworks,weusuallyspecifythateitherpeercaninfactsendapingandexpectaponginresponse.However,becausethe
timeoutsdependonnetworktopologiesknownbesttodynamicclients,itisusuallytheclientthatpingstheserver.

ThisworksforallROUTERbasedbrokers.Thesameoptimizationsweusedinthesecondmodelmakethisworkevenbetter:
treatanyincomingdataasapong,andonlysendapingwhennototherwisesendingdata.

HeartbeatingforParanoidPirate topprevnext

ForParanoidPirate,wechosethesecondapproach.Itmightnothavebeenthesimplestoption:ifdesigningthistoday,I'd
probablytryapingpongapproachinstead.Howevertheprinciplesaresimilar.Theheartbeatmessagesflowasynchronouslyin
bothdirections,andeitherpeercandecidetheotheris"dead"andstoptalkingtoit.

Intheworker,thisishowwehandleheartbeatsfromthequeue:

Wecalculatealiveness,whichishowmanyheartbeatswecanstillmissbeforedecidingthequeueisdead.Itstartsat
threeandwedecrementiteachtimewemissaheartbeat.
Wewait,inthezmq_pollloop,foronesecondeachtime,whichisourheartbeatinterval.
Ifthere'sanymessagefromthequeueduringthattime,weresetourlivenesstothree.
Ifthere'snomessageduringthattime,wecountdownourliveness.
Ifthelivenessreacheszero,weconsiderthequeuedead.
Ifthequeueisdead,wedestroyoursocket,createanewone,andreconnect.
Toavoidopeningandclosingtoomanysockets,wewaitforacertainintervalbeforereconnecting,andwedoublethe
intervaleachtimeuntilitreaches32seconds.

Andthisishowwehandleheartbeatstothequeue:

http://zguide.zeromq.org/page:all 79/225
12/31/2015 MQ - The Guide - MQ - The Guide
Wecalculatewhentosendthenextheartbeatthisisasinglevariablebecausewe'retalkingtoonepeer,thequeue.
Inthezmq_pollloop,wheneverwepassthistime,wesendaheartbeattothequeue.

Here'stheessentialheartbeatingcodefortheworker:

#defineHEARTBEAT_LIVENESS3//35isreasonable
#defineHEARTBEAT_INTERVAL1000//msecs
#defineINTERVAL_INIT1000//Initialreconnect
#defineINTERVAL_MAX32000//Afterexponentialbackoff


//Iflivenesshitszero,queueisconsidereddisconnected
size_tliveness=HEARTBEAT_LIVENESS
size_tinterval=INTERVAL_INIT

//Sendoutheartbeatsatregularintervals
uint64_theartbeat_at=zclock_time()+HEARTBEAT_INTERVAL

while(true){
zmq_pollitem_titems[]={{worker,0,ZMQ_POLLIN,0}}
intrc=zmq_poll(items,1,HEARTBEAT_INTERVAL*ZMQ_POLL_MSEC)

if(items[0].revents&ZMQ_POLLIN){
//Receiveanymessagefromqueue
liveness=HEARTBEAT_LIVENESS
interval=INTERVAL_INIT
}
else
if(liveness==0){
zclock_sleep(interval)
if(interval<INTERVAL_MAX)
interval*=2
zsocket_destroy(ctx,worker)

liveness=HEARTBEAT_LIVENESS
}
//Sendheartbeattoqueueifit'stime
if(zclock_time()>heartbeat_at){
heartbeat_at=zclock_time()+HEARTBEAT_INTERVAL
//Sendheartbeatmessagetoqueue
}
}

Thequeuedoesthesame,butmanagesanexpirationtimeforeachworker.

Herearesometipsforyourownheartbeatingimplementation:

Usezmq_pollorareactorasthecoreofyourapplication'smaintask.

Startbybuildingtheheartbeatingbetweenpeers,testitbysimulatingfailures,andthenbuildtherestofthemessageflow.
Addingheartbeatingafterwardsismuchtrickier.

Usesimpletracing,i.e.,printtoconsole,togetthisworking.Tohelpyoutracetheflowofmessagesbetweenpeers,usea
dumpmethodsuchaszmsgoffers,andnumberyourmessagesincrementallysoyoucanseeiftherearegaps.

Inarealapplication,heartbeatingmustbeconfigurableandusuallynegotiatedwiththepeer.Somepeerswillwant
aggressiveheartbeating,aslowas10msecs.Otherpeerswillbefarawayandwantheartbeatingashighas30seconds.

Ifyouhavedifferentheartbeatintervalsfordifferentpeers,yourpolltimeoutshouldbethelowest(shortesttime)ofthese.
Donotuseaninfinitetimeout.

Doheartbeatingonthesamesocketyouuseformessages,soyourheartbeatsalsoactasakeepalivetostopthe
networkconnectionfromgoingstale(somefirewallscanbeunkindtosilentconnections).

http://zguide.zeromq.org/page:all 80/225
12/31/2015 MQ - The Guide - MQ - The Guide

ContractsandProtocols topprevnext

Ifyou'repayingattention,you'llrealizethatParanoidPirateisnotinteroperablewithSimplePirate,becauseoftheheartbeats.But
howdowedefine"interoperable"?Toguaranteeinteroperability,weneedakindofcontract,anagreementthatletsdifferent
teamsindifferenttimesandplaceswritecodethatisguaranteedtoworktogether.Wecallthisa"protocol".

It'sfuntoexperimentwithoutspecifications,butthat'snotasensiblebasisforrealapplications.Whathappensifwewanttowrite
aworkerinanotherlanguage?Dowehavetoreadcodetoseehowthingswork?Whatifwewanttochangetheprotocolfor
somereason?Evenasimpleprotocolwill,ifit'ssuccessful,evolveandbecomemorecomplex.

Lackofcontractsisasuresignofadisposableapplication.Solet'swriteacontractforthisprotocol.Howdowedothat?

There'sawikiatrfc.zeromq.orgthatwemadeespeciallyasahomeforpublicZeroMQcontracts.
Tocreateanewspecification,registeronthewikiifneeded,andfollowtheinstructions.It'sfairlystraightforward,thoughwriting
technicaltextsisnoteveryone'scupoftea.

IttookmeaboutfifteenminutestodraftthenewPiratePatternProtocol.It'snotabigspecification,butitdoescaptureenoughto
actasthebasisforarguments("yourqueueisn'tPPPcompatiblepleasefixit!").

TurningPPPintoarealprotocolwouldtakemorework:

ThereshouldbeaprotocolversionnumberintheREADYcommandsothatit'spossibletodistinguishbetweendifferent
versionsofPPP.

Rightnow,READYandHEARTBEATarenotentirelydistinctfromrequestsandreplies.Tomakethemdistinct,wewould
needamessagestructurethatincludesa"messagetype"part.

ServiceOrientedReliableQueuing(MajordomoPattern) topprevnext

Figure50TheMajordomoPattern

Thenicethingaboutprogressishowfastithappenswhenlawyersandcommitteesaren'tinvolved.TheonepageMDP
specificationturnsPPPintosomethingmoresolid.Thisishowweshoulddesigncomplexarchitectures:startbywritingdownthe
contracts,andonlythenwritesoftwaretoimplementthem.

TheMajordomoProtocol(MDP)extendsandimprovesonPPPinoneinterestingway:itaddsa"servicename"torequeststhat
http://zguide.zeromq.org/page:all 81/225
12/31/2015 MQ - The Guide - MQ - The Guide
theclientsends,andasksworkerstoregisterforspecificservices.AddingservicenamesturnsourParanoidPiratequeueintoa
serviceorientedbroker.ThenicethingaboutMDPisthatitcameoutofworkingcode,asimplerancestorprotocol(PPP),anda
precisesetofimprovementsthateachsolvedaclearproblem.Thismadeiteasytodraft.

ToimplementMajordomo,weneedtowriteaframeworkforclientsandworkers.It'sreallynotsanetoaskeveryapplication
developertoreadthespecandmakeitwork,whentheycouldbeusingasimplerAPIthatdoestheworkforthem.

Sowhileourfirstcontract(MDPitself)defineshowthepiecesofourdistributedarchitecturetalktoeachother,oursecond
contractdefineshowuserapplicationstalktothetechnicalframeworkwe'regoingtodesign.

Majordomohastwohalves,aclientsideandaworkerside.Becausewe'llwritebothclientandworkerapplications,wewillneed
twoAPIs.HereisasketchfortheclientAPI,usingasimpleobjectorientedapproach:

//MajordomoProtocolclientexample
//UsesthemdcliAPItohideallMDPaspects

//Letsusbuildthissourcewithoutcreatingalibrary
#include"mdcliapi.c"

intmain(intargc,char*argv[])
{
intverbose=(argc>1&&streq(argv[1],"v"))
mdcli_t*session=mdcli_new("tcp://localhost:5555",verbose)

intcount
for(count=0count<100000count++){
zmsg_t*request=zmsg_new()
zmsg_pushstr(request,"Helloworld")
zmsg_t*reply=mdcli_send(session,"echo",&request)
if(reply)
zmsg_destroy(&reply)
else
break//Interruptorfailure
}
printf("%drequests/repliesprocessed\n",count)
mdcli_destroy(&session)
return0
}

That'sit.Weopenasessiontothebroker,sendarequestmessage,getareplymessageback,andeventuallyclosethe
connection.Here'sasketchfortheworkerAPI:

//MajordomoProtocolworkerexample
//UsesthemdwrkAPItohideallMDPaspects

//Letsusbuildthissourcewithoutcreatingalibrary
#include"mdwrkapi.c"

intmain(intargc,char*argv[])
{
intverbose=(argc>1&&streq(argv[1],"v"))
mdwrk_t*session=mdwrk_new(
"tcp://localhost:5555","echo",verbose)

zmsg_t*reply=NULL
while(true){
zmsg_t*request=mdwrk_recv(session,&reply)
if(request==NULL)
break//Workerwasinterrupted
reply=request//Echoiscomplex:)
}
mdwrk_destroy(&session)

http://zguide.zeromq.org/page:all 82/225
12/31/2015 MQ - The Guide - MQ - The Guide
return0
}

It'smoreorlesssymmetrical,buttheworkerdialogisalittledifferent.Thefirsttimeaworkerdoesarecv(),itpassesanullreply.
Thereafter,itpassesthecurrentreply,andgetsanewrequest.

TheclientandworkerAPIswerefairlysimpletoconstructbecausethey'reheavilybasedontheParanoidPiratecodewealready
developed.HereistheclientAPI:

mdcliapi:MajordomoclientAPIinC

Go|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Haskell|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Let'sseehowtheclientAPIlooksinaction,withanexampletestprogramthatdoes100Krequestreplycycles:

mdclient:MajordomoclientapplicationinC

C++|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

AndhereistheworkerAPI:

mdwrkapi:MajordomoworkerAPIinC

Go|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Haskell|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Let'sseehowtheworkerAPIlooksinaction,withanexampletestprogramthatimplementsanechoservice:

mdworker:MajordomoworkerapplicationinC

C++|Go|Haskell|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

HerearesomethingstonoteabouttheworkerAPIcode:

TheAPIsaresinglethreaded.Thismeans,forexample,thattheworkerwon'tsendheartbeatsinthebackground.Happily,
thisisexactlywhatwewant:iftheworkerapplicationgetsstuck,heartbeatswillstopandthebrokerwillstopsending
requeststotheworker.

TheworkerAPIdoesn'tdoanexponentialbackoffit'snotworththeextracomplexity.

TheAPIsdon'tdoanyerrorreporting.Ifsomethingisn'tasexpected,theyraiseanassertion(orexceptiondependingon
thelanguage).Thisisidealforareferenceimplementation,soanyprotocolerrorsshowimmediately.Forrealapplications,
theAPIshouldberobustagainstinvalidmessages.

YoumightwonderwhytheworkerAPIismanuallyclosingitssocketandopeninganewone,whenZeroMQwillautomatically
reconnectasocketifthepeerdisappearsandcomesback.LookbackattheSimplePirateandParanoidPirateworkersto
understand.AlthoughZeroMQwillautomaticallyreconnectworkersifthebrokerdiesandcomesbackup,thisisn'tsufficienttore
registertheworkerswiththebroker.Iknowofatleasttwosolutions.Thesimplest,whichweusehere,isfortheworkertomonitor
theconnectionusingheartbeats,andifitdecidesthebrokerisdead,tocloseitssocketandstartafreshwithanewsocket.The
alternativeisforthebrokertochallengeunknownworkerswhenitgetsaheartbeatfromtheworkerandaskthemtoreregister.
Thatwouldrequireprotocolsupport.

Nowlet'sdesigntheMajordomobroker.Itscorestructureisasetofqueues,oneperservice.Wewillcreatethesequeuesas
workersappear(wecoulddeletethemasworkersdisappear,butforgetthatfornowbecauseitgetscomplex).Additionally,we
keepaqueueofworkersperservice.

Andhereisthebroker:

mdbroker:MajordomobrokerinC

C++|Go|Haskell|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Thisisbyfarthemostcomplexexamplewe'veseen.It'salmost500linesofcode.Towritethisandmakeitsomewhatrobust
tooktwodays.However,thisisstillashortpieceofcodeforafullserviceorientedbroker.

http://zguide.zeromq.org/page:all 83/225
12/31/2015 MQ - The Guide - MQ - The Guide
Herearesomethingstonoteaboutthebrokercode:

TheMajordomoProtocolletsushandlebothclientsandworkersonasinglesocket.Thisisnicerforthosedeployingand
managingthebroker:itjustsitsononeZeroMQendpointratherthanthetwothatmostproxiesneed.

ThebrokerimplementsallofMDP/0.1properly(asfarasIknow),includingdisconnectionifthebrokersendsinvalid
commands,heartbeating,andtherest.

Itcanbeextendedtorunmultiplethreads,eachmanagingonesocketandonesetofclientsandworkers.Thiscouldbe
interestingforsegmentinglargearchitectures.TheCcodeisalreadyorganizedaroundabrokerclasstomakethistrivial.

Aprimary/failoverorlive/livebrokerreliabilitymodeliseasy,asthebrokeressentiallyhasnostateexceptservice
presence.It'suptoclientsandworkerstochooseanotherbrokeriftheirfirstchoiceisn'tupandrunning.

Theexamplesusefivesecondheartbeats,mainlytoreducetheamountofoutputwhenyouenabletracing.Realistic
valueswouldbelowerformostLANapplications.However,anyretryhastobeslowenoughtoallowforaserviceto
restart,say10secondsatleast.

WelaterimprovedandextendedtheprotocolandtheMajordomoimplementation,whichnowsitsinitsownGithubproject.Ifyou
wantaproperlyusableMajordomostack,usetheGitHubproject.

AsynchronousMajordomoPattern topprevnext

TheMajordomoimplementationintheprevioussectionissimpleandstupid.TheclientisjusttheoriginalSimplePirate,wrapped
upinasexyAPI.WhenIfireupaclient,broker,andworkeronatestbox,itcanprocess100,000requestsinabout14seconds.
Thatispartiallyduetothecode,whichcheerfullycopiesmessageframesaroundasifCPUcycleswerefree.Butthereal
problemisthatwe'redoingnetworkroundtrips.ZeroMQdisablesNagle'salgorithm,butroundtrippingisstillslow.

Theoryisgreatintheory,butinpractice,practiceisbetter.Let'smeasuretheactualcostofroundtrippingwithasimpletest
program.Thissendsabunchofmessages,firstwaitingforareplytoeachmessage,andsecondasabatch,readingallthe
repliesbackasabatch.Bothapproachesdothesamework,buttheygiveverydifferentresults.Wemockupaclient,broker,and
worker:

tripping:RoundtripdemonstratorinC

C++|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Onmydevelopmentbox,thisprogramsays:

Settinguptest...
Synchronousroundtriptest...
9057calls/second
Asynchronousroundtriptest...
173010calls/second

Notethattheclientthreaddoesasmallpausebeforestarting.Thisistogetaroundoneofthe"features"oftheroutersocket:if
yousendamessagewiththeaddressofapeerthat'snotyetconnected,themessagegetsdiscarded.Inthisexamplewedon't
usetheloadbalancingmechanism,sowithoutthesleep,iftheworkerthreadistooslowtoconnect,itwilllosemessages,making
amessofourtest.

Aswesee,roundtrippinginthesimplestcaseis20timesslowerthantheasynchronous,"shoveitdownthepipeasfastasit'll
go"approach.Let'sseeifwecanapplythistoMajordomotomakeitfaster.

First,wemodifytheclientAPItosendandreceiveintwoseparatemethods:

mdcli_t*mdcli_new(char*broker)
voidmdcli_destroy(mdcli_t**self_p)
intmdcli_send(mdcli_t*self,char*service,zmsg_t**request_p)
zmsg_t*mdcli_recv(mdcli_t*self)

http://zguide.zeromq.org/page:all 84/225
12/31/2015 MQ - The Guide - MQ - The Guide
It'sliterallyafewminutes'worktorefactorthesynchronousclientAPItobecomeasynchronous:

mdcliapi2:MajordomoasynchronousclientAPIinC

Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Thedifferencesare:

WeuseaDEALERsocketinsteadofREQ,soweemulateREQwithanemptydelimiterframebeforeeachrequestand
eachresponse.
Wedon'tretryrequestsiftheapplicationneedstoretry,itcandothisitself.
Webreakthesynchronoussendmethodintoseparatesendandrecvmethods.
Thesendmethodisasynchronousandreturnsimmediatelyaftersending.Thecallercanthussendanumberof
messagesbeforegettingaresponse.
Therecvmethodwaitsfor(withatimeout)oneresponseandreturnsthattothecaller.

Andhere'sthecorrespondingclienttestprogram,whichsends100,000messagesandthenreceives100,000back:

mdclient2:MajordomoclientapplicationinC

C++|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Thebrokerandworkerareunchangedbecausewe'venotmodifiedtheprotocolatall.Weseeanimmediateimprovementin
performance.Here'sthesynchronousclientchuggingthrough100Krequestreplycycles:

$timemdclient
100000requests/repliesprocessed

real0m14.088s
user0m1.310s
sys0m2.670s

Andhere'stheasynchronousclient,withasingleworker:

$timemdclient2
100000repliesreceived

real0m8.730s
user0m0.920s
sys0m1.550s

Twiceasfast.Notbad,butlet'sfireup10workersandseehowithandlesthetraffic

$timemdclient2
100000repliesreceived

real0m3.863s
user0m0.730s
sys0m0.470s

Itisn'tfullyasynchronousbecauseworkersgettheirmessagesonastrictlastusedbasis.Butitwillscalebetterwithmore
workers.OnmyPC,aftereightorsoworkers,itdoesn'tgetanyfaster.Fourcoresonlystretchessofar.Butwegota4x
improvementinthroughputwithjustafewminutes'work.Thebrokerisstillunoptimized.Itspendsmostofitstimecopying
messageframesaround,insteadofdoingzerocopy,whichitcould.Butwe'regetting25Kreliablerequest/replycallsasecond,
withprettyloweffort.

However,theasynchronousMajordomopatternisn'tallroses.Ithasafundamentalweakness,namelythatitcannotsurvivea
brokercrashwithoutmorework.Ifyoulookatthemdcliapi2codeyou'llseeitdoesnotattempttoreconnectafterafailure.A
properreconnectwouldrequirethefollowing:

http://zguide.zeromq.org/page:all 85/225
12/31/2015 MQ - The Guide - MQ - The Guide
Anumberoneveryrequestandamatchingnumberoneveryreply,whichwouldideallyrequireachangetotheprotocolto
enforce.
TrackingandholdingontoalloutstandingrequestsintheclientAPI,i.e.,thoseforwhichnoreplyhasyetbeenreceived.
Incaseoffailover,fortheclientAPItoresendalloutstandingrequeststothebroker.

It'snotadealbreaker,butitdoesshowthatperformanceoftenmeanscomplexity.IsthisworthdoingforMajordomo?Itdepends
onyourusecase.Foranamelookupserviceyoucalloncepersession,no.Forawebfrontendservingthousandsofclients,
probablyyes.

ServiceDiscovery topprevnext

So,wehaveaniceserviceorientedbroker,butwehavenowayofknowingwhetheraparticularserviceisavailableornot.We
knowwhetherarequestfailed,butwedon'tknowwhy.Itisusefultobeabletoaskthebroker,"istheechoservicerunning?"The
mostobviouswaywouldbetomodifyourMDP/Clientprotocoltoaddcommandstoaskthis.ButMDP/Clienthasthegreatcharm
ofbeingsimple.AddingservicediscoverytoitwouldmakeitascomplexastheMDP/Workerprotocol.

Anotheroptionistodowhatemaildoes,andaskthatundeliverablerequestsbereturned.Thiscanworkwellinanasynchronous
world,butitalsoaddscomplexity.Weneedwaystodistinguishreturnedrequestsfromrepliesandtohandletheseproperly.

Let'strytousewhatwe'vealreadybuilt,buildingontopofMDPinsteadofmodifyingit.Servicediscoveryis,itself,aservice.It
mightindeedbeoneofseveralmanagementservices,suchas"disableserviceX","providestatistics",andsoon.Whatwewant
isageneral,extensiblesolutionthatdoesn'taffecttheprotocolorexistingapplications.

Sohere'sasmallRFCthatlayersthisontopofMDP:theMajordomoManagementInterface(MMI).Wealreadyimplementeditin
thebroker,thoughunlessyoureadthewholethingyouprobablymissedthat.I'llexplainhowitworksinthebroker:

Whenaclientrequestsaservicethatstartswithmmi.,insteadofroutingthistoaworker,wehandleitinternally.

Wehandlejustoneserviceinthisbroker,whichismmi.service,theservicediscoveryservice.

Thepayloadfortherequestisthenameofanexternalservice(arealone,providedbyaworker).

Thebrokerreturns"200"(OK)or"404"(Notfound),dependingonwhetherthereareworkersregisteredforthatserviceor
not.

Here'showweusetheservicediscoveryinanapplication:

mmiecho:ServicediscoveryoverMajordomoinC

Go|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Haskell|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Trythiswithandwithoutaworkerrunning,andyoushouldseethelittleprogramreport"200"or"404"accordingly.The
implementationofMMIinourexamplebrokerisflimsy.Forexample,ifaworkerdisappears,servicesremain"present".In
practice,abrokershouldremoveservicesthathavenoworkersaftersomeconfigurabletimeout.

IdempotentServices topprevnext

Idempotencyisnotsomethingyoutakeapillfor.Whatitmeansisthatit'ssafetorepeatanoperation.Checkingtheclockis
idempotent.Lendingonescreditcardtooneschildrenisnot.Whilemanyclienttoserverusecasesareidempotent,someare
not.Examplesofidempotentusecasesinclude:

Statelesstaskdistribution,i.e.,apipelinewheretheserversarestatelessworkersthatcomputeareplybasedpurelyon
thestateprovidedbyarequest.Insuchacase,it'ssafe(thoughinefficient)toexecutethesamerequestmanytimes.

Anameservicethattranslateslogicaladdressesintoendpointstobindorconnectto.Insuchacase,it'ssafetomakethe
samelookuprequestmanytimes.

Andhereareexamplesofanonidempotentusecases:

Aloggingservice.Onedoesnotwantthesameloginformationrecordedmorethanonce.
http://zguide.zeromq.org/page:all 86/225
12/31/2015 MQ - The Guide - MQ - The Guide
Anyservicethathasimpactondownstreamnodes,e.g.,sendsoninformationtoothernodes.Ifthatservicegetsthesame
requestmorethanonce,downstreamnodeswillgetduplicateinformation.

Anyservicethatmodifiesshareddatainsomenonidempotentwaye.g.,aservicethatdebitsabankaccountisnot
idempotentwithoutextrawork.

Whenourserverapplicationsarenotidempotent,wehavetothinkmorecarefullyaboutwhenexactlytheymightcrash.Ifan
applicationdieswhenit'sidle,orwhileit'sprocessingarequest,that'susuallyfine.Wecanusedatabasetransactionstomake
sureadebitandacreditarealwaysdonetogether,ifatall.Iftheserverdieswhilesendingitsreply,that'saproblem,becauseas
farasit'sconcerned,ithasdoneitswork.

Ifthenetworkdiesjustasthereplyismakingitswaybacktotheclient,thesameproblemarises.Theclientwillthinktheserver
diedandwillresendtherequest,andtheserverwilldothesameworktwice,whichisnotwhatwewant.

Tohandlenonidempotentoperations,usethefairlystandardsolutionofdetectingandrejectingduplicaterequests.Thismeans:

Theclientmuststampeveryrequestwithauniqueclientidentifierandauniquemessagenumber.

Theserver,beforesendingbackareply,storesitusingthecombinationofclientIDandmessagenumberasakey.

Theserver,whengettingarequestfromagivenclient,firstcheckswhetherithasareplyforthatclientIDandmessage
number.Ifso,itdoesnotprocesstherequest,butjustresendsthereply.

DisconnectedReliability(TitanicPattern) topprevnext

OnceyourealizethatMajordomoisa"reliable"messagebroker,youmightbetemptedtoaddsomespinningrust(thatis,
ferrousbasedharddiskplatters).Afterall,thisworksforalltheenterprisemessagingsystems.It'ssuchatemptingideathatit'sa
littlesadtohavetobenegativetowardit.Butbrutalcynicismisoneofmyspecialties.So,somereasonsyoudon'twantrust
basedbrokerssittinginthecenterofyourarchitectureare:

Asyou'veseen,theLazyPirateclientperformssurprisinglywell.Itworksacrossawholerangeofarchitectures,from
directclienttoservertodistributedqueueproxies.Itdoestendtoassumethatworkersarestatelessandidempotent.But
wecanworkaroundthatlimitationwithoutresortingtorust.

Rustbringsawholesetofproblems,fromslowperformancetoadditionalpiecesthatyouhavetomanage,repair,and
handle6a.m.panicsfrom,astheyinevitablybreakatthestartofdailyoperations.ThebeautyofthePiratepatternsin
generalistheirsimplicity.Theywon'tcrash.Andifyou'restillworriedaboutthehardware,youcanmovetoapeertopeer
patternthathasnobrokeratall.I'llexplainlaterinthischapter.

Havingsaidthis,however,thereisonesaneusecaseforrustbasedreliability,whichisanasynchronousdisconnectednetwork.
ItsolvesamajorproblemwithPirate,namelythataclienthastowaitforananswerinrealtime.Ifclientsandworkersareonly
sporadicallyconnected(thinkofemailasananalogy),wecan'tuseastatelessnetworkbetweenclientsandworkers.Wehaveto
putstateinthemiddle.

So,here'stheTitanicpattern,inwhichwewritemessagestodisktoensuretheynevergetlost,nomatterhowsporadically
clientsandworkersareconnected.Aswedidforservicediscovery,we'regoingtolayerTitanicontopofMDPratherthanextend
it.It'swonderfullylazybecauseitmeanswecanimplementourfireandforgetreliabilityinaspecializedworker,ratherthaninthe
broker.Thisisexcellentforseveralreasons:

Itismucheasierbecausewedivideandconquer:thebrokerhandlesmessageroutingandtheworkerhandlesreliability.
Itletsusmixbrokerswritteninonelanguagewithworkerswritteninanother.
Itletsusevolvethefireandforgettechnologyindependently.

Theonlydownsideisthatthere'sanextranetworkhopbetweenbrokerandharddisk.Thebenefitsareeasilyworthit.

Therearemanywaystomakeapersistentrequestreplyarchitecture.We'llaimforonethatissimpleandpainless.Thesimplest
designIcouldcomeupwith,afterplayingwiththisforafewhours,isa"proxyservice".Thatis,Titanicdoesn'taffectworkersat
all.Ifaclientwantsareplyimmediately,ittalksdirectlytoaserviceandhopestheserviceisavailable.Ifaclientishappytowait
awhile,ittalkstoTitanicinsteadandasks,"hey,buddy,wouldyoutakecareofthisformewhileIgobuymygroceries?"

Figure51TheTitanicPattern

http://zguide.zeromq.org/page:all 87/225
12/31/2015 MQ - The Guide - MQ - The Guide

Titanicisthusbothaworkerandaclient.ThedialogbetweenclientandTitanicgoesalongtheselines:

Client:Pleaseacceptthisrequestforme.Titanic:OK,done.
Client:Doyouhaveareplyforme?Titanic:Yes,hereitis.Or,no,notyet.
Client:OK,youcanwipethatrequestnow,I'mhappy.Titanic:OK,done.

WhereasthedialogbetweenTitanicandbrokerandworkergoeslikethis:

Titanic:Hey,Broker,isthereancoffeeservice?Broker:Uhm,Yeah,seemslike.
Titanic:Hey,coffeeservice,pleasehandlethisforme.
Coffee:Sure,hereyouare.
Titanic:Sweeeeet!

Youcanworkthroughthisandthepossiblefailurescenarios.Ifaworkercrasheswhileprocessingarequest,Titanicretries
indefinitely.Ifareplygetslostsomewhere,Titanicwillretry.Iftherequestgetsprocessedbuttheclientdoesn'tgetthereply,it
willaskagain.IfTitaniccrasheswhileprocessingarequestorareply,theclientwilltryagain.Aslongasrequestsarefully
committedtosafestorage,workcan'tgetlost.

Thehandshakingispedantic,butcanbepipelined,i.e.,clientscanusetheasynchronousMajordomopatterntodoalotofwork
andthengettheresponseslater.

Weneedsomewayforaclienttorequestitsreplies.We'llhavemanyclientsaskingforthesameservices,andclientsdisappear
andreappearwithdifferentidentities.Hereisasimple,reasonablysecuresolution:

EveryrequestgeneratesauniversallyuniqueID(UUID),whichTitanicreturnstotheclientafterithasqueuedtherequest.
Whenaclientasksforareply,itmustspecifytheUUIDfortheoriginalrequest.

Inarealisticcase,theclientwouldwanttostoreitsrequestUUIDssafely,e.g.,inalocaldatabase.

Beforewejumpoffandwriteyetanotherformalspecification(fun,fun!),let'sconsiderhowtheclienttalkstoTitanic.Onewayis
touseasingleserviceandsenditthreedifferentrequesttypes.Anotherway,whichseemssimpler,istousethreeservices:

titanic.request:storearequestmessage,andreturnaUUIDfortherequest.
titanic.reply:fetchareply,ifavailable,foragivenrequestUUID.
titanic.close:confirmthatareplyhasbeenstoredandprocessed.

http://zguide.zeromq.org/page:all 88/225
12/31/2015 MQ - The Guide - MQ - The Guide
We'lljustmakeamultithreadedworker,whichaswe'veseenfromourmultithreadingexperiencewithZeroMQ,istrivial.However,
let'sfirstsketchwhatTitanicwouldlooklikeintermsofZeroMQmessagesandframes.ThisgivesustheTitanicServiceProtocol
(TSP).

UsingTSPisclearlymoreworkforclientapplicationsthanaccessingaservicedirectlyviaMDP.Here'stheshortestrobust
"echo"clientexample:

ticlient:TitanicclientexampleinC

Haxe|Java|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Ofcoursethiscanbe,andshouldbe,wrappedupinsomekindofframeworkorAPI.It'snothealthytoaskaverageapplication
developerstolearnthefulldetailsofmessaging:ithurtstheirbrains,coststime,andofferstoomanywaystomakebuggy
complexity.Additionally,itmakesithardtoaddintelligence.

Forexample,thisclientblocksoneachrequestwhereasinarealapplication,we'dwanttobedoingusefulworkwhiletasksare
executed.Thisrequiressomenontrivialplumbingtobuildabackgroundthreadandtalktothatcleanly.It'sthekindofthingyou
wanttowrapinanicesimpleAPIthattheaveragedevelopercannotmisuse.It'sthesameapproachthatweusedforMajordomo.

Here'stheTitanicimplementation.Thisserverhandlesthethreeservicesusingthreethreads,asproposed.Itdoesfull
persistencetodiskusingthemostbrutalapproachpossible:onefilepermessage.It'ssosimple,it'sscary.Theonlycomplexpart
isthatitkeepsaseparatequeueofallrequests,toavoidreadingthedirectoryoverandover:

titanic:TitanicbrokerexampleinC

Haxe|Java|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Totestthis,startmdbrokerandtitanic,andthenrunticlient.Nowstartmdworkerarbitrarily,andyoushouldseethe
clientgettingaresponseandexitinghappily.

Somenotesaboutthiscode:

Notethatsomeloopsstartbysending,othersbyreceivingmessages.ThisisbecauseTitanicactsbothasaclientanda
workerindifferentroles.
TheTitanicbrokerusestheMMIservicediscoveryprotocoltosendrequestsonlytoservicesthatappeartoberunning.
SincetheMMIimplementationinourlittleMajordomobrokerisquitepoor,thiswon'tworkallthetime.
Weuseaninprocconnectiontosendnewrequestdatafromthetitanic.requestservicethroughtothemain
dispatcher.Thissavesthedispatcherfromhavingtoscanthediskdirectory,loadallrequestfiles,andsortthemby
date/time.

Theimportantthingaboutthisexampleisnotperformance(which,althoughIhaven'ttestedit,issurelyterrible),buthowwellit
implementsthereliabilitycontract.Totryit,startthemdbrokerandtitanicprograms.Thenstarttheticlient,andthenstartthe
mdworkerechoservice.Youcanrunallfouroftheseusingthevoptiontodoverboseactivitytracing.Youcanstopandrestart
anypieceexcepttheclientandnothingwillgetlost.

IfyouwanttouseTitanicinrealcases,you'llrapidlybeasking"howdowemakethisfaster?"

Here'swhatI'ddo,startingwiththeexampleimplementation:

Useasinglediskfileforalldata,ratherthanmultiplefiles.Operatingsystemsareusuallybetterathandlingafewlarge
filesthanmanysmallerones.
Organizethatdiskfileasacircularbuffersothatnewrequestscanbewrittencontiguously(withveryoccasional
wraparound).Onethread,writingfullspeedtoadiskfile,canworkrapidly.
Keeptheindexinmemoryandrebuildtheindexatstartuptime,fromthediskbuffer.Thissavestheextradiskheadflutter
neededtokeeptheindexfullysafeondisk.Youwouldwantanfsyncaftereverymessage,oreveryNmillisecondsifyou
werepreparedtolosethelastMmessagesincaseofasystemfailure.
Useasolidstatedriveratherthanspinningironoxideplatters.
Preallocatetheentirefile,orallocateitinlargechunks,whichallowsthecircularbuffertogrowandshrinkasneeded.
Thisavoidsfragmentationandensuresthatmostreadsandwritesarecontiguous.

Andsoon.WhatI'dnotrecommendisstoringmessagesinadatabase,notevena"fast"key/valuestore,unlessyoureallylikea
specificdatabaseanddon'thaveperformanceworries.Youwillpayasteeppricefortheabstraction,tentoathousandtimesover
arawdiskfile.

IfyouwanttomakeTitanicevenmorereliable,duplicatetherequeststoasecondserver,whichyou'dplaceinasecondlocation
justfarawayenoughtosurviveanuclearattackonyourprimarylocation,yetnotsofarthatyougettoomuchlatency.

http://zguide.zeromq.org/page:all 89/225
12/31/2015 MQ - The Guide - MQ - The Guide
IfyouwanttomakeTitanicmuchfasterandlessreliable,storerequestsandrepliespurelyinmemory.Thiswillgiveyouthe
functionalityofadisconnectednetwork,butrequestswon'tsurviveacrashoftheTitanicserveritself.

HighAvailabilityPair(BinaryStarPattern) topprevnext

Figure52HighAvailabilityPair,NormalOperation

TheBinaryStarpatternputstwoserversinaprimarybackuphighavailabilitypair.Atanygiventime,oneofthese(theactive)
acceptsconnectionsfromclientapplications.Theother(thepassive)doesnothing,butthetwoserversmonitoreachother.Ifthe
activedisappearsfromthenetwork,afteracertaintimethepassivetakesoverasactive.

WedevelopedtheBinaryStarpatternatiMatixforourOpenAMQserver.Wedesignedit:

Toprovideastraightforwardhighavailabilitysolution.
Tobesimpleenoughtoactuallyunderstandanduse.
Tofailoverreliablywhenneeded,andonlywhenneeded.

AssumingwehaveaBinaryStarpairrunning,herearethedifferentscenariosthatwillresultinafailover:

Thehardwarerunningtheprimaryserverhasafatalproblem(powersupplyexplodes,machinecatchesfire,orsomeone
simplyunplugsitbymistake),anddisappears.Applicationsseethis,andreconnecttothebackupserver.
Thenetworksegmentonwhichtheprimaryserversitscrashesperhapsaroutergetshitbyapowerspikeand
applicationsstarttoreconnecttothebackupserver.
Theprimaryservercrashesoriskilledbytheoperatoranddoesnotrestartautomatically.

Figure53HighavailabilityPairDuringFailover

http://zguide.zeromq.org/page:all 90/225
12/31/2015 MQ - The Guide - MQ - The Guide

Recoveryfromfailoverworksasfollows:

Theoperatorsrestarttheprimaryserverandfixwhateverproblemswerecausingittodisappearfromthenetwork.
Theoperatorsstopthebackupserveratamomentwhenitwillcauseminimaldisruptiontoapplications.
Whenapplicationshavereconnectedtotheprimaryserver,theoperatorsrestartthebackupserver.

Recovery(tousingtheprimaryserverasactive)isamanualoperation.Painfulexperienceteachesusthatautomaticrecoveryis
undesirable.Thereareseveralreasons:

Failovercreatesaninterruptionofservicetoapplications,possiblylasting1030seconds.Ifthereisarealemergency,this
ismuchbetterthantotaloutage.Butifrecoverycreatesafurther1030secondoutage,itisbetterthatthishappensoff
peak,whenusershavegoneoffthenetwork.

Whenthereisanemergency,theabsolutefirstpriorityiscertaintyforthosetryingtofixthings.Automaticrecoverycreates
uncertaintyforsystemadministrators,whocannolongerbesurewhichserverisinchargewithoutdoublechecking.

Automaticrecoverycancreatesituationswherenetworksfailoverandthenrecover,placingoperatorsinthedifficult
positionofanalyzingwhathappened.Therewasaninterruptionofservice,butthecauseisn'tclear.

Havingsaidthis,theBinaryStarpatternwillfailbacktotheprimaryserverifthisisrunning(again)andthebackupserverfails.In
fact,thisishowweprovokerecovery.

TheshutdownprocessforaBinaryStarpairistoeither:

1. Stopthepassiveserverandthenstoptheactiveserveratanylatertime,or
2. Stopbothserversinanyorderbutwithinafewsecondsofeachother.

Stoppingtheactiveandthenthepassiveserverwithanydelaylongerthanthefailovertimeoutwillcauseapplicationsto
disconnect,thenreconnect,andthendisconnectagain,whichmaydisturbusers.

DetailedRequirements topprevnext

BinaryStarisassimpleasitcanbe,whilestillworkingaccurately.Infact,thecurrentdesignisthethirdcompleteredesign.Each
ofthepreviousdesignswefoundtobetoocomplex,tryingtodotoomuch,andwestrippedoutfunctionalityuntilwecametoa
designthatwasunderstandable,easytouse,andreliableenoughtobeworthusing.

Theseareourrequirementsforahighavailabilityarchitecture:

Thefailoverismeanttoprovideinsuranceagainstcatastrophicsystemfailures,suchashardwarebreakdown,fire,
accident,andsoon.Therearesimplerwaystorecoverfromordinaryservercrashesandwealreadycoveredthese.

Failovertimeshouldbeunder60secondsandpreferablyunder10seconds.

Failoverhastohappenautomatically,whereasrecoverymusthappenmanually.Wewantapplicationstoswitchoverto
http://zguide.zeromq.org/page:all 91/225
12/31/2015 MQ - The Guide - MQ - The Guide
thebackupserverautomatically,butwedonotwantthemtoswitchbacktotheprimaryserverexceptwhentheoperators
havefixedwhateverproblemtherewasanddecidedthatitisagoodtimetointerruptapplicationsagain.

Thesemanticsforclientapplicationsshouldbesimpleandeasyfordeveloperstounderstand.Ideally,theyshouldbe
hiddenintheclientAPI.

Thereshouldbeclearinstructionsfornetworkarchitectsonhowtoavoiddesignsthatcouldleadtosplitbrainsyndrome,
inwhichbothserversinaBinaryStarpairthinktheyaretheactiveserver.

Thereshouldbenodependenciesontheorderinwhichthetwoserversarestarted.

Itmustbepossibletomakeplannedstopsandrestartsofeitherserverwithoutstoppingclientapplications(thoughthey
maybeforcedtoreconnect).

Operatorsmustbeabletomonitorbothserversatalltimes.

Itmustbepossibletoconnectthetwoserversusingahighspeeddedicatednetworkconnection.Thatis,failover
synchronizationmustbeabletouseaspecificIProute.

Wemakethefollowingassumptions:

Asinglebackupserverprovidesenoughinsurancewedon'tneedmultiplelevelsofbackup.

Theprimaryandbackupserversareequallycapableofcarryingtheapplicationload.Wedonotattempttobalanceload
acrosstheservers.

Thereissufficientbudgettocoverafullyredundantbackupserverthatdoesnothingalmostallthetime.

Wedon'tattempttocoverthefollowing:

Theuseofanactivebackupserverorloadbalancing.InaBinaryStarpair,thebackupserverisinactiveanddoesno
usefulworkuntiltheprimaryservergoesoffline.

Thehandlingofpersistentmessagesortransactionsinanyway.Weassumetheexistenceofanetworkofunreliable(and
probablyuntrusted)serversorBinaryStarpairs.

Anyautomaticexplorationofthenetwork.TheBinaryStarpairismanuallyandexplicitlydefinedinthenetworkandis
knowntoapplications(atleastintheirconfigurationdata).

Replicationofstateormessagesbetweenservers.Allserversidestatemustberecreatedbyapplicationswhentheyfail
over.

HereisthekeyterminologythatweuseinBinaryStar:

Primary:theserverthatisnormallyorinitiallyactive.

Backup:theserverthatisnormallypassive.Itwillbecomeactiveifandwhentheprimaryserverdisappearsfromthe
network,andwhenclientapplicationsaskthebackupservertoconnect.

Active:theserverthatacceptsclientconnections.Thereisatmostoneactiveserver.

Passive:theserverthattakesoveriftheactivedisappears.NotethatwhenaBinaryStarpairisrunningnormally,the
primaryserverisactive,andthebackupispassive.Whenafailoverhashappened,therolesareswitched.

ToconfigureaBinaryStarpair,youneedto:

1. Telltheprimaryserverwherethebackupserverislocated.
2. Tellthebackupserverwheretheprimaryserverislocated.
3. Optionally,tunethefailoverresponsetimes,whichmustbethesameforbothservers.

Themaintuningconcernishowfrequentlyyouwanttheserverstochecktheirpeeringstatus,andhowquicklyyouwantto
activatefailover.Inourexample,thefailovertimeoutvaluedefaultsto2,000msec.Ifyoureducethis,thebackupserverwilltake
overasactivemorerapidlybutmaytakeoverincaseswheretheprimaryservercouldrecover.Forexample,youmayhave
wrappedtheprimaryserverinashellscriptthatrestartsitifitcrashes.Inthatcase,thetimeoutshouldbehigherthanthetime
neededtorestarttheprimaryserver.

ForclientapplicationstoworkproperlywithaBinaryStarpair,theymust:

1. Knowbothserveraddresses.

http://zguide.zeromq.org/page:all 92/225
12/31/2015 MQ - The Guide - MQ - The Guide
2. Trytoconnecttotheprimaryserver,andifthatfails,tothebackupserver.
3. Detectafailedconnection,typicallyusingheartbeating.
4. Trytoreconnecttotheprimary,andthenbackup(inthatorder),withadelaybetweenretriesthatisatleastashighasthe
serverfailovertimeout.
5. Recreateallofthestatetheyrequireonaserver.
6. Retransmitmessageslostduringafailover,ifmessagesneedtobereliable.

It'snottrivialwork,andwe'dusuallywrapthisinanAPIthathidesitfromrealenduserapplications.

ThesearethemainlimitationsoftheBinaryStarpattern:

AserverprocesscannotbepartofmorethanoneBinaryStarpair.
Aprimaryservercanhaveasinglebackupserver,andnomore.
Thepassiveserverdoesnousefulwork,andisthuswasted.
Thebackupservermustbecapableofhandlingfullapplicationloads.
Failoverconfigurationcannotbemodifiedatruntime.
Clientapplicationsmustdosomeworktobenefitfromfailover.

PreventingSplitBrainSyndrome topprevnext

Splitbrainsyndromeoccurswhendifferentpartsofaclusterthinktheyareactiveatthesametime.Itcausesapplicationstostop
seeingeachother.BinaryStarhasanalgorithmfordetectingandeliminatingsplitbrain,whichisbasedonathreewaydecision
mechanism(aserverwillnotdecidetobecomeactiveuntilitgetsapplicationconnectionrequestsanditcannotseeitspeer
server).

However,itisstillpossibleto(mis)designanetworktofoolthisalgorithm.AtypicalscenariowouldbeaBinaryStarpair,thatis
distributedbetweentwobuildings,whereeachbuildingalsohadasetofapplicationsandwheretherewasasinglenetworklink
betweenbothbuildings.Breakingthislinkwouldcreatetwosetsofclientapplications,eachwithhalfoftheBinaryStarpair,and
eachfailoverserverwouldbecomeactive.

Topreventsplitbrainsituations,wemustconnectaBinaryStarpairusingadedicatednetworklink,whichcanbeassimpleas
pluggingthembothintothesameswitchor,better,usingacrossovercabledirectlybetweentwomachines.

WemustnotsplitaBinaryStararchitectureintotwoislands,eachwithasetofapplications.Whilethismaybeacommontypeof
networkarchitecture,youshouldusefederation,nothighavailabilityfailover,insuchcases.

Asuitablyparanoidnetworkconfigurationwouldusetwoprivateclusterinterconnects,ratherthanasingleone.Further,the
networkcardsusedfortheclusterwouldbedifferentfromthoseusedformessagetraffic,andpossiblyevenondifferentpathson
theserverhardware.Thegoalistoseparatepossiblefailuresinthenetworkfrompossiblefailuresinthecluster.Networkports
canhavearelativelyhighfailurerate.

BinaryStarImplementation topprevnext

Withoutfurtherado,hereisaproofofconceptimplementationoftheBinaryStarserver.Theprimaryandbackupserversrunthe
samecode,youchoosetheirroleswhenyourunthecode:

bstarsrv:BinaryStarserverinC

Haxe|Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhereistheclient:

bstarcli:BinaryStarclientinC

Haxe|Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

TotestBinaryStar,starttheserversandclientinanyorder:

http://zguide.zeromq.org/page:all 93/225
12/31/2015 MQ - The Guide - MQ - The Guide
bstarsrvp#Startprimary
bstarsrvb#Startbackup
bstarcli

Youcanthenprovokefailoverbykillingtheprimaryserver,andrecoverybyrestartingtheprimaryandkillingthebackup.Note
howit'stheclientvotethattriggersfailover,andrecovery.

Binarystarisdrivenbyafinitestatemachine.Eventsarethepeerstate,so"PeerActive"meanstheotherserverhastoldusit's
active."ClientRequest"meanswe'vereceivedaclientrequest."ClientVote"meanswe'vereceivedaclientrequestANDour
peerisinactivefortwoheartbeats.

NotethattheserversusePUBSUBsocketsforstateexchange.Noothersocketcombinationwillworkhere.PUSHandDEALER
blockifthereisnopeerreadytoreceiveamessage.PAIRdoesnotreconnectifthepeerdisappearsandcomesback.ROUTER
needstheaddressofthepeerbeforeitcansenditamessage.

Figure54BinaryStarFiniteStateMachine

BinaryStarReactor topprevnext

BinaryStarisusefulandgenericenoughtopackageupasareusablereactorclass.Thereactorthenrunsandcallsourcode
wheneverithasamessagetoprocess.Thisismuchnicerthancopying/pastingtheBinaryStarcodeintoeachserverwherewe
wantthatcapability.

InC,wewraptheCZMQzloopclassthatwesawbefore.zloopletsyouregisterhandlerstoreactonsocketandtimerevents.
IntheBinaryStarreactor,weprovidehandlersforvotersandforstatechanges(activetopassive,andviceversa).Hereisthe
bstarAPI:

//bstarclassBinaryStarreactor

#include"bstar.h"

http://zguide.zeromq.org/page:all 94/225
12/31/2015 MQ - The Guide - MQ - The Guide
//Stateswecanbeinatanypointintime
typedefenum{
STATE_PRIMARY=1,//Primary,waitingforpeertoconnect
STATE_BACKUP=2,//Backup,waitingforpeertoconnect
STATE_ACTIVE=3,//Activeacceptingconnections
STATE_PASSIVE=4//Passivenotacceptingconnections
}state_t

//Events,whichstartwiththestatesourpeercanbein
typedefenum{
PEER_PRIMARY=1,//HApeerispendingprimary
PEER_BACKUP=2,//HApeerispendingbackup
PEER_ACTIVE=3,//HApeerisactive
PEER_PASSIVE=4,//HApeerispassive
CLIENT_REQUEST=5//Clientmakesrequest
}event_t

//Structureofourclass

struct_bstar_t{
zctx_t*ctx//Ourprivatecontext
zloop_t*loop//Reactorloop
void*statepub//Statepublisher
void*statesub//Statesubscriber
state_tstate//Currentstate
event_tevent//Currentevent
int64_tpeer_expiry//Whenpeerisconsidered'dead'
zloop_fn*voter_fn//Votingsockethandler
void*voter_arg//Argumentsforvotinghandler
zloop_fn*active_fn//Callwhenbecomeactive
void*active_arg//Argumentsforhandler
zloop_fn*passive_fn//Callwhenbecomepassive
void*passive_arg//Argumentsforhandler
}

//Thefinitestatemachineisthesameasintheproofofconceptserver.
//Tounderstandthisreactorindetail,firstreadtheCZMQzloopclass.

//Wesendstateinformationeverythisoften
//Ifpeerdoesn'trespondintwoheartbeats,itis'dead'
#defineBSTAR_HEARTBEAT1000//Inmsecs

//BinaryStarfinitestatemachine(applieseventtostate)
//Returns1iftherewasanexception,0ifeventwasvalid.

staticint
s_execute_fsm(bstar_t*self)
{
intrc=0
//Primaryserveriswaitingforpeertoconnect
//AcceptsCLIENT_REQUESTeventsinthisstate
if(self>state==STATE_PRIMARY){
if(self>event==PEER_BACKUP){
zclock_log("I:connectedtobackup(passive),readyasactive")
self>state=STATE_ACTIVE
if(self>active_fn)
(self>active_fn)(self>loop,NULL,self>active_arg)
}
else
if(self>event==PEER_ACTIVE){
zclock_log("I:connectedtobackup(active),readyaspassive")
self>state=STATE_PASSIVE
if(self>passive_fn)

http://zguide.zeromq.org/page:all 95/225
12/31/2015 MQ - The Guide - MQ - The Guide
(self>passive_fn)(self>loop,NULL,self>passive_arg)
}
e
lse
elf>event==CLIENT_REQUEST){
if(s
usintotheactiveifwe've
//Allowclientrequeststoturn
//waitedsufficientlylongtobelievethebackupisnot
//currentlyactingasactive(i.e.,afterafailover)
assert(self>peer_expiry>0)

if(zclock_time()>=self>peer_expiry){
adyasactive")
zclock_log("I:requestfromclient,re
self>state=STATE_ACTIVE
if(self>active_fn)
(self>loop,NULL,self>active_arg)
(self>active_fn)
}else

//Don'trespondtoclientsyetit'spossiblewe're
//performingafailbackandthebackupiscurrently
active
rc=1
}
}

else
ackupserveriswaitingforpeertoconnect
//B
//RejectsCLIENT_REQUESTeventsinthisstate
if(self>state==STATE_BACKUP){
{
if(self>event==PEER_ACTIVE)
rimary(active),readyaspassive")
zclock_log("I:connectedtop
self>state=STATE_PASSIVE
if(self>passive_fn)
(self>loop,NULL,self>passive_arg)
(self>passive_fn)
}

else
elf>event==CLIENT_REQUEST)
if(s
rc=1
}
e
lse
erverisactive
//S
//AcceptsCLIENT_R
EQUESTeventsinthisstate
//TheonlywayoutofACTIVEisdeath
if(self>state==STATE_ACTIVE){
{
if(self>event==PEER_ACTIVE)
itbrain
//Twoactiveswouldmeanspl
zclock_log("E:fatalerrordualact
ives,aborting")
rc=1
}
}

else
erverispassive
//S
//CLIENT_REQUESTev
entscantriggerfailoverifpeerlooksdead
if(self>state==STATE_PASSIVE){

if(self>event==PEER_PRIMARY){
active,peerwillgopassive
//Peerisrestartingbecome
zclock_log("I:primary(passive)isrestarting,readyasac
tive")
self>state=STATE_ACTIVE
}
e
lse
elf>event==PEER_BACKUP){
if(s
eactive,peerwillgopassive
//Peerisrestartingbecom
zclock_log("I:backup(passive)isrestarting,readyasact
ive")
self>state=STATE_ACTIVE
}
e
lse
elf>event==PEER_PASSIVE){
if(s
sterwouldbenonresponsive
//Twopassiveswouldmeanclu

http://zguide.zeromq.org/page:all 96/225
12/31/2015 MQ - The Guide - MQ - The Guide
zclock_log("E:fatalerrordualpassives,aborting")
rc=1
}

else
elf>event==CLIENT_REQUEST){
if(s
thaspassed
//Peerbecomesactiveiftimeou
//It'stheclientrequestthattriggersthe
failover
assert(self>peer_expiry>0)

if(zclock_time()>=self>peer_expiry){
vestate
//Ifpeerisdead,switchtotheacti
zclock_log("I:failoversuccessful,readyasa
ctive")
self>state=STATE_ACTIVE
}
e
lse
/Ifpeerisalive,rejectconnections
/
rc=1
}
/
/Callstatechangehandlerifnecessary
if(self>state==STATE_ACTIVE&&self>a
ctive_fn)
e_arg)
(self>active_fn)(self>loop,NULL,self>activ
}
r
eturnrc
}

staticvoid
s_update_peer_expiry(bstar_t*self)
{
self>peer_expiry=zclock_time()+2*BSTAR_HEARTBEAT
}

//Reactoreventhandlers

//Publishourstatetopeer
ints_send_state(zloop_t*loop,inttimer_id,void*arg)
{
bstar_t*self=(bstar_t*)arg
zstr_sendf(self>statepub,"%d",self>state)
return0
}

//Receivestatefrompeer,executefinitestatemachine
ints_recv_state(zloop_t*loop,zmq_pollitem_t*poller,void*arg)
{
bstar_t*self=(bstar_t*)arg
char*state=zstr_recv(poller>socket)
if(state){
self>event=atoi(state)
s_update_peer_expiry(self)
free(state)
}
returns_execute_fsm(self)
}

//Applicationwantstospeaktous,seeifit'spossible
ints_voter_ready(zloop_t*loop,zmq_pollitem_t*poller,void*arg)
{
bstar_t*self=(bstar_t*)arg
//Ifservercanacceptinputnow,callapplhandler
self>event=CLIENT_REQUEST
if(s_execute_fsm(self)==0)
(self>voter_fn)(self>loop,poller,self>voter_arg)
else{
//Destroywaitingmessage,noonetoreadit
http://zguide.zeromq.org/page:all 97/225
12/31/2015 MQ - The Guide - MQ - The Guide
zmsg_t*msg=zmsg_recv(poller>socket)
zmsg_destroy(&msg)
}
r
eturn0
}

//Thisistheconstructorforourbstarclass.Wehavetotellit
//whetherwe'reprimaryorbackupserver,aswellasourlocaland
//remoteendpointstobindandconnectto:

bstar_t*
bstar_new(intprimary,char*local,char*remote)
{
bstar_t
*self

self=(bstar_t*)zmalloc(sizeof(bstar_t))

//InitializetheBinaryStar
self>ctx=zctx_new()
self>loop=zloop_new()
self>state=primary?STATE_PRIMARY:STATE_BACKUP

//Createpublisherforstategoingtopeer
self>statepub=zsocket_new(self>ctx,ZMQ_PUB)
zsocket_bind(self>statepub,local)

//Createsubscriberforstatecomingfrompeer
self>statesub=zsocket_new(self>ctx,ZMQ_SUB)
zsocket_set_subscribe(self>statesub,"")
zsocket_connect(self>statesub,remote)

//Setupbasicreactorevents
zloop_timer(self>loop,BSTAR_HEARTBEAT,0,s_send_state,self)
zmq_pollitem_tpoller={self>statesub,0,ZMQ_POLLIN}
zloop_poller(self>loop,&poller,s_recv_state,self)
returnself
}

//Thedestructorshutsdownthebstarreactor:

void
bstar_destroy(bstar_t**self_p)
{
assert(self_p)
if(*self_p){
bstar_t*self=*self_p
zloop_destroy(&self>loop)
zctx_destroy(&self>ctx)
free(self)
*self_p=NULL
}
}

//Thismethodreturnstheunderlyingzloopreactor,sowecanadd
//additionaltimersandreaders:

zloop_t*
bstar_zloop(bstar_t*self)
{
returnself>loop
}

//Thismethodregistersaclientvotersocket.Messagesreceived

http://zguide.zeromq.org/page:all 98/225
12/31/2015 MQ - The Guide - MQ - The Guide
//onthissocketprovidetheCLIENT_REQUESTeventsfortheBinaryStar
//FSMandarepassedtotheprovidedapplicationhandler.Werequire
//exactlyonevoterperbstarinstance:

int
bstar_voter(bstar_t*self,char*endpoint,inttype,zloop_fnhandler,
void*arg)
{
//Holdactualhandler+argsowecancallthislater
void*socket=zsocket_new(self>ctx,type)
zsocket_bind(socket,endpoint)
assert(!self>voter_fn)
self>voter_fn=handler
self>voter_arg=arg
zmq_pollitem_tpoller={socket,0,ZMQ_POLLIN}
returnzloop_poller(self>loop,&poller,s_voter_ready,self)
}

//Registerhandlerstobecalledeachtimethere'sastatechange:

void
bstar_new_active(bstar_t*self,zloop_fnhandler,void*arg)
{
assert(!self>active_fn)
self>active_fn=handler
self>active_arg=arg
}

void
bstar_new_passive(bstar_t*self,zloop_fnhandler,void*arg)
{
assert(!self>passive_fn)
self>passive_fn=handler
self>passive_arg=arg
}

//Enable/disableverbosetracing,fordebugging:

voidbstar_set_verbose(bstar_t*self,boolverbose)
{
zloop_set_verbose(self>loop,verbose)
}

//Finally,starttheconfiguredreactor.Itwillendifanyhandler
//returns1tothereactor,oriftheprocessreceivesSIGINTorSIGTERM:

int
bstar_start(bstar_t*self)
{
assert(self>voter_fn)
s_update_peer_expiry(self)
returnzloop_start(self>loop)
}

Andhereistheclassimplementation:

bstar:BinaryStarcoreclassinC

Haxe|Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Thisgivesusthefollowingshortmainprogramfortheserver:

bstarsrv2:BinaryStarserver,usingcoreclassinC
http://zguide.zeromq.org/page:all 99/225
12/31/2015 MQ - The Guide - MQ - The Guide

Haxe|Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

BrokerlessReliability(FreelancePattern) topprevnext

Itmightseemironictofocussomuchonbrokerbasedreliability,whenweoftenexplainZeroMQas"brokerlessmessaging".
However,inmessaging,asinreallife,themiddlemanisbothaburdenandabenefit.Inpractice,mostmessagingarchitectures
benefitfromamixofdistributedandbrokeredmessaging.Yougetthebestresultswhenyoucandecidefreelywhattradeoffs
youwanttomake.ThisiswhyIcandrivetwentyminutestoawholesalertobuyfivecasesofwineforaparty,butIcanalsowalk
tenminutestoacornerstoretobuyonebottleforadinner.Ourhighlycontextsensitiverelativevaluationsoftime,energy,and
costareessentialtotherealworldeconomy.Andtheyareessentialtoanoptimalmessagebasedarchitecture.

ThisiswhyZeroMQdoesnotimposeabrokercentricarchitecture,thoughitdoesgiveyouthetoolstobuildbrokers,akaproxies,
andwe'vebuiltadozenorsodifferentonessofar,justforpractice.

Sowe'llendthischapterbydeconstructingthebrokerbasedreliabilitywe'vebuiltsofar,andturningitbackintoadistributed
peertopeerarchitectureIcalltheFreelancepattern.Ourusecasewillbeanameresolutionservice.Thisisacommonproblem
withZeroMQarchitectures:howdoweknowtheendpointtoconnectto?HardcodingTCP/IPaddressesincodeisinsanely
fragile.Usingconfigurationfilescreatesanadministrationnightmare.Imagineifyouhadtohandconfigureyourwebbrowser,on
everyPCormobilephoneyouused,torealizethat"google.com"was"74.125.230.82".

AZeroMQnameservice(andwe'llmakeasimpleimplementation)mustdothefollowing:

Resolvealogicalnameintoatleastabindendpoint,andaconnectendpoint.Arealisticnameservicewouldprovide
multiplebindendpoints,andpossiblymultipleconnectendpointsaswell.

Allowustomanagemultipleparallelenvironments,e.g.,"test"versus"production",withoutmodifyingcode.

Bereliable,becauseifitisunavailable,applicationswon'tbeabletoconnecttothenetwork.

PuttinganameservicebehindaserviceorientedMajordomobrokeriscleverfromsomepointsofview.However,it'ssimplerand
muchlesssurprisingtojustexposethenameserviceasaservertowhichclientscanconnectdirectly.Ifwedothisright,the
nameservicebecomestheonlyglobalnetworkendpointweneedtohardcodeinourcodeorconfigurationfiles.

Figure55TheFreelancePattern

Thetypesoffailureweaimtohandleareservercrashesandrestarts,serverbusylooping,serveroverload,andnetworkissues.
Togetreliability,we'llcreateapoolofnameserverssoifonecrashesorgoesaway,clientscanconnecttoanother,andsoon.In
practice,twowouldbeenough.Butfortheexample,we'llassumethepoolcanbeanysize.

Inthisarchitecture,alargesetofclientsconnecttoasmallsetofserversdirectly.Theserversbindtotheirrespectiveaddresses.
It'sfundamentallydifferentfromabrokerbasedapproachlikeMajordomo,whereworkersconnecttothebroker.Clientshavea
coupleofoptions:

UseREQsocketsandtheLazyPiratepattern.Easy,butwouldneedsomeadditionalintelligencesoclientsdon'tstupidly

http://zguide.zeromq.org/page:all 100/225
12/31/2015 MQ - The Guide - MQ - The Guide
trytoreconnecttodeadserversoverandover.

UseDEALERsocketsandblastoutrequests(whichwillbeloadbalancedtoallconnectedservers)untiltheygetareply.
Effective,butnotelegant.

UseROUTERsocketssoclientscanaddressspecificservers.Buthowdoestheclientknowtheidentityoftheserver
sockets?Eithertheserverhastopingtheclientfirst(complex),ortheserverhastouseahardcoded,fixedidentityknown
totheclient(nasty).

We'lldevelopeachoftheseinthefollowingsubsections.

ModelOne:SimpleRetryandFailover topprevnext

Soourmenuappearstooffer:simple,brutal,complex,ornasty.Let'sstartwithsimpleandthenworkoutthekinks.WetakeLazy
Pirateandrewriteittoworkwithmultipleserverendpoints.

Startoneorseveralserversfirst,specifyingabindendpointastheargument:

flserver1:Freelanceserver,ModelOneinC

C#|Java|Lua|PHP|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Thenstarttheclient,specifyingoneormoreconnectendpointsasarguments:

flclient1:Freelanceclient,ModelOneinC

C#|Java|PHP|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Asamplerunis:

flserver1tcp://*:5555&
flserver1tcp://*:5556&
flclient1tcp://localhost:5555tcp://localhost:5556

AlthoughthebasicapproachisLazyPirate,theclientaimstojustgetonesuccessfulreply.Ithastwotechniques,dependingon
whetheryouarerunningasingleserverormultipleservers:

Withasingleserver,theclientwillretryseveraltimes,exactlyasforLazyPirate.
Withmultipleservers,theclientwilltryeachserveratmostonceuntilit'sreceivedareplyorhastriedallservers.

ThissolvesthemainweaknessofLazyPirate,namelythatitcouldnotfailovertobackuporalternateservers.

However,thisdesignwon'tworkwellinarealapplication.Ifwe'reconnectingmanysocketsandourprimarynameserveris
down,we'regoingtoexperiencethispainfultimeouteachtime.

ModelTwo:BrutalShotgunMassacre topprevnext

Let'sswitchourclienttousingaDEALERsocket.Ourgoalhereistomakesurewegetareplybackwithintheshortestpossible
time,nomatterwhetheraparticularserverisupordown.Ourclienttakesthisapproach:

Wesetthingsup,connectingtoallservers.
Whenwehavearequest,weblastitoutasmanytimesaswehaveservers.
Wewaitforthefirstreply,andtakethat.
Weignoreanyotherreplies.

Whatwillhappeninpracticeisthatwhenallserversarerunning,ZeroMQwilldistributetherequestssothateachservergetsone
requestandsendsonereply.Whenanyserverisofflineanddisconnected,ZeroMQwilldistributetherequeststotheremaining
http://zguide.zeromq.org/page:all 101/225
12/31/2015 MQ - The Guide - MQ - The Guide
servers.Soaservermayinsomecasesgetthesamerequestmorethanonce.

What'smoreannoyingfortheclientisthatwe'llgetmultiplerepliesback,butthere'snoguaranteewe'llgetaprecisenumberof
replies.Requestsandrepliescangetlost(e.g.,iftheservercrasheswhileprocessingarequest).

Sowehavetonumberrequestsandignoreanyrepliesthatdon'tmatchtherequestnumber.OurModelOneserverwillwork
becauseit'sanechoserver,butcoincidenceisnotagreatbasisforunderstanding.Sowe'llmakeaModelTwoserverthatchews
upthemessageandreturnsacorrectlynumberedreplywiththecontent"OK".We'llusemessagesconsistingoftwoparts:a
sequencenumberandabody.

Startoneormoreservers,specifyingabindendpointeachtime:

flserver2:Freelanceserver,ModelTwoinC

C#|Java|Lua|PHP|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Thenstarttheclient,specifyingtheconnectendpointsasarguments:

flclient2:Freelanceclient,ModelTwoinC

C#|Java|PHP|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Herearesomethingstonoteabouttheclientimplementation:

TheclientisstructuredasanicelittleclassbasedAPIthathidesthedirtyworkofcreatingZeroMQcontextsandsockets
andtalkingtotheserver.Thatis,ifashotgunblasttothemidriffcanbecalled"talking".

Theclientwillabandonthechaseifitcan'tfindanyresponsiveserverwithinafewseconds.

TheclienthastocreateavalidREPenvelope,i.e.,addanemptymessageframetothefrontofthemessage.

Theclientperforms10,000nameresolutionrequests(fakeones,asourserverdoesessentiallynothing)andmeasuresthe
averagecost.Onmytestbox,talkingtooneserver,thisrequiresabout60microseconds.Talkingtothreeservers,ittakesabout
80microseconds.

Theprosandconsofourshotgunapproachare:

Pro:itissimple,easytomakeandeasytounderstand.
Pro:itdoesthejoboffailover,andworksrapidly,solongasthereisatleastoneserverrunning.
Con:itcreatesredundantnetworktraffic.
Con:wecan'tprioritizeourservers,i.e.,Primary,thenSecondary.
Con:theservercandoatmostonerequestatatime,period.

ModelThree:ComplexandNasty topprevnext

Theshotgunapproachseemstoogoodtobetrue.Let'sbescientificandworkthroughallthealternatives.We'regoingtoexplore
thecomplex/nastyoption,evenifit'sonlytofinallyrealizethatwepreferredbrutal.Ah,thestoryofmylife.

WecansolvethemainproblemsoftheclientbyswitchingtoaROUTERsocket.Thatletsussendrequeststospecificservers,
avoidserversweknowaredead,andingeneralbeassmartaswewanttobe.Wecanalsosolvethemainproblemoftheserver
(singlethreadedness)byswitchingtoaROUTERsocket.

ButdoingROUTERtoROUTERbetweentwoanonymoussockets(whichhaven'tsetanidentity)isnotpossible.Bothsides
generateanidentity(fortheotherpeer)onlywhentheyreceiveafirstmessage,andthusneithercantalktotheotheruntilithas
firstreceivedamessage.Theonlywayoutofthisconundrumistocheat,andusehardcodedidentitiesinonedirection.The
properwaytocheat,inaclient/servercase,istolettheclient"know"theidentityoftheserver.Doingittheotherwayaround
wouldbeinsane,ontopofcomplexandnasty,becauseanynumberofclientsshouldbeabletoariseindependently.Insane,
complex,andnastyaregreatattributesforagenocidaldictator,butterribleonesforsoftware.

Ratherthaninventyetanotherconcepttomanage,we'llusetheconnectionendpointasidentity.Thisisauniquestringonwhich
bothsidescanagreewithoutmorepriorknowledgethantheyalreadyhavefortheshotgunmodel.It'sasneakyandeffectiveway
toconnecttwoROUTERsockets.

http://zguide.zeromq.org/page:all 102/225
12/31/2015 MQ - The Guide - MQ - The Guide
RememberhowZeroMQidentitieswork.TheserverROUTERsocketsetsanidentitybeforeitbindsitssocket.Whenaclient
connects,theydoalittlehandshaketoexchangeidentities,beforeeithersidesendsarealmessage.TheclientROUTERsocket,
havingnotsetanidentity,sendsanullidentitytotheserver.TheservergeneratesarandomUUIDtodesignatetheclientforits
ownuse.Theserversendsitsidentity(whichwe'veagreedisgoingtobeanendpointstring)totheclient.

Thismeansthatourclientcanrouteamessagetotheserver(i.e.,sendonitsROUTERsocket,specifyingtheserverendpointas
identity)assoonastheconnectionisestablished.That'snotimmediatelyafterdoingazmq_connect(),butsomerandomtime
thereafter.Hereinliesoneproblem:wedon'tknowwhentheserverwillactuallybeavailableandcompleteitsconnection
handshake.Iftheserverisonline,itcouldbeafterafewmilliseconds.Iftheserverisdownandthesysadminisouttolunch,it
couldbeanhourfromnow.

There'sasmallparadoxhere.Weneedtoknowwhenserversbecomeconnectedandavailableforwork.IntheFreelance
pattern,unlikethebrokerbasedpatternswesawearlierinthischapter,serversaresilentuntilspokento.Thuswecan'ttalktoa
serveruntilit'stoldusit'sonline,whichitcan'tdountilwe'veaskedit.

Mysolutionistomixinalittleoftheshotgunapproachfrommodel2,meaningwe'llfire(harmless)shotsatanythingwecan,and
ifanythingmoves,weknowit'salive.We'renotgoingtofirerealrequests,butratherakindofpingpongheartbeat.

Thisbringsustotherealmofprotocolsagain,sohere'sashortspecthatdefineshowaFreelanceclientandserverexchange
pingpongcommandsandrequestreplycommands.

Itisshortandsweettoimplementasaserver.Here'sourechoserver,ModelThree,nowspeakingFLP:

flserver3:Freelanceserver,ModelThreeinC

C#|Java|Lua|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

TheFreelanceclient,however,hasgottenlarge.Forclarity,it'ssplitintoanexampleapplicationandaclassthatdoesthehard
work.Here'sthetoplevelapplication:

flclient3:Freelanceclient,ModelThreeinC

C#|Java|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhere,almostascomplexandlargeastheMajordomobroker,istheclientAPIclass:

flcliapi:FreelanceclientAPIinC

C#|Java|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

ThisAPIimplementationisfairlysophisticatedandusesacoupleoftechniquesthatwe'venotseenbefore.

MultithreadedAPI:theclientAPIconsistsoftwoparts,asynchronousflcliapiclassthatrunsintheapplicationthread,
andanasynchronousagentclassthatrunsasabackgroundthread.RememberhowZeroMQmakesiteasytocreate
multithreadedapps.Theflcliapiandagentclassestalktoeachotherwithmessagesoveraninprocsocket.AllZeroMQ
aspects(suchascreatinganddestroyingacontext)arehiddenintheAPI.Theagentineffectactslikeaminibroker,
talkingtoserversinthebackground,sothatwhenwemakearequest,itcanmakeabestefforttoreachaserverit
believesisavailable.

Ticklesspolltimer:inpreviouspollloopswealwaysusedafixedtickinterval,e.g.,1second,whichissimpleenoughbut
notexcellentonpowersensitiveclients(suchasnotebooksormobilephones),wherewakingtheCPUcostspower.For
fun,andtohelpsavetheplanet,theagentusesaticklesstimer,whichcalculatesthepolldelaybasedonthenexttimeout
we'reexpecting.Aproperimplementationwouldkeepanorderedlistoftimeouts.Wejustcheckalltimeoutsandcalculate
thepolldelayuntilthenextone.

Conclusion topprevnext

Inthischapter,we'veseenavarietyofreliablerequestreplymechanisms,eachwithcertaincostsandbenefits.Theexample
codeislargelyreadyforrealuse,thoughitisnotoptimized.Ofallthedifferentpatterns,thetwothatstandoutforproductionuse
aretheMajordomopattern,forbrokerbasedreliability,andtheFreelancepattern,forbrokerlessreliability.

http://zguide.zeromq.org/page:all 103/225
12/31/2015 MQ - The Guide - MQ - The Guide

Chapter5AdvancedPubSubPatterns topprevnext

InChapter3AdvancedRequestReplyPatternsandChapter4ReliableRequestReplyPatternswelookedatadvanceduseof
ZeroMQ'srequestreplypattern.Ifyoumanagedtodigestallthat,congratulations.Inthischapterwe'llfocusonpublishsubscribe
andextendZeroMQ'scorepubsubpatternwithhigherlevelpatternsforperformance,reliability,statedistribution,and
monitoring.

We'llcover:

Whentousepublishsubscribe
Howtohandletooslowsubscribers(theSuicidalSnailpattern)
Howtodesignhighspeedsubscribers(theBlackBoxpattern)
Howtomonitorapubsubnetwork(theEspressopattern)
Howtobuildasharedkeyvaluestore(theClonepattern)
Howtousereactorstosimplifycomplexservers
HowtousetheBinaryStarpatterntoaddfailovertoaserver

ProsandConsofPubSub topprevnext

ZeroMQ'slowlevelpatternshavetheirdifferentcharacters.Pubsubaddressesanoldmessagingproblem,whichismulticastor
groupmessaging.IthasthatuniquemixofmeticuloussimplicityandbrutalindifferencethatcharacterizesZeroMQ.It'sworth
understandingthetradeoffsthatpubsubmakes,howthesebenefitus,andhowwecanworkaroundthemifneeded.

First,PUBsendseachmessageto"allofmany",whereasPUSHandDEALERrotatemessagesto"oneofmany".Youcannot
simplyreplacePUSHwithPUBorviceversaandhopethatthingswillwork.Thisbearsrepeatingbecausepeopleseemtoquite
oftensuggestdoingthis.

Moreprofoundly,pubsubisaimedatscalability.Thismeanslargevolumesofdata,sentrapidlytomanyrecipients.Ifyouneed
millionsofmessagespersecondsenttothousandsofpoints,you'llappreciatepubsubalotmorethanifyouneedafew
messagesasecondsenttoahandfulofrecipients.

Togetscalability,pubsubusesthesametrickaspushpull,whichistogetridofbackchatter.Thismeansthatrecipientsdon't
talkbacktosenders.Therearesomeexceptions,e.g.,SUBsocketswillsendsubscriptionstoPUBsockets,butit'sanonymous
andinfrequent.

Killingbackchatterisessentialtorealscalability.Withpubsub,it'showthepatterncanmapcleanlytothePGMmulticast
protocol,whichishandledbythenetworkswitch.Inotherwords,subscribersdon'tconnecttothepublisheratall,theyconnectto
amulticastgroupontheswitch,towhichthepublishersendsitsmessages.

Whenweremovebackchatter,ouroverallmessageflowbecomesmuchsimpler,whichletsusmakesimplerAPIs,simpler
protocols,andingeneralreachmanymorepeople.Butwealsoremoveanypossibilitytocoordinatesendersandreceivers.What
thismeansis:

Publisherscan'ttellwhensubscribersaresuccessfullyconnected,bothoninitialconnections,andonreconnectionsafter
networkfailures.

Subscriberscan'ttellpublishersanythingthatwouldallowpublisherstocontroltherateofmessagestheysend.Publishers
onlyhaveonesetting,whichisfullspeed,andsubscribersmusteitherkeepuporlosemessages.

Publisherscan'ttellwhensubscribershavedisappearedduetoprocessescrashing,networksbreaking,andsoon.

Thedownsideisthatweactuallyneedalloftheseifwewanttodoreliablemulticast.TheZeroMQpubsubpatternwilllose
messagesarbitrarilywhenasubscriberisconnecting,whenanetworkfailureoccurs,orjustifthesubscriberornetworkcan't
keepupwiththepublisher.

Theupsideisthattherearemanyusecaseswherealmostreliablemulticastisjustfine.Whenweneedthisbackchatter,wecan
eitherswitchtousingROUTERDEALER(whichItendtodoformostnormalvolumecases),orwecanaddaseparatechannel
forsynchronization(we'llseeanexampleofthislaterinthischapter).

Pubsubislikearadiobroadcastyoumisseverythingbeforeyoujoin,andthenhowmuchinformationyougetdependsonthe

http://zguide.zeromq.org/page:all 104/225
12/31/2015 MQ - The Guide - MQ - The Guide
qualityofyourreception.Surprisingly,thismodelisusefulandwidespreadbecauseitmapsperfectlytorealworlddistributionof
information.ThinkofFacebookandTwitter,theBBCWorldService,andthesportsresults.

Aswedidforrequestreply,let'sdefinereliabilityintermsofwhatcangowrong.Herearetheclassicfailurecasesforpubsub:

Subscribersjoinlate,sotheymissmessagestheserveralreadysent.
Subscriberscanfetchmessagestooslowly,soqueuesbuildupandthenoverflow.
Subscriberscandropoffandlosemessageswhiletheyareaway.
Subscriberscancrashandrestart,andlosewhateverdatatheyalreadyreceived.
Networkscanbecomeoverloadedanddropdata(specifically,forPGM).
Networkscanbecometooslow,sopublishersidequeuesoverflowandpublisherscrash.

Alotmorecangowrongbutthesearethetypicalfailuresweseeinarealisticsystem.Sincev3.x,ZeroMQforcesdefaultlimits
onitsinternalbuffers(thesocalledhighwatermarkorHWM),sopublishercrashesarerarerunlessyoudeliberatelysetthe
HWMtoinfinite.

Allofthesefailurecaseshaveanswers,thoughnotalwayssimpleones.Reliabilityrequirescomplexitythatmostofusdon'tneed,
mostofthetime,whichiswhyZeroMQdoesn'tattempttoprovideitoutofthebox(eveniftherewasoneglobaldesignfor
reliability,whichthereisn't).

PubSubTracing(EspressoPattern) topprevnext

Let'sstartthischapterbylookingatawaytotracepubsubnetworks.InChapter2SocketsandPatternswesawasimpleproxy
thatusedthesetodotransportbridging.Thezmq_proxy()methodhasthreearguments:afrontendandbackendsocketthatit
bridgestogether,andacapturesockettowhichitwillsendallmessages.

Thecodeisdeceptivelysimple:

espresso:EspressoPatterninC

C#|Java|Python|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

EspressoworksbycreatingalistenerthreadthatreadsaPAIRsocketandprintsanythingitgets.ThatPAIRsocketisoneendof
apipetheotherend(anotherPAIR)isthesocketwepasstozmq_proxy().Inpractice,you'dfilterinterestingmessagestoget
theessenceofwhatyouwanttotrack(hencethenameofthepattern).

Thesubscriberthreadsubscribesto"A"and"B",receivesfivemessages,andthendestroysitssocket.Whenyourunthe
example,thelistenerprintstwosubscriptionmessages,fivedatamessages,twounsubscribemessages,andthensilence:

[002]0141
[002]0142
[007]B91164
[007]B12979
[007]A52599
[007]A06417
[007]A45770
[002]0041
[002]0042

Thisshowsneatlyhowthepublishersocketstopssendingdatawhentherearenosubscribersforit.Thepublisherthreadisstill
sendingmessages.Thesocketjustdropsthemsilently.

LastValueCaching topprevnext

Ifyou'veusedcommercialpubsubsystems,youmaybeusedtosomefeaturesthataremissinginthefastandcheerfulZeroMQ
pubsubmodel.Oneoftheseislastvaluecaching(LVC).Thissolvestheproblemofhowanewsubscribercatchesupwhenit
joinsthenetwork.Thetheoryisthatpublishersgetnotifiedwhenanewsubscriberjoinsandsubscribestosomespecifictopics.
http://zguide.zeromq.org/page:all 105/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thepublishercanthenrebroadcastthelastmessageforthosetopics.

I'vealreadyexplainedwhypublishersdon'tgetnotifiedwhentherearenewsubscribers,becauseinlargepubsubsystems,the
volumesofdatamakeitprettymuchimpossible.Tomakereallylargescalepubsubnetworks,youneedaprotocollikePGMthat
exploitsanupscaleEthernetswitch'sabilitytomulticastdatatothousandsofsubscribers.TryingtodoaTCPunicastfromthe
publishertoeachofthousandsofsubscribersjustdoesn'tscale.Yougetweirdspikes,unfairdistribution(somesubscribers
gettingthemessagebeforeothers),networkcongestion,andgeneralunhappiness.

PGMisaonewayprotocol:thepublishersendsamessagetoamulticastaddressattheswitch,whichthenrebroadcaststhatto
allinterestedsubscribers.Thepublisherneverseeswhensubscribersjoinorleave:thisallhappensintheswitch,whichwedon't
reallywanttostartreprogramming.

However,inalowervolumenetworkwithafewdozensubscribersandalimitednumberoftopics,wecanuseTCPandthenthe
XSUBandXPUBsocketsdotalktoeachotheraswejustsawintheEspressopattern.

CanwemakeanLVCusingZeroMQ?Theanswerisyes,ifwemakeaproxythatsitsbetweenthepublisherandsubscribersan
analogforthePGMswitch,butonewecanprogramourselves.

I'llstartbymakingapublisherandsubscriberthathighlighttheworstcasescenario.Thispublisherispathological.Itstartsby
immediatelysendingmessagestoeachofathousandtopics,andthenitsendsoneupdateasecondtoarandomtopic.A
subscriberconnects,andsubscribestoatopic.WithoutLVC,asubscriberwouldhavetowaitanaverageof500secondstoget
anydata.Toaddsomedrama,let'spretendthere'sanescapedconvictcalledGregorthreateningtoriptheheadoffRogerthe
toybunnyifwecan'tfixthat8.3minutes'delay.

Here'sthepublishercode.Notethatithasthecommandlineoptiontoconnecttosomeaddress,butotherwisebindstoan
endpoint.We'llusethislatertoconnecttoourlastvaluecache:

pathopub:PathologicPublisherinC

C#|Java|Python|Ruby|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Scala|Tcl

Andhere'sthesubscriber:

pathosub:PathologicSubscriberinC

C#|Java|Python|Ruby|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Scala|Tcl

Trybuildingandrunningthese:firstthesubscriber,thenthepublisher.You'llseethesubscriberreportsgetting"SaveRoger"as
you'dexpect:

./pathosub&
./pathopub

It'swhenyourunasecondsubscriberthatyouunderstandRoger'spredicament.Youhavetoleaveitanawfullongtimebeforeit
reportsgettinganydata.So,here'sourlastvaluecache.AsIpromised,it'saproxythatbindstotwosocketsandthenhandles
messagesonboth:

lvcache:LastValueCachingProxyinC

C#|Java|Python|Ruby|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Scala|Tcl

Now,runtheproxy,andthenthepublisher:

./lvcache&
./pathopubtcp://localhost:5557

Andnowrunasmanyinstancesofthesubscriberasyouwanttotry,eachtimeconnectingtotheproxyonport5558:

./pathosubtcp://localhost:5558

http://zguide.zeromq.org/page:all 106/225
12/31/2015 MQ - The Guide - MQ - The Guide
Eachsubscriberhappilyreports"SaveRoger",andGregortheEscapedConvictslinksbacktohisseatfordinnerandanicecup
ofhotmilk,whichisallhereallywantedinthefirstplace.

Onenote:bydefault,theXPUBsocketdoesnotreportduplicatesubscriptions,whichiswhatyouwantwhenyou'renaively
connectinganXPUBtoanXSUB.Ourexamplesneakilygetsaroundthisbyusingrandomtopicssothechanceofitnotworking
isoneinamillion.InarealLVCproxy,you'llwanttousetheZMQ_XPUB_VERBOSEoptionthatweimplementinChapter6The
ZeroMQCommunityasanexercise.

SlowSubscriberDetection(SuicidalSnailPattern) topprevnext

Acommonproblemyouwillhitwhenusingthepubsubpatterninreallifeistheslowsubscriber.Inanidealworld,westream
dataatfullspeedfrompublisherstosubscribers.Inreality,subscriberapplicationsareoftenwrittenininterpretedlanguages,or
justdoalotofwork,orarejustbadlywritten,totheextentthattheycan'tkeepupwithpublishers.

Howdowehandleaslowsubscriber?Theidealfixistomakethesubscriberfaster,butthatmighttakeworkandtime.Someof
theclassicstrategiesforhandlingaslowsubscriberare:

Queuemessagesonthepublisher.ThisiswhatGmaildoeswhenIdon'treadmyemailforacoupleofhours.Butin
highvolumemessaging,pushingqueuesupstreamhasthethrillingbutunprofitableresultofmakingpublishersrunoutof
memoryandcrashespeciallyiftherearelotsofsubscribersandit'snotpossibletoflushtodiskforperformancereasons.

Queuemessagesonthesubscriber.Thisismuchbetter,andit'swhatZeroMQdoesbydefaultifthenetworkcankeep
upwiththings.Ifanyone'sgoingtorunoutofmemoryandcrash,it'llbethesubscriberratherthanthepublisher,whichis
fair.Thisisperfectfor"peaky"streamswhereasubscribercan'tkeepupforawhile,butcancatchupwhenthestream
slowsdown.However,it'snoanswertoasubscriberthat'ssimplytooslowingeneral.

Stopqueuingnewmessagesafterawhile.ThisiswhatGmaildoeswhenmymailboxoverflowsitspreciousgigabytes
ofspace.Newmessagesjustgetrejectedordropped.Thisisagreatstrategyfromtheperspectiveofthepublisher,and
it'swhatZeroMQdoeswhenthepublishersetsaHWM.However,itstilldoesn'thelpusfixtheslowsubscriber.Nowwe
justgetgapsinourmessagestream.

Punishslowsubscriberswithdisconnect.ThisiswhatHotmail(rememberthat?)didwhenIdidn'tloginfortwoweeks,
whichiswhyIwasonmyfifteenthHotmailaccountwhenithitmethattherewasperhapsabetterway.It'sanicebrutal
strategythatforcessubscriberstositupandpayattentionandwouldbeideal,butZeroMQdoesn'tdothis,andthere'sno
waytolayeritontopbecausesubscribersareinvisibletopublisherapplications.

Noneoftheseclassicstrategiesfit,soweneedtogetcreative.Ratherthandisconnectthepublisher,let'sconvincethe
subscribertokillitself.ThisistheSuicidalSnailpattern.Whenasubscriberdetectsthatit'srunningtooslowly(where"tooslowly"
ispresumablyaconfiguredoptionthatreallymeans"soslowlythatifyouevergethere,shoutreallyloudlybecauseIneedto
know,soIcanfixthis!"),itcroaksanddies.

Howcanasubscriberdetectthis?Onewaywouldbetosequencemessages(numbertheminorder)anduseaHWMatthe
publisher.Now,ifthesubscriberdetectsagap(i.e.,thenumberingisn'tconsecutive),itknowssomethingiswrong.Wethentune
theHWMtothe"croakanddieifyouhitthis"level.

Therearetwoproblemswiththissolution.One,ifwehavemanypublishers,howdowesequencemessages?Thesolutionisto
giveeachpublisherauniqueIDandaddthattothesequencing.Second,ifsubscribersuseZMQ_SUBSCRIBEfilters,theywillget
gapsbydefinition.Ourprecioussequencingwillbefornothing.

Someusecaseswon'tusefilters,andsequencingwillworkforthem.Butamoregeneralsolutionisthatthepublishertimestamps
eachmessage.Whenasubscribergetsamessage,itchecksthetime,andifthedifferenceismorethan,say,onesecond,it
doesthe"croakanddie"thing,possiblyfiringoffasquawktosomeoperatorconsolefirst.

TheSuicideSnailpatternworksespeciallywhensubscribershavetheirownclientsandservicelevelagreementsandneedto
guaranteecertainmaximumlatencies.Abortingasubscribermaynotseemlikeaconstructivewaytoguaranteeamaximum
latency,butit'stheassertionmodel.Aborttoday,andtheproblemwillbefixed.Allowlatedatatoflowdownstream,andthe
problemmaycausewiderdamageandtakelongertoappearontheradar.

HereisaminimalexampleofaSuicidalSnail:

suisnail:SuicidalSnailinC

C++|C#|Java|Lua|PHP|Python|Tcl|Ada|Basic|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

http://zguide.zeromq.org/page:all 107/225
12/31/2015 MQ - The Guide - MQ - The Guide
HerearesomethingstonoteabouttheSuicidalSnailexample:

Themessagehereconsistssimplyofthecurrentsystemclockasanumberofmilliseconds.Inarealisticapplication,you'd
haveatleastamessageheaderwiththetimestampandamessagebodywithdata.

Theexamplehassubscriberandpublisherinasingleprocessastwothreads.Inreality,theywouldbeseparate
processes.Usingthreadsisjustconvenientforthedemonstration.

HighSpeedSubscribers(BlackBoxPattern) topprevnext

Nowletslookatonewaytomakeoursubscribersfaster.Acommonusecaseforpubsubisdistributinglargedatastreamslike
marketdatacomingfromstockexchanges.Atypicalsetupwouldhaveapublisherconnectedtoastockexchange,takingprice
quotes,andsendingthemouttoanumberofsubscribers.Ifthereareahandfulofsubscribers,wecoulduseTCP.Ifwehavea
largernumberofsubscribers,we'dprobablyusereliablemulticast,i.e.,PGM.

Figure56TheSimpleBlackBoxPattern

Let'simagineourfeedhasanaverageof100,000100bytemessagesasecond.That'satypicalrate,afterfilteringmarketdata
wedon'tneedtosendontosubscribers.Nowwedecidetorecordaday'sdata(maybe250GBin8hours),andthenreplayitto
asimulationnetwork,i.e.,asmallgroupofsubscribers.While100KmessagesasecondiseasyforaZeroMQapplication,we
wanttoreplayitmuchfaster.

Sowesetupourarchitecturewithabunchofboxesoneforthepublisherandoneforeachsubscriber.Thesearewellspecified
boxeseightcores,twelveforthepublisher.

Andaswepumpdataintooursubscribers,wenoticetwothings:

1. Whenwedoeventheslightestamountofworkwithamessage,itslowsdownoursubscribertothepointwhereitcan't
catchupwiththepublisheragain.

1. We'rehittingaceiling,atbothpublisherandsubscriber,toaround6Mmessagesasecond,evenaftercarefuloptimization
andTCPtuning.
http://zguide.zeromq.org/page:all 108/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thefirstthingwehavetodoisbreakoursubscriberintoamultithreadeddesignsothatwecandoworkwithmessagesinoneset
ofthreads,whilereadingmessagesinanother.Typically,wedon'twanttoprocesseverymessagethesameway.Rather,the
subscriberwillfiltersomemessages,perhapsbyprefixkey.Whenamessagematchessomecriteria,thesubscriberwillcalla
workertodealwithit.InZeroMQterms,thismeanssendingthemessagetoaworkerthread.

Sothesubscriberlookssomethinglikeaqueuedevice.Wecouldusevarioussocketstoconnectthesubscriberandworkers.If
weassumeonewaytrafficandworkersthatareallidentical,wecanusePUSHandPULLanddelegatealltheroutingworkto
ZeroMQ.Thisisthesimplestandfastestapproach.

ThesubscribertalkstothepublisheroverTCPorPGM.Thesubscribertalkstoitsworkers,whichareallinthesameprocess,
overinproc://.

Figure57MadBlackBoxPattern

Nowtobreakthatceiling.Thesubscriberthreadhits100%ofCPUandbecauseitisonethread,itcannotusemorethanone
core.Asinglethreadwillalwayshitaceiling,beitat2M,6M,ormoremessagespersecond.Wewanttosplittheworkacross
multiplethreadsthatcanruninparallel.

Theapproachusedbymanyhighperformanceproducts,whichworkshere,issharding.Usingsharding,wesplittheworkinto
parallelandindependentstreams,suchashalfofthetopickeysinonestream,andhalfinanother.Wecouldusemanystreams,
butperformancewon'tscaleunlesswehavefreecores.Solet'sseehowtoshardintotwostreams.

Withtwostreams,workingatfullspeed,wewouldconfigureZeroMQasfollows:

TwoI/Othreads,ratherthanone.
Twonetworkinterfaces(NIC),onepersubscriber.
EachI/OthreadboundtoaspecificNIC.
Twosubscriberthreads,boundtospecificcores.
TwoSUBsockets,onepersubscriberthread.
Theremainingcoresassignedtoworkerthreads.
WorkerthreadsconnectedtobothsubscriberPUSHsockets.

Ideally,wewanttomatchthenumberoffullyloadedthreadsinourarchitecturewiththenumberofcores.Whenthreadsstartto
fightforcoresandCPUcycles,thecostofaddingmorethreadsoutweighsthebenefits.Therewouldbenobenefit,forexample,
increatingmoreI/Othreads.

http://zguide.zeromq.org/page:all 109/225
12/31/2015 MQ - The Guide - MQ - The Guide

ReliablePubSub(ClonePattern) topprevnext

Asalargerworkedexample,we'lltaketheproblemofmakingareliablepubsubarchitecture.We'lldevelopthisinstages.The
goalistoallowasetofapplicationstosharesomecommonstate.Hereareourtechnicalchallenges:

Wehavealargesetofclientapplications,saythousandsortensofthousands.
Theywilljoinandleavethenetworkarbitrarily.
Theseapplicationsmustshareasingleeventuallyconsistentstate.
Anyapplicationcanupdatethestateatanypointintime.

Let'ssaythatupdatesarereasonablylowvolume.Wedon'thaverealtimegoals.Thewholestatecanfitintomemory.Some
plausibleusecasesare:

Aconfigurationthatissharedbyagroupofcloudservers.
Somegamestatesharedbyagroupofplayers.
Exchangeratedatathatisupdatedinrealtimeandavailabletoapplications.

CentralizedVersusDecentralized topprevnext

Afirstdecisionwehavetomakeiswhetherweworkwithacentralserverornot.Itmakesabigdifferenceintheresultingdesign.
Thetradeoffsarethese:

Conceptually,acentralserverissimplertounderstandbecausenetworksarenotnaturallysymmetrical.Withacentral
server,weavoidallquestionsofdiscovery,bindversusconnect,andsoon.

Generally,afullydistributedarchitectureistechnicallymorechallengingbutendsupwithsimplerprotocols.Thatis,each
nodemustactasserverandclientintherightway,whichisdelicate.Whendoneright,theresultsaresimplerthanusinga
centralserver.WesawthisintheFreelancepatterninChapter4ReliableRequestReplyPatterns.

Acentralserverwillbecomeabottleneckinhighvolumeusecases.Ifhandlingscaleintheorderofmillionsofmessages
asecondisrequired,weshouldaimfordecentralizationrightaway.

Ironically,acentralizedarchitecturewillscaletomorenodesmoreeasilythanadecentralizedone.Thatis,it'seasierto
connect10,000nodestooneserverthantoeachother.

So,fortheClonepatternwe'llworkwithaserverthatpublishesstateupdatesandasetofclientsthatrepresentapplications.

RepresentingStateasKeyValuePairs topprevnext

We'lldevelopCloneinstages,solvingoneproblematatime.First,let'slookathowtoupdateasharedstateacrossasetof
clients.Weneedtodecidehowtorepresentourstate,aswellastheupdates.Thesimplestplausibleformatisakeyvaluestore,
whereonekeyvaluepairrepresentsanatomicunitofchangeinthesharedstate.

WehaveasimplepubsubexampleinChapter1Basics,theweatherserverandclient.Let'schangetheservertosendkey
valuepairs,andtheclienttostoretheseinahashtable.Thisletsussendupdatesfromoneservertoasetofclientsusingthe
classicpubsubmodel.

Anupdateiseitheranewkeyvaluepair,amodifiedvalueforanexistingkey,oradeletedkey.Wecanassumefornowthatthe
wholestorefitsinmemoryandthatapplicationsaccessitbykey,suchasbyusingahashtableordictionary.Forlargerstores
andsomekindofpersistencewe'dprobablystorethestateinadatabase,butthat'snotrelevanthere.

Thisistheserver:

clonesrv1:Cloneserver,ModelOneinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|

http://zguide.zeromq.org/page:all 110/225
12/31/2015 MQ - The Guide - MQ - The Guide
Q|Racket|Ruby|Scala

Andhereistheclient:

clonecli1:Cloneclient,ModelOneinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Figure58PublishingStateUpdates

Herearesomethingstonoteaboutthisfirstmodel:

Allthehardworkisdoneinakvmsgclass.Thisclassworkswithkeyvaluemessageobjects,whicharemultipartZeroMQ
messagesstructuredasthreeframes:akey(aZeroMQstring),asequencenumber(64bitvalue,innetworkbyteorder),
andabinarybody(holdseverythingelse).

Theservergeneratesmessageswitharandomized4digitkey,whichletsussimulatealargebutnotenormoushashtable
(10Kentries).

Wedon'timplementdeletionsinthisversion:allmessagesareinsertsorupdates.

Theserverdoesa200millisecondpauseafterbindingitssocket.Thisistopreventslowjoinersyndrome,wherethe
subscriberlosesmessagesasitconnectstotheserver'ssocket.We'llremovethatinlaterversionsoftheClonecode.

We'llusethetermspublisherandsubscriberinthecodetorefertosockets.Thiswillhelplaterwhenwehavemultiple
socketsdoingdifferentthings.

Hereisthekvmsgclass,inthesimplestformthatworksfornow:

kvsimple:KeyvaluemessageclassinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Later,we'llmakeamoresophisticatedkvmsgclassthatwillworkinrealapplications.

Boththeserverandclientmaintainhashtables,butthisfirstmodelonlyworksproperlyifwestartallclientsbeforetheserverand
theclientsnevercrash.That'sveryartificial.

GettinganOutofBandSnapshot topprevnext

Sonowwehaveoursecondproblem:howtodealwithlatejoiningclientsorclientsthatcrashandthenrestart.

http://zguide.zeromq.org/page:all 111/225
12/31/2015 MQ - The Guide - MQ - The Guide
Inordertoallowalate(orrecovering)clienttocatchupwithaserver,ithastogetasnapshotoftheserver'sstate.Justaswe've
reduced"message"tomean"asequencedkeyvaluepair",wecanreduce"state"tomean"ahashtable".Togettheserverstate,
aclientopensaDEALERsocketandasksforitexplicitly.

Tomakethiswork,wehavetosolveaproblemoftiming.Gettingastatesnapshotwilltakeacertaintime,possiblyfairlylongif
thesnapshotislarge.Weneedtocorrectlyapplyupdatestothesnapshot.Buttheserverwon'tknowwhentostartsendingus
updates.Onewaywouldbetostartsubscribing,getafirstupdate,andthenaskfor"stateforupdateN".Thiswouldrequirethe
serverstoringonesnapshotforeachupdate,whichisn'tpractical.

Figure59StateReplication

Sowewilldothesynchronizationintheclient,asfollows:

Theclientfirstsubscribestoupdatesandthenmakesastaterequest.Thisguaranteesthatthestateisgoingtobenewer
thantheoldestupdateithas.

Theclientwaitsfortheservertoreplywithstate,andmeanwhilequeuesallupdates.Itdoesthissimplybynotreading
them:ZeroMQkeepsthemqueuedonthesocketqueue.

Whentheclientreceivesitsstateupdate,itbeginsonceagaintoreadupdates.However,itdiscardsanyupdatesthatare
olderthanthestateupdate.Soifthestateupdateincludesupdatesupto200,theclientwilldiscardupdatesupto201.

Theclientthenappliesupdatestoitsownstatesnapshot.

It'sasimplemodelthatexploitsZeroMQ'sowninternalqueues.Here'stheserver:

clonesrv2:Cloneserver,ModelTwoinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhereistheclient:

clonecli2:Cloneclient,ModelTwoinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Herearesomethingstonoteaboutthesetwoprograms:

Theserverusestwotasks.Onethreadproducestheupdates(randomly)andsendsthesetothemainPUBsocket,while
theotherthreadhandlesstaterequestsontheROUTERsocket.ThetwocommunicateacrossPAIRsocketsoveran
inproc://connection.

Theclientisreallysimple.InC,itconsistsofaboutfiftylinesofcode.Alotoftheheavyliftingisdoneinthekvmsgclass.

http://zguide.zeromq.org/page:all 112/225
12/31/2015 MQ - The Guide - MQ - The Guide
Evenso,thebasicClonepatterniseasiertoimplementthanitseemedatfirst.

Wedon'tuseanythingfancyforserializingthestate.Thehashtableholdsasetofkvmsgobjects,andtheserversends
these,asabatchofmessages,totheclientrequestingstate.Ifmultipleclientsrequeststateatonce,eachwillgeta
differentsnapshot.

Weassumethattheclienthasexactlyoneservertotalkto.Theservermustberunningwedonottrytosolvethe
questionofwhathappensiftheservercrashes.

Rightnow,thesetwoprogramsdon'tdoanythingreal,buttheycorrectlysynchronizestate.It'saneatexampleofhowtomix
differentpatterns:PAIRPAIR,PUBSUB,andROUTERDEALER.

RepublishingUpdatesfromClients topprevnext

Inoursecondmodel,changestothekeyvaluestorecamefromtheserveritself.Thisisacentralizedmodelthatisuseful,for
exampleifwehaveacentralconfigurationfilewewanttodistribute,withlocalcachingoneachnode.Amoreinterestingmodel
takesupdatesfromclients,nottheserver.Theserverthusbecomesastatelessbroker.Thisgivesussomebenefits:

We'relessworriedaboutthereliabilityoftheserver.Ifitcrashes,wecanstartanewinstanceandfeeditnewvalues.

Wecanusethekeyvaluestoretoshareknowledgebetweenactivepeers.

Tosendupdatesfromclientsbacktotheserver,wecoulduseavarietyofsocketpatterns.Thesimplestplausiblesolutionisa
PUSHPULLcombination.

Whydon'tweallowclientstopublishupdatesdirectlytoeachother?Whilethiswouldreducelatency,itwouldremovethe
guaranteeofconsistency.Youcan'tgetconsistentsharedstateifyouallowtheorderofupdatestochangedependingonwho
receivesthem.Saywehavetwoclients,changingdifferentkeys.Thiswillworkfine.Butifthetwoclientstrytochangethesame
keyatroughlythesametime,they'llendupwithdifferentnotionsofitsvalue.

Thereareafewstrategiesforobtainingconsistencywhenchangeshappeninmultipleplacesatonce.We'llusetheapproachof
centralizingallchange.Nomattertheprecisetimingofthechangesthatclientsmake,theyareallpushedthroughtheserver,
whichenforcesasinglesequenceaccordingtotheorderinwhichitgetsupdates.

Figure60RepublishingUpdates

Bymediatingallchanges,theservercanalsoaddauniquesequencenumbertoallupdates.Withuniquesequencing,clientscan
detectthenastierfailures,includingnetworkcongestionandqueueoverflow.Ifaclientdiscoversthatitsincomingmessage

http://zguide.zeromq.org/page:all 113/225
12/31/2015 MQ - The Guide - MQ - The Guide
streamhasahole,itcantakeaction.Itseemssensiblethattheclientcontacttheserverandaskforthemissingmessages,butin
practicethatisn'tuseful.Ifthereareholes,they'recausedbynetworkstress,andaddingmorestresstothenetworkwillmake
thingsworse.Alltheclientcandoiswarnitsusersthatitis"unabletocontinue",stop,andnotrestartuntilsomeonehas
manuallycheckedthecauseoftheproblem.

We'llnowgeneratestateupdatesintheclient.Here'stheserver:

clonesrv3:Cloneserver,ModelThreeinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhereistheclient:

clonecli3:Cloneclient,ModelThreeinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Herearesomethingstonoteaboutthisthirddesign:

Theserverhascollapsedtoasingletask.ItmanagesaPULLsocketforincomingupdates,aROUTERsocketforstate
requests,andaPUBsocketforoutgoingupdates.

Theclientusesasimpleticklesstimertosendarandomupdatetotheserveronceasecond.Inarealimplementation,we
woulddriveupdatesfromapplicationcode.

WorkingwithSubtrees topprevnext

Aswegrowthenumberofclients,thesizeofoursharedstorewillalsogrow.Itstopsbeingreasonabletosendeverythingto
everyclient.Thisistheclassicstorywithpubsub:whenyouhaveaverysmallnumberofclients,youcansendeverymessage
toallclients.Asyougrowthearchitecture,thisbecomesinefficient.Clientsspecializeindifferentareas.

Soevenwhenworkingwithasharedstore,someclientswillwanttoworkonlywithapartofthatstore,whichwecallasubtree.
Theclienthastorequestthesubtreewhenitmakesastaterequest,anditmustspecifythesamesubtreewhenitsubscribesto
updates.

Thereareacoupleofcommonsyntaxesfortrees.Oneisthepathhierarchy,andanotheristhetopictree.Theselooklikethis:

Pathhierarchy:/some/list/of/paths
Topictree:some.list.of.topics

We'llusethepathhierarchy,andextendourclientandserversothataclientcanworkwithasinglesubtree.Onceyouseehow
toworkwithasinglesubtreeyou'llbeabletoextendthisyourselftohandlemultiplesubtrees,ifyourusecasedemandsit.

Here'stheserverimplementingsubtrees,asmallvariationonModelThree:

clonesrv4:Cloneserver,ModelFourinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhereisthecorrespondingclient:

clonecli4:Cloneclient,ModelFourinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

EphemeralValues topprevnext

http://zguide.zeromq.org/page:all 114/225
12/31/2015 MQ - The Guide - MQ - The Guide
Anephemeralvalueisonethatexpiresautomaticallyunlessregularlyrefreshed.IfyouthinkofClonebeingusedforaregistration
service,thenephemeralvalueswouldletyoudodynamicvalues.Anodejoinsthenetwork,publishesitsaddress,andrefreshes
thisregularly.Ifthenodedies,itsaddresseventuallygetsremoved.

Theusualabstractionforephemeralvaluesistoattachthemtoasession,anddeletethemwhenthesessionends.InClone,
sessionswouldbedefinedbyclients,andwouldendiftheclientdied.Asimpleralternativeistoattachatimetolive(TTL)to
ephemeralvalues,whichtheserverusestoexpirevaluesthathaven'tbeenrefreshedintime.

AgooddesignprinciplethatIusewheneverpossibleistonotinventconceptsthatarenotabsolutelyessential.Ifwehavevery
largenumbersofephemeralvalues,sessionswillofferbetterperformance.Ifweuseahandfulofephemeralvalues,it'sfineto
setaTTLoneachone.Ifweusemassesofephemeralvalues,it'smoreefficienttoattachthemtosessionsandexpirethemin
bulk.Thisisn'taproblemwefaceatthisstage,andmayneverface,sosessionsgooutthewindow.

Nowwewillimplementephemeralvalues.First,weneedawaytoencodetheTTLinthekeyvaluemessage.Wecouldadda
frame.TheproblemwithusingZeroMQframesforpropertiesisthateachtimewewanttoaddanewproperty,wehaveto
changethemessagestructure.Itbreakscompatibility.Solet'saddapropertiesframetothemessage,andwritethecodetolet
usgetandputpropertyvalues.

Next,weneedawaytosay,"deletethisvalue".Upuntilnow,serversandclientshavealwaysblindlyinsertedorupdatednew
valuesintotheirhashtable.We'llsaythatifthevalueisempty,thatmeans"deletethiskey".

Here'samorecompleteversionofthekvmsgclass,whichimplementsthepropertiesframe(andaddsaUUIDframe,whichwe'll
needlateron).Italsohandlesemptyvaluesbydeletingthekeyfromthehash,ifnecessary:

kvmsg:Keyvaluemessageclass:fullinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

TheModelFiveclientisalmostidenticaltoModelFour.Itusesthefullkvmsgclassnow,andsetsarandomizedttlproperty
(measuredinseconds)oneachmessage:

kvmsg_set_prop(kvmsg,"ttl","%d",randof(30))

UsingaReactor topprevnext

Untilnow,wehaveusedapollloopintheserver.Inthisnextmodeloftheserver,weswitchtousingareactor.InC,weuse
CZMQ'szloopclass.Usingareactormakesthecodemoreverbose,buteasiertounderstandandbuildoutbecauseeachpiece
oftheserverishandledbyaseparatereactorhandler.

Weuseasinglethreadandpassaserverobjectaroundtothereactorhandlers.Wecouldhaveorganizedtheserverasmultiple
threads,eachhandlingonesocketortimer,butthatworksbetterwhenthreadsdon'thavetosharedata.Inthiscaseallworkis
centeredaroundtheserver'shashmap,soonethreadissimpler.

Therearethreereactorhandlers:

OnetohandlesnapshotrequestscomingontheROUTERsocket
Onetohandleincomingupdatesfromclients,comingonthePULLsocket
OnetoexpireephemeralvaluesthathavepassedtheirTTL.

clonesrv5:Cloneserver,ModelFiveinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

AddingtheBinaryStarPatternforReliability topprevnext

TheClonemodelswe'veexploreduptonowhavebeenrelativelysimple.Nowwe'regoingtogetintounpleasantlycomplex

http://zguide.zeromq.org/page:all 115/225
12/31/2015 MQ - The Guide - MQ - The Guide
territory,whichhasmegettingupforanotherespresso.Youshouldappreciatethatmaking"reliable"messagingiscomplex
enoughthatyoualwaysneedtoask,"Doweactuallyneedthis?"beforejumpingintoit.Ifyoucangetawaywithunreliableorwith
"goodenough"reliability,youcanmakeahugewinintermsofcostandcomplexity.Sure,youmaylosesomedatanowandthen.
Itisoftenagoodtradeoff.Havingsaid,that,andsipsbecausetheespressoisreallygood,let'sjumpin.

Asyouplaywiththelastmodel,you'llstopandrestarttheserver.Itmightlooklikeitrecovers,butofcourseit'sapplyingupdates
toanemptystateinsteadofthepropercurrentstate.Anynewclientjoiningthenetworkwillonlygetthelatestupdatesinsteadof
thefullhistoricalrecord.

Whatwewantisawayfortheservertorecoverfrombeingkilled,orcrashing.Wealsoneedtoprovidebackupincasetheserver
isoutofcommissionforanylengthoftime.Whensomeoneasksfor"reliability",askthemtolistthefailurestheywanttohandle.
Inourcase,theseare:

Theserverprocesscrashesandisautomaticallyormanuallyrestarted.Theprocesslosesitsstateandhastogetitback
fromsomewhere.

Theservermachinediesandisofflineforasignificanttime.Clientshavetoswitchtoanalternateserversomewhere.

Theserverprocessormachinegetsdisconnectedfromthenetwork,e.g.,aswitchdiesoradatacentergetsknockedout.
Itmaycomebackatsomepoint,butinthemeantimeclientsneedanalternateserver.

Ourfirststepistoaddasecondserver.WecanusetheBinaryStarpatternfromChapter4ReliableRequestReplyPatternsto
organizetheseintoprimaryandbackup.BinaryStarisareactor,soit'susefulthatwealreadyrefactoredthelastservermodel
intoareactorstyle.

Weneedtoensurethatupdatesarenotlostiftheprimaryservercrashes.Thesimplesttechniqueistosendthemtoboth
servers.Thebackupservercanthenactasaclient,andkeepitsstatesynchronizedbyreceivingupdatesasallclientsdo.It'll
alsogetnewupdatesfromclients.Itcan'tyetstoretheseinitshashtable,butitcanholdontothemforawhile.

So,ModelSixintroducesthefollowingchangesoverModelFive:

Weuseapubsubflowinsteadofapushpullflowforclientupdatessenttotheservers.Thistakescareoffanningoutthe
updatestobothservers.Otherwisewe'dhavetousetwoDEALERsockets.

Weaddheartbeatstoserverupdates(toclients),sothataclientcandetectwhentheprimaryserverhasdied.Itcanthen
switchovertothebackupserver.

WeconnectthetwoserversusingtheBinaryStarbstarreactorclass.BinaryStarreliesontheclientstovotebymaking
anexplicitrequesttotheservertheyconsideractive.We'llusesnapshotrequestsasthevotingmechanism.

WemakeallupdatemessagesuniquelyidentifiablebyaddingaUUIDfield.Theclientgeneratesthis,andtheserver
propagatesitbackonrepublishedupdates.

Thepassiveserverkeepsa"pendinglist"ofupdatesthatithasreceivedfromclients,butnotyetfromtheactiveserveror
updatesit'sreceivedfromtheactiveserver,butnotyetfromtheclients.Thelistisorderedfromoldesttonewest,sothatit
iseasytoremoveupdatesoffthehead.

Figure61CloneClientFiniteStateMachine

http://zguide.zeromq.org/page:all 116/225
12/31/2015 MQ - The Guide - MQ - The Guide

It'susefultodesigntheclientlogicasafinitestatemachine.Theclientcyclesthroughthreestates:

Theclientopensandconnectsitssockets,andthenrequestsasnapshotfromthefirstserver.Toavoidrequeststorms,it
willaskanygivenserveronlytwice.Onerequestmightgetlost,whichwouldbebadluck.Twogettinglostwouldbe
carelessness.

Theclientwaitsforareply(snapshotdata)fromthecurrentserver,andifitgetsit,itstoresit.Ifthereisnoreplywithin
sometimeout,itfailsovertothenextserver.

Whentheclienthasgottenitssnapshot,itwaitsforandprocessesupdates.Again,ifitdoesn'thearanythingfromthe
serverwithinsometimeout,itfailsovertothenextserver.

Theclientloopsforever.It'squitelikelyduringstartuporfailoverthatsomeclientsmaybetryingtotalktotheprimaryserverwhile
othersaretryingtotalktothebackupserver.TheBinaryStarstatemachinehandlesthis,hopefullyaccurately.It'shardtoprove
softwarecorrectinsteadwehammerituntilwecan'tproveitwrong.

Failoverhappensasfollows:

Theclientdetectsthatprimaryserverisnolongersendingheartbeats,andconcludesthatithasdied.Theclientconnects
tothebackupserverandrequestsanewstatesnapshot.

Thebackupserverstartstoreceivesnapshotrequestsfromclients,anddetectsthatprimaryserverhasgone,soittakes
overasprimary.

Thebackupserverappliesitspendinglisttoitsownhashtable,andthenstartstoprocessstatesnapshotrequests.

Whentheprimaryservercomesbackonline,itwill:

Startupaspassiveserver,andconnecttothebackupserverasaCloneclient.

Starttoreceiveupdatesfromclients,viaitsSUBsocket.

Wemakeafewassumptions:

Atleastoneserverwillkeeprunning.Ifbothserverscrash,weloseallserverstateandthere'snowaytorecoverit.

Multipleclientsdonotupdatethesamehashkeysatthesametime.Clientupdateswillarriveatthetwoserversina
differentorder.Therefore,thebackupservermayapplyupdatesfromitspendinglistinadifferentorderthantheprimary
serverwouldordid.Updatesfromoneclientwillalwaysarriveinthesameorderonbothservers,sothatissafe.

http://zguide.zeromq.org/page:all 117/225
12/31/2015 MQ - The Guide - MQ - The Guide
ThusthearchitectureforourhighavailabilityserverpairusingtheBinaryStarpatternhastwoserversandasetofclientsthat
talktobothservers.

Figure62HighavailabilityCloneServerPair

HereisthesixthandlastmodeloftheCloneserver:

clonesrv6:Cloneserver,ModelSixinC

Java|Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Thismodelisonlyafewhundredlinesofcode,butittookquiteawhiletogetworking.Tobeaccurate,buildingModelSixtook
aboutafullweekof"Sweetgod,thisisjusttoocomplexforanexample"hacking.We'veassembledprettymucheverythingand
thekitchensinkintothissmallapplication.Wehavefailover,ephemeralvalues,subtrees,andsoon.Whatsurprisedmewasthat
theupfrontdesignwasprettyaccurate.Stillthedetailsofwritinganddebuggingsomanysocketflowsisquitechallenging.

Thereactorbaseddesignremovesalotofthegruntworkfromthecode,andwhatremainsissimplerandeasiertounderstand.
WereusethebstarreactorfromChapter4ReliableRequestReplyPatterns.Thewholeserverrunsasonethread,sothere'sno
interthreadweirdnessgoingonjustastructurepointer(self)passedaroundtoallhandlers,whichcandotheirthinghappily.
Onenicesideeffectofusingreactorsisthatthecode,beinglesstightlyintegratedintoapollloop,ismucheasiertoreuse.Large
chunksofModelSixaretakenfromModelFive.

Ibuiltitpiecebypiece,andgoteachpieceworkingproperlybeforegoingontothenextone.Becausetherearefourorfivemain
socketflows,thatmeantquitealotofdebuggingandtesting.Idebuggedjustbydumpingmessagestotheconsole.Don'tuse
classicdebuggerstostepthroughZeroMQapplicationsyouneedtoseethemessageflowstomakeanysenseofwhatisgoing
on.

Fortesting,IalwaystrytouseValgrind,whichcatchesmemoryleaksandinvalidmemoryaccesses.InC,thisisamajorconcern,
asyoucan'tdelegatetoagarbagecollector.UsingproperandconsistentabstractionslikekvmsgandCZMQhelpsenormously.

TheClusteredHashmapProtocol topprevnext

WhiletheserverisprettymuchamashupofthepreviousmodelplustheBinaryStarpattern,theclientisquitealotmore
complex.Butbeforewegettothat,let'slookatthefinalprotocol.I'vewrittenthisupasaspecificationontheZeroMQRFC
websiteastheClusteredHashmapProtocol.

Roughly,therearetwowaystodesignacomplexprotocolsuchasthisone.Onewayistoseparateeachflowintoitsownsetof
sockets.Thisistheapproachweusedhere.Theadvantageisthateachflowissimpleandclean.Thedisadvantageisthat
managingmultiplesocketflowsatoncecanbequitecomplex.Usingareactormakesitsimpler,butstill,itmakesalotofmoving

http://zguide.zeromq.org/page:all 118/225
12/31/2015 MQ - The Guide - MQ - The Guide
piecesthathavetofittogethercorrectly.

Thesecondwaytomakesuchaprotocolistouseasinglesocketpairforeverything.Inthiscase,I'dhaveusedROUTERforthe
serverandDEALERfortheclients,andthendoneeverythingoverthatconnection.Itmakesforamorecomplexprotocolbutat
leastthecomplexityisallinoneplace.InChapter7AdvancedArchitectureusingZeroMQwe'lllookatanexampleofaprotocol
doneoveraROUTERDEALERcombination.

Let'stakealookattheCHPspecification.Notethat"SHOULD","MUST"and"MAY"arekeywordsweuseinprotocol
specificationstoindicaterequirementlevels.

Goals

CHPismeanttoprovideabasisforreliablepubsubacrossaclusterofclientsconnectedoveraZeroMQnetwork.Itdefinesa
"hashmap"abstractionconsistingofkeyvaluepairs.Anyclientcanmodifyanykeyvaluepairatanytime,andchangesare
propagatedtoallclients.Aclientcanjointhenetworkatanytime.

Architecture

CHPconnectsasetofclientapplicationsandasetofservers.Clientsconnecttotheserver.Clientsdonotseeeachother.
Clientscancomeandgoarbitrarily.

PortsandConnections

TheserverMUSTopenthreeportsasfollows:

ASNAPSHOTport(ZeroMQROUTERsocket)atportnumberP.
APUBLISHERport(ZeroMQPUBsocket)atportnumberP+1.
ACOLLECTORport(ZeroMQSUBsocket)atportnumberP+2.

TheclientSHOULDopenatleasttwoconnections:

ASNAPSHOTconnection(ZeroMQDEALERsocket)toportnumberP.
ASUBSCRIBERconnection(ZeroMQSUBsocket)toportnumberP+1.

TheclientMAYopenathirdconnection,ifitwantstoupdatethehashmap:

APUBLISHERconnection(ZeroMQPUBsocket)toportnumberP+2.

Thisextraframeisnotshowninthecommandsexplainedbelow.

StateSynchronization

TheclientMUSTstartbysendingaICANHAZcommandtoitssnapshotconnection.Thiscommandconsistsoftwoframesas
follows:

ICANHAZcommand

Frame0:"ICANHAZ?"
Frame1:subtreespecification

BothframesareZeroMQstrings.ThesubtreespecificationMAYbeempty.Ifnotempty,itconsistsofaslashfollowedbyoneor
morepathsegments,endinginaslash.

TheserverMUSTrespondtoaICANHAZcommandbysendingzeroormoreKVSYNCcommandstoitssnapshotport,followed
withaKTHXBAIcommand.TheserverMUSTprefixeachcommandwiththeidentityoftheclient,asprovidedbyZeroMQwith
theICANHAZcommand.TheKVSYNCcommandspecifiesasinglekeyvaluepairasfollows:

KVSYNCcommand

Frame0:key,asZeroMQstring
Frame1:sequencenumber,8bytesinnetworkorder
Frame2:<empty>
Frame3:<empty>
Frame4:value,asblob

http://zguide.zeromq.org/page:all 119/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thesequencenumberhasnosignificanceandmaybezero.

TheKTHXBAIcommandtakesthisform:

KTHXBAIcommand

Frame0:"KTHXBAI"
Frame1:sequencenumber,8bytesinnetworkorder
Frame2:<empty>
Frame3:<empty>
Frame4:subtreespecification

ThesequencenumberMUSTbethehighestsequencenumberoftheKVSYNCcommandspreviouslysent.

WhentheclienthasreceivedaKTHXBAIcommand,itSHOULDstarttoreceivemessagesfromitssubscriberconnectionand
applythem.

ServertoClientUpdates

WhentheserverhasanupdateforitshashmapitMUSTbroadcastthisonitspublishersocketasaKVPUBcommand.The
KVPUBcommandhasthisform:

KVPUBcommand

Frame0:key,asZeroMQstring
Frame1:sequencenumber,8bytesinnetworkorder
Frame2:UUID,16bytes
Frame3:properties,asZeroMQstring
Frame4:value,asblob

ThesequencenumberMUSTbestrictlyincremental.TheclientMUSTdiscardanyKVPUBcommandswhosesequencenumbers
arenotstrictlygreaterthanthelastKTHXBAIorKVPUBcommandreceived.

TheUUIDisoptionalandframe2MAYbeempty(sizezero).Thepropertiesfieldisformattedaszeroormoreinstancesof
"name=value"followedbyanewlinecharacter.Ifthekeyvaluepairhasnoproperties,thepropertiesfieldisempty.

Ifthevalueisempty,theclientSHOULDdeleteitskeyvalueentrywiththespecifiedkey.

IntheabsenceofotherupdatestheserverSHOULDsendaHUGZcommandatregularintervals,e.g.,oncepersecond.The
HUGZcommandhasthisformat:

HUGZcommand

Frame0:"HUGZ"
Frame1:00000000
Frame2:<empty>
Frame3:<empty>
Frame4:<empty>

TheclientMAYtreattheabsenceofHUGZasanindicatorthattheserverhascrashed(seeReliabilitybelow).

ClienttoServerUpdates

Whentheclienthasanupdateforitshashmap,itMAYsendthistotheserverviaitspublisherconnectionasaKVSETcommand.
TheKVSETcommandhasthisform:

KVSETcommand

Frame0:key,asZeroMQstring
Frame1:sequencenumber,8bytesinnetworkorder
http://zguide.zeromq.org/page:all 120/225
12/31/2015 MQ - The Guide - MQ - The Guide
Frame2:UUID,16bytes
Frame3:properties,asZeroMQstring
Frame4:value,asblob

Thesequencenumberhasnosignificanceandmaybezero.TheUUIDSHOULDbeauniversallyuniqueidentifier,ifareliable
serverarchitectureisused.

Ifthevalueisempty,theserverMUSTdeleteitskeyvalueentrywiththespecifiedkey.

TheserverSHOULDacceptthefollowingproperties:

ttl:specifiesatimetoliveinseconds.IftheKVSETcommandhasattlproperty,theserverSHOULDdeletethekey
valuepairandbroadcastaKVPUBwithanemptyvalueinordertodeletethisfromallclientswhentheTTLhasexpired.

Reliability

CHPmaybeusedinadualserverconfigurationwhereabackupservertakesoveriftheprimaryserverfails.CHPdoesnot
specifythemechanismsusedforthisfailoverbuttheBinaryStarpatternmaybehelpful.

Toassistserverreliability,theclientMAY:

SetaUUIDineveryKVSETcommand.
DetectthelackofHUGZoveratimeperiodandusethisasanindicatorthatthecurrentserverhasfailed.
Connecttoabackupserverandrerequestastatesynchronization.

ScalabilityandPerformance

CHPisdesignedtobescalabletolargenumbers(thousands)ofclients,limitedonlybysystemresourcesonthebroker.Because
allupdatespassthroughasingleserver,theoverallthroughputwillbelimitedtosomemillionsofupdatespersecondatpeak,
andprobablyless.

Security

CHPdoesnotimplementanyauthentication,accesscontrol,orencryptionmechanismsandshouldnotbeusedinany
deploymentwherethesearerequired.

BuildingaMultithreadedStackandAPI topprevnext

Theclientstackwe'veusedsofarisn'tsmartenoughtohandlethisprotocolproperly.Assoonaswestartdoingheartbeats,we
needaclientstackthatcanruninabackgroundthread.IntheFreelancepatternattheendofChapter4ReliableRequest
ReplyPatternsweusedamultithreadedAPIbutdidn'texplainitindetail.ItturnsoutthatmultithreadedAPIsarequiteuseful
whenyoustarttomakemorecomplexZeroMQprotocolslikeCHP.

Figure63MultithreadedAPI

http://zguide.zeromq.org/page:all 121/225
12/31/2015 MQ - The Guide - MQ - The Guide

Ifyoumakeanontrivialprotocolandyouexpectapplicationstoimplementitproperly,mostdeveloperswillgetitwrongmostof
thetime.You'regoingtobeleftwithalotofunhappypeoplecomplainingthatyourprotocolistoocomplex,toofragile,andtoo
hardtouse.WhereasifyougivethemasimpleAPItocall,youhavesomechanceofthembuyingin.

OurmultithreadedAPIconsistsofafrontendobjectandabackgroundagent,connectedbytwoPAIRsockets.Connectingtwo
PAIRsocketslikethisissousefulthatyourhighlevelbindingshouldprobablydowhatCZMQdoes,whichispackagea"create
newthreadwithapipethatIcanusetosendmessagestoit"method.

ThemultithreadedAPIsthatweseeinthisbookalltakethesameform:

Theconstructorfortheobject(clone_new)createsacontextandstartsabackgroundthreadconnectedwithapipe.It
holdsontooneendofthepipesoitcansendcommandstothebackgroundthread.

Thebackgroundthreadstartsanagentthatisessentiallyazmq_pollloopreadingfromthepipesocketandanyother
sockets(here,theDEALERandSUBsockets).

ThemainapplicationthreadandthebackgroundthreadnowcommunicateonlyviaZeroMQmessages.Byconvention,the
frontendsendsstringcommandssothateachmethodontheclassturnsintoamessagesenttothebackendagent,like
this:

void
clone_connect(clone_t*self,char*address,char*service)
{
assert(self)
zmsg_t*msg=zmsg_new()
zmsg_addstr(msg,"CONNECT")
zmsg_addstr(msg,address)
zmsg_addstr(msg,service)
zmsg_send(&msg,self>pipe)
}

Ifthemethodneedsareturncode,itcanwaitforareplymessagefromtheagent.

Iftheagentneedstosendasynchronouseventsbacktothefrontend,weaddarecvmethodtotheclass,whichwaitsfor
messagesonthefrontendpipe.

http://zguide.zeromq.org/page:all 122/225
12/31/2015 MQ - The Guide - MQ - The Guide
Wemaywanttoexposethefrontendpipesockethandletoallowtheclasstobeintegratedintofurtherpollloops.
Otherwiseanyrecvmethodwouldblocktheapplication.

ThecloneclasshasthesamestructureastheflcliapiclassfromChapter4ReliableRequestReplyPatternsandaddsthe
logicfromthelastmodeloftheCloneclient.WithoutZeroMQ,thiskindofmultithreadedAPIdesignwouldbeweeksofreallyhard
work.WithZeroMQ,itwasadayortwoofwork.

TheactualAPImethodsforthecloneclassarequitesimple:

//Createanewcloneclassinstance
clone_t*
clone_new(void)

//Destroyacloneclassinstance
void
clone_destroy(clone_t**self_p)

//Definethesubtree,ifany,forthiscloneclass
void
clone_subtree(clone_t*self,char*subtree)

//Connectthecloneclasstooneserver
void
clone_connect(clone_t*self,char*address,char*service)

//Setavalueinthesharedhashmap
void
clone_set(clone_t*self,char*key,char*value,intttl)

//Getavaluefromthesharedhashmap
char*
clone_get(clone_t*self,char*key)

SohereisModelSixofthecloneclient,whichhasnowbecomejustathinshellusingthecloneclass:

clonecli6:Cloneclient,ModelSixinC

Java|Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Notetheconnectmethod,whichspecifiesoneserverendpoint.Underthehood,we'reinfacttalkingtothreeports.However,as
theCHPprotocolsays,thethreeportsareonconsecutiveportnumbers:

Theserverstaterouter(ROUTER)isatportP.
Theserverupdatespublisher(PUB)isatportP+1.
Theserverupdatessubscriber(SUB)isatportP+2.

Sowecanfoldthethreeconnectionsintoonelogicaloperation(whichweimplementasthreeseparateZeroMQconnectcalls).

Let'sendwiththesourcecodefortheclonestack.Thisisacomplexpieceofcode,buteasiertounderstandwhenyoubreakit
intothefrontendobjectclassandthebackendagent.Thefrontendsendsstringcommands("SUBTREE","CONNECT","SET",
"GET")totheagent,whichhandlesthesecommandsaswellastalkingtotheserver(s).Hereistheagent'slogic:

1. Startupbygettingasnapshotfromthefirstserver
2. Whenwegetasnapshotswitchtoreadingfromthesubscribersocket.
3. Ifwedon'tgetasnapshotthenfailovertothesecondserver.
4. Pollonthepipeandthesubscribersocket.
5. Ifwegotinputonthepipe,handlethecontrolmessagefromthefrontendobject.
6. Ifwegotinputonthesubscriber,storeorapplytheupdate.
7. Ifwedidn'tgetanythingfromtheserverwithinacertaintime,failover.
8. RepeatuntiltheprocessisinterruptedbyCtrlC.

Andhereistheactualcloneclassimplementation:

clone:CloneclassinC

http://zguide.zeromq.org/page:all 123/225
12/31/2015 MQ - The Guide - MQ - The Guide

Java|Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Chapter6TheZeroMQCommunity topprevnext

Peoplesometimesaskmewhat'ssospecialaboutZeroMQ.MystandardansweristhatZeroMQisarguablythebestanswerwe
havetothevexingquestionof"Howdowemakethedistributedsoftwarethatthe21stcenturydemands?"Butmorethanthat,
ZeroMQisspecialbecauseofitscommunity.Thisisultimatelywhatseparatesthewolvesfromthesheep.

Therearethreemainopensourcepatterns.Thefirstisthelargefirmdumpingcodetobreakthemarketforothers.Thisisthe
ApacheFoundationmodel.Thesecondistinyteamsorsmallfirmsbuildingtheirdream.Thisisthemostcommonopensource
model,whichcanbeverysuccessfulcommercially.Thelastisaggressiveanddiversecommunitiesthatswarmoveraproblem
landscape.ThisistheLinuxmodel,andtheonetowhichweaspirewithZeroMQ.

It'shardtooveremphasizethepowerandpersistenceofaworkingopensourcecommunity.Therereallydoesnotseemtobea
betterwayofmakingsoftwareforthelongterm.Notonlydoesthecommunitychoosethebestproblemstosolve,itsolvesthem
minimally,carefully,anditthenlooksaftertheseanswersforyears,decades,untilthey'renolongerrelevant,andthenitquietly
putsthemaway.

ToreallybenefitfromZeroMQ,youneedtounderstandthecommunity.Atsomepointdowntheroadyou'llwanttosubmita
patch,anissue,oranaddon.Youmightwanttoasksomeoneforhelp.Youwillprobablywanttobetapartofyourbusinesson
ZeroMQ,andwhenItellyouthatthecommunityismuch,muchmoreimportantthanthecompanythatbackstheproduct,even
thoughI'mCEOofthatcompany,thisshouldbesignificant.

InthischapterI'mgoingtolookatourcommunityfromseveralanglesandconcludebyexplainingindetailourcontractfor
collaboration,whichwecall"C4".Youshouldfindthediscussionusefulforyourownwork.We'vealsoadaptedtheZeroMQC4
processforclosedsourceprojectswithgoodsuccess.

We'llcover:

TheroughstructureofZeroMQasasetofprojects
What"softwarearchitecture"isreallyabout
WhyweusetheLGPLandnottheBSDlicense
HowwedesignedandgrewtheZeroMQcommunity
ThebusinessthatbacksZeroMQ
WhoownstheZeroMQsourcecode
HowtomakeandsubmitapatchtoZeroMQ
WhocontrolswhatpatchesactuallygointoZeroMQ
Howweguaranteecompatibilitywitholdcode
Whywedon'tusepublicgitbranches
WhodecidesontheZeroMQroadmap
Aworkedexampleofachangetolibzmq

ArchitectureoftheZeroMQCommunity topprevnext

YouknowthatZeroMQisanLGPLlicensedproject.Infactit'sacollectionofprojects,builtaroundthecorelibrary,libzmq.I'll
visualizetheseprojectsasanexpandinggalaxy:

Atthecore,libzmqistheZeroMQcorelibrary.It'swritteninC++,withalowlevelCAPI.Thecodeisnasty,mainly
becauseit'shighlyoptimizedbutalsobecauseit'swritteninC++,alanguagethatlendsitselftosubtleanddeepnastiness.
MartinSustrikwrotethebulkofthiscode.Todayithasdozensofpeoplewhomaintaindifferentpartsofit.

Aroundlibzmq,thereareabout50bindings.TheseareindividualprojectsthatcreatehigherlevelAPIsforZeroMQ,orat
leastmapthelowlevelAPIintootherlanguages.Thebindingsvaryinqualityfromexperimentaltoutterlyawesome.
ProbablythemostimpressivebindingisPyZMQ,whichwasoneofthefirstcommunityprojectsontopofZeroMQ.Ifyou
areabindingauthor,youshouldreallystudyPyZMQandaspiretomakingyourcodeandcommunityasgreat.

http://zguide.zeromq.org/page:all 124/225
12/31/2015 MQ - The Guide - MQ - The Guide
Alotoflanguageshavemultiplebindings(Erlang,Ruby,C#,atleast)writtenbydifferentpeopleovertime,ortaking
varyingapproaches.Wedon'tregulatetheseinanyway.Thereareno"official"bindings.Youvotebyusingoneorthe
other,contributingtoit,orignoringit.

Thereareaseriesofreimplementationsoflibzmq,startingwithJeroMQ,afullJavatranslationofthelibrary,whichis
nowthebasisforNetMQ,aC#stack.ThesenativestacksoffersimilaroridenticalAPIs,andspeakthesameprotocol
(ZMTP)aslibzmq.

OntopofthebindingsarealotofprojectsthatuseZeroMQorbuildonit.Seethe"Labs"pageonthewikiforalonglistof
projectsandprotoprojectsthatuseZeroMQinsomeway.Thereareframeworks,webserverslikeMongrel2,brokerslike
Majordomo,andenterpriseopensourcetoolslikeStorm.

Libzmq,mostofthebindings,andsomeoftheouterprojectssitintheZeroMQcommunity"organization"onGitHub.This
organizationis"run"byagroupconsistingofthemostseniorbindingauthors.There'sverylittletorunasit'salmostallself
managingandthere'szeroconflictthesedays.

iMatix,myfirm,playsaspecificroleinthecommunity.Weownthetrademarksandenforcethemdiscretelyinordertomakesure
thatifyoudownloadapackagecallingitself"ZeroMQ",youcantrustwhatyouaregetting.Peoplehaveonrareoccasiontriedto
hijackthename,maybebelievingthat"freesoftware"meansthereisnopropertyatstakeandnoonewillingtodefendit.One
thingyou'llunderstandfromthischapterishowseriouslywetaketheprocessbehindoursoftware(andImean"us"asa
community,notacompany).iMatixbacksthecommunitybyenforcingthatprocessonanythingcallingitself"ZeroMQ"or
"ZeroMQ".WealsoputmoneyandtimeintothesoftwareandpackagingforreasonsI'llexplainlater.

Itisnotacharityexercise.ZeroMQisaforprofitproject,andaveryprofitableone.Theprofitsarewidelydistributedamongall
thosewhoinvestinit.It'sreallythatsimple:takethetimetobecomeanexpertinZeroMQ,orbuildsomethingusefulontopof
ZeroMQ,andyou'llfindyourvalueasanindividual,orteam,orcompanyincreasing.iMatixenjoysthesamebenefitsaseveryone
elseinthecommunity.It'swinwintoeveryoneexceptourcompetitors,whofindthemselvesfacingathreattheycan'tbeatand
can'treallyescape.ZeroMQdominatesthefutureworldofmassivelydistributedsoftware.

Myfirmdoesn'tjusthavethecommunity'sbackwealsobuiltthecommunity.ThiswasdeliberateworkintheoriginalZeroMQ
whitepaperfrom2007,thereweretwoprojects.Onewastechnical,howtomakeabettermessagingsystem.Thesecondwas
howtobuildacommunitythatcouldtakethesoftwaretodominantsuccess.Softwaredies,butcommunitysurvives.

HowtoMakeReallyLargeArchitectures topprevnext

Thereare,ithasbeensaid(atleastbypeoplereadingthissentenceoutloud),twowaystomakereallylargescalesoftware.
OptionOneistothrowmassiveamountsofmoneyandproblemsatempiresofsmartpeople,andhopethatwhatemergesisnot
yetanothercareerkiller.Ifyou'reveryluckyandarebuildingonlotsofexperience,havekeptyourteamssolid,andarenot
aimingfortechnicalbrilliance,andarefurthermoreincrediblylucky,itworks.

Butgamblingwithhundredsofmillionsofothers'moneyisn'tforeveryone.Fortherestofuswhowanttobuildlargescale
software,there'sOptionTwo,whichisopensource,andmorespecifically,freesoftware.Ifyou'reaskinghowthechoiceof
softwarelicenseisrelevanttothescaleofthesoftwareyoubuild,that'stherightquestion.

ThebrilliantandvisionaryEbenMoglenoncesaid,roughly,thatafreesoftwarelicenseisthecontractonwhichacommunity
builds.WhenIheardthis,abouttenyearsago,theideacametomeCanwedeliberatelygrowfreesoftwarecommunities?

Tenyearslater,theansweris"yes",andthereisalmostasciencetoit.Isay"almost"becausewedon'tyethaveenough
evidenceofpeopledoingthisdeliberatelywithadocumented,reproducibleprocess.ItiswhatI'mtryingtodowithSocial
Architecture.ZeroMQcameafterWikidot,aftertheDigitalStandardsOrganization(Digistan)andaftertheFoundationforaFree
InformationInfrastructure(akatheFFII,anNGOthatfightsagainstsoftwarepatents).Thisallcameafteralotoflesssuccessful
communityprojectslikeXitamiandLibero.Mymaintakeawayfromalongcareerofprojectsofeveryconceivableformatis:ifyou
wanttobuildtrulylargescaleandlonglastingsoftware,aimtobuildafreesoftwarecommunity.

PsychologyofSoftwareArchitecture topprevnext

DirkjanOchtmanpointedmetoWikipedia'sdefinitionofSoftwareArchitectureas"thesetofstructuresneededtoreasonabout
thesystem,whichcomprisesoftwareelements,relationsamongthem,andpropertiesofboth".Formethisvapidandcircular
jargonisagoodexampleofhowmiserablylittleweunderstandwhatactuallymakesasuccessfullargescalesoftware

http://zguide.zeromq.org/page:all 125/225
12/31/2015 MQ - The Guide - MQ - The Guide
architecture.

Architectureistheartandscienceofmakinglargeartificialstructuresforhumanuse.IfthereisonethingI'velearnedandapplied
successfullyin30yearsofmakinglargerandlargersoftwaresystems,itisthis:softwareisaboutpeople.Largestructuresin
themselvesaremeaningless.It'showtheyfunctionforhumanusethatmatters.Andinsoftware,humanusestartswiththe
programmerswhomakethesoftwareitself.

Thecoreproblemsinsoftwarearchitecturearedrivenbyhumanpsychology,nottechnology.Therearemanywaysour
psychologyaffectsourwork.Icouldpointtothewayteamsseemtogetstupiderastheygetlargerorwhentheyhavetowork
acrosslargerdistances.Doesthatmeanthesmallertheteam,themoreeffective?Howthendoesalargeglobalcommunitylike
ZeroMQmanagetoworksuccessfully?

TheZeroMQcommunitywasn'taccidental.Itwasadeliberatedesign,mycontributiontotheearlydayswhenthecodecameout
ofacellarinBratislava.Thedesignwasbasedonmypetscienceof"SocialArchitecture",whichWikipediadefinesas"the
consciousdesignofanenvironmentthatencouragesadesiredrangeofsocialbehaviorsleadingtowardssomegoalorsetof
goals."Idefinethisasmorespecificallyas"theprocess,andtheproduct,ofplanning,designing,andgrowinganonline
community."

OneofthetenetsofSocialArchitectureisthathowweorganizeismoresignificantthanwhoweare.Thesamegroup,organized
differently,canproducewhollydifferentresults.WearelikepeersinaZeroMQnetwork,andourcommunicationpatternshavea
dramaticimpactonourperformance.Ordinarypeople,wellconnected,canfaroutperformateamofexpertsusingpoorpatterns.
Ifyou'rethearchitectofalargerZeroMQapplication,you'regoingtohavetohelpothersfindtherightpatternsforworking
together.Dothisright,andyourprojectcansucceed.Doitwrong,andyourprojectwillfail.

Thetwomostimportantpsychologicalelementsarethatwe'rereallybadatunderstandingcomplexityandthatwearesogoodat
workingtogethertodivideandconquerlargeproblems.We'rehighlysocialapes,andkindofsmart,butonlyintherightkindof
crowd.

SohereismyshortlistofthePsychologicalElementsofSoftwareArchitecture:

Stupidity:ourmentalbandwidthislimited,sowe'reallstupidatsomepoint.Thearchitecturehastobesimpleto
understand.Thisisthenumberonerule:simplicitybeatsfunctionality,everysingletime.Ifyoucan'tunderstandan
architectureonacoldgrayMondaymorningbeforecoffee,itistoocomplex.

Selfishness:weactonlyoutofselfinterest,sothearchitecturemustcreatespaceandopportunityforselfishactsthat
benefitthewhole.Selfishnessisoftenindirectandsubtle.Forexample,I'llspendhourshelpingsomeoneelseunderstand
somethingbecausethatcouldbeworthdaystomelater.

Laziness:wemakelotsofassumptions,manyofwhicharewrong.Wearehappiestwhenwecanspendtheleasteffortto
getaresultortotestanassumptionquickly,sothearchitecturehastomakethispossible.Specifically,thatmeansitmust
besimple.

Jealousy:we'rejealousofothers,whichmeanswe'llovercomeourstupidityandlazinesstoproveotherswrongandbeat
themincompetition.Thearchitecturethushastocreatespaceforpubliccompetitionbasedonfairrulesthatanyonecan
understand.

Fear:we'reunwillingtotakerisks,especiallyifitmakesuslookstupid.Fearoffailureisamajorreasonpeopleconform
andfollowthegroupinmassstupidity.Thearchitectureshouldmakesilentexperimentationeasyandcheap,givingpeople
opportunityforsuccesswithoutpunishingfailure.

Reciprocity:we'llpayextraintermsofhardwork,evenmoney,topunishcheatsandenforcefairrules.Thearchitecture
shouldbeheavilyrulebased,tellingpeoplehowtoworktogether,butnotwhattoworkon.

Conformity:we'rehappiesttoconform,outoffearandlaziness,whichmeansifthepatternsaregood,clearlyexplained
anddocumented,andfairlyenforced,we'llnaturallychoosetherightpatheverytime.

Pride:we'reintenselyawareofoursocialstatus,andwe'llworkhardtoavoidlookingstupidorincompetentinpublic.The
architecturehastomakesureeverypiecewemakehasournameonit,sowe'llhavesleeplessnightsstressingabout
whatotherswillsayaboutourwork.

Greed:we'reultimatelyeconomicanimals(seeselfishness),sothearchitecturehastogiveuseconomicincentiveto
investinmakingithappen.Maybeit'spolishingourreputationasexperts,maybeit'sliterallymakingmoneyfromsome
skillorcomponent.Itdoesn'tmatterwhatitis,buttheremustbeeconomicincentive.Thinkofarchitectureasamarket
place,notanengineeringdesign.

Thesestrategiesworkonalargescalebutalsoonasmallscale,withinanorganizationorteam.

http://zguide.zeromq.org/page:all 126/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheImportanceofContracts topprevnext

Letmediscussacontentiousbutimportantarea,whichiswhatlicensetochoose.I'llsay"BSD"tocoverMIT,X11,BSD,Apache,
andsimilarlicenses,and"GPL"tocoverGPLv3,LGPLv3,andAGPLv3.Thesignificantdifferenceistheobligationtoshareback
anyforkedversions,whichpreventsanyentityfromcapturingthesoftware,andthuskeepsit"free".

Asoftwarelicenseisn'ttechnicallyacontractsinceyoudon'tsignanything.Butbroadly,callingitacontractisusefulsinceit
takestheobligationsofeachparty,andmakesthemlegallyenforceableincourt,undercopyrightlaw.

Youmightask,whydoweneedcontractsatalltomakeopensource?Surelyit'sallaboutdecency,goodwill,peopleworking
togetherforselflessmotives.Surelytheprincipleof"lessismore"applieshereofallplaces?Don'tmorerulesmeanless
freedom?Dowereallyneedlawyerstotellushowtoworktogether?Itseemscynicalandevencounterproductivetoforcea
restrictivesetofrulesonthehappycommunesoffreeandopensourcesoftware.

Butthetruthabouthumannatureisnotthatpretty.We'renotreallyangels,nordevils,justselfinterestedwinnersdescended
fromabillionyearunbrokenlineofwinners.Inbusiness,marriage,andcollectiveworks,soonerorlater,weeitherstopcaring,or
wefightandweargue.

Putthisanotherway:acollectiveworkhastwoextremeoutcomes.Eitherit'safailure,irrelevant,andworthless,inwhichcase
everysanepersonwalksaway,withoutafight.Or,it'sasuccess,relevant,andvaluable,inwhichcasewestartjockeyingfor
power,control,andoften,money.

Whatawellwrittencontractdoesistoprotectthosevaluablerelationshipsfromconflict.Amarriagewherethetermsofdivorce
areclearlyagreedupfrontismuchlesslikelytoendindivorce.Abusinessdealwherebothpartiesagreehowtoresolvevarious
classicconflictssuchasonepartystealingtheothers'clientsorstaffismuchlesslikelytoendinconflict.

Similarly,asoftwareprojectthathasawellwrittencontractthatdefinesthetermsofbreakupclearlyismuchlesslikelytoendin
breakup.Thealternativeseemstobetoimmersetheprojectintoalargerorganizationthatcanassertpressureonteamstowork
together(orlosethebackingandbrandingoftheorganization).ThisisforexamplehowtheApacheFoundationworks.Inmy
experienceorganizationbuildinghasitsowncosts,andendsupfavoringwealthierparticipants(whocanaffordthosesometimes
hugecosts).

Inanopensourceorfreesoftwareproject,breakupusuallytakestheformofafork,wherethecommunitysplitsintotwoormore
groups,eachwithdifferentvisionsofthefuture.Duringthehoneymoonperiodofaproject,whichcanlastyears,there'sno
questionofabreakup.Itisasaprojectbeginstobeworthmoney,orasthemainauthorsstarttoburnout,thatthegoodwilland
generositytendstodryup.

Sowhendiscussingsoftwarelicenses,forthecodeyouwriteorthecodeyouuse,alittlecynicismhelps.Askyourself,not"which
licensewillattractmorecontributors?"becausetheanswertothatliesinthemissionstatementandcontributionprocess.Ask
yourself,"ifthisprojecthadabigfight,andsplitthreeways,whichlicensewouldsaveus?"Or,"ifthewholeteamwasboughtby
ahostilefirmthatwantedtoturnthiscodeintoaproprietaryproduct,whichlicensewouldsaveus?"

Longtermsurvivalmeansenduringthebadtimes,aswellasenjoyingthegoodones.

WhenBSDprojectsfork,theycannoteasilymergeagain.Indeed,onewayforkingofBSDprojectsisquitesystematic:everytime
BSDcodeendsupinacommercialproject,thisiswhat'shappened.WhenGPLprojectsfork,however,remergingistrivial.

TheGPL'sstoryisrelevanthere.Thoughcommunitiesofprogrammerssharingtheircodeopenlywerealreadysignificantbythe
1980's,theytendedtouseminimallicensesthatworkedaslongasnorealmoneygotinvolved.Therewasanimportantlanguage
stackcalledEmacs,originallybuiltinLispbyRichardStallman.Anotherprogrammer,JamesGosling(wholatergaveusJava),
rewroteEmacsinCwiththehelpofmanycontributors,ontheassumptionthatitwouldbeopen.Stallmangotthatcodeandused
itasthebasisforhisownCversion.Goslingthensoldthecodetoafirmwhichturnedaroundandblockedanyonedistributinga
competingproduct.Stallmanfoundthissaleofthecommonworkhugelyunethical,andbegandevelopingareusablelicensethat
wouldprotectcommunitiesfromthis.

WhateventuallyemergedwastheGNUGeneralPublicLicense,whichusedtraditionalcopyrighttoforceremixability.Itwasa
neathackthatspreadtootherdomains,forinstancetheCreativeCommonsforphotographyandmusic.In2007,wesawversion
3ofthelicense,whichwasaresponsetobelatedattacksfromMicrosoftandothersontheconcept.Ithasbecomealongand
complexdocumentbutcorporatecopyrightlawyershavebecomefamiliarwithitandinmyexperience,fewcompaniesmind
usingGPLsoftwareandlibraries,solongastheboundariesareclearlydefined.

Thus,agoodcontractandIconsiderthemodernGPLtobethebestforsoftwareletsprogrammersworktogetherwithout
upfrontagreements,organizations,orassumptionsofdecencyandgoodwill.Itmakesitcheapertocollaborate,andturnsconflict
intohealthycompetition.GPLdoesn'tjustdefinewhathappenswithafork,itactivelyencouragesforksasatoolfor
experimentationandlearning.Whereasaforkcankillaprojectwitha"moreliberal"license,GPLprojectsthriveonforkssince

http://zguide.zeromq.org/page:all 127/225
12/31/2015 MQ - The Guide - MQ - The Guide
successfulexperimentscan,bycontract,beremixedbackintothemainstream.

Yes,therearemanythrivingBSDprojectsandmanydeadGPLones.It'salwayswrongtogeneralize.Aprojectwillthriveordie
formanyreasons.However,inacompetitivesport,oneneedseveryadvantage.

TheotherimportantpartoftheBSDvs.GPLstoryiswhatIcall"leakage",whichistheeffectofpouringwaterintoapotwitha
smallbutrealholeinthebottom.

EatMe topprevnext

Hereisastory.Ithappenedtotheeldestbrotherinlawofthecousinofafriendofmine'scolleagueatwork.Hisnamewas,and
stillis,Patrick.

PatrickwasacomputerscientistwithaPhDinadvancednetworktopologies.Hespenttwoyearsandhissavingsbuildinganew
product,andchoosetheBSDlicensebecausehebelievedthatwouldgethimmoreadoption.Heworkedinhisattic,atgreat
personalcost,andproudlypublishedhiswork.Peopleapplauded,foritwastrulyfantastic,andhismailinglistsweresoonabuzz
withactivityandpatchesandhappychatter.Manycompaniestoldhimhowtheyweresavingmillionsusinghiswork.Someof
themevenpaidhimforconsultancyandtraining.Hewasinvitedtospeakatconferencesandstartedcollectingbadgeswithhis
nameonthem.Hestartedasmallbusiness,hiredafriendtoworkwithhim,anddreamedofmakingitbig.

Thenoneday,someonepointedhimtoanewproject,GPLlicensed,whichhadforkedhisworkandwasimprovingonit.Hewas
irritatedandupset,andaskedhowpeoplefellowopensourcers,noless!wouldsoshamelesslystealhiscode.Therewere
longargumentsonthelistaboutwhetheritwasevenlegaltorelicensetheirBSDcodeasGPLcode.Turnedout,itwas.Hetried
toignorethenewproject,butthenhesoonrealizedthatnewpatchescomingfromthatprojectcouldn'tevenbemergedbackinto
hiswork!

Worse,theGPLprojectgotpopularandsomeofhiscorecontributorsmadefirstsmall,andthenlargerpatchestoit.Again,he
couldn'tusethosechanges,andhefeltabandoned.Patrickwentintoadepression,hisgirlfriendlefthimforaninternational
currencydealercalled,weirdly,Patrice,andhestoppedallworkontheproject.Hefeltbetrayed,andutterlymiserable.Hefired
hisfriend,whotookitratherbadlyandtoldeveryonethatPatrickwasaclosetbanjoplayer.Finally,Patricktookajobasaproject
managerforacloudcompany,andbytheageofforty,hehadstoppedprogrammingevenforfun.

PoorPatrick.Ialmostfeltsorryforhim.ThenIaskedhim,"Whydidn'tyouchoosetheGPL?""Becauseit'sarestrictiveviral
license",hereplied.Itoldhim,"YoumayhaveaPhD,andyoumaybetheeldestbrotherinlawofthecousinofafriendofmy
colleague,butyouareanidiotandMoniquewassmarttoleaveyou.Youpublishedyourworkinvitingpeopletopleasestealyour
codeaslongastheykeptthis'pleasestealmycode'statementintheresultingwork",andwhenpeopledidexactlythat,yougot
upset.Worse,youwereahypocritebecausewhentheydiditinsecret,youwerehappy,butwhentheydiditopenly,youfelt
betrayed."

Seeingyourhardworkcapturedbyasmarterteamandthenusedagainstyouisenormouslypainful,sowhyevenmakethat
possible?EveryproprietaryprojectthatusesBSDcodeiscapturingit.ApublicGPLforkisperhapsmorehumiliating,butit'sfully
selfinflicted.

BSDislikefood.Itliterally(andImeanthatmetaphorically)whispers"eatme"inthelittlevoiceoneimaginesacubeofcheese
mightusewhenit'ssittingnexttoanemptybottleofthebestbeerintheworld,whichisofcourseOrval,brewedbyanancient
andalmostextinctorderofsilentBelgianmonkscalledLesGarsLabasQuiFabriquel'Orval.TheBSDlicense,likeitsnearclone
MIT/X11,wasdesignedspecificallybyauniversity(Berkeley)withnoprofitmotivetoleakworkandeffort.Itisawaytopush
subsidizedtechnologyatbelowitscostprice,adumpingofunderpricedcodeinthehopethatitwillbreakthemarketforothers.
BSDisanexcellentstrategictool,butonlyifyou'realargewellfundedinstitutionthatcanaffordtouseOptionOne.TheApache
licenseisBSDinasuit.

Forussmallbusinesseswhoaimourinvestmentslikepreciousbullets,leakingworkandeffortisunacceptable.Breakingthe
marketisgreat,butwecannotaffordtosubsidizeourcompetitors.TheBSDnetworkingstackendedupputtingWindowsonthe
Internet.Wecannotaffordbattleswiththoseweshouldnaturallybeallieswith.Wecannotaffordtomakefundamentalbusiness
errorsbecauseintheend,thatmeanswehavetofirepeople.

Itcomesdowntobehavioraleconomicsandgametheory.Thelicensewechoosemodifiestheeconomicsofthosewhouseour
work.Inthesoftwareindustry,therearefriends,foes,andfood.BSDmakesmostpeopleseeusaslunch.Closedsourcemakes
mostpeopleseeusasenemies(doyoulikepayingpeopleforsoftware?)GPL,however,makesmostpeople,withtheexception
ofthePatricksoftheworld,ourallies.AnyforkofZeroMQislicensecompatiblewithZeroMQ,tothepointwhereweencourage
forksasavaluabletoolforexperimentation.Yes,itcanbeweirdtoseesomeonetrytorunoffwiththeballbuthere'sthesecret,I
cangetitbackanytimeIwant.

http://zguide.zeromq.org/page:all 128/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheProcess topprevnext

Ifyou'veacceptedmythesisuptonow,great!Now,I'llexplaintheroughprocessbywhichweactuallybuildanopensource
community.ThiswashowwebuiltorgreworgentlysteeredtheZeroMQcommunityintoexistence.

Yourgoalasleaderofacommunityistomotivatepeopletogetoutthereandexploretoensuretheycandososafelyand
withoutdisturbingotherstorewardthemwhentheymakesuccessfuldiscoveriesandtoensuretheysharetheirknowledgewith
everyoneelse(andnotbecauseweaskthem,notbecausetheyfeelgenerous,butbecauseit'sTheLaw).

Itisaniterativeprocess.Youmakeasmallproduct,atyourowncost,butinpublicview.Youthenbuildasmallcommunity
aroundthatproduct.Ifyouhaveasmallbutrealhit,thecommunitythenhelpsdesignandbuildthenextversion,andgrows
larger.Andthenthatcommunitybuildsthenextversion,andsoon.It'sevidentthatyouremainpartofthecommunity,maybe
evenamajoritycontributor,butthemorecontrolyoutrytoassertoverthematerialresults,thelesspeoplewillwanttoparticipate.
Planyourownretirementwellbeforesomeonedecidesyouaretheirnextproblem.

Crazy,Beautiful,andEasy topprevnext

Youneedagoalthat'scrazyandsimpleenoughtogetpeopleoutofbedinthemorning.Yourcommunityhastoattractthevery
bestpeopleandthatdemandssomethingspecial.WithZeroMQ,wesaidweweregoingtomake"theFastest.Messaging.Ever.",
whichqualifiesasagoodmotivator.Ifwe'dsaid,we'regoingtomake"asmarttransportlayerthat'llconnectyourmovingpieces
cheaplyandflexiblyacrossyourenterprise",we'dhavefailed.

Thenyourworkmustbebeautiful,immediatelyuseful,andattractive.Yourcontributorsareuserswhowanttoexplorejustalittle
beyondwheretheyarenow.Makeitsimple,elegant,andbrutallyclean.Theexperiencewhenpeoplerunoruseyourwork
shouldbeanemotionalone.Theyshouldfeelsomething,andifyouaccuratelysolvedevenjustonebigproblemthatuntilthen
theydidn'tquiterealizetheyfaced,you'llhaveasmallpartoftheirsoul.

Itmustbeeasytounderstand,use,andjoin.Toomanyprojectshavebarrierstoaccess:putyourselfintheotherperson'smind
andseeallthereasonstheycometoyoursite,thinking"Um,interestingproject,but"andthenleave.Youwantthemtostay
andtryit,justonce.UseGitHubandputtheissuetrackerrightthere.

Ifyoudothesethingswell,yourcommunitywillbesmartbutmoreimportantly,itwillbeintellectuallyandgeographicallydiverse.
Thisisreallyimportant.Agroupoflikemindedexpertscannotexploretheproblemlandscapewell.Theytendtomakebig
mistakes.Diversitybeatseducationanytime.

Stranger,MeetStranger topprevnext

Howmuchupfrontagreementdotwopeopleneedtoworktogetheronsomething?Inmostorganizations,alot.Butyoucan
bringthiscostdowntonearzero,andthenpeoplecancollaboratewithouthavingevermet,doneaphoneconference,meeting,
orbusinesstriptodiscussRolesandResponsibilitiesoverwaytoomanybottlesofcheapKoreanricewine.

Youneedwellwrittenrulesthataredesignedbycynicalpeoplelikemetoforcestrangersintomutuallybeneficialcollaboration
insteadofconflict.TheGPLisagoodstart.GitHubanditsfork/mergestrategyisagoodfollowup.Andthenyouwantsomething
likeourC4rulebooktocontrolhowworkactuallyhappens.

C4(whichInowuseforeverynewopensourceproject)hasdetailedandtestedanswerstoalotofcommonmistakespeople
make,suchasthesinofworkingofflineinacornerwithothers"becauseit'sfaster".Transparencyisessentialtogettrust,which
isessentialtogetscale.Byforcingeverysinglechangethroughasingletransparentprocess,youbuildrealtrustintheresults.

Anothercardinalsinthatmanyopensourcedevelopersmakeistoplacethemselvesaboveothers."Ifoundedthisprojectthus
myintellectissuperiortothatofothers".It'snotjustimmodestandrude,andusuallyinaccurate,it'salsopoorbusiness.Therules
mustapplyequallytoeveryone,withoutdistinction.Youarepartofthecommunity.Yourjob,asfounderofaproject,isnotto
imposeyourvisionoftheproductoverothers,buttomakesuretherulesaregood,honest,andenforced.

InfiniteProperty topprevnext

http://zguide.zeromq.org/page:all 129/225
12/31/2015 MQ - The Guide - MQ - The Guide

Oneofthesaddestmythsoftheknowledgebusinessisthatideasareasensibleformofproperty.It'smedievalnonsensethat
shouldhavebeenjunkedalongwithslavery,butsadlyit'sstillmakingtoomanypowerfulpeopletoomuchmoney.

Ideasarecheap.Whatdoesworksensiblyaspropertyisthehardworkwedoinbuildingamarket."Youeatwhatyoukill"isthe
rightmodelforencouragingpeopletoworkhard.Whetherit'smoralauthorityoveraproject,moneyfromconsulting,orthesaleof
atrademarktosomelarge,richfirm:ifyoumakeit,youownit.Butwhatyoureallyownis"footfall",participantsinyourproject,
whichultimatelydefinesyourpower.

Todothisrequiresinfinitefreespace.Thankfully,GitHubsolvedthisproblemforus,forwhichIwilldieagratefulperson(there
aremanyreasonstobegratefulinlife,whichIwon'tlistherebecauseweonlyhaveahundredorsopagesleft,butthisisoneof
them).

Youcannotscaleasingleprojectwithmanyownerslikeyoucanscaleacollectionofmanysmallprojects,eachwithfewer
owners.Whenweembraceforks,apersoncanbecomean"owner"withasingleclick.Nowtheyjusthavetoconvinceothersto
joinbydemonstratingtheiruniquevalue.

SoinZeroMQ,weaimedtomakeiteasytowritebindingsontopofthecorelibrary,andwestoppedtryingtomakethose
bindingsourselves.Thiscreatedspaceforotherstomakethose,becometheirowners,andgetthatcredit.

CareandFeeding topprevnext

Iwishacommunitycouldbe100%selfsteering,andperhapsonedaythiswillwork,buttodayit'snotthecase.We'reveryclose
withZeroMQ,butfrommyexperienceacommunityneedsfourtypesofcareandfeeding:

First,simplybecausemostpeoplearetoonice,weneedsomekindofsymbolicleadershiporownerswhoprovideultimate
authorityincaseofconflict.Usuallyit'sthefoundersofthecommunity.I'veseenitworkwithselfelectedgroupsof
"elders",butoldmenliketotalkalot.I'veseencommunitiessplitoverthequestion"whoisincharge?",andsettingup
legalentitieswithboardsandsuchseemstomakeargumentsovercontrolworse,notbetter.Maybebecausethereseems
tobemoretofightover.Oneoftherealbenefitsoffreesoftwareisthatit'salwaysremixable,soinsteadoffightingovera
pie,onesimplyforksthepie.

Second,communitiesneedlivingrules,andthustheyneedalawyerabletoformulateandwritethesedown.Rulesare
criticalwhendoneright,theyremovefriction.Whendonewrong,orneglected,weseerealfrictionandargumentthatcan
driveawaythenicemajority,leavingtheargumentativecoreinchargeoftheburninghouse.OnethingI'vetriedtodowith
theZeroMQandpreviouscommunitiesiscreatereusablerules,whichperhapsmeanswedon'tneedlawyersasmuch.

Thirdly,communitiesneedsomekindoffinancialbacking.Thisisthejaggedrockthatbreaksmostships.Ifyoustarvea
community,itbecomesmorecreativebutthecorecontributorsburnout.Ifyoupourtoomuchmoneyintoit,youattractthe
professionals,whoneversay"no",andthecommunitylosesitsdiversityandcreativity.Ifyoucreateafundforpeopleto
share,theywillfight(bitterly)overit.WithZeroMQ,we(iMatix)spendourtimeandmoneyonmarketingandpackaging
(likethisbook),andthebasiccare,likebugfixes,releases,andwebsites.

Lastly,salesandcommercialmediationareimportant.Thereisanaturalmarketbetweenexpertcontributorsand
customers,butbotharesomewhatincompetentattalkingtoeachother.Customersassumethatsupportisfreeorvery
cheapbecausethesoftwareisfree.Contributorsareshyataskingafairratefortheirwork.Itmakesforadifficultmarket.
Agrowingpartofmyworkandmyfirm'sprofitsissimplyconnectingZeroMQuserswhowanthelpwithexpertsfromthe
communityabletoprovideit,andensuringbothsidesarehappywiththeresults.

I'veseencommunitiesofbrilliantpeoplewithnoblegoalsdyingbecausethefoundersgotsomeorallofthesefourthingswrong.
Thecoreproblemisthatyoucan'texpectconsistentlygreatleadershipfromanyonecompany,person,orgroup.Whatworks
todayoftenwon'tworktomorrow,yetstructuresbecomemoresolid,notmoreflexible,overtime.

ThebestanswerIcanfindisamixoftwothings.One,theGPLanditsguaranteeofremixability.Nomatterhowbadthe
authority,nomatterhowmuchtheytrytoprivatizeandcapturethecommunity'swork,ifit'sGPLlicensed,thatworkcanwalk
awayandfindabetterauthority.Beforeyousay,"allopensourceoffersthis,"thinkitthrough.IcankillaBSDlicensedprojectby
hiringthecorecontributorsandnotreleasinganynewpatches.Butevenwithabillionofdollars,IcannotkillaGPLlicensed
project.Two,thephilosophicalanarchistmodelofauthority,whichisthatwechooseit,itdoesnotownus.

TheZeroMQProcess:C4 topprevnext

http://zguide.zeromq.org/page:all 130/225
12/31/2015 MQ - The Guide - MQ - The Guide

WhenwesayZeroMQwesometimesmeanlibzmq,thecorelibrary.Inearly2012,wesynthesizedthelibzmqprocessintoa
formalprotocolforcollaborationthatwecalledtheCollectiveCodeConstructionContract,orC4.Youcanseethisasalayer
abovetheGPL.Theseareourrules,andI'llexplainthereasoningbehindeachone.

C4isanevolutionoftheGitHubFork+PullModel.YoumaygetthefeelingI'mafanofgitandGitHub.Thiswouldbeaccurate:
thesetwotoolshavemadesuchapositiveimpactonourworkoverthelastyears,especiallywhenitcomestobuilding
community.

Language topprevnext

Thekeywords"MUST","MUSTNOT","REQUIRED","SHALL","SHALLNOT","SHOULD","SHOULDNOT",
"RECOMMENDED","MAY",and"OPTIONAL"inthisdocumentaretobeinterpretedasdescribedinRFC2119.

BystartingwiththeRFC2119language,theC4textmakesveryclearitsintentiontoactasaprotocolratherthanarandomly
writtensetofrecommendations.Aprotocolisacontractbetweenpartiesthatdefinestherightsandobligationsofeachparty.
Thesecanbepeersinanetworkortheycanbestrangersworkinginthesameproject.

IthinkC4isthefirsttimeanyonehasattemptedtocodifyacommunity'srulebookasaformalandreusableprotocolspec.
Previously,ourruleswerespreadoutoverseveralwikipages,andwerequitespecifictolibzmqinmanyways.Butexperience
teachesusthatthemoreformal,accurate,andreusabletherules,theeasieritisforstrangerstocollaborateupfront.Andless
frictionmeansamorescalablecommunity.AtthetimeofC4,wealsohadsomedisagreementinthelibzmqprojectover
preciselywhatprocesswewereusing.Noteveryonefeltboundbythesamerules.Let'sjustsaysomepeoplefelttheyhada
specialstatus,whichcreatedfrictionwiththerestofthecommunity.Socodificationmadethingsclear.

It'seasytouseC4:justhostyourprojectonGitHub,getoneotherpersontojoin,andopenthefloortopullrequests.Inyour
README,putalinktoC4andthat'sit.We'vedonethisinquiteafewprojectsanditdoesseemtowork.I'vebeenpleasantly
surprisedafewtimesjustapplyingtheserulestomyownwork,likeCZMQ.Noneofusaresoamazingthatwecanworkwithout
others.

Goals topprevnext

C4ismeanttoprovideareusableoptimalcollaborationmodelforopensourcesoftwareprojects.

TheshorttermreasonforwritingC4wastoendargumentsoverthelibzmqcontributionprocess.Thedissenterswentoff
elsewhere.TheZeroMQcommunityblossomedsmoothlyandeasily,asI'dpredicted.Mostpeopleweresurprised,butgratified.
There'sbeennorealcriticismsofC4exceptitsbranchingpolicy,whichI'llcometolaterasitdeservesitsowndiscussion.

There'sareasonI'mreviewinghistoryhere:asfounderofacommunity,youareaskingpeopletoinvestinyourproperty,
trademark,andbranding.Inreturn,andthisiswhatwedowithZeroMQ,youcanusethatbrandingtosetabarforquality.When
youdownloadaproductlabeled"ZeroMQ",youknowthatit'sbeenproducedtocertainstandards.It'sabasicruleofquality:write
downyourprocessotherwiseyoucannotimproveit.Ourprocessesaren'tperfect,norcantheyeverbe.Butanyflawinthemcan
befixed,andtested.

MakingC4reusableisthereforereallyimportant.Tolearnmoreaboutthebestpossibleprocess,weneedtogetresultsfromthe
widestrangeofprojects.

Ithasthesespecificgoals:
Tomaximizethescaleofthecommunityaroundaproject,byreducingthefrictionfornewContributorsandcreatingascaled
participationmodelwithstrongpositivefeedbacks

Thenumberonegoalissizeandhealthofthecommunitynottechnicalquality,notprofits,notperformance,notmarketshare.
Thegoalissimplythenumberofpeoplewhocontributetotheproject.Thesciencehereissimple:thelargerthecommunity,the
http://zguide.zeromq.org/page:all 131/225
12/31/2015 MQ - The Guide - MQ - The Guide
moreaccuratetheresults.

Torelievedependenciesonkeyindividualsbyseparatingdifferentskillsetssothatthereisalargerpoolofcompetencein
anyrequireddomain

Perhapstheworstproblemwefacedinlibzmqwasdependenceonpeoplewhocouldunderstandthecode,manageGitHub
branches,andmakecleanreleasesallatthesametime.It'slikelookingforathleteswhocanrunmarathonsandsprint,swim,
andalsoliftweights.Wehumansarereallygoodatspecialization.Askingustobereallygoodattwocontradictorythingsreduces
thenumberofcandidatessharply,whichisaBadThingforanyproject.Wehadthisproblemseverelyinlibzmqin2009orso,
andfixeditbysplittingtheroleofmaintainerintotwo:onepersonmakespatchesandanothermakesreleases.

Toallowtheprojecttodevelopfasterandmoreaccurately,byincreasingthediversityofthedecisionmakingprocess

Thisistheorynotfullyproven,butnotfalsified.Thediversityofthecommunityandthenumberofpeoplewhocanweighinon
discussions,withoutfearofbeingcriticizedordismissed,thefasterandmoreaccuratelythesoftwaredevelops.Speedisquite
subjectivehere.Goingveryfastinthewrongdirectionisnotjustuseless,it'sactivelydamaging(andwesufferedalotofthatin
libzmqbeforeweswitchedtoC4).

Tosupportthenaturallifecycleofprojectversionsfromexperimentalthroughtostable,byallowingsafeexperimentation,
rapidfailure,andisolationofstablecode

Tobehonest,thisgoalseemstobefadingintoirrelevance.It'squiteaninterestingeffectoftheprocess:thegitmasterisalmost
alwaysperfectlystable.Thishastodowiththesizeofchangesandtheirlatency,i.e.,thetimebetweensomeonewritingthecode
andsomeoneactuallyusingitfully.However,peoplestillexpect"stable"releases,sowe'llkeepthisgoalthereforawhile.

Toreducetheinternalcomplexityofprojectrepositories,thusmakingiteasierforContributorstoparticipateandreducing
thescopeforerror

Curiousobservation:peoplewhothriveincomplexsituationsliketocreatecomplexitybecauseitkeepstheirvaluehigh.It'sthe
CobraEffect(Googleit).Gitmadebrancheseasyandleftuswiththealltoocommonsyndromeof"gitiseasyonceyou
understandthatagitbranchisjustafoldedfivedimensionalleptonspacethathasadetachedhistorywithnointerveningcache".
Developersshouldnotbemadetofeelstupidbytheirtools.I'veseentoomanytopclassdevelopersconfusedbyrepository
structurestoacceptconventionalwisdomongitbranches.We'llcomebacktodisposeofgitbranchesshortly,dearreader.

Toenforcecollectiveownershipoftheproject,whichincreaseseconomicincentivetoContributorsandreducestheriskof
hijackbyhostileentities.

Ultimately,we'reeconomiccreatures,andthesensethat"weownthis,andourworkcanneverbeusedagainstus"makesit
mucheasierforpeopletoinvestinanopensourceprojectlikeZeroMQ.Anditcan'tbejustafeeling,ithastobereal.Therearea
numberofaspectstomakingcollectiveownershipwork,we'llseetheseonebyoneaswegothroughC4.

Preliminaries topprevnext

TheprojectSHALLusethegitdistributedrevisioncontrolsystem.

Githasitsfaults.ItscommandlineAPIishorriblyinconsistent,andithasacomplex,messyinternalmodelthatitshovesinyour
faceattheslightestprovocation.Butdespitedoingitsbesttomakeitsusersfeelstupid,gitdoesitsjobreally,reallywell.More
pragmatically,I'vefoundthatifyoustayawayfromcertainareas(branches!),peoplelearngitrapidlyanddon'tmakemany
mistakes.Thatworksforme.

TheprojectSHALLbehostedongithub.comorequivalent,hereincalledthe"Platform".

http://zguide.zeromq.org/page:all 132/225
12/31/2015 MQ - The Guide - MQ - The Guide
I'msureonedaysomelargefirmwillbuyGitHubandbreakit,andanotherplatformwillriseinitsplace.Untilthen,Githubserves
upanearperfectsetofminimal,fast,simpletools.I'vethrownhundredsofpeopleatit,andtheyallsticklikefliesstuckinadish
ofhoney.

TheprojectSHALLusethePlatformissuetracker.

WemadethemistakeinlibzmqofswitchingtoJirabecausewehadn'tlearnedyethowtoproperlyusetheGitHubissuetracker.
Jiraisagreatexampleofhowtoturnsomethingusefulintoacomplexmessbecausethebusinessdependsonsellingmore
"features".ButevenwithoutcriticizingJira,keepingtheissuetrackeronthesameplatformmeansonelessUItolearn,oneless
login,andsmoothintegrationbetweenissuesandpatches.

TheprojectSHOULDhaveclearlydocumentedguidelinesforcodestyle.

Thisisaprotocolplugin:insertcodestyleguidelineshere.Ifyoudon'tdocumentthecodestyleyouuse,youhavenobasis
exceptprejudicetorejectpatches.

A"Contributor"isapersonwhowishestoprovideapatch,beingasetofcommitsthatsolvesomeclearlyidentifiedproblem.
A"Maintainer"isapersonwhomergepatchestotheproject.Maintainersarenotdeveloperstheirjobistoenforceprocess.

Nowwemoveontodefinitionsoftheparties,andthesplittingofrolesthatsavedusfromthesinofstructuraldependencyonrare
individuals.Thisworkedwellinlibzmq,butasyouwillseeitdependsontherestoftheprocess.C4isn'tabuffetyouwillneed
thewholeprocess(orsomethingverylikeit),oritwon'tholdtogether.

ContributorsSHALLNOThavecommitaccesstotherepositoryunlesstheyarealsoMaintainers.
MaintainersSHALLhavecommitaccesstotherepository.

Whatwewantedtoavoidwaspeoplepushingtheirchangesdirectlytomaster.Thiswasthebiggestsourceoftroubleinlibzmq
historically:largemassesofrawcodethattookmonthsoryearstofullystabilize.WeeventuallyfollowedotherZeroMQprojects
likePyZMQinusingpullrequests.Wewentfurther,andstipulatedthatallchangeshadtofollowthesamepath.Noexceptions
for"specialpeople".

Everyone,withoutdistinctionordiscrimination,SHALLhaveanequalrighttobecomeaContributorunderthetermsofthis
contract.

Wehadtostatethisexplicitly.Itusedtobethatthelibzmqmaintainerswouldrejectpatchessimplybecausetheydidn'tlike
them.Now,thatmaysoundreasonabletotheauthorofalibrary(thoughlibzmqwasnotwrittenbyanyoneperson),butlet's
rememberourgoalofcreatingaworkthatisownedbyasmanypeopleaspossible.Saying"Idon'tlikeyourpatchsoI'mgoingto
rejectit"isequivalenttosaying,"IclaimtoownthisandIthinkI'mbetterthanyou,andIdon'ttrustyou".Thosearetoxic
messagestogivetootherswhoarethinkingofbecomingyourcoinvestors.

Ithinkthisfightbetweenindividualexpertiseandcollectiveintelligenceplaysoutinotherareas.ItdefinedWikipedia,andstill
does,adecadeafterthatworksurpassedanythingbuiltbysmallgroupsofexperts.Forme,wemakesoftwarebyslowly
synthesizingthemostaccurateknowledge,muchaswemakeWikipediaarticles.

LicensingandOwnership topprevnext

TheprojectSHALLusetheGPLv3oravariantthereof(LGPL,AGPL).

I'vealreadyexplainedhowfullremixabilitycreatesbetterscaleandwhytheGPLanditsvariantsseemstheoptimalcontractfor
remixablesoftware.Ifyou'realargebusinessaimingtodumpcodeonthemarket,youwon'twantC4,butthenyouwon'treally
careaboutcommunityeither.

http://zguide.zeromq.org/page:all 133/225
12/31/2015 MQ - The Guide - MQ - The Guide
Allcontributionstotheprojectsourcecode("patches")SHALLusethesamelicenseastheproject.

Thisremovestheneedforanyspecificlicenseorcontributionagreementforpatches.YouforktheGPLcode,youpublishyour
remixedversiononGitHub,andyouoranyoneelsecanthensubmitthatasapatchtotheoriginalcode.BSDdoesn'tallowthis.
AnyworkthatcontainsBSDcodemayalsocontainunlicensedproprietarycodesoyouneedexplicitactionfromtheauthorofthe
codebeforeyoucanremixit.

Allpatchesareownedbytheirauthors.ThereSHALLNOTbeanycopyrightassignmentprocess.

HerewecometothekeyreasonpeopletrusttheirinvestmentsinZeroMQ:it'slogisticallyimpossibletobuythecopyrightsto
createaclosedsourcecompetitortoZeroMQ.iMatixcan'tdothiseither.Andthemorepeoplethatsendpatches,theharderit
becomes.ZeroMQisn'tjustfreeandopentodaythisspecificrulemeansitwillremainsoforever.Notethatit'snotthecasein
allGPLprojects,manyofwhichstillaskforcopyrighttransferbacktothemaintainers.

TheprojectSHALLbeownedcollectivelybyallitsContributors.

Thisisperhapsredundant,butworthsaying:ifeveryoneownstheirpatches,thentheresultingwholeisalsoownedbyevery
contributor.There'snolegalconceptofowninglinesofcode:the"work"isatleastasourcefile.

EachContributorSHALLberesponsibleforidentifyingthemselvesintheprojectContributorlist.

Inotherwords,themaintainersarenotkarmaaccountants.Anyonewhowantscredithastoclaimitthemselves.

PatchRequirements topprevnext

Inthissection,wedefinetheobligationsofthecontributor:specifically,whatconstitutesa"valid"patch,sothatmaintainershave
rulestheycanusetoacceptorrejectpatches.

MaintainersandContributorsMUSThaveaPlatformaccountandSHOULDusetheirrealnamesorawellknownalias.

Intheworstcasescenario,wheresomeonehassubmittedtoxiccode(patented,orownedbysomeoneelse),weneedtobeable
totracewhoandwhen,sowecanremovethecode.Askingforrealnamesorawellknownaliasisatheoreticalstrategyfor
reducingtheriskofboguspatches.Wedon'tknowifthisactuallyworksbecausewehaven'thadtheproblemyet.

ApatchSHOULDbeaminimalandaccurateanswertoexactlyoneidentifiedandagreedproblem.

ThisimplementstheSimplicityOrientedDesignprocessthatI'llcometolaterinthischapter.Oneclearproblem,oneminimal
solution,apply,test,repeat.

ApatchMUSTadheretothecodestyleguidelinesoftheprojectifthesearedefined.

Thisisjustsanity.I'vespenttimecleaningupotherpeoples'patchesbecausetheyinsistedonputtingtheelsebesidetheif
insteadofjustbelowasNatureintended.Consistentcodeishealthier.

ApatchMUSTadheretothe"EvolutionofPublicContracts"guidelinesdefinedbelow.

Ah,thepain,thepain.I'mnotspeakingofthetimeatageeightwhenIsteppedonaplankwitha4inchnailprotrudingfromit.
ThatwasrelativelyOK.I'mspeakingof20102011whenwehadmultipleparallelreleasesofZeroMQ,eachwithdifferent
incompatibleAPIsorwireprotocols.Itwasanexerciseinbadrules,pointlesslyenforced,thatstillhurtsustoday.Therulewas,"If
youchangetheAPIorprotocol,youSHALLcreateanewmajorversion".Givemethenailthroughthefootthathurtless.
http://zguide.zeromq.org/page:all 134/225
12/31/2015 MQ - The Guide - MQ - The Guide
OneofthebigchangeswemadewithC4wassimplytoban,outright,thiskindofsanctionedsabotage.Amazingly,it'snoteven
hard.Wejustdon'tallowthebreakingofexistingpubliccontracts,period,unlesseveryoneagrees,inwhichcasenoperiod.As
LinusTorvaldsfamouslyputiton23December2012,"WEDONOTBREAKUSERSPACE!"

ApatchSHALLNOTincludenontrivialcodefromotherprojectsunlesstheContributoristheoriginalauthorofthatcode.

Thisrulehastwoeffects.Thefirstisthatitforcespeopletomakeminimalsolutionsbecausetheycannotsimplyimportswathes
ofexistingcode.InthecaseswhereI'veseenthishappentoprojects,it'salwaysbadunlesstheimportedcodeisverycleanly
separated.Thesecondisthatitavoidslicensearguments.Youwritethepatch,youareallowedtopublishitasLGPL,andwe
canmergeitbackin.Butyoufinda200linecodefragmentontheweb,andtrytopastethat,we'llrefuse.

ApatchMUSTcompilecleanlyandpassprojectselftestsonatleasttheprincipletargetplatform.

Forcrossplatformprojects,itisfairtoaskthatthepatchworksonthedevelopmentboxusedbythecontributor.

ApatchcommitmessageSHOULDconsistofasingleshort(lessthan50character)linesummarizingthechange,optionally
followedbyablanklineandthenamorethoroughdescription.

Thisisagoodformatforcommitmessagesthatfitsintoemail(thefirstlinebecomesthesubject,andtherestbecomestheemail
body).

A"CorrectPatch"isonethatsatisfiestheaboverequirements.

Justincaseitwasn'tclear,we'rebacktolegaleseanddefinitions.

DevelopmentProcess topprevnext

Inthissection,weaimtodescribetheactualdevelopmentprocess,stepbystep.

ChangeontheprojectSHALLbegovernedbythepatternofaccuratelyidentifyingproblemsandapplyingminimal,accurate
solutionstotheseproblems.

Thisisaunapologeticrammingthroughofthirtyyears'softwaredesignexperience.It'saprofoundlysimpleapproachtodesign:
makeminimal,accuratesolutionstorealproblems,nothingmoreorless.InZeroMQ,wedon'thavefeaturerequests.Treating
newfeaturesthesameasbugsconfusessomenewcomers.Butthisprocessworks,andnotjustinopensource.Enunciatingthe
problemwe'retryingtosolve,witheverysinglechange,iskeytodecidingwhetherthechangeisworthmakingornot.

Toinitiatechanges,auserSHALLloganissueontheprojectPlatformissuetracker.

Thisismeanttostopusfromgoingofflineandworkinginaghetto,eitherbyourselvesorwithothers.Althoughwetendtoaccept
pullrequeststhathaveclearargumentation,thisruleletsussay"stop"toconfusedortoolargepatches.

TheuserSHOULDwritetheissuebydescribingtheproblemtheyfaceorobserve.

"Problem:weneedfeatureX.Solution:makeit"isnotagoodissue."Problem:usercannotdocommontasksAorBexceptby
usingacomplexworkaround.Solution:makefeatureX"isadecentexplanation.BecauseeveryoneI'veeverworkedwithhas
neededtolearnthis,itseemsworthrestating:documenttherealproblemfirst,solutionsecond.

TheuserSHOULDseekconsensusontheaccuracyoftheirobservation,andthevalueofsolvingtheproblem.

http://zguide.zeromq.org/page:all 135/225
12/31/2015 MQ - The Guide - MQ - The Guide
Andbecausemanyapparentproblemsareillusionary,bystatingtheproblemexplicitlywegiveothersachancetocorrectour
logic."You'reonlyusingAandBalotbecausefunctionCisunreliable.Solution:makefunctionCworkproperly."

UsersSHALLNOTlogfeaturerequests,ideas,suggestions,oranysolutionstoproblemsthatarenotexplicitlydocumented
andprovable.

Thereareseveralreasonsfornotloggingideas,suggestions,orfeaturerequests.Inourexperience,thesejustaccumulateinthe
issuetrackeruntilsomeonedeletesthem.Butmoreprofoundly,whenwetreatallchangeasproblemsolutions,wecanprioritize
trivially.Eithertheproblemisrealandsomeonewantstosolveitnow,orit'snotonthetable.Thus,wishlistsareoffthetable.

Thus,thereleasehistoryoftheprojectSHALLbealistofmeaningfulissuesloggedandsolved.

I'dlovetheGitHubissuetrackertosimplylistalltheissueswesolvedineachrelease.Todaywestillhavetowritethatbyhand.If
oneputstheissuenumberineachcommit,andifoneusestheGitHubissuetracker,whichwesadlydon'tyetdoforZeroMQ,
thisreleasehistoryiseasiertoproducemechanically.

Toworkonanissue,aContributorSHALLforktheprojectrepositoryandthenworkontheirforkedrepository.

HereweexplaintheGitHubfork+pullrequestmodelsothatnewcomersonlyhavetolearnoneprocess(C4)inorderto
contribute.

Tosubmitapatch,aContributorSHALLcreateaPlatformpullrequestbacktotheproject.

GitHubhasmadethissosimplethatwedon'tneedtolearngitcommandstodoit,forwhichI'mdeeplygrateful.Sometimes,I'll
tellpeoplewhoIdon'tparticularlylikethatcommandlinegitisawesomeandalltheyneedtodoislearngit'sinternalmodelin
detailbeforetryingtouseitonrealwork.WhenIseethemseveralmonthslatertheylookchanged.

AContributorSHALLNOTcommitchangesdirectlytotheproject.

Anyonewhosubmitsapatchisacontributor,andallcontributorsfollowthesamerules.Nospecialprivilegestotheoriginal
authors,becauseotherwisewe'renotbuildingacommunity,onlyboostingouregos.

Todiscussapatch,peopleMAYcommentonthePlatformpullrequest,onthecommit,orelsewhere.

Randomlydistributeddiscussionsmaybeconfusingifyou'rewalkingupforthefirsttime,butGitHubsolvesthisforallcurrent
participantsbysendingemailstothosewhoneedtofollowwhat'sgoingon.Wehadthesameexperienceandthesamesolution
inWikidot,anditworks.There'snoevidencethatdiscussingindifferentplaceshasanynegativeeffect.

Toacceptorrejectapatch,aMaintainerSHALLusethePlatforminterface.

WorkingviatheGitHubwebuserinterfacemeanspullrequestsareloggedasissues,withworkflowanddiscussion.I'msure
therearemorecomplexwaystowork.Complexityiseasyit'ssimplicitythat'sincrediblyhard.

MaintainersSHALLNOTaccepttheirownpatches.

TherewasarulewedefinedintheFFIIyearsagotostoppeopleburningout:nolessthantwopeopleonanyproject.One
personprojectstendtoendintears,oratleastbittersilence.Wehavequitealotofdataonburnout,whyithappens,andhowto
preventit(evencureit).I'llexplorethislaterinthechapter,becauseifyouworkwithoronopensourceyouneedtobeawareof
therisks.The"nomergingyourownpatch"rulehastwogoals.First,ifyouwantyourprojecttobeC4certified,youhavetogetat
leastoneotherpersontohelp.Ifnoonewantstohelpyou,perhapsyouneedtorethinkyourproject.Second,havingacontrolfor
everypatchmakesitmuchmoresatisfying,keepsusmorefocused,andstopsusbreakingtherulesbecausewe'reinahurry,or
justfeelinglazy.

http://zguide.zeromq.org/page:all 136/225
12/31/2015 MQ - The Guide - MQ - The Guide
MaintainersSHALLNOTmakevaluejudgmentsoncorrectpatches.

Wealreadysaidthisbutit'sworthrepeating:theroleofMaintainerisnottojudgeapatch'ssubstance,onlyitstechnicalquality.
Thesubstantiveworthofapatchonlyemergesovertime:peopleuseit,andlikeit,ortheydonot.Andifnooneisusingapatch,
eventuallyit'llannoysomeoneelsewhowillremoveit,andnoonewillcomplain.

MaintainersSHALLmergecorrectpatchesrapidly.

ThereisacriteriaIcallchangelatency,whichistheroundtriptimefromidentifyingaproblemtotestingasolution.Thefasterthe
better.Ifmaintainerscannotrespondtopullrequestsasrapidlyaspeopleexpect,they'renotdoingtheirjob(ortheyneedmore
hands).

TheContributorMAYtaganissueas"Ready"aftermakingapullrequestfortheissue.

Bydefault,GitHubofferstheusualvarietyofissues,butwithC4wedon'tusethem.Instead,weneedjusttwolabels,"Urgent"
and"Ready".Acontributorwhowantsanotherusertotestanissuecanthenlabelitas"Ready".

TheuserwhocreatedanissueSHOULDclosetheissueaftercheckingthepatchissuccessful.

Whenonepersonopensanissue,andanotherworksonit,it'sbesttoallowtheoriginalpersontoclosetheissue.Thatactsasa
doublecheckthattheissuewasproperlyresolved.

MaintainersSHOULDaskforimprovementstoincorrectpatchesandSHOULDrejectincorrectpatchesiftheContributor
doesnotrespondconstructively.

Initially,Ifeltitwasworthmergingallpatches,nomatterhowpoor.There'sanelementoftrollinginvolved.Acceptingeven
obviouslyboguspatchescould,Ifelt,pullinmorecontributors.Butpeoplewereuncomfortablewiththissowedefinedthe
"correctpatch"rules,andtheMaintainer'sroleincheckingforquality.Onthenegativeside,Ithinkwedidn'ttakesomeinteresting
risks,whichcouldhavepaidoffwithmoreparticipants.Onthepositiveside,thishasledtolibzmqmaster(andinallprojects
thatuseC4)beingpracticallyproductionquality,practicallyallthetime.

AnyContributorwhohasvaluejudgmentsonacorrectpatchSHOULDexpresstheseviatheirownpatches.

Inessence,thegoalhereistoallowuserstotrypatchesratherthantospendtimearguingprosandcons.Aseasyasitisto
makeapatch,it'saseasytorevertitwithanotherpatch.Youmightthinkthiswouldleadto"patchwars",butthathasn't
happened.We'vehadahandfulofcasesinlibzmqwherepatchesbyonecontributorwerekilledbyanotherpersonwhofeltthe
experimentationwasn'tgoingintherightdirection.Itiseasierthanseekingupfrontconsensus.

MaintainersMAYcommitchangestononsourcedocumentationdirectlytotheproject.

Thisexitallowsmaintainerswhoaremakingreleasenotestopushthosewithouthavingtocreateanissuewhichwouldthen
affectthereleasenotes,leadingtostressonthespacetimefabricandpossiblyinvoluntaryreroutingbackwardsinthefourth
dimensiontobeforetheinventionofcoldbeer.Shudder.Itissimplertoagreethatreleasenotesaren'ttechnicallysoftware.

CreatingStableReleases topprevnext

Wewantsomeguaranteeofstabilityforaproductionsystem.Inthepast,thismeanttakingunstablecodeandthenovermonths
hammeringoutthebugsandfaultsuntilitwassafetotrust.iMatix'sjob,foryears,hasbeentodothistolibzmq,turningraw
codeintopackagesbyallowingonlybugfixesandnonewcodeintoa"stabilizationbranch".It'ssurprisinglynotasthanklessasit
sounds.

http://zguide.zeromq.org/page:all 137/225
12/31/2015 MQ - The Guide - MQ - The Guide
Now,sincewewentfullspeedwithC4,we'vefoundthatgitmasteroflibzmqismostlyperfect,mostofthetime.Thisfreesour
timetodomoreinterestingthings,suchasbuildingnewopensourcelayersontopoflibzmq.However,peoplestillwantthat
guarantee:manyuserswillsimplynotinstallexceptfroman"official"release.Soastablereleasetodaymeanstwothings.First,
asnapshotofthemastertakenatatimewhentherewerenonewchangesforawhile,andnodramaticopenbugs.Second,a
waytofinetunethatsnapshottofixthecriticalissuesremaininginit.

Thisistheprocessweexplaininthissection.

TheprojectSHALLhaveonebranch("master")thatalwaysholdsthelatestinprogressversionandSHOULDalwaysbuild.

Thisisredundantbecauseeverypatchalwaysbuildsbutit'sworthrestating.Ifthemasterdoesn'tbuild(andpassitstests),
someoneneedswakingup.

TheprojectSHALLNOTusetopicbranchesforanyreason.PersonalforksMAYusetopicbranches.

I'llcometobranchessoon.Inshort(or"tldr",astheysayonthewebs),branchesmaketherepositorytoocomplexandfragile,
andrequireupfrontagreement,allofwhichareexpensiveandavoidable.

TomakeastablereleasesomeoneSHALLforktherepositorybycopyingitandthusbecomemaintainerofthisrepository.
ForkingaprojectforstabilizationMAYbedoneunilaterallyandwithoutagreementofprojectmaintainers.

It'sfreesoftware.Noonehasamonopolyonit.Ifyouthinkthemaintainersaren'tproducingstablereleasesright,forkthe
repositoryanddoityourself.Forkingisn'tafailure,it'sanessentialtoolforcompetition.Youcan'tdothiswithbranches,which
meansabranchbasedreleasepolicygivestheprojectmaintainersamonopoly.Andthat'sbadbecausethey'llbecomelazier
andmorearrogantthanifrealcompetitionischasingtheirheels.

AstabilizationprojectSHOULDbemaintainedbythesameprocessasthemainproject.

Stabilizationprojectshavemaintainersandcontributorslikeanyproject.Inpracticeweusuallycherrypickpatchesfromthemain
projecttothestabilizationproject,butthat'sjustaconvenience.

Apatchtoarepositorydeclared"stable"SHALLbeaccompaniedbyareproducibletestcase.

Bewareofaonesizefitsallprocess.Newcodedoesnotneedthesameparanoiaascodethatpeoplearetrustingforproduction
use.Inthenormaldevelopmentprocess,wedidnotmentiontestcases.There'sareasonforthis.WhileIlovetestablepatches,
manychangesaren'teasilyoratalltestable.However,tostabilizeacodebaseyouwanttofixonlyseriousbugs,andyouwantto
be100%sureeverychangeisaccurate.Thismeansbeforeandaftertestsforeverychange.

EvolutionofPublicContracts topprevnext

By"publiccontracts",ImeanAPIsandprotocols.Upuntiltheendof2011,libzmq'snaturallyhappystatewasmarredbybroken
promisesandbrokencontracts.Westoppedmakingpromises(aka"roadmaps")forlibzmqcompletely,andourdominant
theoryofchangeisnowthatitemergescarefullyandaccuratelyovertime.Ata2012Chicagomeetup,GarrettSmithandChuck
Remescalledthisthe"drunkenstumbletogreatness",whichishowIthinkofitnow.

Westoppedbreakingpubliccontractssimplybybanningthepractice.Beforethenithadbeen"OK"(asinwediditandeveryone
complainedbitterly,andweignoredthem)tobreaktheAPIorprotocolsolongaswechangedthemajorversionnumber.Sounds
fine,untilyougetZeroMQv2.0,v3.0,andv4.0allindevelopmentatthesametime,andnotspeakingtoeachother.

AllPublicContracts(APIsorprotocols)SHOULDbedocumented.

You'dthinkthiswasagivenforprofessionalsoftwareengineersbutno,it'snot.So,it'sarule.YouwantC4certificationforyour

http://zguide.zeromq.org/page:all 138/225
12/31/2015 MQ - The Guide - MQ - The Guide
project,youmakesureyourpubliccontractsaredocumented.No"It'sspecifiedinthecode"excuses.Codeisnotacontract.
(Yes,IintendatsomepointtocreateaC4certificationprocesstoactasaqualityindicatorforopensourceprojects.)

AllPublicContractsSHALLuseSemanticVersioning.

Thisruleismainlyherebecausepeopleaskedforit.I'venorealloveforit,asSemanticVersioningiswhatledtothesocalled
"WhydoesZeroMQnotspeaktoitself?!"debacle.I'veneverseentheproblemthatthissolved.Somethingaboutruntime
validationoflibraryversions,orsomesuch.

AllPublicContractsSHOULDhavespaceforextensibilityandexperimentation.

Now,therealthingisthatpubliccontractsdochange.It'snotaboutnotchangingthem.It'saboutchangingthemsafely.This
meanseducating(especiallyprotocol)designerstocreatethatspaceupfront.

ApatchthatmodifiesastablePublicContractSHOULDnotbreakexistingapplicationsunlessthereisoverridingconsensus
onthevalueofdoingthis.

SometimesthepatchisfixingabadAPIthatnooneisusing.It'safreedomweneed,butitshouldbebasedonconsensus,not
oneperson'sdogma.However,makingrandomchanges"justbecause"isnotgood.InZeroMQv3.x,didwebenefitfrom
renamingZMQ_NOBLOCKtoZMQ_DONTWAIT?Sure,it'sclosertothePOSIXsocketrecv()call,butisthatworthbreaking
thousandsofapplications?Nooneeverreporteditasanissue.TomisquoteStallman:"yourfreedomtocreateanidealworld
stopsoneinchfrommyapplication."

ApatchthatintroducesnewfeaturestoaPublicContractSHOULDdosousingnewnames.

WehadtheexperienceinZeroMQonceortwiceofnewfeaturesusingoldnames(orworse,usingnamesthatwerestillinuse
elsewhere).ZeroMQv3.0hadanewlyintroduced"ROUTER"socketthatwastotallydifferentfromtheexistingROUTERsocket
in2.x.Dearlord,youshouldbefacepalming,why?Thereason:apparently,evensmartpeoplesometimesneedregulationto
stopthemdoingsillythings.

OldnamesSHOULDbedeprecatedinasystematicfashionbymarkingnewnamesas"experimental"untiltheyarestable,
thenmarkingtheoldnamesas"deprecated".

Thislifecyclenotationhasthegreatbenefitofactuallytellinguserswhatisgoingonwithaconsistentdirection."Experimental"
means"wehaveintroducedthisandintendtomakeitstableifitworks".Itdoesnotmean,"wehaveintroducedthisandwill
removeitatanytimeifwefeellikeit".Oneassumesthatcodethatsurvivesmorethanonepatchcycleismeanttobethere.
"Deprecated"means"wehavereplacedthisandintendtoremoveit".

Whensufficienttimehaspassed,olddeprecatednamesSHOULDbemarked"legacy"andeventuallyremoved.

Intheorythisgivesapplicationstimetomoveontostablenewcontractswithoutrisk.Youcanupgradefirst,makesurethings
work,andthen,overtime,fixthingsuptoremovedependenciesondeprecatedandlegacyAPIsandprotocols.

OldnamesSHALLNOTbereusedbynewfeatures.

Ah,yes,thejoywhenZeroMQv3.xrenamedthetopusedAPIfunctions(zmq_send()andzmq_recv())andthenrecycledthe
oldnamesfornewmethodsthatwereutterlyincompatible(andwhichIsuspectfewpeopleactuallyuse).Youshouldbeslapping
yourselfinconfusionagain,butreally,thisiswhathappenedandIwasasguiltyasanyone.Afterall,wedidchangetheversion
number!Theonlybenefitofthatexperiencewastogetthisrule.

Whenoldnamesareremoved,theirimplementationsMUSTprovokeanexception(assertion)ifusedbyapplications.

I'venottestedthisruletobecertainitmakessense.Perhapswhatitmeansis"ifyoucan'tprovokeacompileerrorbecausethe
http://zguide.zeromq.org/page:all 139/225
12/31/2015 MQ - The Guide - MQ - The Guide
APIisdynamic,provokeanassertion".

ProjectAdministration topprevnext

TheprojectfoundersSHALLactasAdministratorstomanagethesetofprojectMaintainers.

Someoneneedstoadministertheproject,anditmakessensethattheoriginalfoundersstartthisballrolling.

TheAdministratorsSHALLensuretheirownsuccessionovertimebypromotingthemosteffectiveMaintainers.

Atthesametime,asfounderofaprojectyoureallywanttogetoutofthewaybeforeyoubecomeoverattachedtoit.Promoting
themostactiveandconsistentmaintainersisgoodforeveryone.

AnewContributorwhomakesacorrectpatchSHALLbeinvitedtobecomeaMaintainer.

ImetFelixGeisendrferinLyonsin2012attheMixITconferencewhereIpresentedSocialArchitectureandonethingthatcame
outofthiswasFelix'snowfamousPullRequestHack.ItfitselegantlyintoC4andsolvestheproblemofmaintainersdroppingout
overtime.

AdministratorsMAYremoveMaintainerswhoareinactiveforanextendedperiodoftime,orwhorepeatedlyfailtoapplythis
processaccurately.

ThiswasIanBarber'ssuggestion:weneedawaytocropinactivemaintainers.Originallymaintainerswereselfelectedbutthat
makesithardtodroptroublemakers(whoarerare,butnotunknown).

C4isnotperfect.Fewthingsare.Theprocessforchangingit(Digistan'sCOSS)isalittleoutdatednow:itreliesonasingleeditor
workflowwiththeabilitytofork,butnotmerge.ThisseemstoworkbutitcouldbebettertouseC4forprotocolslikeC4.

ARealLifeExample topprevnext

Inthisemailthread,DanGoesaskshowtomakeapublisherthatknowswhenanewclientsubscribes,andsendsoutprevious
matchingmessages.It'sastandardpubsubtechniquecalled"lastvaluecaching".Nowovera1waytransportlikepgm(where
subscribersliterallysendnopacketsbacktopublishers),thiscan'tbedone.ButoverTCP,itcan,ifweuseanXPUBsocketand
ifthatsocketdidn'tcleverlyfilteroutduplicatesubscriptionstoreduceupstreamtraffic.

ThoughI'mnotanexpertcontributortolibzmq,thisseemslikeafunproblemtosolve.Howhardcoulditbe?Istartbyforking
thelibzmqrepositorytomyownGitHubaccountandthencloneittomylaptop,whereIbuildit:

gitclonegit@github.com:hintjens/libzmq.git
cdlibzmq
./autogen.sh
./configure
make

Becausethelibzmqcodeisneatandwellorganized,itwasquiteeasytofindthemainfilestochange(xpub.cppand
xpub.hpp).Eachsockettypehasitsownsourcefileandclass.Theyinheritfromsocket_base.cpp,whichhasthishookfor
socketspecificoptions:

//First,checkwhetherspecificsockettypeoverloadstheoption.

http://zguide.zeromq.org/page:all 140/225
12/31/2015 MQ - The Guide - MQ - The Guide
intrc=xsetsockopt(option_,optval_,optvallen_)
if(rc==0||errno!=EINVAL)
returnrc

//Ifthesockettypedoesn'tsupporttheoption,passitto
//thegenericoptionparser.
returnoptions.setsockopt(option_,optval_,optvallen_)

ThenIcheckwheretheXPUBsocketfiltersoutduplicatesubscriptions,initsxread_activatedmethod:

boolunique
if(*data==0)
unique=subscriptions.rm(data+1,size1,pipe_)
else
unique=subscriptions.add(data+1,size1,pipe_)

//Ifthesubscriptionisnotaduplicatestoreitsothatitcanbe
//passedtousedonnextrecvcall.
if(unique&&options.type!=ZMQ_PUB)
pending.push_back(blob_t(data,size))

Atthisstage,I'mnottooconcernedwiththedetailsofhowsubscriptions.rmandsubscriptions.addwork.Thecode
seemsobviousexceptthat"subscription"alsoincludesunsubscription,whichconfusedmeforafewseconds.Ifthere'sanything
elseweirdinthermandaddmethods,that'saseparateissuetofixlater.Timetomakeanissueforthischange.Iheadoverto
thezeromq.jira.comsite,login,andcreateanewentry.

Jirakindlyoffersmethetraditionalchoicebetween"bug"and"newfeature"andIspendthirtysecondswonderingwherethis
counterproductivehistoricaldistinctioncamefrom.Presumably,the"we'llfixbugsforfree,butyoupayfornewfeatures"
commercialproposal,whichstemsfromthe"youtelluswhatyouwantandwe'llmakeitfor$X"modelofsoftwaredevelopment,
andwhichgenerallyleadsto"wespentthreetimes$Xandwegotwhat?!"emailFistsofFury.

Puttingsuchthoughtsaside,Icreateanissue#443anddescribedtheproblemandplausiblesolution:

Problem:XPUBsocketfiltersoutduplicatesubscriptions(deliberatedesign).Howeverthismakesitimpossibletodo
subscriptionbasedintelligence.Seehttp://lists.zeromq.org/pipermail/zeromqdev/2012October/018838.htmlforausecase.
Solution:makethisbehaviorconfigurablewithasocketoption.

It'snamingtime.TheAPIsitsininclude/zmq.h,sothisiswhereIaddedtheoptionname.Whenyouinventaconceptinan
APIoranywhere,pleasetakeamomenttochooseanamethatisexplicitandshortandobvious.Don'tfallbackongeneric
namesthatneedadditionalcontexttounderstand.Youhaveonechancetotellthereaderwhatyourconceptisanddoes.A
namelikeZMQ_SUBSCRIPTION_FORWARDING_FLAGisterrible.Ittechnicallykindofaimsintherightdirection,butismiserably
longandobscure.IchoseZMQ_XPUB_VERBOSE:shortandexplicitandclearlyanon/offswitchwith"off"beingthedefaultsetting.

So,it'stimetoaddaprivatepropertytothexpubclassdefinitioninxpub.hpp:

//Iftrue,sendallsubscriptionmessagesupstream,notjust
//uniqueones
boolverbose

Andthenliftsomecodefromrouter.cpptoimplementthexsetsockoptmethod.Finally,changethexread_activated
methodtousethisnewoption,andwhileatit,makethattestonsockettypemoreexplicittoo:

//Ifthesubscriptionisnotaduplicatestoreitsothatitcanbe
//passedtousedonnextrecvcall.
if(options.type==ZMQ_XPUB&&(unique||verbose))
pending.push_back(blob_t(data,size))

http://zguide.zeromq.org/page:all 141/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thethingbuildsnicelythefirsttime.Thismakesmealittlesuspicious,butbeinglazyandjetlaggedIdon'timmediatelymakea
testcasetoactuallytryoutthechange.Theprocessdoesn'tdemandthat,evenifusuallyI'ddoitjusttocatchthatinevitable10%
ofmistakesweallmake.Idohoweverdocumentthisnewoptiononthedoc/zmq_setsockopt.txtmanpage.Intheworst
case,Iaddedapatchthatwasn'treallyuseful.ButIcertainlydidn'tbreakanything.

Idon'timplementamatchingzmq_getsockoptbecause"minimal"meanswhatitsays.There'snoobvioususecaseforgetting
thevalueofanoptionthatyoupresumablyjustset,incode.Symmetryisn'tavalidreasontodoublethesizeofapatch.Idid
havetodocumentthenewoptionbecausetheprocesssays,"AllPublicContractsSHOULDbedocumented."

Committingthecode,Ipushthepatchtomyforkedrepository(the"origin"):

gitcommitam"Fixedissue#443"
gitpushoriginmaster

SwitchingtotheGitHubwebinterface,Igotomylibzmqfork,andpressthebig"PullRequest"buttonatthetop.GitHubasks
meforatitle,soIenter"AddedZMQ_XPUB_VERBOSEoption".I'mnotsurewhyitasksthisasImadeaneatcommitmessage
buthey,let'sgowiththeflowhere.

ThismakesanicelittlepullrequestwithtwocommitstheoneI'dmadeamonthagoonthereleasenotestoprepareforthe
v3.2.1release(amonthpassessoquicklywhenyouspendmostofitinairports),andmyfixforissue#443(37newlinesof
code).GitHubletsyoucontinuetomakecommitsafteryou'vekickedoffapullrequest.Theygetqueuedupandmergedinone
go.Thatiseasy,butthemaintainermayrefusethewholebundlebasedononepatchthatdoesn'tlookvalid.

BecauseDaniswaiting(atleastinmyhighlyoptimisticimagination)forthisfix,IgobacktothezeromqdevlistandtellhimI've
madethepatch,withalinktothecommit.ThefasterIgetfeedback,thebetter.It's1a.m.inSouthKoreaasImakethispatch,so
earlyeveninginEurope,andmorningintheStates.Youlearntocounttimezoneswhenyouworkwithpeopleacrosstheworld.
Ianisinaconference,Mikkoisgettingonaplane,andChuckisprobablyintheoffice,butthreehourslater,Ianmergesthepull
request.

AfterIanmergesthepullrequest,Iresynchronizemyforkwiththeupstreamlibzmqrepository.First,Iaddaremotethattellsgit
wherethisrepositorysits(IdothisjustonceinthedirectorywhereI'mworking):

gitremoteaddupstreamgit://github.com/zeromq/libzmq.git

AndthenIpullchangesbackfromtheupstreammasterandcheckthegitlogtodoublecheck:

gitpullrebaseupstreammaster
gitlog

Andthatisprettymuchit,intermsofhowmuchgitoneneedstolearnandusetocontributepatchestolibzmq.Sixgit
commandsandsomeclickingonwebpages.Mostimportantlytomeasanaturallylazy,stupid,andeasilyconfuseddeveloper,I
don'thavetolearngit'sinternalmodels,andneverhavetodoanythinginvolvingthoseinfernalenginesofstructuralcomplexity
wecall"gitbranches".Nextup,theattemptedassassinationofgitbranches.Let'slivedangerously!

GitBranchesConsideredHarmful topprevnext

Oneofgit'smostpopularfeaturesisitsbranches.Almostallprojectsthatusegitusebranches,andtheselectionofthe"best"
branchingstrategyislikeariteofpassageforanopensourceproject.VincentDriessen'sgitflowmaybethebestknown.Ithas
basebranches(master,develop),featurebranches,releasebranches,hotfixbranches,andsupportbranches.Manyteamshave
adoptedgitflow,whichevenhasgitextensionstosupportit.I'magreatbelieverinpopularwisdom,butsometimesyouhaveto
recognizemassdelusionforwhatitis.

HereisasectionofC4thatmighthaveshockedyouwhenyoufirstreadit:

TheprojectSHALLNOTusetopicbranchesforanyreason.PersonalforksMAYusetopicbranches.

http://zguide.zeromq.org/page:all 142/225
12/31/2015 MQ - The Guide - MQ - The Guide
Tobeclear,it'spublicbranchesinsharedrepositoriesthatI'mtalkingabout.Usingbranchesforprivatework,e.g.,toworkon
differentissues,appearstoworkwellenough,thoughit'smorecomplexitythanIpersonallyenjoy.TochannelStallmanagain:
"yourfreedomtocreatecomplexityendsoneinchfromoursharedworkspace."

LiketherestofC4,therulesonbranchesarenotaccidental.TheycamefromourexperiencemakingZeroMQ,startingwhen
MartinSustrikandIrethoughthowtomakestablereleases.Webothloveandappreciatesimplicity(somepeopleseemtohavea
remarkabletoleranceforcomplexity).WechattedforawhileIaskedhim,"I'mgoingtostartmakingastablerelease.Wouldit
beOKformetomakeabranchinthegityou'reworkingin?"Martindidn'tliketheidea."OK,ifIforktherepository,Icanmove
patchesfromyourrepotothatone".Thatfeltmuchbettertobothofus.

TheresponsefrommanyintheZeroMQcommunitywasshockandhorror.Peoplefeltwewerebeinglazyandmaking
contributorsworkhardertofindthe"right"repository.Still,thisseemedsimple,andindeeditworkedsmoothly.Thebestpartwas
thatweeachworkedaswewantedto.Whereasbefore,theZeroMQrepositoryhadfelthorriblycomplex(anditwasn'teven
anythinglikegitflow),thisfeltsimple.Anditworked.Theonlydownsidewasthatwelostasingleunifiedhistory.Now,perhaps
historianswillfeelrobbed,butIhonestlycan'tseethatthehistoricalminutiaeofwhochangedwhat,when,includingeverybranch
andexperiment,areworthanysignificantpainorfriction.

Peoplehavegottenusedtothe"multiplerepositories"approachinZeroMQandwe'vestartedusingthatinotherprojectsquite
successfully.Myownopinionisthathistorywilljudgegitbranchesandpatternslikegitflowasacomplexsolutiontoimaginary
problemsinheritedfromthedaysofSubversionandmonolithicrepositories.

Moreprofoundly,andperhapsthisiswhythemajorityseemstobe"wrong":Ithinkthebranchesversusforksargumentisreallya
deeperdesignversusevolveargumentabouthowtomakesoftwareoptimally.I'lladdressthatdeeperargumentinthenext
section.Fornow,I'lltrytobescientificaboutmyirrationalhatredofbranches,bylookingatanumberofcriteria,andcomparing
branchesandforksineachone.

SimplicityVersusComplexity topprevnext

Thesimpler,thebetter.

Thereisnoinherentreasonwhybranchesaremorecomplexthanforks.However,gitflowusesfivetypesofbranch,whereasC4
usestwotypesoffork(development,andstable)andonebranch(master).Circumstantialevidenceisthusthatbranchesleadto
morecomplexitythanforks.Fornewusers,itisdefinitely,andwe'vemeasuredthisinpractice,easiertolearntoworkwithmany
repositoriesandnobranchesexceptmaster.

ChangeLatency topprevnext

Thesmallerandmorerapidthedelivery,thebetter.

Developmentbranchesseemtocorrelatestronglywithlarge,slow,riskydeliveries."Sorry,Ihavetomergethisbranchbeforewe
cantestthenewversion"signalsabreakdowninprocess.It'scertainlynothowC4works,whichisbyfocusingtightlyon
individualproblemsandtheirminimalsolutions.Allowingbranchesindevelopmentraiseschangelatency.Forkshaveadifferent
outcome:it'suptotheforkertoensurethathischangesmergecleanly,andtokeepthemsimplesotheywon'tberejected.

LearningCurve topprevnext

Thesmootherthelearningcurve,thebetter.

Evidencedefinitelyshowsthatlearningtousegitbranchesiscomplex.Forsomepeople,thisisOK.Formostdevelopers,every
cyclespentlearninggitisacyclelostonmoreproductivethings.I'vebeentoldseveraltimes,bydifferentpeoplethatIdonotlike
branchesbecauseI"neverproperlylearnedgit".Thatisfair,butitisacriticismofthetool,notthehuman.

CostofFailure topprevnext

http://zguide.zeromq.org/page:all 143/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thelowerthecostoffailure,thebetter.

Branchesdemandmoreperfectionfromdevelopersbecausemistakespotentiallyaffectothers.Thisraisesthecostoffailure.
Forksmakefailureextremelycheapbecauseliterallynothingthathappensinaforkcanaffectothersnotusingthatfork.

UpfrontCoordination topprevnext

Thelessneedforupfrontcoordination,thebetter.

Youcandoahostilefork.Youcannotdoahostilebranch.Branchesdependonupfrontcoordination,whichisexpensiveand
fragile.Onepersoncanvetothedesiresofawholegroup.ForexampleintheZeroMQcommunitywewereunabletoagreeona
gitbranchingmodelforayear.Wesolvedthatbyusingforkinginstead.Theproblemwentaway.

Scalability topprevnext

Themoreyoucanscaleaproject,thebetter.

Thestrongassumptioninallbranchstrategiesisthattherepositoryistheproject.Butthereisalimittohowmanypeopleyoucan
gettoagreetoworktogetherinonerepository.AsIexplained,thecostofupfrontcoordinationcanbecomefatal.Amorerealistic
projectscalesbyallowinganyonetostarttheirownrepositories,andensuringthesecanworktogether.AprojectlikeZeroMQ
hasdozensofrepositories.Forkinglooksmorescalablethanbranching.

SurpriseandExpectations topprevnext

Thelesssurprising,thebetter.

Peopleexpectbranchesandfindforkstobeuncommonandthusconfusing.Thisistheoneaspectwherebrancheswin.Ifyou
usebranches,asinglepatchwillhavethesamecommithashtag,whereasacrossforksthepatchwillhavedifferenthashtags.
Thatmakesithardertotrackpatchesastheycrossforks,true.Butseriously,havingtotrackhexadecimalhashtagsisnota
feature.It'sabug.Sometimesbetterwaysofworkingaresurprisingatfirst.

EconomicsofParticipation topprevnext

Themoretangibletherewards,thebetter.

Peopleliketoowntheirworkandgetcreditforit.Thisismucheasierwithforksthanwithbranches.Forkscreatemore
competitioninahealthyway,whilebranchessuppresscompetitionandforcepeopletocollaborateandsharecredit.Thissounds
positivebutinmyexperienceitdemotivatespeople.Abranchisn'taproductyoucan"own",whereasaforkcanbe.

RobustnessinConflict topprevnext

Themoreamodelcansurviveconflict,thebetter.

Likeitornot,peoplefightoverego,status,beliefs,andtheoriesoftheworld.Challengeisanecessarypartofscience.Ifyour
organizationalmodeldependsonagreement,youwon'tsurvivethefirstrealfight.Branchesdonotsurviverealargumentsand
http://zguide.zeromq.org/page:all 144/225
12/31/2015 MQ - The Guide - MQ - The Guide
fights,whereasforkscanbehostile,andstillbenefitallparties.Andthisisindeedhowfreesoftwareworks.

GuaranteesofIsolation topprevnext

Thestrongertheisolationbetweenproductioncodeandexperiment,thebetter.

Peoplemakemistakes.I'veseenexperimentalcodepushedtomainlineproductionbyerror.I'veseenpeoplemakebadpanic
changesunderstress.Buttherealfaultisinallowingtwoentirelyseparategenerationsofproducttoexistinthesameprotected
space.Ifyoucanpushtorandombranchx,youcanpushtomaster.Branchesdonotguaranteeisolationofproductioncritical
code.Forksdo.

Visibility topprevnext

Themorevisibleourwork,thebetter.

Forkshavewatchers,issues,aREADME,andawiki.Brancheshavenoneofthese.Peopletryforks,buildthem,breakthem,
patchthem.Branchessitthereuntilsomeonerememberstoworkonthem.Forkshavedownloadsandtarballs.Branchesdonot.
Whenwelookforselforganization,themorevisibleanddeclarativetheproblems,thefasterandmoreaccuratelywecanwork.

Conclusions topprevnext

Inthissection,I'velistedaseriesofarguments,mostofwhichcamefromfellowteammembers.Here'showitseemstobreak
down:gitveteransinsistthatbranchesarethewaytowork,whereasnewcomerstendtofeelintimidatedwhenaskedtonavigate
gitbranches.Gitisnotaneasytooltomaster.Whatwe'vediscovered,accidentally,isthatwhenyoustopusingbranchesatall,
gitbecomestrivialtouse.Itliterallycomesdowntosixcommands(clone,remote,commit,log,push,andpull).
Furthermore,abranchfreeprocessactuallyworks,we'veuseditforacoupleofyearsnow,andnovisibledownsideexcept
surprisetotheveteransandgrowthof"single"projectsovermultiplerepositories.

Ifyoucan'tuseforks,perhapsbecauseyourfirmdoesn'ttrustGitHub'sprivaterepositories,thenyoucanperhapsusetopic
branches,oneperissue.You'llstillsufferthecostsofgettingupfrontconsensus,lowcompetitiveness,andriskofhumanerror.

DesigningforInnovation topprevnext

Let'slookatinnovation,whichWikipediadefinesas,"thedevelopmentofnewvaluesthroughsolutionsthatmeetnew
requirements,inarticulateneeds,oroldcustomerandmarketneedsinvalueaddingnewways."Thisreallyjustmeanssolving
problemsmorecheaply.Itsoundsstraightforward,butthehistoryofcollapsedtechgiantsprovesthatit'snot.I'lltrytoexplain
howteamssooftengetitwrong,andsuggestawayfordoinginnovationright.

TheTaleofTwoBridges topprevnext

Twooldengineersweretalkingoftheirlivesandboastingoftheirgreatestprojects.Oneoftheengineersexplainedhowhehad
designedoneofthegreatestbridgesevermade.

"Webuiltitacrossarivergorge,"hetoldhisfriend."Itwaswideanddeep.Wespenttwoyearsstudyingtheland,andchoosing
designsandmaterials.Wehiredthebestengineersanddesignedthebridge,whichtookanotherfiveyears.Wecontractedthe
largestengineeringfirmstobuildthestructures,thetowers,thetollbooths,andtheroadsthatwouldconnectthebridgetothe
mainhighways.Dozensdiedduringtheconstruction.Undertheroadlevelwehadtrains,andaspecialpathforcyclists.That

http://zguide.zeromq.org/page:all 145/225
12/31/2015 MQ - The Guide - MQ - The Guide
bridgerepresentedyearsofmylife."

Thesecondmanreflectedforawhile,thenspoke."Oneeveningmeandafriendgotdrunkonvodka,andwethrewaropeacross
agorge,"hesaid."Justarope,tiedtotwotrees.Thereweretwovillages,oneateachside.Atfirst,peoplepulledpackages
acrossthatropewithapulleyandstring.Thensomeonethrewasecondrope,andbuiltafootwalk.Itwasdangerous,butthe
kidslovedit.Agroupofmenthenrebuiltthat,madeitsolid,andwomenstartedtocross,everyday,withtheirproduce.Amarket
grewupononesideofthebridge,andslowlythatbecamealargetown,becausetherewasalotofspaceforhouses.Therope
bridgegotreplacedwithawoodenbridge,toallowhorsesandcartstocross.Thenthetownbuiltarealstonebridge,withmetal
beams.Later,theyreplacedthestonepartwithsteel,andtodaythere'sasuspensionbridgestandinginthatsamespot."

Thefirstengineerwassilent."Funnything,"hesaid,"mybridgewasdemolishedabouttenyearsafterwebuiltit.Turnsoutitwas
builtinthewrongplaceandnoonewantedtouseit.Someguyshadthrownaropeacrossthegorge,afewmilesfurther
downstream,andthat'swhereeveryonewent."

HowZeroMQLostItsRoadMap topprevnext

PresentingZeroMQattheMixITconferenceinLyoninearly2012,Iwasaskedseveraltimesforthe"roadmap".Myanswer
was:thereisnoroadmapanylonger.Wehadroadmaps,andwedeletedthem.Insteadofafewexpertstryingtolayoutthe
nextsteps,wewereallowingthistohappenorganically.Theaudiencedidn'treallylikemyanswer.SounFrench.

However,thehistoryofZeroMQmakesitquiteclearwhyroadmapswereproblematic.Inthebeginning,wehadasmallteam
makingthelibrary,withfewcontributors,andnodocumentedroadmap.AsZeroMQgrewmorepopularandweswitchedtomore
contributors,usersaskedforroadmaps.Sowecollectedourplanstogetherandtriedtoorganizethemintoreleases.Here,we
wrote,iswhatwillcomeinthenextrelease.

Aswerolledoutreleases,wehittheproblemthatit'sveryeasytopromisestuff,andratherhardertomakeitasplanned.Forone
thing,muchoftheworkwasvoluntary,andit'snotclearhowyouforcevolunteerstocommittoaroadmap.Butalso,priorities
canshiftdramaticallyovertime.Soweweremakingpromiseswecouldnotkeep,andtherealdeliveriesdidn'tmatchtheroad
maps.

Thesecondproblemwasthatbydefiningtheroadmap,weineffectclaimedterritory,makingitharderforotherstoparticipate.
Peopledoprefertocontributetochangestheybelieveweretheiridea.Writingdownalistofthingstodoturnscontributionintoa
choreratherthananopportunity.

Finally,wesawchangesinZeroMQthatwerequitetraumatic,andtheroadmapsdidn'thelpwiththis,despitealotofdiscussion
andeffortto"doitright".ExamplesofthiswereincompatiblechangesinAPIsandprotocols.Itwasquiteclearthatweneededa
differentapproachfordefiningthechangeprocess.

Softwareengineersdon'tlikethenotionthatpowerful,effectivesolutionscancomeintoexistencewithoutanintelligentdesigner
activelythinkingthingsthrough.AndyetnooneinthatroominLyonwouldhavequestionedevolution.Astrangeirony,andoneI
wantedtoexplorefurtherasitunderpinsthedirectiontheZeroMQcommunityhastakensincethestartof2012.

Inthedominanttheoryofinnovation,brilliantindividualsreflectonlargeproblemsetsandthencarefullyandpreciselycreatea
solution.Sometimestheywillhave"eureka"momentswherethey"get"brilliantlysimpleanswerstowholelargeproblemsets.
Theinventor,andtheprocessofinventionarerare,precious,andcancommandamonopoly.Historyisfullofsuchheroic
individuals.Weowethemourmodernworld.

Lookingmoreclosely,however,andyouwillseethatthefactsdon'tmatch.Historydoesn'tshowloneinventors.Itshowslucky
peoplewhostealorclaimownershipofideasthatarebeingworkedonbymany.Itshowsbrilliantpeoplestrikingluckyonce,and
thenspendingdecadesonfruitlessandpointlessquests.ThebestknownlargescaleinventorslikeThomasEdisonwereinfact
justverygoodatsystematicbroadresearchdonebylargeteams.It'slikeclaimingthatSteveJobsinventedeverydevicemade
byApple.Itisanicemyth,goodformarketing,bututterlyuselessaspracticalscience.

Recenthistory,muchbetterdocumentedandlesseasytomanipulate,showsthiswell.TheInternetissurelyoneofthemost
innovativeandfastmovingareasoftechnology,andoneofthebestdocumented.Ithasnoinventor.Instead,ithasamassive
economyofpeoplewhohavecarefullyandprogressivelysolvedalongseriesofimmediateproblems,documentedtheiranswers,
andmadethoseavailabletoall.TheinnovativenatureoftheInternetcomesnotfromasmall,selectbandofEinsteins.Itcomes
fromRFCsanyonecanuseandimprove,madebyhundredsandthousandsofsmart,butnotuniquelysmart,individuals.It
comesfromopensourcesoftwareanyonecanuseandimprove.Itcomesfromsharing,scaleofcommunity,andthecontinuous
accretionofgoodsolutionsanddisposalofbadones.

Herethusisanalternativetheoryofinnovation:

http://zguide.zeromq.org/page:all 146/225
12/31/2015 MQ - The Guide - MQ - The Guide
1. Thereisaninfiniteproblem/solutionterrain.
2. Thisterrainchangesovertimeaccordingtoexternalconditions.
3. Wecanonlyaccuratelyperceiveproblemstowhichweareclose.
4. Wecanrankthecost/benefiteconomicsofproblemsusingamarketforsolutions.
5. Thereisanoptimalsolutiontoanysolvableproblem.
6. Wecanapproachthisoptimalsolutionheuristically,andmechanically.
7. Ourintelligencecanmakethisprocessfaster,butdoesnotreplaceit.

Thereareafewcorollariestothis:

Individualcreativitymatterslessthanprocess.Smarterpeoplemayworkfaster,buttheymayalsoworkinthewrong
direction.It'sthecollectivevisionofrealitythatkeepsushonestandrelevant.

Wedon'tneedroadmapsifwehaveagoodprocess.Functionalitywillemergeandevolveovertimeassolutionscompete
formarketshare.

Wedon'tinventsolutionssomuchasdiscoverthem.Allsympathiestothecreativesoul.It'sjustaninformationprocessing
machinethatlikestopolishitsownegoandcollectkarma.

Intelligenceisasocialeffect,thoughitfeelspersonal.Apersoncutofffromotherseventuallystopsthinking.Wecan
neithercollectproblemsnormeasuresolutionswithoutotherpeople.

Thesizeanddiversityofthecommunityisakeyfactor.Larger,morediversecommunitiescollectmorerelevantproblems,
andsolvethemmoreaccurately,anddothisfaster,thanasmallexpertgroup.

So,whenwetrustthesolitaryexperts,theymakeclassicmistakes.Theyfocusonideas,notproblems.Theyfocusonthewrong
problems.Theymakemisjudgmentsaboutthevalueofsolvingproblems.Theydon'tusetheirownwork.

Canweturntheabovetheoryintoareusableprocess?Inlate2011,IstarteddocumentingC4andsimilarcontracts,andusing
thembothinZeroMQandinclosedsourceprojects.TheunderlyingprocessissomethingIcall"SimplicityOrientedDesign",or
SOD.Thisisareproduciblewayofdevelopingsimpleandelegantproducts.Itorganizespeopleintoflexiblesupplychainsthat
areabletonavigateaproblemlandscaperapidlyandcheaply.Theydothisbybuilding,testing,andkeepingordiscarding
minimalplausiblesolutions,called"patches".Livingproductsconsistoflongseriesofpatches,appliedoneatoptheother.

SODisrelevantfirstbecauseit'showweevolveZeroMQ.It'salsothebasisforthedesignprocesswewilluseinChapter7
AdvancedArchitectureusingZeroMQtodeveloplargerscaleZeroMQapplications.Ofcourse,youcanuseanysoftware
architecturemethodologywithZeroMQ.

TobestunderstandhowweendedupwithSOD,let'slookatthealternatives.

TrashOrientedDesign topprevnext

ThemostpopulardesignprocessinlargebusinessesseemstobeTrashOrientedDesign,orTOD.TODfeedsoffthebeliefthat
allweneedtomakemoneyaregreatideas.It'stenaciousnonsense,butapowerfulcrutchforpeoplewholackimagination.The
theorygoesthatideasarerare,sothetrickistocapturethem.It'slikenonmusiciansbeingawedbyaguitarplayer,notrealizing
thatgreattalentissocheapitliterallyplaysonthestreetsforcoins.

ThemainoutputofTODsisexpensive"ideation":concepts,designdocuments,andproductsthatgostraightintothetrashcan.It
worksasfollows:

TheCreativePeoplecomeupwithlonglistsof"wecoulddoXandY".I'veseenendlesslydetailedlistsofeverything
amazingaproductcoulddo.We'veallbeenguiltyofthis.Oncethecreativeworkofideagenerationhashappened,it'sjust
amatterofexecution,ofcourse.

Sothemanagersandtheirconsultantspasstheirbrilliantideastodesignerswhocreateacresofpreciouslyrefineddesign
documents.Thedesignerstakethetensofideasthemanagerscameupwith,andturnthemintohundredsofworld
changingdesigns.

Thesedesignsgetgiventoengineerswhoscratchtheirheadsandwonderwhotheheckcameupwithsuchnonsense.
Theystarttoargueback,butthedesignscomefromuphigh,andreally,it'snotuptoengineerstoarguewithcreative
peopleandexpensiveconsultants.

Sotheengineerscreepbacktotheircubicles,humiliatedandthreatenedintobuildingthegiganticbutohsoelegantjunk
heap.Itisbonebreakingworkbecausethedesignstakenoaccountofpracticalcosts.Minorwhimsmighttakeweeksof
http://zguide.zeromq.org/page:all 147/225
12/31/2015 MQ - The Guide - MQ - The Guide
worktobuild.Astheprojectgetsdelayed,themanagersbullytheengineersintogivinguptheireveningsandweekends.

Eventually,somethingresemblingaworkingproductmakesitoutofthedoor.It'screakyandfragile,complexandugly.
Thedesignerscursetheengineersfortheirincompetenceandpaymoreconsultantstoputlipstickontothepig,andslowly
theproductstartstolookalittlenicer.

Bythistime,themanagershavestartedtotrytoselltheproductandtheyfind,shockingly,thatnoonewantsit.
Undaunted,theycourageouslybuildmilliondollarwebsitesandadcampaignstoexplaintothepublicwhytheyabsolutely
needthisproduct.Theydodealswithotherbusinessestoforcetheproductonthelazy,stupid,andungratefulmarket.

Aftertwelvemonthsofintensemarketing,theproductstillisn'tmakingprofits.Worse,itsuffersdramaticfailuresandgets
brandedinthepressasadisaster.Thecompanyquietlyshelvesit,firestheconsultants,buysacompetingproductfroma
smallstartupandrebrandsthatasitsownVersion2.Hundredsofmillionsofdollarsendupinthetrash.

Meanwhile,anothervisionarymanagersomewhereintheorganizationdrinksalittletoomuchtequilawithsomemarketing
peopleandhasaBrilliantIdea.

TrashOrientedDesignwouldbeacaricatureifitwasn'tsocommon.Somethinglike19outof20marketreadyproductsbuiltby
largefirmsarefailures(yes,87%ofstatisticsaremadeuponthespot).Theremaining1in20probablyonlysucceedsbecause
thecompetitorsaresobadandthemarketingissoaggressive.

ThemainlessonsofTODarequitestraightforwardbuthardtoswallow.Theyare:

Ideasarecheap.Noexceptions.Therearenobrilliantideas.Anyonewhotriestostartadiscussionwith"oooh,wecando
thistoo!"shouldbebeatendownwithallthepassiononereservesfortravelingevangelists.Itislikesittinginacafeatthe
footofamountain,drinkingahotchocolateandtellingothers,"Hey,Ihaveagreatidea,wecanclimbthatmountain!And
buildachaletontop!Withtwosaunas!Andagarden!Hey,andwecanmakeitsolarpowered!Dude,that'sawesome!
Whatcolorshouldwepaintit?Green!No,blue!OK,goandmakeit,I'llstayhereandmakespreadsheetsandgraphics!"

Thestartingpointforagooddesignprocessistocollectrealproblemsthatconfrontrealpeople.Thesecondstepisto
evaluatetheseproblemswiththebasicquestion,"Howmuchisitworthtosolvethisproblem?"Havingdonethat,wecan
collectthatsetofproblemsthatareworthsolving.

Goodsolutionstorealproblemswillsucceedasproducts.Theirsuccesswilldependonhowgoodandcheapthesolution
is,andhowimportanttheproblemis(andsadly,howbigthemarketingbudgetsare).Buttheirsuccesswillalsodependon
howmuchtheydemandinefforttouseinotherwords,howsimpletheyare.

Now,afterslayingthedragonofutterirrelevance,weattackthedemonofcomplexity.

ComplexityOrientedDesign topprevnext

Reallygoodengineeringteamsandsmallfirmscanusuallybuilddecentproducts.Butthevastmajorityofproductsstillendup
beingtoocomplexandlesssuccessfulthantheymightbe.Thisisbecausespecialistteams,eventhebest,oftenstubbornly
applyaprocessIcallComplexityOrientedDesign,orCOD,whichworksasfollows:

Managementcorrectlyidentifiessomeinterestinganddifficultproblemwitheconomicvalue.Indoingso,theyalready
leapfrogoveranyTODteam.

Theteamwithenthusiasmstartstobuildprototypesandcorelayers.Theseworkasdesignedandthusencouraged,the
teamgooffintointensedesignandarchitecturediscussions,comingupwithelegantschemasthatlookbeautifuland
solid.

Managementcomesbackandchallengestheteamwithyetmoredifficultproblems.Wetendtoequatecostwithvalue,so
theharderandmoreexpensivetosolve,themorethesolutionshouldbeworth,intheirminds.

Theteam,beingengineersandthuslovingtobuildstuff,buildstuff.Theybuildandbuildandbuildandendupwith
massive,perfectlydesignedcomplexity.

Theproductsgotomarket,andthemarketscratchesitsheadandasks,"Seriously,isthisthebestyoucando?"People
dousetheproducts,especiallyiftheyaren'tspendingtheirownmoneyinclimbingthelearningcurve.

Managementgetspositivefeedbackfromitslargercustomers,whosharethesameideathathighcost(intrainingand
use)meanshighvalue,andsocontinuestopushtheprocess.

http://zguide.zeromq.org/page:all 148/225
12/31/2015 MQ - The Guide - MQ - The Guide
Meanwhilesomewhereacrosstheworld,asmallteamissolvingthesameproblemusingabetterprocess,andayear
latersmashesthemarkettolittlepieces.

CODischaracterizedbyateamobsessivelysolvingthewrongproblemsinaformofcollectivedelusion.CODproductstendto
belarge,ambitious,complex,andunpopular.MuchopensourcesoftwareistheoutputofCODprocesses.Itisinsanelyhardfor
engineerstostopextendingadesigntocovermorepotentialproblems.Theyargue,"WhatifsomeonewantstodoX?"butnever
askthemselves,"WhatistherealvalueofsolvingX?"

AgoodexampleofCODinpracticeisBluetooth,acomplex,overdesignedsetofprotocolsthatusershate.Itcontinuestoexist
onlybecauseinamassivelypatentedindustrytherearenorealalternatives.Bluetoothisperfectlysecure,whichiscloseto
pointlessforaproximityprotocol.Atthesametime,itlacksastandardAPIfordevelopers,meaningit'sreallycostlytouse
Bluetoothinapplications.

Onthe#zeromqIRCchannel,Wintreoncewroteofhowenragedhewasmanyyearsagowhenhe"foundthatXMMS2hada
workingpluginsystem,butcouldnotactuallyplaymusic."

CODisaformoflargescale"rabbitholing",inwhichdesignersandengineerscannotdistancethemselvesfromthetechnical
detailsoftheirwork.Theyaddmoreandmorefeatures,utterlymisreadingtheeconomicsoftheirwork.

ThemainlessonsofCODarealsosimple,buthardforexpertstoswallow.Theyare:

Makingstuffthatyoudon'timmediatelyhaveaneedforispointless.Doesn'tmatterhowtalentedorbrilliantyouare,ifyou
justsitdownandmakestuffpeoplearenotactuallyaskingfor,youaremostlikelywastingyourtime.

Problemsarenotequal.Somearesimple,andsomearecomplex.Ironically,solvingthesimplerproblemsoftenhasmore
valuetomorepeoplethansolvingthereallyhardones.Soifyouallowengineerstojustworkonrandomthings,they'll
mostlyfocusonthemostinterestingbutleastworthwhilethings.

Engineersanddesignerslovetomakestuffanddecoration,andthisinevitablyleadstocomplexity.Itiscrucialtohavea
"stopmechanism",awaytosetshort,harddeadlinesthatforcepeopletomakesmaller,simpleranswerstojustthemost
crucialproblems.

SimplicityOrientedDesign topprevnext

Finally,wecometotherarebutpreciousSimplicityOrientedDesign,orSOD.Thisprocessstartswitharealization:wedonot
knowwhatwehavetomakeuntilafterwestartmakingit.Comingupwithideasorlargescaledesignsisn'tjustwasteful,it'sa
directhindrancetodesigningthetrulyaccuratesolutions.Thereallyjuicyproblemsarehiddenlikefarvalleys,andanyactivity
exceptactivescoutingcreatesafogthathidesthosedistantvalleys.Youneedtokeepmobile,packlight,andmovefast.

SODworksasfollows:

Wecollectasetofinterestingproblems(bylookingathowpeopleusetechnologyorotherproducts)andwelinetheseup
fromsimpletocomplex,lookingforandidentifyingpatternsofuse.

Wetakethesimplest,mostdramaticproblemandwesolvethiswithaminimalplausiblesolution,or"patch".Eachpatch
solvesexactlyagenuineandagreeduponprobleminabrutallyminimalfashion.

Weapplyonemeasureofqualitytopatches,namely"Canthisbedoneanysimplerwhilestillsolvingthestatedproblem?"
Wecanmeasurecomplexityintermsofconceptsandmodelsthattheuserhastolearnorguessinordertousethepatch.
Thefewer,thebetter.Aperfectpatchsolvesaproblemwithzerolearningrequiredbytheuser.

Ourproductdevelopmentconsistsofapatchthatsolvestheproblem"weneedaproofofconcept"andthenevolvesinan
unbrokenlinetoamatureseriesofproducts,throughhundredsorthousandsofpatchespiledontopofeachother.

Wedonotdoanythingthatisnotapatch.Weenforcethisrulewithformalprocessesthatdemandthateveryactivityor
taskistiedtoagenuineandagreeduponproblem,explicitlyenunciatedanddocumented.

Webuildourprojectsintoasupplychainwhereeachprojectcanprovideproblemstoits"suppliers"andreceivepatchesin
return.Thesupplychaincreatesthe"stopmechanism"becausewhenpeopleareimpatientlywaitingforananswer,we
necessarilycutourworkshort.

Individualsarefreetoworkonanyprojects,andprovidepatchesatanyplacetheyfeelit'sworthwhile.Noindividuals
"own"anyproject,excepttoenforcetheformalprocesses.Asingleprojectcanhavemanyvariations,eachacollectionof
different,competingpatches.

http://zguide.zeromq.org/page:all 149/225
12/31/2015 MQ - The Guide - MQ - The Guide
Projectsexportformalanddocumentedinterfacessothatupstream(client)projectsareunawareofchangehappeningin
supplierprojects.Thusmultiplesupplierprojectscancompeteforclientprojects,ineffectcreatingafreeandcompetitive
market.

Wetieoursupplychaintorealusersandexternalclientsandwedrivethewholeprocessbyrapidcyclessothata
problemreceivedfromoutsideuserscanbeanalyzed,evaluated,andsolvedwithapatchinafewhours.

Ateverymomentfromtheveryfirstpatch,ourproductisshippable.Thisisessential,becausealargeproportionof
patcheswillbewrong(1030%)andonlybygivingtheproducttouserscanweknowwhichpatcheshavebecome
problemsthatneedsolving.

SODisahillclimbingalgorithm,areliablewayoffindingoptimalsolutionstothemostsignificantproblemsinanunknown
landscape.Youdon'tneedtobeageniustouseSODsuccessfully,youjustneedtobeabletoseethedifferencebetweenthe
fogofactivityandtheprogresstowardsnewrealproblems.

Peoplehavepointedoutthathillclimbingalgorithmshaveknownlimitations.Onegetsstuckonlocalpeaks,mainly.Butthisis
nonethelesshowlifeitselfworks:collectingtinyincrementalimprovementsoverlongperiodsoftime.Thereisnointelligent
designer.Wereducetheriskoflocalpeaksbyspreadingoutwidelyacrossthelandscape,butitissomewhatmoot.The
limitationsaren'toptional,theyarephysicallaws.Thetheorysays,thisishowinnovationreallyworks,sobetterembraceitand
workwithitthantrytoworkonthebasisofmagicalthinking.

Andinfactonceyouseeallinnovationasmoreorlesssuccessfulhillclimbing,yourealizewhysometeamsandcompaniesand
productsgetstuckinaneverneverlandofdiminishingprospects.Theysimplydon'thavethediversityandcollectiveintelligence
tofindbetterhillstoclimb.WhenNokiakilledtheiropensourceprojects,theycuttheirownthroat.

AreallygooddesignerwithagoodteamcanuseSODtobuildworldclassproducts,rapidlyandaccurately.Togetthemostout
ofSODthedesignerhastousetheproductcontinuously,fromdayone,anddevelophisorherabilitytosmelloutproblemssuch
asinconsistency,surprisingbehavior,andotherformsoffriction.Wenaturallyoverlookmanyannoyances,butagooddesigner
pickstheseupandthinksabouthowtopatchthem.Designisaboutremovingfrictionintheuseofaproduct.

Inanopensourcesetting,wedothisworkinpublic.There'sno"let'sopenthecode"moment.Projectsthatdothisareinmyview
missingthepointofopensource,whichistoengageyourusersinyourexploration,andtobuildcommunityaroundtheseedof
thearchitecture.

Burnout topprevnext

TheZeroMQcommunityhasbeenandstillisheavilydependentonprobonoindividualefforts.I'dliketothinkthateveryonewas
compensatedinsomewayfortheircontributions,andIbelievethatwithZeroMQ,contributingmeansgainingexpertiseinan
extraordinarilyvaluabletechnology,whichleadstoimprovedprofessionaloptions.

However,notallprojectswillbesoluckyandifyouworkwithorinopensource,youshouldunderstandtheriskofburnoutthat
volunteersface.Thisappliestoallprobonocommunities.Inthissection,I'llexplainwhatcausesburnout,howtorecognizeit,
howtopreventit,and(ifithappens)howtotrytotreatit.Disclaimer:I'mnotapsychiatristandthisarticleisbasedonmyown
experiencesofworkinginprobonocontextsforthelast20years,includingfreesoftwareprojects,andNGOssuchastheFFII.

Inaprobonocontext,we'reexpectedtoworkwithoutdirectorobviouseconomicincentive.Thatis,wesacrificefamilylife,
professionaladvancement,freetime,andhealthinordertoaccomplishsomegoalwehavedecidedtoaccomplish.Inany
project,weneedsomekindofrewardtomakeitworthcontinuingeachday.Inmostprobonoprojectstherewardsarevery
indirect,superficiallynoteconomicalatall.Mostly,wedothingsbecausepeoplesay,"Hey,great!"Karmaisapowerfulmotivator.

However,weareeconomicbeings,andsoonerorlater,ifaprojectcostsusagreatdealanddoesnotbringeconomicrewardsof
somekind(money,fame,anewjob),westarttosuffer.Atacertainstage,itseemsoursubconscioussimplygetsdisgustedand
says,"Enoughisenough!"andrefusestogoanyfurther.Ifwetrytoforceourselves,wecanliterallygetsick.

ThisiswhatIcall"burnout",thoughthetermisalsousedforotherkindsofexhaustion.Toomuchinvestmentonaprojectwith
toolittleeconomicreward,fortoolong.Wearegreatatmanipulatingourselvesandothers,andthisisoftenpartoftheprocess
thatleadstoburnout.Wetellourselvesthatit'sforagoodcauseandthattheotherguyisdoingOK,soweshouldbeabletoas
well.

WhenIgotburnedoutonopensourceprojectslikeXitami,IrememberclearlyhowIfelt.Isimplystoppedworkingonit,refused
toansweranymoreemails,andtoldpeopletoforgetaboutit.Youcantellwhensomeone'sburnedout.Theygooffline,and
everyonestartssaying,"He'sactingstrangedepressed,ortired"

Diagnosisissimple.Hassomeoneworkedalotonaprojectthatwasnotpayingbackinanyway?Didshemakeexceptional
http://zguide.zeromq.org/page:all 150/225
12/31/2015 MQ - The Guide - MQ - The Guide
sacrifices?Didheloseorabandonhisjoborstudiestodotheproject?Ifyou'reanswering"yes",it'sburnout.

TherearethreesimpletechniquesI'vedevelopedovertheyearstoreducetheriskofburnoutintheteamsIworkwith:

Nooneisirreplaceable.Workingsoloonacriticalorpopularprojecttheconcentrationofresponsibilityononeperson
whocannotsettheirownlimitsisprobablythemainfactor.It'samanagementtruism:ifsomeoneinyourorganizationis
irreplaceable,getridofhimorher.

Weneeddayjobstopaythebills.Thiscanbehard,butseemsnecessary.Gettingmoneyfromsomewhereelsemakesit
mucheasiertosustainasacrificialproject.

Teachpeopleaboutburnout.Thisshouldbeabasiccourseincollegesanduniversities,asprobonoworkbecomesa
morecommonwayforyoungpeopletoexperimentprofessionally.

Whensomeoneisworkingaloneonacriticalproject,youknowtheyaregoingblowtheirfusessoonerorlater.It'sactuallyfairly
predictable:somethinglike1836monthsdependingontheindividualandhowmucheconomicstresstheyfaceintheirprivate
lives.I'venotseenanyoneburnoutafterhalfayear,norlastfiveyearsinaunrewardingproject.

Thereisasimplecureforburnoutthatworksinatleastsomecases:getpaiddecentlyforyourwork.However,thisprettymuch
destroysthefreedomofmovement(acrossthatinfiniteproblemlandscape)thatthevolunteerenjoys.

PatternsforSuccess topprevnext

I'llendthiscodefreechapterwithaseriesofpatternsforsuccessinsoftwareengineering.Theyaimtocapturetheessenceof
whatdividesglorioussuccessfromtragicfailure.Theyweredescribedas"religiousmaniacaldogma"byamanager,and
"anythingelsewouldbeeffinginsane"byacolleague,inasingleday.Forme,theyarescience.ButtreattheLazyPerfectionist
andothersastoolstouse,sharpen,andthrowawayifsomethingbettercomesalong.

TheLazyPerfectionist topprevnext

Neverdesignanythingthat'snotapreciseminimalanswertoaproblemwecanidentifyandhavetosolve.

TheLazyPerfectionistspendshisidletimeobservingothersandidentifyingproblemsthatareworthsolving.Helooksfor
agreementonthoseproblems,alwaysasking,"Whatistherealproblem".Thenhemoves,preciselyandminimally,tobuild,or
getotherstobuild,ausableanswertooneproblem.Heuses,orgetsotherstousethosesolutions.Andherepeatsthisuntil
therearenoproblemslefttosolve,ortimeormoneyrunsout.

TheBenevolentTyrant topprevnext

Thecontrolofalargeforceisthesameprincipleasthecontrolofafewmen:itismerelyaquestionofdividinguptheirnumbers.
SunTzu

TheBenevolentTyrantdivideslargeproblemsintosmalleronesandthrowsthematgroupstofocuson.Shebrokerscontracts
betweenthesegroups,intheformofAPIsandthe"unprotocols"we'llreadaboutinthenextchapter.TheBenevolentTyrant
constructsasupplychainthatstartswithproblems,andresultsinusablesolutions.Sheisruthlessabouthowthesupplychain
works,butdoesnottellpeoplewhattoworkon,norhowtodotheirwork.

TheEarthandSky topprevnext

Theidealteamconsistsoftwosides:onewritingcode,andoneprovidingfeedback.

TheEarthandSkyworktogetherasawhole,incloseproximity,buttheycommunicateformallythroughissuetracking.Skyseeks
http://zguide.zeromq.org/page:all 151/225
12/31/2015 MQ - The Guide - MQ - The Guide
outproblemsfromothersandfromtheirownuseoftheproductandfeedsthesetoEarth.Earthrapidlyanswerswithtestable
solutions.EarthandSkycanworkthroughdozensofissuesinaday.Skytalkstootherusers,andEarthtalkstoother
developers.EarthandSkymaybetwopeople,ortwosmallgroups.

TheOpenDoor topprevnext

Theaccuracyofknowledgecomesfromdiversity.

TheOpenDooracceptscontributionsfromalmostanyone.Shedoesnotarguequalityordirection,insteadallowingothersto
arguethatandgetmoreengaged.Shecalculatesthatevenatrollwillbringmorediverseopiniontothegroup.Sheletsthegroup
formitsopinionaboutwhatgoesintostablecode,andsheenforcesthisopinionwithhelpofaBenevolentTyrant.

TheLaughingClown topprevnext

Perfectionprecludesparticipation.

TheLaughingClown,oftenactingastheHappyFailure,makesnoclaimtohighcompetence.Insteadhisanticsandbumbling
attemptsprovokeothersintorescuinghimfromhisowntragedy.Somehowhowever,healwaysidentifiestherightproblemsto
solve.Peoplearesobusyprovinghimwrongtheydon'trealizethey'redoingvaluablework.

TheMindfulGeneral topprevnext

Makenoplans.Setgoals,developstrategiesandtactics.

TheMindfulGeneraloperatesinunknownterritory,solvingproblemsthatarehiddenuntiltheyarenearby.Thusshemakesno
plans,butseeksopportunities,thenexploitsthemrapidlyandaccurately.Shedevelopstacticsandstrategiesinthefield,and
teachesthesetohersoldierssotheycanmoveindependently,andtogether.

TheSocialEngineer topprevnext

Ifyouknowtheenemyandknowyourself,youneednotfeartheresultofahundredbattles.SunTzu

TheSocialEngineerreadstheheartsandmindsofthoseheworkswithandfor.Heasks,ofeveryone,"Whatmakesthisperson
angry,insecure,argumentative,calm,happy?"Hestudiestheirmoodsanddispositions.Withthisknowledgehecanencourage
thosewhoareuseful,anddiscouragethosewhoarenot.TheSocialEngineerneveractsonhisownemotions.

TheConstantGardener topprevnext

Hewillwinwhosearmyisanimatedbythesamespiritthroughoutallitsranks.SunTzu

TheConstantGardenergrowsaprocessfromasmallseed,stepbystepasmorepeoplecomeintotheproject.Shemakes
everychangeforaprecisereason,withagreementfromeveryone.Sheneverimposesaprocessfromabovebutletsothers
cometoconsensus,andthenheenforcesthatconsensus.Inthisway,everyoneownstheprocesstogetherandbyowningit,
theyareattachedtoit.

http://zguide.zeromq.org/page:all 152/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheRollingStone topprevnext

Aftercrossingariver,youshouldgetfarawayfromit.SunTzu

TheRollingStoneacceptshisownmortalityandtransience.Hehasnoattachmenttohispastwork.Heacceptsthatallthatwe
makeisdestinedforthetrashcan,itisjustamatteroftime.Withprecise,minimalinvestments,hecanmoverapidlyawayfrom
thepastandstayfocusedonthepresentandnearfuture.Aboveall,hehasnoegoandnopridetobehurtbytheactionsof
others.

ThePirateGang topprevnext

Code,likeallknowledge,worksbestascollectivenotprivateproperty.

ThePirateGangorganizesfreelyaroundproblems.Itacceptsauthorityinsofarasauthorityprovidesgoalsandresources.The
PirateGangownsandsharesallitmakes:everyworkisfullyremixablebyothersinthePirateGang.Thegangmovesrapidlyas
newproblemsemerge,andisquicktoabandonoldsolutionsifthosestopbeingrelevant.Nopersonsorgroupscanmonopolize
anypartofthesupplychain.

TheFlashMob topprevnext

Watershapesitscourseaccordingtothenatureofthegroundoverwhichitflows.SunTzu

TheFlashMobcomestogetherinspaceandtimeasneeded,thendispersesassoonastheycan.Physicalclosenessis
essentialforhighbandwidthcommunications.Butovertimeitcreatestechnicalghettos,whereEarthgetsseparatedfromSky.
TheFlashMobtendstocollectalotoffrequentfliermiles.

TheCanaryWatcher topprevnext

Painisnot,generally,aGoodSign.

TheCanaryWatchermeasuresthequalityofanorganizationbytheirownpainlevel,andtheobservedpainlevelsofthosewith
whomheworks.Hebringsnewparticipantsintoexistingorganizationssotheycanexpresstherawpainoftheinnocent.Hemay
usealcoholtogetotherstoverbalizetheirpainpoints.Heasksothers,andhimself,"Areyouhappyinthisprocess,andifnot,
whynot?"Whenanorganizationcausespaininhimselforothers,hetreatsthatasaproblemtobefixed.Peopleshouldfeeljoy
intheirwork.

TheHangman topprevnext

Neverinterruptotherswhentheyaremakingmistakes.

TheHangmanknowsthatwelearnonlybymakingmistakes,andshegivesotherscopiousropewithwhichtolearn.Sheonly
pullstheropegently,whenit'stime.Alittletugtoremindtheotheroftheirprecariousposition.Allowingotherstolearnbyfailure
givesthegoodreasontostay,andthebadexcusetoleave.TheHangmanisendlesslypatient,becausethereisnoshortcutto
thelearningprocess.

TheHistorian topprevnext

http://zguide.zeromq.org/page:all 153/225
12/31/2015 MQ - The Guide - MQ - The Guide

Keepingthepublicrecordmaybetedious,butit'stheonlywaytopreventcollusion.

TheHistorianforcesdiscussionintothepublicview,topreventcollusiontoownareasofwork.ThePirateGangdependsonfull
andequalcommunicationsthatdonotdependonmomentarypresence.Noonereallyreadsthearchives,butthesimply
possibilitystopsmostabuses.TheHistorianencouragestherighttoolforthejob:emailfortransientdiscussions,IRCforchatter,
wikisforknowledge,issuetrackingforrecordingopportunities.

TheProvocateur topprevnext

Whenamanknowsheistobehangedinafortnight,itconcentrateshismindwonderfully.SamuelJohnson

TheProvocateurcreatesdeadlines,enemies,andtheoccasionalimpossibility.Teamsworkbestwhentheydon'thavetimefor
thecrap.Deadlinesbringpeopletogetherandfocusthecollectivemind.Anexternalenemycanmoveapassiveteamintoaction.
TheProvocateurnevertakesthedeadlinetooseriously.Theproductisalwaysreadytoship.Butshegentlyremindstheteamof
thestakes:fail,andwealllookforotherjobs.

TheMystic topprevnext

Whenpeopleargueorcomplain,justwritethemaSunTzuquotationMikkoKoppanen

TheMysticneverarguesdirectly.Heknowsthattoarguewithanemotionalpersononlycreatesmoreemotion.Insteadheside
stepsthediscussion.It'shardtobeangryataChinesegeneral,especiallywhenhehasbeendeadfor2,400years.TheMystic
playsHangmanwhenpeopleinsistontherighttogetitwrong.

Chapter7AdvancedArchitectureusingZeroMQ topprevnext

OneoftheeffectsofusingZeroMQatlargescaleisthatbecausewecanbuilddistributedarchitecturessomuchfasterthan
before,thelimitationsofoursoftwareengineeringprocessesbecomemorevisible.Mistakesinslowmotionareoftenharderto
see(orrather,easiertorationalizeaway).

MyexperiencewhenteachingZeroMQtogroupsofengineersisthatit'srarelysufficienttojustexplainhowZeroMQworksand
thenjustexpectthemtostartbuildingsuccessfulproducts.Likeanytechnologythatremovesfriction,ZeroMQopensthedoorto
bigblunders.IfZeroMQistheACMErocketpropelledshoeofdistributedsoftwaredevelopment,alotofusarelikeWileE.
Coyote,slammingfullspeedintotheproverbialdesertcliff.

WesawinChapter6TheZeroMQCommunitythatZeroMQitselfusesaformalprocessforchanges.Onereasonwebuiltthis
process,oversomeyears,wastostoptherepeatedcliffslammingthathappenedinthelibraryitself.

Partly,it'saboutslowingdownandpartially,it'saboutensuringthatwhenyoumovefast,yougoandthisisessentialDear
Readerintherightdirection.It'smystandardinterviewriddle:what'stherarestpropertyofanysoftwaresystem,theabsolute
hardestthingtogetright,thelackofwhichcausesthesloworfastdeathofthevastmajorityofprojects?Theanswerisnotcode
quality,funding,performance,oreven(thoughit'sacloseanswer),popularity.Theanswerisaccuracy.

Accuracyishalfthechallenge,andappliestoanyengineeringwork.Theotherhalfisdistributedcomputingitself,whichsetsupa
wholerangeofproblemsthatweneedtosolveifwearegoingtocreatearchitectures.Weneedtoencodeanddecodedatawe
needtodefineprotocolstoconnectclientsandserversweneedtosecuretheseprotocolsagainstattackersandweneedto
makestacksthatarerobust.Asynchronousmessagingishardtogetright.

Thischapterwilltacklethesechallenges,startingwithabasicreappraisalofhowtodesignandbuildsoftwareandendingwitha
fullyformedexampleofadistributedapplicationforlargescalefiledistribution.

We'llcoverthefollowingjuicytopics:

http://zguide.zeromq.org/page:all 154/225
12/31/2015 MQ - The Guide - MQ - The Guide
Howtogofromideatoworkingprototypesafely(theMOPEDpattern)
DifferentwaystoserializeyourdataasZeroMQmessages
Howtocodegeneratebinaryserializationcodecs
HowtobuildcustomcodegeneratorsusingtheGSLtool
Howtowriteandlicenseaprotocolspecification
HowtobuildfastrestartablefiletransferoverZeroMQ
Howtousecreditbasedflowcontrolfornonblockingtransfers
Howtobuildprotocolserversandclientsasstatemachines
HowtomakeasecureprotocoloverZeroMQ
Alargescalefilepublishingsystem(FileMQ)

MessageOrientedPatternforElasticDesign topprevnext

I'llintroduceMessageOrientedPatternforElasticDesign(MOPED),asoftwareengineeringpatternforZeroMQarchitectures.It
waseither"MOPED"or"BIKE",theBackronymInducedKineticEffect.That'sshortfor"BICICLE",theBackronymInflatedSeeifI
CareLessEffect.Inlife,onelearnstogowiththeleastembarrassingchoice.

Ifyou'vereadthisbookcarefully,you'llhaveseenMOPEDinactionalready.ThedevelopmentofMajordomoinChapter4
ReliableRequestReplyPatternsisanearperfectcase.Butcutenamesareworthathousandwords.

ThegoalofMOPEDistodefineaprocessbywhichwecantakearoughusecaseforanewdistributedapplication,andgofrom
"HelloWorld"tofullyworkingprototypeinanylanguageinunderaweek.

UsingMOPED,yougrow,morethanbuild,aworkingZeroMQarchitecturefromthegroundupwithminimalriskoffailure.By
focusingonthecontractsratherthantheimplementations,youavoidtheriskofprematureoptimization.Bydrivingthedesign
processthroughultrashorttestbasedcycles,youcanbemorecertainthatwhatyouhaveworksbeforeyouaddmore.

Wecanturnthisintofiverealsteps:

Step1:internalizetheZeroMQsemantics.
Step2:drawarougharchitecture.
Step3:decideonthecontracts.
Step4:makeaminimalendtoendsolution.
Step5:solveoneproblemandrepeat.

Step1:InternalizetheSemantics topprevnext

YoumustlearnanddigestZeroMQ's"language",thatis,thesocketpatternsandhowtheywork.Theonlywaytolearna
languageistouseit.There'snowaytoavoidthisinvestment,notapesyoucanplaywhileyousleep,nochipsyoucanpluginto
magicallybecomesmarter.Readthisbookfromthestart,workthroughthecodeexamplesinwhateverlanguageyouprefer,
understandwhat'sgoingon,and(mostimportantly)writesomeexamplesyourselfandthenthrowthemaway.

Atacertainpoint,you'llfeelaclickingnoiseinyourbrain.Maybeyou'llhaveaweirdchiliinduceddreamwherelittleZeroMQ
tasksrunaroundtryingtoeatyoualive.Maybeyou'lljustthink"aaahh,sothat'swhatitmeans!"Ifwedidourworkright,itshould
taketwotothreedays.Howeverlongittakes,untilyoustartthinkingintermsofZeroMQsocketsandpatterns,you'renotready
forstep2.

Step2:DrawaRoughArchitecture topprevnext

Frommyexperience,it'sessentialtobeabletodrawthecoreofyourarchitecture.Ithelpsothersunderstandwhatyouare
thinking,anditalsohelpsyouthinkthroughyourideas.Thereisreallynobetterwaytodesignagoodarchitecturethantoexplain
yourideastoyourcolleagues,usingawhiteboard.

Youdon'tneedtogetitright,andyoudon'tneedtomakeitcomplete.Whatyoudoneedtodoisbreakyourarchitectureinto
piecesthatmakesense.Thenicethingaboutsoftwarearchitecture(ascomparedtoconstructingbridges)isthatyourreallycan

http://zguide.zeromq.org/page:all 155/225
12/31/2015 MQ - The Guide - MQ - The Guide
replaceentirelayerscheaplyifyou'veisolatedthem.

Startbychoosingthecoreproblemthatyouaregoingtosolve.Ignoreanythingthat'snotessentialtothatproblem:youwilladdit
inlater.Theproblemshouldbeanendtoendproblem:theropeacrossthegorge.

Forexample,aclientaskedustomakeasupercomputingclusterwithZeroMQ.Clientscreatebundlesofwork,whicharesentto
abrokerthatdistributesthemtoworkers(runningonfastgraphicsprocessors),collectstheresultsback,andreturnsthemtothe
client.

Theropeacrossthegorgeisoneclienttalkingtoabrokertalkingtooneworker.Wedrawthreeboxes:client,broker,worker.We
drawarrowsfromboxtoboxshowingtherequestflowingonewayandtheresponseflowingback.It'sjustlikethemanydiagrams
wesawinearlierchapters.

Beminimalistic.Yourgoalisnottodefinearealarchitecture,buttothrowaropeacrossthegorgetobootstrapyourprocess.We
makethearchitecturesuccessfullymorecompleteandrealisticovertime:e.g.,addingmultipleworkers,addingclientandworker
APIs,handlingfailures,andsoon.

Step3:DecideontheContracts topprevnext

Agoodsoftwarearchitecturedependsoncontracts,andthemoreexplicittheyare,thebetterthingsscale.Youdon'tcarehow
thingshappenyouonlycareabouttheresults.IfIsendanemail,Idon'tcarehowitarrivesatitsdestination,aslongasthe
contractisrespected.Theemailcontractis:itarriveswithinafewminutes,noonemodifiesit,anditdoesn'tgetlost.

Andtobuildalargesystemthatworkswell,youmustfocusonthecontractsbeforetheimplementations.Itmaysoundobvious
butalltoooften,peopleforgetorignorethis,orarejusttooshytoimposethemselves.IwishIcouldsayZeroMQhaddonethis
properly,butforyearsourpubliccontractsweresecondrateafterthoughtsinsteadofprimaryinyourfacepiecesofwork.

Sowhatisacontractinadistributedsystem?Thereare,inmyexperience,twotypesofcontract:

TheAPIstoclientapplications.RememberthePsychologicalElements.TheAPIsneedtobeasabsolutelysimple,
consistent,andfamiliaraspossible.Yes,youcangenerateAPIdocumentationfromcode,butyoumustfirstdesignit,and
designinganAPIisoftenhard.

Theprotocolsthatconnectthepieces.Itsoundslikerocketscience,butit'sreallyjustasimpletrick,andonethatZeroMQ
makesparticularlyeasy.Infactthey'resosimpletowrite,andneedsolittlebureaucracythatIcallthemunprotocols.

Youwriteminimalcontractsthataremostlyjustplacemarkers.MostmessagesandmostAPImethodswillbemissingorempty.
Youalsowanttowritedownanyknowntechnicalrequirementsintermsofthroughput,latency,reliability,andsoon.Theseare
thecriteriaonwhichyouwillacceptorrejectanyparticularpieceofwork.

Step4:WriteaMinimalEndtoEndSolution topprevnext

Thegoalistotestouttheoverallarchitectureasrapidlyaspossible.MakeskeletonapplicationsthatcalltheAPIs,andskeleton
stacksthatimplementbothsidesofeveryprotocol.Youwanttogetaworkingendtoend"HelloWorld"assoonasyoucan.You
wanttobeabletotestcodeasyouwriteit,sothatyoucanweedoutthebrokenassumptionsandinevitableerrorsyoumake.Do
notgooffandspendsixmonthswritingatestsuite!Instead,makeaminimalbarebonesapplicationthatusesourstill
hypotheticalAPI.

IfyoudesignanAPIwearingthehatofthepersonwhoimplementsit,you'llstarttothinkofperformance,features,options,and
soon.You'llmakeitmorecomplex,moreirregular,andmoresurprisingthanitshouldbe.But,andhere'sthetrick(it'sacheap
one,wasbiginJapan):ifyoudesignanAPIwhilewearingthehatofthepersonwhohastoactuallywriteappsthatuseit,you
useallthatlazinessandfeartoyouradvantage.

Writedowntheprotocolsonawikiorshareddocumentinsuchawaythatyoucanexplaineverycommandclearlywithouttoo
muchdetail.Stripoffanyrealfunctionality,becauseitwillonlycreateinertiathatmakesithardertomovestuffaround.Youcan
alwaysaddweight.Don'tspendeffortdefiningformalmessagestructures:passtheminimumaroundinthesimplestpossible
fashionusingZeroMQ'smultipartframing.

Ourgoalistogetthesimplesttestcaseworking,withoutanyavoidablefunctionality.Everythingyoucanchopoffthelistofthings
todo,youchop.Ignorethegroansfromcolleaguesandbosses.I'llrepeatthisonceagain:youcanalwaysaddfunctionality,
http://zguide.zeromq.org/page:all 156/225
12/31/2015 MQ - The Guide - MQ - The Guide
that'srelativelyeasy.Butaimtokeeptheoverallweighttoaminimum.

Step5:SolveOneProblemandRepeat topprevnext

You'renowinthehappycycleofissuedrivendevelopmentwhereyoucanstarttosolvetangibleproblemsinsteadofadding
features.Writeissuesthateachstateaclearproblem,andproposeasolution.AsyoudesigntheAPI,keepinmindyour
standardsfornames,consistency,andbehavior.Writingthesedowninproseoftenhelpskeepthemsane.

Fromhere,everysinglechangeyoumaketothearchitectureandcodecanbeprovenbyrunningthetestcase,watchingitnot
work,makingthechange,andthenwatchingitwork.

Nowyougothroughthewholecycle(extendingthetestcase,fixingtheAPI,updatingtheprotocol,andextendingthecode,as
needed),takingproblemsoneatatimeandtestingthesolutionsindividually.Itshouldtakeabout1030minutesforeachcycle,
withtheoccasionalspikeduetorandomconfusion.

Unprotocols topprevnext

ProtocolsWithoutTheGoats topprevnext

Whenthismanthinksofprotocols,thismanthinksofmassivedocumentswrittenbycommittees,overyears.Thismanthinksof
theIETF,W3C,ISO,Oasis,regulatorycapture,FRANDpatentlicensedisputes,andsoonafter,thismanthinksofretirementtoa
nicelittlefarminnorthernBoliviaupinthemountainswheretheonlyotherneedlesslystubbornbeingsarethegoatschewingup
thecoffeeplants.

Now,I'venothingpersonalagainstcommittees.Theuselessfolkneedaplacetositouttheirliveswithminimalriskof
reproducingafterall,thatonlyseemsfair.Butmostcommitteeprotocolstendtowardscomplexity(theonesthatwork),ortrash
(theoneswedon'ttalkabout).There'safewreasonsforthis.Oneistheamountofmoneyatstake.Moremoneymeansmore
peoplewhowanttheirparticularprejudicesandassumptionsexpressedinprose.Buttwoisthelackofgoodabstractionson
whichtobuild.Peoplehavetriedtobuildreusableprotocolabstractions,likeBEEP.Mostdidnotstick,andthosethatdid,like
SOAPandXMPP,areonthecomplexsideofthings.

Itusedtobe,decadesago,whentheInternetwasayoungmodestthing,thatprotocolswereshortandsweet.Theyweren'teven
"standards",but"requestsforcomments",whichisasmodestasyoucanget.It'sbeenoneofmygoalssincewestartediMatixin
1995tofindawayforordinarypeoplelikemetowritesmall,accurateprotocolswithouttheoverheadofthecommittees.

Now,ZeroMQdoesappeartoprovidealiving,successfulprotocolabstractionlayerwithits"we'llcarrymultipartmessagesover
randomtransports"wayofworking.BecauseZeroMQdealssilentlywithframing,connections,androuting,it'ssurprisinglyeasy
towritefullprotocolspecsontopofZeroMQ,andinChapter4ReliableRequestReplyPatternsandChapter5Advanced
PubSubPatternsIshowedhowtodothis.

Somewherearoundmid2007,IkickedofftheDigitalStandardsOrganizationtodefinenewsimplerwaysofproducinglittle
standards,protocols,andspecifications.Inmydefense,itwasaquietsummer.Atthetime,Iwrotethatanewspecification
shouldtake"minutestoexplain,hourstodesign,daystowrite,weekstoprove,monthstobecomemature,andyearstoreplace."

In2010,westartedcallingsuchlittlespecificationsunprotocols,whichsomepeoplemightmistakeforadastardlyplanforworld
dominationbyashadowyinternationalorganization,butwhichreallyjustmeans"protocolswithoutthegoats".

ContractsAreHard topprevnext

Writingcontractsisperhapsthemostdifficultpartoflargescalearchitecture.Withunprotocols,weremoveasmuchofthe
unnecessaryfrictionaspossible.Whatremainsisstillahardsetofproblemstosolve.Agoodcontract(beitanAPI,aprotocol,

http://zguide.zeromq.org/page:all 157/225
12/31/2015 MQ - The Guide - MQ - The Guide
orarentalagreement)hastobesimple,unambiguous,technicallysound,andeasytoenforce.

Likeanytechnicalskill,it'ssomethingyouhavetolearnandpractice.Thereareaseriesofspecificationsonthe
ZeroMQRFCsite,whichareworthreadingandusingthemasabasisforyourownspecificationswhenyoufindyourselfinneed.

I'lltrytosummarizemyexperienceasaprotocolwriter:

Startsimple,anddevelopyourspecificationsstepbystep.Don'tsolveproblemsyoudon'thaveinfrontofyou.

Useveryclearandconsistentlanguage.Aprotocolmayoftenbreakdownintocommandsandfieldsuseclearshort
namesfortheseentities.

Trytoavoidinventingconcepts.Reuseanythingyoucanfromexistingspecifications.Useterminologythatisobviousand
cleartoyouraudience.

Makenothingforwhichyoucannotdemonstrateanimmediateneed.Yourspecificationsolvesproblemsitdoesnot
providefeatures.Makethesimplestplausiblesolutionforeachproblemthatyouidentify.

Implementyourprotocolasyoubuildit,sothatyouareawareofthetechnicalconsequencesofeachchoice.Usea
languagethatmakesithard(likeC)andnotonethatmakesiteasy(likePython).

Testyourspecificationasyoubuilditonotherpeople.Yourbestfeedbackonaspecificationiswhensomeoneelsetriesto
implementitwithouttheassumptionsandknowledgethatyouhaveinyourhead.

Crosstestrapidlyandconsistently,throwingothers'clientsagainstyourserversandviceversa.

Bepreparedtothrowitoutandstartagainasoftenasneeded.Planforthis,bylayeringyourarchitecturesothate.g.,you
cankeepanAPIbutchangetheunderlyingprotocols.

Onlyuseconstructsthatareindependentofprogramminglanguageandoperatingsystem.

Solvealargeprobleminlayers,makingeachlayeranindependentspecification.Bewareofcreatingmonolithicprotocols.
Thinkabouthowreusableeachlayeris.Thinkabouthowdifferentteamscouldbuildcompetingspecificationsateach
layer.

Andaboveall,writeitdown.Codeisnotaspecification.Thepointaboutawrittenspecificationisthatnomatterhowweakitis,it
canbesystematicallyimproved.Bywritingdownaspecification,youwillalsospotinconsistenciesandgrayareasthatare
impossibletoseeincode.

Ifthissoundshard,don'tworrytoomuch.OneofthelessobviousbenefitsofusingZeroMQisthatitcutstheeffortnecessaryto
writeaprotocolspecbyperhaps90%ormorebecauseitalreadyhandlesframing,routing,queuing,andsoon.Thismeansthat
youcanexperimentrapidly,makemistakescheaply,andthuslearnrapidly.

HowtoWriteUnprotocols topprevnext

Whenyoustarttowriteanunprotocolspecificationdocument,sticktoaconsistentstructuresothatyourreadersknowwhatto
expect.HereisthestructureIuse:

Coversection:witha1linesummary,URLtothespec,formalname,version,whotoblame.
Licenseforthetext:absolutelyneededforpublicspecifications.
Thechangeprocess:i.e.,howcanIasareaderfixproblemsinthespecification?
Useoflanguage:MUST,MAY,SHOULD,andsoon,withareferencetoRFC2119.
Maturityindicator:isthisanexperimental,draft,stable,legacy,orretired?
Goalsoftheprotocol:whatproblemsisittryingtosolve?
Formalgrammar:preventsargumentsduetodifferentinterpretationsofthetext.
Technicalexplanation:semanticsofeachmessage,errorhandling,andsoon.
Securitydiscussion:explicitly,howsecuretheprotocolis.
References:tootherdocuments,protocols,andsoon.

Writingclear,expressivetextishard.Doavoidtryingtodescribeimplementationsoftheprotocol.Rememberthatyou'rewritinga
contract.Youdescribeinclearlanguagetheobligationsandexpectationsofeachparty,thelevelofobligation,andthepenalties
forbreakingtherules.Youdonottrytodefinehoweachpartyhonorsitspartofthedeal.

Herearesomekeypointsaboutunprotocols:

http://zguide.zeromq.org/page:all 158/225
12/31/2015 MQ - The Guide - MQ - The Guide
Aslongasyourprocessisopen,thenyoudon'tneedacommittee:justmakecleanminimaldesignsandmakesure
anyoneisfreetoimprovethem.

Ifuseanexistinglicense,thenyoudon'thavelegalworriesafterwards.IuseGPLv3formypublicspecificationsand
adviseyoutodothesame.Forinhousework,standardcopyrightisperfect.

Formalityisvaluable.Thatis,learntowriteaformalgrammarsuchasABNF(AugmentedBackusNaurForm)andusethis
tofullydocumentyourmessages.

UseamarketdrivenlifecycleprocesslikeDigistan'sCOSSsothatpeopleplacetherightweightonyourspecsasthey
mature(ordon't).

WhyusetheGPLv3forPublicSpecifications? topprevnext

Thelicenseyouchooseisparticularlycrucialforpublicspecifications.Traditionally,protocolsarepublishedundercustom
licenses,wheretheauthorsownthetextandderivedworksareforbidden.Thissoundsgreat(afterall,whowantstoseea
protocolforked?),butit'sinfacthighlyrisky.Aprotocolcommitteeisvulnerabletocapture,andiftheprotocolisimportantand
valuable,theincentiveforcapturegrows.

Oncecaptured,likesomewildanimals,animportantprotocolwilloftendie.Therealproblemisthatthere'snowaytofreea
captiveprotocolpublishedunderaconventionallicense.Theword"free"isn'tjustanadjectivetodescribespeechorair,it'salso
averb,andtherighttoforkaworkagainstthewishesoftheownerisessentialtoavoidingcapture.

Letmeexplainthisinshorterwords.ImaginethatiMatixwritesaprotocoltodaythat'sreallyamazingandpopular.Wepublishthe
specandmanypeopleimplementit.Thoseimplementationsarefastandawesome,andfreeasinbeer.Theystarttothreatenan
existingbusiness.Theirexpensivecommercialproductisslowerandcan'tcompete.SoonedaytheycometoouriMatixofficein
MaetangDong,SouthKorea,andoffertobuyourfirm.Becausewe'respendingvastamountsonsushiandbeer,weaccept
gratefully.Withevillaughter,thenewownersoftheprotocolstopimprovingthepublicversion,closethespecification,andadd
patentedextensions.Theirnewproductssupportthisnewprotocolversion,buttheopensourceversionsarelegallyblockedfrom
doingso.Thecompanytakesoverthewholemarket,andcompetitionends.

Whenyoucontributetoanopensourceproject,youreallywanttoknowyourhardworkwon'tbeusedagainstyoubyaclosed
sourcecompetitor.ThisiswhytheGPLbeatsthe"morepermissive"BSD/MIT/X11licensesformostcontributors.Theselicenses
givepermissiontocheat.Thisappliesjustasmuchtoprotocolsastosourcecode.

WhenyouimplementaGPLv3specification,yourapplicationsareofcourseyours,andlicensedanywayyoulike.Butyoucanbe
certainoftwothings.One,thatspecificationwillneverbeembracedandextendedintoproprietaryforms.Anyderivedformsof
thespecificationmustalsobeGPLv3.Two,noonewhoeverimplementsorusestheprotocolwilleverlaunchapatentattackon
anythingitcovers,norcantheyaddtheirpatentedtechnologytoitwithoutgrantingtheworldafreelicense.

UsingABNF topprevnext

Myadvicewhenwritingprotocolspecsistolearnanduseaformalgrammar.It'sjustlesshasslethanallowingotherstointerpret
whatyoumean,andthenrecoverfromtheinevitablefalseassumptions.Thetargetofyourgrammarisotherpeople,engineers,
notcompilers.

MyfavoritegrammarisABNF,asdefinedbyRFC2234,becauseitisprobablythesimplestandmostwidelyusedformal
languagefordefiningbidirectionalcommunicationsprotocols.MostIETF(InternetEngineeringTaskForce)specificationsuse
ABNF,whichisgoodcompanytobein.

I'llgivea30secondcrashcourseinwritingABNF.Itmayremindyouofregularexpressions.Youwritethegrammarasrules.
Eachruletakestheform"name=elements".Anelementcanbeanotherrule(whichyoudefinebelowasanotherrule)orapre
definedterminallikeCRLF,OCTET,oranumber.TheRFClistsalltheterminals.Todefinealternativeelements,separatewitha
slash.Todefinerepetition,useanasterisk.Togroupelements,useparentheses.ReadtheRFCbecauseit'snotintuitive.

I'mnotsureifthisextensionisproper,butIthenprefixelementswith"C:"and"S:"toindicatewhethertheycomefromtheclient
orserver.

Here'sapieceofABNFforanunprotocolcalledNOMthatwe'llcomebacktolaterinthischapter:

http://zguide.zeromq.org/page:all 159/225
12/31/2015 MQ - The Guide - MQ - The Guide

nomprotocol=openpeering*usepeering

openpeering=C:OHAI(S:OHAIOK/S:WTF)

usepeering=C:ICANHAZ
/S:CHEEZBURGER
/C:HUGZS:HUGZOK
/S:HUGZC:HUGZOK

I'veactuallyusedthesekeywords(OHAI,WTF)incommercialprojects.Theymakedevelopersgigglyandhappy.Theyconfuse
management.They'regoodinfirstdraftsthatyouwanttothrowawaylater.

TheCheaporNastyPattern topprevnext

ThereisagenerallessonI'velearnedoveracoupleofdecadesofwritingprotocolssmallandlarge.IcallthistheCheaporNasty
pattern:youcanoftensplityourworkintotwoaspectsorlayersandsolvetheseseparatelyoneusinga"cheap"approach,the
otherusinga"nasty"approach.

ThekeyinsighttomakingCheaporNastyworkistorealizethatmanyprotocolsmixalowvolumechattypartforcontrol,anda
highvolumeasynchronouspartfordata.Forinstance,HTTPhasachattydialogtoauthenticateandgetpages,andan
asynchronousdialogtostreamdata.FTPactuallysplitsthisovertwoportsoneportforcontrolandoneportfordata.

Protocoldesignerswhodon'tseparatecontrolfromdatatendtomakehorridprotocols,becausethetradeoffsinthetwocases
arealmosttotallyopposed.Whatisperfectforcontrolisbadfordata,andwhat'sidealfordatajustdoesn'tworkforcontrol.It's
especiallytruewhenwewanthighperformanceatthesametimeasextensibilityandgooderrorchecking.

Let'sbreakthisdownusingaclassicclient/serverusecase.Theclientconnectstotheserverandauthenticates.Itthenasksfor
someresource.Theserverchatsback,thenstartstosenddatabacktotheclient.Eventually,theclientdisconnectsortheserver
finishes,andtheconversationisover.

Now,beforestartingtodesignthesemessages,stopandthink,andlet'scomparethecontroldialogandthedataflow:

Thecontroldialoglastsashorttimeandinvolvesveryfewmessages.Thedataflowcouldlastforhoursordays,and
involvebillionsofmessages.

Thecontroldialogiswhereallthe"normal"errorshappen,e.g.,notauthenticated,notfound,paymentrequired,censored,
andsoon.Incontrast,anyerrorsthathappenduringthedataflowareexceptional(diskfull,servercrashed).

Thecontroldialogiswherethingswillchangeovertimeasweaddmoreoptions,parameters,andsoon.Thedataflow
shouldbarelychangeovertimebecausethesemanticsofaresourcearefairlyconstantovertime.

Thecontroldialogisessentiallyasynchronousrequest/replydialog.Thedataflowisessentiallyaonewayasynchronous
flow.

Thesedifferencesarecritical.Whenwetalkaboutperformance,itappliesonlytodataflows.It'spathologicaltodesignaonetime
controldialogtobefast.Thuswhenwetalkaboutthecostofserialization,thisonlyappliestothedataflow.Thecostof
encoding/decodingthecontrolflowcouldbehuge,andformanycasesitwouldnotchangeathing.Soweencodecontrolusing
Cheap,andweencodedataflowsusingNasty.

Cheapisessentiallysynchronous,verbose,descriptive,andflexible.ACheapmessageisfullofrichinformationthatcanchange
foreachapplication.Yourgoalasdesigneristomakethisinformationeasytoencodeandparse,trivialtoextendfor
experimentationorgrowth,andhighlyrobustagainstchangebothforwardsandbackwards.TheCheappartofaprotocollooks
likethis:

Itusesasimpleselfdescribingstructuredencodingfordata,beitXML,JSON,HTTPstyleheaders,orsomeother.Any
encodingisfineaslongastherearestandardsimpleparsersforitinyourtargetlanguages.

Itusesastraightrequestreplymodelwhereeachrequesthasasuccess/failurereply.Thismakesittrivialtowritecorrect
clientsandserversforaCheapdialog.

Itdoesn'ttry,evenmarginally,tobefast.Performancedoesn'tmatterwhenyoudosomethingonlyonceorafewtimesper

http://zguide.zeromq.org/page:all 160/225
12/31/2015 MQ - The Guide - MQ - The Guide
session.

ACheapparserissomethingyoutakeofftheshelfandthrowdataat.Itshouldn'tcrash,shouldn'tleakmemory,shouldbehighly
tolerant,andshouldberelativelysimpletoworkwith.That'sit.

Nastyhoweverisessentiallyasynchronous,terse,silent,andinflexible.ANastymessagecarriesminimalinformationthat
practicallyneverchanges.Yourgoalasdesigneristomakethisinformationultrafasttoparse,andpossiblyevenimpossibleto
extendandexperimentwith.TheidealNastypatternlookslikethis:

Itusesahandoptimizedbinarylayoutfordata,whereeverybitispreciselycrafted.

Itusesapureasynchronousmodelwhereoneorbothpeerssenddatawithoutacknowledgments(oriftheydo,theyuse
sneakyasynchronoustechniqueslikecreditbasedflowcontrol).

Itdoesn'ttry,evenmarginally,tobefriendly.Performanceisallthatmatterswhenyouaredoingsomethingseveralmillion
timespersecond.

ANastyparserissomethingyouwritebyhand,whichwritesorreadsbits,bytes,words,andintegersindividuallyandprecisely.It
rejectsanythingitdoesn'tlike,doesnomemoryallocationsatall,andnevercrashes.

CheaporNastyisn'tauniversalpatternnotallprotocolshavethisdichotomy.Also,howyouuseCheaporNastywilldependon
thesituation.Insomecases,itcanbetwopartsofasingleprotocol.Inothercases,itcanbetwoprotocols,onelayeredontopof
theother.

ErrorHandling topprevnext

UsingCheaporNastymakeserrorhandlingrathersimpler.Youhavetwokindsofcommandsandtwowaystosignalerrors:

Synchronouscontrolcommands:errorsarenormal:everyrequesthasaresponsethatiseitherOKoranerrorresponse.
Asynchronousdatacommands:errorsareexceptional:badcommandsareeitherdiscardedsilently,orcausethewhole
connectiontobeclosed.

It'susuallygoodtodistinguishafewkindsoferrors,butasalwayskeepitminimalandaddonlywhatyouneed.

SerializingYourData topprevnext

Whenwestarttodesignaprotocol,oneofthefirstquestionswefaceishowweencodedataonthewire.Thereisnouniversal
answer.Thereareahalfdozendifferentwaystoserializedata,eachwithprosandcons.We'llexploresomeofthese.

AbstractionLevel topprevnext

Beforelookingathowtoputdataontothewire,it'sworthaskingwhatdataweactuallywanttoexchangebetweenapplications.If
wedon'tuseanyabstraction,weliterallyserializeanddeserializeourinternalstate.Thatis,theobjectsandstructuresweuseto
implementourfunctionality.

Puttinginternalstateontothewireishoweverareallybadidea.It'slikeexposinginternalstateinanAPI.Whenyoudothis,you
arehardcodingyourimplementationdecisionsintoyourprotocols.Youarealsogoingtoproduceprotocolsthataresignificantly
morecomplexthantheyneedtobe.

It'sperhapsthemainreasonsomanyolderprotocolsandAPIsaresocomplex:theirdesignersdidnotthinkabouthowto
abstractthemintosimplerconcepts.Thereisofcoursenoguaranteethananabstractionwillbesimplerthat'swherethehard
workcomesin.

AgoodprotocolorAPIabstractionencapsulatesnaturalpatternsofuse,andgivesthemnameandpropertiesthatarepredictable
andregular.Itchoosessensibledefaultssothatthemainusecasescanbespecifiedminimally.Itaimstobesimpleforthe
simplecases,andexpressivefortherarercomplexcases.Itdoesnotmakeanystatementsorassumptionsabouttheinternal

http://zguide.zeromq.org/page:all 161/225
12/31/2015 MQ - The Guide - MQ - The Guide
implementationunlessthatisabsolutelyneededforinteroperability.

ZeroMQFraming topprevnext

ThesimplestandmostwidelyusedserializationformatforZeroMQapplicationsisZeroMQ'sownmultipartframing.Forexample,
hereishowtheMajordomoProtocoldefinesarequest:

Frame0:Emptyframe
Frame1:"MDPW01"(sixbytes,representingMDP/Workerv0.1)
Frame2:0x02(onebyte,representingREQUEST)
Frame3:Clientaddress(envelopestack)
Frame4:Empty(zerobytes,envelopedelimiter)
Frames5+:Requestbody(opaquebinary)

Toreadandwritethisincodeiseasy,butthisisaclassicexampleofacontrolflow(thewholeofMDPisreally,asit'sachatty
requestreplyprotocol).WhenwecametoimproveMDPforthesecondversion,wehadtochangethisframing.Excellent,we
brokeallexistingimplementations!

Backwardscompatibilityishard,butusingZeroMQframingforcontrolflowsdoesnothelp.Here'showIshouldhavedesigned
thisprotocolifI'dfollowedmyownadvice(andI'llfixthisinthenextversion).It'ssplitintoaCheappartandaNastypart,and
usestheZeroMQframingtoseparatethese:

Frame0:"MDP/2.0"forprotocolnameandversion
Frame1:commandheader
Frame2:commandbody

Wherewe'dexpecttoparsethecommandheaderinthevariousintermediaries(clientAPI,broker,andworkerAPI),andpassthe
commandbodyuntouchedfromapplicationtoapplication.

SerializationLanguages topprevnext

Serializationlanguageshavetheirfashions.XMLusedtobebigasinpopular,thenitgotbigasinoverengineered,andthenit
fellintothehandsof"EnterpriseInformationArchitects"andit'snotbeenseenalivesince.Today'sXMListheepitomeof
"somewhereinthatmessissmall,elegantlanguagetryingtoescape".

StillXMLwasway,waybetterthanitspredecessors,whichincludedsuchmonstersastheStandardGeneralizedMarkup
Language(SGML),whichinturnwasacoolbreezecomparedtomindtorturingbeastslikeEDIFACT.Sothehistoryof
serializationlanguagesseemstobeofgraduallyemergingsanity,hiddenbywavesofrevoltingEIAsdoingtheirbesttoholdonto
theirjobs.

JSONpoppedoutoftheJavaScriptworldasaquickanddirty"I'dratherresignthanuseXMLhere"waytothrowdataontothe
wireandgetitbackagain.JSONisjustminimalXMLexpressed,sneakily,asJavaScriptsourcecode.

Here'sasimpleexampleofusingJSONinaCheapprotocol:

"protocol":{
"name":"MTL",
"version":1
},
"virtualhost":"testenv"

ThesamedatainXMLwouldbe(XMLforcesustoinventasingletoplevelentity):

http://zguide.zeromq.org/page:all 162/225
12/31/2015 MQ - The Guide - MQ - The Guide
<command>
<protocolname="MTL"version="1"/>
<virtualhost>testenv</virtualhost>
</command>

AndhereitisusingplainoldHTTPstyleheaders:

Protocol:MTL/1.0
Virtualhost:testenv

Theseareallprettyequivalentaslongasyoudon'tgooverboardwithvalidatingparsers,schemas,andother"trustus,thisisall
foryourowngood"nonsense.ACheapserializationlanguagegivesyouspaceforexperimentationforfree("ignoreany
elements/attributes/headersthatyoudon'trecognize"),andit'ssimpletowritegenericparsersthat,forexample,thunka
commandintoahashtable,orviceversa.

However,it'snotallroses.WhilemodernscriptinglanguagessupportJSONandXMLeasilyenough,olderlanguagesdonot.If
youuseXMLorJSON,youcreatenontrivialdependencies.It'salsosomewhatofapaintoworkwithtreestructureddataina
languagelikeC.

Soyoucandriveyourchoiceaccordingtothelanguagesforwhichyou'reaiming.Ifyouruniverseisascriptinglanguage,thengo
forJSON.Ifyouareaimingtobuildprotocolsforwidersystemuse,keepthingssimpleforCdevelopersandsticktoHTTPstyle
headers.

SerializationLibraries topprevnext

Themsgpack.orgsitesays:

I'mgoingtomaketheperhapsunpopularclaimthat"fastandsmall"arefeaturesthatsolvenonproblems.Theonlyrealproblem
thatserializationlibrariessolveis,asfarasIcantell,theneedtodocumentthemessagecontractsandactuallyserializedatato
andfromthewire.

Let'sstartbydebunking"fastandsmall".It'sbasedonatwopartargument.First,thatmakingyourmessagessmallerand
reducingCPUcostforencodinganddecodingwillmakeasignificantdifferencetoyourapplication'sperformance.Second,that
thisequallyvalidacrosstheboardtoallmessages.

Butmostrealapplicationstendtofallintooneoftwocategories.Eitherthespeedofserializationandsizeofencodingismarginal
comparedtoothercosts,suchasdatabaseaccessorapplicationcodeperformance.Or,networkperformancereallyiscritical,
andthenallsignificantcostsoccurinafewspecificmessagetypes.

Thus,aimingfor"fastandsmall"acrosstheboardisafalseoptimization.YouneithergettheeasyflexibilityofCheapforyour
infrequentcontrolflows,nordoyougetthebrutalefficiencyofNastyforyourhighvolumedataflows.Worse,theassumptionthat
allmessagesareequalinsomewaycancorruptyourprotocoldesign.CheaporNastyisn'tonlyaboutserializationstrategies,it's
alsoaboutsynchronousversusasynchronous,errorhandlingandthecostofchange.

Myexperienceisthatmostperformanceproblemsinmessagebasedapplicationscanbesolvedby(a)improvingtheapplication
itselfand(b)handoptimizingthehighvolumedataflows.Andtohandoptimizeyourmostcriticaldataflows,youneedtocheat
tolearnexploitfactsaboutyourdata,somethinggeneralpurposeserializerscannotdo.

Nowlet'saddressdocumentationandtheneedtowriteourcontractsexplicitlyandformally,ratherthanonlyincode.Thisisa
validproblemtosolve,indeedoneofthemainonesifwe'retobuildalonglasting,largescalemessagebasedarchitecture.

HereishowwedescribeatypicalmessageusingtheMessagePackinterfacedefinitionlanguage(IDL):

messagePerson{
1:stringsurname
2:stringfirstname
3:optionalstringemail
}

http://zguide.zeromq.org/page:all 163/225
12/31/2015 MQ - The Guide - MQ - The Guide
Now,thesamemessageusingtheGoogleprotocolbuffersIDL:

messagePerson{
requiredstringsurname=1
requiredstringfirstname=2
optionalstringemail=3
}

Itworks,butinmostpracticalcaseswinsyoulittleoveraserializationlanguagebackedbydecentspecificationswrittenbyhand
orproducedmechanically(we'llcometothis).Thepriceyou'llpayisanextradependencyandquiteprobably,worseoverall
performancethanifyouusedCheaporNasty.

HandwrittenBinarySerialization topprevnext

Asyou'llgatherfromthisbook,mypreferredlanguageforsystemsprogrammingisC(upgradedtoC99,witha
constructor/destructorAPImodelandgenericcontainers).TherearetworeasonsIlikethismodernizedClanguage.First,I'mtoo
weakmindedtolearnabiglanguagelikeC++.Lifejustseemsfilledwithmoreinterestingthingstounderstand.Second,Ifind
thatthisspecificlevelofmanualcontrolletsmeproducebetterresults,faster.

Thepointhereisn'tCversusC++,butthevalueofmanualcontrolforhighendprofessionalusers.It'snoaccidentthatthebest
cars,cameras,andespressomachinesintheworldhavemanualcontrols.Thatlevelofonthespotfinetuningoftenmakesthe
differencebetweenworldclasssuccess,andbeingsecondbest.

Whenyouarereally,trulyconcernedaboutthespeedofserializationand/orthesizeoftheresult(oftenthesecontradicteach
other),youneedhandwrittenbinaryserialization.Inotherwords,let'shearitforMr.Nasty!

YourbasicprocessforwritinganefficientNastyencoder/decoder(codec)is:

Buildrepresentativedatasetsandtestapplicationsthatcanstresstestyourcodec.
Writeafirstdumbversionofthecodec.
Test,measure,improve,andrepeatuntilyourunoutoftimeand/ormoney.

Herearesomeofthetechniquesweusetomakeourcodecsbetter:

Useaprofiler.There'ssimplynowaytoknowwhatyourcodeisdoinguntilyou'veprofileditforfunctioncountsandfor
CPUcostperfunction.Whenyoufindyourhotspots,fixthem.

Eliminatememoryallocations.TheheapisveryfastonamodernLinuxkernel,butit'sstillthebottleneckinmostnaive
codecs.Onolderkernels,theheapcanbetragicallyslow.Uselocalvariables(thestack)insteadoftheheapwhereyou
can.

Testondifferentplatformsandwithdifferentcompilersandcompileroptions.Apartfromtheheap,therearemanyother
differences.Youneedtolearnthemainones,andallowforthem.

Usestatetocompressbetter.Ifyouareconcernedaboutcodecperformance,youarealmostdefinitelysendingthesame
kindsofdatamanytimes.Therewillberedundancybetweeninstancesofdata.Youcandetecttheseandusethatto
compress(e.g.,ashortvaluethatmeans"sameaslasttime").

Knowyourdata.Thebestcompressiontechniques(intermsofCPUcostforcompactness)requireknowingaboutthe
data.Forexample,thetechniquesusedtocompressawordlist,avideo,andastreamofstockmarketdataareall
different.

Bereadytobreaktherules.Doyoureallyneedtoencodeintegersinbigendiannetworkbyteorder?x86andARM
accountforalmostallmodernCPUs,yetuselittleendian(ARMisactuallybiendianbutAndroid,likeWindowsandiOS,is
littleendian).

CodeGeneration topprevnext

http://zguide.zeromq.org/page:all 164/225
12/31/2015 MQ - The Guide - MQ - The Guide
Readingtheprevioustwosections,youmighthavewondered,"couldIwritemyownIDLgeneratorthatwasbetterthanageneral
purposeone?"Ifthisthoughtwanderedintoyourmind,itprobablyleftprettysoonafter,chasedbydarkcalculationsabouthow
muchworkthatactuallyinvolved.

WhatifItoldyouofawaytobuildcustomIDLgeneratorscheaplyandquickly?Youcanhaveawaytogetperfectlydocumented
contracts,codethatisasevilanddomainspecificasyouneedittobe,andallyouneedtodoissignawayyoursoul(whoever
reallyusedthat,amIright?)justhere

AtiMatix,untilafewyearsago,weusedcodegenerationtobuildeverlargerandmoreambitioussystemsuntilwedecidedthe
technology(GSL)wastoodangerousforcommonuse,andwesealedthearchiveandlockeditwithheavychainsinadeep
dungeon.WeactuallyposteditonGitHub.Ifyouwanttotrytheexamplesthatarecomingup,grabtherepositoryandbuild
yourselfagslcommand.Typing"make"inthesrcsubdirectoryshoulddoit(andifyou'rethatguywholovesWindows,I'msure
you'llsendapatchwithprojectfiles).

Thissectionisn'treallyaboutGSLatall,butaboutausefulandlittleknowntrickthat'susefulforambitiousarchitectswhowantto
scalethemselves,aswellastheirwork.Onceyoulearnthetrick,youcanwhipupyourowncodegeneratorsinashorttime.The
codegeneratorsmostsoftwareengineersknowaboutcomewithasinglehardcodedmodel.Forinstance,Ragel"compiles
executablefinitestatemachinesfromregularlanguages",i.e.,Ragel'smodelisaregularlanguage.Thiscertainlyworksfora
goodsetofproblems,butit'sfarfromuniversal.HowdoyoudescribeanAPIinRagel?Oraprojectmakefile?Orevenafinite
statemachineliketheoneweusedtodesigntheBinaryStarpatterninChapter4ReliableRequestReplyPatterns?

Allthesewouldbenefitfromcodegeneration,butthere'snouniversalmodel.Sothetrickistodesignyourownmodelsasyou
needthem,andthenmakecodegeneratorsascheapcompilersforthatmodel.Youneedsomeexperienceinhowtomakegood
models,andyouneedatechnologythatmakesitcheaptobuildcustomcodegenerators.Ascriptinglanguage,likePerland
Python,isagoodoption.However,weactuallybuiltGSLspecificallyforthis,andthat'swhatIprefer.

Let'stakeasimpleexamplethattiesintowhatwealreadyknow.We'llseemoreextensiveexampleslater,becauseIreallydo
believethatcodegenerationiscrucialknowledgeforlargescalework.InChapter4ReliableRequestReplyPatterns,we
developedtheMajordomoProtocol(MDP),andwroteclients,brokers,andworkersforthat.Nowcouldwegeneratethosepieces
mechanically,bybuildingourowninterfacedescriptionlanguageandcodegenerators?

WhenwewriteaGSLmodel,wecanuseanysemanticswelike,inotherwordswecaninventdomainspecificlanguagesonthe
spot.I'llinventacoupleseeifyoucanguesswhattheyrepresent:

slideshow
name=Cookerylevel3
page
title=FrenchCuisine
item=Overview
item=Thehistoricalcuisine
item=Thenouvellecuisine
item=WhytheFrenchlivelonger
page
title=Overview
item=Soupsandsalads
item=Leplatprincipal
item=Bchamelandothersauces
item=Pastries,cakes,andquiches
item=Souffl:cheesetostrawberry

Howaboutthisone:

table
name=person
column
name=firstname
type=string
column
name=lastname
type=string
column
name=rating

http://zguide.zeromq.org/page:all 165/225
12/31/2015 MQ - The Guide - MQ - The Guide
type=integer

Wecouldcompilethefirstintoapresentation.Thesecond,wecouldcompileintoSQLtocreateandworkwithadatabasetable.
Soforthisexercise,ourdomainlanguage,ourmodel,consistsof"classes"thatcontain"messages"thatcontain"fields"of
varioustypes.It'sdeliberatelyfamiliar.HereistheMDPclientprotocol:

<classname="mdp_client">
MDP/Client
<header>
<fieldname="empty"type="string"value=""
>Emptyframe</field>
<fieldname="protocol"type="string"value="MDPC01"
>Protocolidentifier</field>
</header>
<messagename="request">
Clientrequesttobroker
<fieldname="service"type="string">Servicename</field>
<fieldname="body"type="frame">Requestbody</field>
</message>
<messagename="reply">
Responsebacktoclient
<fieldname="service"type="string">Servicename</field>
<fieldname="body"type="frame">Responsebody</field>
</message>
</class>

AndhereistheMDPworkerprotocol:

<classname="mdp_worker">
MDP/Worker
<header>
<fieldname="empty"type="string"value=""
>Emptyframe</field>
<fieldname="protocol"type="string"value="MDPW01"
>Protocolidentifier</field>
<fieldname="id"type="octet">Messageidentifier</field>
</header>
<messagename="ready"id="1">
Workertellsbrokeritisready
<fieldname="service"type="string">Servicename</field>
</message>
<messagename="request"id="2">
Clientrequesttobroker
<fieldname="client"type="frame">Clientaddress</field>
<fieldname="body"type="frame">Requestbody</field>
</message>
<messagename="reply"id="3">
Workerreturnsreplytobroker
<fieldname="client"type="frame">Clientaddress</field>
<fieldname="body"type="frame">Requestbody</field>
</message>
<messagename="hearbeat"id="4">
Eitherpeertellstheotherit'sstillalive
</message>
<messagename="disconnect"id="5">
Eitherpeertellsotherthepartyisover
</message>
</class>

http://zguide.zeromq.org/page:all 166/225
12/31/2015 MQ - The Guide - MQ - The Guide
GSLusesXMLasitsmodelinglanguage.XMLhasapoorreputation,havingbeendraggedthroughtoomanyenterprisesewers
tosmellsweet,butithassomestrongpositives,aslongasyoukeepitsimple.Anywaytowriteaselfdescribinghierarchyof
itemsandattributeswouldwork.

NowhereisashortIDLgeneratorwritteninGSLthatturnsourprotocolmodelsintodocumentation:

.#TrivialIDLgenerator(specs.gsl)
.#
.output"$(class.name).md"
##The$(string.trim(class.?''):left)Protocol
.formessage
.frames=count(class>header.field)+count(field)

A$(message.NAME)commandconsistsofamultipartmessageof$(frames)
frames:

.forclass>header.field
.ifname="id"
*Frame$(item()):0x$(message.id:%02x)(1byte,$(message.NAME))
.else
*Frame$(item()):"$(value:)"($(string.length("$(value)"))\
bytes,$(field.:))
.endif
.endfor
.index=count(class>header.field)+1
.forfield
*Frame$(index):$(field.?'')\
.iftype="string"
(printablestring)
.elsiftype="frame"
(opaquebinary)
.index+=1
.else
.echo"E:unknownfieldtype:$(type)"
.endif
.index+=1
.endfor
.endfor

TheXMLmodelsandthisscriptareinthesubdirectoryexamples/models.Todothecodegeneration,Igivethiscommand:

gslscript:specsmdp_client.xmlmdp_worker.xml

HereistheMarkdowntextwegetfortheworkerprotocol:

##TheMDP/WorkerProtocol

AREADYcommandconsistsofamultipartmessageof4frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x01(1byte,READY)
*Frame4:Servicename(printablestring)

AREQUESTcommandconsistsofamultipartmessageof5frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x02(1byte,REQUEST)
http://zguide.zeromq.org/page:all 167/225
12/31/2015 MQ - The Guide - MQ - The Guide
*Frame4:Clientaddress(opaquebinary)
*Frame6:Requestbody(opaquebinary)

AREPLYcommandconsistsofamultipartmessageof5frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x03(1byte,REPLY)
*Frame4:Clientaddress(opaquebinary)
*Frame6:Requestbody(opaquebinary)

AHEARBEATcommandconsistsofamultipartmessageof3frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x04(1byte,HEARBEAT)

ADISCONNECTcommandconsistsofamultipartmessageof3frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x05(1byte,DISCONNECT)

This,asyoucansee,isclosetowhatIwrotebyhandintheoriginalspec.Now,ifyouhaveclonedthezguiderepositoryand
youarelookingatthecodeinexamples/models,youcangeneratetheMDPclientandworkercodecs.Wepassthesametwo
modelstoadifferentcodegenerator:

gslscript:codec_cmdp_client.xmlmdp_worker.xml

Whichgivesusmdp_clientandmdp_workerclasses.ActuallyMDPissosimplethatit'sbarelyworththeeffortofwritingthe
codegenerator.Theprofitcomeswhenwewanttochangetheprotocol(whichwedidforthestandaloneMajordomoproject).
Youmodifytheprotocol,runthecommand,andoutpopsmoreperfectcode.

Thecodec_c.gslcodegeneratorisnotshort,buttheresultingcodecsaremuchbetterthanthehandwrittencodeIoriginallyput
togetherforMajordomo.Forinstance,thehandwrittencodehadnoerrorcheckingandwoulddieifyoupasseditbogus
messages.

I'mnowgoingtoexplaintheprosandconsofGSLpoweredmodelorientedcodegeneration.Powerdoesnotcomeforfreeand
oneofthegreatesttrapsinourbusinessistheabilitytoinventconceptsoutofthinair.GSLmakesthisparticularlyeasy,soitcan
beanequallydangeroustool.

Donotinventconcepts.Thejobofadesigneristoremoveproblems,notaddfeatures.

Firstly,Iwilllayouttheadvantagesofmodelorientedcodegeneration:

Youcancreatenearperfectabstractionsthatmaptoyourrealworld.So,ourprotocolmodelmaps100%tothe"real
world"ofMajordomo.Thiswouldbeimpossiblewithoutthefreedomtotuneandchangethemodelinanyway.
Youcandeveloptheseperfectmodelsquicklyandcheaply.
Youcangenerateanytextoutput.Fromasinglemodel,youcancreatedocumentation,codeinanylanguage,testtools
literallyanyoutputyoucanthinkof.
Youcangenerate(andImeanthisliterally)perfectoutputbecauseit'scheaptoimproveyourcodegeneratorstoanylevel
youwant.
Yougetasinglesourcethatcombinesspecificationsandsemantics.
Youcanleverageasmallteamtoamassivesize.AtiMatix,weproducedthemillionlineOpenAMQmessagingproduct
outofperhaps85Klinesofinputmodels,includingthecodegenerationscriptsthemselves.

Nowlet'slookatthedisadvantages:

Youaddtooldependenciestoyourproject.
Youmaygetcarriedawayandcreatemodelsforthepurejoyofcreatingthem.
Youmayalienatenewcomers,whowillsee"strangestuff",fromyourwork.
Youmaygivepeopleastrongexcusenottoinvestinyourproject.

http://zguide.zeromq.org/page:all 168/225
12/31/2015 MQ - The Guide - MQ - The Guide
Cynically,modelorientedabuseworksgreatinenvironmentswhereyouwanttoproducehugeamountsofperfectcodethatyou
canmaintainwithlittleeffortandwhichnoonecanevertakeawayfromyou.Personally,Iliketocrossmyriversandmoveon.
Butiflongtermjobsecurityisyourthing,thisisalmostperfect.

SoifyoudouseGSLandwanttocreateopencommunitiesaroundyourwork,hereismyadvice:

Useitonlywhereyouwouldotherwisebewritingtiresomecodebyhand.
Designnaturalmodelsthatarewhatpeoplewouldexpecttosee.
Writethecodebyhandfirstsoyouknowwhattogenerate.
Donotoveruse.Keepitsimple!Donotgettoometa!!
Introducegraduallyintoaproject.
Putthegeneratedcodeintoyourrepositories.

We'realreadyusingGSLinsomeprojectsaroundZeroMQ.Forexample,thehighlevelCbinding,CZMQ,usesGSLtogenerate
thesocketoptionsclass(zsockopt).A300linecodegeneratorturns78linesofXMLmodelinto1,500linesofperfect,butreally
boringcode.That'sagoodwin.

TransferringFiles topprevnext

Let'stakeabreakfromthelecturingandgetbacktoourfirstloveandthereasonfordoingallofthis:code.

"HowdoIsendafile?"isacommonquestionontheZeroMQmailinglists.Thisshouldnotbesurprising,becausefiletransferis
perhapstheoldestandmostobvioustypeofmessaging.Sendingfilesaroundnetworkshaslotsofusecasesapartfrom
annoyingthecopyrightcartels.ZeroMQisverygoodoutoftheboxatsendingeventsandtasks,butlessgoodatsendingfiles.

I'vepromised,forayearortwo,towriteaproperexplanation.Here'sagratuitouspieceofinformationtobrightenyourmorning:
theword"proper"comesfromthearchaicFrenchpropre,whichmeans"clean".ThedarkageEnglishcommonfolk,notbeing
familiarwithhotwaterandsoap,changedthewordtomean"foreign"or"upperclass",asin"that'sproperfood!",butlaterthe
wordcametomeanjust"real",asin"that'sapropermessyou'vegottenusinto!"

So,filetransfer.Thereareseveralreasonsyoucan'tjustpickuparandomfile,blindfoldit,andshoveitwholeintoamessage.
ThemostobviousreasonbeingthatdespitedecadesofdeterminedgrowthinRAMsizes(andwhoamongusoldtimersdoesn't
fondlyremembersavingupforthat1024bytememoryextensioncard?!),disksizesobstinatelyremainmuchlarger.Evenifwe
couldsendafilewithoneinstruction(say,usingasystemcalllikesendfile),we'dhittherealitythatnetworksarenotinfinitelyfast
norperfectlyreliable.Aftertryingtouploadalargefileseveraltimesonaslowflakynetwork(WiFi,anyone?),you'llrealizethata
properfiletransferprotocolneedsawaytorecoverfromfailures.Thatis,itneedsawaytosendonlythepartofafilethatwasn't
yetreceived.

Finally,afterallthis,ifyoubuildaproperfileserver,you'llnoticethatsimplysendingmassiveamountsofdatatolotsofclients
createsthatsituationweliketocall,inthetechnicalparlance,"serverwentbellyupduetoallavailableheapmemorybeingeaten
byapoorlydesignedapplication".Aproperfiletransferprotocolneedstopayattentiontomemoryuse.

We'llsolvetheseproblemsproperly,onebyone,whichshouldhopefullygetustoagoodandproperfiletransferprotocolrunning
overZeroMQ.First,let'sgeneratea1GBtestfilewithrandomdata(realpoweroftwogigalikeVonNeummanintended,notthe
fakesilicononesthememoryindustrylikestosell):

ddif=/dev/urandomof=testdatabs=1Mcount=1024

Thisislargeenoughtobetroublesomewhenwehavelotsofclientsaskingforthesamefileatonce,andonmanymachines,
1GBisgoingtobetoolargetoallocateinmemoryanyhow.Asabasereference,let'smeasurehowlongittakestocopythisfile
fromdiskbacktodisk.Thiswilltellushowmuchourfiletransferprotocoladdsontop(includingnetworkcosts):

$timecptestdatatestdata2

real0m7.143s
user0m0.012s
sys0m1.188s

The4figureprecisionismisleadingexpectvariationsof25%eitherway.Thisisjustan"orderofmagnitude"measurement.

http://zguide.zeromq.org/page:all 169/225
12/31/2015 MQ - The Guide - MQ - The Guide
Here'sourfirstcutatthecode,wheretheclientasksforthetestdataandtheserverjustsendsit,withoutstoppingforbreath,as
aseriesofmessages,whereeachmessageholdsonechunk:

fileio1:Filetransfertest,model1inC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

It'sprettysimple,butwealreadyrunintoaproblem:ifwesendtoomuchdatatotheROUTERsocket,wecaneasilyoverflowit.
Thesimplebutstupidsolutionistoputaninfinitehighwatermarkonthesocket.It'sstupidbecausewenowhavenoprotection
againstexhaustingtheserver'smemory.YetwithoutaninfiniteHWM,werisklosingchunksoflargefiles.

Trythis:settheHWMto1,000(inZeroMQv3.xthisisthedefault)andthenreducethechunksizeto100Ksowesend10K
chunksinonego.Runthetest,andyou'llseeitneverfinishes.Asthezmq_socket()manpagesayswithcheerfulbrutality,for
theROUTERsocket:"ZMQ_HWMoptionaction:Drop".

Wehavetocontroltheamountofdatatheserversendsupfront.There'snopointinitsendingmorethanthenetworkcan
handle.Let'strysendingonechunkatatime.Inthisversionoftheprotocol,theclientwillexplicitlysay,"GivemechunkN",and
theserverwillfetchthatspecificchunkfromdiskandsendit.

Here'stheimprovedsecondmodel,wheretheclientasksforonechunkatatime,andtheserveronlysendsonechunkforeach
requestitgetsfromtheclient:

fileio2:Filetransfertest,model2inC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Itismuchslowernow,becauseofthetoandfrochattingbetweenclientandserver.Wepayabout300microsecondsforeach
requestreplyroundtrip,onalocalloopconnection(clientandserveronthesamebox).Itdoesn'tsoundlikemuchbutitaddsup
quickly:

$time./fileio1
4296chunksreceived,1073741824bytes

real0m0.669s
user0m0.056s
sys0m1.048s

$time./fileio2
4295chunksreceived,1073741824bytes

real0m2.389s
user0m0.312s
sys0m2.136s

Therearetwovaluablelessonshere.First,whilerequestreplyiseasy,it'salsotooslowforhighvolumedataflows.Payingthat
300microsecondsoncewouldbefine.Payingitforeverysinglechunkisn'tacceptable,particularlyonrealnetworkswith
latenciesofperhaps1,000timeshigher.

ThesecondpointissomethingI'vesaidbeforebutwillrepeat:it'sincrediblyeasytoexperiment,measure,andimproveaprotocol
overZeroMQ.Andwhenthecostofsomethingcomeswaydown,youcanaffordalotmoreofit.Dolearntodevelopandprove
yourprotocolsinisolation:I'veseenteamswastetimetryingtoimprovepoorlydesignedprotocolsthataretoodeeplyembedded
inapplicationstobeeasilytestableorfixable.

Ourmodeltwofiletransferprotocolisn'tsobad,apartfromperformance:

Itcompletelyeliminatesanyriskofmemoryexhaustion.Toprovethat,wesetthehighwatermarkto1inbothsenderand
receiver.
Itletstheclientchoosethechunksize,whichisusefulbecauseifthere'sanytuningofthechunksizetobedone,for
networkconditions,forfiletypes,ortoreducememoryconsumptionfurther,it'stheclientthatshouldbedoingthis.
Itgivesusfullyrestartablefiletransfers.
Itallowstheclienttocancelthefiletransferatanypointintime.

Ifwejustdidn'thavetodoarequestforeachchunk,it'dbeausableprotocol.Whatweneedisawayfortheservertosend

http://zguide.zeromq.org/page:all 170/225
12/31/2015 MQ - The Guide - MQ - The Guide
multiplechunkswithoutwaitingfortheclienttorequestoracknowledgeeachone.Whatareourchoices?

Theservercouldsend10chunksatonce,thenwaitforasingleacknowledgment.That'sexactlylikemultiplyingthechunk
sizeby10,soit'spointless.Andyes,it'sjustaspointlessforallvaluesof10.

Theservercouldsendchunkswithoutanychatterfromtheclientbutwithaslightdelaybetweeneachsend,sothatit
wouldsendchunksonlyasfastasthenetworkcouldhandlethem.Thiswouldrequiretheservertoknowwhat's
happeningatthenetworklayer,whichsoundslikehardwork.Italsobreakslayeringhorribly.Andwhathappensifthe
networkisreallyfast,buttheclientitselfisslow?Wherearechunksqueuedthen?

Theservercouldtrytospyonthesendingqueue,i.e.,seehowfullitis,andsendonlywhenthequeueisn'tfull.Well,
ZeroMQdoesn'tallowthatbecauseitdoesn'twork,forthesamereasonasthrottlingdoesn'twork.Theserverandnetwork
maybemorethanfastenough,buttheclientmaybeaslowlittledevice.

WecouldmodifylibzmqtotakesomeotheractiononreachingHWM.Perhapsitcouldblock?Thatwouldmeanthata
singleslowclientwouldblockthewholeserver,sonothankyou.Maybeitcouldreturnanerrortothecaller?Thenthe
servercoulddosomethingsmartlikewell,thereisn'treallyanythingitcoulddothat'sanybetterthandroppingthe
message.

Apartfrombeingcomplexandvariouslyunpleasant,noneoftheseoptionswouldevenwork.Whatweneedisawayfortheclient
totelltheserver,asynchronouslyandinthebackground,thatit'sreadyformore.Weneedsomekindofasynchronousflow
control.Ifwedothisright,datashouldflowwithoutinterruptionfromtheservertotheclient,butonlyaslongastheclientis
readingit.Let'sreviewourthreeprotocols.Thiswasthefirstone:

C:fetch
S:chunk1
S:chunk2
S:chunk3
....

Andthesecondintroducedarequestforeachchunk:

C:fetchchunk1
S:sendchunk1
C:fetchchunk2
S:sendchunk2
C:fetchchunk3
S:sendchunk3
C:fetchchunk4
....

Nowwaveshandsmysteriouslyhere'sachangedprotocolthatfixestheperformanceproblem:

C:fetchchunk1
C:fetchchunk2
C:fetchchunk3
S:sendchunk1
C:fetchchunk4
S:sendchunk2
S:sendchunk3
....

Itlookssuspiciouslysimilar.Infact,it'sidenticalexceptthatwesendmultiplerequestswithoutwaitingforareplyforeachone.
Thisisatechniquecalled"pipelining"anditworksbecauseourDEALERandROUTERsocketsarefullyasynchronous.

Here'sthethirdmodelofourfiletransfertestbench,withpipelining.Theclientsendsanumberofrequestsahead(the"credit")
andtheneachtimeitprocessesanincomingchunk,itsendsonemorecredit.Theserverwillneversendmorechunksthanthe
clienthasaskedfor:

fileio3:Filetransfertest,model3inC

http://zguide.zeromq.org/page:all 171/225
12/31/2015 MQ - The Guide - MQ - The Guide

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

ThattweakgivesusfullcontrolovertheendtoendpipelineincludingallnetworkbuffersandZeroMQqueuesatsenderand
receiver.Weensurethepipelineisalwaysfilledwithdatawhilenevergrowingbeyondapredefinedlimit.Morethanthat,the
clientdecidesexactlywhentosend"credit"tothesender.Itcouldbewhenitreceivesachunk,orwhenithasfullyprocesseda
chunk.Andthishappensasynchronously,withnosignificantperformancecost.

Inthethirdmodel,Ichoseapipelinesizeof10messages(eachmessageisachunk).Thiswillcostamaximumof2.5MB
memoryperclient.Sowith1GBofmemorywecanhandleatleast400clients.Wecantrytocalculatetheidealpipelinesize.It
takesabout0.7secondstosendthe1GBfile,whichisabout160microsecondsforachunk.Aroundtripis300microseconds,so
thepipelineneedstobeatleast35chunkstokeeptheserverbusy.Inpractice,Istillgotperformancespikeswithapipelineof5
chunks,probablybecausethecreditmessagessometimesgetdelayedbyoutgoingdata.Soat10chunks,itworksconsistently.

$time./fileio3
4291chunksreceived,1072741824bytes

real0m0.777s
user0m0.096s
sys0m1.120s

Domeasurerigorously.Yourcalculationsmaybegood,buttherealworldtendstohaveitsownopinions.

Whatwe'vemadeisclearlynotyetarealfiletransferprotocol,butitprovesthepatternandIthinkitisthesimplestplausible
design.Forarealworkingprotocol,wemightwanttoaddsomeorallof:

Authenticationandaccesscontrols,evenwithoutencryption:thepointisn'ttoprotectsensitivedata,buttocatcherrorslike
sendingtestdatatoproductionservers.

ACheapstylerequestincludingfilepath,optionalcompression,andotherstuffwe'velearnedisusefulfromHTTP(such
asIfModifiedSince).

ACheapstyleresponse,atleastforthefirstchunk,thatprovidesmetadatasuchasfilesize(sotheclientcanpre
allocate,andavoidunpleasantdiskfullsituations).

Theabilitytofetchasetoffilesinonego,otherwisetheprotocolbecomesinefficientforlargesetsofsmallfiles.

Confirmationfromtheclientwhenit'sfullyreceivedafile,torecoverfromchunksthatmightbelostiftheclientdisconnects
unexpectedly.

Sofar,oursemantichasbeen"fetch"thatis,therecipientknows(somehow)thattheyneedaspecificfile,sotheyaskforit.The
knowledgeofwhichfilesexistandwheretheyareisthenpassedoutofband(e.g.,inHTTP,bylinksintheHTMLpage).

Howabouta"push"semantic?Therearetwoplausibleusecasesforthis.First,ifweadoptacentralizedarchitecturewithfileson
amain"server"(notsomethingI'madvocating,butpeopledosometimeslikethis),thenit'sveryusefultoallowclientstoupload
filestotheserver.Second,itletsusdoakindofpubsubforfiles,wheretheclientasksforallnewfilesofsometypeasthe
servergetsthese,itforwardsthemtotheclient.

Afetchsemanticissynchronous,whileapushsemanticisasynchronous.Asynchronousislesschatty,sofaster.Also,youcan
docutethingslike"subscribetothispath"thuscreatingapubsubfiletransferarchitecture.ThatissoobviouslyawesomethatI
shouldn'tneedtoexplainwhatproblemitsolves.

Still,hereistheproblemwiththefetchsemantic:thatoutofbandroutetotellclientswhatfilesexist.Nomatterhowyoudothis,it
endsupbeingcomplex.Eitherclientshavetopoll,oryouneedaseparatepubsubchanneltokeepclientsuptodate,oryou
needuserinteraction.

Thinkingthisthroughalittlemore,though,wecanseethatfetchisjustaspecialcaseofpubsub.Sowecangetthebestofboth
worlds.Hereisthegeneraldesign:

Fetchthispath
Hereiscredit(repeat)

Tomakethiswork(andwewill,mydearreaders),weneedtobealittlemoreexplicitabouthowwesendcredittotheserver.The
cutetrickoftreatingapipelined"fetchchunk"requestascreditwon'tflybecausetheclientdoesn'tknowanylongerwhatfiles
actuallyexist,howlargetheyare,anything.Iftheclientsays,"I'mgoodfor250,000bytesofdata",thisshouldworkequallyfor1

http://zguide.zeromq.org/page:all 172/225
12/31/2015 MQ - The Guide - MQ - The Guide
fileof250Kbytes,or100filesof2,500bytes.

Andthisgivesus"creditbasedflowcontrol",whicheffectivelyremovestheneedforhighwatermarks,andanyriskofmemory
overflow.

StateMachines topprevnext

Softwareengineerstendtothinkof(finite)statemachinesasakindofintermediaryinterpreter.Thatis,youtakearegular
languageandcompilethatintoastatemachine,thenexecutethestatemachine.Thestatemachineitselfisrarelyvisibletothe
developer:it'saninternalrepresentationoptimized,compressed,andbizarre.

However,itturnsoutthatstatemachinesarealsovaluableasafirstclassmodelinglanguagesforprotocolhandlers,e.g.,
ZeroMQclientsandservers.ZeroMQmakesitrathereasytodesignprotocols,butwe'veneverdefinedagoodpatternforwriting
thoseclientsandserversproperly.

Aprotocolhasatleasttwolevels:

Howwerepresentindividualmessagesonthewire.
Howmessagesflowbetweenpeers,andthesignificanceofeachmessage.

We'veseeninthischapterhowtoproducecodecsthathandleserialization.That'sagoodstart.Butifweleavethesecondjobto
developers,thatgivesthemalotofroomtointerpret.Aswemakemoreambitiousprotocols(filetransfer+heartbeating+credit+
authentication),itbecomeslessandlesssanetotrytoimplementclientsandserversbyhand.

Yes,peopledothisalmostsystematically.Butthecostsarehigh,andthey'reavoidable.I'llexplainhowtomodelprotocolsusing
statemachines,andhowtogenerateneatandsolidcodefromthosemodels.

Myexperiencewithusingstatemachinesasasoftwareconstructiontooldatesto1985andmyfirstrealjobmakingtoolsfor
applicationdevelopers.In1991,IturnedthatknowledgeintoafreesoftwaretoolcalledLibero,whichspatoutexecutablestate
machinesfromasimpletextmodel.

ThethingaboutLibero'smodelwasthatitwasreadable.Thatis,youdescribedyourprogramlogicasnamedstates,each
acceptingasetofevents,eachdoingsomerealwork.Theresultingstatemachinehookedintoyourapplicationcode,drivingit
likeaboss.

Liberowascharminglygoodatitsjob,fluentinmanylanguages,andmodestlypopulargiventheenigmaticnatureofstate
machines.WeusedLiberoinangerindozensoflargedistributedapplications,oneofwhichwasfinallyswitchedoffin2011after
20yearsofoperation.Statemachinedrivencodeconstructionworkedsowellthatit'ssomewhatimpressivethatthisapproach
neverhitthemainstreamofsoftwareengineering.

SointhissectionI'mgoingtoexplainLibero'smodel,anddemonstratehowtouseittogenerateZeroMQclientsandservers.
We'lluseGSLagain,butlikeIsaid,theprinciplesaregeneralandyoucanputtogethercodegeneratorsusinganyscripting
language.

Asaworkedexample,let'sseehowtocarryonastatefuldialogwithapeeronaROUTERsocket.We'lldeveloptheserverusing
astatemachine(andtheclientbyhand).WehaveasimpleprotocolthatI'llcall"NOM".I'musingtheohsoveryserious
keywordsforunprotocolsproposal:

nomprotocol=openpeering*usepeering

openpeering=C:OHAI(S:OHAIOK/S:WTF)

usepeering=C:ICANHAZ
/S:CHEEZBURGER
/C:HUGZS:HUGZOK
/S:HUGZC:HUGZOK

I'venotfoundaquickwaytoexplainthetruenatureofstatemachineprogramming.Inmyexperience,itinvariablytakesafew
daysofpractice.Afterthreeorfourdays'exposuretotheidea,thereisanearaudible"click!"assomethinginthebrainconnects
allthepiecestogether.We'llmakeitconcretebylookingatthestatemachineforourNOMserver.

Ausefulthingaboutstatemachinesisthatyoucanreadthemstatebystate.Eachstatehasauniquedescriptivenameandone
http://zguide.zeromq.org/page:all 173/225
12/31/2015 MQ - The Guide - MQ - The Guide
ormoreevents,whichwelistinanyorder.Foreachevent,weperformzeroormoreactionsandwethenmovetoanextstate(or
stayinthesamestate).

InaZeroMQprotocolserver,wehaveastatemachineinstanceperclient.Thatsoundscomplexbutitisn't,aswe'llsee.We
describeourfirststate,Start,ashavingonevalidevent:OHAI.Wechecktheuser'scredentialsandthenarriveinthe
Authenticatedstate.

Figure64TheStartState

TheCheckCredentialsactionproduceseitheranokoranerrorevent.It'sintheAuthenticatedstatethatwehandlethese
twopossibleeventsbysendinganappropriatereplybacktotheclient.Ifauthenticationfailed,wereturntotheStartstatewhere
theclientcantryagain.

Figure65TheAuthenticatedState

Whenauthenticationhassucceeded,wearriveintheReadystate.Herewehavethreepossibleevents:anICANHAZorHUGZ
messagefromtheclient,oraheartbeattimerevent.

Figure66TheReadyState

http://zguide.zeromq.org/page:all 174/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thereareafewmorethingsaboutthisstatemachinemodelthatareworthknowing:

Eventsinuppercase(like"HUGZ")areexternaleventsthatcomefromtheclientasmessages.
Eventsinlowercase(like"heartbeat")areinternalevents,producedbycodeintheserver.
The"SendSOMETHING"actionsareshorthandforsendingaspecificreplybacktotheclient.
Eventsthataren'tdefinedinaparticularstatearesilentlyignored.

Now,theoriginalsourcefortheseprettypicturesisanXMLmodel:

<classname="nom_server"script="server_c">

<statename="start">
<eventname="OHAI"next="authenticated">
<actionname="checkcredentials"/>
</event>
</state>

<statename="authenticated">
<eventname="ok"next="ready">
<actionname="send"message="OHAIOK"/>
</event>
<eventname="error"next="start">
<actionname="send"message="WTF"/>
</event>
</state>

<statename="ready">
<eventname="ICANHAZ">
<actionname="send"message="CHEEZBURGER"/>
</event>
<eventname="HUGZ">
<actionname="send"message="HUGZOK"/>
</event>
<eventname="heartbeat">
<actionname="send"message="HUGZ"/>
</event>
</state>
</class>

Thecodegeneratorisinexamples/models/server_c.gsl.ItisafairlycompletetoolthatI'lluseandexpandformoreserious
http://zguide.zeromq.org/page:all 175/225
12/31/2015 MQ - The Guide - MQ - The Guide
worklater.Itgenerates:

AserverclassinC(nom_server.c,nom_server.h)thatimplementsthewholeprotocolflow.
AselftestmethodthatrunstheselfteststepslistedintheXMLfile.
Documentationintheformofgraphics(theprettypictures).

Here'sasimplemainprogramthatstartsthegeneratedNOMserver:

#include"czmq.h"
#include"nom_server.h"

intmain(intargc,char*argv[])
{
printf("StartingNOMprotocolserveronport5670\n")
nom_server_t*server=nom_server_new()
nom_server_bind(server,"tcp://*:5670")
nom_server_wait(server)
nom_server_destroy(&server)
return0
}

Thegeneratednom_serverclassisafairlyclassicmodel.ItacceptsclientmessagesonaROUTERsocket,sothefirstframeon
everyrequestistheclient'sconnectionidentity.Theservermanagesasetofclients,eachwithstate.Asmessagesarrive,it
feedstheseaseventstothestatemachine.Here'sthecoreofthestatemachine,asamixofGSLcommandsandtheCcodewe
intendtogenerate:

client_execute(client_t*self,intevent)
{
self>next_event=event
while(self>next_event){
self>event=self>next_event
self>next_event=0
switch(self>state){
.forclass.state
case$(name:c)_state:
.forevent
.ifindex()>1
else
.endif
if(self>event==$(name:c)_event){
.foraction
.ifname="send"
zmsg_addstr(self>reply,"$(message:)")
.else
$(name:c)_action(self)
.endif
.endfor
.ifdefined(event.next)
self>state=$(next:c)_state
.endif
}
.endfor
break
.endfor
}
if(zmsg_size(self>reply)>1){
zmsg_send(&self>reply,self>router)
self>reply=zmsg_new()
zmsg_add(self>reply,zframe_dup(self>address))
}
}
http://zguide.zeromq.org/page:all 176/225
12/31/2015 MQ - The Guide - MQ - The Guide
}

Eachclientisheldasanobjectwithvariousproperties,includingthevariablesweneedtorepresentastatemachineinstance:

event_tnext_event//Nextevent
state_tstate//Currentstate
event_tevent//Currentevent

Youwillseebynowthatwearegeneratingtechnicallyperfectcodethathastheprecisedesignandshapewewant.Theonly
cluethatthenom_serverclassisn'thandwrittenisthatthecodeistoogood.Peoplewhocomplainthatcodegeneratorsproduce
poorcodeareaccustomedtopoorcodegenerators.Itistrivialtoextendourmodelasweneedit.Forexample,here'showwe
generatetheselftestcode.

First,weadda"selftest"itemtothestatemachineandwriteourtests.We'renotusinganyXMLgrammarorvalidationsoitreally
isjustamatterofopeningtheeditorandaddinghalfadozenlinesoftext:

<selftest>
<stepsend="OHAI"body="Sleepy"recv="WTF"/>
<stepsend="OHAI"body="Joe"recv="OHAIOK"/>
<stepsend="ICANHAZ"recv="CHEEZBURGER"/>
<stepsend="HUGZ"recv="HUGZOK"/>
<steprecv="HUGZ"/>
</selftest>

Designingonthefly,Idecidedthat"send"and"recv"wereanicewaytoexpress"sendthisrequest,thenexpectthisreply".
Here'stheGSLcodethatturnsthismodelintorealcode:

.forclass>selftest.step
.ifdefined(send)
msg=zmsg_new()
zmsg_addstr(msg,"$(send:)")
.ifdefined(body)
zmsg_addstr(msg,"$(body:)")
.endif
zmsg_send(&msg,dealer)

.endif
.ifdefined(recv)
msg=zmsg_recv(dealer)
assert(msg)
command=zmsg_popstr(msg)
assert(streq(command,"$(recv:)"))
free(command)
zmsg_destroy(&msg)

.endif
.endfor

Finally,oneofthemoretrickybutabsolutelyessentialpartsofanystatemachinegeneratorishowdoIplugthisintomyown
code?AsaminimalexampleforthisexerciseIwantedtoimplementthe"checkcredentials"actionbyacceptingallOHAIsfrom
myfriendJoe(HiJoe!)andrejecteveryoneelse'sOHAIs.Aftersomethought,Idecidedtograbcodedirectlyfromthestate
machinemodel,i.e.,embedactionbodiesintheXMLfile.Soinnom_server.xml,you'llseethis:

<actionname="checkcredentials">
char*body=zmsg_popstr(self>request)
if(body&&streq(body,"Joe"))
self>next_event=ok_event

http://zguide.zeromq.org/page:all 177/225
12/31/2015 MQ - The Guide - MQ - The Guide
else
self>next_event=error_event
free(body)
</action>

AndthecodegeneratorgrabsthatCcodeandinsertsitintothegeneratednom_server.cfile:

.forclass.action
staticvoid
$(name:c)_action(client_t*self){
$(string.trim(.):)
}
.endfor

Andnowwehavesomethingquiteelegant:asinglesourcefilethatdescribesmyserverstatemachineandalsocontainsthe
nativeimplementationsformyactions.Anicemixofhighlevelandlowlevelthatisabout90%smallerthantheCcode.

Beware,asyourheadspinswithnotionsofalltheamazingthingsyoucouldproducewithsuchleverage.Whilethisapproach
givesyourealpower,italsomovesyouawayfromyourpeers,andifyougotoofar,you'llfindyourselfworkingalone.

Bytheway,thissimplelittlestatemachinedesignexposesjustthreevariablestoourcustomcode:

self>next_event
self>request
self>reply

IntheLiberostatemachinemodel,thereareafewmoreconceptsthatwe'venotusedhere,butwhichwewillneedwhenwe
writelargerstatemachines:

Exceptions,whichletsuswriteterserstatemachines.Whenanactionraisesanexception,furtherprocessingontheevent
stops.Thestatemachinecanthendefinehowtohandleexceptionevents.
TheDefaultsstate,wherewecandefinedefaulthandlingforevents(especiallyusefulforexceptionevents).

AuthenticationUsingSASL topprevnext

WhenwedesignedAMQPin2007,wechosetheSimpleAuthenticationandSecurityLayer(SASL)fortheauthenticationlayer,
oneoftheideaswetookfromtheBEEPprotocolframework.SASLlookscomplexatfirst,butit'sactuallysimpleandfitsneatly
intoaZeroMQbasedprotocol.WhatIespeciallylikeaboutSASListhatit'sscalable.Youcanstartwithanonymousaccessor
plaintextauthenticationandnosecurity,andgrowtomoresecuremechanismsovertimewithoutchangingyourprotocol.

I'mnotgoingtogiveadeepexplanationnowbecausewe'llseeSASLinactionsomewhatlater.ButI'llexplaintheprincipleso
you'realreadysomewhatprepared.

IntheNOMprotocol,theclientstartedwithanOHAIcommand,whichtheservereitheraccepted("HiJoe!")orrejected.Thisis
simplebutnotscalablebecauseserverandclienthavetoagreeupfrontonthetypeofauthenticationthey'regoingtodo.

WhatSASLintroduced,whichisgenius,isafullyabstractedandnegotiablesecuritylayerthat'sstilleasytoimplementatthe
protocollevel.Itworksasfollows:

Theclientconnects.
Theserverchallengestheclient,passingalistofsecurity"mechanisms"thatitknowsabout.
Theclientchoosesasecuritymechanismthatitknowsabout,andanswerstheserver'schallengewithablobofopaque
datathat(andhere'stheneattrick)somegenericsecuritylibrarycalculatesandgivestotheclient.
Theservertakesthesecuritymechanismtheclientchose,andthatblobofdata,andpassesittoitsownsecuritylibrary.
Thelibraryeitheracceptstheclient'sanswer,ortheserverchallengesagain.

ThereareanumberoffreeSASLlibraries.Whenwecometorealcode,we'llimplementjusttwomechanisms,ANONYMOUS
andPLAIN,whichdon'tneedanyspeciallibraries.

TosupportSASL,wehavetoaddanoptionalchallenge/responsesteptoour"openpeering"flow.Hereiswhattheresulting

http://zguide.zeromq.org/page:all 178/225
12/31/2015 MQ - The Guide - MQ - The Guide
protocolgrammarlookslike(I'mmodifyingNOMtodothis):

securenom=openpeering*usepeering

openpeering=C:OHAI*(S:ORLYC:YARLY)(S:OHAIOK/S:WTF)

ORLY=1*mechanismchallenge
mechanism=string
challenge=*OCTET

YARLY=mechanismresponse
response=*OCTET

WhereORLYandYARLYcontainastring(alistofmechanismsinORLY,onemechanisminYARLY)andablobofopaquedata.
Dependingonthemechanism,theinitialchallengefromtheservermaybeempty.Wedon'tcare:wejustpassthistothesecurity
librarytodealwith.

TheSASLRFCgoesintodetailaboutotherfeatures(thatwedon'tneed),thekindsofwaysSASLcouldbeattacked,andsoon.

LargeScaleFilePublishing:FileMQ topprevnext

Let'sputallthesetechniquestogetherintoafiledistributionsystemthatI'llcallFileMQ.Thisisgoingtobearealproduct,living
onGitHub.Whatwe'llmakehereisafirstversionofFileMQ,asatrainingtool.Iftheconceptworks,therealthingmayeventually
getitsownbook.

WhymakeFileMQ? topprevnext

Whymakeafiledistributionsystem?IalreadyexplainedhowtosendlargefilesoverZeroMQ,andit'sreallyquitesimple.Butif
youwanttomakemessagingaccessibletoamilliontimesmorepeoplethancanuseZeroMQ,youneedanotherkindofAPI.An
APIthatmyfiveyearoldsoncanunderstand.AnAPIthatisuniversal,requiresnoprogramming,andworkswithjustaboutevery
singleapplication.

Yes,I'mtalkingaboutthefilesystem.It'stheDropBoxpattern:chuckyourfilessomewhereandtheygetmagicallycopied
somewhereelsewhenthenetworkconnectsagain.

However,whatI'maimingforisafullydecentralizedarchitecturethatlooksmorelikegit,thatdoesn'tneedanycloudservices
(thoughwecouldputFileMQinthecloud),andthatdoesmulticast,i.e.,cansendfilestomanyplacesatonce.

FileMQmustbesecure(able),easilyhookedintorandomscriptinglanguages,andasfastaspossibleacrossourdomesticand
officenetworks.

IwanttouseittobackupphotosfrommymobilephonetomylaptopoverWiFi.Tosharepresentationslidesinrealtimeacross
50laptopsinaconference.Tosharedocumentswithcolleaguesinameeting.Tosendearthquakedatafromsensorstocentral
clusters.TobackupvideofrommyphoneasItakeit,duringprotestsorriots.Tosynchronizeconfigurationfilesacrossacloudof
Linuxservers.

Avisionaryidea,isn'tit?Well,ideasarecheap.Thehardpartismakingthis,andmakingitsimple.

InitialDesignCut:theAPI topprevnext

Here'sthewayIseethefirstdesign.FileMQhastobedistributed,whichmeansthateverynodecanbeaserverandaclientat
thesametime.ButIdon'twanttheprotocoltobesymmetrical,becausethatseemsforced.Wehaveanaturalflowoffilesfrom
pointAtopointB,whereAisthe"server"andBisthe"client".Iffilesflowbacktheotherway,thenwehavetwoflows.FileMQis

http://zguide.zeromq.org/page:all 179/225
12/31/2015 MQ - The Guide - MQ - The Guide
notyetdirectorysynchronizationprotocol,butwe'llbringitquiteclose.

Thus,I'mgoingtobuildFileMQastwopieces:aclientandaserver.Then,I'llputthesetogetherinamainapplication(the
filemqtool)thatcanactbothasclientandserver.Thetwopieceswilllookquitesimilartothenom_server,withthesamekind
ofAPI:

fmq_server_t*server=fmq_server_new()
fmq_server_bind(server,"tcp://*:5670")
fmq_server_publish(server,"/home/ph/filemq/share","/public")
fmq_server_publish(server,"/home/ph/photos/stream","/photostream")

fmq_client_t*client=fmq_client_new()
fmq_client_connect(client,"tcp://pieter.filemq.org:5670")
fmq_client_subscribe(server,"/public/","/home/ph/filemq/share")
fmq_client_subscribe(server,"/photostream/","/home/ph/photos/stream")

while(!zctx_interrupted)
sleep(1)

fmq_server_destroy(&server)
fmq_client_destroy(&client)

IfwewrapthisCAPIinotherlanguages,wecaneasilyscriptFileMQ,embeditapplications,portittosmartphones,andsoon.

InitialDesignCut:theProtocol topprevnext

Thefullnamefortheprotocolisthe"FileMessageQueuingProtocol",orFILEMQinuppercasetodistinguishitfromthesoftware.
Tostartwith,wewritedowntheprotocolasanABNFgrammar.Ourgrammarstartswiththeflowofcommandsbetweenthe
clientandserver.Youshouldrecognizetheseasacombinationofthevarioustechniqueswe'veseenalready:

filemqprotocol=openpeering*usepeering[closepeering]

openpeering=C:OHAI*(S:ORLYC:YARLY)(S:OHAIOK/error)

usepeering=C:ICANHAZ(S:ICANHAZOK/error)
/C:NOM
/S:CHEEZBURGER
/C:HUGZS:HUGZOK
/S:HUGZC:HUGZOK

closepeering=C:KTHXBAI/S:KTHXBAI

error=S:SRSLY/S:RTFM

Herearethecommandstoandfromtheserver:

Theclientopenspeeringtotheserver
OHAI=signature%x01protocolversion
signature=%xAA%xA3
protocol=stringMustbe"FILEMQ"
string=size*VCHAR
size=OCTET
version=%x01

TheserverchallengestheclientusingtheSASLmodel
ORLY=signature%x02mechanismschallenge

http://zguide.zeromq.org/page:all 180/225
12/31/2015 MQ - The Guide - MQ - The Guide
mechanisms=size1*mechanism
mechanism=string
challenge=*OCTETZeroMQframe

TheclientrespondswithSASLauthenticationinformation
YARLY=%signaturex03mechanismresponse
response=*OCTETZeroMQframe

Theservergrantstheclientaccess
OHAIOK=signature%x04

Theclientsubscribestoavirtualpath
ICANHAZ=signature%x05pathoptionscache
path=stringFullpathorpathprefix
options=dictionary
dictionary=size*keyvalue
keyvalue=stringFormattedasname=value
cache=dictionaryFileSHA1signatures

Theserverconfirmsthesubscription
ICANHAZOK=signature%x06

Theclientsendscredittotheserver
NOM=signature%x07credit
credit=8OCTET64bitinteger,networkorder
sequence=8OCTET64bitinteger,networkorder

Theserversendsachunkoffiledata
CHEEZBURGER=signature%x08sequenceoperationfilename
offsetheaderschunk
sequence=8OCTET64bitinteger,networkorder
operation=OCTET
filename=string
offset=8OCTET64bitinteger,networkorder
headers=dictionary
chunk=FRAME

Clientorserversendsaheartbeat
HUGZ=signature%x09

Clientorserverrespondstoaheartbeat
HUGZOK=signature%x0A

Clientclosesthepeering
KTHXBAI=signature%x0B

Andherearethedifferentwaystheservercantelltheclientthingswentwrong:

Servererrorreplyrefusedduetoaccessrights
S:SRSLY=signature%x80reason

Servererrorreplyclientsentaninvalidcommand
S:RTFM=signature%x81reason

FILEMQlivesontheZeroMQunprotocolswebsiteandhasaregisteredTCPportwithIANA(theInternetAssignedNumbers
Authority),whichisport5670.

BuildingandTryingFileMQ topprevnext

http://zguide.zeromq.org/page:all 181/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheFileMQstackisonGitHub.ItworkslikeaclassicC/C++project:

gitclonegit://github.com/zeromq/filemq.git
cdfilemq
./autogen.sh
./configure
makecheck

YouwanttobeusingthelatestCZMQmasterforthis.Nowtryrunningthetrackcommand,whichisasimpletoolthatuses
FileMQtotrackchangesinonedirectoryinanother:

cdsrc
./track./fmqroot/send./fmqroot/recv

Andopentwofilenavigatorwindows,oneintosrc/fmqroot/sendandoneintosrc/fmqroot/recv.Dropfilesintothesend
folderandyou'llseethemarriveintherecvfolder.Theserverchecksoncepersecondfornewfiles.Deletefilesinthesend
folder,andthey'redeletedintherecvfoldersimilarly.

IusetrackforthingslikeupdatingmyMP3playermountedasaUSBdrive.AsIaddorremovefilesinmylaptop'sMusicfolder,
thesamechangeshappenontheMP3player.FILEMQisn'tafullreplicationprotocolyet,butwe'llfixthatlater.

InternalArchitecture topprevnext

TobuildFileMQIusedalotofcodegeneration,possiblytoomuchforatutorial.Howeverthecodegeneratorsareallreusablein
otherstacksandwillbeimportantforourfinalprojectinChapter8AFrameworkforDistributedComputing.Theyarean
evolutionofthesetwesawearlier:

codec_c.gsl:generatesamessagecodecforagivenprotocol.
server_c.gsl:generatesaserverclassforaprotocolandstatemachine.
client_c.gsl:generatesaclientclassforaprotocolandstatemachine.

ThebestwaytolearntouseGSLcodegenerationistotranslatetheseintoalanguageofyourchoiceandmakeyourowndemo
protocolsandstacks.You'llfinditfairlyeasy.FileMQitselfdoesn'ttrytosupportmultiplelanguages.Itcould,butit'dmakethings
needlesslycomplex.

TheFileMQarchitectureactuallyslicesintotwolayers.There'sagenericsetofclassestohandlechunks,directories,files,
patches,SASLsecurity,andconfigurationfiles.Then,there'sthegeneratedstack:messages,client,andserver.IfIwascreating
anewprojectI'dforkthewholeFileMQproject,andgoandmodifythethreemodels:

fmq_msg.xml:definesthemessageformats
fmq_client.xml:definestheclientstatemachine,API,andimplementation.
fmq_server.xml:doesthesamefortheserver.

You'dwanttorenamethingstoavoidconfusion.Whydidn'tImakethereusableclassesintoaseparatelibrary?Theansweris
twofold.First,nooneactuallyneedsthis(yet).Second,it'dmakethingsmorecomplexforyouasyoubuildandplaywithFileMQ.
It'sneverworthaddingcomplexitytosolveatheoreticalproblem.

AlthoughIwroteFileMQinC,it'seasytomaptootherlanguages.ItisquiteamazinghowniceCbecomeswhenyouadd
CZMQ'sgenericzlistandzhashcontainersandclassstyle.Letmegothroughtheclassesquickly:

fmq_sasl:encodesanddecodesaSASLchallenge.IonlyimplementedthePLAINmechanism,whichisenoughtoprove
theconcept.

fmq_chunk:workswithvariablesizedblobs.NotasefficientasZeroMQ'smessagesbuttheydolessweirdnessandso
areeasiertounderstand.Thechunkclasshasmethodstoreadandwritechunksfromdisk.

http://zguide.zeromq.org/page:all 182/225
12/31/2015 MQ - The Guide - MQ - The Guide
fmq_file:workswithfiles,whichmayormaynotexistondisk.Givesyouinformationaboutafile(likesize),letsyouread
andwritetofiles,removefiles,checkifafileexists,andcheckifafileis"stable"(moreonthatlater).

fmq_dir:workswithdirectories,readingthemfromdiskandcomparingtwodirectoriestoseewhatchanged.Whenthere
arechanges,returnsalistof"patches".

fmq_patch:workswithonepatch,whichreallyjustsays"createthisfile"or"deletethisfile"(referringtoafmq_fileitem
eachtime).

fmq_config:workswithconfigurationdata.I'llcomebacktoclientandserverconfigurationlater.

Everyclasshasatestmethod,andthemaindevelopmentcycleis"edit,test".Thesearemostlysimpleselftests,buttheymake
thedifferencebetweencodeIcantrustandcodeIknowwillstillbreak.It'sasafebetthatanycodethatisn'tcoveredbyatest
casewillhaveundiscoverederrors.I'mnotafanofexternaltestharnesses.Butinternaltestcodethatyouwriteasyouwriteyour
functionalitythat'slikethehandleonaknife.

Youshould,really,beabletoreadthesourcecodeandrapidlyunderstandwhattheseclassesaredoing.Ifyoucan'treadthe
codehappily,tellme.IfyouwanttoporttheFileMQimplementationintootherlanguages,startbyforkingthewholerepository
andlaterwe'llseeifit'spossibletodothisinoneoverallrepo.

PublicAPI topprevnext

ThepublicAPIconsistsoftwoclasses(aswesketchedearlier):

fmq_client:providestheclientAPI,withmethodstoconnecttoaserver,configuretheclient,andsubscribetopaths.

fmq_server:providestheserverAPI,withmethodstobindtoaport,configuretheserver,andpublishapath.

TheseclassesprovideanmultithreadedAPI,amodelwe'veusedafewtimesnow.WhenyoucreateanAPIinstance(i.e.,
fmq_server_new()orfmq_client_new()),thismethodkicksoffabackgroundthreadthatdoestherealwork,i.e.,runsthe
serverortheclient.TheotherAPImethodsthentalktothisthreadoverZeroMQsockets(apipeconsistingoftwoPAIRsockets
overinproc://).

IfIwasakeenyoungdevelopereagertouseFileMQinanotherlanguage,I'dprobablyspendahappyweekendwritingabinding
forthispublicAPI,thenstickitinasubdirectoryofthefilemqprojectcalled,say,bindings/,andmakeapullrequest.

TheactualAPImethodscomefromthestatemachinedescription,likethis(fortheserver):

<methodname="publish">
<argumentname="location"type="string"/>
<argumentname="alias"type="string"/>
mount_t*mount=mount_new(location,alias)
zlist_append(self>mounts,mount)
</method>

Whichgetsturnedintothiscode:

void
fmq_server_publish(fmq_server_t*self,char*location,char*alias)
{
assert(self)
assert(location)
assert(alias)
zstr_sendm(self>pipe,"PUBLISH")
zstr_sendfm(self>pipe,"%s",location)
zstr_sendf(self>pipe,"%s",alias)
}

http://zguide.zeromq.org/page:all 183/225
12/31/2015 MQ - The Guide - MQ - The Guide

DesignNotes topprevnext

ThehardestpartofmakingFileMQwasn'timplementingtheprotocol,butmaintainingaccuratestateinternally.AnFTPorHTTP
serverisessentiallystateless.Butapublish/subscribeserverhastomaintainsubscriptions,atleast.

SoI'llgothroughsomeofthedesignaspects:

Theclientdetectsiftheserverhasdiedbythelackofheartbeats(HUGZ)comingfromtheserver.Itthenrestartsitsdialog
bysendinganOHAI.There'snotimeoutontheOHAIbecausetheZeroMQDEALERsocketwillqueueanoutgoing
messageindefinitely.

Ifaclientstopsreplyingwith(HUGZOK)totheheartbeatsthattheserversends,theserverconcludesthattheclienthas
diedanddeletesallstatefortheclientincludingitssubscriptions.

TheclientAPIholdssubscriptionsinmemoryandreplaysthemwhenithasconnectedsuccessfully.Thismeansthecaller
cansubscribeatanytime(anddoesn'tcarewhenconnectionsandauthenticationactuallyhappen).

Theserverandclientusevirtualpaths,muchlikeanHTTPorFTPserver.Youpublishoneormoremountpoints,each
correspondingtoadirectoryontheserver.Eachofthesemapstosomevirtualpath,forinstance"/"ifyouhaveonlyone
mountpoint.Clientsthensubscribetovirtualpaths,andfilesarriveinaninboxdirectory.Wedon'tsendphysicalfile
namesacrossthenetwork.

Therearesometimingissues:iftheserveriscreatingitsmountpointswhileclientsareconnectedandsubscribing,the
subscriptionswon'tattachtotherightmountpoints.So,webindtheserverportaslastthing.

ClientscanreconnectatanypointiftheclientsendsOHAI,thatsignalstheendofanypreviousconversationandthestart
ofanewone.Imightonedaymakesubscriptionsdurableontheserver,sotheysurviveadisconnection.Theclientstack,
afterreconnecting,replaysanysubscriptionsthecallerapplicationalreadymade.

Configuration topprevnext

I'vebuiltseverallargeserverproducts,liketheXitamiwebserverthatwaspopularinthelate90's,andtheOpenAMQmessaging
server.Gettingconfigurationeasyandobviouswasalargepartofmakingtheseserversfuntouse.

Wetypicallyaimtosolveanumberofproblems:

Shipdefaultconfigurationfileswiththeproduct.
Allowuserstoaddcustomconfigurationfilesthatareneveroverwritten.
Allowuserstoconfigurefromthecommandline.

Andthenlayertheseoneontheother,socommandlinesettingsoverridecustomsettings,whichoverridedefaultsettings.Itcan
bealotofworktodothisright.ForFileMQ,I'vetakenasomewhatsimplerapproach:allconfigurationisdonefromtheAPI.

Thisishowwestartandconfiguretheserver,forexample:

server=fmq_server_new()
fmq_server_configure(server,"server_test.cfg")
fmq_server_publish(server,"./fmqroot/send","/")
fmq_server_publish(server,"./fmqroot/logs","/logs")
fmq_server_bind(server,"tcp://*:5670")

Wedouseaspecificformatfortheconfigfiles,whichisZPL,aminimalistsyntaxthatwestartedusingforZeroMQ"devices"a
fewyearsago,butwhichworkswellforanyserver:

#Configureserverforplainaccess
#
server
monitor=1#Checkmountpoints

http://zguide.zeromq.org/page:all 184/225
12/31/2015 MQ - The Guide - MQ - The Guide
heartbeat=1#Heartbeattoclients

publish
location=./fmqroot/logs
virtual=/logs

security
echo=I:useguest/guesttologintoserver
#TheseareSASLmechanismsweaccept
anonymous=0
plain=1
account
login=guest
password=guest
group=guest
account
login=super
password=secret
group=admin

Onecutething(whichseemsuseful)thegeneratedservercodedoesistoparsethisconfigfile(whenyouusethe
fmq_server_configure()method)andexecuteanysectionthatmatchesanAPImethod.Thusthepublishsectionworks
asafmq_server_publish()method.

FileStability topprevnext

Itisquitecommontopolladirectoryforchangesandthendosomething"interesting"withnewfiles.Butasoneprocessiswriting
toafile,otherprocesseshavenoideawhenthefilehasbeenfullywritten.Onesolutionistoaddasecond"indicator"filethatwe
createaftercreatingthefirstfile.Thisisintrusive,however.

Thereisaneaterway,whichistodetectwhenafileis"stable",i.e.,nooneiswritingtoitanylonger.FileMQdoesthisby
checkingthemodificationtimeofthefile.Ifit'smorethanasecondold,thenthefileisconsideredstable,atleaststableenough
tobeshippedofftoclients.Ifaprocesscomesalongafterfiveminutesandappendstothefile,it'llbeshippedoffagain.

Forthistowork,andthisisarequirementforanyapplicationhopingtouseFileMQsuccessfully,donotbuffermorethana
second'sworthofdatainmemorybeforewriting.Ifyouuseverylargeblocksizes,thefilemaylookstablewhenit'snot.

DeliveryNotifications topprevnext

OneofthenicethingsaboutthemultithreadedAPImodelwe'reusingisthatit'sessentiallymessagebased.Thismakesitideal
forreturningeventsbacktothecaller.AmoreconventionalAPIapproachwouldbetousecallbacks.Butcallbacksthatcross
threadboundariesaresomewhatdelicate.Here'showtheclientsendsamessagebackwhenithasreceivedacompletefile:

zstr_sendm(self>pipe,"DELIVER")
zstr_sendm(self>pipe,filename)
zstr_sendf(self>pipe,"%s/%s",inbox,filename)

Wecannowadda_recv()methodtotheAPIthatwaitsforeventsbackfromtheclient.Itmakesacleanstyleforthecaller:
createtheclientobject,configureit,andthenreceiveandprocessanyeventsitreturns.

SymbolicLinks topprevnext

http://zguide.zeromq.org/page:all 185/225
12/31/2015 MQ - The Guide - MQ - The Guide

Whileusingastagingareaisanice,simpleAPI,italsocreatescostsforsenders.IfIalreadyhavea2GBvideofileonacamera,
andwanttosenditviaFileMQ,thecurrentimplementationasksthatIcopyittoastagingareabeforeitwillbesentto
subscribers.

Oneoptionistomountthewholecontentdirectory(e.g.,/home/me/Movies),butthisisfragilebecauseitmeanstheapplication
can'tdecidetosendindividualfiles.It'severythingornothing.

Asimpleansweristoimplementportablesymboliclinks.AsWikipediaexplains:"Asymboliclinkcontainsatextstringthatis
automaticallyinterpretedandfollowedbytheoperatingsystemasapathtoanotherfileordirectory.Thisotherfileordirectoryis
calledthetarget.Thesymboliclinkisasecondfilethatexistsindependentlyofitstarget.Ifasymboliclinkisdeleted,itstarget
remainsunaffected."

Thisdoesn'taffecttheprotocolinanywayit'sanoptimizationintheserverimplementation.Let'smakeasimpleportable
implementation:

Asymboliclinkconsistsofafilewiththeextension.ln.
Thefilenamewithout.lnisthepublishedfilename.
Thelinkfilecontainsoneline,whichistherealpathtothefile.

Becausewe'vecollectedalloperationsonfilesinasingleclass(fmq_file),it'sacleanchange.Whenwecreateanewfile
object,wecheckifit'sasymboliclinkandthenallreadonlyactions(getfilesize,readfile)operateonthetargetfile,notthelink.

RecoveryandLateJoiners topprevnext

Asitstandsnow,FileMQhasonemajorremainingproblem:itprovidesnowayforclientstorecoverfromfailures.Thescenariois
thataclient,connectedtoaserver,startstoreceivefilesandthendisconnectsforsomereason.Thenetworkmaybetooslow,or
breaks.Theclientmaybeonalaptopwhichisshutdown,thenresumed.TheWiFimaybedisconnected.Aswemovetoamore
mobileworld(seeChapter8AFrameworkforDistributedComputing)thisusecasebecomesmoreandmorefrequent.Insome
waysit'sbecomingadominantusecase.

IntheclassicZeroMQpubsubpattern,therearetwostrongunderlyingassumptions,bothofwhichareusuallywronginFileMQ's
realworld.First,thatdataexpiresveryrapidlysothatthere'snointerestinaskingfromolddata.Second,thatnetworksarestable
andrarelybreak(soit'sbettertoinvestmoreinimprovingtheinfrastructureandlessinaddressingrecovery).

TakeanyFileMQusecaseandyou'llseethatiftheclientdisconnectsandreconnects,thenitshouldgetanythingitmissed.A
furtherimprovementwouldbetorecoverfrompartialfailures,likeHTTPandFTPdo.Butonethingatatime.

Oneanswertorecoveryis"durablesubscriptions",andthefirstdraftsoftheFILEMQprotocolaimedtosupportthis,withclient
identifiersthattheservercouldholdontoandstore.Soifaclientreappearsafterafailure,theserverwouldknowwhatfilesithad
notreceived.

Statefulserversare,however,nastytomakeanddifficulttoscale.Howdowe,forexample,dofailovertoasecondaryserver?
Wheredoesitgetitssubscriptionsfrom?It'sfarnicerifeachclientconnectionworksindependentlyandcarriesallnecessary
statewithit.

Anothernailinthecoffinofdurablesubscriptionsisthatitrequiresupfrontcoordination.Upfrontcoordinationisalwaysared
flag,whetherit'sinateamofpeopleworkingtogether,orabunchofprocessestalkingtoeachother.Whataboutlatejoiners?In
therealworld,clientsdonotneatlylineupandthenallsay"Ready!"atthesametime.Intherealworld,theycomeandgo
arbitrarily,andit'svaluableifwecantreatabrandnewclientinthesamewayasaclientthatwentawayandcameback.

ToaddressthisIwilladdtwoconceptstotheprotocol:aresynchronizationoptionandacachefield(adictionary).Iftheclient
wantsrecovery,itsetstheresynchronizationoption,andtellstheserverwhatfilesitalreadyhasviathecachefield.Weneed
both,becausethere'snowayintheprotocoltodistinguishbetweenanemptyfieldandanullfield.TheFILEMQRFCdescribes
thesefieldsasfollows:

Theoptionsfieldprovidesadditionalinformationtotheserver.TheserverSHOULDimplementtheseoptions:RESYNC=1
iftheclientsetsthis,theserverSHALLsendthefullcontentsofthevirtualpathtotheclient,exceptfilestheclientalready
has,asidentifiedbytheirSHA1digestinthecachefield.

And:

http://zguide.zeromq.org/page:all 186/225
12/31/2015 MQ - The Guide - MQ - The Guide

WhentheclientspecifiestheRESYNCoption,thecachedictionaryfieldtellstheserverwhichfilestheclientalreadyhas.
Eachentryinthecachedictionaryisa"filename=digest"key/valuepairwherethedigestSHALLbeaSHA1digestin
printablehexadecimalformat.Ifthefilenamestartswith"/"thenitSHOULDstartwiththepath,otherwisetheserverMUST
ignoreit.Ifthefilenamedoesnotstartwith"/"thentheserverSHALLtreatitasrelativetothepath.

Clientsthatknowtheyareintheclassicpubsubusecasejustdon'tprovideanycachedata,andclientsthatwantrecovery
providetheircachedata.Itrequiresnostateintheserver,noupfrontcoordination,andworksequallywellforbrandnewclients
(whichmayhavereceivedfilesviasomeoutofbandmeans),andclientsthatreceivedsomefilesandwerethendisconnectedfor
awhile.

IdecidedtouseSHA1digestsforseveralreasons.First,it'sfastenough:150msectodigesta25MBcoredumponmylaptop.
Second,it'sreliable:thechanceofgettingthesamehashfordifferentversionsofonefileiscloseenoughtozero.Third,it'sthe
widestsupporteddigestalgorithm.Acyclicredundancycheck(e.g.,CRC32)isfasterbutnotreliable.MorerecentSHAversions
(SHA256,SHA512)aremoresecurebuttake50%moreCPUcycles,andareoverkillforourneeds.

HereiswhatatypicalICANHAZmessagelookslikewhenweusebothcachingandresyncing(thisisoutputfromthedump
methodofthegeneratedcodecclass):

ICANHAZ:
path='/photos'
options={
RESYNC=1
}
cache={
DSCF0001.jpg=1FABCD4259140ACA99E991E7ADD2034AC57D341D
DSCF0006.jpg=01267C7641C5A22F2F4B0174FFB0C94DC59866F6
DSCF0005.jpg=698E88C05B5C280E75C055444227FEA6FB60E564
DSCF0004.jpg=F0149101DD6FEC13238E6FD9CA2F2AC62829CBD0
DSCF0003.jpg=4A49F25E2030B60134F109ABD0AD9642C8577441
DSCF0002.jpg=F84E4D69D854D4BF94B5873132F9892C8B5FA94E
}

Althoughwedon'tdothisinFileMQ,theservercanusethecacheinformationtohelptheclientcatchupwithdeletionsthatit
missed.Todothis,itwouldhavetologdeletions,andthencomparethislogwiththeclientcachewhenaclientsubscribes.

TestUseCase:TheTrackTool topprevnext

ToproperlytestsomethinglikeFileMQweneedatestcasethatplayswithlivedata.Oneofmysysadmintasksistomanagethe
MP3tracksonmymusicplayer,whichis,bytheway,aSansaClipreflashedwithRockBox,whichIhighlyrecommend.AsI
downloadtracksintomyMusicfolder,Iwanttocopythesetomyplayer,andasIfindtracksthatannoyme,Ideletetheminthe
Musicfolderandwantthosegonefrommyplayertoo.

Thisiskindofoverkillforapowerfulfiledistributionprotocol.IcouldwritethisusingabashorPerlscript,buttobehonestthe
hardestworkinFileMQwasthedirectorycomparisoncodeandIwanttobenefitfromthat.SoIputtogetherasimpletoolcalled
track,whichcallstheFileMQAPI.Fromthecommandlinethisrunswithtwoargumentsthesendingandthereceiving
directories:

./track/home/ph/Music/media/32306364/MUSIC

ThecodeisaneatexampleofhowtousetheFileMQAPItodolocalfiledistribution.Hereisthefullprogram,minusthelicense
text(it'sMIT/X11licensed):

#include"czmq.h"
#include"../include/fmq.h"

http://zguide.zeromq.org/page:all 187/225
12/31/2015 MQ - The Guide - MQ - The Guide
intmain(intargc,char*argv[])
{
fmq_server_t*server=fmq_server_new()
fmq_server_configure(server,"anonymous.cfg")
fmq_server_publish(server,argv[1],"/")
fmq_server_set_anonymous(server,true)
fmq_server_bind(server,"tcp://*:5670")

fmq_client_t*client=fmq_client_new()
fmq_client_connect(client,"tcp://localhost:5670")
fmq_client_set_inbox(client,argv[2])
fmq_client_set_resync(client,true)
fmq_client_subscribe(client,"/")

while(true){
//Getmessagefromfmq_clientAPI
zmsg_t*msg=fmq_client_recv(client)
if(!msg)
break//Interrupted
char*command=zmsg_popstr(msg)
if(streq(command,"DELIVER")){
char*filename=zmsg_popstr(msg)
char*fullname=zmsg_popstr(msg)
printf("I:received%s(%s)\n",filename,fullname)
free(filename)
free(fullname)
}
free(command)
zmsg_destroy(&msg)
}
fmq_server_destroy(&server)
fmq_client_destroy(&client)
return0
}

Notehowweworkwithphysicalpathsinthistool.Theserverpublishesthephysicalpath/home/ph/Musicandmapsthistothe
virtualpath/.Theclientsubscribesto/andreceivesallfilesin/media/32306364/MUSIC.Icoulduseanystructurewithinthe
serverdirectory,anditwouldbecopiedfaithfullytotheclient'sinbox.NotetheAPImethodfmq_client_set_resync(),which
causesaservertoclientsynchronization.

GettinganOfficialPortNumber topprevnext

We'vebeenusingport5670intheexamplesforFILEMQ.Unlikeallthepreviousexamplesinthisbook,thisportisn'tarbitrarybut
wasassignedbytheInternetAssignedNumbersAuthority(IANA),which"isresponsiblefortheglobalcoordinationoftheDNS
Root,IPaddressing,andotherInternetprotocolresources".

I'llexplainverybrieflywhenandhowtorequestregisteredportnumbersforyourapplicationprotocols.Themainreasonisto
ensurethatyourapplicationscanruninthewildwithoutconflictwithotherprotocols.Technically,ifyoushipanysoftwarethat
usesportnumbersbetween1024and49151,youshouldbeusingonlyIANAregisteredportnumbers.Manyproductsdon't
botherwiththis,however,andtendinsteadtousetheIANAlistas"portstoavoid".

Ifyouaimtomakeapublicprotocolofanyimportance,suchasFILEMQ,you'regoingtowantanIANAregisteredport.I'llexplain
brieflyhowtodothis:

Documentyourprotocolclearly,asIANAwillwantaspecificationofhowyouintendtousetheport.Itdoesnothavetobe
afullyformedprotocolspecification,butmustbesolidenoughtopassexpertreview.

Decidewhattransportprotocolsyouwant:UDP,TCP,SCTP,andsoon.WithZeroMQyouwillusuallyonlywantTCP.

Fillintheapplicationoniana.org,providingallthenecessaryinformation.

http://zguide.zeromq.org/page:all 188/225
12/31/2015 MQ - The Guide - MQ - The Guide
IANAwillthencontinuetheprocessbyemailuntilyourapplicationisacceptedorrejected.

Notethatyoudon'trequestaspecificportnumberIANAwillassignyouone.It'sthereforewisetostartthisprocessbeforeyou
shipsoftware,notafterwards.

Chapter8AFrameworkforDistributedComputing topprevnext

We'vegonethoughajourneyofunderstandingZeroMQinitsmanyaspects.Bynowyoumayhavestartedtobuildyourown
productsusingthetechniquesIexplained,aswellasothersyou'vefiguredoutyourself.Youwillstarttofacequestionsabouthow
tomaketheseproductsworkintherealworld.

Butwhatisthat"realworld"?I'llarguethatitisbecomingaworldofeverincreasingnumbersofmovingpieces.Somepeopleuse
thephrasethe"InternetofThings",suggestingthatwe'llseeanewcategoryofdevicesthataremorenumerousbutalsomore
stupidthanourcurrentsmartphones,tablets,laptops,andservers.However,Idon'tthinkthedatapointsthiswayatall.Yes,
therearemoreandmoredevices,butthey'renotstupidatall.They'resmartandpowerfulandgettingmoresoallthetime.

ThemechanismatworkissomethingIcall"CostGravity"andithastheeffectofreducingthecostoftechnologybyhalfevery18
24months.Putanotherway,ourglobalcomputingcapacitydoubleseverytwoyears,overandoverandover.Thefutureisfilled
withtrillionsofdevicesthatarefullypowerfulmulticorecomputers:theydon'trunacutdown"operatingsystemforthings"but
fulloperatingsystemsandfullapplications.

Andthisistheworldwe'retargetingwithZeroMQ.Whenwetalkof"scale",wedon'tmeanhundredsofcomputers,oreven
thousands.Thinkofcloudsoftinysmartandperhapsselfreplicatingmachinessurroundingeveryperson,fillingeveryspace,
coveringeverywall,fillingthecracksandeventually,becomingsomuchapartofusthatwegetthembeforebirthandtheyfollow
ustodeath.

Thesecloudsoftinymachinestalktoeachother,allthetime,overshortrangewirelesslinksusingtheInternetProtocol.They
createmeshnetworks,passinformationandtasksaroundlikenervoussignals.Theyaugmentourmemory,vision,everyaspect
ofourcommunications,andphysicalfunctions.Andit'sZeroMQthatpowerstheirconversationsandeventsandexchangesof
workandinformation.

Now,tomakeevenathinimitationofthiscometruetoday,weneedtosolveasetoftechnicalproblems.Theseinclude:Howdo
peersdiscovereachother?HowdotheytalktoexistingnetworksliketheWeb?Howdotheyprotecttheinformationtheycarry?
Howdowetrackandmonitorthem,togetsomeideaofwhatthey'redoing?Thenweneedtodowhatmostengineersforget
about:packagethissolutionintoaframeworkthatisdeadeasyforordinarydeveloperstouse.

Thisiswhatwe'llattemptinthischapter:tobuildaframeworkfordistributedapplicationsasanAPI,protocols,and
implementations.It'snotasmallchallengebutI'veclaimedoftenthatZeroMQmakessuchproblemssimple,solet'sseeifthat's
stilltrue.

We'llcover:

Requirementsfordistributedcomputing
TheprosandconsofWiFiforproximitynetworking
DiscoveryusingUDPandTCP
AmessagebasedAPI
Creatinganewopensourceproject
Peertopeerconnectivity(theHarmonypattern)
Trackingpeerpresenceanddisappearance
Groupmessagingwithoutcentralcoordination
Largescaletestingandsimulation
Dealingwithhighwatermarksandblockedpeers
Distributedloggingandmonitoring

DesignforTheRealWorld topprevnext

Whetherwe'reconnectingaroomfulofmobiledevicesoverWiFioraclusterofvirtualboxesoversimulatedEthernet,wewillhit
thesamekindsofproblems.Theseare:

http://zguide.zeromq.org/page:all 189/225
12/31/2015 MQ - The Guide - MQ - The Guide
Discovery:howdowelearnaboutothernodesonthenetwork?Doweuseadiscoveryservice,centralizedmediation,or
somekindofbroadcastbeacon?

Presence:howdowetrackwhenothernodescomeandgo?Doweusesomekindofcentralregistrationservice,or
heartbeatingorbeacons?

Connectivity:howdoweactuallyconnectonenodetoanother?Doweuselocalnetworking,wideareanetworking,ordo
weuseacentralmessagebrokertodotheforwarding?

Pointtopointmessaging:howdowesendamessagefromonenodetoanother?Dowesendthistothenode'snetwork
address,ordoweusesomeindirectaddressingviaacentralizedmessagebroker?

Groupmessaging:howdowesendamessagefromonenodetoagroupofothers?Doweworkviaacentralized
messagebroker,ordoweuseapubsubmodellikeZeroMQ?

Testingandsimulation:howdowesimulatelargenumbersofnodessowecantestperformanceproperly?Dowehaveto
buytwodozenAndroidtablets,orcanweusepuresoftwaresimulation?

DistributedLogging:howdowetrackwhatthiscloudofnodesisdoingsowecandetectperformanceproblemsand
failures?Dowecreateamainloggingservice,ordowealloweverydevicetologtheworldaroundit?

Contentdistribution:howdowesendcontentfromonenodetoanother?DoweuseservercentricprotocolslikeFTPor
HTTP,ordoweusedecentralizedprotocolslikeFileMQ?

Ifwecansolvetheseproblemsreasonablywell,andthefurtherproblemsthatwillemerge(likesecurityandwideareabridging),
wegetsomethinglikeaframeworkforwhatImightcall"ReallyCoolDistributedApplications",orasmygrandkidscallit,"the
softwareourworldrunson".

Youshouldhaveguessedfrommyrhetoricalquestionsthattherearetwobroaddirectionsinwhichwecango.Oneisto
centralizeeverything.Theotheristodistributeeverything.I'mgoingtobetondecentralization.Ifyouwantcentralization,you
don'treallyneedZeroMQthereareotheroptionsyoucanuse.

Soveryroughly,here'sthestory.One,thenumberofmovingpiecesincreasesexponentiallyovertime(doublesevery24
months).Two,thesepiecesstopusingwiresbecausedraggingcableseverywheregetsreallyboring.Three,futureapplications
runacrossclustersofthesepiecesusingtheBenevolentTyrantpatternfromChapter6TheZeroMQCommunity.Four,today
it'sreallydifficult,naystillratherimpossible,tobuildsuchapplications.Five,let'smakeitcheapandeasyusingallthetechniques
andtoolswe'vebuiltup.Six,partay!

TheSecretLifeofWiFi topprevnext

Thefutureisclearlywireless,andwhilemanybigbusinesseslivebyconcentratingdataintheirclouds,thefuturedoesn'tlook
quitesocentralized.Thedevicesattheedgesofournetworksgetsmartereveryyear,notdumber.They'rehungryforworkand
informationtodigestandfromwhichtoprofit.Andtheydon'tdragcablesaround,exceptonceanightforpower.It'sallwireless
andmoreandmore,it's802.11brandedWiFiofdifferentalphabeticalflavors.

WhyMeshIsn'tHereYet topprevnext

Assuchavitalpartofourfuture,WiFihasabigproblemthat'snotoftendiscussed,butthatanyonebettingonitneedstobe
awareof.Thephonecompaniesoftheworldhavebuiltthemselvesniceprofitablemobilephonecartelsinnearlyeverycountry
withafunctioninggovernment,basedonconvincinggovernmentsthatwithoutmonopolyrightstoairwavesandideas,theworld
wouldfallapart.Technically,wecallthis"regulatorycapture"and"patents",butinfactit'sjustaformofblackmailandcorruption.
Ifyou,thestate,giveme,abusiness,therighttoovercharge,taxthemarket,andbanallrealcompetitors,I'llgiveyou5%.Not
enough?Howabout10%?OK,15%plussnacks.Ifyourefuse,wepullservice.

ButWiFisnuckpastthis,borrowingunlicensedairspaceandridingonthebackoftheopenandunpatentedandremarkably
innovativeInternetProtocolstack.Sotoday,wehavethecurioussituationwhereitcostsmeseveralEuroaminutetocallfrom
SeoultoBrusselsifIusethestatebackedinfrastructurethatwe'vesubsidizedoverdecades,butnothingatallifIcanfindan
unregulatedWiFiaccesspoint.Oh,andIcandovideo,sendfilesandphotos,anddownloadentirehomemoviesallforthesame
amazingpricepointofpreciselyzeropointzerozero(inanycurrencyyoulike).GodhelpmeifItrytosendjustonephotohome
usingtheserviceforwhichIactuallypay.ThatwouldcostmemorethanthecameraItookiton.
http://zguide.zeromq.org/page:all 190/225
12/31/2015 MQ - The Guide - MQ - The Guide
Itisthepricewepayforhavingtoleratedthe"trustus,we'retheexperts"patentsystemforsolong.Butmorethanthat,it'sa
massiveeconomicincentivetochunksofthetechnologysectorandespeciallychipsetmakerswhoownpatentsontheanti
InternetGSM,GPRS,3G,andLTEstacks,andwhotreatthetelcosasprimeclientstoactivelythrottleWiFidevelopment.And
ofcourseit'sthesefirmsthatbulkouttheIEEEcommitteesthatdefineWiFi.

Thereasonforthisrantagainstlawyerdriven"innovation"istosteeryourthinkingtowards"whatifWiFiwerereallyfree?"This
willhappenoneday,nottoofaroff,andit'sworthbettingon.We'llseeseveralthingshappen.First,muchmoreaggressiveuseof
airspaceespeciallyforneardistancecommunicationswherethereisnoriskofinterference.Second,bigcapacityimprovements
aswelearntousemoreairspaceinparallel.Third,accelerationofthestandardizationprocess.Last,broadersupportindevices
forreallyinterestingconnectivity.

Rightnow,streamingamoviefromyourphonetoyourTVisconsidered"leadingedge".Thisisridiculous.Let'sgettruly
ambitious.Howaboutastadiumofpeoplewatchingagame,sharingphotosandHDvideowitheachotherinrealtime,creating
anadhoceventthatliterallysaturatestheairspacewithadigitalfrenzy.Ishouldbeabletocollectterabytesofimageryfrom
thosearoundme,inanhour.WhydoesthishavetogothroughTwitterorFacebookandthattinyexpensivemobiledata
connection?Howaboutahomewithhundredsofdevicesalltalkingtoeachotherovermesh,sowhensomeoneringsthe
doorbell,theporchlightsstreamvideothroughtoyourphoneorTV?Howaboutacarthatcantalktoyourphoneandplayyour
dubstepplaylistwithoutyouplugginginwires.

Togetmoreserious,whyisourdigitalsocietyinthehandsofcentralpointsthataremonitored,censored,logged,usedtotrack
whowetalkto,collectevidenceagainstus,andthenshutdownwhentheauthoritiesdecidewehavetoomuchfreespeech?The
lossofprivacywe'relivingthroughisonlyaproblemwhenit'sonesided,butthentheproblemiscalamitous.Atrulywireless
worldwouldbypassallcentralcensorship.It'showtheInternetwasdesigned,andit'squitefeasible,technically(whichisthebest
kindoffeasible).

SomePhysics topprevnext

Naivedevelopersofdistributedsoftwaretreatthenetworkasinfinitelyfastandperfectlyreliable.Whilethisisapproximatelytrue
forsimpleapplicationsoverEthernet,WiFirapidlyprovesthedifferencebetweenmagicalthinkingandscience.Thatis,WiFi
breakssoeasilyanddramaticallyunderstressthatIsometimeswonderhowanyonewoulddareuseitforrealwork.Theceiling
movesupasWiFigetsbetter,butneverfastenoughtostopushittingit.

TounderstandhowWiFiperformstechnically,youneedtounderstandabasiclawofphysics:thepowerrequiredtoconnecttwo
pointsincreasesaccordingtothesquareofthedistance.Peoplewhogrowupinlargerhouseshaveexponentiallyloudervoices,
asIlearnedinDallas.ForaWiFinetwork,thismeansthatastworadiosgetfurtherapart,theyhavetoeitherusemorepoweror
lowertheirsignalrate.

There'sonlysomuchpoweryoucanpulloutofabatterybeforeuserstreatthedeviceashopelesslybroken.Thuseventhougha
WiFinetworkmayberatedatacertainspeed,therealbitratebetweentheaccesspoint(AP)andaclientdependsonhowfar
apartthetwoare.AsyoumoveyourWiFienabledphoneawayfromtheAP,thetworadiostryingtotalktoeachotherwillfirst
increasetheirpowerandthenreducetheirbitrate.

Thiseffecthassomeconsequencesofwhichweshouldbeawareifwewanttobuildrobustdistributedapplicationsthatdon't
danglewiresbehindthemlikepuppets:

IfyouhaveagroupofdevicestalkingtoanAP,whentheAPistalkingtotheslowestdevice,thewholenetworkhasto
wait.It'slikehavingtorepeatajokeatapartytothedesignateddriverwhohasnosenseofhumor,isstillfullyand
tragicallysober,andhasapoorgraspofthelanguage.

IfyouuseunicastTCPandsendamessagetomultipledevices,theAPmustsendthepacketstoeachdeviceseparately,
Yes,andyouknewthis,it'salsohowEthernetworks.Butnowunderstandthatonedistant(orlowpowered)devicemeans
everythingwaitsforthatslowestdevicetocatchup.

Ifyouusemulticastorbroadcast(whichworkthesame,inmostcases),theAPwillsendsinglepacketstothewhole
networkatonce,whichisawesome,butitwilldoitattheslowestpossiblebitrate(usually1Mbps).Youcanadjustthis
ratemanuallyinsomeAPs.ThatjustreducesthereachofyourAP.YoucanalsobuymoreexpensiveAPsthathavea
littlemoreintelligenceandwillfigureoutthehighestbitratetheycansafelyuse.YoucanalsouseenterpriseAPswith
IGMP(InternetGroupManagementProtocol)supportandZeroMQ'sPGMtransporttosendonlytosubscribedclients.I'd
not,however,betonsuchAPsbeingwidelyavailable,ever.

AsyoutrytoputmoredevicesontoanAP,performancerapidlygetsworsetothepointwhereaddingonemoredevicecanbreak
thewholenetworkforeveryone.ManyAPssolvethisbyrandomlydisconnectingclientswhentheyreachsomelimit,suchasfour
toeightdevicesforamobilehotspot,3050devicesforaconsumerAP,perhaps100devicesforanenterpriseAP.

http://zguide.zeromq.org/page:all 191/225
12/31/2015 MQ - The Guide - MQ - The Guide

What'stheCurrentStatus? topprevnext

Despiteitsuncomfortableroleasenterprisetechnologythatsomehowescapedintothewild,WiFiisalreadyusefulformorethan
gettingafreeSkypecall.It'snotideal,butitworkswellenoughtoletussolvesomeinterestingproblems.Letmegiveyouarapid
statusreport.

First,pointtopointversusaccesspointtoclient.TraditionalWiFiisallAPclient.EverypackethastogofromclientAtoAP,then
toclientB.Youcutyourbandwidthby50%,butthat'sonlyhalftheproblem.Iexplainedabouttheinversepowerlaw.IfAandB
areveryclosetogether,butbotharefarfromtheAP,they'llbothbeusingalowbitrate.ImagineyourAPisinthegarage,and
you'reinthelivingroomtryingtostreamvideofromyourphonetoyourTV.Goodluck!

Thereisanold"adhoc"modethatletsAandBtalktoeachother,butit'swaytooslowforanythingfun,andofcourse,it's
disabledonallmobilechipsets.Actually,it'sdisabledinthetopsecretdriversthatthechipsetmakerskindlyprovidetohardware
makers.ThereisanewTunneledDirectLinkSetup(TDLS)protocolthatletstwodevicescreateadirectlink,usinganAPfor
discoverybutnotfortraffic.Andthere'sa"5G"WiFistandard(it'samarketingterm,soitgoesinquotes)thatboostslinkspeeds
toagigabit.TDLSand5GtogethermakeHDmoviestreamingfromyourphonetoyourTVaplausiblereality.IassumeTDLSwill
berestrictedinvariouswayssoastoplacatethetelcos.

Lastly,wesawstandardizationofthe802.11smeshprotocolin2012,afteraremarkablyspeedytenyearsorsoofwork.Mesh
removestheaccesspointcompletely,atleastintheimaginaryfuturewhereitexistsandiswidelyused.Devicestalktoeach
otherdirectly,andmaintainlittleroutingtablesofneighborsthatletthemforwardpackets.ImaginetheAPsoftwareembedded
intoeverydevice,butsmartenough(it'snotasimpressiveasitsounds)todomultiplehops.

Noonewhoismakingmoneyfromthemobiledataextortionracketwantstosee802.11savailablebecausecitywidemeshis
suchanightmareforthebottomline,soit'shappeningasslowlyaspossible.Theonlylargeorganizationwiththepower(and,I
assumethesurfacetosurfacemissiles)togetmeshtechnologyintowideuseistheUSArmy.ButmeshwillemergeandI'dbet
on802.11sbeingwidelyavailableinconsumerelectronicsby2020orso.

Second,ifwedon'thavepointtopoint,howfarcanwetrustAPstoday?Well,ifyougotoaStarbucksintheUSandtrythe
ZeroMQ"HelloWorld"exampleusingtwolaptopsconnectedviathefreeWiFi,you'llfindtheycannotconnect.Why?Well,the
answerisinthename:"attwifi".AT&TisagoodoldincumbenttelcothathatesWiFiandpresumablyprovidestheservicecheaply
toStarbucksandotherssothatindependentscan'tgetintothemarket.ButanyaccesspointyoubuywillsupportclienttoAPto
clientaccess,andoutsidetheUSI'veneverfoundapublicAPlockeddowntheAT&Tway.

Third,performance.TheAPisclearlyabottleneckyoucannotgetbetterthanhalfofitsadvertisedspeedevenifyouputAandB
literallybesidetheAP.Worse,ifthereareotherAPsinthesameairspace,they'llshouteachotherout.Inmyhome,WiFibarely
worksatallbecausetheneighborstwohousesdownhaveanAPwhichthey'veamplified.Evenonadifferentchannel,it
interfereswithourhomeWiFi.InthecafewhereI'msittingnowthereareoveradozennetworks.Realistically,aslongaswe're
dependentonAPbasedWiFi,we'resubjecttorandominterferenceandunpredictableperformance.

Fourth,batterylife.There'snoinherentreasonthatWiFi,whenidle,ishungrierthanBluetooth,forexample.Theyusethesame
radiosandlowlevelframing.Themaindifferenceistuningandintheprotocols.Forwirelesspowersavingtoworkwell,devices
havetomostlysleepandbeaconouttootherdevicesonlyonceeverysooften.Forthistowork,theyneedtosynchronizetheir
clocks.Thishappensproperlyforthemobilephonepart,whichiswhymyoldflipphonecanrunfivedaysonacharge.When
WiFiisworking,itwillusemorepower.Currentpoweramplifiertechnologyisalsoinefficient,meaningyoudrawalotmore
energyfromyourbatterythanyoupumpintotheair(thewasteturnsintoahotphone).Poweramplifiersareimprovingaspeople
focusmoreonmobileWiFi.

Lastly,mobileaccesspoints.Ifwecan'ttrustcentralizedAPs,andifourdevicesaresmartenoughtorunfulloperatingsystems,
can'twemakethemworkasAPs?I'msogladyouaskedthatquestion.Yes,wecan,anditworksquitenicely.Especially
becauseyoucanswitchthisonandoffinsoftware,onamodernOSlikeAndroid.Again,thevillainsofthepeacearetheUS
telcos,whomostlydetestthisfeatureandkillitorcrippleitonthephonestheycontrol.Smartertelcosrealizethatit'sawayto
amplifytheir"lastmile"andbringhighervalueproductstomoreusers,butcrooksdon'tcompeteonsmarts.

Conclusions topprevnext

WiFiisnotEthernetandalthoughIbelievefutureZeroMQapplicationswillhaveaveryimportantdecentralizedwireless
presence,it'snotgoingtobeaneasyroad.MuchofthebasicreliabilityandcapacitythatyouexpectfromEthernetismissing.
WhenyourunadistributedapplicationoverWiFi,youmustallowforfrequenttimeouts,randomlatencies,arbitrary

http://zguide.zeromq.org/page:all 192/225
12/31/2015 MQ - The Guide - MQ - The Guide
disconnections,wholeinterfacesgoingdownandcomingup,andsoon.

Thetechnologicalevolutionofwirelessnetworkingisbestdescribedas"slowandjoyless".Applicationsandframeworksthattry
toexploitdecentralizedwirelessaremostlyabsentorpoor.Theonlyexistingopensourceframeworkforproximitynetworkingis
AllJoynfromQualcomm.ButwithZeroMQ,weprovedthattheinertiaanddecrepitincompetenceofexistingplayerswasno
reasonforustositstill.Whenweaccuratelyunderstandproblems,wecansolvethem.Whatweimagine,wecanmakereal.

Discovery topprevnext

DiscoveryisanessentialpartofnetworkprogrammingandafirstclassproblemforZeroMQdevelopers.Everyzmq_connect
()callprovidesanendpointstring,andthathastocomefromsomewhere.Theexampleswe'veseensofardon'tdodiscovery:
theendpointstheyconnecttoarehardcodedasstringsinthecode.Whilethisisfineforexamplecode,it'snotidealforreal
applications.Networksdon'tbehavethatnicely.Thingschange,andit'showwellwehandlechangethatdefinesourlongterm
success.

ServiceDiscovery topprevnext

Let'sstartwithdefinitions.Networkdiscoveryisfindingoutwhatotherpeersareonthenetwork.Servicediscoveryislearning
whatthosepeerscandoforus.Wikipediadefinesa"networkservice"as"aservicethatishostedonacomputernetwork",and
"service"as"asetofrelatedsoftwarefunctionalitiesthatcanbereusedfordifferentpurposes,togetherwiththepoliciesthat
shouldcontrolitsusage".It'snotveryhelpful.IsFacebookanetworkservice?

Infacttheconceptof"networkservice"haschangedovertime.Thenumberofmovingpieceskeepsdoublingevery1824
months,breakingoldconceptualmodelsandpushingforeversimpler,morescalableones.Aserviceis,forme,asystemlevel
applicationthatotherprogramscantalkto.Anetworkserviceisoneaccessibleremotely(ascomparedto,e.g.,the"grep"
command,whichisacommandlineservice).

IntheclassicBSDsocketmodel,aservicemaps1to1toanetworkport.Acomputersystemoffersanumberofserviceslike
"FTP",and"HTTP",eachwithassignedports.TheBSDAPIhasfunctionslikegetservbynametomapaservicenametoaport
number.Soaclassicservicemapstoanetworkendpoint:ifyouknowaserver'sIPaddressandthenyoucanfinditsFTP
service,ifthatisrunning.

Inmodernmessaging,however,servicesdon'tmap1to1toendpoints.Oneendpointcanleadtomanyservices,andservices
canmovearoundovertime,betweenports,orevenbetweensystems.Whereismycloudstoragetoday?Inarealisticlarge
distributedapplication,therefore,weneedsomekindofservicediscoverymechanism.

TherearemanywaystodothisandIwon'ttrytoprovideanexhaustivelist.Howeverthereareafewclassicpatterns:

Wecanforcetheold1to1mappingfromendpointtoservice,andsimplystateupfrontthatacertainTCPportnumber
representsacertainservice.Ourprotocolthenshouldletuscheckthis("Arethefirst4bytesoftherequest'HTTP'?").

Wecanbootstraponeserviceoffanotherconnectingtoawellknownendpointandservice,askingforthe"real"service,
andgettinganendpointbackinreturn.Thisgivesusaservicelookupservice.Ifthelookupserviceallowsit,servicescan
thenmovearoundaslongastheyupdatetheirlocation.

Wecanproxyoneservicethroughanother,sothatawellknownendpointandservicewillprovideotherservicesindirectly
(i.e.byforwardingmessagestothem).ThisisforinstancehowourMajordomoserviceorientedbrokerworks.

Wecanexchangelistsofknownservicesandendpoints,thatchangeovertime,usingagossipapproachoracentralized
approach(liketheClonepattern),sothateachnodeinadistributednetworkcanbuildupaneventuallyconsistentmapof
thewholenetwork.

Wecancreatefurtherabstractlayersinbetweennetworkendpointsandservices,e.g.assigningeachnodeaunique
identifier,sowegeta"networkofnodes"whereeachnodemayoffersomeservices,andmayappearonrandomnetwork
endpoints.

Wecandiscoverservicesopportunistically,e.g.byconnectingtoendpointsandthenaskingthemwhatservicestheyoffer.
"Hi,doyouofferasharedprinter?Ifso,what'sthemakerandmodel?"

There'sno"rightanswer".Therangeofoptionsishuge,andchangesovertimeasthescaleofournetworksgrows.Insome
http://zguide.zeromq.org/page:all 193/225
12/31/2015 MQ - The Guide - MQ - The Guide
networkstheknowledgeofwhatservicesrunwherecanliterallybecomepoliticalpower.ZeroMQimposesnospecificmodelbut
makesiteasytodesignandbuildtheonesthatsuitusbest.However,tobuildservicediscovery,wemuststartbysolving
networkdiscovery.

NetworkDiscovery topprevnext

HereisalistofthesolutionsIknowfornetworkdiscovery:

Usehardcodedendpointstrings,i.e.,fixedIPaddressesandagreedports.Thisworkedininternalnetworksadecadeago
whentherewereafew"bigservers"andtheyweresoimportanttheygotstaticIPaddresses.Thesedayshoweverit'sno
useexceptinexamplesorforinprocesswork(threadsarethenewBigIron).Youcanmakeithurtalittlelessbyusing
DNSbutthisisstillpainfulforanyonewho'snotalsodoingsystemadministrationasasidejob.

Getendpointstringsfromconfigurationfiles.Thisshovesnameresolutionintouserspace,whichhurtslessthanDNSbut
that'slikesayingapunchinthefacehurtslessthanakickinthegroin.Younowgetanontrivialmanagementproblem.
Whoupdatestheconfigurationfiles,andwhen?Wheredotheylive?DoweinstalladistributedmanagementtoollikeSalt
Stack?

Useamessagebroker.Youstillneedahardcodedorconfiguredendpointstringtoconnecttothebroker,butthis
approachreducesthenumberofdifferentendpointsinthenetworktoone.Thatmakesarealimpact,andbrokerbased
networksdoscalenicely.However,brokersaresinglepointsoffailure,andtheybringtheirownsetofworriesabout
managementandperformance.

Useanaddressingbroker.Inotherwordsuseacentralservicetomediateaddressinformation(likeadynamicDNSsetup)
butallownodestosendeachothermessagesdirectly.It'sagoodmodelbutstillcreatesapointoffailureand
managementcosts.

Usehelperlibraries,likeZeroConf,thatprovideDNSserviceswithoutanycentralizedinfrastructure.It'sagoodanswerfor
certainapplicationsbutyourmileagewillvary.Helperlibrariesaren'tzerocost:theymakeitmorecomplextobuildthe
software,theyhavetheirownrestrictions,andtheyaren'tnecessarilyportable.

BuildsystemleveldiscoverybysendingoutARPorICMPECHOpacketsandthenqueryingeverynodethatresponds.
YoucanquerythroughaTCPconnection,forexample,orbysendingUDPmessages.Someproductsdothis,likethe
EyeFiwirelesscard.

Douserlevelbruteforcediscoverybytryingtoconnecttoeverysingleaddressinthenetworksegment.Youcandothis
triviallyinZeroMQsinceithandlesconnectionsinthebackground.Youdon'tevenneedmultiplethreads.It'sbrutalbut
fun,andworksverywellindemosandworkshops.Howeveritdoesn'tscale,andannoysdecentthinkingengineers.

RollyourownUDPbaseddiscoveryprotocol.Lotsofpeopledothis(Icountedabout80questionsonthistopicon
StackOverflow).UDPworkswellforthisandit'stechnicallyclear.Butit'stechnicallytrickytogetright,tothepointwhere
anydeveloperdoingthisthefirstfewtimeswillgetitdramaticallywrong.

Gossipdiscoveryprotocols.Afullyinterconnectednetworkisquiteeffectiveforsmallernumbersofnodes(say,upto100
or200).Forlargenumbersofnodes,weneedsomekindofgossipprotocol.Thatis,wherethenodeswecanreasonable
discover(say,onthesamesegmentasus),tellusaboutnodesthatarefurtheraway.Gossipprotocolsgobeyondwhat
weneedthesedayswithZeroMQ,butwilllikelybemorecommoninthefuture.Oneexampleofawideareagossipmodel
ismeshnetworking.

TheUseCase topprevnext

Let'sdefineourusecasemoreexplicitly.Afterall,allthesedifferentapproacheshaveworkedandstillworktosomeextent.What
interestsmeasarchitectisthefuture,andfindingdesignsthatcancontinuetoworkformorethanafewyears.Thismeans
identifyinglongtermtrends.Ourusecaseisn'thereandnow,it'stenortwentyyearsfromtoday.

HerearethelongtermtrendsIseeindistributedapplications:

Theoverallnumberofmovingpieceskeepsincreasing.Myestimateisthatitdoublesevery24months,buthowfastit
increasesmatterslessthanthefactthatwekeepaddingmoreandmorenodestoournetworks.They'renotjustboxesbut

http://zguide.zeromq.org/page:all 194/225
12/31/2015 MQ - The Guide - MQ - The Guide
alsoprocessesandthreads.Thedriverhereiscost,whichkeepsfalling.Inadecade,theaverageteenagerwillcarry30
50devices,allthetime.

Controlshiftsawayfromthecenter.Possiblydatatoo,thoughwe'restillfarfromunderstandinghowtobuildsimple
decentralizedinformationstores.Inanycase,thestartopologyisslowlydyingandbeingreplacedbycloudsofclouds.In
thefuturethere'sgoingtobemuchmoretrafficwithinalocalenvironment(home,office,school,bar)thanbetweenremote
nodesandthecenter.Themathsherearesimple:remotecommunicationscostmore,runmoreslowlyandareless
naturalthancloserangecommunications.It'smoreaccuratebothtechnicallyandsociallytoshareaholidayvideowith
yourfriendoverlocalWiFithanviaFacebook.

Networksareincreasinglycollaborative,lesscontrolled.Thismeanspeoplebringingtheirowndevicesandexpectingthem
toworkseamlessly.TheWebshowedonewaytomakethisworkbutwe'rereachingthelimitsofwhattheWebcando,as
westarttoexceedtheaverageofonedeviceperperson.

Thecostofconnectinganewnodetoanetworkmustfallproportionally,ifthenetworkistoscale.Thismeansreducing
theamountofconfigurationanodeneeds:lesspresharedstate,lesscontext.Again,theWebsolvedthisproblembutat
thecostofcentralization.Wewantthesameplugandplayexperiencebutwithoutacentralagency.

Inaworldoftrillionsofnodes,theonesyoutalktomostaretheonesclosesttoyou.Thisishowitworksintherealworldandit's
thesanestwayofscalinglargescalearchitectures.Groupsofnodes,logicallyorphysicallyclose,connectedbybridgestoother
groupsofnodes.Alocalgroupwillbeanythingfromhalfadozennodestoafewthousandnodes.

Sowehavetwobasicusecases:

Discoveryforproximitynetworks,thatis,asetofnodesthatfindthemselvesclosetoeachother.Wecandefine"close
toeachother"asbeing"onthesamenetworksegment".It'snotgoingtobetrueinallcasesbutit'strueenoughtobea
usefulplacetostart.

Discoveryacrosswideareanetworks,thatis,bridgingofproximitynetworkstogether.Wesometimescallthis
"federation".Therearemanywaystodofederationbutit'scomplexandsomethingtocoverelsewhere.Fornow,let's
assumewedofederationusingacentralizedbrokerorservice.

Soweareleftwiththeproblemofproximitynetworking.Iwanttojustplugthingsintothenetworkandhavethemtalkingtoeach
other.Whetherthey'retabletsinaschoolorabunchofserversinacloud,thelessupfrontagreementandcoordination,the
cheaperitistoscale.Soconfigurationfilesandbrokersandanykindofcentralizedserviceareallout.

Ialsowanttoallowanynumberofapplicationsonabox,bothbecausethat'showtherealworldworks(peopledownloadapps),
andsothatIcansimulatelargenetworksonmylaptop.UpfrontsimulationistheonlywayIknowtobesureasystemwillwork
whenit'sloadedinreallife.You'dbesurprisedhowengineersjusthopethingswillwork."Oh,I'msurethatbridgewillstayup
whenweopenittotraffic".Ifyouhaven'tsimulatedandfixedthethreemostlikelyfailures,they'llstillbethereonopeningday.

Runningmultipleinstancesofaserviceonthesamemachinewithoutupfrontcoordinationmeanswehavetouseephemeral
ports,i.e.,portsassignedrandomlyforservices.EphemeralportsruleoutbruteforceTCPdiscoveryandanyDNSsolution
includingZeroConf.

Finally,discoveryhastohappeninuserspacebecausetheappswe'rebuildingwillberunningonrandomboxesthatwedonot
necessarilyownandcontrol.Forexample,otherpeople'smobiledevices.Soanydiscoverythatneedsrootpermissionsis
excluded.ThisrulesoutARPandICMPandonceagainZeroConfsincethatalsoneedsrootpermissionsfortheserviceparts.

TechnicalRequirements topprevnext

Let'srecaptherequirements:

Thesimplestpossiblesolutionthatworks.Therearesomanyedgecasesinadhocnetworksthateveryextrafeatureor
functionalitybecomesarisk.

Supportsephemeralports,sothatwecanrunrealisticsimulations.Iftheonlywaytotestistouserealdevices,itbecomes
impossiblyexpensiveandslowtoruntests.

Norootaccessneeded,itmustrun100%inuserspace.Wewanttoshipfullypackagedapplicationsontodeviceslike
mobilephonesthatwedon'townandwhererootaccessisn'tavailable.

Invisibletosystemadministrators,sowedonotneedtheirhelptorunourapplications.Whatevertechniqueweuseshould
befriendlytothenetworkandavailablebydefault.

http://zguide.zeromq.org/page:all 195/225
12/31/2015 MQ - The Guide - MQ - The Guide
Zeroconfigurationapartfrominstallingtheapplicationsthemselves.Askingtheuserstodoanyconfigurationisgiving
themanexcusetonotusetheapplications.

Fullyportabletoallmodernoperatingsystems.Wecan'tassumewe'llberunningonanyspecificOS.Wecan'tassume
anysupportfromtheoperatingsystemexceptstandarduserspacenetworking.WecanassumeZeroMQandCZMQare
available.

FriendlytoWiFinetworkswithupto100150participants.Thismeanskeepingmessagessmallandbeingawareofhow
WiFinetworksscaleandhowtheybreakunderpressure.

Protocolneutral,i.e.,ourbeaconingshouldnotimposeanyspecificdiscoveryprotocol.I'llexplainwhatthismeansalittle
later.

Easytoreimplementinanygivenlanguage.Sure,wehaveaniceCimplementation,butifittakestoolongtore
implementinanotherlanguage,thatexcludeslargechunksoftheZeroMQcommunity.So,again,simple.

Fastresponsetime.Bythis,Imeananewnodeshouldbevisibletoitspeersinaveryshorttime,asecondortwoatmost.
Networkschangeshaperapidly.It'sOKtotakelonger,even30seconds,torealizeapeerhasdisappeared.

FromthelistofpossiblesolutionsIcollected,theonlyoptionthatisn'tdisqualifiedforoneormorereasonsistobuildourown
UDPbaseddiscoverystack.It'salittledisappointingthataftersomanydecadesofresearchintonetworkdiscovery,thisiswhere
weendup.Butthehistoryofcomputingdoesseemtogofromcomplextosimple,somaybeit'snormal.

ASelfHealingP2PNetworkin30Seconds topprevnext

Imentionedbruteforcediscovery.Let'sseehowthatworks.Onenicethingaboutsoftwareistobruteforceyourwaythroughthe
learningexperience.Aslongaswe'rehappytothrowawaywork,wecanlearnrapidlysimplybytryingthingsthatmayseem
insanefromthesafetyofthearmchair.

I'llexplainabruteforcediscoveryapproachforZeroMQthatemergedfromaworkshopin2012.Itisremarkablysimpleand
stupid:connecttoeveryIPaddressintheroom.Ifyournetworksegmentis192.168.55.x,forinstance,youdothis:

connecttotcp://192.168.55.1:9000
connecttotcp://192.168.55.2:9000
connecttotcp://192.168.55.3:9000
...
connecttotcp://192.168.55.254:9000

WhichinZeroMQspeaklookslikethis:

intaddress
for(address=1address<255address++)
zsocket_connect(listener,"tcp://192.168.55.%d:9000",address)

Thestupidpartiswhereweassumethatconnectingtoourselvesisfine,whereweassumethatallpeersareonthesame
networksegment,wherewewastefilehandlesasiftheywerefree.Luckilytheseassumptionsareoftentotallyaccurate.Atleast,
oftenenoughtoletusdofunthings.

TheloopworksbecauseZeroMQconnectcallsareasynchronousandopportunistic.Theylieintheshadowslikehungrycats,
waitingpatientlytopounceonanyinnocentmousethatdaredstartupaserviceonport9000.It'ssimple,effective,andworked
firsttime.

Itgetsbetter:aspeersleaveandjointhenetwork,they'llautomaticallyreconnect.We'vedesignedaselfhealingpeertopeer
network,in30secondsandthreelinesofcode.

Itwon'tworkforrealcasesthough.Pooreroperatingsystemstendtorunoutoffilehandles,andnetworkstendtobemore
complexthanonesegment.Andifonenodesquatsacoupleofhundredfilehandles,largescalesimulations(withmanynodes
ononeboxorinoneprocess)areoutofthequestion.

http://zguide.zeromq.org/page:all 196/225
12/31/2015 MQ - The Guide - MQ - The Guide
Still,let'sseehowfarwecangowiththisapproachbeforewethrowitout.Here'satinydecentralizedchatprogramthatletsyou
talktoanyoneelseonthesamenetworksegment.Thecodehastwothreads:alistenerandabroadcaster.Thelistenercreatesa
SUBsocketanddoesthebruteforceconnectiontoallpeersinthenetwork.Thebroadcasteracceptsinputfromtheconsoleand
sendsitonaPUBsocket:

dechat:DecentralizedChatinC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

ThedechatprogramneedstoknowthecurrentIPaddress,theinterface,andanalias.Wecouldgettheseincodefromthe
operatingsystem,butthat'sgrunkynonportablecode.Soweprovidethisinformationonthecommandline:

dechat192.168.55.122eth0Joe

PreemptiveDiscoveryoverRawSockets topprevnext

Oneofthegreatthingsaboutshortrangewirelessistheproximity.WiFimapscloselytothephysicalspace,whichmapsclosely
tohowwenaturallyorganize.Infact,theInternetisquiteabstractandthisconfusesalotofpeoplewhokindof"getit"butinfact
don'treally.WithWiFi,wehavetechnicalconnectivitythatispotentiallysupertangible.Youseewhatyougetandyougetwhat
yousee.Tangiblemeanseasytounderstandandthatshouldmeanlovefromusersinsteadofthetypicalfrustrationandseething
hatred.

Proximityisthekey.WehaveabunchofWiFiradiosinaroom,happilybeaconingtoeachother.Forlotsofapplications,it
makessensethattheycanfindeachotherandstartchattingwithoutanyuserinput.Afterall,mostrealworlddataisn'tprivate,
it'sjusthighlylocalized.

I'minahotelroominGangnam,Seoul,witha4Gwirelesshotspot,aLinuxlaptop,andancoupleofAndroidphones.Thephones
andlaptoparetalkingtothehotspot.TheifconfigcommandsaysmyIPaddressis192.168.1.2.Letmetrysomeping
commands.DHCPserverstendtodishoutaddressesinsequence,somyphonesareprobablycloseby,numericallyspeaking:

$ping192.168.1.1
PING192.168.1.1(192.168.1.1)56(84)bytesofdata.
64bytesfrom192.168.1.1:icmp_req=1ttl=64time=376ms
64bytesfrom192.168.1.1:icmp_req=2ttl=64time=358ms
64bytesfrom192.168.1.1:icmp_req=4ttl=64time=167ms
^C
192.168.1.1pingstatistics
3packetstransmitted,2received,33%packetloss,time2001ms
rttmin/avg/max/mdev=358.077/367.522/376.967/9.445ms

Foundone!150300msecroundtriplatencythat'sasurprisinglyhighfigure,somethingtokeepinmindforlater.NowIping
myself,justtotrytodoublecheckthings:

$ping192.168.1.2
PING192.168.1.2(192.168.1.2)56(84)bytesofdata.
64bytesfrom192.168.1.2:icmp_req=1ttl=64time=0.054ms
64bytesfrom192.168.1.2:icmp_req=2ttl=64time=0.055ms
64bytesfrom192.168.1.2:icmp_req=3ttl=64time=0.061ms
^C
192.168.1.2pingstatistics
3packetstransmitted,3received,0%packetloss,time1998ms
rttmin/avg/max/mdev=0.054/0.056/0.061/0.009ms

Theresponsetimeisabitfasternow,whichiswhatwe'dexpect.Let'strythenextcoupleofaddresses:

http://zguide.zeromq.org/page:all 197/225
12/31/2015 MQ - The Guide - MQ - The Guide

$ping192.168.1.3
PING192.168.1.3(192.168.1.3)56(84)bytesofdata.
64bytesfrom192.168.1.3:icmp_req=1ttl=64time=291ms
64bytesfrom192.168.1.3:icmp_req=2ttl=64time=271ms
64bytesfrom192.168.1.3:icmp_req=3ttl=64time=132ms
^C
192.168.1.3pingstatistics
3packetstransmitted,3received,0%packetloss,time2001ms
rttmin/avg/max/mdev=132.781/231.914/291.851/70.609ms

That'sthesecondphone,withthesamekindoflatencyasthefirstone.Let'scontinueandseeifthereareanyotherdevices
connectedtothehotspot:

$ping192.168.1.4
PING192.168.1.4(192.168.1.4)56(84)bytesofdata.
^C
192.168.1.4pingstatistics
3packetstransmitted,0received,100%packetloss,time2016ms

Andthatisit.Now,pingusesrawIPsocketstosendICMP_ECHOmessages.TheusefulthingaboutICMP_ECHOisthatitgetsa
responsefromanyIPstackthathasnotdeliberatelyhadechoswitchedoff.That'sstillacommonpracticeoncorporatewebsites
whofeartheold"pingofdeath"exploitwheremalformedmessagescouldcrashthemachine.

Icallthispreemptivediscoverybecauseitdoesn'ttakeanycooperationfromthedevice.Wedon'trelyonanycooperationfrom
thephonestoseethemsittingthereaslongasthey'renotactivelyignoringus,wecanseethem.

Youmightaskwhythisisuseful.Wedon'tknowthatthepeersrespondingtoICMP_ECHOrunZeroMQ,thattheyareinterestedin
talkingtous,thattheyhaveanyserviceswecanuse,orevenwhatkindofdevicetheyare.However,knowingthatthere's
somethingonaddress192.168.1.3isalreadyuseful.Wealsoknowhowfarawaythedeviceis,relatively,weknowhowmany
devicesareonthenetwork,andweknowtheroughstateofthenetwork(asin,good,poor,orterrible).

Itisn'tevenhardtocreateICMP_ECHOmessagesandsendthem.Afewdozenlinesofcode,andwecoulduseZeroMQ
multithreadingtodothisinparallelforaddressesstretchingoutaboveandbelowourownIPaddress.Couldbekindoffun.

However,sadly,there'safatalflawinmyideaofusingICMP_ECHOtodiscoverdevices.ToopenarawIPsocketrequiresroot
privilegesonaPOSIXbox.Itstopsrogueprogramsgettingdatameantforothers.Wecangetthepowertoopenrawsocketson
Linuxbygivingsudoprivilegestoourcommand(pinghasthesocalledstickybitset).OnamobileOSlikeAndroid,itrequires
rootaccess,i.e.,rootingthephoneortablet.That'soutofthequestionformostpeopleandsoICMP_ECHOisoutofreachfor
mostdevices.

Expletivedeleted!Let'strysomethinginuserspace.ThenextstepmostpeopletakeisUDPmulticastorbroadcast.Let'sfollow
thattrail.

CooperativeDiscoveryUsingUDPBroadcasts topprevnext

Multicasttendstobeseenasmoremodernand"better"thanbroadcast.InIPv6,broadcastdoesn'tworkatall:youmustalways
usemulticast.Nonetheless,allIPv4localnetworkdiscoveryprotocolsendupusingUDPbroadcastanyhow.Thereasons:
broadcastandmulticastendupworkingmuchthesame,exceptbroadcastissimplerandlessrisky.Multicastisseenbynetwork
adminsaskindofdangerous,asitcanleakovernetworksegments.

Ifyou'veneverusedUDP,you'lldiscoverit'squiteaniceprotocol.Insomeways,itremindsusofZeroMQ,sendingwhole
messagestopeersusingatwodifferentpatterns:onetoone,andonetomany.ThemainproblemswithUDParethat(a)the
POSIXsocketAPIwasdesignedforuniversalflexibility,notsimplicity,(b)UDPmessagesarelimitedforpracticalpurposesto
about1,500bytesonLANsand512bytesontheInternet,and(c)whenyoustarttouseUDPforrealdata,youfindthat
messagesgetdropped,especiallyasinfrastructuretendstofavorTCPoverUDP.

HereisaminimalpingprogramthatusesUDPinsteadofICMP_ECHO:

http://zguide.zeromq.org/page:all 198/225
12/31/2015 MQ - The Guide - MQ - The Guide
udpping1:UDPdiscovery,model1inC

C++|Python|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Thiscodeusesasinglesockettobroadcast1bytemessagesandreceiveanythingthatothernodesarebroadcasting.WhenI
runit,itshowsjustonenode,whichisitself:

Pingingpeers...
Foundpeer192.168.1.2:9999
Pingingpeers...
Foundpeer192.168.1.2:9999

IfIswitchoffallnetworkingandtryagain,sendingamessagefails,asI'dexpect:

Pingingpeers...
sendto:Networkisunreachable

Workingonthebasisofsolvetheproblemscurrentlyaimingatyourthroat,let'sfixthemosturgentissuesinthisfirstmodel.
Theseissuesare:

Usingthe255.255.255.255broadcastaddressisabitdubious.Ontheonehand,thisbroadcastaddressmeansprecisely
"sendtoallnodesonthelocalnetwork,anddon'tforward".However,ifyouhaveseveralinterfaces(wiredEthernet,WiFi)
thenbroadcastswillgooutonyourdefaultrouteonly,andviajustoneinterface.Whatwewanttodoiseithersendour
broadcastoneachinterface'sbroadcastaddress,orfindtheWiFiinterfaceanditsbroadcastaddress.

Likemanyaspectsofsocketprogramming,gettinginformationonnetworkinterfacesisnotportable.Dowewanttowrite
nonportablecodeinourapplications?No,thisisbetterhiddeninalibrary.

There'snohandlingforerrorsexcept"abort",whichistoobrutalfortransientproblemslike"yourWiFiisswitchedoff".The
codeshoulddistinguishbetweensofterrors(ignoreandretry)andharderrors(assert).

ThecodeneedstoknowitsownIPaddressandignorebeaconsthatitsentout.Likefindingthebroadcastaddress,this
requiresinspectingtheavailableinterfaces.

ThesimplestanswertotheseissuesistopushtheUDPcodeintoaseparatelibrarythatprovidesacleanAPI,likethis:

//Constructor
staticudp_t*
udp_new(intport_nbr)

//Destructor
staticvoid
udp_destroy(udp_t**self_p)

//ReturnsUDPsockethandle
staticint
udp_handle(udp_t*self)

//SendmessageusingUDPbroadcast
staticvoid
udp_send(udp_t*self,byte*buffer,size_tlength)

//ReceivemessagefromUDPbroadcast
staticssize_t
udp_recv(udp_t*self,byte*buffer,size_tlength)

HereistherefactoredUDPpingprogramthatcallsthislibrary,whichismuchcleanerandnicer:

udpping2:UDPdiscovery,model2inC

http://zguide.zeromq.org/page:all 199/225
12/31/2015 MQ - The Guide - MQ - The Guide
Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Thelibrary,udplib,hidesalotoftheunpleasantcode(whichwillbecomeuglieraswemakethisworkonmoresystems).I'm
notgoingtoprintthatcodehere.Youcanreaditintherepository.

Now,therearemoreproblemssizingusupandwonderingiftheycanmakelunchoutofus.First,IPv4versusIPv6andmulticast
versusbroadcast.InIPv6,broadcastdoesn'texistatalloneusesmulticast.FrommyexperiencewithWiFi,IPv4multicastand
broadcastworkidenticallyexceptthatmulticastbreaksinsomesituationswherebroadcastworksfine.Someaccesspointsdo
notforwardmulticastpackets.Whenyouhaveadevice(e.g.,atablet)thatactsasamobileAP,thenit'spossibleitwon'tget
multicastpackets.Meaning,itwon'tseeotherpeersonthenetwork.

ThesimplestplausiblesolutionissimplytoignoreIPv6fornow,andusebroadcast.Aperhapssmartersolutionwouldbetouse
multicastanddealwithasymmetricbeaconsiftheyhappen.

We'llstickwithstupidandsimplefornow.There'salwaystimetomakeitmorecomplex.

MultipleNodesonOneDevice topprevnext

SowecandiscovernodesontheWiFinetwork,aslongasthey'resendingoutbeaconsasweexpect.SoItrytotestwithtwo
processes.ButwhenIrunudpping2twice,thesecondinstancecomplains"'Addressalreadyinuse'onbind"andexits.Oh,right.
UDPandTCPbothreturnanerrorifyoutrytobindtwodifferentsocketstothesameport.Thisisright.Thesemanticsoftwo
readersononesocketwouldbeweirdtosaytheleast.Odd/evenbytes?Yougetallthe1s,Igetallthe0's?

However,aquickcheckofstackoverflow.comandsomememoryofasocketoptioncalledSO_REUSEADDRturnsupgold.IfIuse
that,IcanbindseveralprocessestothesameUDPport,andtheywillallreceiveanymessagearrivingonthatport.It'salmostas
iftheguyswhodesignedthiswerereadingmymind!(That'swaymoreplausiblethanthechancethatImaybereinventingthe
wheel.)

AquicktestshowsthatSO_REUSEADDRworksaspromised.ThisisgreatbecausethenextthingIwanttodoisdesignanAPI
andthenstartdozensofnodestoseethemdiscoveringeachother.Itwouldbereallycumbersometohavetotesteachnodeon
aseparatedevice.Andwhenwegettotestinghowrealtrafficbehavesonalarge,flakynetwork,thetwoalternativesare
simulationortemporaryinsanity.

AndIspeakfromexperience:wewere,thissummer,testingondozensofdevicesatonce.Ittakesaboutanhourtosetupafull
testrun,andyouneedaspaceshieldedfromWiFiinterferenceifyouwantanykindofreproducibility(unlessyourtestcaseis
"provethatinterferencekillsWiFinetworksfasterthanOrvalcankillathirst".

IfIwereawhizAndroiddeveloperwithafreeweekend,I'dimmediately(asin,itwouldtakemetwodays)portthiscodetomy
phoneandgetitsendingbeaconstomyPC.Butsometimeslazyismoreprofitable.IlikemyLinuxlaptop.Ilikebeingableto
startadozenthreadsfromoneprocess,andhaveeachthreadactinglikeanindependentnode.Ilikenothavingtoworkinareal
FaradaycagewhenIcansimulateoneonmylaptop.

DesigningtheAPI topprevnext

I'mgoingtorunNnodesonadevice,andtheyaregoingtohavetodiscovereachother,aswellasabunchofothernodesout
thereonthelocalnetwork.IcanuseUDPforlocaldiscoveryaswellasremotediscovery.It'sarguablynotasefficientasusing,
e.g.,theZeroMQinproc://transport,butithasthegreatadvantagethattheexactsamecodewillworkinsimulationandinreal
deployment.

IfIhavemultiplenodesononedevice,weclearlycan'tusetheIPaddressandportnumberasnodeaddress.Ineedsomelogical
nodeidentifier.Arguably,thenodeidentifieronlyhastobeuniquewithinthecontextofthedevice.Mymindfillswithcomplex
stuffIcouldmake,likesupernodesthatsitonrealUDPportsandforwardmessagestointernalnodes.Ihitmyheadonthetable
untiltheideaofinventingnewconceptsleavesit.

ExperiencetellsusthatWiFidoesthingslikedisappearandreappearwhileapplicationsarerunning.Usersclickonthings,which
doesinterestingthingslikechangetheIPaddresshalfwaythroughasession.WecannotdependonIPaddresses,noron
establishedconnections(intheTCPfashion).Weneedsomelonglastingaddressingmechanismthatsurvivesinterfacesand
connectionsbeingtorndownandthenrecreated.

http://zguide.zeromq.org/page:all 200/225
12/31/2015 MQ - The Guide - MQ - The Guide
Here'sthesimplestsolutionIcansee:wegiveeverynodeaUUID,andspecifythatnodes,representedbytheirUUIDs,can
appearorreappearatcertainIPaddress:portendpoints,andthendisappearagain.We'lldealwithrecoveryfromlostmessages
later.AUUIDis16bytes.SoifIhave100nodesonaWiFinetwork,that's(doubleitforotherrandomstuff)3,200bytesasecond
ofbeacondatathattheairhastocarryjustfordiscoveryandpresence.Seemsacceptable.

Backtoconcepts.WedoneedsomenamesforourAPI.Attheleastweneedawaytodistinguishbetweenthenodeobjectthat
is"us",andnodeobjectsthatareourpeers.We'llbedoingthingslikecreatingan"us"andthenaskingithowmanypeersit
knowsaboutandwhotheyare.Theterm"peer"isclearenough.

Fromthedeveloperpointofview,anode(theapplication)needsawaytotalktotheoutsideworld.Let'sborrowatermfrom
networkingandcallthisan"interface".Theinterfacerepresentsustotherestoftheworldandpresentstherestoftheworldto
us,asasetofotherpeers.Itautomaticallydoeswhateverdiscoveryitmust.Whenwewanttotalktoapeer,wegettheinterface
todothatforus.Andwhenapeertalkstous,it'stheinterfacethatdeliversusthemessage.

ThisseemslikeacleanAPIdesign.Howabouttheinternals?

TheinterfacemustbemultithreadedsothatonethreadcandoI/Ointhebackground,whiletheforegroundAPItalkstothe
application.WeusedthisdesignintheCloneandFreelanceclientAPIs.

TheinterfacebackgroundthreaddoesthediscoverybusinessbindtotheUDPport,sendoutUDPbeacons,andreceive
beacons.

WeneedtoatleastsendUUIDsinthebeaconmessagesothatwecandistinguishourownbeaconsfromthoseofour
peers.

Weneedtotrackpeersthatappear,andthatdisappear.Forthis,I'lluseahashtablethatstoresallknownpeersand
expirepeersaftersometimeout.

Weneedawaytoreportpeersandeventstothecaller.Herewegetintoajuicyquestion.HowdoesabackgroundI/O
threadtellaforegroundAPIthreadthatstuffishappening?Callbacksmaybe?Heckno.We'lluseZeroMQmessages,of
course.

ThethirditerationoftheUDPpingprogramisevensimplerandmorebeautifulthanthesecond.Themainbody,inC,isjustten
linesofcode.

udpping3:UDPdiscovery,model3inC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Theinterfacecodeshouldbefamiliarifyou'vestudiedhowwemakemultithreadedAPIclasses:

interface:UDPpinginterfaceinC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

WhenIrunthisintwowindows,itreportsonepeerjoiningthenetwork.Ikillthatpeerandafewsecondslater,ittellsmethepeer
left:


[006]JOINED
[032]418E98D4B7184844B7D5E0EE5691084C

[004]LEFT
[032]418E98D4B7184844B7D5E0EE5691084C

What'sniceaboutaZeroMQmessagebasedAPIisthatIcanwrapthisanywayIlike.Forinstance,IcanturnitintocallbacksifI
reallywantthose.IcanalsotraceallactivityontheAPIveryeasily.

Somenotesabouttuning.OnEthernet,fiveseconds(theexpirytimeIusedinthiscode)seemslikealot.Onabadlystressed
WiFinetwork,youcangetpinglatenciesof30secondsormore.Ifyouuseatooaggressivevaluefortheexpiry,you'lldisconnect
nodesthatarestillthere.Ontheotherside,enduserapplicationsexpectacertainliveliness.Ifittakes30secondstoreportthat
anodehasgone,userswillgetannoyed.

Adecentstrategyistodetectandreportdisappearednodesrapidly,butonlydeletethemafteralongerinterval.Visually,anode

http://zguide.zeromq.org/page:all 201/225
12/31/2015 MQ - The Guide - MQ - The Guide
wouldbegreenwhenit'salive,thengrayforawhileasitwentoutofreach,thenfinallydisappear.We'renotdoingthisnow,but
willdoitintherealimplementationoftheasyetunnamedframeworkwe'remaking.

Aswewillalsoseelater,wehavetotreatanyinputfromanode,notjustUDPbeacons,asasignoflife.UDPmaygetsquashed
whenthere'salotofTCPtraffic.Thisisperhapsthemainreasonwe'renotusinganexistingUDPdiscoverylibrary:it'snecessary
tointegratethistightlywithourZeroMQmessagingforittowork.

MoreAboutUDP topprevnext

SowehavediscoveryandpresenceworkingoverUDPIPv4broadcasts.It'snotideal,butitworksforthelocalnetworkswehave
today.Howeverwecan'tuseUDPforrealwork,notwithoutadditionalworktomakeitreliable.There'sajokeaboutUDPbut
sometimesyou'llgetit,andsometimesyouwon't.

We'llsticktoTCPforallonetoonemessaging.ThereisonemoreusecaseforUDPafterdiscovery,whichismulticastfile
distribution.I'llexplainwhyandhow,thenshelvethatforanotherday.Thewhyissimple:whatwecall"socialnetworks"isjust
augmentedculture.Wecreateculturebysharing,andthismeansmoreandmoresharingworksthatwemakeorremix.Photos,
documents,contracts,tweets.Thecloudsofdeviceswe'reaimingtowardsdomoreofthis,notless.

Now,therearetwoprincipalpatternsforsharingcontent.Oneisthepubsubpatternwhereonenodesendsoutcontenttoaset
ofothernodessimultaneously.Secondisthe"latejoiner"pattern,whereanodearrivessomewhatlaterandwantstocatchupto
theconversation.WecandealwiththelatejoinerusingTCPunicast.ButdoingTCPunicasttoagroupofclientsatthesame
timehassomedisadvantages.First,itcanbeslowerthanmulticast.Second,it'sunfairbecausesomewillgetthecontentbefore
others.

BeforeyoujumpofftodesignaUDPmulticastprotocol,realizethatit'snotasimplecalculation.Whenyousendamulticast
packet,theWiFiaccesspointusesalowbitratetoensurethateventhefurthestdeviceswillgetitsafely.MostnormalAPsdon't
dotheobviousoptimization,whichistomeasurethedistanceofthefurthestdeviceandusethatbitrate.Instead,theyjustusea
fixedvalue.SoifyouhaveafewdevicesclosetotheAP,multicastwillbeinsanelyslow.Butifyouhavearoomfulofdevices
whichallwanttogetthenextchapterofthetextbook,multicastcanbeinsanelyeffective.

Thecurvescrossatabout612devicesdependingonthenetwork.Intheory,youcouldmeasurethecurvesinrealtimeand
createanadaptiveprotocol.Thatwouldbecoolbutprobablytoohardforeventhesmartestofus.

IfyoudositdownandsketchoutaUDPmulticastprotocol,realizethatyouneedachannelforrecovery,togetlostpackets.
You'dprobablywanttodothisoverTCP,usingZeroMQ.Fornow,however,we'llforgetaboutmulticastUDPandassumeall
trafficgoesoverTCP.

SpinningOffaLibraryProject topprevnext

Atthisstage,however,thecodeisgrowinglargerthananexampleshouldbe,soit'stimetocreateaproperGitHubproject.It'sa
rule:buildyourprojectsinpublicview,andtellpeopleaboutthemasyougosoyourmarketingandcommunitybuildingstartson
Day1.I'llwalkthroughwhatthisinvolves.IexplainedinChapter6TheZeroMQCommunityaboutgrowingcommunitiesaround
projects.Weneedafewthings:

Aname
Aslogan
Apublicgithubrepository
AREADMEthatlinkstotheC4process
Licensefiles
Anissuetracker
Twomaintainers
Afirstbootstrapversion

Thenameandsloganfirst.Thetrademarksofthe21stcenturyaredomainnames.SothefirstthingIdowhenspinningoffa
projectistolookforadomainnamethatmightwork.Quiterandomly,oneofouroldmessagingprojectswascalled"Zyre"andI
havethedomainnameforit.Thefullnameisabackronym:theZeroMQRealtimeExchangeframework.

I'msomewhatshyaboutpushingnewprojectsintotheZeroMQcommunitytooaggressively,andnormallywouldstartaprojectin
eithermypersonalaccountortheiMatixorganization.Butwe'velearnedthatmovingprojectsaftertheybecomepopularis

http://zguide.zeromq.org/page:all 202/225
12/31/2015 MQ - The Guide - MQ - The Guide
counterproductive.Mypredictionsofafuturefilledwithmovingpiecesareeithervalidorwrong.Ifthischapterisvalid,wemight
aswelllaunchthisasaZeroMQprojectfromthestart.Ifit'swrong,wecandeletetherepositorylaterorletitsinktothebottomof
alonglistofforgottenstarts.

Startwiththebasics.Theprotocol(UDPandZeroMQ/TCP)willbeZRE(ZeroMQRealtimeExchangeprotocol)andtheproject
willbeZyre.Ineedasecondmaintainer,soIinvitemyfriendDongMin(theKoreanhackerbehindJeroMQ,apureJavaZeroMQ
stack)tojoin.He'sbeenworkingonverysimilarideassoisenthusiastic.WediscussthisandwegettheideaofbuildingZyreon
topofJeroMQ,aswellasontopofCZMQandlibzmq.ThiswouldmakeitaloteasiertorunZyreonAndroid.Itwouldalsogive
ustwofullyseparateimplementationsfromthestart,whichisalwaysagoodthingforaprotocol.

SowetaketheFileMQprojectIbuiltinChapter7AdvancedArchitectureusingZeroMQasatemplateforanewGitHubproject.
TheGNUautoconftoolsarequitedecent,buthaveapainfulsyntax.It'seasiesttocopyexistingprojectfilesandmodifythem.
TheFileMQprojectbuildsalibrary,hastesttools,licensefiles,manpages,andsoon.It'snottoolargesoit'sagoodstarting
point.

IputtogetheraREADMEtosummarizethegoalsoftheprojectandpointtoC4.Theissuetrackerisenabledbydefaultonnew
GitHubprojects,sooncewe'vepushedtheUDPpingcodeasafirstversion,we'rereadytogo.However,it'salwaysgoodto
recruitmoremaintainers,soIcreateanissue"Callformaintainers"thatsays:

Ifyou'dliketohelpclickthatlovelygreen"MergePullRequest"buttonandgeteternalgoodkarma,addacomment
confirmingthatyou'vereadandunderstandtheC4processathttp://rfc.zeromq.org/spec:22.

Finally,Ichangetheissuetrackerlabels.Bydefault,GitHubofferstheusualvarietyofissuetypes,butwithC4wedon'tuse
them.Instead,weneedjusttwolabels("Urgent",inred,and"Ready",inblack).

PointtoPointMessaging topprevnext

I'mgoingtotakethelastUDPpingprogramandbuildapointtopointmessaginglayerontopofthat.Ourgoalisthatwecan
detectpeersastheyjoinandleavethenetwork,thatwecansendmessagestothem,andthatwecangetreplies.Itisanontrivial
problemtosolveandtakesMinandmetwodaystogeta"HelloWorld"versionworking.

Wehadtosolveanumberofissues:

WhatinformationtosendintheUDPbeacon,andhowtoformatit.
WhatZeroMQsockettypestousetointerconnectnodes.
WhatZeroMQmessagestosend,andhowtoformatthem.
Howtosendamessagetoaspecificnode.
Howtoknowthesenderofanymessagesowecouldsendareply.
HowtorecoverfromlostUDPbeacons.
Howtoavoidoverloadingthenetworkwithbeacons.

I'llexplaintheseinenoughdetailsothatyouunderstandwhywemadeeachchoicewedid,withsomecodefragmentsto
illustrate.Wetaggedthiscodeasversion0.1.0soyoucanlookatthecode:mostofthehardworkisdonein
zre_interface.c.

UDPBeaconFraming topprevnext

SendingUUIDsacrossthenetworkisthebareminimumforalogicaladdressingscheme.However,wehaveafewmoreaspects
togetworkingbeforethiswillworkinrealuse:

Weneedsomeprotocolidentificationsothatwecancheckforandrejectinvalidpackets.
Weneedsomeversioninformationsothatwecanchangethisprotocolovertime.
WeneedtotellothernodeshowtoreachusviaTCP,i.e.,aZeroMQporttheycantalktouson.

Let'sstartwiththebeaconmessageformat.Weprobablywantafixedprotocolheaderthatwillneverchangeinfutureversions
andabodythatdependsontheversion.

Figure67ZREdiscoverymessage
http://zguide.zeromq.org/page:all 203/225
12/31/2015 MQ - The Guide - MQ - The Guide

Theversioncanbea1bytecounterstartingat1.TheUUIDis16bytesandtheportisa2byteportnumberbecauseUDPnicely
tellsusthesender'sIPaddressforeverymessagewereceive.Thisgivesusa22byteframe.

TheClanguage(andafewotherslikeErlang)makeitsimpletoreadandwritebinarystructures.Wedefinethebeaconframe
structure:

#defineBEACON_PROTOCOL"ZRE"
#defineBEACON_VERSION0x01

typedefstruct{
byteprotocol[3]
byteversion
uuid_tuuid
uint16_tport
}beacon_t

Thismakessendingandreceivingbeaconsquitesimple.Hereishowwesendabeacon,usingthezre_udpclasstodothe
nonportablenetworkcalls:

//Beaconobject
beacon_tbeacon

//Formatbeaconfields
beacon.protocol[0]='Z'
beacon.protocol[1]='R'
beacon.protocol[2]='E'
beacon.version=BEACON_VERSION
memcpy(beacon.uuid,self>uuid,sizeof(uuid_t))
beacon.port=htons(self>port)

//Broadcastthebeacontoanyonewhoislistening
zre_udp_send(self>udp,(byte*)&beacon,sizeof(beacon_t))

Whenwereceiveabeacon,weneedtoguardagainstbogusdata.We'renotgoingtobeparanoidagainst,forexample,denialof
serviceattacks.Wejustwanttomakesurethatwe'renotgoingtocrashwhenabadZREimplementationsendsuserroneous
frames.

Tovalidateaframe,wecheckitssizeandheader.IfthoseareOK,weassumethebodyisusable.WhenwegetaUUIDthatisn't
ourselves(recall,we'llgetourownUDPbroadcastsback),wecantreatthisasapeer:

//Getbeaconframefromnetwork
beacon_tbeacon
ssize_tsize=zre_udp_recv(self>udp,
(byte*)&beacon,sizeof(beacon_t))

//Basicvalidationontheframe
if(size!=sizeof(beacon_t)
||beacon.protocol[0]!='Z'
||beacon.protocol[1]!='R'
||beacon.protocol[2]!='E'
||beacon.version!=BEACON_VERSION)

http://zguide.zeromq.org/page:all 204/225
12/31/2015 MQ - The Guide - MQ - The Guide
return0//Ignoreinvalidbeacons

//IfwegotaUUIDandit'snotourownbeacon,wehaveapeer
if(memcmp(beacon.uuid,self>uuid,sizeof(uuid_t))){
char*identity=s_uuid_str(beacon.uuid)
s_require_peer(self,identity,
zre_udp_from(self>udp),ntohs(beacon.port))
free(identity)
}

TruePeerConnectivity(HarmonyPattern) topprevnext

BecauseZeroMQisdesignedtomakedistributedmessagingeasy,peopleoftenaskhowtointerconnectasetoftruepeers(as
comparedtoobviousclientsandservers).ItisathornyquestionandZeroMQdoesn'treallyprovideasingleclearanswer.

TCP,whichisthemostcommonlyusedtransportinZeroMQ,isnotsymmetriconesidemustbindandonemustconnectand
thoughZeroMQtriestobeneutralaboutthis,it'snot.Whenyouconnect,youcreateanoutgoingmessagepipe.Whenyoubind,
youdonot.Whenthereisnopipe,youcannotwritemessages(ZeroMQwillreturnEAGAIN).

DeveloperswhostudyZeroMQandthentrytocreateNtoNconnectionsbetweensetsofequalpeersoftentryaROUTERto
ROUTERflow.It'sobviouswhy:eachpeerneedstoaddressasetofpeers,whichrequiresROUTER.Itusuallyendswitha
plaintiveemailtothelist.

ExperienceteachesusthatROUTERtoROUTERisparticularlydifficulttousesuccessfully.Ataminimum,onepeermustbind
andonemustconnect,meaningthearchitectureisnotsymmetrical.Butalsobecauseyousimplycan'ttellwhenyouareallowed
tosafelysendamessagetoapeer.It'saCatch22:youcantalktoapeerafterit'stalkedtoyou,butthepeercan'ttalktoyou
untilyou'vetalkedtoit.Onesideortheotherwillbelosingmessagesandthushastoretry,whichmeansthepeerscannotbe
equal.

I'mgoingtoexplaintheHarmonypattern,whichsolvesthisproblem,andwhichweuseinZyre.

Wewantaguaranteethatwhenapeer"appears"onournetwork,wecantalktoitsafelywithoutZeroMQdroppingmessages.
Forthis,wehavetouseaDEALERorPUSHsocketthatconnectsouttothepeersothatevenifthatconnectiontakessome
nonzerotime,thereisimmediatelyapipeandZeroMQwillacceptoutgoingmessages.

ADEALERsocketcannotaddressmultiplepeersindividually.ButifwehaveoneDEALERperpeer,andweconnectthat
DEALERtothepeer,wecansafelysendmessagestoapeerassoonaswe'veconnectedtoit.

Now,thenextproblemistoknowwhosentusaparticularmessage.WeneedareplyaddressthatistheUUIDofthenodewho
sentanygivenmessage.DEALERcan'tdothisunlessweprefixeverysinglemessagewiththat16byteUUID,whichwouldbe
wasteful.ROUTERdoesdoitifwesettheidentityproperlybeforeconnectingtotherouter.

AndsotheHarmonypatterncomesdowntothesecomponents:

OneROUTERsocketthatwebindtoaephemeralport,whichwebroadcastinourbeacons.
OneDEALERsocketperpeerthatweconnecttothepeer'sROUTERsocket.
ReadingfromourROUTERsocket.
Writingtothepeer'sDEALERsocket.

Thenextproblemisthatdiscoveryisn'tneatlysynchronized.Wecangetthefirstbeaconfromapeerafterwestarttoreceive
messagesfromit.AmessagecomesinontheROUTERsocketandhasaniceUUIDattachedtoit,butnophysicalIPaddress
andport.WehavetoforcediscoveryoverTCP.Todothis,ourfirstcommandtoanynewpeertowhichweconnectisanOHAI
commandwithourIPaddressandport.Thisensurethatthereceiverconnectsbacktousbeforetryingtosendusanycommand.

Hereitis,brokendownintosteps:

IfwereceiveaUDPbeaconfromanewpeer,weconnecttothepeerthroughaDEALERsocket.
WereadmessagesfromourROUTERsocket,andeachmessagecomeswiththeUUIDofthesender.
Ifit'sanOHAImessage,weconnectbacktothatpeerifnotalreadyconnectedtoit.
Ifit'sanyothermessage,wemustalreadybeconnectedtothepeer(agoodplaceforanassertion).
WesendmessagestoeachpeerusingtheperpeerDEALERsocket,whichmustbeconnected.
Whenweconnecttoapeer,wealsotellourapplicationthatthepeerexists.

http://zguide.zeromq.org/page:all 205/225
12/31/2015 MQ - The Guide - MQ - The Guide
Everytimewegetamessagefromapeer,wetreatthatasaheartbeat(it'salive).

IfwewerenotusingUDPbutsomeotherdiscoverymechanism,I'dstillusetheHarmonypatternforatruepeernetwork:one
ROUTERforinputfromallpeers,andoneDEALERperpeerforoutput.BindtheROUTER,connecttheDEALER,andstarteach
conversationwithanOHAIequivalentthatprovidesthereturnIPaddressandport.Youwouldneedsomeexternalmechanismto
bootstrapeachconnection.

DetectingDisappearances topprevnext

Heartbeatingsoundssimplebutit'snot.UDPpacketsgetdroppedwhenthere'salotofTCPtraffic,soifwedependonUDP
beacons,we'llgetfalsedisconnections.TCPtrafficcanbedelayedfor5,10,even30secondsifthenetworkisreallybusy.Soif
wekillpeerswhentheygoquiet,we'llhavefalsedisconnections.

BecauseUDPbeaconsaren'treliable,it'stemptingtoaddinTCPbeacons.Afterall,TCPwilldeliverthemreliably.However,
there'sonelittleproblem.Imaginethatyouhave100nodesonanetwork,andeachnodesendsaTCPbeacononceasecond.
Eachbeaconis22bytes,notcountingTCP'sframingoverhead.Thatis100*99*22bytespersecond,or217,000bytes/second
justforheartbeating.That'sabout12%ofatypicalWiFinetwork'sidealcapacity,whichsoundsOK.Butwhenanetworkis
stressedorfightingothernetworksforairspace,thatextra200Kasecondwillbreakwhat'sleft.UDPbroadcastsareatleastlow
cost.

SowhatwedoisswitchtoTCPheartbeatsonlywhenaspecificpeerhasn'tsentusanyUDPbeaconsinawhile.Andthenwe
sendTCPheartbeatsonlytothatonepeer.Ifthepeercontinuestobesilent,weconcludeit'sgoneaway.Ifthepeercomesback
withadifferentIPaddressand/orport,wehavetodisconnectourDEALERsocketandreconnecttothenewport.

Thisgivesusasetofstatesforeachpeer,thoughatthisstagethecodedoesn'tuseaformalstatemachine:

PeervisiblethankstoUDPbeacon(weconnectusingIPaddressandportfrombeacon)
PeervisiblethankstoOHAIcommand(weconnectusingIPaddressandportfromcommand)
Peerseemsalive(wegotaUDPbeaconorcommandoverTCPrecently)
Peerseemsquiet(noactivityinsometime,sowesendaHUGZcommand)
Peerhasdisappeared(noreplytoourHUGZcommands,sowedestroypeer)

There'soneremainingscenariowedidn'taddressinthecodeatthisstage.It'spossibleforapeertochangeIPaddressesand
portswithoutactuallytriggeringadisappearanceevent.Forexample,iftheuserswitchesoffWiFiandthenswitchesitbackon,
theaccesspointcanassignthepeeranewIPaddress.We'llneedtohandleadisappearedWiFiinterfaceonournodeby
unbindingtheROUTERsocketandrebindingitwhenwecan.Becausethisisnotcentraltothedesignnow,Idecidetologan
issueontheGitHubtrackerandleaveitforarainyday.

GroupMessaging topprevnext

Groupmessagingisacommonandveryusefulpattern.Theconceptissimple:insteadoftalkingtoasinglenode,youtalktoa
"group"ofnodes.Thegroupisjustaname,astringthatyouagreeonintheapplication.It'spreciselylikeusingthepubsub
prefixesinPUBandSUBsockets.Infact,theonlyreasonIsay"groupmessaging"andnot"pubsub"istopreventconfusion,
becausewe'renotgoingtousePUBSUBsocketsforthis.

PUBSUBsocketswouldalmostwork.Butwe'vejustdonesuchalotofworktosolvethelatejoinerproblem.Applicationsare
inevitablygoingtowaitforpeerstoarrivebeforesendingmessagestogroups,sowehavetobuildontheHarmonypatternrather
thanstartagainbesideit.

Let'slookattheoperationswewanttodoongroups:

Wewanttojoinandleavegroups.
Wewanttoknowwhatothernodesareinanygivengroup.
Wewanttosendamessageto(allnodesin)agroup.

Theselookfamiliartoanyonewho'susedInternetRelayChat,exceptthatwehavenoserver.Everynodewillneedtokeeptrack
ofwhateachgrouprepresents.Thisinformationwillnotalwaysbefullyconsistentacrossthenetwork,butitwillbecloseenough.

Ourinterfacewilltrackasetofgroups(eachanobject).Thesearealltheknowngroupswithoneormoremembernode,

http://zguide.zeromq.org/page:all 206/225
12/31/2015 MQ - The Guide - MQ - The Guide
excludingourselves.We'lltracknodesastheyleaveandjoingroups.Becausenodescanjointhenetworkatanytime,wehave
totellnewpeerswhatgroupswe'rein.Whenapeerdisappears,we'llremoveitfromallgroupsweknowabout.

Thisgivesussomenewprotocolcommands:

JOINwesendthistoallpeerswhenwejoinagroup.
LEAVEwesendthistoallpeerswhenweleaveagroup.

Plus,weaddagroupsfieldtothefirstcommandwesend(renamedfromOHAItoHELLOatthispointbecauseIneedalarger
lexiconofcommandverbs).

Lastly,let'saddawayforpeerstodoublechecktheaccuracyoftheirgroupdata.Theriskisthatwemissoneoftheabove
messages.ThoughweareusingHarmonytoavoidthetypicalmessagelossatstartup,it'sworthbeingparanoid.Fornow,allwe
needisawaytodetectsuchafailure.We'lldealwithrecoverylater,iftheproblemactuallyhappens.

I'llusetheUDPbeaconforthis.Whatwewantisarollingcounterthatsimplytellshowmanyjoinandleaveoperations
("transitions")therehavebeenforanode.Itstartsat0andincrementsforeachgroupwejoinorleave.Wecanuseaminimal1
bytevaluebecausethatwillcatchallfailuresexcepttheastronomicallyrare"welostprecisely256messagesinarow"failure
(thisistheonethathitsduringthefirstdemo).WewillalsoputthetransitionscounterintotheJOIN,LEAVE,andHELLO
commands.Andtotrytoprovoketheproblem,we'lltestbyjoining/leavingseveralhundredgroupswithahighwatermarksetto
10orso.

It'stimetochooseverbsforthegroupmessaging.Weneedacommandthatmeans"talktoonepeer"andonethatmeans"talk
tomanypeers".Aftersomeattempts,mybestchoicesareWHISPERandSHOUT,andthisiswhatthecodeuses.TheSHOUT
commandneedstotelltheuserthegroupname,aswellasthesenderpeer.

Becausegroupsarelikepubsub,youmightbetemptedtousethistobroadcasttheJOINandLEAVEcommandsaswell,
perhapsbycreatinga"global"groupthatallnodesjoin.Myadviceistokeepgroupspurelyasuserspaceconceptsfortwo
reasons.First,howdoyoujointheglobalgroupifyouneedtheglobalgrouptosendoutaJOINcommand?Second,itcreates
specialcases(reservednames)whicharemessy.

It'ssimplerjusttosendJOINsandLEAVEsexplicitlytoallconnectedpeers,period.

I'mnotgoingtoworkthroughtheimplementationofgroupmessagingindetailbecauseit'sfairlypedanticandnottooexciting.
Thedatastructuresforgroupandpeermanagementaren'toptimal,butthey'reworkable.Weusethefollowing:

Alistofgroupsforourinterface,whichwecansendtonewpeersinaHELLOcommand
Ahashofgroupsforotherpeers,whichweupdatewithinformationfromHELLO,JOIN,andLEAVEcommands
Ahashofpeersforeachgroup,whichweupdatewiththesamethreecommands.

Atthisstage,I'mstartingtogetprettyhappywiththebinaryserialization(ourcodecgeneratorfromChapter7Advanced
ArchitectureusingZeroMQ),whichhandleslistsanddictionariesaswellasstringsandintegers.

Thisversionistaggedintherepositoryasv0.2.0andyoucandownloadthetarballifyouwanttocheckwhatthecodelookedlike
atthisstage.

TestingandSimulation topprevnext

Whenyoubuildaproductoutofpieces,andthisincludesadistributedframeworklikeZyre,theonlywaytoknowthatitwillwork
properlyinreallifeistosimulaterealactivityoneachpiece.

OnAssertions topprevnext

Theproperuseofassertionsisoneofthehallmarksofaprofessionalprogrammer.

Ourconfirmationbiasascreatorsmakesithardtotestourworkproperly.Wetendtowriteteststoprovethecodeworks,rather
thantryingtoproveitdoesn't.Therearemanyreasonsforthis.Wepretendtoourselvesandothersthatwecanbe(couldbe)
perfect,wheninfactweconsistentlymakemistakes.Bugsincodeareseenas"bad",ratherthan"inevitable",sopsychologically
wewanttoseefewerofthem,notuncovermoreofthem."Hewritesperfectcode"isacomplimentratherthanaeuphemismfor

http://zguide.zeromq.org/page:all 207/225
12/31/2015 MQ - The Guide - MQ - The Guide
"henevertakesriskssohiscodeisasboringandheavilyusedascoldspaghetti".

Someculturesteachustoaspiretoperfectionandpunishmistakesineducationandwork,whichmakesthisattitudeworse.To
acceptthatwe'refallible,andthentolearnhowtoturnthatintoprofitratherthanshameisoneofthehardestintellectual
exercisesinanyprofession.Weleverageourfallibilitiesbyworkingwithothersandbychallengingourownworksooner,not
later.

Onetrickthatmakesiteasieristouseassertions.Assertionsarenotaformoferrorhandling.Theyareexecutabletheoriesof
fact.Thecodeasserts,"Atthispoint,suchandsuchmustbetrue"andiftheassertionfails,thecodekillsitself.

Thefasteryoucanprovecodeincorrect,thefasterandmoreaccuratelyyoucanfixit.Believingthatcodeworksandprovingthat
itbehavesasexpectedislessscience,moremagicalthinking.It'sfarbettertobeabletosay,"libzmqhasfivehundred
assertionsanddespiteallmyefforts,notoneofthemfails".

SotheZyrecodebaseisscatteredwithassertions,andparticularlyacoupleonthecodethatdealswiththestateofpeers.This
isthehardestaspecttogetright:peersneedtotrackeachotherandexchangestateaccuratelyorthingsstopworking.The
algorithmsdependonasynchronousmessagesflyingaroundandI'mprettysuretheinitialdesignhasflaws.Italwaysdoes.

AndasItesttheoriginalZyrecodebystartingandstoppinginstancesofzre_pingbyhand,everysooftenIgetanassertion
failure.Runningbyhanddoesn'treproducetheseoftenenough,solet'smakeapropertestertool.

OnUpFrontTesting topprevnext

Beingabletofullytesttherealbehaviorofindividualcomponentsinthelaboratorycanmakea10xor100xdifferencetothecost
ofyourproject.Thatconfirmationbiasengineershavetotheirownworkmakesupfronttestingincrediblyprofitable,andlate
stagetestingincrediblyexpensive.

I'lltellyouashortstoryaboutaprojectweworkedoninthelate1990's.Weprovidedthesoftwareandotherteamsprovidedthe
hardwareforafactoryautomationproject.Threeorfourteamsbroughttheirexpertsonsite,whichwasaremotefactory(funny
howthepollutingfactoriesarealwaysinremotebordercountry).

Oneoftheseteams,afirmspecializinginindustrialautomation,builtticketmachines:kiosks,andsoftwaretorunonthem.
Nothingunusual:swipeabadge,chooseanoption,receiveaticket.Theyassembledtwoofthesekiosksonsite,eachweek
bringingsomemorebitsandpieces.Ticketprinters,monitorscreens,specialkeypadsfromIsrael.Thestuffhadtoberesistant
againstdustbecausethekioskssatoutside.Nothingworked.Thescreenswereunreadableinthesun.Theticketprinters
continuallyjammedandmisprinted.Theinternalsofthekioskjustsatonwoodenshelving.Thekiosksoftwarecrashedregularly.
Itwascomedicexceptthattheprojectreally,reallyhadtoworkandsowespentweeksandthenmonthsonsitehelpingtheother
teamsdebugtheirbitsandpiecesuntilitworked.

Ayearlater,therewasasecondfactory,andthesamestory.Bythistimetheclient,wasgettingimpatient.Sowhentheycameto
thethirdandlargestfactory,ayearlater,wejumpedupandsaid,"pleaseletusmakethekiosksandthesoftwareand
everything".

Wemadeadetaileddesignforthesoftwareandhardwareandfoundsuppliersforallthepieces.Ittookusthreemonthsto
searchtheInternetforeachcomponent(inthosedays,theInternetwasalotslower),andanothertwomonthstogetthem
assembledintostainlesssteelbrickseachweighingabouttwentykilos.Thesebricksweretwofeetsquareandeightinchesdeep,
withalargeflatscreenpanelbehindunbreakableglass,andtwoconnectors:oneforpower,oneforEthernet.Youloadedupthe
paperbinwithenoughforsixmonths,thenscrewedthebrickintoahousing,anditautomaticallybooted,founditsDNSserver,
loadeditsLinuxOSandthenapplicationsoftware.Itconnectedtotherealserver,andshowedthemainmenu.Yougotaccessto
theconfigurationscreensbyswipingaspecialbadgeandthenenteringacode.

Thesoftwarewasportablesowecouldtestthataswewroteit,andaswecollectedthepiecesfromoursupplierswekeptoneof
eachsowehadadisassembledkiosktoplaywith.Whenwegotourfinishedkiosks,theyallworkedimmediately.Weshipped
themtotheclient,whopluggedthemintotheirhousing,switchedthemon,andwenttobusiness.Wespentaweekorsoonsite,
andintenyears,onekioskbroke(thescreendied,andwasreplaced).

Lessonis,testupfrontsothatwhenyouplugthethingin,youknowpreciselyhowit'sgoingtobehave.Ifyouhaven'ttestedit
upfront,you'regoingtobespendingweeksandmonthsinthefieldironingoutproblemsthatshouldneverhavebeenthere.

TheZyreTester topprevnext

http://zguide.zeromq.org/page:all 208/225
12/31/2015 MQ - The Guide - MQ - The Guide

Duringmanualtesting,Ididhitanassertionrarely.Itthendisappeared.BecauseIdon'tbelieveinmagic,Iknowthatmeantthe
codewasstillwrongsomewhere.So,thenextstepwasheavydutytestingoftheZyrev0.2.0codetotrytobreakitsassertions,
andgetagoodideaofhowitwillbehaveinthefield.

Wepackagedthediscoveryandmessagingfunctionalityasaninterfaceobjectthatthemainprogramcreates,workswith,and
thendestroys.Wedon'tuseanyglobalvariables.Thismakesiteasytostartlargenumbersofinterfacesandsimulatereal
activity,allwithinoneprocess.Andifthere'sonethingwe'velearnedfromwritinglotsofexamples,it'sthatZeroMQ'sabilityto
orchestratemultiplethreadsinasingleprocessismucheasiertoworkwiththanmultipleprocesses.

Thefirstversionofthetesterconsistsofamainthreadthatstartsandstopsasetofchildthreads,eachrunningoneinterface,
eachwithaROUTER,DEALER,andUDPsocket(R,D,andUinthediagram).

Figure68ZyreTesterTool

ThenicethingisthatwhenIamconnectedtoaWiFiaccesspoint,allZyretraffic(evenbetweentwointerfacesinthesame
process)goesacrosstheAP.ThismeansIcanfullystresstestanyWiFiinfrastructurewithjustacoupleofPCsrunningina
room.It'shardtoemphasizehowvaluablethisis:ifwehadbuiltZyreas,say,adedicatedserviceforAndroid,we'dliterallyneed
dozensofAndroidtabletsorphonestodoanylargescaletesting.Kiosks,andallthat.

Thefocusisnowonbreakingthecurrentcode,tryingtoproveitwrong.There'snopointatthisstageintestinghowwellitruns,
howfastitis,howmuchmemoryituses,oranythingelse.We'llworkuptotrying(andfailing)tobreakeachindividual
functionality,butfirst,wetrytobreaksomeofthecoreassertionsI'veputintothecode.

Theseare:

ThefirstcommandthatanynodereceivesfromapeerMUSTbeHELLO.Inotherwords,messagescannotbelostduring
thepeertopeerconnectionprocess.

Thestateeachnodecalculatesforitspeersmatchesthestateeachpeercalculatesforitself.Inotherwords,again,no
messagesarelostinthenetwork.

Whenmyapplicationsendsamessagetoapeer,wehaveaconnectiontothatpeer.Inotherwords,theapplicationonly
"sees"apeerafterwehaveestablishedaZeroMQconnectiontoit.

WithZeroMQ,thereareseveralcaseswherewemaylosemessages.Oneisthe"latejoiner"syndrome.Twoiswhenweclose
socketswithoutsendingeverything.ThreeiswhenweoverflowthehighwatermarkonaROUTERorPUBsocket.Fouriswhen
weuseanunknownaddresswithaROUTERsocket.

Now,IthinkHarmonygetsaroundallthesepotentialcases.Butwe'realsoaddingUDPtothemix.Sothefirstversionofthe
testersimulatesanunstableanddynamicnetwork,wherenodescomeandgorandomly.It'sherethatthingswillbreak.

http://zguide.zeromq.org/page:all 209/225
12/31/2015 MQ - The Guide - MQ - The Guide
Hereisthemainthreadofthetester,whichmanagesapoolof100threads,startingandstoppingeachonerandomly.Every
~750msecsiteitherstartsorstopsonerandomthread.Werandomizethetimingsothatthreadsaren'tallsynchronized.Aftera
fewminutes,wehaveanaverageof50threadshappilychattingtoeachotherlikeKoreanteenagersintheGangnamsubway
station:

intmain(intargc,char*argv[])
{
//Initializecontextfortalkingtotasks
zctx_t*ctx=zctx_new()
zctx_set_linger(ctx,100)

//Getnumberofinterfacestosimulate,default100
intmax_interface=100
intnbr_interfaces=0
if(argc>1)
max_interface=atoi(argv[1])

//Weaddressinterfacesasanarrayofpipes
void**pipes=zmalloc(sizeof(void*)*max_interface)

//Wewillrandomlystartandstopinterfacethreads
while(!zctx_interrupted){
uintindex=randof(max_interface)
//Toggleinterfacethread
if(pipes[index]){
zstr_send(pipes[index],"STOP")
zsocket_destroy(ctx,pipes[index])
pipes[index]=NULL
zclock_log("I:Stoppedinterface(%drunning)",
nbr_interfaces)
}
else{
pipes[index]=zthread_fork(ctx,interface_task,NULL)
zclock_log("I:Startedinterface(%drunning)",
++nbr_interfaces)
}
//Sleep~750msecsrandomlysowesmoothoutactivity
zclock_sleep(randof(500)+500)
}
zctx_destroy(&ctx)
return0
}

Notethatwemaintainapipetoeachchildthread(CZMQcreatesthepipeautomaticallywhenweusethezthread_fork
method).It'sviathispipethatwetellchildthreadstostopwhenit'stimeforthemtoleave.Thechildthreadsdothefollowing(I'm
switchingtopseudocodeforclarity):

createaninterface
whiletrue:
pollonpipetoparent,andoninterface
ifparentsentusamessage:
break
ifinterfacesentusamessage:
ifmessageisENTER:
sendaWHISPERtothenewpeer
ifmessageisEXIT:
sendaWHISPERtothedepartedpeer
ifmessageisWHISPER:
sendbackaWHISPER1/2ofthetime
ifmessageisSHOUT:

http://zguide.zeromq.org/page:all 210/225
12/31/2015 MQ - The Guide - MQ - The Guide
sendbackaWHISPER1/3ofthetime
sendbackaSHOUT1/3ofthetime
oncepersecond:
joinorleaveoneof10randomgroups
destroyinterface

TestResults topprevnext

Yes,webrokethecode.Severaltimes,infact.Thiswassatisfying.I'llworkthroughthedifferentthingswefound.

Gettingnodestoagreeonconsistentgroupstatuswasthemostdifficult.Everynodeneedstotrackthegroupmembershipofthe
wholenetwork,asIalreadyexplainedinthesection"GroupMessaging".Groupmessagingisapubsubpattern.JOINsand
LEAVEsareanalogoustosubscribeandunsubscribemessages.It'sessentialthatnoneoftheseevergetlost,orwe'llfindnodes
droppingrandomlyoffgroups.

SoeachnodecountsthetotalnumberofJOINsandLEAVEsit'severdone,andbroadcaststhisstatus(as1byterollingcounter)
initsUDPbeacon.Othernodespickupthestatus,compareittotheirowncalculations,andifthere'sadifference,thecode
asserts.

ThefirstproblemwasthatUDPbeaconsgetdelayedrandomly,sothey'reuselessforcarryingthestatus.Whenabeacons
arriveslate,thestatusisinaccurateandwegetafalsenegative.Tofixthis,wemovedthestatusinformationintotheJOINand
LEAVEcommands.WealsoaddedittotheHELLOcommand.Thelogicthenbecomes:

GetinitialstatusforapeerfromitsHELLOcommand.
WhengettingaJOINorLEAVEfromapeer,incrementthestatuscounter.
CheckthatthenewstatuscountermatchesthevalueintheJOINorLEAVEcommand
Ifitdoesn't,assert.

Nextproblemwegotwasthatmessageswerearrivingunexpectedlyonnewconnections.TheHarmonypatternconnects,then
sendsHELLOasthefirstcommand.ThismeansthereceivingpeershouldalwaysgetHELLOasthefirstcommandfromanew
peer.WewereseeingPING,JOIN,andothercommandsarriving.

ThisturnedouttobeduetoCZMQ'sephemeralportlogic.Anephemeralportisjustadynamicallyassignedportthataservice
cangetratherthanaskingforafixedportnumber.APOSIXsystemusuallyassignsephemeralportsintherange0xC000to
0xFFFF.CZMQ'slogicistolookforafreeportinthisrange,bindtothat,andreturntheportnumbertothecaller.

Thissoundsfine,untilyougetonenodestoppingandanothernodestartingclosetogether,andthenewnodegettingtheport
numberoftheoldnode.RememberthatZeroMQtriestoreestablishabrokenconnection.Sowhenthefirstnodestopped,its
peerswouldretrytoconnect.Whenthenewnodeappearsonthatsameport,suddenlyallthepeersconnecttoitandstart
chattinglikethey'reoldbuddies.

It'sageneralproblemthataffectsanylargerscaledynamicZeroMQapplication.Thereareanumberofplausibleanswers.One
istonotreuseephemeralports,whichiseasiersaidthandonewhenyouhavemultipleprocessesononesystem.Another
solutionwouldbetoselectarandomporteachtime,whichatleastreducestheriskofhittingajustfreedport.Thisbringstherisk
ofagarbageconnectiondowntoperhaps1/1000butit'sstillthere.Perhapsthebestsolutionistoacceptthatthiscanhappen,
understandthecauses,anddealwithitontheapplicationlevel.

WehaveastatefulprotocolthatalwaysstartswithaHELLOcommand.Weknowthatit'spossibleforpeerstoconnecttous,
thinkingwe'reanexistingnodethatwentawayandcameback,andsendusothercommands.Steponeiswhenwediscovera
newpeer,todestroyanyexistingpeerconnectedtothesameendpoint.It'snotafullanswerbutatleastit'spolite.Steptwoisto
ignoreanythingcominginfromanewpeeruntilthatpeersaysHELLO.

Thisdoesn'trequireanychangetotheprotocol,butitmustbespecifiedintheprotocolwhenwecometoit:duetotheway
ZeroMQconnectionswork,it'spossibletoreceiveunexpectedcommandsfromawellbehavingpeerandthereisnowayto
returnanerrorcodeorotherwisetellthatpeertoresetitsconnection.Thus,apeermustdiscardanycommandfromapeeruntil
itreceivesHELLO.

Infact,ifyoudrawthisonapieceofpaperandthinkitthrough,you'llseethatyounevergetaHELLOfromsuchaconnection.
ThepeerwillsendPINGsandJOINsandLEAVEsandtheneventuallytimeoutandclose,asitfailstogetanyheartbeatsback
fromus.

http://zguide.zeromq.org/page:all 211/225
12/31/2015 MQ - The Guide - MQ - The Guide
You'llalsoseethatthere'snoriskofconfusion,nowayforcommandsfromtwopeerstogetmixedintoasinglestreamonour
DEALERsocket.

Whenyouaresatisfiedthatthisworks,we'rereadytomoveon.Thisversionistaggedintherepositoryasv0.3.0andyoucan
downloadthetarballifyouwanttocheckwhatthecodelookedlikeatthisstage.

Notethatdoingheavysimulationoflotsofnodeswillprobablycauseyourprocesstorunoutoffilehandles,givinganassertion
failureinlibzmq.Iraisedtheperprocesslimitto30,000byrunning(onmyLinuxbox):

ulimitn30000

TracingActivity topprevnext

Todebugthekindsofproblemswesawhere,weneedextensivelogging.There'salothappeninginparallel,buteveryproblem
canbetraceddowntoaspecificexchangebetweentwonodes,consistingofasetofeventsthathappeninstrictsequence.We
knowhowtomakeverysophisticatedlogging,butasusualit'swisertomakejustwhatweneedandnomore.Wehaveto
capture:

Timeanddateforeachevent.
Inwhichnodetheeventoccurred.
Thepeernode,ifany.
Whattheeventwas(e.g.,whichcommandarrived).
Eventdata,ifany.

Theverysimplesttechniqueistoprintthenecessaryinformationtotheconsole,withatimestamp.That'stheapproachIused.
Thenit'ssimpletofindthenodesaffectedbyafailure,filterthelogfileforonlymessagesreferringtothem,andseeexactlywhat
happened.

DealingwithBlockedPeers topprevnext

InanyperformancesensitiveZeroMQarchitecture,youneedtosolvetheproblemofflowcontrol.Youcannotsimplysend
unlimitedmessagestoasocketandhopeforthebest.Attheoneextreme,youcanexhaustmemory.Thisisaclassicfailure
patternforamessagebroker:oneslowclientstopsreceivingmessagesthebrokerstartstoqueuethem,andeventually
exhaustsmemoryandthewholeprocessdies.Attheotherextreme,thesocketdropsmessages,orblocks,asyouhitthehigh
watermark.

WithZyrewewanttodistributemessagestoasetofpeers,andwewanttodothisfairly.UsingasingleROUTERsocketfor
outputwouldbeproblematicbecauseanyoneblockedpeerwouldblockoutgoingtraffictoallpeers.TCPdoeshavegood
algorithmsforspreadingthenetworkcapacityacrossasetofconnections.Andwe'reusingaseparateDEALERsockettotalkto
eachpeer,sointheoryeachDEALERsocketwillsenditsqueuedmessagesinthebackgroundreasonablyfairly.

ThenormalbehaviorofaDEALERsocketthathitsitshighwatermarkistoblock.Thisisusuallyideal,butit'saproblemforus
here.Ourcurrentinterfacedesignusesonethreadthatdistributesmessagestoallpeers.Ifoneofthosesendcallsweretoblock,
alloutputwouldblock.

Thereareafewoptionstoavoidblocking.Oneistousezmq_poll()onthewholesetofDEALERsockets,andonlywriteto
socketsthatareready.Idon'tlikethisforacoupleofreasons.First,theDEALERsocketishiddeninsidethepeerclass,anditis
cleanertoalloweachclasstohandlethisopaquely.Second,whatdowedowithmessageswecan'tyetdelivertoaDEALER
socket?Wheredowequeuethem?Third,itseemstobesidesteppingtheissue.Ifapeerisreallysobusyitcan'treadits
messages,somethingiswrong.Mostlikely,it'sdead.

Sonopollingforoutput.Thesecondoptionistouseonethreadperpeer.Iquiteliketheideaofthisbecauseitfitsintothe
ZeroMQdesignpatternof"doonethinginonethread".Butthisisgoingtocreatealotofthreads(squareofthenumberofnodes
westart)inthesimulation,andwe'realreadyrunningoutoffilehandles.

Athirdoptionistouseanonblockingsend.Thisisnicerandit'sthesolutionIchoose.Wecanthenprovideeachpeerwitha
reasonableoutgoingqueue(theHWM)andifthatgetsfull,treatitasafatalerroronthatpeer.Thiswillworkforsmaller

http://zguide.zeromq.org/page:all 212/225
12/31/2015 MQ - The Guide - MQ - The Guide
messages.Ifwe'resendinglargechunkse.g.,forcontentdistributionwe'llneedacreditbasedflowcontrolontop.

ThereforethefirststepistoprovetoourselvesthatwecanturnthenormalblockingDEALERsocketintoanonblockingsocket.
ThisexamplecreatesanormalDEALERsocket,connectsittosomeendpoint(sothatthere'sanoutgoingpipeandthesocket
willacceptmessages),setsthehighwatermarktofour,andthensetsthesendtimeouttozero:

eagain:CheckingEAGAINonDEALERsocketinC

C#|Python|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Whenwerunthis,wesendfourmessagessuccessfully(theygonowhere,thesocketjustqueuesthem),andthenwegetanice
EAGAINerror:

Sendingmessage0
Sendingmessage1
Sendingmessage2
Sendingmessage3
Sendingmessage4
Resourcetemporarilyunavailable

Thenextstepistodecidewhatareasonablehighwatermarkwouldbeforapeer.Zyreismeantforhumaninteractionsthatis,
applicationsthatchatatalowfrequency,suchastwogamesorashareddrawingprogram.I'dexpectahundredmessagesper
secondtobequitealot.Our"peerisreallydead"timeoutis10seconds.Soahighwatermarkof1,000seemsfair.

RatherthansetafixedHWMorusethedefault(whichrandomlyalsohappenstobe1,000),wecalculateitas100*thetimeout.
Here'showweconfigureanewDEALERsocketforapeer:

//Createnewoutgoingsocket(dropanymessagesintransit)
self>mailbox=zsocket_new(self>ctx,ZMQ_DEALER)

//Setourcaller"From"identitysothatreceivingnodeknows
//whoeachmessagecamefrom.
zsocket_set_identity(self>mailbox,reply_to)

//Setahighwatermarkthatallowsforreasonableactivity
zsocket_set_sndhwm(self>mailbox,PEER_EXPIRED*100)

//SendmessagesimmediatelyorreturnEAGAIN
zsocket_set_sndtimeo(self>mailbox,0)

//Connectthroughtopeernode
zsocket_connect(self>mailbox,"tcp://%s",endpoint)

Andfinally,whatdowedowhenwegetanEAGAINonapeer?Wedon'tneedtogothroughalltheworkofdestroyingthepeer
becausetheinterfacewilldothisautomaticallyifitdoesn'tgetanymessagefromthepeerwithintheexpirationtimeout.Just
droppingthelastmessageseemsveryweakitwillgivethereceivingpeergaps.

I'dpreferamorebrutalresponse.Brutalisgoodbecauseitforcesthedesigntoa"good"or"bad"decisionratherthanafuzzy
"shouldworkbuttobehonesttherearealotofedgecasessolet'sworryaboutitlater".Destroythesocket,disconnectthepeer,
andstopsendinganythingtoit.Thepeerwilleventuallyhavetoreconnectandreinitializeanystate.It'skindofanassertionthat
100messagesasecondisenoughforanyone.So,inthezre_peer_sendmethod:

int
zre_peer_send(zre_peer_t*self,zre_msg_t**msg_p)
{
assert(self)
if(self>connected){
if(zre_msg_send(msg_p,self>mailbox)&&errno==EAGAIN){
zre_peer_disconnect(self)
return1

http://zguide.zeromq.org/page:all 213/225
12/31/2015 MQ - The Guide - MQ - The Guide
}
}

return0
}

Wherethedisconnectmethodlookslikethis:

void
zre_peer_disconnect(zre_peer_t*self)
{
//Ifconnected,destroysocketanddropallpendingmessages
assert(self)
if(self>connected){
zsocket_destroy(self>ctx,self>mailbox)
free(self>endpoint)
self>endpoint=NULL
self>connected=false
}
}

DistributedLoggingandMonitoring topprevnext

Let'slookatloggingandmonitoring.Ifyou'veevermanagedarealserver(likeawebserver),youknowhowvitalitistohavea
captureofwhatisgoingon.Therearealonglistofreasons,notleast:

Tomeasuretheperformanceofthesystemovertime.
Toseewhatkindsofworkaredonethemost,tooptimizeperformance.
Totrackerrorsandhowoftentheyoccur.
Todopostmortemsoffailures.
Toprovideanaudittrailincaseofdispute.

Let'sscopethisintermsoftheproblemswethinkwe'llhavetosolve:

Wewanttotrackkeyevents(suchasnodesleavingandrejoiningthenetwork).
Foreachevent,wewanttotrackaconsistentsetofdata:thedate/time,nodethatobservedtheevent,peerthatcreated
theevent,typeofeventitself,andothereventdata.
Wewanttobeabletoswitchloggingonandoffatanytime.
Wewanttobeabletoprocesslogdatamechanicallybecauseitwillbesizable.
Wewanttobeabletomonitorarunningsystemthatis,collectlogsandanalyzeinrealtime.
Wewantlogtraffictohaveminimaleffectonthenetwork.
Wewanttobeabletocollectlogdataatasinglepointonthenetwork.

Asinanydesign,someoftheserequirementsarehostiletoeachother.Forexample,collectinglogdatainrealtimemeans
sendingitoverthenetwork,whichwillaffectnetworktraffictosomeextent.However,asinanydesign,theserequirementsare
alsohypotheticaluntilwehaverunningcodesowecan'ttakethemtooseriously.We'llaimforplausiblygoodenoughand
improveovertime.

APlausibleMinimalImplementation topprevnext

Arguably,justdumpinglogdatatodiskisonesolution,andit'swhatmostmobileapplicationsdo(using"debuglogs").Butmost
failuresrequirecorrelationofeventsfromtwonodes.Thismeanssearchinglotsofdebuglogsbyhandtofindtheonesthat
matter.It'snotaverycleverapproach.

Wewanttosendlogdatasomewherecentral,eitherimmediately,oropportunistically(i.e.,storeandforward).Fornow,let's
focusonimmediatelogging.MyfirstideawhenitcomestosendingdataistouseZyreforthis.Justsendlogdatatoagroup

http://zguide.zeromq.org/page:all 214/225
12/31/2015 MQ - The Guide - MQ - The Guide
called"LOG",andhopesomeonecollectsit.

ButusingZyretologZyreitselfisaCatch22.Whologsthelogger?Whatifwewantaverboselogofeverymessagesent?Do
weincludeloggingmessagesinthatornot?Itquicklygetsmessy.Wewantaloggingprotocolthat'sindependentofZyre'smain
ZREprotocol.Thesimplestapproachisapubsubprotocol,whereallnodespublishlogdataonaPUBsocketandacollector
picksthatupviaaSUBsocket.

Figure69DistributedLogCollection

Thecollectorcan,ofcourse,runonanynode.Thisgivesusanicerangeofusecases:

ApassivelogcollectorthatstoreslogdataondiskforeventualstatisticalanalysisthiswouldbeaPCwithsufficienthard
diskspaceforweeksormonthsoflogdata.

Acollectorthatstoreslogdataintoadatabasewhereitcanbeusedinrealtimebyotherapplications.Thismightbe
overkillforasmallworkgroup,butwouldbesnazzyfortrackingtheperformanceoflargergroups.Thecollectorcould
collectlogdataoverWiFiandthenforwarditoverEthernettoadatabasesomewhere.

AlivemeterapplicationthatjoinedtheZyrenetworkandthencollectedlogdatafromnodes,showingeventsandstatistics
inrealtime.

Thenextquestionishowtointerconnectthenodesandcollector.Whichsidebinds,andwhichconnects?Bothwayswillwork
here,butit'smarginallybetterifthePUBsocketsconnecttotheSUBsocket.Ifyourecall,ZeroMQ'sinternalbuffersonlypopinto
existencewhenthereareconnections.Itmeansassoonasanodeconnectstothecollector,itcanstartsendinglogdatawithout
loss.

Howdowetellnodeswhatendpointtoconnectto?Wemayhaveanynumberofcollectorsonthenetwork,andthey'llbeusing
arbitrarynetworkaddressesandports.Weneedsomekindofserviceannouncementmechanism,andherewecanuseZyreto
dotheworkforus.Wecouldusegroupmessaging,butitseemsneatertobuildservicediscoveryintotheZREprotocolitself.It's
nothingcomplex:ifanodeprovidesaserviceX,itcantellothernodesaboutthatwhenitsendsthemaHELLOcommand.

We'llextendtheHELLOcommandwithaheadersfieldthatholdsasetofname=valuepairs.Let'sdefinethattheheaderX
ZRELOGspecifiesthecollectorendpoint(theSUBsocket).Anodethatactsasacollectorcanaddaheaderlikethis(for
example):

XZRELOG=tcp://192.168.1.122:9992

Whenanothernodeseesthisheader,itsimplyconnectsitsPUBsockettothatendpoint.Logdatanowgetsdistributedtoall
collectors(zeroormore)onthenetwork.

Makingthisfirstversionwasfairlysimpleandtookhalfaday.Herearethepieceswehadtomakeorchange:

Wemadeanewclasszre_logthatacceptslogdataandmanagestheconnectiontothecollector,ifany.
Weaddedsomebasicmanagementforpeerheaders,takenfromtheHELLOcommand.
WhenapeerhastheXZRELOGheader,weconnecttotheendpointitspecifies.
Wherewewereloggingtostdout,weswitchedtologgingviathezre_logclass.

http://zguide.zeromq.org/page:all 215/225
12/31/2015 MQ - The Guide - MQ - The Guide
WeextendedtheinterfaceAPIwithamethodthatletstheapplicationsetheaders.
WewroteasimpleloggerapplicationthatmanagestheSUBsocketandsetstheXZRELOGheader.
WesendourownheaderswhenwesendaHELLOcommand.

ThisversionistaggedintheZyrerepositoryasv0.4.0andyoucandownloadthetarballifyouwanttoseewhatthecodelooked
likeatthisstage.

Atthisstage,thelogmessageisjustastring.We'llmakemoreprofessionallystructuredlogdatainalittlewhile.

First,anoteondynamicports.Inthezre_testerappthatweusefortesting,wecreateanddestroyinterfacesaggressively.
Oneconsequenceisthatanewinterfacecaneasilyreuseaportthatwasjustfreedbyanotherapplication.Ifthere'saZeroMQ
socketsomewheretryingtoconnectthisport,theresultscanbehilarious.

Here'sthescenarioIhad,whichcausedafewminutes'confusion.Theloggerwasrunningonadynamicport:

Startloggerapplication
Starttesterapplication
Stoplogger
Testerreceivesinvalidmessage(andassertsasdesigned)

Asthetestercreatedanewinterface,thatreusedthedynamicportfreedbythe(juststopped)logger,andsuddenlytheinterface
begantoreceivelogdatafromnodesonitsmailbox.Wesawasimilarsituationbefore,whereanewinterfacecouldreusethe
portfreedbyanoldinterfaceandstartgettingolddata.

Thelessonis,ifyouusedynamicports,bepreparedtoreceiverandomdatafromillinformedapplicationsthatarereconnecting
toyou.Switchingtoastaticportstoppedthemisbehavingconnection.That'snotafullsolutionthough.Therearetwomore
weaknesses:

AsIwritethis,libzmqdoesn'tchecksockettypeswhenconnecting.TheZMTP/2.0protocoldoesannounceeachpeer's
sockettype,sothischeckisdoable.

TheZREprotocolhasnofailfast(assertion)mechanismweneedtoreadandparseawholemessagebeforerealizing
thatit'sinvalid.

Let'saddressthesecondone.Socketpairvalidationwouldn'tsolvethisfullyanyway.

ProtocolAssertions topprevnext

AsWikipediaputsit,"Failfastsystemsareusuallydesignedtostopnormaloperationratherthanattempttocontinueapossibly
flawedprocess."AprotocollikeHTTPhasafailfastmechanisminthatthefirstfourbytesthataclientsendstoanHTTPserver
mustbe"HTTP".Ifthey'renot,theservercanclosetheconnectionwithoutreadinganythingmore.

OurROUTERsocketisnotconnectionorientedsothere'snowayto"closetheconnection"whenwegetbadincoming
messages.However,wecanthrowouttheentiremessageifit'snotvalid.Theproblemisgoingtobeworsewhenweuse
ephemeralports,butitappliesbroadlytoallprotocols.

Solet'sdefineaprotocolassertionasbeingauniquesignaturethatweplaceatthestartofeachmessageandwhichidentities
theintendedprotocol.Whenwereadamessage,wecheckthesignatureandifit'snotwhatweexpect,wediscardthemessage
silently.Agoodsignatureshouldbehardtoconfusewithregulardataandgiveusenoughspaceforanumberofprotocols.

I'mgoingtousea16bitsignatureconsistingofa12bitpatternanda4bitprotocolID.Thepattern%xAAAismeanttostayaway
fromvalueswemightotherwiseexpecttoseeatthestartofamessage:%x00,%xFF,andprintablecharacters.

Figure70ProtocolSignature

http://zguide.zeromq.org/page:all 216/225
12/31/2015 MQ - The Guide - MQ - The Guide
Asourprotocolcodecisgenerated,it'srelativelyeasytoaddthisassertion.Thelogicis:

Getfirstframeofmessage.
Checkiffirsttwobytesare%xAAAwithexpected4bitsignature.
Ifso,continuetoparserestofmessage.
Ifnot,skipall"more"frames,getfirstframe,andrepeat.

Totestthis,Iswitchedtheloggerbacktousinganephemeralport.Theinterfacenowproperlydetectsanddiscardsany
messagesthatdon'thaveavalidsignature.Ifthemessagehasavalidsignatureandisstillwrong,that'saproperbug.

BinaryLoggingProtocol topprevnext

Nowthatwehavetheloggingframeworkworkingproperly,let'slookattheprotocolitself.Sendingstringsaroundthenetworkis
simple,butwhenitcomestoWiFiwereallycannotaffordtowastebandwidth.Wehavethetoolstoworkwithefficientbinary
protocols,solet'sdesignoneforlogging.

ThisisgoingtobeapubsubprotocolandinZeroMQv3.xwedopublishersidefiltering.Thismeanswecandomultilevel
logging(errors,warnings,information)ifweputthelogginglevelatthestartofthemessage.Soourmessagestartswitha
protocolsignature(twobytes),alogginglevel(onebyte),andaneventtype(onebyte).

Inthefirstversion,wesendUUIDstringstoidentifyeachnode.Astext,theseare32characterseach.Wecansendbinary
UUIDs,butit'sstillverboseandwasteful.Wedon'tcareaboutthenodeidentifiersinthelogfiles.Allweneedissomewayto
correlateevents.Sowhat'stheshortestidentifierwecanusethat'sgoingtobeuniqueenoughforlogging?Isay"uniqueenough"
becausewhilewereallywantzerochanceofduplicateUUIDsinthelivecode,logfilesarenotsocritical.

ThesimplestplausibleansweristohashtheIPaddressandportintoa2bytevalue.We'llgetsomecollisions,butthey'llberare.
Howrare?Asaquicksanitycheck,Iwriteasmallprogramthatgeneratesabunchofaddressesandhashestheminto16bit
values,lookingforcollisions.Tobesure,Igenerate10,000addressesacrossasmallnumberofIPaddresses(matchinga
simulationsetup),andthenacrossalargenumberofaddresses(matchingareallifesetup).Thehashingalgorithmisamodified
Bernstein:

uint16_thash=0
while(*endpoint)
hash=33*hash^*endpoint++

Idon'tgetanycollisionsoverseveralruns,sothiswillworkasidentifierforthelogdata.Thisaddsfourbytes(twoforthenode
recordingtheevent,andtwoforitspeerineventsthatcomefromapeer).

Next,wewanttostorethedateandtimeoftheevent.ThePOSIXtime_ttypewaspreviously32bits,butbecausethis
overflowsin2038,it'sa64bitvalue.We'llusethisthere'snoneedformillisecondresolutioninalogfile:eventsaresequential,
clocksareunlikelytobethattightlysynchronized,andnetworklatenciesmeanthatprecisetimesaren'tthatmeaningful.

We'reupto16bytes,whichisdecent.Finally,wewanttoallowsomeadditionaldata,formattedastextanddependingonthe
typeofevent.Puttingthisalltogethergivesthefollowingmessagespecification:

<class
name="zre_log_msg"
script="codec_c.gsl"
signature="2"
>
ThisistheZREloggingprotocolrawversion.
<includefilename="license.xml"/>

<!Protocolconstants>
<definename="VERSION"value="1"/>

<definename="LEVEL_ERROR"value="1"/>
<definename="LEVEL_WARNING"value="2"/>
<definename="LEVEL_INFO"value="3"/>

http://zguide.zeromq.org/page:all 217/225
12/31/2015 MQ - The Guide - MQ - The Guide

<definename="EVENT_JOIN"value="1"/>
<definename="EVENT_LEAVE"value="2"/>
<definename="EVENT_ENTER"value="3"/>
<definename="EVENT_EXIT"value="4"/>

<messagename="LOG"id="1">
<fieldname="level"type="number"size="1"/>
<fieldname="event"type="number"size="1"/>
<fieldname="node"type="number"size="2"/>
<fieldname="peer"type="number"size="2"/>
<fieldname="time"type="number"size="8"/>
<fieldname="data"type="string"/>
Loganevent
</message>

</class>

Thisgenerates800linesofperfectbinarycodec(thezre_log_msgclass).Thecodecdoesprotocolassertionsjustlikethemain
ZREprotocoldoes.Codegenerationhasafairlysteepstartingcurve,butitmakesitsomucheasiertopushyourdesignspast
"amateur"into"professional".

ContentDistribution topprevnext

Wenowhavearobustframeworkforcreatinggroupsofnodes,lettingthemchattoeachother,andmonitoringtheresulting
network.Nextstepistoallowthemtodistributecontentasfiles.

Asusual,we'llaimfortheverysimplestplausiblesolutionandthenimprovethatstepbystep.Attheveryleastwewantthe
following:

AnapplicationcantelltheZyreAPI,"Publishthisfile",andprovidethepathtoafilethatexistssomewhereinthefile
system.
Zyrewilldistributethatfiletoallpeers,boththosethatareonthenetworkatthattime,andthosethatarrivelater.
Eachtimeaninterfacereceivesafileittellsitsapplication,"Hereisthisfile".

Wemighteventuallywantmorediscrimination,e.g.,publishingtospecificgroups.Wecanaddthatlaterifit'sneeded.InChapter
7AdvancedArchitectureusingZeroMQwedevelopedafiledistributionsystem(FileMQ)designedtobepluggedintoZeroMQ
applications.Solet'susethat.

Eachnodeisgoingtobeafilepublisherandafilesubscriber.Webindthepublishertoanephemeralport(ifweusethestandard
FileMQport5670,wecan'trunmultipleinterfacesononebox),andwebroadcastthepublisher'sendpointintheHELLO
message,aswedidforthelogcollector.Thisletsusinterconnectallnodessothatallsubscriberstalktoallpublishers.

Weneedtoensurethateachnodehasitsowndirectoryforsendingandreceivingfiles(theoutboxandtheinbox).Again,it'sso
wecanrunmultiplenodesononebox.BecausewealreadyhaveauniqueIDpernode,wejustusethatinthedirectoryname.

Here'showwesetuptheFileMQAPIwhenwecreateanewinterface:

sprintf(self>fmq_outbox,".outbox/%s",self>identity)
mkdir(self>fmq_outbox,0775)

sprintf(self>fmq_inbox,".inbox/%s",self>identity)
mkdir(self>fmq_inbox,0775)

self>fmq_server=fmq_server_new()
self>fmq_service=fmq_server_bind(self>fmq_server,"tcp://*:*")
fmq_server_publish(self>fmq_server,self>fmq_outbox,"/")
fmq_server_set_anonymous(self>fmq_server,true)
charpublisher[32]
sprintf(publisher,"tcp://%s:%d",self>host,self>fmq_service)

http://zguide.zeromq.org/page:all 218/225
12/31/2015 MQ - The Guide - MQ - The Guide
zhash_update(self>headers,"XFILEMQ",strdup(publisher))

//Clientwillconnectasitdiscoversnewnodes
self>fmq_client=fmq_client_new()
fmq_client_set_inbox(self>fmq_client,self>fmq_inbox)
fmq_client_set_resync(self>fmq_client,true)
fmq_client_subscribe(self>fmq_client,"/")

AndwhenweprocessaHELLOcommand,wecheckfortheXFILEMQheaderfield:

//IfpeerisaFileMQpublisher,connecttoit
char*publisher=zre_msg_headers_string(msg,"XFILEMQ",NULL)
if(publisher)
fmq_client_connect(self>fmq_client,publisher)

ThelastthingistoexposecontentdistributionintheZyreAPI.Weneedtwothings:

Awayfortheapplicationtosay,"Publishthisfile"
Awayfortheinterfacetotelltheapplication,"Wereceivedthisfile".

Intheory,theapplicationcanpublishafilejustbycreatingasymboliclinkintheoutboxdirectory,butaswe'reusingahidden
outbox,thisisalittledifficult.SoweaddanAPImethodpublish:

//Publishfileintovirtualspace
void
zre_interface_publish(zre_interface_t*self,
char*filename,char*external)
{
zstr_sendm(self>pipe,"PUBLISH")
zstr_sendm(self>pipe,filename)//Realfilename
zstr_send(self>pipe,external)//Locationinvirtualspace
}

TheAPIpassesthistotheinterfacethread,whichcreatesthefileintheoutboxdirectorysothattheFileMQserverwillpickitup
andbroadcastit.Wecouldliterallycopyfiledataintothisdirectory,butbecauseFileMQsupportssymboliclinks,weusethat
instead.Thefilehasa".ln"extensionandcontainsoneline,whichcontainstheactualpathname.

Finally,howdowenotifytherecipientthatafilehasarrived?TheFileMQfmq_clientAPIhasamessage,"DELIVER",forthis,
soallwehavetodoinzre_interfaceisgrabthismessagefromthefmq_clientAPIandpassitontoourownAPI:

zmsg_t*msg=fmq_client_recv(fmq_client_handle(self>fmq_client))
zmsg_send(&msg,self>pipe)

Thisiscomplexcodethatdoesalotatonce.Butwe'reonlyataround10KlinesofcodeforFileMQandZyretogether.Themost
complexZyreclass,zre_interface,is800linesofcode.Thisiscompact.Messagebasedapplicationsdokeeptheirshapeif
you'recarefultoorganizethemproperly.

WritingtheUnprotocol topprevnext

Wehaveallthepiecesforaformalprotocolspecificationandit'stimetoputtheprotocolonpaper.Therearetworeasonsforthis.
First,tomakesurethatanyotherimplementationstalktoeachotherproperly.Second,becauseIwanttogetanofficialportfor
theUDPdiscoveryprotocolandthatmeansdoingthepaperwork.

Likealltheotherunprotocolswedevelopedinthisbook,theprotocollivesontheZeroMQRFCsite.Thecoreoftheprotocol
specificationistheABNFgrammarforthecommandsandfields:

http://zguide.zeromq.org/page:all 219/225
12/31/2015 MQ - The Guide - MQ - The Guide

zreprotocol=greeting*traffic

greeting=S:HELLO
traffic=S:WHISPER
/S:SHOUT
/S:JOIN
/S:LEAVE
/S:PINGR:PINGOK

Greetapeersoitcanconnectbacktous
S:HELLO=header%x01ipaddressmailboxgroupsstatusheaders
header=signaturesequence
signature=%xAA%xA1
sequence=2OCTETIncrementalsequencenumber
ipaddress=stringSenderIPaddress
string=size*VCHAR
size=OCTET
mailbox=2OCTETSendermailboxportnumber
groups=stringsListofgroupssenderisin
strings=size*string
status=OCTETSendergroupstatussequence
headers=dictionarySenderheaderproperties
dictionary=size*keyvalue
keyvalue=stringFormattedasname=value

Sendamessagetoapeer
S:WHISPER=header%x02content
content=FRAMEMessagecontentasZeroMQframe

Sendamessagetoagroup
S:SHOUT=header%x03groupcontent
group=stringNameofgroup
content=FRAMEMessagecontentasZeroMQframe

Joinagroup
S:JOIN=header%x04groupstatus
status=OCTETSendergroupstatussequence

Leaveagroup
S:LEAVE=header%x05groupstatus

Pingapeerthathasgonesilent
S:PING=header%06

Replytoapeer'sping
R:PINGOK=header%07

ExampleZyreApplication topprevnext

Let'snowmakeaminimalexamplethatusesZyretobroadcastfilesaroundadistributednetwork.Thisexampleconsistsoftwo
programs:

AlistenerthatjoinstheZyrenetworkandreportswheneveritreceivesafile.
AsenderthatjoinsaZyrenetworkandbroadcastsexactlyonefile.

Thelistenerisquiteshort:

http://zguide.zeromq.org/page:all 220/225
12/31/2015 MQ - The Guide - MQ - The Guide

#include<zre.h>

intmain(intargc,char*argv[])
{
zre_interface_t*interface=zre_interface_new()
while(true){
zmsg_t*incoming=zre_interface_recv(interface)
if(!incoming)
break
zmsg_dump(incoming)
zmsg_destroy(&incoming)
}
zre_interface_destroy(&interface)
return0
}

Andthesenderisn'tmuchlonger:

#include<zre.h>

intmain(intargc,char*argv[])
{
if(argc<2){
puts("Syntax:senderfilenamevirtualname")
return0
}
printf("Publishing%sas%s\n",argv[1],argv[2])
zre_interface_t*interface=zre_interface_new()
zre_interface_publish(interface,argv[1],argv[2])
while(true){
zmsg_t*incoming=zre_interface_recv(interface)
if(!incoming)
break
zmsg_dump(incoming)
zmsg_destroy(&incoming)
}
zre_interface_destroy(&interface)
return0
}

Conclusions topprevnext

BuildingapplicationsforunstabledecentralizednetworksisoneoftheendgamesforZeroMQ.Asthecostofcomputingfalls
everyyear,suchnetworksbecomemoreandmorecommon,beitconsumerelectronicsorvirtualboxesinthecloud.Inthis
chapter,we'vepulledtogethermanyofthetechniquesfromthebooktobuildZyre,aframeworkforproximitycomputingovera
localnetwork.Zyreisn'tuniquethereareandhavebeenmanyattemptstoopenthisareaforapplications:ZeroConf,SLP,
SSDP,UPnP,DDS.Buttheseallseemtoenduptoocomplexorotherwisetoodifficultforapplicationdeveloperstobuildon.

Zyreisn'tfinished.Likemanyoftheprojectsinthisbook,it'sanicebreakerforothers.Therearesomemajorunfinishedareas,
whichwemayaddressinlatereditionsofthisbookorversionsofthesoftware.

HighlevelAPIs:themessagebasedAPIthatZyreoffersnowisusablebutstillrathermorecomplexthanI'dlikefor
averagedevelopers.Ifthere'sonetargetweabsolutelycannotmiss,it'srawsimplicity.Thismeansweshouldbuildhigh
levelAPIs,inlotsoflanguages,whichhideallthemessaging,andwhichcomedowntosimplemethodslikestart,
join/leavegroup,getmessage,publishfile,stop.

Security:howdowebuildafullydecentralizedsecuritysystem?Wemightbeabletoleveragepublickeyinfrastructurefor

http://zguide.zeromq.org/page:all 221/225
12/31/2015 MQ - The Guide - MQ - The Guide
somework,butthatrequiresthatnodeshavetheirownInternetaccess,whichisn'tguaranteed.Theansweris,asfaras
wecantell,touseanyexistingsecurepeertopeerlink(TLS,BlueTooth,perhapsNFC)toexchangeasessionkeyand
useasymmetriccipher.Symmetricciphershavetheiradvantagesanddisadvantages.

Nomadiccontent:howdoI,asauser,managemycontentacrossmultipledevices?TheZyre+FileMQcombinationmight
help,forlocalnetworkuse,butI'dliketobeabletodothisacrosstheInternetaswell.AretherecloudservicesIcould
use?IstheresomethingIcouldmakeusingZeroMQ?

Federation:howdowescalealocalareadistributedapplicationacrosstheglobe?Oneplausibleanswerisfederation,
whichmeanscreatingclustersofclusters.If100nodescanjointogethertocreatealocalcluster,thenperhaps100
clusterscanjointogethertocreateawideareacluster.Thechallengesarethenquitesimilar:discovery,presence,and
groupmessaging.

Postface topprevnext

TalesfromOutThere topprevnext

IaskedsomeofthecontributorstothisbooktotelluswhattheyweredoingwithZeroMQ.Herearetheirstories.

RobGagnon'sStory topprevnext

"WeuseZeroMQtoassistinaggregatingthousandsofeventsoccurringeveryminuteacrossourglobalnetworkof
telecommunicationsserverssothatwecanaccuratelyreportandmonitorforsituationsthatrequireourattention.ZeroMQmade
thedevelopmentofthesystemnotonlyeasier,butfastertodevelopandmorerobustandfaulttolerantthanwehadoriginally
plannedinouroriginaldesign.

"We'reabletoeasilyaddandremoveclientsfromthenetworkwithoutthelossofanymessage.Ifweneedtoenhancetheserver
portionofoursystem,wecanstopandrestartitaswellwithouthavingtoworryaboutstoppingalloftheclientsfirst.Thebuiltin
bufferingofZeroMQmakesthisallpossible."

TomvanLeeuwen'sStory topprevnext

"Iwaslookingatcreatingsomekindofservicebusconnectingallkindsofservicestogether.Therewerealreadysomeproducts
thatimplementedabroker,buttheydidnothavethefunctionalityIneeded.Byaccident,IstumbleduponZeroMQ,whichis
awesome.It'sverylightweight,lean,simpleandeasytofollowbecausetheguideisverycompleteandreadsverywell.I've
actuallyimplementedtheTitanicpatternandtheMajordomobrokerwithsomeadditions(client/workerauthenticationandworkers
sendingacatalogexplainingwhattheyprovideandhowtheyshouldbeaddressed).

"ThebeautifulthingaboutZeroMQisthefactthatitisalibraryandnotanapplication.Youcanmoldithoweveryoulikeandit
simplyputsboringthingslikequeuing,reconnecting,TCPsocketsandsuchtothebackground,makingsureyoucanconcentrate
onwhatisimportanttoyou.I'veimplementedallkindsofworkers/clientsandthebrokerinRuby,becausethatisthemain
languageweusefordevelopment,butalsosomePHPclientstoconnecttothebusfromexistingPHPwebapps.Weusethis
servicebusforcloudservices,connectingallkindsofplatformdevicestoaservicebusexposingfunctionalityforautomation.

"ZeroMQisveryeasytounderstandandifyouspendadaywiththeguide,you'llhavegoodknowledgeofhowitworks.I'ma
networkengineer,notasoftwaredeveloper,butmanagedtocreateaverynicesolutionforourautomationneeds!ZeroMQ:
Thankyouverymuch!"

http://zguide.zeromq.org/page:all 222/225
12/31/2015 MQ - The Guide - MQ - The Guide

MichaelJakl'sStory topprevnext

"WeuseZeroMQfordistributingmillionsofdocumentsperdayinourdistributedprocessingpipeline.Westartedoutwithbig
messagequeuingbrokersthathadtheirownrespectiveissuesandproblems.Inthequestofsimplifyingourarchitecture,we
choseZeroMQtodothewiring.Sofarithadahugeimpactinhowourarchitecturescalesandhoweasyitistochangeandmove
thecomponents.Theplethoraoflanguagebindingsletsuschoosetherighttoolforthejobwithoutsacrificinginteroperabilityin
oursystem.Wedon'tusealotofsockets(lessthan10inourwholeapplication),butthat'sallweneededtosplitahuge
monolithicapplicationintosmallindependentparts.

"Allinall,ZeroMQletsmekeepmysanityandhelpsmycustomersstaywithinbudget."

VadimShalts'sStory topprevnext

"IamteamleaderinthecompanyActForex,whichdevelopssoftwareforfinancialmarkets.Duetothenatureofourdomain,we
needtoprocesslargevolumesofpricesquickly.Inaddition,it'sextremelycriticaltominimizelatencyinprocessingordersand
prices.Achievingahighthroughputisnotenough.Everythingmustbehandledinasoftrealtimewithapredictableultralow
latencyperprice.Thesystemconsistsofmultiplecomponentsexchangingmessages.Eachpricecantakealotofprocessing
stages,eachofwhichincreasestotallatency.Asaconsequence,lowandpredictablelatencyofmessagingbetweencomponents
becomesakeyfactorofourarchitecture.

"Weinvestigateddifferentsolutionstofindsomethingsuitableforourneeds.Wetrieddifferentmessagebrokers(RabbitMQ,
ActiveMQApollo,Kafka),butfailedtoreachalowandpredictablelatencywithanyofthem.Intheend,wechoseZeroMQused
inconjunctionwithZooKeeperforservicediscovery.ComplexcoordinationwithZeroMQrequiresarelativelylargeeffortanda
goodunderstanding,asaresultofthenaturalcomplexityofmultithreading.WefoundthatanexternalagentlikeZooKeeperis
betterchoiceforservicediscoveryandcoordinationwhileZeroMQcanbeusedprimarilyforsimplemessaging.ZeroMQfit
perfectlyintoourarchitecture.Itallowedustoachievethedesiredlatencyusingminimalefforts.Itsavedusfromabottleneckin
theprocessingofmessagesandmadeprocessingtimeverystableandpredictable.

"IcandecidedlyrecommendZeroMQforsolutionswherelowlatencyisimportant."

HowThisBookHappened topprevnext

WhenIsetouttowriteaZeroMQbook,wewerestilldebatingtheprosandconsofforksandpullrequestsintheZeroMQ
community.Today,forwhatit'sworth,thisargumentseemssettled:the"liberal"policythatweadoptedforlibzmqinearly2012
brokeourdependencyonasingleprimeauthor,andopenedthefloortodozensofnewcontributors.Moreprofoundly,itallowed
ustomovetoagentlyorganicevolutionarymodelthatwasverydifferentfromtheolderforcedmarchmodel.

ThereasonIwasconfidentthiswouldworkwasthatourworkontheGuidehad,forayearormore,showntheway.True,the
textismyownwork,whichisperhapsasitshouldbe.Writingisnotprogramming.Whenwewrite,wetellastoryandonedoesn't
wantdifferentvoicestellingonetaleitfeelsstrange.

Formethereallongtermvalueofthebookistherepositoryofexamples:about65,000linesofcodein24differentlanguages.
It'spartlyaboutmakingZeroMQaccessibletomorepeople.PeoplealreadyrefertothePythonandPHPexamplerepositories
twoofthemostcompletewhentheywanttotellothershowtolearnZeroMQ.Butit'salsoaboutlearningprogramming
languages.

Here'saloopofcodeinTcl:

while{1}{
#Processallpartsofthemessage
zmqmessagemessage
frontendrecv_msgmessage
setmore[frontendgetsockoptRCVMORE]
backendsend_msgmessage[expr{$more?"SNDMORE":""}]
messageclose
if{!$more}{
http://zguide.zeromq.org/page:all 223/225
12/31/2015 MQ - The Guide - MQ - The Guide
break#Lastmessagepart
}
}

Andhere'sthesameloopinLua:

whiletruedo
Processallpartsofthemessage
localmsg=frontend:recv()
if(frontend:getopt(zmq.RCVMORE)==1)then
backend:send(msg,zmq.SNDMORE)
else
backend:send(msg,0)
breakLastmessagepart
end
end

Andthisparticularexample(rrbroker)existsinC#,C++,CL,Clojure,Erlang,F#,Go,Haskell,Haxe,Java,Lua,Node.js,Perl,
PHP,Python,Ruby,Scala,Tcl,andofcourseC.Thiscodebase,allprovidedasopensourceundertheMIT/X11license,may
formthebasisforotherbooksorprojects.

Butwhatthiscollectionoftranslationssaysmostprofoundlyisthis:thelanguageyouchooseisadetail,evenadistraction.The
powerofZeroMQliesinthepatternsitgivesyouandletsyoubuild,andthesetranscendthecomingsandgoingsoflanguages.
Mygoalasasoftwareandsocialarchitectistobuildstructuresthatcanlastgenerations.Thereseemsnopointinaimingfor
meredecades.

RemovingFriction topprevnext

I'llexplainthetechnicaltoolchainweusedintermsofthefrictionweremoved.Inthisbookwe'retellingastoryandthegoalisto
reachasmanypeopleaspossible,ascheaplyandsmoothlyaswecan.

ThecoreideawastohostthetextandexamplesonGitHubandmakeiteasyforanyonetocontribute.Itturnedouttobemore
complexthanthat,however.

Let'sstartwiththedivisionoflabor.I'magoodwriterandcanproduceendlessamountsofdecenttextquickly.Butwhatwas
impossibleformewastoprovidetheexamplesinotherlanguages.BecausethecoreZeroMQAPIisinC,itseemedlogicalto
writetheoriginalexamplesinC.Also,Cisaneutralchoiceit'sperhapstheonlylanguagethatdoesn'tcreatestrongemotions.

Howtoencouragepeopletomaketranslationsoftheexamples?Wetriedafewapproachesandfinallywhatworkedbestwasto
offera"chooseyourlanguage"linkoneverysingleexampleinthetext,whichtookpeopleeithertothetranslationortoapage
explaininghowtheycouldcontribute.ThewayitusuallyworksisthataspeoplelearnZeroMQintheirpreferredlanguage,they
contributeahandfuloftranslationsorfixestotheexistingones.

Atthesametime,Inoticedafewpeoplequitedeterminedlytranslatingeverysingleexample.Thiswasmainlybindingauthors
whorealizedthattheexampleswereagreatwaytoencouragepeopletousetheirbindings.Fortheirefforts,Iextendedthe
scriptstoproducelanguagespecificversionsofthebook.InsteadofincludingtheCcode,we'dincludethePython,orPHPcode.
LuaandHaxealsogottheirdedicatedversions.

Oncewehaveanideaofwhoworksonwhat,weknowhowtostructuretheworkitself.It'sclearthattowriteandtestan
example,whatyouwanttoworkonissourcecode.Soweimportthissourcecodewhenwebuildthebook,andthat'showwe
makelanguagespecificversions.

Iliketowriteinaplaintextformat.It'sfastandworkswellwithsourcecontrolsystemslikegit.Becausethemainplatformforour
websitesisWikidot,IwriteusingWikidot'sveryreadablemarkupformat.

Atleastinthefirstchapters,itwasimportanttodrawpicturestoexplaintheflowofmessagesbetweenpeers.Makingdiagrams
byhandisalotofwork,andwhenwewanttogetfinaloutputindifferentformats,imageconversionbecomesachore.Istarted
withDitaa,whichturnstextdiagramsintoPNGs,thenlaterswitchedtoasciitosvg,whichproducesSVGfiles,whicharerather
better.Sincethefiguresaretextdiagrams,embeddedintheprose,it'sremarkablyeasytoworkwiththem.

http://zguide.zeromq.org/page:all 224/225
12/31/2015 MQ - The Guide - MQ - The Guide
Bynowyou'llrealizethatthetoolchainweuseishighlycustomized,thoughitusesalotofexternaltools.Allareavailableon
Ubuntu,whichisamercy,andthewholecustomtoolchainisinthezguiderepositoryinthebinsubdirectory.

Let'swalkthroughtheeditingandpublishingprocess.Hereishowweproducetheonlineversion:

bin/buildguide

Whichworksasfollows:

Theoriginaltextsitsinaseriesoftextfiles(oneperchapter).
Theexamplessitintheexamplessubdirectory,classifiedperlanguage.
WetakethetextandprocessthisusingacustomPerlscript,mkwikidot,intoasetofWikidotreadyfiles.
Wedothisforeachofthelanguagesthatgettheirownversion.
Weextractthegraphicsandcallasciitosvgandrasterizeoneachonetoproduceimagefiles,whichwestoreintheimages
subdirectory.
Weextractinlinelistings(whicharenottranslated)andstorestheseinthelistingssubdirectory.
WeusepygmentizeoneachexampleandlistingtocreateamarkeduppageinWikidotformat.
WeuploadallchangedfilestotheonlinewikiusingtheWikidotAPI.

Doingthisfromscratchtakesawhile.SowestoretheSHA1signaturesofeveryimage,listing,example,andtextfile,andonly
processanduploadchanges,andthatmakesiteasytopublishanewversionofthetextwhenpeoplemakenewcontributions.

ToproducethePDFandEpubformats,wedothefollowing:

bin/buildpdfs

Whichworksasfollows:

WeusethecustommkdocbookPerlprogramontheinputfilestoproduceaDocBookoutput.
WepushtheDocBookformatthroughdocbook2psandps2pdftocreatecleanPDFsineachlanguage.
WepushtheDocBookformatthroughdb2epubtocreateEpubbooksandineachlanguage.
WeuploadthePDFstothepublicwikiusingtheWikidotAPI.

Whencreatingacommunityproject,it'simportanttolowerthe"changelatency",whichisthetimeittakesforpeopletoseetheir
workliveor,atleast,toseethatyou'veacceptedtheirpullrequest.Ifthatismorethanadayortwo,you'veoftenlostyour
contributor'sinterest.

Licensing topprevnext

Iwantpeopletoreusethistextintheirownwork:inpresentations,articles,andevenotherbooks.However,thedealisthatif
theyremixmywork,otherscanremixtheirs.I'dlikecredit,andhavenoargumentagainstothersmakingmoneyfromtheir
remixes.Thus,thetextislicensedunderccbysa.

Fortheexamples,westartedwithGPL,butitrapidlybecameclearthiswasn'tworkable.Thepointofexamplesistogivepeople
reusablecodefragmentssotheywilluseZeroMQmorewidely,andiftheseareGPL,thatwon'thappen.Weswitchedto
MIT/X11,evenforthelargerandmorecomplexexamplesthatconceivablywouldworkasLGPL.

However,whenwestartedturningtheexamplesintostandaloneprojects(aswithMajordomo),weusedtheLGPL.Again,
remixabilitytrumpsdissemination.Licensesaretoolsusethemwithintent,notideology.

Websitedesignandcontentiscopyright(c)2014iMatixCorporation.Contactusforprofessionalsupport.Sitecontentlicensedunderccbysa3.0MQiscopyright(c)
Copyright(c)20072014iMatixCorporationandContributors.MQisfreesoftwarelicensedundertheLGPL.MQandZEROMQaretrademarksofiMatixCorporation.
TermsofUsePrivacyPolicy

http://zguide.zeromq.org/page:all 225/225

You might also like