Professional Documents
Culture Documents
Now-Casting Food Consumer Price Indexes With Big Data Public-Private Complementarities
Now-Casting Food Consumer Price Indexes With Big Data Public-Private Complementarities
Now-Casting Food Consumer Price Indexes With Big Data Public-Private Complementarities
Complementarities
SangitaDubeyandPietroGennari
FoodandAgricultureOrganizationoftheUnitedNations
StatisticsDivision,VialedelleTermediCaracalla,00153Rome,Italy
Telephone(+39)0657055890
Sangita.Dubey@fao.org
Telephone(+39)06570553599ESSDirector@fao.org
Abstract
:Policymakers,particularlycentralbanks,relyincreasinglyonbigdata
forinformation,ornowcasts,aboutthecurrentstateoftheeconomy,whereofficial
statistics,suchasGDPandunemploymentrates,areavailableonlywithasignificantlag.
Officialstatistics,however,remainhesitantaboutadoptingorusingbigdatabasedon
concernsaboutdataquality,representativity,andlegalissues.
Thispaperpresentstheusesofbigdatainthedomainoffoodprices,from
producingofficialstatisticstonowcastsforfoodsecurityearlywarning.Inthecontextof
privatesectordataproduction,itreviewssomebigfoodpricesources,namely,from
supermarketscanners,webscraping,andcrowdsourcing,withanillustrationusing
Brazilianfoodprices.Itproposescomparativeadvantagesandcomplementaritiesof
privatepublicproduction,particularlyinthefoodsecuritycontext,concludingthatwhile
dataqualityissuescanbeaddressed,organizationalmandatesandlegislative
requirementscreatemoredifficulthurdlesinpublicprivatepartnershipsintheofficialuse
ofprivatefoodpricestatistics.
Acknowledgement
:TheauthorsthankJosephReisingerandJonathanCrossof
Premiseforsharingtheirdata,analysisandmethodologicalnotes,andFranckCachiafor
hisstatisticalandanalyticalsupport.
Keywords
:
bigdata,
nowcasting,foodpricestatistics,scannerdata,webscraping,
crowdsourcing,consumerpriceindex,mobileapplications,scannerdata,foodsecurity.
Disclaimer
:Theviewsexpressedarepurelythoseoftheauthorsandmaynotin
anycircumstancesberegardedasstatinganofficialpositionoftheFoodandAgriculture
OrganizationoftheUnitedNations.
1. INTRODUCTION
Bigdata,characterizedbyitsthreevsofvolume,velocityandvariety,hasopened
thedoortomoretimely,frequent,detailedandcosteffectivedata.Thisenablespolicy
makerstoobtainnowcasts,orcurrentperiodforecasts,ofkeyeconomicphenomena,
suchasGDPgrowth,unemploymentrates,retailsales,andconsumerinflation,which
betterinformfiscalandmonetarypolicy,andserveasearlywarningsofturningpointsin
theeconomy(Armah,2013AskitasandZimmerman,2009Banburaetal,2010Choiand
Varian,2009aand2009bGalbraithandTkacs,2013Khan,2012McLarenand
Shanbhogue,2011WuandBrynjoflsson,2009).Bigdataalsohelpsaddressthe
limitationthatofficialstatisticsusedformanypolicydecisionsareavailableonlywitha
significanttimelagandlackingthedetailanddisaggregationrequired.
Theuseofbigdatainnowcastingofficialstatisticsraisesseveralimportantquestions.
Istherearoleforbigdatainofficialstatistics?Asnewprivatesectorbigdataproducers
appear,whatistheirrolevisavisofficialstatistics?Howdotheserolesvarybytypeof
data,producer,anduse?
Theseissuesandquestionsarebeingdebatedinbothnationalandinternational
statisticsofficesalike(KarlbegandSkaliotis,2013Pierson,2013UnitedNationsGlobal
Pulse,2012,Oct2013,June2013UNSC,2014).Opponentspointouttheveryvalidrisks
inusingprivatesectorbigdatatoproduceofficialstatistics,includingconcernsarisingfrom
representativity,dataquality,privacy,legalandinstitutionalmandates,andongoing
production.Supporters,ontheotherhand,pointoutthebenefitsoflowcost,lowburden,
timelyanddetailedinformation,andsuggestthattherisksraisedcanbeaddressed,much
inthewaythatithaswithrespecttotheuseofadministrativedata.
Thedomainoffoodpricestatisticsisparticularlyhelpfulincontributingtothese
debates,bothbecauseofitshistoryinuseofbigdata,andbecauseoftheimportanceof
frequent,detailedandrealtimefoodpricedatainmonitoringfoodsecurityandproviding
anearlywarningoffoodinsecurity.Forexample,thefoodpricecrisisof200708saw
worldfoodpricesincreasesignificantly,resultinginanincreaseinfoodinsecurity,social
unrestandpoliticalandeconomicinstability.AWorldBankreportattributesfoodriotsto
sharpfoodpriceincreases(WorldBank,2014).Morerecently,theEbolaoutbreakinWest
Africahasseensharpincreasesinfoodpricesandfoodinsecurity,arisingfroma
combinationoffactorsincludingtravelrestrictionsanddisruptionsinagriculturalproduction
andtransportation.
Thoughtheneedforhighfrequentrealtimefoodpricedataisundisputed,officialfood
pricestatisticsaretypicallyavailablemonthly,andonlyattheendofthemonthoraweek
ortwoafter,andrarelywiththekindofdetailneededforfoodsecurityearlywarning,
monitoringandpolicyresponse.Manypolicydepartments,particularlyindeveloping
countries,haveadoptedbigdataapproachestonowcastfoodpricesandmonitorfood
securityandinformtheirmarketinformationandearlywarningsystems.Atthesametime,
manynationalstatisticsoffices(NSOs)havealsostartedtoadoptbigdataincompiling
theirofficialfoodpricestatistics,namely,thefoodandnonalcoholicbeveragecomponent
oftheconsumerpriceindex(foodCPI),thoughthishasrarelyledtomoretimelystatistics.
Thispaperexaminestheactualandpotentialroleandimpactofbigdataonfood
pricestatistics,particularlybigdataproducedbytheprivatesector.Towardsthatend,the
paperisstructuredasfollows.Section2providesabriefoverviewofthecollectionand
productionofthefoodCPIbynationalstatisticsoffices(NSOs),whichisnecessaryto
evaluateprivatesectorbigfooddata.Section3discussesthreeapproachesinusing
privatesectorbigdatatocomputeornowcastofficialfoodpricestatistics:1)retail
pointofsalescannerdata2)webscrapedfoodpricesand3)crowdsourcedmobileapp
datacollection.Section4providesanexampleinthecontextofBrazil,comparingofficial
BrazilianfoodCPIdatawithprivatesectordataprovidedbytheSanFranciscobasedIT
company,Premise.Section5concludeswithsomeofthepublicandprivatesector
comparativeadvantagesandcomplementaritiesincollectingandproducingfoodCPIsand
otherfoodpricestatistics.
2. Foodpricestatistics,officialfoodCPIsandnowcasting
FoodCPIs,asubcomponentoftheCPI,arethemosttraditionalfoodprice
statisticscompiledbyNSOs,andmuchofitsimportancederivesfromtheimportanceof
theCPIasanofficialstatisticusedforbothpublicandprivatesectordecisionmaking.
TheimportanceoftheCPIstemsfromitswideusesasamonetarypolicytargetto
monitorinflationtoescalatesocialsecurityandpensionpaymentstoindextaxthresholds
andtoescalatebothprivateandpublicsectorwagecontracts.WithinNSOs,itisamong
theirgroupofmissioncriticalstatisticsthatreceivethehighestattentiontodataquality
andtimeliness,astheCPIimpactsacountrysinterestandexchangerates,itsgovernment
revenuesandexpenditures,andprivatesectorwagecompensationbills.AndthefoodCPI
subcomponent,asmentionedearlier,isimportantinitsownrightinordertomonitor
pricerelatedfoodinsecurity.
Givenitsimportance,greateffortsaremadetocontinuallyimprovethequalityand
comparabilityoftheCPIwithinandacrosscountries,withwellestablishedinternational
guidelinesandmethodologies,andavastvolumeoftheoreticalandappliedresearch.
Despitethis,allcountriesdeviatefromthefirstbestmethodologyincompilingCPIs,largely
duetocostsandtheneedfortimelydata,withimplicationsforthequalityofthefoodCPI
subcomponentinmonitoringfoodsecurity.Toexaminethis,thenextsubsectionprovidesa
briefoverviewofCPIdatacollectionandcompilation.
2.1.
AbriefprimerontheCPImethodologydatacollectiontocompilation
ThefoodCPIfollowsthesamestatisticalmethodologyastheCPIitself,with
internationalguidelinesprovidedintheCPIManual:TheoryandPractice(ILO,2004).
MostcountriescompiletheCPIusingaLaspeyrestypeindex,P
,asfollows:
L
n
pitq0i
n
P L = i=1
i=1
p0i q0i
pt
p0i q0i
= (p0i ) n
i=1
i=1
p0i q0i
pt
= (p0i )s0i
i=1
Wherenisthenumberofcommodities,irepresentscommodityi,tisthetimeperiod,
s0i =
p0i q0i
p0i q0i
pt
i=1
relativeofcommodityibetweenperiods0andt(ILO,2004).Inpractice,commodityi
referstoasetofindividualproducts,whoseindividualexpenditureweightsareunavailable.
Asaresult,thepricerelativeforthissetofproductsisitselfanelementarypriceindex,
generallyestimatedbythegeometricmeanofpricerelativesofindividualproducts(a
Jevonsindex).
ALaspeyresindexhasthepropertythatquantityorexpenditureweightsarekept
fixedforthebaseperiod,0,toenabletheindextomeasureonlythepurepricechange
betweentwoperiods.However,sincepricesaretypicallycollectedmonthly,whilethe
expenditureweightsaretypicallycomputedannually,theindexisnotapureLaspeyres
index,butpertainstothemoregeneralcategoryofLowetypeindices.Furthermore,the
expenditureweightsorsharespertaintobroadgroupingsofcommoditiesasopposedto
detailedcommoditybreakdowns.Boththeweightingandthegroupingstakeintoaccount
thehighcostsofdatacollection,theformerofwhichistypicallyobtainedthrough
householdexpendituresurveys.Sinceconsumersareknowntosubstitutefromhigherto
lowerpriceditems,thisindexisknowntohaveanupwardbias,whichincreasestheless
frequentlyexpenditureweightsareupdated.TheoppositebiasisfoundforPaaschetype
indices,whereexpenditureweightsrefertothecurrentperiod.
Theindexmethodologyuseddiffersfromtheidealchainedindices(whichrequire
monthlyexpenditureweights),largelyduetothehighcostsofconductedregular
expendituresurveys.Furthermore,theexpenditureweightsaredesignedtoreflecta
representativeconsumer,butthelackofcoverageofexpendituresinsmallcitiesand
ruralareasresultsinanindexthatmorelikelyrepresentstherepresentativebigcity
consumer.Duetotransportation,storageandpostharvestfoodloss,itisreasonableto
expectthatfoodpriceinflationishigherinurbanareasrelativetosmallcitiesandrural
areas,particularlyindevelopingcountries,withthepossibleexceptionofregionsand
productsfacingimportdependency,wheretheoppositemaybetrue.
Forpricecollection,theNSOdeterminestheregionscovered,whichareoftenonly
largeurbanareasandforeachcommoditygroup,thesampleofmarketsoroutletswithin
theregionandthesampleofproductswithintheoutletforwhichpricesarecollected.
CommoditygroupingsarebasedontheClassificationofIndividualConsumptionby
Purpose(COICOP),aninternationalclassificationsystem,orsomevariantthereof.To
ensureonlypricechangeismeasured,pricesarecollectedforthesamecommodityforthe
sameoutlet,whichmeansCPIdataarelongitudinalinnature.
Theselectionofthesampleofproductswithineachcommoditygroupingisbased
onthemaincommoditiespurchasedwithinthatgroupinaparticularoutlet.Whatismain
isoftenbasedontheadviceoftheoutlet,whichmayreflectjudgmentasopposedto
statisticalevidence.Furthermore,whileeveryeffortismadetopricethesamecommodity
overtime,itisnotalwayspossibletoensurethatqualityremainsfixed.Furthermore,CPIs
alsoneedtotakeintoaccounttheintroductionofnewitemsinthemarket,andemerging
brandssoldathigherprices.
WhilethisdescriptiondoesnotfullyexplainthemethodologybehindCPIdata
collectionandcompilation,itdoesshowthattheactualpracticedeviatesfromtheideal.
Thedeviationisdrivenlargelybycosts,responseburden,andtheneedfortimelyand
reliableCPIdatatoinformmonetarypolicy.
Insummary,whiletraditionalfoodCPIdatacollectionandcompilationisguidedby
internationallyacceptedguidelines,itspracticesuffersfromthefollowinglimitations,
particularlywithrespecttoitsuseinmonitoringfoodsecurityandwarningofturningpoints.
Evenwhenprovidedmonthly,theCPIandfoodCPIareavailablewithalagofseveral
weeks,astheyarepublishedbetweenendofthemonthorwithinseveralweeksafter.Its
focusonurbanareasleadstolackofrepresentativityofconsumersinsmallercitiesand
ruralareas,wholikelyfacelowerlevelsoffoodpriceinflation,particularlyindeveloping
countries,apartfromthoseareasdependentonfoodimports.Thelackoffrequent
updatingofexpenditureweightscreatesanupwardbiasinmeasuringfoodpriceinflation,
becauseitignoresthefactthatconsumerssubstitutefromhighertolowerpriceditemsofa
similarnature.Unlessthereissomeexpostreweightingtogetatabetterrepresentativity
ofatypicalbasketoffooditemspurchased,itislikelythatofficialpricestatisticswill
overestimatefoodpriceinflation.Andfinally,thefoodCPIlackstheleveloffoodproduct
detailandgeographicaldetailnecessarytopinpointtheproductstypesandlocations
wherepricerelatedfoodinsecurityislikelytooccur,adataneedparticularlyimportingfor
foodsecuritymonitoringandearlywarning.
3. TheuseofprivatesectorBigDataincomputingfoodCPIs
Asmentionedearlier,thetypesofbigdatausedforcomputingfoodCPIsinclude
retailpointofsalescannerdata,datascrapedfrominternetsites,andfoodpricedata
collectedusingmobileapplicationsonhandhelddevicessuchasmobilephones.Afourth
method,basedoninternetsearchqueries,isnotdiscussedinthispaper.Unlikemany
otherdomainsinofficialstatistics,wheretherearestillsignificantmethodologyanddata
qualitybasedobjectionstotheuseofprivatesectorbigdata,theuseofprivatesectorbig
datatocompiletheCPIprovidesacounterexample,asscannerdataisbeingdirectlyused
inCPIcompilationbyseveralcountries.
3.1ScannerData
TheuseofscannerdataforcomputingCPIs,particularlyfoodCPIshasbeen
advocatedbyleadingpriceindextheorists,suchasErwinDiewert,RobertFeenstra,Denis
Fixler,andJackTriplett(FeenstraandShapiro,2003)discussedseveraltimesinthe
internationalmeetingsoftheOttawaGrouponpriceindicesfeaturedintheMay2014
ILOledexpertgroupmeetingontheCPIandadvocatedbyEurostattoitsmember
countriesincompilingtheHarmonisedIndicesofConsumerPrices(HICP).Atthenational
level,ithasbeentestedand/orimplementedinseveralcountries,includingthe
Netherlands(vanderGrientanddeHaan,2010),Norway(RodriguezandHaraldsen,
2006.),Switzerland(R.Mlleretal,2006),andtheUnitedKingdom(Jamesand
Campbell,2012).
Scannerdatafordataobtainedattheretailpointofsalewhenpurchasesare
scannedbybarcodereaders.Forfoodprices,thesearetypicallygatheredatsupermarket
checkouts.Theinformationcollectedincludestheproductpurchased,itscharacteristics,
theexpenditure,andthetimeandsalelocation/outlet.Barcodesusedetailed
classificationsystems,suchastheInternationalEuropeanArticleNumber,formerlythe
EuropeanArticleNumber(EAN),ortheUniversalProductCode(UPC),bothofwhich
enableamappingtotheCOICOP.UsingthiscommonCOICOPclassificationisessential
forcrosscountrycomparisonsofCPIs.
Incountrieswherescannerdataisused,advantagesincludeimprovementsindata
qualityandrepresentativityofthemainitemspurchased,reducedcostsofdatacollection
andresponseburden,andtheavailabilityofnearrealtimedata(availablewithatwoto
threedaydelay)aswellasacensusoftransactionsfortheretailerscovered(Jamesand
10
Campbell,2012Mueller,2006SilerandHeravi,2001).Theseadvantagesincreasewhen
foodsalesareconcentratedamongafewretailers,suchasSwitzerland,wherethetwo
biggestretailchainsaccountfor70%ofsales,ortheUK,wherethetopfouraccountfor
over75%andthetopsixforalmost90%ofsales.
Twolimitationsdoexist.Thefirstisthefactthatfoodsalesinmomandpopstores
arenotcaptured.Thesecond,andmoreconsiderablelimitation,arisesfromtherisk
associatedwithrelianceontheprivatesectorfordatacollectionandmaintenance.IT
glitchescandelaythereportingofdatabyretailers,puttingatriskeitherthequalityofthe
CPI,ortherequirementthattheNSOpublishtheCPIaccordingtoafixed,preannounced
schedule.Privatefirmscanexperiencefinancialdifficulties,reducingtheirabilitytoplace
effortinactivities,suchasdatasharing,thatdonotcontributetosales.Whereretailsales
areconcentrated,theimpactofmissingdataissignificant,andsincetheCPIisamongthe
statisticalindicatorsthatimpactfinancialmarkets,theserisksarenottrivial.Whilethefirst
issuecannotberesolvedusingscannerdata,asolutiontothelatterisestablishformaland
legallybindingcontractswithprivatesectorprovidersofscannerdata.
Whilethisexampleisnoteworthyindemonstratinghowofficialstatisticsutilize
privatesectorbigfooddata,itsapplicationindevelopingcountriesmaybelimiteddueto
lowsharesoffoodsalesinsupermarketchainsthatelectronicallyscanbarcodes.
3.2Webscrapedpricedata,andtheBillionPricesProjectatMIT
TheBillionPricesProject(BPP)atMITisanexampleoftheuseofwebscrapingof
onlinepricestonowcasttheCPI.Thisproject,originallyanacademicinitiativeatMIT,
collectsonlinepricesformillionsofitemssoldbyalargenumberofretailerstoproduce
realtimenationalinflationindexesfor22countries,aswellasaglobalinflationindex.The
indicesaresoldthroughtheBPPsprivatesectorpartner,PriceStats.Toprotectthis
bottomline,onlydatafortheUnitedStatesandArgentinaaremadepubliclyavailablefor
free,andthatwitha10daylagrelativetoaccessavailablebypayingclients.
11
Someoftheadvantagesofthisapproachincludelowcostsandresponseburden,
greatertimelinessandfrequency,moredetailedcommoditypricescollected,coverageofa
largenumberofcountries,includingdevelopingcountries,andtheprovisionof(near)
realtimeinflationmeasures.Relativetoscannerdata,theabilitytoincludedeveloping
countriesinproducingcomparableinflationmeasuresisanimportantadvantage(Cavallo,
2013).
Forpurposesofnowcasting,JamesSuroweckiwroteintheNewYorkerin2011:
...afterLehmanBrotherswentunder,inSeptember,2008,theprojectsdatashowedthat
businessesstartedcuttingpricesalmostimmediately,whichsuggestedthatdemandhad
collapsed.Thegovernmentsnumbers,bycontrast,didntshowthisdeflationarypressure
untilthatNovember.Thisyear,theresbeenamilduptickinannualinflation,andagainthe
BPPdetectedthenewtrendbeforetheConsumerPriceIndexdid.Thatkindofearly
headsupcouldhelpgovernmentsmakemoretimelydecisions.
Asadisadvantage,theindicesdeviatesfrominternationallyrecommended
practicesinthatitdoesnotuseexpenditureweightsortheJevonsindexforelementary
aggregated,andfurthermore,collectspricesinsomecountriesforaverylimitednumberof
retailersand/orcities(e.g.1retailerinArgentinain2012)(Cavallo,2012).Thebigger
limitation,forpurposesofthispaper,istheabsenceoffoodpricesubindices,whichin
developingcountries,mayreflectthefactthatmostfoodpricesmaynotbeadvertised
online.
3.3Crowdsourcedmobileapppricedatacollection
Athirdbigfooddatasourcecomesfromdatacollectedusingmobileappsinhand
helddevices,suchascellphones.ForNSOsthatequiptheirtrainedenumeratorswiththis
technology,itbecomesanalogoustocomputerassistedpersonalinterview(CAPI)
applications,alreadywidelyusedbyNSOsindatacollection,solittlemoreneedbesaid.
12
Thelowcostofthistechnology,however,hasalsoledtoitsusebynonNSO
governmentdepartmentsandprivatesectorfirmstocollectandpublishearly
warning/marketinformationonfoodprices.Theabsenceofinternational/statistical
guidelines,however,limitthevalueofthisdatasourceinprovidingcomparable
crosscountrydata.
ToexpandthevalueofthistoolbeyondtheusualCAPIapplications,organizations
haveturnedtocrowdsourcingasawayofcollectinglargeramountsofdata,with
crowdsourcingdefinedbyWikipediaastheprocessofobtainingneededservices,ideas,
orcontentbysolicitingcontributionsfromalargegroupofpeopleitcombinestheefforts
ofnumerousselfidentifiedvolunteersorparttimeworkers.Organizations,bothprivate
andpublic,whousethisapproachtocollectfoodpricedatahavetwooptions.Theycan
usepurecrowdsourcing,wherethecrowddetermineswhatfoodsandmarketsand
retailerstocover.Inthiscase,theuseofthisdatawouldnotbeappropriateforfoodCPI
compilation,thoughitcouldwellmonitorfoodsecurityissues.Alternatively,an
organizationcanallocateitspricedatacollectionacrossthecrowdtopinpointspecific
markets,outletsandcommodities,inwhichcase,theapproachmayapproximatethe
collectionmethodologymostcountriesusedtoobtainfoodCPIdata.Thislatterapproach
istakenbyPremise,thoughastheycorrectlypointout,thisallocationofdatarequirements
istypicallynotseenascrowdsourcing.
13
4. PublicversusprivatefoodpricestatisticsforBrazil
4.1 TheIBGEandofficialBrazilianfoodpricedata
Brazilsnationalstatisticsoffice(NSO),theInstitutoBrasileirodeGeografiae
Estatistica(IBGE),createdin1937,isthemainproviderofofficialBrazilianstatistics,
includingitsCPIandfoodCPIsubcomponent.TheIBGEfollowstheCPIManual
guidelinestoproducemonthlyLaspeyrestypeCPIsandfoodCPIs,andlikemostNSOs,
adherestothetenFundamentalPrinciplesofOfficialStatistics,establishedbytheUnited
NationsStatisticalCommissionin1994
(http://unstats.un.org/unsd/methods/statorg/default.htm).Theseprinciplesinclude:
relevance,impartialityandequalaccesstostatisticsprofessionalismandaccountabilityin
theuseandreportingofmethodsandproceduresforthecollection,processing,storage
andpresentationofdatachoiceoverthesourceofdatabasedonquality,timeliness,costs
andrespondentburdenandinternationalcoordinationandcooperation.
TheIBGEproducesseveralmeasuresoftheconsumerpriceindex,whichvarybased
onlocationsandhouseholdscovered.TheIndiceNacionaldePrecosauConsumidor
(IPCA),usedforthisanalysis,coverstenkeymetropolitanareasandtwomunicipalities:
Belm,Fortaleza,Recife,Salvador,BeloHorizonte,RiodeJaneiro,SoPaulo,Curitiba,
VitriaandPortoAlegre,Braslia,andthemunicipalitiesofGoiniaandCampoGrande.
TheIPCAincludesfamiliesdwellingintheseareaswithmonthlyincomefromanysource,
rangingfrom1(one)to40(forty)minimumwages(fromtheIBGEwebsite).This
contrastswithanotheroftheIBGEsconsumerpriceindexmeasures,whichcoversthe
samegeographicareas,butincludesonlyfamilieswithmonthlyincomerangingfrom
1(one)to5(five)minimumwagesandwhoseheadofhouseholdispaidasalaryfortheir
mainactivity.DatacollectionfortheIPCAoccursfromday1today30ofthereference
month.
14
FoodCPIsubcomponentsincludemeats,fruits,vegetables,fish,fats,beverages,
herbs,cereals,processedmeatsandfish,poultry&eggs,dairy,bread,flours,roots,and
sugars.TheoverallCPIandthefoodCPIarealsopublishedforeachofthe12urban
areascovered.
Becausetheprivatesectordatacomparedlooksatonlythosefoodspurchasedfor
foodpreparationathome,theCPIsubcomponentofIPCAusedforthisanalysisisthe
Alimentaonodomiclio,orfoodathome,subcomponentofthefoodCPI.Thisisone
ofthetwokeygroupingsoftheBrazilianfoodCPI,withtheotherreferringtofoodprepared
and/oreatenoutsidethehome.
4.2 Premiseanditsfoodpricedata
Premise,arecentlyestablishedSanFranciscobasedITfirm,collectsandcompiles
foodpricestatisticsforBrazil,aswellasArgentina,China,India,andtheUnitedStates,
withplanstoexpandintoAfrica,startingwithNigeriaandGhana.Itsdatacollection
methodsincludecrowdsourcedmobiledatacollection,combinedwithwebscraped
prices.Uniqueinitsapproachisthefactthatitattemptstofollowinternationallyaccepted
practicesinobtainpricedataandcalculatinganindextoapproximatethefoodCPI,using
officialstatisticsforexpenditureweights,adjustedthroughlocalanalysisofmorecurrent
foodexpenditurehabits.
Foritsdatacollection,Premisehasestablishedapproachesfairlysimilartothatused
byNSOstoobtainfoodprices.Bothitsonlinedataandofflinedatarelyonlocalexpertsto
identifythekeyoutlets(internetdomainsorretailoutlets)andthemainfooditems,similar
tothejudgmentalsamplingusedbyNSOs.Theseexpertsarealsousedtovalidatethe
datacollected.Foritsonlinedata,thePremisewebcrawlercollectandrecordspretax,
preshippingprices.Foritsofflinedata,pricesforfooditemsofinterestare
crowdsourcedusingmobileappsdownloadedonsmartphones,thoughinreality,the
15
allocationofworkrenderstheirapproachsimilartoaComputerAssistancePersonal
Interview(CAPI),inwhichitsdatacollectionstrategymakesthePremisemobileappa
CAPItypeapplication.
Foritsofflinesamplingstrategy,Premiseusesmultistagesamplingtoobtainprices
foraminimumof5citiesineachcountryforasetofkeyfoodproducts.Eachcityis
dividedintocoverageareas,withelementarystratadefinedbystoretype(bysizeand
chain)andfoodproducts.Fieldworkersthatformthecrowdareassignedtoacoverage
area,andencouragedtocollectpricesfromasmanystrataaspossible.Thesefield
workersalsosubmitmetadataandphotographsforthepriceditems.Theirincentivefor
datacollectionandqualityisthefeetheyreceiveperpricequotethatpassesquality
control.
Inreality,theapproachisnotstrictlycrowdsourcing,asfieldworkers,usually
universitystudents,arescreenedandrecruited,obtainfieldtrainingonboththemobileapp
aswellasthedataandmetadatarequired,andareassigneditemsandlocationsonwhich
tocollectfoodprices,metadataandproductphotos.Furthermore,Premisealsoincents
shopkeeperstoallowthispricedatacollectionbyprovidingthemwithreportsonpricesof
goodsinsimilarstoresinthearea.
Combinedwithitsunderlyingsamplingstrategy,Premisesapproachcouldbe
viewedassimilartotraditionalNSOpricedatacollectionstrategieswhereinterviewersare
hiredoncontract,CATIapplicationsareusedfordatacaptureandqualitycontrol,andwell
designedsamplingstrategiesdeterminethemarket,outletandproductforwhichpricesare
collected.
Initsdataprocessingandindexcompilation,Premisenormalizesproductpricesfor
sizeandquantity,classifiesproductsintocategories,performsoutlierdetection,ensuresa
minimumsamplesizeinconstructingJevonstypepricerelativesforitsfood
subcomponent,compilesitsaggregatecountrylevelFoodStaplesIndex(FSI)and
16
subcomponentindicesusingaLaspeyrestypeindex,andappliesofficialNSO
expenditureweightsintheFSI.
Asthecompanyanditsdataseriesarerelativelynew,datingbackto2013,Premise
currentlycomputesa7dayand30dayinflationmeasurefromitsFSI,andpublishes
indicesforitssubcomponents,suchasprocessesmeat,fruit,vegetables,oilsandfats,and
dairyandeggs.However,italsopublishespricelevelsofkeyindividualfoodcommodities
atcitylevelforpurposesoffoodsecuritymonitoring,suchaspotatoandgreenpepper
pricesinBrazil,orwheatbreadandvegetablepricesinBuenosAires.
Premisedataandmetadataisavailabletopayingclientsorcanberequestedfora
trialperiod.
TheFSIpriceindexsubcomponentsincludemeat,fruit,vegetables,fish&seafood,
oils&fats,beverages,herbs,processedgrains,processedmeats,dairy&eggs,flours,
roots,sugars,grainsandnuts,processedfruitsandvegetables,sweets,andothersnacks
4.3 AcomparisonofIBGEandPremisefoodconsumerpriceindexes
SohowdoesthePremiseFSIcomparewiththefoodathomecomponentofthe
IBGEsIPCA?Thiscanbeassessedagainstthetypicaldimensionsofastatisticalquality
assuranceframework:relevance,timeliness,accuracy,accessibility,interpretabilityand
coherence.Thesemultidimensionalelementsofqualityassurancecoverthenotionof
fitnessforuse,andhence,aretypicallyinterpretedfromtheperspectiveoftheuser
(StatisticsCanada,2002).
Intermsofrelevance,theusesandusersofIBGEandPremisestatisticspotentially
overlap,butarenotidentical.TheIBGEfoodCPIislargelyusedbygovernmenttoinform
policy,thoughitisalsousedbytheprivatesectorforgenericanalysisoffoodprice
changesaswellasfoodandagricultureresearchers.Giventhetypesoffood
subcomponentsavailable(seeTable2),thereislikelyotherusersandusesofwhichthe
17
authorsareunaware.Premisedatausersincludeprivatesectorfirms,including
internationalbanksandhedgefunds,interestedinmonitoringrealtimepricemovements
forpurposesofcorporate,lendingandinvestmentdecisions,thoughPremiseisalso
lookingtoexpanduserstoincludegovernmentsinterestedinfoodpricestomonitorfood
security.ThekeyadvantagetoIBGEdataarisesfromitsindicesavailableforeachofits
12urbanareascovered.
ThekeyadvantagetoPremisearisesintermsoffoodsecuritymonitoringfromthe
factthatitprovidesdailyindicesinnearrealtime,aswellaspricesavailableforsome
individualproductsandtheircomparisonlocallyandhistorically.Thelatteralsosuggesta
potentialusebyconsumersinidentifyinglowerprices,thoughwithadayortwolag,though
itisunclearhowPremisewouldobtainrevenuesfromprovidingthisdatapublicly,
particularlygiventhatitwouldlikelylostrevenuesfromclientswhocurrentlypurchaseit.
ThekeyadvantagefromtheIBGEarisesfromitslongertimeseries,whichenables
morethoroughanalysisofhistoricaltrends.Thisalsopointstoariskcreatedinusing
privatesectorfoodpriceindices:anyrisktocontinuityintheirdatacollectionandindex
publicationwillreducetherelevanceoftheirdata,giventhelongitudinalnatureoftheCPI.
TimelinessleansinfavorofPremise,bothbecausetheFSIisadailyindex,and
becauseitcanmakeavailableamonthlyindexsimilartothefoodCPI10daysbefore
monthendandupto25daysbeforetheofficialIBGErelease.Indeed,thelateranalysis
willshowthatthefirst7dayaverageofPremisesdailyFSI,availablebeforemidmonth,
doesareasonablygoodjobofpredictingornowcastingthemonthlyfoodathomeCPI.
AccessibilityclearlyfavorstheIBGE,whichensuresimpartialaccesstoall,asper
theFundamentalPrinciplesofOfficialStatisticswhilePremiseprovidesaccess,as
expected,mainlytopayingclients.
Intermsofaccuracy,themethodologyandstrategyfollowedbyPremiseappearsto
beasrigorousasthatofanNSO,butasaprivatefirm,itsdataarenotsubjecttothe
18
internationalscrutinyorgovernmentauditingandqualityassurancefacedbyanNSO.
Bothindicesappeartocoversimilarfoodcommoditygroups,boththeIBGEandPremise
providebothanoverallfoodathomepriceindexandindicesforkeyfoodsubcomponents.
IfboththeIBGEandPremiseapplythesameexpenditureweights,itisexpectedthat
householdcoveragewillalsobesimilar.Premiseupdatestheseweightswithrecentand
localanalysisofcurrentfoodexpenditurepatterns,thoughcostsmayrenderthisless
robustincomparisontoofficialhouseholdexpendituresurveys.IPCA,ontheotherhand,
coversabroadergeographicareathanPremise,including12urbanareascomparedto5
citiescoveredbyPremise,andproducesfoodCPIsfortheseareas,whichmaybetter
informlocaldecisionmaking.
Somedegreeofassessmentofaccuracymaybeinformedbyastatistical
comparisonoftheFSIanditssubcomponentsagainsttheIBGEsfoodathomeCPI/IPCA
anditssubcomponents.Thisassessmentislimitedtosimpleanalysiscorrelationanalysis
andsimplelinearregressionforecasts/predictionsgiventhelimitedtimeseriesin
Premisedata.PleasekeepinmindthatdifferencesinPremiseandIBGEindicescanarise
fromdifferencesinsamplesizeandselection(andhence,insamplingerrors),geographic
coverage,andfoodproduct/itemcoverage.Asaresult,onecannotconcludethatoneset
ofindicesarenecessarilymoreaccuratethantheotherbasedonindexcomparisonsand
analysesalone.
Table1providestheIBGEfoodathomeCPIandselectsubcomponentfoodprice
indicesfortheIPCA,withindicesrebasedtoJune2013tofacilitatecomparison.Table2
and3providethePremiseFSIandselectsubcomponents:Table2providesthedaily
averageindexforthefirst7days,whileTable3providesthedailyaverageindexforthe
first30days(exceptforFebruary,whichcontainsa28dayaverage).Monthlyindexes
werealsocompiled,thougharenotpresented,forthedailyaverageofthefirst15days,
andthethefirst21days(3weeks),toevaluatepredictionaccuracyrelativetoleadtimes.
19
Theindexsubcomponentswereselectedtoenablecomparisonofassimilarproduct
groupingsbetweentheIBGEandPremiseaspossible.Monthovermonthfoodprice
inflationwascalculatedforthefourPremiseseries,tocomparePremisedataspredictive
powerrelativetothemonthlyIBGEfoodpriceinflationseries.
Chart1plotsthefourseriesbasedonthePremisedailyFSI(7dayaverage,15day
average,21dayaverage,30dayaverage)againsttheBrazilianIBGEfoodathomeCPI,
withallindicesrebasedtoJune2013.AllfourPremiseseriestrackseemtotrackwellthe
officialBrazilianstatistic,withsimilartrendsexceptinJulyandAugust2014,whenPremise
datashowsincreasingpriceswhileofficialBraziliandatashowsthereverse.
Chart2,whichshowsmonthovermonthinflationforthefiveseriesconfirmsthese
results.Again,exceptforJulyandAugust2014,officialdataandthePremiseseriesall
seemtohavesimilarmovements.BetweenSeptember2013andJanuary2014,italmost
appearsthatofficialdatalagPremiseseries.Thisinterpretationwouldnotbevalid,
however,giventhatbothdatasetssetouttomeasurethesamephenomenon.Most
problematic,however,istheJulyandAugust2014data,inwhichofficialstatisticsshowa
fallinfoodprices,whilePremisedatashowsanincrease.Thoughitmaybetemptingto
concludethattheprivatesectordataisfaulty,theWorldCupeventinBrazil,andthenews
ofitsupwardspressureoninflationmightleadonetospeculateif,instead,itwasnt
Premisethatgotitright?
ToevaluatePremisedataintermsofitsabilitytonowcastorpredictofficial
foodathomepriceinflation,asimplelinearregressionmodelwasconstructedforeachof
thefourPremiseseries(7dayaverage,15dayaverage,21dayaverage,30day
average),witheachregressionusingPremisedataastheexplanatoryvariable.Different
regressionmodelswereconstructed,foreachseries,topredictfoodathomepriceinflation
fromApriltoAugust2014.AprilinflationwaspredictedusingJune2013toMarch2014
monthlyPremiseindicesMayinflationusingJune2013April2014dataJuneinflation
20
usingJune2013May2014dataJulyinflationusingJune2013June2014dataand
AugustinflationusingJune2013July2014data.
Toevaluatethepredictivepowerofeachoftheseries,aMeanAbsolutePrediction
Error(MAPE)wascomputingusingthefollowingformula:
n
M AP E = abs(AtFt
At )
t=1
WhereAtistheactualvaluefromIBGEdataFtistheforecastbasedonasimplelinear
regressionusingthePremiseFSIastheindependentvariableandn=5isthenumberof
monthsforecasted.Table4providestheresultsofthepredictions/nowcasts,theMAPE,
andtheleadtimesforeachofthefourmonthlyPremiseseries.
NoneofthevaluesoftheMAPEsareparticularlycompelling,withvaluesinthe95%
to97%rangebasedontheforecastsforthefivemonthsfromApriltoAugust2014,
inclusive.Moreimportantly,thesignspredictedforJulyandAugustareincorrect.
Interestinglythe7dayseries,producedabout25daysbeforetheIBGEpublishesthe
IPCA,hasthesameMAPEasthe15dayand30dayseries.
KeepinginmindquestionsabouttheJulyandAugustdata,MAPEswerealso
calculatedforonlyAprilthroughJuneof2014,withvaluesrangingfrom15%forthe7day
seriesto7%forthe30dayseries.Thesepredictionshaveamuchmoreacceptable
MAPE,thesignsareallcorrect,and,asinthecaseofmostnowcasts,theadditionof
moreinformationinthePremiseseriesreducestheMAPE.The15dayseriesisthemost
attractiveinitstradeoffbetweenleadtimeof17daysbeforeofficialdatapublication,and
MAPE.Waitingtoobtainthefullmonthofday(the30dayseries),oftenpublishedaround
thesamedayoradayortwoaheadoftheofficialseries,onlygainsamarginaladvantage
inpredictionpower.This15dayseriesisalsocompelling,inthatmanyNSOscompile
theirCPIsanditssubcomponentsbasedondatacollectedduringthefirst15daysofthe
month.
21
ThisanalysissuggeststhereissomepredictivepowerintheuseofPremisesbig
foodpricedata.IfindeedtheofficialfoodCPIstatisticswereincorrectinmeasuring
foodathomepriceinflationinJulyandAugustof2014,theanalysissuggeststhatPremise
datawouldnotonlyprovideavaluableandtimelynowcastoffoodpriceinflation,butcould
alsohelpvalidateofficialstatistics.
5. Conclusion:Publicprivatesectorcomparativeadvantagesand
complementarities
Thedescriptionsandanalysesaboveleadbacktotheoriginalquestions,particularly
inthecontextoffoodpricestatistics:Istherearoleforbigdatainofficialstatistics?And
whatistheroleofprivatesectorproducersofbigdatavisavisofficialstatistics?Howdo
theserolesvarybytypeofdata,producer,oruse?
InthecaseoffoodCPIs,thereisnoquestionthattherehasbeenandisarolefor
bigdatainofficialstatistics,astheuseofscannerdataincompilingofficialCPIs
demonstrates.Thisrolevariesbycountry,assomeNSOsusescannerdatadirectlyto
compiletheirfoodCPIs,whileotheruseittovalidatetheirCPIs,andmanydonotuseitat
all.
Statisticalorganizationshavealsobeguntoadoptotherbigdatatoolsdescribedin
thispaper,suchaswebscrapingandmobiletools.Eurostat,forexample,isdevelopinga
generictooltocollectwebscrapedpricestoimproveitsCPI,andItalysIstatis
experimentingwithwebscrapingandtextminingforitsSurveyoninformationand
communicationtechnologyinenterprises.Eurostat,NewZealandandSloveniaobtain
microdataonmobilephonecall/texttimesandpositionstoenhancetheirpopulationand
migrationstatistics,whichdatasharinglegislationinplaceinSloveniatoobtainthisdata
forfreefromitsprivatesectorproviders(UNECEwebsite,BigDataHome).
Inadoptingtoolssuchaswebscrapersandmobileapps,NSOsneedtoconsider
technical,legislativeandsecurityissues,suchasstabilityoftheapplication,reliabilityof
22
mobilenetworks,securityofconfidentialinformationtransmitted,andthepersonalsecurity
ofinterviewers.Thisleadstothefinaltwoquestionsontheroleoftheprivatesectorasbig
dataproducers,andthevaryingtypesofroles.
Again,inthecaseofscannerdata,NSOsalreadyobtainscannerdatafromprivate
supermarketoutlets,forwhichthekeyrisksarisingfromITglitchesanddelaysindata
transmissionaremanagedwithlegalcontracts.Furthermore,sincethedataiscollectedfor
adifferentpurposethananNSOsuse,namely,toinformmarketresearchandwholesale
foodpurchasesandmarketingcampaigns,andNSOspublishaggregatedata,most
producersofscannerdatadonotcompromisetheirbusinesslinebysharingthisdatawith
NSOs.
Forsomeofthenewerprivateproducersofbigfooddata,suchasPremiseand
PriceStats,theconsiderationsarequitedifferent,andstemfromtheirbusinessmodel,in
whichthedataitselfisakeyproduct.Suchfirmsearnrevenuesprimarilyfromthedata,
statisticsandanalysistheyprovidetopayingclients,whoreceivethisintelligencein
advanceoftheircompetitorsorthepublicatlarge.Thecomparativeadvantageofprivate
sectorproductioninusingcrowdsourcedand/orwebscrapedpriceslikelyarisesfromthe
lowerperunitcostsofdatacollectionincurredbyspecialistfirms,aswellastheirflexibility
inmodifyingandimprovingtheirdatacollectionandproductionprocessesovertime.
Suchprivatefirmsnormallyfocusonanarrowareasofstatistics,forwhichtheyrecruit
andtrainstaffthatspecializeinonesubjectmatterdomainandonesetofITplatforms
(contrastedwithNSOs,whereefficiencygainsaccruefromgenericITplatformand
knowledgeacrossmultiplesubjectmatterareas).Thevaluetheseprivatefirmsbringtheir
clientsisinproducing(near)realtime,frequentanddetaileddatawithrestrictedaccess.
Theirclients,inturn,benefitfromthishighfrequency,realtimeproprietaryinformation
whichenablesthemtomakedecisionsaheadoftheircompetitors,orwithmoredetailed
productandgeographicinformationthanavailablefromofficialstatistics.Notsurprisingly,
23
someofthekeyclientsofPremiseandPriceStatsincludehedgefundsandother
investmentfirms,whorelyonthistypeofjustintimedetailedintelligencenecessaryto
informprofitablebusinessdecisions.
ThisbusinessmodelunderlyingfirmslikePremiseandPriceStatscircumscribesthe
typeofpublicprivatepartnershippossible.SinceNSOsarerequiredbylawtopublicly
providestatisticsontheircountry,economyandpeoples,andthetenfundamental
principlesofofficialstatisticsrequireimpartialaccessandtransparencyinthesharingof
theunderlyingdatacollectionandcompilationmethodology(whichmaybeviewedasa
tradesecretinaprivatefirm),thedirectuseofsuchprivatesectorbigdataincompiling
officialstatisticswouldlikelyrunintolegislativeproblemsandpoliticalproblems.Sincethe
CPIhastheabilitytomovefinancialmarkets,theknowledgethatsomefirmshave
advancedaccesstoevenpartoftheofficialCPIwouldlikelycreate,ataminimum,adverse
publicreaction.Ontheotherhand,ifanNSOcouldrepublishtheCPIdatapurchased
fromaprivatefirm,thiswouldunderminetheprofitabilityofsuchfirmswhomaketheir
incomefromsellingdata.Finally,whilemostNSOshavesomefinancialstability,given
theirlegislativemandatesandtaxfunding,privatefirmslackthisfinancialsecurity.In
short,ifaprivatesectordataproducergoesbust,wesaybyebyetothedatathey
produce.ThisisparticularlyproblematicfortheCPIandfoodCPI,whichrelyona
relativelylongmonthlyseriesoflongitudinaldata.
Thisdifferencesinmandatesandfinancialsecuritydoessuggestsomealternativeand
complementarities.Ontheoneextreme,NSOscanandhaveadoptedthebigdatatools
pioneeredbyprivatesectorfirms,includingthedevelopmentofwebscrapingtechnologies
andmobileapps.Inatleastonecountry,andNSOhasboughtouttheprivatesector
pioneer.Thisdoesleadtoaseparatesetofquestionsregardingpublicsectorcrowding
outofprivatesectorfirms.
24
Inthemiddleofthespectrum,NSOscanuse,sometimesatapurchaseprice,private
sectorbigfooddataforvalidationorqualityassuranceofofficialstatistics.Thishasbeen
thecaseinsomecountrieswithrespecttoscannerdata.ThePremisebasednowcasts
suggestsasimilarrolewithrespecttothiscompanysdata.Similarly,policydepartments
cananddouseprivatesectorbigdatatonowcastofficialstatistics,suchasGDPgrowth
andemploymentstatistics,withtheanalysisofthispapersuggestingthismaybeextended
tonowcastingofficialfoodCPIs.Furthermore,thecomplementaryandtimelynatureof
privatesectorbigfooddatasuggestsaroleforcentralbanks,financedepartmentsand
ministriesofagriculturetousethisdatasourcetomonitorfoodsecurity,andprovidean
earlywarningofkeyturningpoints.Thereisalreadyprecedenceforsuchuse,asmany
centralbanksandpolicydepartmentspurchaseprivatesectoreconomicforecastsas
inputsintotheirfiscalandmonetarypolicydecisionsandtonowcastkeyofficialstatistics.
Attheotherextremeliesthepurchaseofprivatesectorbigfooddatafordirectusein
compilingofficialstatistics,thoughdifferencesinbusinessandinstitutionalmodelsand
mandatesandlegalobligationsprovidethekeyfactorsindeterminingwhattypesofbig
dataandprivatesectorproviderscanservethisfunction.
6.
25
REFERENCES
[1]
MichaelaAgafiteiandSorinaVaju,
AddressingtheChallengeofProducingEuropean
ComparableDatausingAdministrativeData
.PresentedattheSeminaronStatistical
DataCollection,2527September2013.Geneva:UnitedNationsEconomic
CommissionforEurope,2013.
[2]
JoseRamonG.Albert,BigData:BigThreatorBigOpportunityforOfficialStatistics?
PublishedbyParis21,
http://www.paris21.org/newsletter/fall2013/bigdatadrjoseramonalbert
,2013.
[3]
PedroLessAndradeetal.,FromBigDatatoBigSocialandEconomicOpportunities:
WhichPoliciesWillLeadtoLeveragingDataDrivenInnovationsPotential?in:The
GlobalInformationTechnologyReport2014:RewardsandRisksofBigData,2014,
pp.8186.
[4]
NiiAyiArmah,BigDataAnalysis:TheNextFrontier,in:BankofCanadaReview,
Summer2013,pp3239.
[5]
N.AskitasandK.F.Zimmermann,GoogleEconometricsandUnemployment
Forecasting.AppliedEconomicsQuarterly,55(2),2009,pp10720.
[6]
MartaBanburaetal,Nowcasting,EuropeanCentralBankWorkingPaperSeriesNo
1275,December2010.
BenatBilbaoOsorioetal,TheGlobalInformationTechnologyReport2014:
RewardsandRisksofBigData,Geneva,2014.
[7]
[8]
DanahBoydandKateCrawford,SixProvocationsforBigData,in:ADecadein
InternetTime:SymposiumontheDynamicsoftheInternetandSociety,September
2011.
[9]
AlbertoCavallo,Onlineandofficialpriceindexes:MeasuringArgentnasinflation,in
JournalofMonetaryEconomics,2012.
[10] AlbertoCavallo,ScrapedDataandStickyPrices,MITSloanWorkingPaper,May
2013.
[11] H.ChoiandH.Varian,PredictingthePresentwithGoogleTrends.GoogleInc,April
2009a.
[12] H.ChoiandH.Varian,PredictingInitialClaimsforUnemploymentBenefits.Google
Inc,July2009b.
[13] P.DaasandM.vanderLoo,BigData(andOfficialStatistics),presentedatthe
MeetingontheManagementofStatisticalInformationSystems,ParisandBangkok,
2325April2013.
[14] RobertC.FeenstraandMatthewD.Shapiro,Eds,ScannerDataandPriceIndexes,
UniversityofChicagoPress,2003.
[15] JohnW.GalbraithandGregTkacz,NowcastingGDP:ElectronicPayments,Data
VintagesandtheTimingofDataReleases,CIRANOworkingpaper,Montreal,2013.
26
[16] InternationalLabourOrganization(ILO),ConsumerPriceIndexManual:Theoryand
Practice,2004.
[17] InternationalTelecommunicationUnion,MeasuringtheInformationSociety,Geneva,
2013.
[18] AdamJacobs,ThePathologiesofBigData,in:CommunicationsoftheACM,52(8),
August2009.
[19] SaraJamesandRichardCampbell,ObtainingScannerDataProject,presentedto
theWorkshoponScannerDataforHICP,Stockholm,8June2011.
[20] MartinKarlbergandMichailSkaliotis,BigDataforOfficialStatisticsStrategiesand
SomeInitialEuropeanApplications,presentedtoTheConferenceofEuropean
Statisticians,Geneva,2527September2013.
[21] IrfanKhan,Nowcasting:Bigdatapredictsthepresent,in:ITWorld,Oct2012.
[22] RobertKirkpatrick,BeyondTargetedAds:BigDataforaBetterWorld,presentedat
theOReillyStrataConference,UnitedNationsGlobalPulse,Oct2012.
[23] O.Lamont,DoShortagesCauseInflation?in:ReducingInflation:Motivationand
Strategy,C.D.RomerandD.H.Romer,eds.,UniversityofChicagoPress,Chicago,
1997,pp.281306.
[24] CliffordLynch,BigData:Howdoyourdatagrow?in:Nature455,3September2008,
pp.2829.
[25] JamesManyikaetal,BigData:Thenextfrontierforinnovation,competitionand
productivity,McKinsey&Company,SanFrancisco,2011.
[26] N.McLarenandR.Shanbhogue,UsingInternetSearchDataasEconomic
Indicators,in:BankofEnglandQuarterlyBulletinQ2,2011,pp.134140.
[27] R.Mlleretal,RecentDevelopmentsintheSwissCPI:ScannerData,
TelecommunicationsandHealthPriceCollection,presentedtothe9thmeetingofthe
OttawaGroupMeetingonPrices,London,2006,pp.1416.
[28] NationalResearchCounciloftheNationalAcademies,ImprovingDatatoAnalyze
FoodandNutritionPolicies,Washington,2005.
[29] OECD.ExploringdatadrivenInnovationasanewsourceofgrowth:Mappingthe
policyissuesraisedbyBigData,Paris,June2013.
[30] StevePierson,BigData:APerspectivefromtheBLS,inAMSTATNEWS,1January
2013.
[31] J.RodriguezandF.Haraldsen,TheUseofScannerDataintheNorwegianCPI:The
NewIndexforFoodandNonAlcoholicBeverages,in:EconomicSurvey4,2006,
pp.2128.
[32] HillarySanders,etal,TheRelationshipbetweenPremisePriceData&Official
GovernmentReleases,
27
[33] MonicaScannapiecoetal,PlacingBigDatainOfficialStatistics:ABigChallenge?
PresentedtotheNewTechniquesandTechnologiesforStatisticsconference,United
NationsEconomicCommissionforAfrica,2013.
[34] MickSilverandSaeedHeravi,ScannerDataandtheMeasurementofInflation,in:
TheEconomicJournal111,June2001,pp.383404.
[35] MichailSkaliotis,Theroleofofficialstatisticsinabigdataecosystem:whatwill
change?
presentedtotheEuropeanCentralBankWorkshoponBigDatafor
ForecastingandStatistics,Frankfurt,2014.
[36] T.Suhoy,QueryIndicesanda2008Downturn:IsraeliData,BankofIsrael
DiscussionPaperNo.200906,2009.
[37] JamesSurowiecki,ABillionPricesNow,in:TheNewYorker,May30,2011.
[38] StatisticsCanada,StatisticsCanadasQualityAssuranceFramework,2002.
[39] StatisticsSweden,Issuesintheuseofscannerdata,(undateddocument).
[40] JackE.Triplett,ShouldtheCostofLivingIndexProvidetheConceptualFramework
foraConsumerPriceIndex?BrookingsInstitution,2000.
[41] UnitedNationsEconomicCommissionforEurope(UNECE),BigDataHome,
UNECEStatisticsWikis,2014l
[42] UnitedNationsGlobalPulse,BigDataforDevelopment:APrimer,June2013.
[43] UnitedNationsGlobalPulse,MobilePhoneNetworkDataforDevelopment,Oct
2013.
[44] UnitedNationsGlobalPulse,BigDataforDevelopment:Challengesand
Opportunities,May2012.
[45] UnitedNationsStatisticsDivision(UNSD),FundamentalPrinciplesofOfficial
Statistics,19942013.
[46] UnitedNationsStatisticalCommission,Bigdataandmodernizationofstatistical
systems.ReportoftheSecretaryGeneralpresentedattheFortyfifthsession,47
March2014.
[47] HeymerikvanderGrientandJandeHaan,Theuseofsupermarketscannerdatain
theDutchCPI,StatisticsNetherlands,2010.
[48] JoeWeisenthal,IsMITsBillionPricesProjectWarningofalargespikeupinthe
CPI?in:BusinessInsider,25April2011.
[49] Wikibon,AComprehensiveListofBigDataStatistics.WikibonBlog,1Aug2012.
Availableat
http://wikibon.org/blog/bigdatastatistics/
.
[50] WorldBank,FoodPriceWatch,May2014.
28
[51] L.WuandE.Brynjolfsson,TheFutureofPrediction:HowGoogleSearches
ForeshadowHousingPricesandSales.SloanSchoolofManagement,
MassachusettsInstituteofTechnology,2009.
29
Table1:OfficialBrazilianfoodpricestatisticsthefoodCPIandselectsubcomponents
FoodCPI
Jan13
96.32
Feb13
97.89
Mar13
99.22
Apr13
100.31
May13
100.36
Jun13
100.00
Jul13
99.27
Aug13
98.93
Sep13
98.90
Oct13
99.96
Nov13
100.37
Dec13
101.16
Jan14
102.07
Feb14
102.30
Mar14
104.78
Apr14
106.38
May14
106.81
June14
106.17
July14
105.63
Aug14
104.99
Meats
104.2
4
104.1
0
102.4
1
100.5
8
99.87
100.0
0
100.0
8
100.2
3
101.1
1
104.3
2
105.2
8
107.7
3
111.0
4
111.0
4
113.5
4
115.6
1
116.0
8
116.5
5
116.6
9
117.1
9
Fruits
Vegetable
s
Fish
Fats
Drinks
Herbs
92.50
78.84
103.38
107.76
99.31
95.70
92.90
89.62
102.59
107.70
99.44
95.66
97.09
97.66
104.41
107.03
100.30
97.90
100.23
107.74
105.56
104.84
100.97
99.37
101.25
105.10
102.60
101.99
100.58
99.87
100.00
100.00
100.00
100.00
100.00
100.00
97.40
86.96
99.85
98.25
100.46
99.90
97.07
77.40
100.26
97.07
100.90
100.34
99.88
69.11
100.40
96.89
101.25
99.10
101.88
70.87
101.44
96.37
102.33
97.36
102.46
72.81
104.19
96.55
102.85
95.92
107.09
74.97
106.78
96.87
103.24
96.90
110.77
74.34
113.01
97.45
104.52
97.76
113.89
75.38
112.47
97.37
105.06
97.67
116.12
91.93
115.91
99.26
105.37
98.03
116.78
97.99
119.26
101.85
106.05
98.96
114.22
98.73
118.81
102.98
106.67
100.93
110.19
89.77
115.91
103.15
107.44
101.88
109.54
77.46
113.58
102.26
108.11
103.07
107.40
71.64
112.89
99.40
108.22
103.45
Source:
InstitutoBrasileirodeGeografiaeEstatistica.
IndexesrebasedtoJune2013.
30
Table2:PremisefoodpricestatisticstheFoodStaplesIndexandselect
subcomponents,first7dayaverageofdailyindices
Food
Staples
Index
May13
101.49
Jun13
100.00
Jul13
97.85
Aug13
97.69
Sep13
99.69
Oct13
100.34
Nov13
101.68
Dec13
102.96
Jan14
103.62
Feb14
102.54
Mar14
103.91
Apr14
105.75
May14
105.96
Jun14
104.87
Jul14
105.51
Aug14
107.17
Meat
100.2
8
100.0
0
98.19
100.2
6
102.6
5
104.7
6
106.6
4
108.4
5
109.1
6
107.7
3
108.6
1
109.7
3
109.7
5
108.6
5
109.3
4
108.7
7
Fruit
Vegetable
s
Fish&
Seafood
Oils&Fats
Beverages
Herbs,
spices&
condiment
s
97.35
104.41
99.59
98.48
99.36
100.38
100.00
100.00
100.00
100.00
100.00
100.00
97.20
93.53
98.75
99.17
100.25
100.18
96.29
93.00
98.76
97.38
101.69
98.64
100.83
92.11
100.13
95.92
103.11
99.10
102.31
89.42
100.71
94.88
105.22
99.38
102.07
90.00
101.03
95.78
104.58
98.96
105.46
91.54
100.77
95.41
105.51
98.18
106.65
94.36
104.19
95.83
105.76
98.72
104.96
92.07
102.11
95.43
105.30
98.58
106.11
96.11
104.70
96.32
105.11
98.49
105.68
98.85
106.06
97.53
106.82
99.94
103.99
99.37
108.20
97.76
107.24
101.79
100.35
95.54
107.53
97.57
107.17
102.55
100.29
92.93
107.32
97.76
107.16
102.95
101.28
92.75
107.12
97.83
108.06
104.15
Source:
Premise.
IndexesrebasedtoJune2013.
31
Table3:PremisefoodpricestatisticstheFoodStaplesIndexandselect
subcomponents,30dayaverageofdailyindices
Food
Staples
Index
May13
102.72
Jun13
100.00
Jul13
98.96
Aug13
99.38
Sep13
100.52
Oct13
101.83
Nov13
102.86
Dec13
104.14
Jan14
104.50
Feb14
103.87
Mar14
105.57
Apr14
106.76
May14
107.03
Jun14
106.08
Jul14
106.71
Aug14
108.60
Meat
101.5
5
100.0
0
99.78
101.8
5
104.4
0
107.0
0
108.1
8
109.9
4
110.5
0
109.4
0
110.0
2
110.8
9
110.7
0
109.6
8
109.9
1
111.8
3
Fruit
Vegetables
Fish&
Seafood
Oils&Fats
Beverages
Herbs,
spices&
condiment
s
99.86
105.94
104.02
100.27
99.88
100.64
100.00
100.00
100.00
100.00
100.00
100.00
97.80
96.32
99.53
98.78
100.57
100.13
99.37
95.90
99.77
97.53
101.30
98.97
102.70
92.98
100.70
96.38
103.47
99.38
102.68
92.63
101.29
95.64
104.76
98.92
104.17
93.03
101.05
96.60
104.28
98.51
106.78
94.92
102.44
96.29
105.38
98.19
107.09
96.25
105.03
96.48
105.16
98.58
107.03
95.40
103.75
96.15
104.65
98.46
106.95
100.78
105.82
97.62
105.43
99.25
106.38
101.06
107.97
98.46
106.68
100.38
104.20
101.53
108.96
98.51
106.85
102.15
101.14
98.08
107.87
98.57
106.87
102.87
101.01
94.26
108.60
98.42
106.25
102.99
102.90
96.71
108.37
98.27
107.91
104.17
Source:
Premise
.
IndexesrebasedtoJune2013.
32
Chart1:BrazilsfoodathomeCPIversusthePremiseFSI,Jan2013Aug2014
DataSource:IBGEandPremise
33
Chart2:ConsumerFoodPriceInflation,BrazilsofficialstatisticsversusthePremiseFood
StaplesIndex,Jan2013Aug2014
34
Table4:UsingPremisedatatopredictconsumerfoodpriceinflation:Mean
AbsolutePredictionErrors(MAPE)andLeadTimes
Predictedfoodinflationusingdailyaverage
Premiseindices(Ft)
IBGE
value
7day
15day
21day
30day
(At)
April
0.81
0.60
0.55
0.50
1.52
May
0.09
0.11
0.17
0.12
0.41
June
0.44
0.47
0.49
0.43
0.60
July
0.28
0.34
0.32
0.30
0.51
August
0.69
0.69
0.83
0.88
MAPE,AprilAug
0.97
0.97
0.95
0.97
MAPE,AprilJune
0.15
0.08
0.08
0.07
25days
17days
10days
2days
LeadTime
0.61
Sourcedata:
PremiseandIBGE.
35