Now-Casting Food Consumer Price Indexes With Big Data Public-Private Complementarities

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

NowcastingFoodConsumerPriceIndexeswithBigData:PublicPrivate

Complementarities

SangitaDubeyandPietroGennari
FoodandAgricultureOrganizationoftheUnitedNations
StatisticsDivision,VialedelleTermediCaracalla,00153Rome,Italy
Telephone(+39)0657055890
Sangita.Dubey@fao.org
Telephone(+39)06570553599ESSDirector@fao.org

Abstract
:Policymakers,particularlycentralbanks,relyincreasinglyonbigdata
forinformation,ornowcasts,aboutthecurrentstateoftheeconomy,whereofficial
statistics,suchasGDPandunemploymentrates,areavailableonlywithasignificantlag.
Officialstatistics,however,remainhesitantaboutadoptingorusingbigdatabasedon
concernsaboutdataquality,representativity,andlegalissues.
Thispaperpresentstheusesofbigdatainthedomainoffoodprices,from
producingofficialstatisticstonowcastsforfoodsecurityearlywarning.Inthecontextof
privatesectordataproduction,itreviewssomebigfoodpricesources,namely,from
supermarketscanners,webscraping,andcrowdsourcing,withanillustrationusing
Brazilianfoodprices.Itproposescomparativeadvantagesandcomplementaritiesof
privatepublicproduction,particularlyinthefoodsecuritycontext,concludingthatwhile
dataqualityissuescanbeaddressed,organizationalmandatesandlegislative
requirementscreatemoredifficulthurdlesinpublicprivatepartnershipsintheofficialuse
ofprivatefoodpricestatistics.
Acknowledgement
:TheauthorsthankJosephReisingerandJonathanCrossof
Premiseforsharingtheirdata,analysisandmethodologicalnotes,andFranckCachiafor
hisstatisticalandanalyticalsupport.
Keywords
:

bigdata,

nowcasting,foodpricestatistics,scannerdata,webscraping,
crowdsourcing,consumerpriceindex,mobileapplications,scannerdata,foodsecurity.

Disclaimer
:Theviewsexpressedarepurelythoseoftheauthorsandmaynotin
anycircumstancesberegardedasstatinganofficialpositionoftheFoodandAgriculture
OrganizationoftheUnitedNations.

1. INTRODUCTION

Bigdata,characterizedbyitsthreevsofvolume,velocityandvariety,hasopened
thedoortomoretimely,frequent,detailedandcosteffectivedata.Thisenablespolicy
makerstoobtainnowcasts,orcurrentperiodforecasts,ofkeyeconomicphenomena,
suchasGDPgrowth,unemploymentrates,retailsales,andconsumerinflation,which
betterinformfiscalandmonetarypolicy,andserveasearlywarningsofturningpointsin
theeconomy(Armah,2013AskitasandZimmerman,2009Banburaetal,2010Choiand
Varian,2009aand2009bGalbraithandTkacs,2013Khan,2012McLarenand
Shanbhogue,2011WuandBrynjoflsson,2009).Bigdataalsohelpsaddressthe
limitationthatofficialstatisticsusedformanypolicydecisionsareavailableonlywitha
significanttimelagandlackingthedetailanddisaggregationrequired.
Theuseofbigdatainnowcastingofficialstatisticsraisesseveralimportantquestions.
Istherearoleforbigdatainofficialstatistics?Asnewprivatesectorbigdataproducers
appear,whatistheirrolevisavisofficialstatistics?Howdotheserolesvarybytypeof
data,producer,anduse?
Theseissuesandquestionsarebeingdebatedinbothnationalandinternational
statisticsofficesalike(KarlbegandSkaliotis,2013Pierson,2013UnitedNationsGlobal
Pulse,2012,Oct2013,June2013UNSC,2014).Opponentspointouttheveryvalidrisks
inusingprivatesectorbigdatatoproduceofficialstatistics,includingconcernsarisingfrom
representativity,dataquality,privacy,legalandinstitutionalmandates,andongoing
production.Supporters,ontheotherhand,pointoutthebenefitsoflowcost,lowburden,
timelyanddetailedinformation,andsuggestthattherisksraisedcanbeaddressed,much
inthewaythatithaswithrespecttotheuseofadministrativedata.
Thedomainoffoodpricestatisticsisparticularlyhelpfulincontributingtothese
debates,bothbecauseofitshistoryinuseofbigdata,andbecauseoftheimportanceof
frequent,detailedandrealtimefoodpricedatainmonitoringfoodsecurityandproviding

anearlywarningoffoodinsecurity.Forexample,thefoodpricecrisisof200708saw
worldfoodpricesincreasesignificantly,resultinginanincreaseinfoodinsecurity,social
unrestandpoliticalandeconomicinstability.AWorldBankreportattributesfoodriotsto
sharpfoodpriceincreases(WorldBank,2014).Morerecently,theEbolaoutbreakinWest
Africahasseensharpincreasesinfoodpricesandfoodinsecurity,arisingfroma
combinationoffactorsincludingtravelrestrictionsanddisruptionsinagriculturalproduction
andtransportation.
Thoughtheneedforhighfrequentrealtimefoodpricedataisundisputed,officialfood
pricestatisticsaretypicallyavailablemonthly,andonlyattheendofthemonthoraweek
ortwoafter,andrarelywiththekindofdetailneededforfoodsecurityearlywarning,
monitoringandpolicyresponse.Manypolicydepartments,particularlyindeveloping
countries,haveadoptedbigdataapproachestonowcastfoodpricesandmonitorfood
securityandinformtheirmarketinformationandearlywarningsystems.Atthesametime,
manynationalstatisticsoffices(NSOs)havealsostartedtoadoptbigdataincompiling
theirofficialfoodpricestatistics,namely,thefoodandnonalcoholicbeveragecomponent
oftheconsumerpriceindex(foodCPI),thoughthishasrarelyledtomoretimelystatistics.
Thispaperexaminestheactualandpotentialroleandimpactofbigdataonfood
pricestatistics,particularlybigdataproducedbytheprivatesector.Towardsthatend,the
paperisstructuredasfollows.Section2providesabriefoverviewofthecollectionand
productionofthefoodCPIbynationalstatisticsoffices(NSOs),whichisnecessaryto
evaluateprivatesectorbigfooddata.Section3discussesthreeapproachesinusing
privatesectorbigdatatocomputeornowcastofficialfoodpricestatistics:1)retail
pointofsalescannerdata2)webscrapedfoodpricesand3)crowdsourcedmobileapp
datacollection.Section4providesanexampleinthecontextofBrazil,comparingofficial
BrazilianfoodCPIdatawithprivatesectordataprovidedbytheSanFranciscobasedIT
company,Premise.Section5concludeswithsomeofthepublicandprivatesector

comparativeadvantagesandcomplementaritiesincollectingandproducingfoodCPIsand
otherfoodpricestatistics.

2. Foodpricestatistics,officialfoodCPIsandnowcasting

FoodCPIs,asubcomponentoftheCPI,arethemosttraditionalfoodprice
statisticscompiledbyNSOs,andmuchofitsimportancederivesfromtheimportanceof
theCPIasanofficialstatisticusedforbothpublicandprivatesectordecisionmaking.
TheimportanceoftheCPIstemsfromitswideusesasamonetarypolicytargetto
monitorinflationtoescalatesocialsecurityandpensionpaymentstoindextaxthresholds
andtoescalatebothprivateandpublicsectorwagecontracts.WithinNSOs,itisamong
theirgroupofmissioncriticalstatisticsthatreceivethehighestattentiontodataquality
andtimeliness,astheCPIimpactsacountrysinterestandexchangerates,itsgovernment
revenuesandexpenditures,andprivatesectorwagecompensationbills.AndthefoodCPI
subcomponent,asmentionedearlier,isimportantinitsownrightinordertomonitor
pricerelatedfoodinsecurity.
Givenitsimportance,greateffortsaremadetocontinuallyimprovethequalityand
comparabilityoftheCPIwithinandacrosscountries,withwellestablishedinternational
guidelinesandmethodologies,andavastvolumeoftheoreticalandappliedresearch.
Despitethis,allcountriesdeviatefromthefirstbestmethodologyincompilingCPIs,largely
duetocostsandtheneedfortimelydata,withimplicationsforthequalityofthefoodCPI
subcomponentinmonitoringfoodsecurity.Toexaminethis,thenextsubsectionprovidesa
briefoverviewofCPIdatacollectionandcompilation.

2.1.

AbriefprimerontheCPImethodologydatacollectiontocompilation

ThefoodCPIfollowsthesamestatisticalmethodologyastheCPIitself,with

internationalguidelinesprovidedintheCPIManual:TheoryandPractice(ILO,2004).
MostcountriescompiletheCPIusingaLaspeyrestypeindex,P
,asfollows:
L
n

pitq0i

n
P L = i=1

i=1

p0i q0i

pt

p0i q0i

= (p0i ) n
i=1

i=1

p0i q0i

pt

= (p0i )s0i
i=1

Wherenisthenumberofcommodities,irepresentscommodityi,tisthetimeperiod,
s0i =

p0i q0i

p0i q0i

pt

istheshareofexpenditureoncommodityiinperiod0,and (p0i ) istheprice


i

i=1

relativeofcommodityibetweenperiods0andt(ILO,2004).Inpractice,commodityi
referstoasetofindividualproducts,whoseindividualexpenditureweightsareunavailable.
Asaresult,thepricerelativeforthissetofproductsisitselfanelementarypriceindex,
generallyestimatedbythegeometricmeanofpricerelativesofindividualproducts(a
Jevonsindex).
ALaspeyresindexhasthepropertythatquantityorexpenditureweightsarekept
fixedforthebaseperiod,0,toenabletheindextomeasureonlythepurepricechange
betweentwoperiods.However,sincepricesaretypicallycollectedmonthly,whilethe
expenditureweightsaretypicallycomputedannually,theindexisnotapureLaspeyres
index,butpertainstothemoregeneralcategoryofLowetypeindices.Furthermore,the
expenditureweightsorsharespertaintobroadgroupingsofcommoditiesasopposedto
detailedcommoditybreakdowns.Boththeweightingandthegroupingstakeintoaccount
thehighcostsofdatacollection,theformerofwhichistypicallyobtainedthrough
householdexpendituresurveys.Sinceconsumersareknowntosubstitutefromhigherto
lowerpriceditems,thisindexisknowntohaveanupwardbias,whichincreasestheless
frequentlyexpenditureweightsareupdated.TheoppositebiasisfoundforPaaschetype
indices,whereexpenditureweightsrefertothecurrentperiod.

Theindexmethodologyuseddiffersfromtheidealchainedindices(whichrequire
monthlyexpenditureweights),largelyduetothehighcostsofconductedregular
expendituresurveys.Furthermore,theexpenditureweightsaredesignedtoreflecta
representativeconsumer,butthelackofcoverageofexpendituresinsmallcitiesand
ruralareasresultsinanindexthatmorelikelyrepresentstherepresentativebigcity
consumer.Duetotransportation,storageandpostharvestfoodloss,itisreasonableto
expectthatfoodpriceinflationishigherinurbanareasrelativetosmallcitiesandrural
areas,particularlyindevelopingcountries,withthepossibleexceptionofregionsand
productsfacingimportdependency,wheretheoppositemaybetrue.
Forpricecollection,theNSOdeterminestheregionscovered,whichareoftenonly
largeurbanareasandforeachcommoditygroup,thesampleofmarketsoroutletswithin
theregionandthesampleofproductswithintheoutletforwhichpricesarecollected.
CommoditygroupingsarebasedontheClassificationofIndividualConsumptionby
Purpose(COICOP),aninternationalclassificationsystem,orsomevariantthereof.To
ensureonlypricechangeismeasured,pricesarecollectedforthesamecommodityforthe
sameoutlet,whichmeansCPIdataarelongitudinalinnature.
Theselectionofthesampleofproductswithineachcommoditygroupingisbased
onthemaincommoditiespurchasedwithinthatgroupinaparticularoutlet.Whatismain
isoftenbasedontheadviceoftheoutlet,whichmayreflectjudgmentasopposedto
statisticalevidence.Furthermore,whileeveryeffortismadetopricethesamecommodity
overtime,itisnotalwayspossibletoensurethatqualityremainsfixed.Furthermore,CPIs
alsoneedtotakeintoaccounttheintroductionofnewitemsinthemarket,andemerging
brandssoldathigherprices.
WhilethisdescriptiondoesnotfullyexplainthemethodologybehindCPIdata
collectionandcompilation,itdoesshowthattheactualpracticedeviatesfromtheideal.

Thedeviationisdrivenlargelybycosts,responseburden,andtheneedfortimelyand
reliableCPIdatatoinformmonetarypolicy.
Insummary,whiletraditionalfoodCPIdatacollectionandcompilationisguidedby
internationallyacceptedguidelines,itspracticesuffersfromthefollowinglimitations,
particularlywithrespecttoitsuseinmonitoringfoodsecurityandwarningofturningpoints.
Evenwhenprovidedmonthly,theCPIandfoodCPIareavailablewithalagofseveral
weeks,astheyarepublishedbetweenendofthemonthorwithinseveralweeksafter.Its
focusonurbanareasleadstolackofrepresentativityofconsumersinsmallercitiesand
ruralareas,wholikelyfacelowerlevelsoffoodpriceinflation,particularlyindeveloping
countries,apartfromthoseareasdependentonfoodimports.Thelackoffrequent
updatingofexpenditureweightscreatesanupwardbiasinmeasuringfoodpriceinflation,
becauseitignoresthefactthatconsumerssubstitutefromhighertolowerpriceditemsofa
similarnature.Unlessthereissomeexpostreweightingtogetatabetterrepresentativity
ofatypicalbasketoffooditemspurchased,itislikelythatofficialpricestatisticswill
overestimatefoodpriceinflation.Andfinally,thefoodCPIlackstheleveloffoodproduct
detailandgeographicaldetailnecessarytopinpointtheproductstypesandlocations
wherepricerelatedfoodinsecurityislikelytooccur,adataneedparticularlyimportingfor
foodsecuritymonitoringandearlywarning.

3. TheuseofprivatesectorBigDataincomputingfoodCPIs

Asmentionedearlier,thetypesofbigdatausedforcomputingfoodCPIsinclude
retailpointofsalescannerdata,datascrapedfrominternetsites,andfoodpricedata
collectedusingmobileapplicationsonhandhelddevicessuchasmobilephones.Afourth
method,basedoninternetsearchqueries,isnotdiscussedinthispaper.Unlikemany
otherdomainsinofficialstatistics,wheretherearestillsignificantmethodologyanddata
qualitybasedobjectionstotheuseofprivatesectorbigdata,theuseofprivatesectorbig

datatocompiletheCPIprovidesacounterexample,asscannerdataisbeingdirectlyused
inCPIcompilationbyseveralcountries.

3.1ScannerData
TheuseofscannerdataforcomputingCPIs,particularlyfoodCPIshasbeen
advocatedbyleadingpriceindextheorists,suchasErwinDiewert,RobertFeenstra,Denis
Fixler,andJackTriplett(FeenstraandShapiro,2003)discussedseveraltimesinthe
internationalmeetingsoftheOttawaGrouponpriceindicesfeaturedintheMay2014
ILOledexpertgroupmeetingontheCPIandadvocatedbyEurostattoitsmember
countriesincompilingtheHarmonisedIndicesofConsumerPrices(HICP).Atthenational
level,ithasbeentestedand/orimplementedinseveralcountries,includingthe
Netherlands(vanderGrientanddeHaan,2010),Norway(RodriguezandHaraldsen,
2006.),Switzerland(R.Mlleretal,2006),andtheUnitedKingdom(Jamesand
Campbell,2012).
Scannerdatafordataobtainedattheretailpointofsalewhenpurchasesare
scannedbybarcodereaders.Forfoodprices,thesearetypicallygatheredatsupermarket
checkouts.Theinformationcollectedincludestheproductpurchased,itscharacteristics,
theexpenditure,andthetimeandsalelocation/outlet.Barcodesusedetailed
classificationsystems,suchastheInternationalEuropeanArticleNumber,formerlythe
EuropeanArticleNumber(EAN),ortheUniversalProductCode(UPC),bothofwhich
enableamappingtotheCOICOP.UsingthiscommonCOICOPclassificationisessential
forcrosscountrycomparisonsofCPIs.
Incountrieswherescannerdataisused,advantagesincludeimprovementsindata
qualityandrepresentativityofthemainitemspurchased,reducedcostsofdatacollection
andresponseburden,andtheavailabilityofnearrealtimedata(availablewithatwoto
threedaydelay)aswellasacensusoftransactionsfortheretailerscovered(Jamesand

10

Campbell,2012Mueller,2006SilerandHeravi,2001).Theseadvantagesincreasewhen
foodsalesareconcentratedamongafewretailers,suchasSwitzerland,wherethetwo
biggestretailchainsaccountfor70%ofsales,ortheUK,wherethetopfouraccountfor
over75%andthetopsixforalmost90%ofsales.
Twolimitationsdoexist.Thefirstisthefactthatfoodsalesinmomandpopstores
arenotcaptured.Thesecond,andmoreconsiderablelimitation,arisesfromtherisk
associatedwithrelianceontheprivatesectorfordatacollectionandmaintenance.IT
glitchescandelaythereportingofdatabyretailers,puttingatriskeitherthequalityofthe
CPI,ortherequirementthattheNSOpublishtheCPIaccordingtoafixed,preannounced
schedule.Privatefirmscanexperiencefinancialdifficulties,reducingtheirabilitytoplace
effortinactivities,suchasdatasharing,thatdonotcontributetosales.Whereretailsales
areconcentrated,theimpactofmissingdataissignificant,andsincetheCPIisamongthe
statisticalindicatorsthatimpactfinancialmarkets,theserisksarenottrivial.Whilethefirst
issuecannotberesolvedusingscannerdata,asolutiontothelatterisestablishformaland
legallybindingcontractswithprivatesectorprovidersofscannerdata.
Whilethisexampleisnoteworthyindemonstratinghowofficialstatisticsutilize
privatesectorbigfooddata,itsapplicationindevelopingcountriesmaybelimiteddueto
lowsharesoffoodsalesinsupermarketchainsthatelectronicallyscanbarcodes.

3.2Webscrapedpricedata,andtheBillionPricesProjectatMIT
TheBillionPricesProject(BPP)atMITisanexampleoftheuseofwebscrapingof
onlinepricestonowcasttheCPI.Thisproject,originallyanacademicinitiativeatMIT,
collectsonlinepricesformillionsofitemssoldbyalargenumberofretailerstoproduce
realtimenationalinflationindexesfor22countries,aswellasaglobalinflationindex.The
indicesaresoldthroughtheBPPsprivatesectorpartner,PriceStats.Toprotectthis
bottomline,onlydatafortheUnitedStatesandArgentinaaremadepubliclyavailablefor
free,andthatwitha10daylagrelativetoaccessavailablebypayingclients.
11

Someoftheadvantagesofthisapproachincludelowcostsandresponseburden,
greatertimelinessandfrequency,moredetailedcommoditypricescollected,coverageofa
largenumberofcountries,includingdevelopingcountries,andtheprovisionof(near)
realtimeinflationmeasures.Relativetoscannerdata,theabilitytoincludedeveloping
countriesinproducingcomparableinflationmeasuresisanimportantadvantage(Cavallo,
2013).
Forpurposesofnowcasting,JamesSuroweckiwroteintheNewYorkerin2011:
...afterLehmanBrotherswentunder,inSeptember,2008,theprojectsdatashowedthat
businessesstartedcuttingpricesalmostimmediately,whichsuggestedthatdemandhad
collapsed.Thegovernmentsnumbers,bycontrast,didntshowthisdeflationarypressure
untilthatNovember.Thisyear,theresbeenamilduptickinannualinflation,andagainthe
BPPdetectedthenewtrendbeforetheConsumerPriceIndexdid.Thatkindofearly
headsupcouldhelpgovernmentsmakemoretimelydecisions.
Asadisadvantage,theindicesdeviatesfrominternationallyrecommended
practicesinthatitdoesnotuseexpenditureweightsortheJevonsindexforelementary
aggregated,andfurthermore,collectspricesinsomecountriesforaverylimitednumberof
retailersand/orcities(e.g.1retailerinArgentinain2012)(Cavallo,2012).Thebigger
limitation,forpurposesofthispaper,istheabsenceoffoodpricesubindices,whichin
developingcountries,mayreflectthefactthatmostfoodpricesmaynotbeadvertised
online.

3.3Crowdsourcedmobileapppricedatacollection
Athirdbigfooddatasourcecomesfromdatacollectedusingmobileappsinhand
helddevices,suchascellphones.ForNSOsthatequiptheirtrainedenumeratorswiththis
technology,itbecomesanalogoustocomputerassistedpersonalinterview(CAPI)
applications,alreadywidelyusedbyNSOsindatacollection,solittlemoreneedbesaid.

12

Thelowcostofthistechnology,however,hasalsoledtoitsusebynonNSO
governmentdepartmentsandprivatesectorfirmstocollectandpublishearly
warning/marketinformationonfoodprices.Theabsenceofinternational/statistical
guidelines,however,limitthevalueofthisdatasourceinprovidingcomparable
crosscountrydata.
ToexpandthevalueofthistoolbeyondtheusualCAPIapplications,organizations
haveturnedtocrowdsourcingasawayofcollectinglargeramountsofdata,with
crowdsourcingdefinedbyWikipediaastheprocessofobtainingneededservices,ideas,
orcontentbysolicitingcontributionsfromalargegroupofpeopleitcombinestheefforts
ofnumerousselfidentifiedvolunteersorparttimeworkers.Organizations,bothprivate
andpublic,whousethisapproachtocollectfoodpricedatahavetwooptions.Theycan
usepurecrowdsourcing,wherethecrowddetermineswhatfoodsandmarketsand
retailerstocover.Inthiscase,theuseofthisdatawouldnotbeappropriateforfoodCPI
compilation,thoughitcouldwellmonitorfoodsecurityissues.Alternatively,an
organizationcanallocateitspricedatacollectionacrossthecrowdtopinpointspecific
markets,outletsandcommodities,inwhichcase,theapproachmayapproximatethe
collectionmethodologymostcountriesusedtoobtainfoodCPIdata.Thislatterapproach
istakenbyPremise,thoughastheycorrectlypointout,thisallocationofdatarequirements
istypicallynotseenascrowdsourcing.

13

4. PublicversusprivatefoodpricestatisticsforBrazil

4.1 TheIBGEandofficialBrazilianfoodpricedata

Brazilsnationalstatisticsoffice(NSO),theInstitutoBrasileirodeGeografiae
Estatistica(IBGE),createdin1937,isthemainproviderofofficialBrazilianstatistics,
includingitsCPIandfoodCPIsubcomponent.TheIBGEfollowstheCPIManual
guidelinestoproducemonthlyLaspeyrestypeCPIsandfoodCPIs,andlikemostNSOs,
adherestothetenFundamentalPrinciplesofOfficialStatistics,establishedbytheUnited
NationsStatisticalCommissionin1994
(http://unstats.un.org/unsd/methods/statorg/default.htm).Theseprinciplesinclude:
relevance,impartialityandequalaccesstostatisticsprofessionalismandaccountabilityin
theuseandreportingofmethodsandproceduresforthecollection,processing,storage
andpresentationofdatachoiceoverthesourceofdatabasedonquality,timeliness,costs
andrespondentburdenandinternationalcoordinationandcooperation.
TheIBGEproducesseveralmeasuresoftheconsumerpriceindex,whichvarybased
onlocationsandhouseholdscovered.TheIndiceNacionaldePrecosauConsumidor
(IPCA),usedforthisanalysis,coverstenkeymetropolitanareasandtwomunicipalities:
Belm,Fortaleza,Recife,Salvador,BeloHorizonte,RiodeJaneiro,SoPaulo,Curitiba,
VitriaandPortoAlegre,Braslia,andthemunicipalitiesofGoiniaandCampoGrande.
TheIPCAincludesfamiliesdwellingintheseareaswithmonthlyincomefromanysource,
rangingfrom1(one)to40(forty)minimumwages(fromtheIBGEwebsite).This
contrastswithanotheroftheIBGEsconsumerpriceindexmeasures,whichcoversthe
samegeographicareas,butincludesonlyfamilieswithmonthlyincomerangingfrom
1(one)to5(five)minimumwagesandwhoseheadofhouseholdispaidasalaryfortheir
mainactivity.DatacollectionfortheIPCAoccursfromday1today30ofthereference
month.

14

FoodCPIsubcomponentsincludemeats,fruits,vegetables,fish,fats,beverages,
herbs,cereals,processedmeatsandfish,poultry&eggs,dairy,bread,flours,roots,and
sugars.TheoverallCPIandthefoodCPIarealsopublishedforeachofthe12urban
areascovered.
Becausetheprivatesectordatacomparedlooksatonlythosefoodspurchasedfor
foodpreparationathome,theCPIsubcomponentofIPCAusedforthisanalysisisthe
Alimentaonodomiclio,orfoodathome,subcomponentofthefoodCPI.Thisisone
ofthetwokeygroupingsoftheBrazilianfoodCPI,withtheotherreferringtofoodprepared
and/oreatenoutsidethehome.

4.2 Premiseanditsfoodpricedata

Premise,arecentlyestablishedSanFranciscobasedITfirm,collectsandcompiles
foodpricestatisticsforBrazil,aswellasArgentina,China,India,andtheUnitedStates,
withplanstoexpandintoAfrica,startingwithNigeriaandGhana.Itsdatacollection
methodsincludecrowdsourcedmobiledatacollection,combinedwithwebscraped
prices.Uniqueinitsapproachisthefactthatitattemptstofollowinternationallyaccepted
practicesinobtainpricedataandcalculatinganindextoapproximatethefoodCPI,using
officialstatisticsforexpenditureweights,adjustedthroughlocalanalysisofmorecurrent
foodexpenditurehabits.
Foritsdatacollection,Premisehasestablishedapproachesfairlysimilartothatused
byNSOstoobtainfoodprices.Bothitsonlinedataandofflinedatarelyonlocalexpertsto
identifythekeyoutlets(internetdomainsorretailoutlets)andthemainfooditems,similar
tothejudgmentalsamplingusedbyNSOs.Theseexpertsarealsousedtovalidatethe
datacollected.Foritsonlinedata,thePremisewebcrawlercollectandrecordspretax,
preshippingprices.Foritsofflinedata,pricesforfooditemsofinterestare
crowdsourcedusingmobileappsdownloadedonsmartphones,thoughinreality,the
15

allocationofworkrenderstheirapproachsimilartoaComputerAssistancePersonal
Interview(CAPI),inwhichitsdatacollectionstrategymakesthePremisemobileappa
CAPItypeapplication.
Foritsofflinesamplingstrategy,Premiseusesmultistagesamplingtoobtainprices
foraminimumof5citiesineachcountryforasetofkeyfoodproducts.Eachcityis
dividedintocoverageareas,withelementarystratadefinedbystoretype(bysizeand
chain)andfoodproducts.Fieldworkersthatformthecrowdareassignedtoacoverage
area,andencouragedtocollectpricesfromasmanystrataaspossible.Thesefield
workersalsosubmitmetadataandphotographsforthepriceditems.Theirincentivefor
datacollectionandqualityisthefeetheyreceiveperpricequotethatpassesquality
control.
Inreality,theapproachisnotstrictlycrowdsourcing,asfieldworkers,usually
universitystudents,arescreenedandrecruited,obtainfieldtrainingonboththemobileapp
aswellasthedataandmetadatarequired,andareassigneditemsandlocationsonwhich
tocollectfoodprices,metadataandproductphotos.Furthermore,Premisealsoincents
shopkeeperstoallowthispricedatacollectionbyprovidingthemwithreportsonpricesof
goodsinsimilarstoresinthearea.
Combinedwithitsunderlyingsamplingstrategy,Premisesapproachcouldbe
viewedassimilartotraditionalNSOpricedatacollectionstrategieswhereinterviewersare
hiredoncontract,CATIapplicationsareusedfordatacaptureandqualitycontrol,andwell
designedsamplingstrategiesdeterminethemarket,outletandproductforwhichpricesare
collected.
Initsdataprocessingandindexcompilation,Premisenormalizesproductpricesfor
sizeandquantity,classifiesproductsintocategories,performsoutlierdetection,ensuresa
minimumsamplesizeinconstructingJevonstypepricerelativesforitsfood
subcomponent,compilesitsaggregatecountrylevelFoodStaplesIndex(FSI)and

16

subcomponentindicesusingaLaspeyrestypeindex,andappliesofficialNSO
expenditureweightsintheFSI.
Asthecompanyanditsdataseriesarerelativelynew,datingbackto2013,Premise
currentlycomputesa7dayand30dayinflationmeasurefromitsFSI,andpublishes
indicesforitssubcomponents,suchasprocessesmeat,fruit,vegetables,oilsandfats,and
dairyandeggs.However,italsopublishespricelevelsofkeyindividualfoodcommodities
atcitylevelforpurposesoffoodsecuritymonitoring,suchaspotatoandgreenpepper
pricesinBrazil,orwheatbreadandvegetablepricesinBuenosAires.
Premisedataandmetadataisavailabletopayingclientsorcanberequestedfora
trialperiod.
TheFSIpriceindexsubcomponentsincludemeat,fruit,vegetables,fish&seafood,
oils&fats,beverages,herbs,processedgrains,processedmeats,dairy&eggs,flours,
roots,sugars,grainsandnuts,processedfruitsandvegetables,sweets,andothersnacks

4.3 AcomparisonofIBGEandPremisefoodconsumerpriceindexes

SohowdoesthePremiseFSIcomparewiththefoodathomecomponentofthe
IBGEsIPCA?Thiscanbeassessedagainstthetypicaldimensionsofastatisticalquality
assuranceframework:relevance,timeliness,accuracy,accessibility,interpretabilityand
coherence.Thesemultidimensionalelementsofqualityassurancecoverthenotionof
fitnessforuse,andhence,aretypicallyinterpretedfromtheperspectiveoftheuser
(StatisticsCanada,2002).
Intermsofrelevance,theusesandusersofIBGEandPremisestatisticspotentially
overlap,butarenotidentical.TheIBGEfoodCPIislargelyusedbygovernmenttoinform
policy,thoughitisalsousedbytheprivatesectorforgenericanalysisoffoodprice
changesaswellasfoodandagricultureresearchers.Giventhetypesoffood
subcomponentsavailable(seeTable2),thereislikelyotherusersandusesofwhichthe
17

authorsareunaware.Premisedatausersincludeprivatesectorfirms,including
internationalbanksandhedgefunds,interestedinmonitoringrealtimepricemovements
forpurposesofcorporate,lendingandinvestmentdecisions,thoughPremiseisalso
lookingtoexpanduserstoincludegovernmentsinterestedinfoodpricestomonitorfood
security.ThekeyadvantagetoIBGEdataarisesfromitsindicesavailableforeachofits
12urbanareascovered.
ThekeyadvantagetoPremisearisesintermsoffoodsecuritymonitoringfromthe
factthatitprovidesdailyindicesinnearrealtime,aswellaspricesavailableforsome
individualproductsandtheircomparisonlocallyandhistorically.Thelatteralsosuggesta
potentialusebyconsumersinidentifyinglowerprices,thoughwithadayortwolag,though
itisunclearhowPremisewouldobtainrevenuesfromprovidingthisdatapublicly,
particularlygiventhatitwouldlikelylostrevenuesfromclientswhocurrentlypurchaseit.
ThekeyadvantagefromtheIBGEarisesfromitslongertimeseries,whichenables
morethoroughanalysisofhistoricaltrends.Thisalsopointstoariskcreatedinusing
privatesectorfoodpriceindices:anyrisktocontinuityintheirdatacollectionandindex
publicationwillreducetherelevanceoftheirdata,giventhelongitudinalnatureoftheCPI.
TimelinessleansinfavorofPremise,bothbecausetheFSIisadailyindex,and
becauseitcanmakeavailableamonthlyindexsimilartothefoodCPI10daysbefore
monthendandupto25daysbeforetheofficialIBGErelease.Indeed,thelateranalysis
willshowthatthefirst7dayaverageofPremisesdailyFSI,availablebeforemidmonth,
doesareasonablygoodjobofpredictingornowcastingthemonthlyfoodathomeCPI.
AccessibilityclearlyfavorstheIBGE,whichensuresimpartialaccesstoall,asper
theFundamentalPrinciplesofOfficialStatisticswhilePremiseprovidesaccess,as
expected,mainlytopayingclients.
Intermsofaccuracy,themethodologyandstrategyfollowedbyPremiseappearsto
beasrigorousasthatofanNSO,butasaprivatefirm,itsdataarenotsubjecttothe

18

internationalscrutinyorgovernmentauditingandqualityassurancefacedbyanNSO.
Bothindicesappeartocoversimilarfoodcommoditygroups,boththeIBGEandPremise
providebothanoverallfoodathomepriceindexandindicesforkeyfoodsubcomponents.
IfboththeIBGEandPremiseapplythesameexpenditureweights,itisexpectedthat
householdcoveragewillalsobesimilar.Premiseupdatestheseweightswithrecentand
localanalysisofcurrentfoodexpenditurepatterns,thoughcostsmayrenderthisless
robustincomparisontoofficialhouseholdexpendituresurveys.IPCA,ontheotherhand,
coversabroadergeographicareathanPremise,including12urbanareascomparedto5
citiescoveredbyPremise,andproducesfoodCPIsfortheseareas,whichmaybetter
informlocaldecisionmaking.
Somedegreeofassessmentofaccuracymaybeinformedbyastatistical
comparisonoftheFSIanditssubcomponentsagainsttheIBGEsfoodathomeCPI/IPCA
anditssubcomponents.Thisassessmentislimitedtosimpleanalysiscorrelationanalysis
andsimplelinearregressionforecasts/predictionsgiventhelimitedtimeseriesin
Premisedata.PleasekeepinmindthatdifferencesinPremiseandIBGEindicescanarise
fromdifferencesinsamplesizeandselection(andhence,insamplingerrors),geographic
coverage,andfoodproduct/itemcoverage.Asaresult,onecannotconcludethatoneset
ofindicesarenecessarilymoreaccuratethantheotherbasedonindexcomparisonsand
analysesalone.
Table1providestheIBGEfoodathomeCPIandselectsubcomponentfoodprice
indicesfortheIPCA,withindicesrebasedtoJune2013tofacilitatecomparison.Table2
and3providethePremiseFSIandselectsubcomponents:Table2providesthedaily
averageindexforthefirst7days,whileTable3providesthedailyaverageindexforthe
first30days(exceptforFebruary,whichcontainsa28dayaverage).Monthlyindexes
werealsocompiled,thougharenotpresented,forthedailyaverageofthefirst15days,
andthethefirst21days(3weeks),toevaluatepredictionaccuracyrelativetoleadtimes.

19

Theindexsubcomponentswereselectedtoenablecomparisonofassimilarproduct
groupingsbetweentheIBGEandPremiseaspossible.Monthovermonthfoodprice
inflationwascalculatedforthefourPremiseseries,tocomparePremisedataspredictive
powerrelativetothemonthlyIBGEfoodpriceinflationseries.
Chart1plotsthefourseriesbasedonthePremisedailyFSI(7dayaverage,15day
average,21dayaverage,30dayaverage)againsttheBrazilianIBGEfoodathomeCPI,
withallindicesrebasedtoJune2013.AllfourPremiseseriestrackseemtotrackwellthe
officialBrazilianstatistic,withsimilartrendsexceptinJulyandAugust2014,whenPremise
datashowsincreasingpriceswhileofficialBraziliandatashowsthereverse.
Chart2,whichshowsmonthovermonthinflationforthefiveseriesconfirmsthese
results.Again,exceptforJulyandAugust2014,officialdataandthePremiseseriesall
seemtohavesimilarmovements.BetweenSeptember2013andJanuary2014,italmost
appearsthatofficialdatalagPremiseseries.Thisinterpretationwouldnotbevalid,
however,giventhatbothdatasetssetouttomeasurethesamephenomenon.Most
problematic,however,istheJulyandAugust2014data,inwhichofficialstatisticsshowa
fallinfoodprices,whilePremisedatashowsanincrease.Thoughitmaybetemptingto
concludethattheprivatesectordataisfaulty,theWorldCupeventinBrazil,andthenews
ofitsupwardspressureoninflationmightleadonetospeculateif,instead,itwasnt
Premisethatgotitright?
ToevaluatePremisedataintermsofitsabilitytonowcastorpredictofficial
foodathomepriceinflation,asimplelinearregressionmodelwasconstructedforeachof
thefourPremiseseries(7dayaverage,15dayaverage,21dayaverage,30day
average),witheachregressionusingPremisedataastheexplanatoryvariable.Different
regressionmodelswereconstructed,foreachseries,topredictfoodathomepriceinflation
fromApriltoAugust2014.AprilinflationwaspredictedusingJune2013toMarch2014
monthlyPremiseindicesMayinflationusingJune2013April2014dataJuneinflation

20

usingJune2013May2014dataJulyinflationusingJune2013June2014dataand
AugustinflationusingJune2013July2014data.
Toevaluatethepredictivepowerofeachoftheseries,aMeanAbsolutePrediction
Error(MAPE)wascomputingusingthefollowingformula:
n

M AP E = abs(AtFt
At )
t=1

WhereAtistheactualvaluefromIBGEdataFtistheforecastbasedonasimplelinear
regressionusingthePremiseFSIastheindependentvariableandn=5isthenumberof
monthsforecasted.Table4providestheresultsofthepredictions/nowcasts,theMAPE,
andtheleadtimesforeachofthefourmonthlyPremiseseries.
NoneofthevaluesoftheMAPEsareparticularlycompelling,withvaluesinthe95%
to97%rangebasedontheforecastsforthefivemonthsfromApriltoAugust2014,
inclusive.Moreimportantly,thesignspredictedforJulyandAugustareincorrect.
Interestinglythe7dayseries,producedabout25daysbeforetheIBGEpublishesthe
IPCA,hasthesameMAPEasthe15dayand30dayseries.
KeepinginmindquestionsabouttheJulyandAugustdata,MAPEswerealso
calculatedforonlyAprilthroughJuneof2014,withvaluesrangingfrom15%forthe7day
seriesto7%forthe30dayseries.Thesepredictionshaveamuchmoreacceptable
MAPE,thesignsareallcorrect,and,asinthecaseofmostnowcasts,theadditionof
moreinformationinthePremiseseriesreducestheMAPE.The15dayseriesisthemost
attractiveinitstradeoffbetweenleadtimeof17daysbeforeofficialdatapublication,and
MAPE.Waitingtoobtainthefullmonthofday(the30dayseries),oftenpublishedaround
thesamedayoradayortwoaheadoftheofficialseries,onlygainsamarginaladvantage
inpredictionpower.This15dayseriesisalsocompelling,inthatmanyNSOscompile
theirCPIsanditssubcomponentsbasedondatacollectedduringthefirst15daysofthe
month.

21

ThisanalysissuggeststhereissomepredictivepowerintheuseofPremisesbig
foodpricedata.IfindeedtheofficialfoodCPIstatisticswereincorrectinmeasuring
foodathomepriceinflationinJulyandAugustof2014,theanalysissuggeststhatPremise
datawouldnotonlyprovideavaluableandtimelynowcastoffoodpriceinflation,butcould
alsohelpvalidateofficialstatistics.

5. Conclusion:Publicprivatesectorcomparativeadvantagesand
complementarities

Thedescriptionsandanalysesaboveleadbacktotheoriginalquestions,particularly
inthecontextoffoodpricestatistics:Istherearoleforbigdatainofficialstatistics?And
whatistheroleofprivatesectorproducersofbigdatavisavisofficialstatistics?Howdo
theserolesvarybytypeofdata,producer,oruse?
InthecaseoffoodCPIs,thereisnoquestionthattherehasbeenandisarolefor
bigdatainofficialstatistics,astheuseofscannerdataincompilingofficialCPIs
demonstrates.Thisrolevariesbycountry,assomeNSOsusescannerdatadirectlyto
compiletheirfoodCPIs,whileotheruseittovalidatetheirCPIs,andmanydonotuseitat
all.
Statisticalorganizationshavealsobeguntoadoptotherbigdatatoolsdescribedin
thispaper,suchaswebscrapingandmobiletools.Eurostat,forexample,isdevelopinga
generictooltocollectwebscrapedpricestoimproveitsCPI,andItalysIstatis
experimentingwithwebscrapingandtextminingforitsSurveyoninformationand
communicationtechnologyinenterprises.Eurostat,NewZealandandSloveniaobtain
microdataonmobilephonecall/texttimesandpositionstoenhancetheirpopulationand
migrationstatistics,whichdatasharinglegislationinplaceinSloveniatoobtainthisdata
forfreefromitsprivatesectorproviders(UNECEwebsite,BigDataHome).
Inadoptingtoolssuchaswebscrapersandmobileapps,NSOsneedtoconsider
technical,legislativeandsecurityissues,suchasstabilityoftheapplication,reliabilityof
22

mobilenetworks,securityofconfidentialinformationtransmitted,andthepersonalsecurity
ofinterviewers.Thisleadstothefinaltwoquestionsontheroleoftheprivatesectorasbig
dataproducers,andthevaryingtypesofroles.
Again,inthecaseofscannerdata,NSOsalreadyobtainscannerdatafromprivate
supermarketoutlets,forwhichthekeyrisksarisingfromITglitchesanddelaysindata
transmissionaremanagedwithlegalcontracts.Furthermore,sincethedataiscollectedfor
adifferentpurposethananNSOsuse,namely,toinformmarketresearchandwholesale
foodpurchasesandmarketingcampaigns,andNSOspublishaggregatedata,most
producersofscannerdatadonotcompromisetheirbusinesslinebysharingthisdatawith
NSOs.
Forsomeofthenewerprivateproducersofbigfooddata,suchasPremiseand
PriceStats,theconsiderationsarequitedifferent,andstemfromtheirbusinessmodel,in
whichthedataitselfisakeyproduct.Suchfirmsearnrevenuesprimarilyfromthedata,
statisticsandanalysistheyprovidetopayingclients,whoreceivethisintelligencein
advanceoftheircompetitorsorthepublicatlarge.Thecomparativeadvantageofprivate
sectorproductioninusingcrowdsourcedand/orwebscrapedpriceslikelyarisesfromthe
lowerperunitcostsofdatacollectionincurredbyspecialistfirms,aswellastheirflexibility
inmodifyingandimprovingtheirdatacollectionandproductionprocessesovertime.
Suchprivatefirmsnormallyfocusonanarrowareasofstatistics,forwhichtheyrecruit
andtrainstaffthatspecializeinonesubjectmatterdomainandonesetofITplatforms
(contrastedwithNSOs,whereefficiencygainsaccruefromgenericITplatformand
knowledgeacrossmultiplesubjectmatterareas).Thevaluetheseprivatefirmsbringtheir
clientsisinproducing(near)realtime,frequentanddetaileddatawithrestrictedaccess.
Theirclients,inturn,benefitfromthishighfrequency,realtimeproprietaryinformation
whichenablesthemtomakedecisionsaheadoftheircompetitors,orwithmoredetailed
productandgeographicinformationthanavailablefromofficialstatistics.Notsurprisingly,

23

someofthekeyclientsofPremiseandPriceStatsincludehedgefundsandother
investmentfirms,whorelyonthistypeofjustintimedetailedintelligencenecessaryto
informprofitablebusinessdecisions.
ThisbusinessmodelunderlyingfirmslikePremiseandPriceStatscircumscribesthe
typeofpublicprivatepartnershippossible.SinceNSOsarerequiredbylawtopublicly
providestatisticsontheircountry,economyandpeoples,andthetenfundamental
principlesofofficialstatisticsrequireimpartialaccessandtransparencyinthesharingof
theunderlyingdatacollectionandcompilationmethodology(whichmaybeviewedasa
tradesecretinaprivatefirm),thedirectuseofsuchprivatesectorbigdataincompiling
officialstatisticswouldlikelyrunintolegislativeproblemsandpoliticalproblems.Sincethe
CPIhastheabilitytomovefinancialmarkets,theknowledgethatsomefirmshave
advancedaccesstoevenpartoftheofficialCPIwouldlikelycreate,ataminimum,adverse
publicreaction.Ontheotherhand,ifanNSOcouldrepublishtheCPIdatapurchased
fromaprivatefirm,thiswouldunderminetheprofitabilityofsuchfirmswhomaketheir
incomefromsellingdata.Finally,whilemostNSOshavesomefinancialstability,given
theirlegislativemandatesandtaxfunding,privatefirmslackthisfinancialsecurity.In
short,ifaprivatesectordataproducergoesbust,wesaybyebyetothedatathey
produce.ThisisparticularlyproblematicfortheCPIandfoodCPI,whichrelyona
relativelylongmonthlyseriesoflongitudinaldata.
Thisdifferencesinmandatesandfinancialsecuritydoessuggestsomealternativeand
complementarities.Ontheoneextreme,NSOscanandhaveadoptedthebigdatatools
pioneeredbyprivatesectorfirms,includingthedevelopmentofwebscrapingtechnologies
andmobileapps.Inatleastonecountry,andNSOhasboughtouttheprivatesector
pioneer.Thisdoesleadtoaseparatesetofquestionsregardingpublicsectorcrowding
outofprivatesectorfirms.

24

Inthemiddleofthespectrum,NSOscanuse,sometimesatapurchaseprice,private
sectorbigfooddataforvalidationorqualityassuranceofofficialstatistics.Thishasbeen
thecaseinsomecountrieswithrespecttoscannerdata.ThePremisebasednowcasts
suggestsasimilarrolewithrespecttothiscompanysdata.Similarly,policydepartments
cananddouseprivatesectorbigdatatonowcastofficialstatistics,suchasGDPgrowth
andemploymentstatistics,withtheanalysisofthispapersuggestingthismaybeextended
tonowcastingofficialfoodCPIs.Furthermore,thecomplementaryandtimelynatureof
privatesectorbigfooddatasuggestsaroleforcentralbanks,financedepartmentsand
ministriesofagriculturetousethisdatasourcetomonitorfoodsecurity,andprovidean
earlywarningofkeyturningpoints.Thereisalreadyprecedenceforsuchuse,asmany
centralbanksandpolicydepartmentspurchaseprivatesectoreconomicforecastsas
inputsintotheirfiscalandmonetarypolicydecisionsandtonowcastkeyofficialstatistics.
Attheotherextremeliesthepurchaseofprivatesectorbigfooddatafordirectusein
compilingofficialstatistics,thoughdifferencesinbusinessandinstitutionalmodelsand
mandatesandlegalobligationsprovidethekeyfactorsindeterminingwhattypesofbig
dataandprivatesectorproviderscanservethisfunction.

6.

25

REFERENCES

[1]

MichaelaAgafiteiandSorinaVaju,
AddressingtheChallengeofProducingEuropean
ComparableDatausingAdministrativeData
.PresentedattheSeminaronStatistical
DataCollection,2527September2013.Geneva:UnitedNationsEconomic
CommissionforEurope,2013.

[2]

JoseRamonG.Albert,BigData:BigThreatorBigOpportunityforOfficialStatistics?
PublishedbyParis21,
http://www.paris21.org/newsletter/fall2013/bigdatadrjoseramonalbert
,2013.

[3]

PedroLessAndradeetal.,FromBigDatatoBigSocialandEconomicOpportunities:
WhichPoliciesWillLeadtoLeveragingDataDrivenInnovationsPotential?in:The
GlobalInformationTechnologyReport2014:RewardsandRisksofBigData,2014,
pp.8186.

[4]

NiiAyiArmah,BigDataAnalysis:TheNextFrontier,in:BankofCanadaReview,
Summer2013,pp3239.

[5]

N.AskitasandK.F.Zimmermann,GoogleEconometricsandUnemployment
Forecasting.AppliedEconomicsQuarterly,55(2),2009,pp10720.

[6]

MartaBanburaetal,Nowcasting,EuropeanCentralBankWorkingPaperSeriesNo
1275,December2010.
BenatBilbaoOsorioetal,TheGlobalInformationTechnologyReport2014:
RewardsandRisksofBigData,Geneva,2014.

[7]
[8]

DanahBoydandKateCrawford,SixProvocationsforBigData,in:ADecadein
InternetTime:SymposiumontheDynamicsoftheInternetandSociety,September
2011.

[9]

AlbertoCavallo,Onlineandofficialpriceindexes:MeasuringArgentnasinflation,in
JournalofMonetaryEconomics,2012.

[10] AlbertoCavallo,ScrapedDataandStickyPrices,MITSloanWorkingPaper,May
2013.
[11] H.ChoiandH.Varian,PredictingthePresentwithGoogleTrends.GoogleInc,April
2009a.
[12] H.ChoiandH.Varian,PredictingInitialClaimsforUnemploymentBenefits.Google
Inc,July2009b.
[13] P.DaasandM.vanderLoo,BigData(andOfficialStatistics),presentedatthe
MeetingontheManagementofStatisticalInformationSystems,ParisandBangkok,
2325April2013.
[14] RobertC.FeenstraandMatthewD.Shapiro,Eds,ScannerDataandPriceIndexes,
UniversityofChicagoPress,2003.
[15] JohnW.GalbraithandGregTkacz,NowcastingGDP:ElectronicPayments,Data
VintagesandtheTimingofDataReleases,CIRANOworkingpaper,Montreal,2013.

26

[16] InternationalLabourOrganization(ILO),ConsumerPriceIndexManual:Theoryand
Practice,2004.
[17] InternationalTelecommunicationUnion,MeasuringtheInformationSociety,Geneva,
2013.
[18] AdamJacobs,ThePathologiesofBigData,in:CommunicationsoftheACM,52(8),
August2009.
[19] SaraJamesandRichardCampbell,ObtainingScannerDataProject,presentedto
theWorkshoponScannerDataforHICP,Stockholm,8June2011.
[20] MartinKarlbergandMichailSkaliotis,BigDataforOfficialStatisticsStrategiesand
SomeInitialEuropeanApplications,presentedtoTheConferenceofEuropean
Statisticians,Geneva,2527September2013.
[21] IrfanKhan,Nowcasting:Bigdatapredictsthepresent,in:ITWorld,Oct2012.
[22] RobertKirkpatrick,BeyondTargetedAds:BigDataforaBetterWorld,presentedat
theOReillyStrataConference,UnitedNationsGlobalPulse,Oct2012.
[23] O.Lamont,DoShortagesCauseInflation?in:ReducingInflation:Motivationand
Strategy,C.D.RomerandD.H.Romer,eds.,UniversityofChicagoPress,Chicago,
1997,pp.281306.
[24] CliffordLynch,BigData:Howdoyourdatagrow?in:Nature455,3September2008,
pp.2829.
[25] JamesManyikaetal,BigData:Thenextfrontierforinnovation,competitionand
productivity,McKinsey&Company,SanFrancisco,2011.
[26] N.McLarenandR.Shanbhogue,UsingInternetSearchDataasEconomic
Indicators,in:BankofEnglandQuarterlyBulletinQ2,2011,pp.134140.
[27] R.Mlleretal,RecentDevelopmentsintheSwissCPI:ScannerData,
TelecommunicationsandHealthPriceCollection,presentedtothe9thmeetingofthe
OttawaGroupMeetingonPrices,London,2006,pp.1416.
[28] NationalResearchCounciloftheNationalAcademies,ImprovingDatatoAnalyze
FoodandNutritionPolicies,Washington,2005.
[29] OECD.ExploringdatadrivenInnovationasanewsourceofgrowth:Mappingthe
policyissuesraisedbyBigData,Paris,June2013.
[30] StevePierson,BigData:APerspectivefromtheBLS,inAMSTATNEWS,1January
2013.
[31] J.RodriguezandF.Haraldsen,TheUseofScannerDataintheNorwegianCPI:The
NewIndexforFoodandNonAlcoholicBeverages,in:EconomicSurvey4,2006,
pp.2128.
[32] HillarySanders,etal,TheRelationshipbetweenPremisePriceData&Official
GovernmentReleases,

27

[33] MonicaScannapiecoetal,PlacingBigDatainOfficialStatistics:ABigChallenge?
PresentedtotheNewTechniquesandTechnologiesforStatisticsconference,United
NationsEconomicCommissionforAfrica,2013.
[34] MickSilverandSaeedHeravi,ScannerDataandtheMeasurementofInflation,in:
TheEconomicJournal111,June2001,pp.383404.
[35] MichailSkaliotis,Theroleofofficialstatisticsinabigdataecosystem:whatwill
change?

presentedtotheEuropeanCentralBankWorkshoponBigDatafor
ForecastingandStatistics,Frankfurt,2014.
[36] T.Suhoy,QueryIndicesanda2008Downturn:IsraeliData,BankofIsrael
DiscussionPaperNo.200906,2009.
[37] JamesSurowiecki,ABillionPricesNow,in:TheNewYorker,May30,2011.
[38] StatisticsCanada,StatisticsCanadasQualityAssuranceFramework,2002.
[39] StatisticsSweden,Issuesintheuseofscannerdata,(undateddocument).
[40] JackE.Triplett,ShouldtheCostofLivingIndexProvidetheConceptualFramework
foraConsumerPriceIndex?BrookingsInstitution,2000.
[41] UnitedNationsEconomicCommissionforEurope(UNECE),BigDataHome,
UNECEStatisticsWikis,2014l
[42] UnitedNationsGlobalPulse,BigDataforDevelopment:APrimer,June2013.
[43] UnitedNationsGlobalPulse,MobilePhoneNetworkDataforDevelopment,Oct
2013.
[44] UnitedNationsGlobalPulse,BigDataforDevelopment:Challengesand
Opportunities,May2012.
[45] UnitedNationsStatisticsDivision(UNSD),FundamentalPrinciplesofOfficial
Statistics,19942013.
[46] UnitedNationsStatisticalCommission,Bigdataandmodernizationofstatistical
systems.ReportoftheSecretaryGeneralpresentedattheFortyfifthsession,47
March2014.
[47] HeymerikvanderGrientandJandeHaan,Theuseofsupermarketscannerdatain
theDutchCPI,StatisticsNetherlands,2010.
[48] JoeWeisenthal,IsMITsBillionPricesProjectWarningofalargespikeupinthe
CPI?in:BusinessInsider,25April2011.
[49] Wikibon,AComprehensiveListofBigDataStatistics.WikibonBlog,1Aug2012.
Availableat
http://wikibon.org/blog/bigdatastatistics/
.
[50] WorldBank,FoodPriceWatch,May2014.

28

[51] L.WuandE.Brynjolfsson,TheFutureofPrediction:HowGoogleSearches
ForeshadowHousingPricesandSales.SloanSchoolofManagement,
MassachusettsInstituteofTechnology,2009.

29

Table1:OfficialBrazilianfoodpricestatisticsthefoodCPIandselectsubcomponents

FoodCPI
Jan13

96.32

Feb13

97.89

Mar13

99.22

Apr13

100.31

May13

100.36

Jun13

100.00

Jul13

99.27

Aug13

98.93

Sep13

98.90

Oct13

99.96

Nov13

100.37

Dec13

101.16

Jan14

102.07

Feb14

102.30

Mar14

104.78

Apr14

106.38

May14

106.81

June14

106.17

July14

105.63

Aug14

104.99

Meats
104.2
4
104.1
0
102.4
1
100.5
8
99.87
100.0
0
100.0
8
100.2
3
101.1
1
104.3
2
105.2
8
107.7
3
111.0
4
111.0
4
113.5
4
115.6
1
116.0
8
116.5
5
116.6
9
117.1
9

Fruits

Vegetable
s

Fish

Fats

Drinks

Herbs

92.50

78.84

103.38

107.76

99.31

95.70

92.90

89.62

102.59

107.70

99.44

95.66

97.09

97.66

104.41

107.03

100.30

97.90

100.23

107.74

105.56

104.84

100.97

99.37

101.25

105.10

102.60

101.99

100.58

99.87

100.00

100.00

100.00

100.00

100.00

100.00

97.40

86.96

99.85

98.25

100.46

99.90

97.07

77.40

100.26

97.07

100.90

100.34

99.88

69.11

100.40

96.89

101.25

99.10

101.88

70.87

101.44

96.37

102.33

97.36

102.46

72.81

104.19

96.55

102.85

95.92

107.09

74.97

106.78

96.87

103.24

96.90

110.77

74.34

113.01

97.45

104.52

97.76

113.89

75.38

112.47

97.37

105.06

97.67

116.12

91.93

115.91

99.26

105.37

98.03

116.78

97.99

119.26

101.85

106.05

98.96

114.22

98.73

118.81

102.98

106.67

100.93

110.19

89.77

115.91

103.15

107.44

101.88

109.54

77.46

113.58

102.26

108.11

103.07

107.40

71.64

112.89

99.40

108.22

103.45

Source:
InstitutoBrasileirodeGeografiaeEstatistica.

IndexesrebasedtoJune2013.

30

Table2:PremisefoodpricestatisticstheFoodStaplesIndexandselect
subcomponents,first7dayaverageofdailyindices
Food
Staples
Index

May13

101.49

Jun13

100.00

Jul13

97.85

Aug13

97.69

Sep13

99.69

Oct13

100.34

Nov13

101.68

Dec13

102.96

Jan14

103.62

Feb14

102.54

Mar14

103.91

Apr14

105.75

May14

105.96

Jun14

104.87

Jul14

105.51

Aug14

107.17

Meat
100.2
8
100.0
0
98.19
100.2
6
102.6
5
104.7
6
106.6
4
108.4
5
109.1
6
107.7
3
108.6
1
109.7
3
109.7
5
108.6
5
109.3
4
108.7
7

Fruit

Vegetable
s

Fish&
Seafood

Oils&Fats

Beverages

Herbs,
spices&
condiment
s

97.35

104.41

99.59

98.48

99.36

100.38

100.00

100.00

100.00

100.00

100.00

100.00

97.20

93.53

98.75

99.17

100.25

100.18

96.29

93.00

98.76

97.38

101.69

98.64

100.83

92.11

100.13

95.92

103.11

99.10

102.31

89.42

100.71

94.88

105.22

99.38

102.07

90.00

101.03

95.78

104.58

98.96

105.46

91.54

100.77

95.41

105.51

98.18

106.65

94.36

104.19

95.83

105.76

98.72

104.96

92.07

102.11

95.43

105.30

98.58

106.11

96.11

104.70

96.32

105.11

98.49

105.68

98.85

106.06

97.53

106.82

99.94

103.99

99.37

108.20

97.76

107.24

101.79

100.35

95.54

107.53

97.57

107.17

102.55

100.29

92.93

107.32

97.76

107.16

102.95

101.28

92.75

107.12

97.83

108.06

104.15

Source:
Premise.

IndexesrebasedtoJune2013.

31

Table3:PremisefoodpricestatisticstheFoodStaplesIndexandselect
subcomponents,30dayaverageofdailyindices

Food
Staples
Index

May13

102.72

Jun13

100.00

Jul13

98.96

Aug13

99.38

Sep13

100.52

Oct13

101.83

Nov13

102.86

Dec13

104.14

Jan14

104.50

Feb14

103.87

Mar14

105.57

Apr14

106.76

May14

107.03

Jun14

106.08

Jul14

106.71

Aug14

108.60

Meat
101.5
5
100.0
0
99.78
101.8
5
104.4
0
107.0
0
108.1
8
109.9
4
110.5
0
109.4
0
110.0
2
110.8
9
110.7
0
109.6
8
109.9
1
111.8
3

Fruit

Vegetables

Fish&
Seafood

Oils&Fats

Beverages

Herbs,
spices&
condiment
s

99.86

105.94

104.02

100.27

99.88

100.64

100.00

100.00

100.00

100.00

100.00

100.00

97.80

96.32

99.53

98.78

100.57

100.13

99.37

95.90

99.77

97.53

101.30

98.97

102.70

92.98

100.70

96.38

103.47

99.38

102.68

92.63

101.29

95.64

104.76

98.92

104.17

93.03

101.05

96.60

104.28

98.51

106.78

94.92

102.44

96.29

105.38

98.19

107.09

96.25

105.03

96.48

105.16

98.58

107.03

95.40

103.75

96.15

104.65

98.46

106.95

100.78

105.82

97.62

105.43

99.25

106.38

101.06

107.97

98.46

106.68

100.38

104.20

101.53

108.96

98.51

106.85

102.15

101.14

98.08

107.87

98.57

106.87

102.87

101.01

94.26

108.60

98.42

106.25

102.99

102.90

96.71

108.37

98.27

107.91

104.17

Source:
Premise
.
IndexesrebasedtoJune2013.

32

Chart1:BrazilsfoodathomeCPIversusthePremiseFSI,Jan2013Aug2014

DataSource:IBGEandPremise

33

Chart2:ConsumerFoodPriceInflation,BrazilsofficialstatisticsversusthePremiseFood
StaplesIndex,Jan2013Aug2014

34

Table4:UsingPremisedatatopredictconsumerfoodpriceinflation:Mean
AbsolutePredictionErrors(MAPE)andLeadTimes

Predictedfoodinflationusingdailyaverage
Premiseindices(Ft)

IBGE
value

7day

15day

21day

30day

(At)

April

0.81

0.60

0.55

0.50

1.52

May

0.09

0.11

0.17

0.12

0.41

June

0.44

0.47

0.49

0.43

0.60

July

0.28

0.34

0.32

0.30

0.51

August

0.69

0.69

0.83

0.88

MAPE,AprilAug

0.97

0.97

0.95

0.97

MAPE,AprilJune

0.15

0.08

0.08

0.07

25days

17days

10days

2days

LeadTime

0.61

Sourcedata:
PremiseandIBGE.

35

You might also like