Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

WhydoweneedStagingAreaduringETLLoad

WrittenbyDWBIConceptsTeam

LastUpdated:31December2014

"WehaveasimpledatawarehousethattakesdatafromafewRDBMSsourcesystemsandloadthedatain
dimensionandfacttablesofthewarehouse.Iwonderwhywehaveastaginglayerinbetween.Whycant
weprocesseverythingontheflyandpushtheminthedatawarehouse?"
Lastnight,IreceivedthisquestionfromoneofthemembersofDWBIConceptscommunityoveremailand
thoughtofdiscussingtheprosandconsofhavingastaginglayerinthisarticle.
Really staging area is not a necessity if we can handle it on the fly. But can we? Here are a few reasons
whyyoucantavoidastagingarea:
1.Sourcesystemsareonlyavailableforextractionduringaspecifictimeslotwhichisgenerallylesser
than your overall data loading time. Its a good idea to extract and keep things at your end before
youlosetheconnectiontothesourcesystems.
2.You want to extract data based on some conditions which require you to join two or more different
systems together. E.g. you want to only extract those customers who also exist in some other
system.YouwillnotbeabletoperformaSQLqueryjoiningtwotablesfromtwophysicallydifferent
databases.
3.Varioussourcesystemshavedifferentallottedtimingfordataextraction.
4.Data warehouses data loading frequency does not match with the refresh frequencies of the source
systems.
5.Extracted data from the same set of source systems are going to be used in multiple places (data
warehouseloading,ODSloading,thirdpartyapplicationsetc.)
6.ETLprocessinvolvescomplexdatatransformationsthatrequireextraspacetotemporarilystagethe
data
7.There is specific data reconciliation / debugging requirement which warrants the use of staging area
forpre,duringorpostloaddatavalidations
Clearly staging area gives lot flexibility during data loading. Shouldn't we have a separate staging area
alwaysthen?Isthereanyimpactofhavingastagearea?Yesthereareafew.
1.Stagingareaincreaseslatencythatisthetimerequiredforachangeinthesourcesystemtotake
effectinthedatawarehouse.In lot of real time / near real time applications, staging area is rather
avoided.
2.Datainthestagingareaoccupiesextraspace.
To me, in all practical senses, the benefit of having a staging area outweighs its problems. Hence, in
generalIwillsuggestdesignatingaspecificstagingareaindatawarehousingprojects.
Prev(/etl/etl/53methodsofincrementalloadingindatawarehouse)
Next(/etl/etl/25dataintegration)

Doyouknowtheanswer?
Whichofthefollowingisnotadatabase?
Oracle
MSSQLServer
Hadoop
MySQL

Submit

Popular
Top20SQLInterviewQuestionswithAnswers(/database/sql/72top20sqlinterviewquestionswithanswers)
BestInformaticaInterviewQuestions&Answers(/etl/informatica/131importantpracticalinterviewquestions)
Top50DataWarehousing/AnalyticsInterviewQuestionsandAnswers(/datamodelling/dimensionalmodel/58
top50dwbiinterviewquestionswithanswers)
Top50DWBIInterviewQuestionswithAnswersPart2(/datamodelling/dimensionalmodel/59top50dwbi
interviewquestionswithanswerspart2)
The101GuidetoDimensionalDataModeling(/datamodelling/dimensionalmodel/1dimensionalmodeling
guide)
Top30BusinessObjectsinterviewquestions(BO)withAnswers(/analysis/businessobjects/69top
businessobjectsinterviewquestions)

AlsoRead
BuildingtheNextGenerationETLdataloadingFramework(/etl/etl/56etldataloadframeworkrfc)
IncrementalLoadingforDimensionTable(/etl/etl/54incrementalloadingfordimensiontable)
ETLDesignPattern(/etl/etldesignpattern/57etldesignpattern)
BusinessIntelligenceCertification(/etl/etl/2uncategorised/179businessintelligencecertification)
UsingInformaticaNormalizerTransformation(/etl/informatica/147usinginformaticanormalizer
transformation)

Haveaquestiononthissubject?
Askquestionstoourexpertcommunitymembersandclearyourdoubts.Askingquestionorengagingin
technicaldiscussionisbotheasyandrewarding.

AskaQuestion,we'llAnswer

AreyouonTwitter?
Startfollowingus.Thiswaywewillalwayskeepyouupdatedwithwhat'shappeninginDataAnalytics
community.Wewon'tspamyou.Promise.
Follow@dwbic

AboutUs
DataWarehousingandBusinessIntelligenceOrganizationAdvancingBusinessIntelligence

DWBI.orgisaprofessionalinstitutioncreatedandendorsedbyveteranBIandDataAnalyticsprofessionals
fortheadvancementofdatadrivenintelligence
JoinUs(/dwbi.org/component/easysocial/login)|Submitanarticle(/contribute)|ContactUs(/contact)

Copyright
(https://creativecommons.org/licenses/byncsa/4.0/)
Exceptwhereotherwisenoted,contentsofDWBI.ORGbyIntellipLLP(http://intellip.com)islicensedunder
aCreativeCommonsAttributionNonCommercialShareAlike4.0InternationalLicense.
PrivacyPolicy(/privacy)|TermsofUse(/terms)

Getintouch
(https://www.facebook.com/datawarehousing)
(https://www.linkedin.com/company/dwbiconcepts)

(https://twitter.com/dwbiconcepts)
(https://www.youtube.com/dwbiconcepts)

(https://plus.google.com/b/105042632846858744029)

Security
(https://www.beyondsecurity.com/vulnerabilityscannerverification/dwbi.org)

You might also like