Professional Documents
Culture Documents
CH1 5
CH1 5
CH1 5
Resulted due to the immense amount of data that is becoming available each
day.
Ex: Walmart collects multi-terabytes of new data every day → added to its
petabytes of historical data
a collection of large and complex datasets from different sources that are difficult
to process using traditional data management and processing applications
It can also be seen as a large amount of either organized or unorganized data
that is analyzed to make an informed decision/ evaluation
At least 4 characteristics/ dimensions:
1. Variety
Many different forms of data based on data source (phone, video,
text, retail scanners, google searches, gov. doc,etc)
It can be:
Structured in databases or Excel sheets
Unstructured writing & photography
2. Velocity
Speed w/ which the data is available and in which it can be
processed
It is important in ensuring data is current and updated in real-time
3. Veracity
Has to do w/ data quality, correctness, accuracy
Data lacking veracity may be imprecise, unrepresentative,
untrustworthy
“Garbage in, garbage out”
4. Volume
Deals w/ the ever increasing size of the data and databases
BD produces vast amounts of data
business analytics the application of processes and techniques that transform raw data into
meaningful information to improve decision-making
Bc BD sources are too large/ complex, new ¿? Have arisen that cannot be
answered w/ traditional analysis methods
new methodologies and processing techniques have been developed→
new era in decision making (bussiness analytic period: convert data into
actionable insight for more timely and accurate decision-making. AKA
data analytics )