Professional Documents
Culture Documents
Database and Analytics F
Database and Analytics F
Name
Course
Instructor
Date
DATABASE AND ANALYTICS
2
Table of Contents
Database and Analytics...............................................................................................................................3
Background..............................................................................................................................................3
Data analytics..........................................................................................................................................3
Forms of data analytics applications........................................................................................................6
The data analytics process.......................................................................................................................8
Forms of data analytics............................................................................................................................8
Some applications of data analytics.......................................................................................................10
Big data analytics...................................................................................................................................13
Traditional data analysis........................................................................................................................13
Big data analytics methods....................................................................................................................14
The big data processing model..............................................................................................................15
Significance of Big Data and Data analytics...........................................................................................16
The significance.....................................................................................................................................16
Enhancing Competitive Advantages over Competitors..........................................................................16
Conclusion.................................................................................................................................................17
References.................................................................................................................................................18
DATABASE AND ANALYTICS
3
Background
During the 1980s, data warehousing technology that was categorized as being only
analytical processing (OLAP) was introduced by the relational database management systems
organizations to offer support for business decisions as well as business intelligence (Kwon, Lee
& Shin, 2014). It was initially developed to archive huge amounts of data linked to the
production database as well as keep them lean and mean for effective performance. When it
comes to data warehousing, several copies of the data are located on several database service
servers that are known as a data mart. The data mart, in this case, could be independent and in
other cases an enterprise data mart. In this case, these data are subsequently extracted and loaded
into several analytical data marts. The data analysts tend to develop their algorithms that are
Data analytics
Data analytics is deemed to be the science that entails the analysis of raw data as a way of
making conclusions relating to available information (Kambatla, et al., 2014). Most of the
techniques as well as processes that are involved in data analytics have been automated into
mechanical processes along with algorithms that seem to work over raw data for human
consumption. Data analytics strategies are used to reveal trends along with metrics that would in
other cases be deemed to have been lost in the mass of information (Raghupathi & Raghupathi,
2014). The information can subsequently be used for the optimization processes as a way of
According to Kambatla et al., (2014), data analytics is an extensive term that entails an
assortment of different forms of data analysis. Any form of information that is subjected to data
analytics strategies to generate insights that can be used to improve things. For instance,
companies on several occasions record the downtime, runtime as well as a work queue for the
different machines and subsequently analyze the data to effectively plan the workloads so that
the machines can function close to their peak capacity (Kwon, Lee & Shin, 2014).
Gandomi & Haider, (2015) asserts that data analytics is deemed to have the capability of
doing more than indicate bottlenecks evident in the production. Gaming companies rely on data
analytics to establish reward schemes for the players to maintain most of the players active in the
game. Content companies, on the other hand, use most of the same data analytics as a way of
keeping uses watching, clicking, and reorganizing content to get an additional view or click
The data analytics process is characterized by some core elements that are required for
any initiative. Through the combination of these elements, a successful data analytics initiative
offers a clear depiction of where an organization is, where they have been as well as where they
ought to proceed. In general, the process starts with descriptive analytics (Najafabadiet al.,
2015). The process encompasses the description of historical trends that are evident in the data
being examined. Descriptive analytics seeks to answer the question of what happened. The
attribute entails an examination of the traditional indicators like return on investment, although
the indicators that are used seem to differ depending on the industry (Najafabadi, et al., 2015). It
follows that descriptive analytics do not make a prediction or directly be used to inform
decisions since its main focus is on summarizing data in a meaningful descriptive manner.
DATABASE AND ANALYTICS
5
The next integral section of data analytics is advanced analytics. It forms the part of data
science that takes advantage of the advanced tools that are meant to extract data, reach
predictions and establish trends. The tools encompass the classical statistics in addition to
machine learning (Tsai, et al., 2015). The machine learning technologies that include neural
networks, sentiment analysis, and natural language processing among others make it possible for
advanced analytics to succeed. The information offers a new insight from data, addressing the
Access to machine learning strategies, huge sets of data and cheap computing power has
facilitated the use of these methods in most industries (Hu, et al., 2014). The collection of
extensive data sets has been integral in allowing for the use of these strategies. It follows that
data analytics makes it possible for businesses to attain important conclusions from complex as
well as different sources of data which has been promoted by advances in the parallel processing
Data analytics encompasses the processes that involve the evaluation of data sets to aid
the realization of conclusions relating to the information they contain, through the support of
specialized systems and software. Data analytics technologies, as well as strategies, are
extensively used in the commercial industries to aid organizations to realize more informed
business decisions in addition to scientists and researchers who use it to verify or disapprove
The term data analytics is mainly used in referring to several applications that range from
the basic business intelligence, reporting and online analytical processing to the reporting as well
as online analytical processing to the diverse forms of advanced analytics (Wang, Kung & Byrd,
2018). In this case, it is identical to business analytics which is an additional umbrella term that
DATABASE AND ANALYTICS
6
is employed for approaches that are used for data analysis, with the sole difference being in the
fact that business analytics is directed at business uses whereas data analytics is used at a broader
perspective. The extensive view of the term data analytics cannot be considered universal
although in some instances people tend to employ the term data analytics when specifically
meaning advanced analytics, thus treating business intelligence as a separate category (Wang,
Data analytics initiatives are employed by businesses to aid increase their revenues,
improve their operational efficiency, optimize their marketing campaigns along with customer
services efforts, react fast to emerging market tend and further enjoy a competitive edge over
their rivals, all of which are directed at the objective of boosting the overall business
performance (Ousterhout, et al., 2015). Depending on the specific application, data that has been
analyzed could comprise of either historical records or new information that has been
operationalized for real-time uses. Further, the information could come from a mix of internal
At a superior level, methodologies of data analytics include the exploratory data analysis
that seeks to determine patterns as well as relationships in data along with confirmatory data
analysis which employs statistical strategies in the determination of whether hypotheses relating
to a data set are accurate or false (Wamba, et al., 2017). Data analytics could additionally be
separated into either qualitative or quantitative data analysis. Quantitative data analysis entails
the analysis of numerical data using quantifiable variables that can be measures or compared
statistically. The qualitative approach, on the other hand, is more interpretative in the sense that it
DATABASE AND ANALYTICS
7
seeks to understand the account of non-numerical data as images, text, video, and audio
encompassing the common phrases, themes as well as point of view (Wamba, et al., 2017).
At an application level, business intelligence offers business executives along with the
other corporate workers with actionable information relating to key performance indicators,
customers, business operations among others. In the past, data queries, as well as reports, were
commonly created for end users by the business intelligence developers who worked in IT or
centralized business intelligence team (Ghazal, et al., 2013). Presently, organizations are
increasingly using self-service business intelligence tools that allow the executives, business
analysts as well as operational workers to run their ad hoc queries and further create reports
The more advances forms of data analytics include data mining that encompasses the
process of sorting through large sets of data to identify patterns, trends as well as relationships,
predictive analytics that seeks to anticipate customer behavior, equipment failure as well as
another possible failure (Sun, et al., 2016). The other forms of data analytics are machine
learning which is an artificial intelligence technique that employs automated algorithms to churn
through data sets faster than data scientists can through conventional analytical modeling. Big
data analytics uses predictive analytics, data mining, and machine learning tools to sets of big
data that commonly contain unstructured and semi-structured data. Text mining offers a mode of
analyzing emails, documents along with other text-based content (Ghazal, et al., 2013).
Data analytics initiatives tend to offer support for a huge assortment of business uses.
For instance, banks as well as the credit card company's analyze customer withdrawal as well as
spending patterns as a way of preventing fraud and identity theft (Elgendy & Elragal, 2014).
seek to identify the website visitors who have a higher likelihood of buying a specific product or
service depending on the navigation and page viewing patterns (Sun, et al., 2016). Mobile
network operators, on the other hand, evaluate customer data to ensure they forecast churn thus
allowing them to take steps that will be used to prevent defections to their rivals, boost their
relationship management analytics that seeks to segment customers for marketing campaigns and
further equip their call center employees with up-to-date information relating to their callers.
Healthcare organizations on the other hand mine patient data that is used in the evaluation of the
Data analytics application encompasses more than mere data analysis. Especially when it
comes to the case of advanced analytics projects, most of the needed work is conducted upfront,
entailing the collection, integration, and preparation of data followed by the developing, testing
as well as revising analytical models to guarantee that they generate results. Besides the data
scientists along with other data analysts, the analytics team commonly includes data engineers
whose job is to help ensure that data sets are ready for analysis (Gupta & George, 2016).
Data analytics is an extensive field that has four fundamental forms including diagnostic,
descriptive and prescriptive analytics. Each of these forms has a different objective and different
The descriptive data analytics aids answer the question of what happened. The strategies
tend to summarize huge datasets as they seek to describe outcomes to the stakeholders. Through
the creation of key performance indicators, the strategies can aid the process of tracking
DATABASE AND ANALYTICS
9
successes or failures. Metrics like return on investment are employed in most industries
(Marjani, et al., 2017). Further, specialized metrics are used in evaluating performance in some
industries, with the process demand collection of relevant data, processing of the data, analysis
as well as visualization. The process offers the necessary insights into past performance
Diagnostic analytics on the other hand answer the question of the reason that things
happened. The techniques seek to supplement the more basic descriptive analytics. They adopt
the findings attained from descriptive analytics and dig more intensively to establish the cause.
Performance indicators are additionally evaluated to help them discover the reason they became
either better or worse (Qin, 2014). The attribute occurs in three phases; the identification of
anomalies that are evident in the data which could be the unexpected changes in a metric or
specific market, secondly, the data related to the identified anomalies and lastly the statistical
techniques along with machine learning methodologies like decision trees, neural networks, and
Prescriptive analytics aids in answering questions on what needs to be done. Through the
use of insights that have been generated from predictive analytics, it becomes possible for data-
driven decisions to be made (Loebbecke & Picot, 2015). The attribute makes it possible for a
business to reach informed decisions when faced with uncertainty. Prescriptive analytics
methodologies depend on machine learning strategies that have the capability of establishing
patterns in huge datasets. Through the analysis of the past events as well as decisions, the
The adoption of data analytics is considered to be extensive. Analysis of the data can
in an increasingly competitive world since data analytics can be employed with immense success
One of the earliest parties to adapt data analytics was in the financial sector (Qin, 2014).
employed in the prediction of market trends as well as examines risk. Credit scores are one of the
examples of the ways in which data analytics that impact everyone. The scores rely on numerous
data points to establish the risk of lending in addition to the fact that it is used in detecting and
prevention fraud meant to improve efficiency and curtail risk for financial institutions (Ahmed, et
al., 2017).
The utilization of data analytics goes beyond the maximization of profits and return on
investment. Data analytics can offer vital information to be used in the healthcare sector in the
form of health informatics, prevention of crime as well as protection of the environment. The
application of data analytics relies on these strategies to improve the world (Kelleher, Mac Namee,
& D'arcy, 2015).
While statistics and data analysis have been employed by research, advanced analytics
strategies along with big data make it possible to attain numerous new insights. These strategies
can establish trends in complex systems, with scientists, for instance, relying on machine
learning for the protection of wildlife (Kelleher, Mac Namee, & D'arcy, 2015). The use of data
analytics in the healthcare sector is already extensive. Prediction of patient outcomes, efficiency
when it comes to the allocation of funds and improvement of diagnostics are some of the few
DATABASE AND ANALYTICS
11
examples of the ways in which data analytics has been revolutionizing healthcare (Duan & Xiong,
through the improvement of drug discovery while the pharmaceutical companies rely on data
analytics to comprehend the market for drugs and predict sales (Duan & Xiong, 2015).
The internet of things is the other field that has been exploding alongside machine
learning. The devices offer an excellent opportunity for data analytics. IoT devices commonly
contain numerous sensors that gather meaningful data points for their operations (Akter, et al.,
2016). Devices as a Nest thermostat assess movement and temperature to regulate the overall
heating and cooling. Smart devices further utilize data to learn from and subsequently predict an
individual's behavior. The outcome is that there is access to advanced home automation that can
chains, assessing operations on the floor operation, examining consumer sentiment or anything
else that relates to the large scale analytic challenges, big data has been exerting a significant
impact on the enterprise. The degree of business data that is being generated has increased
steadily over the years and more and more forms of information are stored in digital formats
(Zhou, et al., 2014). One of the main challenges encompassing determining how to deal with the
new types of data sources, transactions, selected events or blog posts. Collecting different types
of data very fast does not add any value. It is imperative to integrate analytics as it will aid in
uncovering the underlying insights that are going to add value to the business (Zhou, et al., 2014).
Out of the numerous diverse data models, the relational model has dominated since the
80s with implementations like MySQL, Oracle databases as well as Microsoft servers known as
relational database management systems (Das & Kumar, 2013). In the recent past, however, in an
DATABASE AND ANALYTICS
12
ever-increasing volume of cases of relational databases that leads to challenges both due to
deficits as well as challenges in data modeling and constraints attributed to horizontal scalability
over an assortment of servers and the huge amounts of data. Two core trends have been
generating challenges to the database and analytics fields (Slavakis, Giannakis & Mateos, 2014).
The first one is the case of the exponential growth of data volume that is being generated by
users, sensors, and systems and further accelerated by the concentration of a huge portion of the
volume in the big distributed systems like Google, Amazon and other cloud services. The ever
increasing interdependency as well as the complexity of data that is accelerated by web 2.0, the
internet, social networks as well as the open standardized access to data sources from a huge
Entities that tend to college large amounts of unstructured data are increasingly turning to
non-relational databases that are presently referred to as the NoSQL databases. The NoSQL
databases tend to focus on the analytical processing of huge scale datasets, providing elevated
scalability over commodity hardware (Puiu, et al., 2016). Computational along with storage needs
of applications as is the case for big data analytics, social networking and business intelligence
over petabyte datasets have pushed the SQL like centralized databases to their limits. The
data stores that are called No-SQL databases like Google BigTable as well as open-source
implementation HBase (Verhoef, Kooge, & Walk, 2016). The emergence of distributed key-value
stores like Voldermort and Cassandra highlights the efficiency and cost-effectiveness of their
approaches. The major limitation that is associated with RDBMS encompasses the challenge of
scaling with data warehousing, Web 2.0, Grid and cloud applications (Verhoef, Kooge & Walk,
2016).
DATABASE AND ANALYTICS
13
Big data is used in reference to the colossal data volumes that cannot be processed
effectively via the use of the traditional applications that are present. The processing of big data
starts with the raw data that is not aggregated and is on most occasions impossible to store in
memory of a unitary computer (Xiang, et al., 2015). The buzzword that is employed to describe the
huge volumes of data, both structured as well as unstructured, big data floods a business on a
daily basis. Big data is an attribute that can be employed in the analysis of insights that result in
The big data analysis predominantly entails analytical methods of big data, systematic
architecture of big data as well as big data mining besides software analysis. Data investigation,
in this case, is the most integral step in big data, used for the exploration of meaningful values,
offering suggestions and decisions (Raghupathi & Raghupathi, 2014). It, however, follows that
The traditional data analysis implies the effective utilization of statistical methods for
extensive data analysis in the exploration as well as the elaboration of concealed data of the
complex data set to ensure that the value of data can be maximized. Data analysis presents a
guide of the diverse plans of development for a country, anticipating customer demands as well
as forecasting the trends of the market for the organization (Gandomi & Haider, 2015). Big data
analytics can be stated as a strategy of analysis of a special form of data. Thus most of the
In the big data ear, everyone wants to focus on the exciting core value as well as
information from the extensive dataset to attain their organizational objectives. Presently, the
The core concept of bloom filter method follows that bit arrays are employed in the
storage of hash values. Bit arrays encompass the bitmap index for the storage of lossy
compression of harsh elements. The main advantage that characterizes the method is the fact that
it could be space high space efficiency in addition to the high query speed. The disadvantage is
Hashing exists as a method that mutates data into small as well as numeric values. The
advantages of hashing are the fact that it can be associated with fast reading, writing as well as
querying speed although it is difficult to compute the appropriate has to function (Tsai, et al.,
2015). The index is considered to be an efficacious method that is used for cutting the disk
reading cost beside disk writing cost and enhancing the speed of query deletion, insertion, and
modification. The demerit of the method, however, is associated with an extra cost of storing
Triel entails a method that is mostly employed in cases of fast retrieval with the method
being used for improving the efficiency of the query and common prefixes of strings of
characters being used to minimize comparison (Elgendy & Elragal, 2014). Parallel computing is
deemed to be in contrast to the serial computing whereby it refers to the use of resources
simultaneously to complete the task. The fundamental method is for the fragmentation of a
problem as well as the allocation of diverse processes for the attainment of co-processing
The idea of big data is used in the context of datasets that cannot be categorized,
accessed, managed, analyses and even processes by the present tools. There have been different
definitions of big data that have been provided by diverse users of big data along with the diverse
analysts of big data as research scholars, technical practitioners, and big data analysts (Gandomi &
Haider, 2015). The most acceptable/ extensive definition of big data is in the context of a dataset
that cannot be captured, managed moreover processed by the general computers within an
acceptable scope.
Big data technologies are used to offer a description of a new generation of technologies
along with architectures that are designed to economically extract value associated with very
large volumes of an extensive variety of data via enabling the high-velocity capture, discovery
besides analysis of the data (Belle, et al., 2015). Based on this context, it follows that big data
attributes can be high volume, various forms, and structures of data, quick development as well
The 4Vs definition tends to generate light on the meaning of big data in the context of the
assessment of concealed values. The definition highlights the most pertinent aspect of big data,
which exemplifies new values that are created from datasets (Ousterhout, et al., 2015).
The META group research offers a three-tier structure highlighting the structure of the
big data mining model. In the case of Tier 1, the emphasis is on the low-level data accessing as
well as computing. In tier II, the emphasis is on information sharing along with privacy, with the
domains along with knowledge of Big data application. In the case of Tier III, it supports the
Big data analytics is deemed a commonly complex process that involves the examination
of large as well as varied sets of data sets to uncover the underlying information as the hidden
patterns, market trends, unknown correlations and customer preferences that could organizations
reach informed business decisions (Ahmed, et al., 2017). On an extensive scale, data analytics
technologies along techniques offer a means of analyzing data sets and drawing conclusions
intelligence queries respond to the basic question concerning business operations as well as
The significance
software and high powered computing systems, allow big data analytics to offers an assortment
Big data analytics along with data analytics applications allow big data analysis,
predictive modelers, and data scientists, statisticians along other analytics professionals to
analyze the growing volumes of structured transactional data that are left untapped by the
DATABASE AND ANALYTICS
17
conventional BI and analytics programs (Zhou, et al., 2014). The structured, as well as
unstructured data do not fit well in the traditional data warehouses that are founded on relational
databases oriented to structure data sets. Additionally, data warehouses may not be managed to
deal with the processing demands that are posed by sets of big data and data analytics that
demand to be updated constantly or even continually as stands the case of real-time data on stock
Conclusion
Overall, big data and data analytics applications encompass data that comes from both
internal as well as external sources like demographic data on consumers, weather data compiled
become prevalent in the big data and data analytics environment as users attempt to perform real-
time analytics on data that is fed on the different systems via stream processing engines. As
evident from the analysis, big data and data analytics aid organizations harness their data and
utilize it in the identification of new opportunities. The attribute subsequently leads to the
realization of smarter business measures, increasingly efficient operations, and improved profits
besides happy customers. Some of the ways in which big data and data analytics add value to
organizations include via a reduction in the overall cost of storage, they enhance superior
decision making and promote the production of new products and services via the ease of
References
Ahmed, E., Yaqoob, I., Hashem, I. A. T., Khan, I., Ahmed, A. I. A., Imran, M., & Vasilakos, A. V. (2017).
Akter, S., Wamba, S. F., Gunasekaran, A., Dubey, R., & Childe, S. J. (2016). How to improve firm
performance using big data analytics capability and business strategy alignment?. International
Belle, A., Thiagarajan, R., Soroushmehr, S. M., Navidi, F., Beard, D. A., & Najarian, K. (2015). Big data
Das, T. K., & Kumar, P. M. (2013). Big data analytics: A framework for unstructured data
Duan, L., & Xiong, Y. (2015). Big data analytics and business analytics. Journal of Management
Analytics, 2(1), 1-21.
Elgendy, N., & Elragal, A. (2014, July). Big data analytics: a literature review paper. In Industrial
Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and
Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., & Jacobsen, H. A. (2013, June). BigBench:
towards an industry standard benchmark for big data analytics. In Proceedings of the 2013 ACM
Gupta, M., & George, J. F. (2016). Toward the development of a big data analytics capability. Information
Hu, H., Wen, Y., Chua, T. S., & Li, X. (2014). Toward scalable systems for big data analytics: A
Kelleher, J. D., Mac Namee, B., & D'arcy, A. (2015). Fundamentals of machine learning for predictive
data analytics: algorithms, worked examples, and case studies. MIT press.
Kwon, O., Lee, N., & Shin, B. (2014). Data quality management, data usage experience and acquisition
Loebbecke, C., & Picot, A. (2015). Reflections on societal and business model transformation arising from
digitization and big data analytics: A research agenda. The Journal of Strategic Information
Systems, 24(3), 149-157.
Marjani, M., Nasaruddin, F., Gani, A., Karim, A., Hashem, I. A. T., Siddiqa, A., & Yaqoob, I. (2017). Big
IoT data analytics: architecture, opportunities, and open research challenges. IEEE Access, 5,
5247-5261.
Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., & Muharemagic, E. (2015).
Deep learning applications and challenges in big data analytics. Journal of Big Data, 2(1), 1.
Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., & Chun, B. G. (2015). Making sense of
Puiu, D., Barnaghi, P., Toenjes, R., Kümper, D., Ali, M. I., Mileo, A., ... & Gao, F. (2016). Citypulse: Large
Qin, S. J. (2014). Process data analytics in the era of big data. AIChE Journal, 60(9), 3092-3100.
Raghupathi, W., & Raghupathi, V. (2014). Big data analytics in healthcare: promise and potential. Health
(statistical) learning tools for our era of data deluge. IEEE Signal Processing Magazine, 31(5), 18-
31.
Sun, Y., Song, H., Jara, A. J., & Bie, R. (2016). Internet of things and big data analytics for smart and
Tsai, C. W., Lai, C. F., Chao, H. C., & Vasilakos, A. V. (2015). Big data analytics: a survey. Journal of Big
data, 2(1), 21.
Verhoef, P., Kooge, E., & Walk, N. (2016). Creating value with big data analytics: Making smarter
Wamba, S. F., Gunasekaran, A., Akter, S., Ren, S. J. F., Dubey, R., & Childe, S. J. (2017). Big data
Research, 70, 356-365.
Wang, Y., Kung, L., & Byrd, T. A. (2018). Big data analytics: Understanding its capabilities and potential
Xiang, Z., Schwartz, Z., Gerdes Jr, J. H., & Uysal, M. (2015). What can big data and text analytics tell us
Management, 44, 120-130.
Zhou, Z. H., Chawla, N. V., Jin, Y., & Williams, G. J. (2014). Big data opportunities and challenges:
magazine, 9(4), 62-74.