Professional Documents
Culture Documents
Big Data Vs Data Mining: Abstract
Big Data Vs Data Mining: Abstract
VINAMRA MITTAL
A. Volume: massive information sets that are variable. For example, a seller might be involved in
command of size bigger than data managed in predicting those who will reply to a promotion.
habitual storage and analytical results. Imagine Distinctive algorithms used in data mining are as
B. Variety: complex, variable and Heterogeneous A. Classification trees: A famous data-mining system
data, which are generated in formats as dissimilar as that is used to categorize a needy categorical variable
public media, e-mail, images ,video, blogs, and based on size of one or many predictor variables. The
sensor data—as well as ―shadow data‖ such as outcome is a tree with links and nodes between the
access journals and Web explore histories. nodes that can be interpret to form if-then rules.
C. Velocity: Data is generated as a stable with real- B. Logistic regression: A algebraic technique that is a
time queries for significant information to be present modification of standard regression but enlarges the idea
up on claim instead of batched. to deal with sorting. It constructs a formula that predicts
D. Value: consequential insights that transport the possibility of the occurrence as a role of the
patterns from bottomless, difficult analysis based on C. Neural networks: A software algorithm that is
graph algorithms, machine learning and statistical molded after the matching architecture of animal minds.
modeling. These analytics overtake the results of The network includes of output nodes, hidden layers and
usual querying, reporting and business intelligence. input nodes. Each unit is allocated a weight. Data is
specified to the input node, and by a method of trial and
error, the algorithm correct the weights until it reaches a
IV. Data Mining for Big Data definite stopping criteria. Some groups have likened this
Data mining includes extracting and analyzing bulky to a black–box system.
amounts of data to discover models for big data. The D. Clustering techniques like K-nearest neighbors: A
methods came out of the grounds of artificial procedure that identifies class of related records. The K-
intelligence (AI) and statistics with a tad of database nearest neighbor technique evaluates the distances
management. Searching information from data takes two between the points and record in the historical data. It
major forms: prediction and description. it is tough to then allocates this record to the set of its nearest
know what the data shows?. Data mining is used to neighbor in a data group.
summarize and simplify the data in a way that we can
recognize and then permit us to gather things about
I. CONCLUSION probable factors. A system wants to be
Big data is directed to continue rising during cautiously designed so that unstructured
the next year and every data scientist will data can be connected through their
have to handle a large amount of data compositerelationships to form valuable
every year .This data will be more patterns, and the development of data
miscellaneous, bigger and faster. We volumes and relationships should help
discussed in this paper several insights patterns to guess the tendency and future.
about the subjects and what we think are the
major concern and the core challenges for II. REFERENCES
the future. Big Data is becoming the latest [1] Xindong Wu, Fellow, IEEE, Xingquan
final border for precise data research and for Zhu, senior Member,IEEE,Gong-
business applications. Data mining with big Qing,Wu,and Wei Ding, senior
data will assist us to discover facts that Member,IEEE:Data Mining with big Data
nobody has discovered before. The IEEE TRANSACTIONS ON
heterogeneous mixture learning technology KNOWLEDGE AND DATA
is an advanced technology used in big data ENGINEERING, VOL. 26, NO. 1,
analysis. In the above, we introduced JANUARY 2014
difficulties that are inherent in [2] M.H. Alam, J.W. Ha, and S.K. Lee,
heterogeneous mixture data analysis, the ―Novel Approaches to Crawling Important
basic concept of heterogeneous mixture Pages Early,‖ Knowledge and
learning and the results of a demonstration Information Systems, vol. 33, no. 3, pp 707-
experiment that deal with electricity demand 734, Dec. 2012.
predictions. As the big data analysis [3] S. Aral and D. Walker, ―Identifying
increases its importance, heterogeneous Influential and Susceptible Members of
mixture data mining technology is also Social Networks,‖ Science, vol. 337, pp.
expected to play a significant role in the 337-341, 2012.
market. The range of application of [5] FUJIMAKI Ryohei, MORINAGA
heterogeneous mixture learning will be Satoshi :The Most Advanced Data Mining
expanded broader than ever in the future. To of the Big Data Era
investigate Big Data, we have examined a [6] E. Birney, ―The Making of ENCODE:
number of challenges at the system levels, Lessons for Big-Data Projects,‖ Nature, vol.
data and model. To hold Big Data mining, 489, pp. 49-51, 2012.
highperformance computing platforms are [7] J. Bollen, H. Mao, and X. Zeng,
necessary, which enforce organized designs ―Twitter Mood Predicts the Stock Market,‖
to set free .the complete power of the Big J. Computational Science, vol. 2, no. 1,
Data. By the data level, the independent pp. 1-8, 2011.
information sources and the range of the [8] S. Borgatti, A. Mehra, D. Brass, and G.
data gathering environments, habitually Labianca, ―Network Analysis in the Social
result in data with complex conditions, such Sciences,‖ Science, vol. 323, pp.
as missing unsure values. The vital 892-895, 2009.
challenge is that a Big Data mining structure [9] J. Bughin, M. Chui, and J. Manyika,
needs to consider complicated interaction Clouds, Big Data, and Smart Assets: Ten
between data sources ,samples and models Tech-Enabled Business Trends to
along with their developing changes with Watch. McKinSey Quarterly, 2010.
time and additional
[10] D. Centola, ―The Spread of Behavior
in an Online Social Network Experiment,‖
Science, vol. 329, pp. 1194-
1197, 2010.
[11] E.Y. Chang, H. Bai, and K. Zhu,
―Parallel Algorithms for Mining Large-
Scale Rich-Media Data,‖ Proc. 17th ACM
Int’l Conf. Multi-media, (MM ’09,) pp. 917-
918, 2009.