Assignment 1 Eti

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Emerging Trends in Computer and Information

Technology

NAME SHIVAM D SURYAWANSHI

TYCO A

ROLL NO 21

ASSIGNMENT 1
1.Explain History of AI

ANS-Maturation of Artificial Intelligence (1943-1952)


o Year 1943: The first work which is now recognized as AI was done by Warren
McCulloch and Walter pits in 1943. They proposed a model of artificial neurons.
o Year 1949: Donald Hebb demonstrated an updating rule for modifying the
connection strength between neurons. His rule is now called Hebbian learning.
o Year 1950: The Alan Turing who was an English mathematician and pioneered
Machine learning in 1950. Alan Turing publishes "Computing Machinery and
Intelligence" in which he proposed a test. The test can check the machine's ability to
exhibit intelligent behavior equivalent to human intelligence, called a Turing test.

The birth of Artificial Intelligence (1952-1956)

o Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial
intelligence program"Which was named as "Logic Theorist". This program had
proved 38 of 52 Mathematics theorems, and find new and more elegant proofs for
some theorems.
o Year 1956: The word "Artificial Intelligence" first adopted by American Computer
scientist John McCarthy at the Dartmouth Conference. For the first time, AI coined as
an academic field.

At that time high-level computer languages such as FORTRAN, LISP, or COBOL were
invented. And the enthusiasm for AI was very high at that time.
The golden years-Early enthusiasm (1956-1974)

o Year 1966: The researchers emphasized developing algorithms which can solve


mathematical problems. Joseph Weizenbaum created the first chatbot in 1966, which
was named as ELIZA.
o Year 1972: The first intelligent humanoid robot was built in Japan which was named
as WABOT-1.

The first AI winter (1974-1980)

o The duration between years 1974 to 1980 was the first AI winter duration. AI winter
refers to the time period where computer scientist dealt with a severe shortage of
funding from government for AI researches.
o During AI winters, an interest of publicity on artificial intelligence was decreased.

A boom of AI (1980-1987)

o Year 1980: After AI winter duration, AI came back with "Expert System". Expert
systems were programmed that emulate the decision-making ability of a human
expert.
o In the Year 1980, the first national conference of the American Association of Artificial
Intelligence was held at Stanford University.

The second AI winter (1987-1993)

o The duration between the years 1987 to 1993 was the second AI Winter duration.
o Again Investors and government stopped in funding for AI research as due to high
cost but not efficient result. The expert system such as XCON was very cost effective.

The emergence of intelligent agents (1993-2011)

o Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary
Kasparov, and became the first computer to beat a world chess champion.
o Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum
cleaner.
o Year 2006: AI came in the Business world till the year 2006. Companies like
Facebook, Twitter, and Netflix also started using AI.

Deep learning, big data and artificial general intelligence (2011-present)


o Year 2011: In the year 2011, IBM's Watson won jeopardy, a quiz show, where it had
to solve the complex questions as well as riddles. Watson had proved that it could
understand natural language and can solve tricky questions quickly.
o Year 2012: Google has launched an Android app feature "Google now", which was
able to provide information to the user as a prediction.
o Year 2014: In the year 2014, Chatbot "Eugene Goostman" won a competition in the
infamous "Turing test."
o Year 2018: The "Project Debater" from IBM debated on complex topics with two
master debaters and also performed extremely well.
o Google has demonstrated an AI program "Duplex" which was a virtual assistant and
which had taken hairdresser appointment on call, and lady on other side didn't notice
that she was talking with the machine.

Now AI has developed to a remarkable level. The concept of Deep learning, big data, and
data science are now trending like a boom. Nowadays companies like Google, Facebook,
IBM, and Amazon are working with AI and creating amazing devices. The future of Artificial
Intelligence is inspiring and will come with high intelligence.

2. What is data Mining

ANS-Data mining, also known as knowledge discovery in data (KDD), is the process
of uncovering patterns and other valuable information from large data sets. Given the
evolution of data warehousing technology and the growth of big data, adoption of
data mining techniques has rapidly accelerated over the last couple of decades,
assisting companies by transforming their raw data into useful knowledge. However,
despite the fact that that technology continuously evolves to handle data at a large-
scale, leaders still face challenges with scalability and automation.
Data mining has improved organizational decision-making through insightful data
analyses. The data mining techniques that underpin these analyses can be divided
into two main purposes; they can either describe the target dataset or they can
predict outcomes through the use of machine learning algorithms. These methods
are used to organize and filter data, surfacing the most interesting information, from
fraud detection to user behaviors, bottlenecks, and even security breaches.
When combined with data analytics and visualization tools, like Apache Spark,
delving into the world of data mining has never been easier and extracting relevant
insights has never been faster. Advances within artificial intelligence only continue to
expedite adoption across industries.  

The data mining process involves a number of steps from data collection to
visualization to extract valuable information from large data sets. As mentioned
above, data mining techniques are used to generate descriptions and predictions
about a target data set. Data scientists describe data through their observations of
patterns, associations, and correlations. They also classify and cluster data through
classification and regression methods, and identify outliers for use cases, like spam
detection.

Data mining usually consists of four main steps: setting objectives, data gathering
and preparation, applying data mining algorithms, and evaluating results.

1. Set the business objectives: This can be the hardest part of the data mining
process, and many organizations spend too little time on this important step. Data
scientists and business stakeholders need to work together to define the business
problem, which helps inform the data questions and parameters for a given project.
Analysts may also need to do additional research to understand the business context
appropriately.
2. Data preparation: Once the scope of the problem is defined, it is easier for data
scientists to identify which set of data will help answer the pertinent questions to the
business. Once they collect the relevant data, the data will be cleaned, removing any
noise, such as duplicates, missing values, and outliers. Depending on the dataset,
an additional step may be taken to reduce the number of dimensions as too many
features can slow down any subsequent computation. Data scientists will look to
retain the most important predictors to ensure optimal accuracy within any models.
3. Model building and pattern mining: Depending on the type of analysis, data
scientists may investigate any interesting data relationships, such as sequential
patterns, association rules, or correlations. While high frequency patterns have
broader applications, sometimes the deviations in the data can be more interesting,
highlighting areas of potential fraud.
Deep learning algorithms may also be applied to classify or cluster a data set
depending on the available data. If the input data is labelled (i.e. supervised
learning), a classification model may be used to categorize data, or alternatively, a
regression may be applied to predict the likelihood of a particular assignment. If the
dataset isn’t labelled (i.e. unsupervised learning), the individual data points in the
training set are compared with one another to discover underlying similarities,
clustering them based on those characteristics.
4. Evaluation of results and implementation of knowledge: Once the data is
aggregated, the results need to be evaluated and interpreted. When finalizing
results, they should be valid, novel, useful, and understandable. When this criteria is
met, organizations can use this knowledge to implement new strategies, achieving
their intended objectives.

3. What Is Data set?

ANS- A Dataset is a set or collection of data. This set is normally presented in a tabular
pattern. Every column describes a particular variable. And each row corresponds to a given
member of the data set, as per the given question. This is a part of data management. Data sets
describe values for each variable for unknown quantities such as height, weight, temperature,
volume, etc of an object or values of random numbers. The values in this set are known as
a datum. The data set consists of data of one or more members corresponding to each row. In
this article, let us learn the definition of the dataset, different types of datasets, properties, and so
on with many solved examples.
A data set is an ordered collection of data. As we know, a collection of information obtained
through observations, measurements, study, or analysis is referred to as data. It could include
information such as facts, numbers, figures,  names, or even basic descriptions of objects. For
our study, data can be organized in the form of graphs, charts, or tables. Through data mining,
data scientists assist in the analysis of gathered data.
A dataset is a set of numbers or values that pertain to a specific topic. A dataset is, for example,
each student’s test scores in a certain class. Datasets can be written as a list of integers in a
random order, a table, or with curly brackets around them. The data sets are normally labelled so
you understand what the data represents, however, while dealing with data sets, you don’t
always know what the data stands for, and you don’t necessarily need to realize what the data
represents to accomplish the problem.
In Statistics, we have different types of data sets available for different types of information. They
are:

 Numerical data sets


 Bivariate data sets
 Multivariate data sets
 Categorical data sets
 Correlation data sets
Let us discuss all these data sets with examples.

Numerical Datasets
The numerical data set is a data set, where the data are expressed in numbers rather than
natural language. The numerical data is sometimes called quantitative data. The set of all the
quantitative data/numerical data is called the numerical data set. The numerical data is always in
the numbers form, such that we can perform arithmetic operations on it.

 Weight and height of a person


 The count of RBC in a medical report
 Number of pages present in a book

Bivariate Datasets
A data set that has two variables is called a Bivariate data set. It deals with the relationship
between the two variables. Bivariate dataset usually contains two types of related data.
Example: To find the percentage score and age of the students in a class. Score and age can be
considered as two variables

2. The sales of ice cream versus the temperature on that day. Here the two variables used
are ice cream and temperature. 

(Note: In case, if you have one set of data alone say, temperature, then it is called the univariate
dataset)
Multivariate Datasets
A data set with multiple variables.  When the dataset contains three or more than three data
types (variables), then the data set is called a multivariate dataset. In other words, the
multivariate dataset consists of individual measurements that are acquired as a function of three
or more than three variables.
Example: If we have to measure the length, width, height, volume of a rectangular box, we have
to use multiple variables to distinguish between those entities.

Categorical Datasets
Categorical data sets represent features or characteristics of a person or an object. The
categorical dataset consists of a categorical variable also called the qualitative variable, that can
take exactly two values. Hence, it is termed as a dichotomous variable. Categorical
data/variables with more than two possible values are called polytomous variables. The
qualitative/categorical variables are often assumed to be polytomous variable unless otherwise
specified.
Example:

 A person’s gender (male or female)


 Marital status (married/unmarried)

Correlation Datasets
The set of values that demonstrate some relationship with each other indicates correlation data
sets. Here the values are found to be dependent on each other.
Generally, correlation is defined as a statistical relationship between two entities/variables. In
some scenarios, you might have to predict the correlation between the things. It is essential to
understand how correlation works. The correlation is classified into three types. They are:

 Positive correlation – Two variables move in the same direction (Either both are up or
both or down)
 Negative correlation – Two variables move in opposite directions. (One variable is up and
another variable is down and vice versa)
 No or zero correlation – No relationship between two variables.
Example: A tall person is considered to be heavier than a short person. So here the weight and
height variables are dependent on each other.

You might also like