Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Introduction to Data

Introduction to Data

 Data is a collection of raw facts.

 These raw facts undergo processing to generate
 Data is the central thread of any activity
 Understanding the nature of data is most fundamental
for proper and effective use of statistical skills

Indrani Sen, MCA, MPhil Compuer Science

Indrani Sen, MCA, MPhil Compuer Science
Information (captured and processed

Patient_ID Patient Medical Symptoms Diagnosis Medicine

Name History Prescribed

A_3162 Rita lamba High BP Cough & Throat Amoxicillin

Cold Infection

Indrani Sen, MCA, MPhil Compuer Science

Types of Data

Indrani Sen, MCA, MPhil Compuer Science

Qualitative data

 Qualitative or categorical data have no logical order,

and can't be translated into a numerical value.
 Eye colour is an example, because 'brown' is not higher
or lower than 'blue'.

Indrani Sen, MCA, MPhil Compuer Science

Quantitative data

 It is information about quantities; that is, information that

can be measured and written down with numbers.
 Some examples of quantitative data are your height,
your shoe size.
 Not all numerical data is quantitative.
 One example of an exception is the security code on
your credit card

Indrani Sen, MCA, MPhil Compuer Science

 The age of your car.
 The softness of a cat fur.
 The colour of the sky.
 The number of pennies in your pocket.
 Complexion of your skin
 Flavor of your food

Indrani Sen, MCA, MPhil Compuer Science

 The age of your car. (Quantitative.)

 The softness of a cat fur. (Qualitative.)

 The color of the sky. (Qualitative.)

 Complexion of your skin (qualitative)
 Flavor of your food(qualitative)

Indrani Sen, MCA, MPhil Compuer Science

Data types

 Based on their mathematical properties,

 data are divided into four groups: NOIR
 Nominal
 Ordinal
 Interval
 Ratio

Indrani Sen, MCA, MPhil Compuer Science

Nominal Data

 Nominal means name and count;

 data are alphabetic or numerical in name only •
 They are categories without order
 Nominal scales are used for labelling variables, without
any quantitative value.
 “Nominal” scales could simply be called “labels.”
 An easy way to remember this type of data is that
nominal sounds like named, nominal = named.

Indrani Sen, MCA, MPhil Compuer Science

Indrani Sen, MCA, MPhil Compuer Science
a sub-type of nominal scale with only two
categories (e.g. male/female) is called

Indrani Sen, MCA, MPhil Compuer Science

Ordinal Data

 Ordinal data is data which is placed into some kind

of order or scale.
 In scale data there is no standardised value for the
difference from one score to the next.
 This can be explained in terms of positions in a race (1st, 2nd,
3rd etc).
 This is ordinal data because the runners are placed in order
of who completed the race in the fastest time to the slowest
time, but there is no standardised difference in time between
the scores.

Indrani Sen, MCA, MPhil Compuer Science

Indrani Sen, MCA, MPhil Compuer Science
 The inequalities like U < G < P < D does not
help to know differences between any
two of them cannot be said to be same
 (say, difference between U and G is not
same as G and P)

Indrani Sen, MCA, MPhil Compuer Science

Interval (or Score/ Mark) Data

 What is an interval scale?

Indrani Sen, MCA, MPhil Compuer Science

 Interval scales are numeric scales in which we know not
only the order, but also the exact differences between
the values
 Like the others, you can remember the key points of an
“interval scale” pretty easily.
 Interval” itself means “space in between,” which is the
important thing to remember–interval scales not only tell
us about order, but also about the value between each
Indrani Sen, MCA, MPhil Compuer Science
Indrani Sen, MCA, MPhil Compuer Science
Indrani Sen, MCA, MPhil Compuer Science
Ratio data

 Ratio scales provide a wealth of possibilities when it comes to statistical

 These variables can be meaningfully added, subtracted, multiplied,
divided (ratios).
 Central tendency can be measured by mode, median, or mean;
 measures of dispersion, such as standard deviation and coefficient of
variation can also be calculated from ratio scales.

Indrani Sen, MCA, MPhil Compuer Science

Indrani Sen, MCA, MPhil Compuer Science
Kinds of data

 Static
 dynamic
 spatial
 Temporal
 Text
 media

Indrani Sen, MCA, MPhil Compuer Science

Static data

 It is the data that doesn’t change.

 Or you can say that its not real-time.

 Static Data is self-contained or controlled.

Eg.- When you fill up a form then you select your city from a defined list of
cities. The list doesn’t change(generally).
 So we can say that the data collected after filling your form is static data.

 You may require some cleansing, preparation, and preprocessing before

you can use static data for an analysis.

Indrani Sen, MCA, MPhil Compuer Science

C LCDw w w .a s t r o l o g y - z o d i a c - s i 9 n s . c o m / z o d 1 a c - s1 9 n s / a q u a n u s/
Z o d i a c S ig n s Horoscope C o m p a t ib i l it y A s t r o lo g y A s t r o l o g y B lo g

Element: 8l!:
Quality: Fixed
Color: Light-Blue. Silver Day: Saturday
Ruler: Uranys, Satyrn
Greatest Overall Compatibility: J.&Q,
Lucky Numbers: 4, 7, 11, 22 , 29 gjttarlys
Date range: January 20 - F ebruary_jjl


Strengths: Progressive , ong1na1.independent, humanitanan

Weaknesses: Runs from emotional expression temperamental. uncompromising, aloof
Aquarius likes:Fun with friends. helping others, fighting for causes, intellectual conversation, a good listener
Aquarius dislikes: Limitations, broken promises, being lonely,dull or boring situations, people who disagree with them
Aquarius-born are shy and quiet • but on the other hand they can be eccentric and energetic both cases, they are deep thinkers and highly
intellectual people who love helping others.They are able to see without prejudice, on both sides, which makes them people who can easily solve
Although they can easily adapt to the energy that surrounds them, Aquarius-born have a deep need to be some
time alone and away from everything,in order to restore power People born under the Aquanus sign look at

Indrani Sen, MCA, MPhil Compuer Science

Indrani Sen, MCA, MPhil Compuer Science
Streaming or dynamic data

 In data management the time scale of the data determines how it is

processed and stored.
 Dynamic data or transactional data is information that is periodically
updated, meaning it changes asynchronously over time as new
information becomes available.

Indrani Sen, MCA, MPhil Compuer Science

Rahul targets RSSIn l.ondon.BJP hits back Ast ad:Gold In rowing,tennis;silver in
kabadd• UAE says no financial aid offered to Kerala

India's first Humboldt pengl1in dies

H-1B:59 us CEOs raise concer1-. over Trun'lp

India win rowing gold but coach futur'e uncer-tain
300 guests evacuated from hotel after couple dies


A r e p lu m b e r s . c a r p e n t e r s a 1" " t s w er t o t h e j o b
p r o b le m ?
Neha and Angad confirm pregnancy rumours ' Eela' release dal:e
postponed Budget-fl"iendly wallets that ar-e tr""ve steals •
a-town wishes Neha Ohup1a on her pregnancy KJO'S beautiful message fo'
K e r a l a f a c e s u p hil l t a s k a sli f e h r " n p s t o n o r m a l c y
Neha and Angad BB and CC creams you really should tnvest In
T r i u m p h s ·t r e e i - T r 1p l e R S r e v ie w

p d - 2 0 1 8 0 8 2 •• ••z i p
Indrani Sen, MCA, MPhil Compuer Science
Census Data
NASA satellites imagery - terabytes of data per day
Weather and Climate Data
Rivers, Farms, ecological impact
Medical Imaging

Indrani Sen, MCA, MPhil Compuer Science

Indrani Sen, MCA, MPhil Compuer Science
.,.. Army Field Commander: Has there been any significant enemy troop
movement since last night ?
.,.. Insurance Risk Manager: Which homes are most likely to be affected in the next
great flood on the Mississippi?
.,.. Medical Doctor: Based on this patient's MRI, have the size of tumor changed?
Molecular Biologist: Is DNA sequence of a matchingwith a criminalin the
.,.. database.
Traffic data:

Indrani Sen, MCA, MPhil Compuer Science

..,..A temporal data denotes the state of an object over a
period of time.
..,.. Stock market data
..,..Camera recording body movements
..,..Car moving in a plane
..,..Plane movements in a 3D plane

Indrani Sen, MCA, MPhil Compuer Science

- An Unmanned Air
- Vehicle
- Tuples are of the form:
(time, x, y, z)
Indrani Sen, MCA, MPhil Compuer Science
Example: Video-tracking
store t t+1

.... ... '' \

.... .... ' ' \
capture .. ..

Camera performs tracking

of body features (20 ST
Indrani Sen, MCA, MPhil Compuer Science
Q u i c lz s o r t
From V\'ikipedia. the free encyclopedia

Qufcl<sort (sometimes called partlt,lon-exchange sort) is an efficient sorting algorlthm.serving as a systematic method for placing the elements of an array In order. Developed by
Tony Hoarein 1959PIand publ shed in 1961,£21it is still a commonly used algorithm for sorting.When implemented well. it can be about two or three times raster thanits main
competitors,merge sort and
heapsort.[3J{oonmtd.c o1Yf
Oulc l<sort s a comparison sort, meaning that It can sort items of any type ror wtiich a "less-than" relation (formally, a total order) ts defined. In efficientimplementations it Is not a
stat>le sort , meaning that me relative order of equal sortItems is not preserved.
Ouicl<sort can operate in-place on an array, requiring small additional amounts of memory to perform the sorting. It s very similar to selection sort , except that it does not always
choose worst-case partition.
Mathematical analysis of qulcl<sort shov.'S that.on average. the algorithm takes O(nlog n) comparisons to sort nItems. In the worst case. 1t makes O(n2) comparisons.though this
behavior Is rare.

Contents (hide)
1H story
2.1 Lomuto part t on scheme
2.2 Hoare partition scheme
2.3 Implementationissues 2.3.1 Choice of pvot
2.3.2 Repeated elements

Indrani Sen, MCA, MPhil Compuer Science

Indrani Sen, MCA, MPhil Compuer Science

You might also like