Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 48

Use of computer in data

analysis
DR SA Balogun ACC
HOU Research
Introduction
• A Computer is an electronic devices which is
capable of accepting data in a prescribed
format, capable of processing the data, storing
the data and result of the processed and help to
release the information in a prescribed format.
• In recent times it has become imperative for
Road Safety practitioners to have the knowledge
of computer.
• This is because of the voluminous data often in
more than two dimensions which neither human
brain nor ordinary statistics can cope with.
CU
Hardware CPU
MM RAM – Read/Write Memory, loses its data when
Computer is switch off.

ROM EPOM – Enable Programmable ROM


(Cannot be altered)
EAROM

Peripherals

Floppy 5 / or 1.2m, 3 / or 1.44, 2HD Flash


Auxiliary Memory Hard disk – C drive, (1.2G – over 80G)

CD ROM

Input/Output (Transport) - Terminal


Keyboard
VDU or Monitor Monochrome (Black and White)

Colour Display
VGA – Video Graphic Adapter
CGA – Colour Graphic Adapter
EGA – Enhanced Graphic Adapter
SVGA – Supper Video Graphic Adapter
Printer LCD – Liquid Crystal Display
DOT
Inkjet/DeskJet
LaserJet
Software Programs
Operating System DOS
Widow 98, 2000, Mil XP
Unix
Linux
Language Low Level FORTRAN, Formula translation
High Level BASIC – Beginner all purpose symbol Instructions
COBOL – Common Business Oriented Language
Others are PASCAL, c/Ett, ADA
Types Source Prog. (Raw code)
Object Prog. (Code Translation
File extension Prog. (‘EKE, .Com, ‘BAT
EKE
Algorithm
Com – Command
Types of Computer System Programing
BAT – for running Virus
Programing
INT ano, Sys etc.
Application Programing (Word, Excel, SPSS etc

FIG 36: COMPONENT OF COMPUTER


INTRODUCTION
• I shall concentrate on the last two
application systems;
• EXEL
• Statistical Package for Social Scientist
(SPSS)
EXEL
• The Electronic spreadsheet is used for many
business applications such as, Accounting,
Financial analysis, payroll processing, statistical
analysis tool, budgeting etc.
• It consists of row and column.
• The intersection of row and column is known as
cell and a group of cell is workbook. Each cell
has address.
• The addition of content of cell C5 and B, is
written in another cell c3 but only the formula is
seen in the cell C3 hence spreadsheet is used
for auditing.
An example of spreadsheet
packages are:

• - Microsoft EXCEL
• - Lotus 1,2,3
• - Quattro pro etc
operation in spreadsheet environment
• financial operation,
• mathematical trigonometry,
• Date and time function,
• Data base management,
• logical (and or not) as well as
• statistical operation (such as Average, Beta
distribution, chi test, confidence level correlation
and Repression.
• To do this, open the computer, click F (x) on
tool, click statistical and financial and see at
least 3 functions.
Starting and closing Exel-
• Click start button, programs then Exel. You can
also do this by double clicking Exel icon
• Other activities- A wide range of activities
similar to those in Word can be performed with
Exel.
• These include entering text, numbers, dates,
percentages, correcting mistakes either by
overwriting or using undo (ctrl z) and redo (ctrl
y), opening of work book from task pane and
other uses of task pane.
• The print dialogue box usually has default
instruction so one has to instruct it as desired.
Auto fill
• It saves time while copying text, number etc in a
spreadsheet.
• To do this, click on the desired cell (i.e Jan) and
ensure that the cursor changes to plus sign then
drag the cursor over all the cell you wish to copy
to.
• With this all other cell reads Feb, March etc.
Formula can also be copied to different cells by
this method.
• A cell in the formula can be made absolute or
constant by pressing F4 after its reference then
ok before auto fill is carried out.
column or row
• Selecting column or row- in a spreadsheet
you will click letters representing the
column or row
• Inserting row or column- Select the row,
click insert then rows.
• Changing column width and row height-
Place the mouse below the row or right to
the column one wish to resize then double
click.
Cut, copy and paste-
• Cut, copy and paste- Press ctrl X or C, click the
desired cell (s), then control V. Microsoft XP has office
clipboard that can allow copying up to 24 items.
• To do this press control C twice to display clipboard,
select item to paste and select paste in clipboard.
• Cells can also be dragged or copied to a new location by
selecting the cells (holding down the control key in case
of copy) positioning cursor at the edge of the selection
so that the cursor can change to a cross sign and drag to
new location.
• Zoom- is found in the view menu
Auto sum
• -It is an easy way to add figures.
• To do this click empty answer cell next to all
additions, click auto sum icon, and then click
enter.
• Alternatively, to do this with a formula,
select the answer cell, type{=sum ( }then click
first cell to reveal its cell reference, type colon
then answer and ok.
• It must be noted that +, -, *, -, (), and ^
stands for addition, subtraction, multiplication,
division, bracket and raise to power respectively.
• Where they occur together, using BODMAS
priority, bracket takes priority before raise to
power, then multiplication, division and plus and
minus respectively.
Sheet manipulation
Right click sheet tab abd select worksheet
• Inserting sheet-
from option displayed or select insert from menu and then
worksheet.
• Same for deleting a sheet.
• Renaming sheet- Double click the desired sheet tab then
rename and click any cell.
• Moving sheet- Click sheet tab of desired sheet and drag to
new location or select sheet, click edit, copy and select new location.
• Copying a sheet - Select the sheet, hold down control key, click
and drag the sheet to new location.
• Creating formulae across sheet- This is used to get grand total
of data.
• For instance if a RTA figure by month for each year occupy a sheet
each, the grand total for many years can be obtained on the last
sheet by typing =sum(, clicking on first sheet tab, then on required
cell location on last sheet, holding down shift key, and clicking on
last sheet tab.
• By this, the sheets are grouped together.
• If such sheets are not next to each other, rather than shift a control
key is used in the above exercise.
Sending spreadsheet to Word-
• Follow the select, copy and paste procedure but a smart
tag would appear after pasting.
• You are at liberty to keep the surface formatting or
destination (Word) style or even maintain the link to Exel
in which case any change in the Exel would
automatically reflect in the Word table.
• It is possible to paste in Word with the Exel menu
available for use.
• This is done by selecting paste special from edit and
then selecting Exel worksheet object.
• Alternatively, double click the spreadsheet within the
word and Exel function appears.
Function
• -This can be obtained from drop down arrow of auto sum
icon or insert icon.
• Click on cell of desired answer then the function and ok.
• An 'if' function request Exel to consider something i.e
true or false.
• In this case the condition {= (count if)}is stated followed
by the rule and a coma, then what one wants the
computer to do i.e if (B6>$B$3, 'NO' 'YES').
• Other operators used apart from equal to is <, >, >, >,
>=, and <= respectively.
• It is sometimes necessary to nest several 'ifs' together
hoping that Exel would pick the applicable answer as in
=if (((B2>65,' 'over65', if (B2>4, '45-65', If B2>24, '25-45',
UNDER 25'))).
• In this case if any of the 'if' is correct Exel picks it but if
all are wrong then it pick age under 25.
Exel as database
• is another word for querying Location Type of cause Sex of Age of Hour of season day date
the database to find RTA victim victim occurrence
information.
• To do this, find auto filter from AbujA- F DGD M 45 1300 Rain Mon 25 th
filter in the data menu and a Lokoj
drop down arrows appear on
all the headings. etc S SPV M etc etc Dry etc etc
• Click on the arrow of desired S etc F Rain
heading and filter with chosen
criteria. S F Rain
• For instance, the RTA table M M Dry
shown below can be filtered to
show locations of fatal S M Rain
accident and its causes.
• etc etc Rain
Filter can be customized to
find say day that begin with 'T'
Drawing diagram
• Click drawing tool bar, click insert diagram,
select desired diagram and click ok.
• Enter the words into the diagram and in
case additional shape is desired resize or
change the diagram, click insert shape
layout or change to respectively as found
on the diagram tool bar.
EXEL FOR STATISTICS
• A lecture on different method of data
analysis shall be given by another lecturer
here but as a refresher see different areas
of analysis below;
The Statistic Model
STATISTICS IN EXEL
C on fid e n ce P ro b a b ility B in o m ia ls ( i. e F a ilu r e / Z Score
Fre q u e n cy S u c c e s s o f o u tc o m e )
H o w o fte n 5 0 v e h w i t h S h o w th a t s a m p le
A
3 0
v s p
k m
e
p
e
h
d
S D I.e Prob of next baby m e a n w ill b e g re a te r
th a n a v e ra g e o b s e rv a tio n
= 2 . 5
being male
S co re s B in D a ta Description X Probability D e sc rip tio n Description
1 D a ta

0 .0 5 S ig 0 0 .2 6 No crash/ 2
2 70
79 success at b/spot
79 0 .3 N o tip s/ 6
3 85 2 .5 SD 1 10
in d e p tria l
4 78 S a m p le 2 0 .1 P ro b o f 7
89 50 0 .5 su cce s s
siz e
5 85 =CONFIDENCE 0 .6 9 = B tw 3 0 .4 =BINOMDIST 8
2 9 .3 to
(A2,A3,A4) 3 0 .7 k m p h (A2,A3,A4,FALSE)
6 =PROB 6
50 Prob x=2(0.1)
(A2:A5,B2:B5,2

7 81 =PR O B
P r o x b tw 5
(A 2 :A 5 ,
1 & 3 (0 .8 )
B 2 :B 5 ,1 ,3

8 4
95

88 2
9

10 97 1

11 =FREQUENCY 9
(A2:A10,B2:B4)

12 =ZTEST One tail Pro for


(A2:A11,4) x=4 (0.0906)
=2*MIN(ZTEST 2 tail =
13 (A2:A11,4),ZTEST
(A2:A11,4)) 0.1812

14 =ZTE S T One tail Pro for


(A2:A 11,6) x=6 (0.863)

15 =2*MIN(ZTEST
(A2:A11,6),ZTEST 2 tail =
(A2:A11,6)) 0.274
STATISTICS IN EXEL
Chi test T test Correlation Pearson F Test
Verify Hypothesis Test that 2 sample are C o m p a re s tw o Ran ge s from -1 to 1 a nd Test sig if 2 samples
fromsame population p r o p e r t ie s it re flect line a r relatio nship
. The data could be have diff variance
Men Women 1. Paired (to/fro) Veh Fuel Indep Dep Test score of Test score of
1 (Observed) (Observed) Discription 2.equal variance (Homoscadacity) speed consumed (driv exp) (no crash)
untrained
3. Unequal variance(Heteroscadacity) trained
2 58 35
6 3 9 9 10 6 20
Agree 3

11 25 Neutral 19 2 7 6 28
3 4
7 7
4 10 23 5 4 5 1 31
Disgree 3 12 9
5 Men Women 8 2 5 15 3 38
(Expected) 5 15
(Expected)
6 45.35 or Etc Agree 9 14 6 17 1
6 3 21 40
(79x93/162)

7 1 4
Etc Etc Neutral
8 =PEAR SON
Etc Etc Disagree 2 5 =CORREL 0.997 (A2:A6, 0.699 =FTEST
(A2:A6,B2:B6)
B2:B60 (A2:A6,B2:B6)

4 17
9
= C H IT E S T
10 (A 2 :B 4 ,
5 1
A 6 :B 8 )

11

=TTEST Paired test


12 (A2:A10, =0.196
B2:B10,2,1)

13

14

15
STATISTICS IN EXEL
R e g r e s s io n F o re c a s t G ro w th T re n d
P r e d ic t fu tu r e P r e d ic t e x p o n e n t ia l
( c u r v e ) g r o w t h u s in g
U s e le a s t s q u a r e
m e th o d t o fit lin e a r
fr o m e x is tin g e x is tin g d a ta v a lu e o f Y fo r g iv e n X

Fuel Ve h
L o c a t io n T r a ffic R e ve n u e
s p e e d (X ) M o n th /R T A
c o n s u m e d (Y ) C ount

6 20 11 3 3 ,1 0 0 1 1 3 3 ,8 9 0

7 28 12 4 7 ,3 0 0 2 1 3 5 ,0 0 0

9 31 13 6 9 ,0 0 0 3 1 3 5 ,7 9 0

15 38 14 1 0 2 ,0 0 0 4 1 3 7 ,3 0 0

21 40 15 1 5 0 ,0 0 0 5 1 3 8 ,1 3 0

=FORECAST P re d ict y
(30,A2:A6,B2:B6) g ive n x=3 0 16 2 2 0 ,0 0 0 6 1 3 9 ,1 0 0
= G R O W T H
(B 2 :B 7 ,A 2 :A 7 ,
A 9 :A 1 0 ) 7 1 3 9 ,9 0 0

8 1 4 1 ,1 2 0

9 1 4 1 ,8 9 0

10 1 4 3 ,2 3 0

11 1 4 4 ,0 0 0

12 1 4 5 ,2 9 0

=TREND
(A2:A13,A15:A19)
SPSS
• The three stages in the use of SPSS in
data analysis are;
• 1. Data entry and data encoding
• 2. Data analysis
• 3. Data interpretation
Parametric Test
1.Summary statistics using 5.The summarise procedure
frequency 6.The means procedure
2.Summary statistics using 7.The OLAP cubes procedure
descriptive 8.T Test
-one sample T Test
3.Exploratory data analysis
a)each machine as separate sample
4.Analysis of cross classification
b)sample mean against known value
using crosstab
-paired sample Ttest
- To study normal-normal -independent sample Ttest
relationship a)determining groups in an
- Ordinal-ordinal independent sample Ttest
- To measure relative risk of b)Testing two independent sample
means
events
c)using cut point to define samples
- To measure agreements
Parametric Test
9.One way ANOVA 15.Discriminat analysis
-Testing equality of group means -assess credit risk
-Performing One way ANOVA -Classify customer
-All possible comparison between
16.Factor analysis
means
-data reduction
-Robust ANOVA
10.GLM Univariate -structure detection
11.Partial correlation 17.Two step cluster analysis
12.Linear regression -to classify
13.Ordinal regression 18.Hierachical cluster analysis
14.Curve estimation -use to classify
-model law of diminishing returns -used to study relationships
-Model viral growth 19.K-means cluster analysis
-to classify customers
Non-Parametric Test
20.Chi-square test b)using two sample Komogorov-
a)Testing independence smirnov test to compare distribution
b)Testing a specific range -Test for several independent samples
c)Customising expected value a)using the median test to detect group
-Binomial test differences
a)comparing several distribution b)using Kruskal-Wallis to test ordinal
outcomes
b)using cut point to define the samples
-Test related sample test
-Run test
a)testing a sample median against
a)examining usability of test result known value
b)testing multiple cut point b)using McNeman test in a pre-post
-One sample Kolmogorov-Smirnov test design
a)testing goodness of fit -Test for several related samples
-Two independent sample test a)i.e testing usability of a website
a)Using Man whitney to test ordinal b)using Friedman test on related
outcomes ordinal measure
21.Multiple response analysis 25.Control chart
22.Ratio statistics -used to monitor
23.ROC curve Used to track proportion of
detective unit
-used to evaluate performance
26.Scoring data with predictive
-Used to choose between model
competing classification
27.Select predictors
scheme
-used to mine customer database
24.Measure of reliability in
28.Naïve Bayes
scale problems
-used for prediction, selection and
-used to analyze survey items
classification
-used to analyze inter-rated -used for classify respondents
agreement
modeling
21.Regression model option -Non linear regression
-Binary logistic regression a)used to model law of
diminishing returns
a)to asses credit risk
b)used to model viral growth
-Multinomial logistic regression
-Probit analysis
a)used to profile consumer a)used to test promotional
of packaged gods effect on sales
b)used to classify customer -Weight estimation
c)used to analyse a 1-1 a)to model cost of mall
matched cases control construction
study -Two stage least-square
regression
a)used to model cross sales
modeling
22.Advanced model option
-Linear Mixed models
-Multivariate General Linear
Modeling a)used to analyse product test
result fro multiple markets
a)GLM Multivariate to profile
difference in amount spent by two b)used to analyse repeated
groups measurement of i.e weight and
b)GLM repeated measure to alcohol level after a meal for 6
measure effect of each promotion months
on sales c)used to model (random
Variance component effects and repeated measure
a)used to analyse product test
result from multiple markets
modeling
i. e banning helmet in few states survival data of event whose
and using different enforcement time of occurrence is
strategies to know the best) unknown i.e RTC
d)used to fit a random -Generalized Estimation
coefficient model to find change
equation
before and after treatment-
General linear models a)to fit repeated measure
a)GLM to model poison logistic regression i.e effect
distribution of RT cases of an observed behavior on
b)to fit Gama regression of the subject
insurance claims on RTC -Loglinear modeling
c)to analyze internal censored (relationship between
categorical variables)
modeling
a)to model accident rate i.e -Life tables
Age & Gender risk factors a)i.e to find relationship
b)paired data-response f between time spent before
becoming licenses driver
subject before and after
-Kaplan-Meievr survival analysis
treatment
a)to study distribution of time
c)Logit Linear analysis to to event i.e time take for drug
model 1/more categorical to affect driving
against 1/more predictors -Cox regression
i.e 800 drivers asked a)to model time to specific
which of 3 helmet they event based on given
like best covariance
modeling
23.Complex sample option
-Complex sample
-sampling wizard
a)obtaining sample from full
a)frequencies
sampling frame
b)descriptives
b)obtaining samples from partial
frame c)crosstab
c)sampling with probability
proportional to size(PPS) d)ratio
-Analysis wizard e)GLM
a)used to ready NHIS data
b)used when sampling weights f)logistic regression
are not in the data files
c)Tabulation
g)ordinal regression
d)descriptive
24.Trend option-forecast and 25.Categories option
modeling -Categorical regression
-Bulk forecasting with expert a)i.e effect of socio-eco on
modeler driving habit
a)using expert modeler to -Categorical Principal component
determine significant analysis
predictors-Bulk forecasting by a)i.e effect of socio-eco on
applying saved models driving habit of different
a)experimenting with vehicle modes
predictors by applying saved -Non linear canonical correlation
models analysis
-seasonal decomposition a)i.e to find similarities in the
-spectral plots socio-eco factors
-Corresponding analysis 26.Conjoint option
a)analysis from cross tab -used to model carpet
-Multiple correspondence cleaner preference i.e
analysis effect of socio eco on
a0i.e effect of socio-eco driving habit
on driving habit of
different vehicle types in 27.Tree option
the mode -to evaluate credit risk
-Multidimensional scaling 28.Data preparation
-Multidimensional unfolding option
Discriptive-analyse
,report,case summary
Tables ,select variable,display
What kind of what kind of cases
Scale/numeric
display do summary
varible within
category you want

Chart/graph Box-grap,chart
kind of chart? buider,gallary
.boxplot,drop
the variables
C o m p a r e g r o u p
f o r s i g n d i f f
D a t a i n c a t e g o r i e s
( o r d i n ) - a n a l , d e s c ,
c r o s s t a b , v a r i a b le

S c a l e / n u m e r
d i v i d e d i n t o
g r o u p s

O n e g r o u p - i . e c o m p a r i n g
v i o l a t i o n o f 1 0 0 k m p h a l o n g
8 d i f f r o a d s . 1 6 v a l u e s a r e
o b t a i n e d . D a t a , s p l i t f i l e ,
s e l e c t r o a d
T w o G r o u p s
/ v a r i a b l e s O n e s c a l e / n u m e r i c
v a r i a b l e d i v i d e d i n t o
t w o u n r e l a t e d g r o u p s .
W h i c h i n d e p s a m p l e
d o y o u w a n t t o t e s t
T e s t t h a t a s s u m e s d a t a a r e n o r m a l l y
d i s t r i b u t e d w i t h in g r o u p s i . e f in d i n g
a c c u r a c y o f 2 t y p e s o f t r a ff i c c o u n t e r o n
r x t n t i m e o f v e h i c le s m o v i n g a t
h y p o t h e t i c a l s p e e d o f 1 0 0 k m p h .
T h e v e h w e r e r a n d o m l y a s s ig n e d
t o t r a i n e d & u n t r g d r i v & m a d e t o
t r a v e l e q u a l d i s t a n c e - a n a l y s e
c o m p m e a n s , i n d e p s a m p le T t e s t

T e s t t h a t d o e s n o t
a s s u m e s d a t a a r e
n o r m a l l y d i s t r i b u t e d
w i t h i n g r o u p s -
2 i n d e p s a m p l e t e s t

T w o s c a l e / n u m e r i c
w h i c h p a i r e d s a m p l e
d o y o u w a n t ?

A ssu m e s both v a ri a b l e n o rm a l l y
d i s t ri b u t e d- A n a l y s e , c o m p m e a n s ,
p a ir e d sa m p le T te st se le ct p a ir e d
v a r ia b le

D o n o t a s s u m e s b o th v a r ia b le
n o r m a lly d is t r ib u t e d - a n a ly ,
n o n p a ra ,2 r e la te d s e le c t
p a ir e d v a r ia b le

T h r e e / m o r e g r o u p .
H o w m a n y g r o u p i n g
( f a c t o r ) v a r i a b d o y o u
h a v e ?
O n e - i . e r e v e n u e f o r
t h r e e g r o u p s d e f i n e d
b y r e g i o n
Test that assum e data are normaly
distributed within groupi.e influence of
speed(indep variab) onfatal and
serious acc case- analyse,oneway
ANOV A
D a t a n o t n o r m a l l y
d i s t r i b u t e d - a n a l y s e ,
n o n p a r a m e t r,
k - i n d e p e n d e n t s a m p l e

N o r m a l i t y i s s k e w e d
T w o / m o r e - i . e r e v e n u e
f o r g r o u p s d e f i n e d b y
d i v i s i o n w i t h i n e a c h
r e g i o n - a n a l y s e , G L M ,
u n i v a r i a t e P u t s c a l e a s
d e p e n d e n t v a r i a b l e
I d e n t i f y s i g r e l a
b t w v a r i a b l e . W h a t
k i n d o f d a t a d o y o u
h a v e ?

D a t a i n c a t e g o r i e s
( n o m / o r d i n ) - a n a l y ,
d e s c , c r o s s t a b

O r d i n a / r a n k o r d e r
o r n o n m o r m a l -
A n a l y , c o r r e l a t e ,
b i v a r i a t e , s p e a r m a n
/ k e n d a l

S c a l e / n u m e r i c ( i n t e / r a t i o )
- H o w m a n y v a r i a b l e d o
y o u w a n t t o e v a l u a t e ?

T w o ( o r m u l t i l p e
p a i r o f v a r i a )

T a b l e s & n u m b e r - A n a l ,
c o r r e l a t e , b i v a r i a t e ,
p e a r s o n . I . e c o m p a r e
d r i v i e x p w i t h c a u s e s
o f a c c i d l i k e i n d i s &
s p e e d

C h a r t s & g r a p h H o w
m a n y p a i r s o f v a r i a b l e
d o y o u w a n t t o l o o k a t ?

O n e - G r a p h ,
c h a r t b u i l d e r ,
g a l e r y s c a t t e r

T w o o r m o r e p a i r s
o f v a r i a b l e s - G r a p h ,
c h a r t b u i l d e r , s c a t t e r

T w o c o n t r o l l i n g f o r t h e
e f f e c t s o f o n e o r m o r e
a d d i t i o n a l v a r i a b l e s -
A n a l y , c o r r e l a t , p a r t i a l ,

E x a c t l y t h r e e ( 3 D s c a t t e r p l o t )
- G r a p h , c h a r t b u i l d e r , g a l l e r y
s c a t t e r , 3 D

O n e d e p e n d e n t v a r i a b l e a n d
t w o o r m o r e i n d e p e n d e n t
( p r e d i c t o r ) v a r i b l e - A n a l y ,
r e g r e s s i o n , l i n e a r s e l e c t s c a l e
a s d e p e n d e n t , 2 / m o r e s c a l e
a s i n d e

O r d i n a l d e p a n d s c a l e o r
c a t e g i c a l i n d e p v a r i a b l e
- A n a l y s e , r e g r e s s i o n ,
s e l e c t o r d n a l d e p e n d e
a s c a t e g o r i c a l a n d o t h e r s
a s f a c t o r / c o v a r i a t e

M u l t i v a r i a t e
Id e n tify g r o u p s o f
s im ila r c a s e s .
W h a t k in d o f d a ta
d o y o u h a v e ?

S c a le , n u m e r ic
( in te r v a /r a tio )

Id e n tify g ro u p s

L e s s t h a n 2 0 0 - A n a l y s e ,
c la s s ify, h ie r ra c h c a l c l u s t e r,
s e le c t a ll s c a le v a lu a b le s to u s e

2 0 0 o r m o r e c a s e s -
A n a ly , c la s s i fy , K - m e a n s
c lu s t e r, s e le c t s c a l e t o
b e u s e d , s p e c ify n o c lu s te r e tc

Id e n tify c h a r a c tr is tic o f k n o w n g r o u p s -
A n a l y, c l a s s i f y, d is c r im in a n t, s e le c t
c a te g o r ic a l g r o u p in g v a r ia b le r a n g e to
s p e c ify th e c a te g o ie s o f in te r e s t th e n
s e le c t s c a le in d e p e n d e n t v a r ia b le
( n o . C r a s h & e x p , a g e , in c o m e e tc )

C a te g o ric a l(n o m i/o rd in a ) o r a


m ix o f s c a le & c a te g o r ic v a r ia b
to u s e in c lu s te r a n a ly s is

Id e n tify g r o u p s o f s im ila r
v a ria b le s - A n a ls , d a ta
re d u c tio , fa c to r s e le c t s c a le
v a ria b le fo r fa c to r a n a ly s is
TESTING
• This involves hypothesizing.
• Usually a hypothesis of no significance different (Ho)
is set.
• Ho is rejected if the calculated value is more than the
table value at 95% confidence level or 0.05% level of
significance (except in one parametric test known as
Wilconox test).
• Rejection of Hypothesis means that the result was
due to chance.
• The two levels used in hypothesizing are confidence
level and level of significance.
Testing
These are the level within which our errors are
constrained.
They also compare figure from the sample to the
population.
Although the two levels are interchangeably used,
they are nonetheless different.
While confidence level is represented as 90%,
95%, 99%, the level of significance is represented
as 0.1, 0.5 and 0.01 respectively.
The 99% OR 0.01 means the result was wrong
one in hundred or 99% right respectively.
Conclusion
• Both the EXEL and SPSS are useful in data analysis
• The way data are arranged in SPP is different from that
of EXEL
• While SPSS data do not require any preliminary
summary , the EXEL data sometimes require preliminary
summaries.
• The explanations in this paper does not fully cover all
bout the two application programmes.
• The more each and every one of you continue to
practice analysis with the above software the more you
become versatile in them-PRACTICE MEANS PERFECT
• Thank you

You might also like