Download as pdf or txt
Download as pdf or txt
You are on page 1of 0

CHG 2

TG QUA PH MM STATA
TRNG I HC LM NGHIP
thangpn@vfu.edu.vn
Ni dung
Nhng vn c bn s dng Stata
1
T chc v qun l d liu trong Stata
2
thangpn@vfu.edu.vn
2
V biu
3
4
Phn tch d liu bng Stata
Gii thiu Stata
Stata (statistics and data) l mt chng trnh thng
k vi nhng kh nng qun l v phn tch d liu,
v th mnh.
Cc phin bn ca Stata (mi nht Stata 12)
Phin bn chun Stata/IC (2047 bin)
Phin bn c bit Stata/SE (32766 bin)
thangpn@vfu.edu.vn
Phin bn c bit Stata/SE (32766 bin)
Phin bn a x l Stata/MP (x l nhanh)
Khi ng v kt thc Stata
Cch 1: Khi ng Stata t shortcut trn mn hnh
desktop
Bm p chut
Cch 2: Khi ng t nt Start :
Start -> Programs -> Stata 10 -> StataSE 10
thangpn@vfu.edu.vn
Kt thc chng trnh Stata:
File -> Exit hoc g Exit vo ca s lnh
Alt + F4
Bm vo nt ng X
thangpn@vfu.edu.vn
Cc ca s ca Stata
Cc ca s ca Stata c m ra bng vic la chn
cc tu chn thanh thc n Windows (menu bar).
Cc ca s ny bao gm:
Results Hin th cc lnh v kt qu
Graph Hin th th
Viewer Hin th ca s tr gip (help) v hin th ni dung cc file
thangpn@vfu.edu.vn
Viewer Hin th ca s tr gip (help) v hin th ni dung cc file
vn bn (text)
Command Dng g cc cu lnh
Review Hin th cc lnh thc hin
Variables Hin th danh sch cc bin ca tp s liu
Data editor Hin th v sa cha s liu di dng bng
Do-file editor Hin th ca s son tho chng trnh
Thanh thc n
File: lm vic vi tp tin, d liu, log, my in, kt thc
Edit: Cc thao tc bin tp d liu(sao chp)
Data: Lm vic vi bin (m t, to), lin kt d liu
Graphics: Lm vic vi th
Statistics: Lm vic vi thng k nh tnh tn sut, trung
bnh, hi quy,
thangpn@vfu.edu.vn
bnh, hi quy,
Window: Dng hin th cc ca s nh comand,
Review v do file
Giao tip vi Stata
T Stata 8 h tr 2 cch giao tip:
1. Trnh t thc hin bng lnh trnh n nh sau:
Chn lnh trn trnh n -> hp thoi -> chn cc tham s
2. Thc hin bng g trc tip cu lnh vo ca s lnh
Command
thangpn@vfu.edu.vn
Kt qu ca lnh c hin th trong ca s Results sau
khi thc hin lnh
Cu trc lnh trong Stata
Cu lnh phn bit ch hoa ch thng (khuyn dng ch
thng).
Cu trc cu lnh trong Stata
command [varlist] [if exp] [in range] [weight] [using
filename] [, options]
thangpn@vfu.edu.vn
1. Lnh Command: Yu cu thc hin hnh ng
2. Bin, tp.: Cc i tng chu tc ng
3. iu kin(if exp): Gii hn cc quan st chu tc ng
4. Cc la chn: Xc nh ty chn khi hon thnh lnh
Khi g bn c th khng cn g tt c cu lnh. V d: thay
v g generate (to mt bin) bn c th g gen l c.
thangpn@vfu.edu.vn
Ghi nht k lm vic Log file
S dng log file: ghi li tt c nhng thao thc hin
trong bui lm vic. Logfile cha cc cu lnh v bng
kt qu, khng cha biu , th. m mt log file:
T menu vo File Log Begin
Bng cu lnh
log using tentep [, append replace [text|smcl]]
thangpn@vfu.edu.vn
log using tentep [, append replace [text|smcl]]
ui m rng ca log file: .log; .smcl;
Tm dng ghi log: log off
Tip tc ghi log: log on
ng logfile: log close
M xem ni dung: type tentep [log|smcl]
Chuyn smcl-> text: translate tentep.smcl tentep.txt
Ni dung
Nhng vn c bn s dng Stata
1
T chc v qun l d liu trong Stata
2
thangpn@vfu.edu.vn
12
V biu
3
4
Phn tch d liu bng Stata
Cu trc d liu
mi thi im Stata ch lm vic vi mt tp d liu (*.dta),
khi d liu c ti vo b nh trong (Data Editor, Data
View xem d liu)
D liu t chc dng bng gm hng v ct, mi ct l mt
bin(vars) tn ct l tn bin, mi hng l mt quan st(obs)
hy l bn ghi.
thangpn@vfu.edu.vn
Lm vic vi tp d liu(1)
Lnh trnh n File -> Open
M tp d liu: use tentep [, clear]
V d: use D:\data\mydata.dta, clear
M cc tp v d i km: sysuse tentep
V d: sysuse auto
thangpn@vfu.edu.vn
V d: sysuse auto
Chng ta c th xem cc s liu: File Example
datasets
Lm vic vi tp d liu(2)
M bng d liu ch ch c
Lnh trnh n: Data -> Browser
Cu lnh: browse
M tp d liu v bin tp d liu
Lnh trnh n: Data->Edit
Cu lnh: edit
thangpn@vfu.edu.vn
Cu lnh: edit
Cu trc ca file d liu
Kim tra cu trc d liu, cc khong gi tr ca bin,
cc nhn ca bin, cc nhn ca gi tr
S dng:
describe: m t s liu hoc m t bin
codebook: hin th d liu km theo cc thng k m t,
gi tr missing,
thangpn@vfu.edu.vn
gi tr missing,
list: dng hin th gi tr ca bin trn mn hnh kt qu
(thng kt hp vi if, in)
Lit k d liu
Lnh trnh n: Data->Describe data->List data
Cu lnh: list [varlist] [if] [in] [, options]
List: Lit k ton b d liu
List danh_sach_bien: Lit k d liu ca cc bin trong
danh sch v d: list make mpg weight
List [danh_sach_bien] if dieu_kien: Lit k cc bin
thangpn@vfu.edu.vn
List [danh_sach_bien] if dieu_kien: Lit k cc bin
tha mn iu kin v d: list if mpg>20 & mpg<23
Thng tin m t tp d liu
Contains data from E:\Stata\ado\base/a/auto.dta
obs: 74 1978 Automobile Data
vars: 12 13 Apr 2007 17:45
size: 3,478 (99.9% of memory free) (_dta has notes)
storage display value
Lnh trnh n: Data->Describe data->Describe data
in memory
Cu lnh: describe
thangpn@vfu.edu.vn
storage display value
variable name type format label variable label
make str18 %-18s Make and Model (Nhn hiu, chng loi)
price int %8.0gc Price (gi tin)
mpg int %8.0g Mileage (mpg) (s dm i c/galon xng)
rep78 int %8.0g Repair Record 1978 (sa cha thng xuyn)
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.) (trng lng)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type (nhp khu hay trong nc)
Sorted by: foreign
Kiu d liu v nh dng
Kiu bin Gi tr nh nht Gi tr ln nht Loi bin
byte -127 100 S nguyn
int -32,767 32,740 S nguyn
long -2,147,483,647 2,147,483,620 S nguyn
Cc thng tin v bin
Tn bin(variable name): ti a 32 k t, gm ch, s, du gch di
_, lun bt u bng ch hoc _ gm bin kiu s, bin k t.
Kiu d liu(storage type): c m t trong bng sau
thangpn@vfu.edu.vn
float -1.70141173319*10^38 1.70141173319*10^38 S thc
double -8.9884656743*10^307 8.9884656743*10^307 S thc
str 1 80 K t
nh dng M t V d
%#.#g nh dng s tng qut %9.0g S c di 9 ch s
%#.#f nh dng s c di c nh
%9.2f di 9, c 2 ch s thp
phn
%#s nh dng xu k t %15s xu c di 15 k t
nh dng hin th(display format)
- Cn tri ; c nhm 3 s thnh nhm v d: %-9.0gc
Hm nh dng - format
C php: format varlist %fmt
Vi %fmt:
%w.df: w l chiu di ca s, d l s ch s sau phn
thp phn
v d: 1.5235 nu nh dng %8.2f 1.52
%w.0g: w chiu di ca s
thangpn@vfu.edu.vn
%w.0g: w chiu di ca s
int %8.0g
byte %8.0g
long %12.0g
float %9.0g
double %10.0g
str# %#s
format length %9.0g
To tp d liu(1)
To tp d liu, to bin (generate): C 2 cch gn
gi tr v label ca bin:
Cch 1: To tp d liu t ca s bin tp
Cch 2: Dng lnh to tp d liu
thangpn@vfu.edu.vn
To tp d liu(2)
Cch 1: To tp d liu t ca s bin tp
1. Dng lnh xa cc bin c: clear
2. Dng lnh m ca s bin tp: edit
3. Chn u tin trn mt ct, nhp mt gi tr i din
l s hoc k t vo mt trn ct chn
4. Nhp p chut ln ct bin va nhp, khi xut
thangpn@vfu.edu.vn
4. Nhp p chut ln ct bin va nhp, khi xut
hin hp thoi.
5. Ghi li tp d
liu: File->Save
hoc Save tep_ten
Nhp tn bin
Nhn ca bin
nh dng hin th
nh dng nhn gi
tr, (t stata 9 tr i)
To tp d liu(3)
Cch 2: Dng lnh to tp d liu
1. Dng lnh xa cc bin c: clear
2. nh ngha cc bin: generate ten_bien=gia_tri
(gia_tri ch c ngha lm mu v nh dng d liu)
3. Gn nhn bin: label variable ten_bien nhan_bien
thangpn@vfu.edu.vn
4. Nhp cc gi tr cho bin: edit
5. Ghi li tp d liu: File->Save hoc save ten_tep
obs thunhap tieudung
1 1 0.6
2 1.1 0.65
3 0.7 0.48
4 1.4 0.9
5 0.5 0.38
6 0.4 0.23
7 0.55 0.32
8 0.8 0.48
9 0.7 0.45
thangpn@vfu.edu.vn
10 0.25 0.18
11 0.65 0.4
12 0.4 0.25
13 1.8 0.95
14 0.4 0.25
15 0.5 0.3
16 0.3 0.2
17 1 0.5
18 0.5 0.25
19 0.8 0.45
20 1.4 0.7
B sung thm bin
Thm bin mi vo tp d liu
1. M tp d liu
Cch 1:
1. Chn u tin trn mt ct, nhp mt gi tr i din
l s hoc k t vo mt trn ct chn
thangpn@vfu.edu.vn
2. Nhp p chut ln ct bin va nhp, khi xut
hin hp thoi.
Cch 2:
1. nh ngha cc bin: generate ten_bien=gia_tri
2. Gn nhn bin: label variable ten_bien nhan_bien
Lu : To bin khng c gi tr: generate ten_bien =.
Xa, i tn bin
Xa bin trong tp d liu
1. M tp d liu
Cch 1:
1. M ca s bin tp bng lnh: edit
2. Chn ct cn xa, Nhp nt Delete
thangpn@vfu.edu.vn
2. Chn ct cn xa, Nhp nt Delete
Cch 2:
1. Xa bng lnh: drop ten_bien
Lu : dng cu lnh s xa lun khng xc nhn li
i tn bin: rename ten_bien_cu ten_bien_moi.
Gn nhn cho gi tr(1)
Gn nhn cho gi tr:
Cch 1: Bng lnh trnh n: Data Labels Label
values Define or modify value labels
thangpn@vfu.edu.vn
Gn nhn cho gi tr(2)
Gn nhn cho gi tr:
Cch 2: Bng cu lnh: label define ten_bien ten_nhan
V d: label define gioi 1 Nam 0 Nu
Hin th cc b nhn dng ln: lable dir
Xa b nhn: label drop ten_nhan
thangpn@vfu.edu.vn
Xa b nhn: label drop ten_nhan
V d: label drop gioi
Xa b tt c: label drop_all
obs gioi thunhap tieudung
1 Nam 1 0.6
2 Nu 1.1 0.65
3 Nam 0.7 0.48
4 Nam 1.4 0.9
5 Nam 0.5 0.38
6 Nu 0.4 0.23
7 Nam 0.55 0.32
8 Nu 0.8 0.48
9 Nu 0.7 0.45
thangpn@vfu.edu.vn
10 Nam 0.25 0.18
11 Nam 0.65 0.4
12 Nu 0.4 0.25
13 Nam 1.8 0.95
14 Nam 0.4 0.25
15 Nam 0.5 0.3
16 Nu 0.3 0.2
17 Nam 1 0.5
18 Nu 0.5 0.25
19 Nu 0.8 0.45
20 Nu 1.4 0.7
Mt s thao tc trn bin
M t cu trc d liu hoc bin ? (describe)
des varlist
i tn bin: rename old_var new_var.
Mun xa mt bin trong Stata? (drop hoc keep)
Ngc li vi drop l keep
drop var1 [var2.]
thangpn@vfu.edu.vn
drop var1 [var2.]
drop if var1 >=15
Hm ton hc v Ton t(1)
K hiu Tn php ton
+ Cng
- Tr
* Nhn
/ Chia
^ Ly tha
K hiu Tn php ton
> Ln hn
< Nh hn
== Bng
>= Ln hn hoc bng
<= Nh hn hoc bng
~= Khc (Khng bng)
!= Khc (Khng bng)
Php ton s hc
Php ton quan h
Php ton logic
thangpn@vfu.edu.vn
K hiu Tn php ton
~ Ph nh (Khng)
& V
| Hoc
Hm: L mt dng php ton c xy dng trc tnh ton. Hm
nhn vo s l cc bin, hng, biu thc trong Stata c cc loi hm:
Hm ton hc; Hm thng k; Hm ngu nhin; Hm k t; Hm ngy
thng, Hm c bit; Hm ma trn; Hm chui thi gian
V d: abs(x) ly gi tr tuyt i ca x; sin(x) Hm tnh sin x
Hm ton hc v Ton t(2)
Chng ta mun tnh ton, hin th kt qu nh mt my
tnh: display
Cc hm ton hc: mod(x,y), sign(x), max(x1,x2, x3)
V d: Mun ly phn d ca 5 chia cho 2
display mod(5,2) kt qu bng 1
thangpn@vfu.edu.vn
Kt hp lnh gen vi cc hm ton hc to bin
v d: gen phandu = mod(5,2)
Mnh if
C php lnh if: if bieu_thuc
V d:
summarize mgp if mpg ==20
sum mpg if(weight<2000)
Chng ta c th kt hp cc ton t vi lnh generate
thangpn@vfu.edu.vn
Chng ta c th kt hp cc ton t vi lnh generate
v replace.
V d: gen var1 = 3^2 * 5 v
replace var1 = 1 if var1 ==.
Mnh in
C php: cau_lenh in khoang
Khong:
#: Hin th gi tr ca bin th #
#/#: t v tr no n v tr no
f/#: t v tr u tin n v tr #
#/l(last): t v tr # n cui
thangpn@vfu.edu.vn
V d:
list mpg 1/10 hin th gi tr ca mpg t v tr 1 n 10
list in f/20 hin th danh sch first->20
list in -10/l hin th danh sch 10 trc last
sum mpg 1/10 tnh tng danh sch 1->10
sum mgp 20/l tnh tng danh sch 20 ->last
M ha li bin(1)
M ha li bin.
recode varlist (rule) [(rule) ...] [, generate(newvar)]
Lut (rule) V d ngha
# = # 3 = 1 3 m thnh 1
# # = # 2 . = 9 2 v . m thnh 9
thangpn@vfu.edu.vn
#/# = # 1/5 = 4 1 n 5 m thnh 4
nonmissing = # nonmiss = 8 Tt c khng trng thnh 8
missing = # miss = 9 Tt c trng thnh 9
M ha li bin(2)
V d: m bin tui(age), to bin mi v gn nhn:
0 17: 1 Di tui lao ng;
18-65:2 tui lao ng;
67 tr i: 3 Ngoi tui lao ng;
v to ra mt bin mi l newage v nhn gi tr new_age
thangpn@vfu.edu.vn
recode age (0/17 = 1 Di tui lao ng) (18/65=2
tui lao ng) (65/105 = 3 Ngoi tui lao
ng), pre(newage) label(new_age)
S dng bin h thng
Khi s liu c trong b nh th _N i din cho tng
s quan st
_n i din cho quan st s: _n=1 quan st th nht,
_n=2 cho quan st th hai, n _n=_N cho quan st
cui cng.
Chng ta c th ng dng _n to ch mc.
thangpn@vfu.edu.vn
Chng ta c th ng dng _n to ch mc.
gen caseID = _n
S dng bin h thng
Trong stata cn cho php hin th d liu ca mt c
th trong d liu.
Bin h thng _n cn c ng dng trong d liu dng
series. Nu chng ta c d liu hng ngy v gi ca
mt c phiu c th trn th trng chng khon vi tn
bin l open. Nh vy, chng ta mun tnh giao ng
thangpn@vfu.edu.vn
bin l open. Nh vy, chng ta mun tnh giao ng
gi ca ngy hm sau so vi hm trc:
sysuse sp500
gen difopen = open open[_n-1]
Qun l b nh
Mc nh Stata thit lp b nh l 10MB, nhng nu d
liu ca bn ln hn 10MB th cn t li kch thc b
nh: set memory #[b|k|m|g]
b: byte; k: kilobyte; m: megabyte; g: gigabyte
V d:
set memory 120m.
thangpn@vfu.edu.vn
set memory 120m.
set memory 3g
op sys refuses to provide memory
no; data in memory would be lost
Ni s liu
Ni 2 hay nhiu file d liu stata (append hoc merge).
Ni 2 s liu theo observation (case) ta s dng
append.
Ni 2 s liu theo bin ta s dng merge (ch khi
trc khi s dng lnh merge th c 2 s liu phi c
sp xp lnh sort).
thangpn@vfu.edu.vn
sp xp lnh sort).
Ni s liu (append)
C php:
append using filename [, options]
Trong options:
keep(varlist) Chng ta c th ch ni mt s bin c
th t s liu using c xc nh trong varlist, nu
khng c xc nh keep th mc nh l tt c cc bin
thangpn@vfu.edu.vn
khng c xc nh keep th mc nh l tt c cc bin
s c ni.
list Hin th kt qu sau khi ni.
Ni s liu(merge)
Trong lnh merge s liu master l trong b nh, cn
s liu using l d liu merge vo.
merge [varlist] using filename [filename ...] [, options]
Mc nh to ra bin _merge nhn 3 gi tr
1 Quan st ch c trong s liu master
2 Quan st ch c trong s liu using
thangpn@vfu.edu.vn
2 Quan st ch c trong s liu using
3 Quan st c c trong master v using
Merge bn c th quan h 1-1, 1-nhiu, nhiu nhiu
Ni s liu
V d 1: ni 1-1, trc khi ni ta phi sort s liu trc.
use thuc_hanh1.dta,clear
merge using thuc_hanh2.dta
file thuc_hanh1.dta l master file, cn file thuc_hanh2.dta
l using file
V d 2: s dng bin ni (thng l bin id)
thangpn@vfu.edu.vn
use thuc_hanh1.dta,clear
sort id
save, replace // lu v thay s liu trong file
use thuc_hanh2.dta,clear l s liu master.
sort id
merge number using thuc_hanh1.dta
Ni dung
Nhng vn c bn s dng Stata
1
T chc v qun l d liu trong Stata
2
thangpn@vfu.edu.vn
44
V biu
3
4
Phn tch d liu bng Stata
Cc tham s o lng thng k
o mc i biu o bin thin
S bnh qun
Khong bin thin
thangpn@vfu.edu.vn
45
S bnh qun
Mt
Trung v
Khong bin thin
Phng sai
lch tiu chun
H s bin thin
Cc tham s o mc i biu
Nu ln c im chung ca hin tng KT- XH s ln
So snh cc hin tng khng cng qui m
Nghin cu qu trnh bin ng qua thi gian.
Chim v tr quan trng trong vic vn dng cc phng
php phn tch v d on
thangpn@vfu.edu.vn
46
S bnh qun
KN v s bnh qun:
S bnh qun trong thng k l l tr s biu hin mc
i biu theo mt ch tiu no ca hin tng
KT-XH bao gm nhiu n v cng loi
c im ca s bnh qun
Mc c trng nht, khi qut nht ca tng th
thangpn@vfu.edu.vn
Mc c trng nht, khi qut nht ca tng th
bao gn nhiu n v cng loi
L kt qu ca s san bng mi chnh lch
Chu nh hng ln bi lng bin c tn s ln nht
47
S bnh qun cng
a) S bnh qun cng gin n
n xi n x x x x
n
i
n
/ / ) .... (
1
2 1
=
= + + + =
thangpn@vfu.edu.vn
48
b) S bnh qun cng gia quyn

=
=
= + + + + + + =
n
i
i
n
i
i i
n n n
f
f x
f f f f x f x f x x
1
1
2 1 2 2 1 1
) .... /( ) .... (
Bnh qun hnh hc geometric mean
a/ iu kin vn dng : Cc lng bin c quan h
tch s.
b/ Cng thc:
S bnh qun nhn gin n
S bnh qun nhn gia quyn
n
n 2 1
x ....... x . x x =
n 2 1 n 2 1
f ... f f f
n
f
2
f
1
x ..... x . x x
+ + +
=
49
2.2 - Mt (mode) M
0
a/ K
- i vi dy s khng c khong cch t:
Mt l lng bin hoc biu hin c gp nhiu nht trong
dy s phn phi.
Cch xc nh M
0
Xc nh lng bin hoc biu hin c tn s ln nht trong
thangpn@vfu.edu.vn
50
Xc nh lng bin hoc biu hin c tn s ln nht trong
dy s phn phi, chnh l M
0
.
B2 : Tnh gi tr gn ng ca M
0
theo cng thc:
D D
) f f ( ) f f (
f f
. h x M
1 0 0 1 0 0
1 0 0
0 min 0
M M M M
M M
M M 0
+

+

+ =
51
) D D ( ) D D (
D D
. h x M
1 0 0 1 0 0
1 0 0
0 min 0
M M M M
M M
M M 0
+

+

+ =
2.3 Trung v (Median) M
e
a/ KN
Trung v l lng bin ca n v ng v tr chnh gia trong
dy s lng bin, chia s n v trong dy s thnh 2 phn
bng nhau.
b/ Cch xc nh trung v
Xc nh n v ng v tr chnh gia
thangpn@vfu.edu.vn
52
- Xc nh n v ng v tr chnh gia
+ Nu s n v tng th l s l (n = 2m + 1) th n v ng
v tr chnh gia l n v th m + 1.
+ Nu s n v tng th l s chn (n = 2m) th n v ng
v tr chnh gia l n v th m v m +1
- Ch :
+ Trung v l lng bin ca n v ng v tr chnh
gia ch khng phi lng bin ng chnh gia.
+ Khi xc nh trung v phi xc nh n v ng v
tr chnh gia trong dy s lng bin nn dy s ny
phi c sp xp theo th t nht nh (t nh n
thangpn@vfu.edu.vn
53
ln hoc ngc li).
- Tnh trung v:
+ i vi dy s khng c khong cch t, trung v l
lng bin ca n v ng v tr chnh gia
Nu s n v tng th l s l : M
e
= x
m+1
Nu s n v tng th l chn :
M = (x + x ) : 2
thangpn@vfu.edu.vn
54
M
e
= (x
m
+ x
m+1
) : 2
+ i vi dy s c khong cch t, cn qua 2 bc
B1 : X t cha trung v : l t cha lng bin ca
n v ng v tr chnh gia .
B2 : Tnh trung v theo cng thc (gi nh phn phi
u n):
f

55
e
1 e
e min e
M
M
i
M M e
f
S
2
f
. h x M

+ =

Tnh cho VD
c/ Tc dng ca M
e
:
- B sung hoc thay th s bnh qun khi cn thit.
- Khi kt hp vi s bq cng, mt, trung v c th
nu ln c trng ca dy s phn phi, c th:
+
Lch phi Lch tri i xng
Mean= Median= Mode
Mean
MedianMode Mode MedianMean
56
Mean= Median= Mode
Mean
MedianMode Mode MedianMean
- Trung v c ng dng nhiu trong cng tc k
thut v phc v cng cng (v xi Me fi = min).
Trong cc tham s o mc i biu, tham
s no o mc i biu tt nht?
VD :
6000 $
2000 $
Ngi lao ng cho rng mc
lng thp, phn ln ch t
100$/thng.
. Ch doanh nghip ni rng mc

57
300 $
100 $
. Ch doanh nghip ni rng mc
lng kh cao, bnh qun t
840$/thng!
Khong bin thin (R)(Range)
a/ KN : L chnh lch gia lng bin ln nht v lng
bin nh nht ca tiu thc.
b/ CT : R = X
max
X
min
VD : T 1 : 45 50 55 60 65 R
1
= ?
T 2: 51 53 55 57 59 R
2
= ?
thangpn@vfu.edu.vn
58
c/ u im : Tnh ton n gin, cho NX nhanh v bin
thin ca tng th.
Nhc im: Cho NX khng chnh xc khi c cc lng
bin t xut (qu ln hoc qu nh).
2.2 Phng sai (
2
) (Variance)
a/ KN: L s bnh qun cng ca bnh phng cc lch
gia lng bin vi bnh qun cc lng bin .
b/ Cng thc :
59
2
i
i
2
i
i
2
i
2
2
2 2
i
2
) x (
f
f . x
f
f . ) x x (
) x (
n
x
n
) x x (
i
i
=

=
=


c/ Tc dng :
- Biu hin bin thin tiu thc
- Dng nhiu trong phn tch thng k nh tnh h s
tng quan, xc nh c mu iu tra
d/ Nhc im:
thangpn@vfu.edu.vn
60
- Khuch i sai s
- n v tnh ton khng ph hp.
lch tiu chuNn()
a/ KN : L cn bc hai ca phng sai
b/ Tc dng:
- L mt trong nhng ch tiu hon thin nht o bin
thin tiu thc
thangpn@vfu.edu.vn
61
- Dng nhiu trong cc phn tch thng k.
- Cho bit s phn phi ca cc lng bin trong mt tng th
(da vo nh l Chebyshev)
Thng k m t
M t thng k vi bin lin tc: summarize
summarize [varlist] [if] [in] [weight] [, options]
Trong lnh summarize: ch tnh b cc quan st khng
c missing. S loi b cc quan st c gi tr missing.
thangpn@vfu.edu.vn
Thng k m t
Gi trung bnh ca mt chic xe l bao nhiu?
Dng lnh: summarize price
Gi trung bnh ca cc xe c tiu hao nhin liu di
mc trung bnh chung 21.3 ?
thangpn@vfu.edu.vn
mc trung bnh chung 21.3 ?
Dng lnh: summarize price if mpg<21.3
Trung v ca mc tiu hao nhin liu l g?
Dng lnh: summarize mpg, detail
Thng k m t
Hin th s liu thng k gi v tiu hao nhin liu i
vi tng loi xe trong nc v xe nc ngoi.
sort foreign
by foreign: summarize price mpg
thangpn@vfu.edu.vn
X l cc gi tr missing
Cc gi tr missing trong Stata c coi nh cc s v
cng ln.
V d: chng ta mun tnh tnh summarize (m t thng
k) vi d liu auto, ta tnh mean ca bin price, theo
rep78.
sysuse auto
thangpn@vfu.edu.vn
sysuse auto
summarize price if rep78>3 kt qu bng 1.1
sum price if rep78>3 & rep78 <. Kt qu bng 1.2
Variable Obs Mean Std. Dev. Min Max
price 34 6073 2315.435 3748 12990
Variable Obs Mean Std. Dev. Min Max
price 29 6011.379 2055.312 3748 11995
Bng 1.1
Bng 1.2
Bng tn sut 1 chiu
m t cc bin ri rc
C php: c 2 cu lnh cho chng ta la chn
tabulate varname [if] [in] [weight] [, tabulate1_options]
tab1 varlist [if] [in] [weight] [, tab1_options] chy
cng mt lc nhiu bin
thangpn@vfu.edu.vn
V d: m s lng xe trong nc v xe nc ngoi
tabulate foreign
To nhiu bng cng lc
tab1 foreign mpg price
Bng tn sut 2 chiu)
S dng bng tn sut v bng tng quan 2 chiu
(cross-tabulation) vi tabulate.
C php: c 2 la chn
tabulate varname1 varname2 [if] [in] [weight] [, options]
tab2 varlist [if] [in] [weight] [, options]
thangpn@vfu.edu.vn
M t thng k theo bng ca Mean, Median, v cc i
lng thng k khc
V d: tabulate var, sum(varlist)
Vi var l bin ri rc, varlist l bin lin tc
Ni dung
Nhng vn c bn s dng Stata
1
T chc v qun l d liu trong Stata
2
thangpn@vfu.edu.vn
68
V biu
3
4
Phn tch d liu bng Stata
Biu Histogram
Histogram dng biu din phn b cc gi tr ca
bin trong th
C php:
histogram varname [if] [in] [weight]
[,[continuous_opts |discrete_opts] options]
thangpn@vfu.edu.vn
continuous_opts: bin(#), width(#), start(#) Vi cc bin l lin tc
bin(#): # l s lng ct hin th trn ha, nu khng xc nh
bin(#) th mc nh s lng bin c tnh theo cng thc sau:
# = min{sqrt(N), 10 ln(N)/ln(10)} vi N l s quan st.
width(#): # rng ca tng ct ci ny ph thuc vo s lng ct
start(#): Mc nh # l gi tr nh nht ca bin cn v
Biu Histogram(2)
discrete_opts: vi cc bin l ri rc
discrete: bin s liu l bin ri rc
width(#) v start(#): cng tng t nh la chn lin tc
options:
density: n v trn thang o l mt (mc nh).
fraction: t l
thangpn@vfu.edu.vn
frequency: n v trn thang o l tn sut,
percent: Phn trm
Biu Histogram(3)
options:
percent: n v trn thang o l phn trm, tng s chiu cao ca
cc ct bng 100.
gap(#): Khong cch gia cc ct, 0 <= # <100
axis_options: xlables(), ylabels(), ytitle(), xtitle().
normal: thm ng cong mt chun vo th.
caption(): Trch dn ngun thng tin
thangpn@vfu.edu.vn
caption(): Trch dn ngun thng tin
title(), subtitle: cc tiu
note(): ch thch
V d: histogram mpg
hist mpg , normal freq xtitle("so dam tren galon xang")
note("so lieu trong file auto.dat")
histogram mpg, percent
histogram mpg, fraction
Biu Histogram(4)
thangpn@vfu.edu.vn
Biu phn tn(Scatter plot)
twoway (typeplot_1 y1 x1) [(typeplot_2 y2 x2)
(typeplot_n yn xn)] [if] [in] [weight] [, options]
typelot: l cc kiu th:
scatter: th dng im phn tn
lfit: ng d bo tuyn tnh
qfit: ng d bo bc hai
thangpn@vfu.edu.vn
qfit: ng d bo bc hai
fpfit: ng d bo a phn thc
lowess: ng LOWESS(locally weight scatterpot smoothing)
V d:
sysuse auto
scatter mpg weight // v th n gin
Biu phn tn(Scatter plot)
Qua biu ta c th hnh dung v mi quan h gia
hai i lng
Dng ca lin h: Nu cc im tp trung theo mt di dc
theo ng thng th c th ni c quan h tuyn tnh
Mc ca mi lin h: Di im cng hp th mc
quan h cng cht
thangpn@vfu.edu.vn
quan h cng cht
Chiu hng ca mi quan h
l tng hay gim.
scatter mpg weight
Biu phn tn(Scatterplot)
Mt s options:msymbol(symbolstylelist):
s ss sm mm mp pp pl ll lu uu us ss s s ss sm mm mx xx x x xx x
s ss sm mm mt tt tr rr ri ii ia aa an nn ng gg gl ll le ee e t tt t solid
s ss sm mm ms ss sq qq qu uu ua aa ar rr re ee e s ss s solid
s ss sm mm md dd di ii ia aa am mm mo oo on nn nd dd d d dd d solid
s ss sm mm mc cc ci ii ir rr rc cc cl ll le ee e o oo o solid
x xx x X XX X
p pp pl ll lu uu us ss s + ++ +
s ss sq qq qu uu ua aa ar rr re ee e S SS S solid
t tt tr rr ri ii ia aa an nn ng gg gl ll le ee e T TT T solid
d dd di ii ia aa am mm mo oo on nn nd dd d D DD D solid
c cc ci ii ir rr rc cc cl ll le ee e O OO O solid

symbolstyle (if any) description
synonym
thangpn@vfu.edu.vn

n nn no oo on nn ne ee e i ii i a symbol that is invisible
p pp po oo oi ii in nn nt tt t p pp p a small dot
s ss sm mm ms ss sq qq qu uu ua aa ar rr re ee e_ __ _h hh ho oo ol ll ll ll lo oo ow ww w s ss sh hh h hollow
s ss sm mm mt tt tr rr ri ii ia aa an nn ng gg gl ll le ee e_ __ _h hh ho oo ol ll ll ll lo oo ow ww w t tt th hh h hollow
s ss sm mm md dd di ii ia aa am mm mo oo on nn nd dd d_ __ _h hh ho oo ol ll ll ll lo oo ow ww w d dd dh hh h hollow
s ss sm mm mc cc ci ii ir rr rc cc cl ll le ee e_ __ _h hh ho oo ol ll ll ll lo oo ow ww w o oo oh hh h hollow
s ss sq qq qu uu ua aa ar rr re ee e_ __ _h hh ho oo ol ll ll ll lo oo ow ww w S SS Sh hh h hollow
t tt tr rr ri ii ia aa an nn ng gg gl ll le ee e_ __ _h hh ho oo ol ll ll ll lo oo ow ww w T TT Th hh h hollow
d dd di ii ia aa am mm mo oo on nn nd dd d_ __ _h hh ho oo ol ll ll ll lo oo ow ww w D DD Dh hh h hollow
c cc ci ii ir rr rc cc cl ll le ee e_ __ _h hh ho oo ol ll ll ll lo oo ow ww w O OO Oh hh h hollow
s ss sm mm mp pp pl ll lu uu us ss s s ss sm mm mx xx x x xx x
scatter mpg weight, msymbol(diamond)
scatter mpg weight, msymbol(x)
Biu phn tn(Scatterplot)
Khi v chng biu phn tn cng cc ng cong
d bo ta c th c lng tt hn v dng quan h
V d:
twoway (scatter mpg weight) (lfit mpg weight) (qfit mpg weight)
thangpn@vfu.edu.vn
Biu phn tn(Scatterplot)
Gn nh ln cc gi tr: mlabel(varlist)
xscale () v yscale(): nolog mc nh, log iu
chnh theo mt xut hin cc gi tr ca bin trong 1
khong nht nh.
V d: scatter mpg weight, xscale(log)
scatter mpg weight, msymbol(plus) mlabel(mpg)
thangpn@vfu.edu.vn
scatter mpg weight, msymbol(plus) mlabel(mpg)
Biu ct (bar plot)
C php: twoway bar yvar xvar [if] [in] [, options]
Options:
vertical: th hin th hnh ct theo chiu thng ng.
horizontal: th hin th hnh ct theo chiu ngang
Cc la chn khc ca tng t nh histogram
V d: s dng s liu sp500 v biu bar gi thay
thangpn@vfu.edu.vn
V d: s dng s liu sp500 v biu bar gi thay
i (bin change) theo ngy (bin date)
twoway bar change date in 1/52
Biu matrix
C php: y l th m rng ca scatter 2 chiu
graph matrix varlist [if] [in] [weight] [, options]

y a aa ax xx xi ii is ss s( (( (2 22 2) )) ) v2/v1 v2/v3 v2/v4 v2/v5

v1/v2 v1/v3 v1/v4 v1/v5 y a aa ax xx xi ii is ss s( (( (1 11 1) )) )

a aa ax xx xi ii is ss s( (( (2 22 2) )) ) a aa ax xx xi ii is ss s( (( (4 44 4) )) )
x x
thangpn@vfu.edu.vn
a aa ax xx xi ii is ss s( (( (1 11 1) )) ) a aa ax xx xi ii is ss s( (( (3 33 3) )) ) a aa ax xx xi ii is ss s( (( (5 55 5) )) )
x x x

v5/v1 v5/v2 v5/v3 v5/v4 y a aa ax xx xi ii is ss s( (( (5 55 5) )) )

y a aa ax xx xi ii is ss s( (( (4 44 4) )) ) v4/v1 v4/v2 v4/v3 v4/v5

v3/v1 v3/v2 v3/v4 v3/v5 y a aa ax xx xi ii is ss s( (( (3 33 3) )) )

v d: s dng auto
sysuse auto, clear
graph mat mpg price weight length
Biu hnh hp (Box Plot)
C 2 kiu biu hnh hp:
graph box yvars [if] [in] [weight] [, options]
graph hbox yvars [if] [in] [weight] [, options]
V d:
graph box mpg
The encoding and the words used to describe the encoding are
thangpn@vfu.edu.vn
graph box mpg
graph box mpg,by(foreign)
o <- outside value
adjacent line <- lower adjacent value

whiskers
<- 25th percentile (lower hinge)

box <- median

<- 75th percentile (upper hinge)
whiskers

adjacent line <- upper adjacent value
o
o <- outside values
Biu hnh trn (Pie Chart)
Hin th cc phn ca hnh trn
graph pie varlist [if] [in] [weight] [, options]
Ch : cc bin trong varlist phi cng n v
Hin th cc phn ca hnh trn theo phn trm hoc
gi tr ca bin theo bin ri rc trong over()
thangpn@vfu.edu.vn
graph pie varname [if] [in] [weight], over(varname) [options]
Hin th cc phn ca hnh trn theo tn sut ca bin
ri rc bn trong over
graph pie [if] [in] [weight], over(varname) [options]
Biu hnh trn (Pie Chart)
Mt s option chnh
over(varname): bin ri rc
angle0(#): # l nghing ca slice u tin, mc nh l 90
missing: bn mun hin th gi tr missing trn biu
V d: graph pie mpg, over(foreign)
thangpn@vfu.edu.vn
Biu hnh trn (Pie Chart)
sysuse auto
gen price1 = price if price < 5000
gen price2 = price if price <8000 & price >=5000
gen price3 = price if price >=8000
graph pie price1 price2 price3 // th n gin loi 1
thangpn@vfu.edu.vn
graph pie price1 price2 price3 // th n gin loi 1
graph pie price1 price2 price3, plabel(_all percent)
graph pie price1 price2 price3, plabel(_all percent) by(
foreign, total)
Biu hnh ct (Bar Chart)
C php: graph bar yvars [if] [in] [weight] [, options]
graph hbar yvars [if] [in] [weight] [, options]
yvars: (stat): vi stat l: mean median p1 p2 ... p99
sum count min max
Options:
thangpn@vfu.edu.vn
over( varname): varname l bin ri rc, v bn c th kt
hp c nhiu over()
by(varname): cng tng t nh over(varname)
blabel():Mc nh l none, bar v total
Biu hnh ct (Bar Chart)
sysuse auto
graph bar (mean) price weight, over(foreign)
graph bar (mean) price weight (median) price weight,
by(foreign)
graph bar (mean) price weight, by(foreign) blabel (bar)
thangpn@vfu.edu.vn
Lu li biu
V th: graph pie price1 price2 price3, plabel(_all
percent) by(foreign, total)
Lu li th: graph save "E:\graph1.gph, replace
S dng th: graph use "E:\graph1.gph
Lnh trnh n: File->Save (trn ca s biu )
thangpn@vfu.edu.vn

You might also like