Professional Documents
Culture Documents
02-Introduction To Stata
02-Introduction To Stata
MR. B. CHIZONDE
Department of Economics, UNZA
©2016
“Whether you think you can or think you can’t, you are
generally right”
Zig Ziglar
OUTLINE
1.What is Stata?
2. The Stata Display
3. Exiting Stata
4. Stata Data Files
5. Opening Stata Data File
6. Basic Tasks in Stata
1. What is Stata?
The name STATA is a syllabic abbreviation of the words
“statistics and data.”
STATA is a general purpose statistical software package
which was created in 1985 by StataCorp.
Stata is used in conducting research in fields like
economics, sociology, political science, biomedicine,
epidemiology, mathematical statistics and others.
This statistical package has both a graphical user interface
(GUI) and a command line interface.
2. The Stata Display
Stata has the following display components:
1. Command: This is where Stata commands are typed.
2. Results: This is the largest part of the display. It shows the
output from the commands and error messages in red.
3. Review: Shows a list of commands recently executed.
4. Variables: Shows the names of the variables in the data
and labels (if created)
5. Menu-Bar: This is located across the top of the display. It
is used for GUI
6. Tool-Bar: This is right below the menu Bar and also used
for GUI.
3. Exiting Stata
Note: Stata can be opened by double-clicking on
the icon on your computer.
It can be exited using the following methods:
1. Menu Bar: File >Exit
2. Command: simply type exit and then enter.
3. Window: you can simply use the close (X)
button of the Stata window.
5. Opening Stata Data File
*Once the data files have been put in the Working Directory,
opening the individual data file is very simple.
1. Via Commands:
use cps4_small
clear
use cps4_small, clear
2. Via Toolbar
Click on the folder icon, and then locate the file using the
dialog box and open it.
6. Basic Tasks in Stata
1. Editing Variables
(Open cps4_small.dta)
(a) Via Toolbar
Click on the Data Editor icon (in edit mode) on the toolbar.
Once you are in edit mode, you can change variable properties
using the properties section.
(b) Via commands:
rename wage wage1 (changes name)
label variable wage1 “earnings” (changes label)
(c) Variable Manager
click on the ‘Variable Manager’ on the toolbar and then you can
edit the variable names, labels and more.
6. Basic Tasks in Stata
2. Stata Command Syntax
Stata commands have the following common syntax
Command [varlist] [if] [in] [weight] [, options]
* Command: the name of the command, eg summarize
* [varlist]: list of variables for the command
* [if]: condition imposed on the command
* [in]: specific range of observations for the command
* [weight]: used when some sample observations are to be
weighted differently than others.
*[, option]: command options come after a comma.
6. Basic Tasks in Stata
3. Summarize and Describe commands
(these are the most basic stata commands and usually the first commands
performed on the data)
The following are the variations of the commands based on the stata
syntax.
describe
summarize
sum
d
summarize wage, detail
summarize wage if female==1
sum wage if female==0
summarize if exper>=10
summarize in 1/50
use cps4_small, clear
sum wage if female==1 in 1/500, detail
6. Basic Tasks in Stata
4. Saving the output
The output can be saved using the following ways:
(a) Copying and pasting
*you can copy stata output as a picture and then paste in a word
document for saving.
(b) Log File
A Log file is a simple way to save your output in either text or smcl
format.
(i) Text Log File:
log using chap01, replace text
sum
describe
log close
(the command generated a log file in text format which has been saved
in the working directory)
Simply go to the working directory and open the document.
(ii) Smcl Log File
The smcl format generates a log file which appears like stata output and
can only be opened by Stata.
log using chap02, replace
sum
describe
log close
*The smcl format log file has been generated.
* To open the log file, File> View, then locate the log file and open it.
6. Basic Tasks in Stata
5. Stata Graphing
Stata graphing is better done through the GUI and not commands.
(a) Histogram
Graphics>Histogram
Select the variable “wage”, percent on Y-axis and then OK.
The graph can then be saved, printed or copied.
Command: hist wage
(b) Scatter Diagram
Graphics> Two way graph (scatters, lines, etc)
Then click on create, add variables:Y-variable “wage” and X-variable “educ”.
Then click submit.
Command: scatter wage educ
6. Basic Tasks in Stata
6. Stata Do-Files
A do-file is a file containing the list of commands that will be
executed as a batch.
A do-file can be created from the commands in the review
window.
Right-click on Review commands, select all, then send the
do-file to editor.
After editing it, file>save.
Give it the name do1 and then save it.
Note: * can be used to add comments within the do-file. Any
line starting with * will be skipped when do-file is executed.
6. Basic Tasks in Stata
7. GeneratingVariables
(a) Using GUI
Data>create or change variables>create new variables
*under “variable name” put the new name of the variable, eg wage2
Then the expression is wage^2, then ok
(b) Using Commands
generate wage3=wage^3
gen lwage=ln(wage)
drop wage3 wage2
drop if wage>5.50
drop in 1/5
use cps4_small, replace
drop in 51/100
use cps4_small, replace
gen negwage = -wage
gen blackeduc = black*educ
gen wage_educ = wage/educ
gen elwage = exp(lwage)
gen rootexper= sqrt(exper)
6. Basic Tasks in Stata
8. Generating IndicatorVariables
To generate an indicator variable we use the generate
command with a condition to be satisfied.
If the condition is satisfied, the variable is assigned the value
1, otherwise it is assigned the value 0.
generate hs=(9<=educ)&(educ<=12)
* This generates an indictor variable names ‘hs” which is 1 if
education level is between 9 years and 12 years.
* To avoid transforming missing values to 0, we add a
condition if the data has missing values.
generate hs=(9<=educ)&(educ<=12) if !missing(educ)
6. Basic Tasks in Stata
9 Tabulating Data
Tabulating helps to observe the frequency of observation for
each variable.
tabulate educ
tab educ, gen(educ)
*the last command generates indictor variables for each
possible value of educ.
END
Reference
Adkins L, & Hill R, (2011) Using Stata for Principles of
Econometrics, 4th Edition: USA