Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

! all by itself means "not" and reverses whatever condition follows it.

Internally, Stata equates true and false with one and zero. That means you can write:

browse make foreign if foreign

or:

browse make foreign if !foreign

This makes for simple and readable code. Just be careful: anything other than zero will also be
interpreted as true, including missing.

Combining Conditions

You can combine conditions with & (logical and) or | (logical or). The character used for logical
or is called the "pipe" character and you type it by pressing Shift-Backslash, the key right above
Enter. Try:

browse make price mpg if mpg>25 & price<5000

This shows you cars that get more than 25 miles per gallon and cost less than $5000 (in 1978
dollars). In set theory terms it is the intersection of the two sets. Now try:

browse make price mpg if mpg>25 | price<5000

This shows you cars that get more than 25 miles per gallon or cost less than $5000. A car must
meet only one of the two conditions to be shown. In set theory terms it is the union of the two
sets.

All the conditions to be combined must be complete. If you wanted to list the cars that have a 1
or a 2 for rep78 you should not use:

browse make rep78 if rep78==1 | 2

(Why this does what it does is left as an exercise for the reader, but it's not what you want.)
Instead you should use:

browse make rep78 if rep78==1 | rep78==2

Missing Values

If you have missing values in your data, you need to keep them in mind when writing if
conditions. Internally, missing values are stored using the 27 largest possible numbers, starting
with the generic missing value (.) and the extended missing values (.a, .b, etc.) after that in
alphabetical order, so the following inequalities hold:
any observed value < . < .a < .b < .c ... < .x < .y < .z

If you want a list of cars that are known to have good repair records, you won't get it with:

browse make rep78 if rep78>3

An easy shortcut is to think of missing values as (positive) infinity, and since infinity is greater
than 3 cars with a missing value for rep78 are included in the list. So add a second condition to
exclude them:

browse make rep78 if rep78>3 & rep78<.

Why <. rather than !=. ? In this data set it makes no difference. But if the data set included
extended missing values, the condition !=. would not exclude them. The condition <. excludes
them because extended missing values are greater than the generic missing value. Thus using <.
ensures you're excluding all missing values.

Exercise: Browse domestic cars that get more than 25 miles per gallon and are known to
have good repair records (rep78 greater than 3). Then browse foreign cars that cost less
than $5,000 and are not known to have poor repair records (rep78 less than or equal to 3).
Include the variables used in the conditions so you can spot-check your results. Explain
why you handled missing values the way you did in both cases.

Options
Options change how a command works. They go after any variable list or if condition, following
a comma. The comma means "everything after this is options" so you only type one comma no
matter how many options you're using.

Consider:

browse make foreign

We know that value labels have been applied to the foreign variable, so the words "Domestic"
and "Foreign" are not the actual values. We can see the values instead of the labels by adding the
nolabel option:

browse make foreign, nolabel

Options must always be one word. Here the words "no" and "label" are combined because
otherwise Stata would think they were two different options.

Many options require additional information, such as a number or a variable they apply to. This
additional information goes in parentheses directly after the option name. To illustrate that we
need to use a command other than browse, because nolabel is the only option it has.
The list command is very simlar to browse, but it just lists the data in the Results window. If you
have a log open the list output will be stored in the log, which is sometimes useful. Try:

list make

The string() option tells the list command to truncate string variables after a given number of
characters, with the number going in the parentheses:

list make, string(5)

You might use the string() option to save space, or if the first part of the string contains all the
information you really need. But it's mostly here as an example of the "additional information
goes in parentheses" syntax you'll use regularly.

Stata reuses option names wherever it makes sense. Thus many commands take a nolabel option
that prompts them to ignore value labels. Other common options include gen() to create a new
variable (with the name of the new variable going in parentheses), by() to act on groups, and
vce() to tell regression commands how to estimate the variance-covariance matrix.

You might also like