Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

6-1

Session 6

SeIecting and SampIing Cases



Selecting Cases 6-2
Selecting the whole set of cases again 6-4
Sampling Cases 6-4
Split File Command 6-6
Practical session 6 6-8

6-2
SESSION 6: SeIecting and SampIing Cases

SeIecting Cases

This command is useful if you wish to perform an analysis on a subset of
cases, e.g. only women or only married people from bsas91b.sav, etc.

To select a subset of cases you select.

Data
Select Cases ...


Figure 6.1
Which gives:

Figure 6.2
This is the
shortcut button
for 'Select'
6-3
The default selection is AII cases. To select a of cases based on
the values of a variable or variables, select the If condition is satisfied
option and click on the If button. The SeIect Cases: If box will appear,
Figure 6.3


Figure 6.3

A condition may be created in exactly the same way as in the If dialog box
but here the conditional expression will be used for selecting cases.

The following are examples of case selection conditions ...

RSEX = 2
(Respondent is female)

RSEX = 2 AND MARSTAT = 2
(Respondent is female and living as married)

PRSOCCL< SRSOCCL
(parents social class is less than respondents social class which
because of the way class is coded (1 is high 6 is low) means those
cases where downward social mobility has occurred.)

Let us take as an example RSEX=2. f we select on this condition we
obtain the following Data Editor Window, Figure 6.4.

6-4

Figure 6.4

Notice that some cases have a line through their row numbers, these are
the cases that have been deselected.

SeIecting the whoIe set of cases again

Unselected cases may either be deleted temporarily or permanently. f
the UnseIected Cases Are box has the FiItered radio button checked,
then cases are only temporarily deselected. n this case, clicking once on
the AII cases button will restore the whole file again.

f, however, the DeIeted button was checked in the UnseIected Cases
Are box, then the deselected cases were permanently deleted and the
whole file can only be restored by retrieving the data file again.


SampIing Cases

f you were working with a very large data set it might be advisable to try
out your analysis on a sample before using the whole data set. This can
be an enormous saving in processing time.

To sample cases select...

Data
Select Cases ...

Choose the Random sampIe of cases option and then click on the
SampIe... button

6-5

Figure 6.5

The following dialog box appears..


Figure 6.6

Here, chose ApproximateIy and entered 50% but an exact number of
cases can also be sampled from the first x number of cases.

The 50% random sample gave the following Data Editor Window,
Figure 6.7.

6-6

Figure 6.7

Notice also that (if the Data Editor is maximised), FiIter On appears in the
status bar at the bottom right.


SpIit FiIe Command

SPSS for Windows has the facility to enable you to split your data file into
separate groups for analysis. For instance, if the file was split according
to the variable rrgcIass (respondent's social class according to the
Registrar Generals Classification), and then you asked for the frequencies
of the variable rsex, you would end up with a frequency table of rsex for
each social class. Using the split file command is equivalent to separately
selecting each category of social class and then running the frequencies
command.

Alternatively, you may wish to perform a particular analysis based not only
on the sex of the respondent but also on their age, say, whether they are
above or below 40. n other words, you want to split your file based on
two variables. Suppose you wanted a separate frequency table for the
following subgroups :

males under 40
males 40 or over
females under 40
females 40 or over

f this was for bsas91b.sav you would need to first recode the variable
rage into two categories, below 40 and equal to or above 40, and then use
6-7
that new variable, agegroup, in the Groups Based on box. n addition,
you would need to add the variable rsex to this dialog box. Having
created agegroup:

Data
Split File.



Figure 6.8

n the SpIit FiIe box, clicked on the radio button Organize output by
groups, and moved the variables agegroup and rsex into the Groups
Based on: box. This gave Figure 6.9


Figure 6.9
This is the shortcut
button for 'Split
File'
6-8
When you have SpIit the file, then SpIit FiIe On appears in the status bar
of the Data Editor (bottom right corner).

To produce a separate analysis for each section of the split file, proceed
as normal for the analysis (e.g. Frequencies, crosstabs). For example,
requesting Frequencies for the variable tenure1 gives 4 separate
frequency tables in the Output Viewer window (see Figure 6.10).


Figure 6.10

Once a data file has been spIit, if you wish to process the whole file again,
then the AnaIyze aII cases check box must be checked.


PracticaI session 6

1. MobiIity tabIes

An 'inter-generational social mobility' table cross-tabulates parents' class
by respondents' class, to show the extent to which a society is open or
closed to movement through the class structure. Most mobility tables
studied in the research literature have examined fathers' class against
sons' class and have ignored the class of mothers and daughters. This is
partly because women have for so long been almost ignored by
sociologists, but also because class is normally assessed on the basis of
respondents' occupation and until the 1960s the majority of women were
not in paid employment.

Usually mobility tables are constructed from data about people's actual
occupations categorized into social classes. n the BSAS dataset,
however, the only data on parents' social class comes from respondents'
own rating of their parents' social class. n some ways this is less
satisfactory than occupational data (the ratings may well be confounded
by the respondents' own positions in the class structure, for instance), but
6-9
one of the requirements of secondary analysis of data collected by other
people is that one has to make the best of what one has got.

A complication with the interpretation of mobility tables is that the
occupational and class structure has changed significantly over the course
of the century. n a representative sample of the population, there will be
some young respondents whose fathers are still alive and working, and
some old respondents whose fathers retired near the beginning of the
century from an occupational structure very different from the present one.
Thus a variable about the social class of fathers will be a rather messy
composite, holding some data about fathers whose class is assessed in
terms of a class structure which no longer exists and some data about
fathers whose class is assessed in terms of the present structure. One
tactic for getting over this problem is to include only respondents within a
particular age range.

Retrieve the H:\My Documents\spss data\bsas91b.sav file, then select
only those aged between 18 and 40.

Following case selection, enter a CROSSTABS command to obtain the
table of parent's social class (prsoccI) (in the rows) by own social
class (srsoccI). Select row percentages.

What percentage of respondents with working class parents now think of
themselves as middle class?

2. Subdividing by sex

The table you have just obtained includes both male and female
respondents. However, the class structure and the mobility of men
and women are very different. t would make more sense to look at
separate mobility tables for the two sexes. One simple way of
achieving this is to get a three-way table by adding RSEX to the
CROSSTABS command. This is done by adding rsex to the box
headed Layer 1 of 1 in the Crosstabs dialog box.

Compare the resulting tables for men and for women. s upward mobility
more or less likely for men than for women?

What reservations is it necessary to make about drawing conclusions from
these data?

You will notice that the filtering effect of the SELECT F has remained in
effect for the three-way cross-tabulation (compare the sum of the total
cases in the two tables controlling for RSEX with the table total for the
first tabulation. Are they the same?)

SAVE YOUR OUTPUT EXER6.SPO.

You might also like