Filling The Gaps Stata

Uploaded by

zbidi seifeddine

0% found this document useful (0 votes)

0 views2 pages

Original Title

filling the gaps stata

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

0 views2 pages

Filling The Gaps Stata

Uploaded by

zbidi seifeddine

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 2

Search inside document

The Stata Journal (2005)

5, Number 1, pp. 135–136

Stata tip 17: Filling in the gaps

Nicholas J. Cox
University of Durham, UK
n.j.cox@durham.ac.uk

The fillin command (see [R] fillin) does precisely one thing: it ﬁlls in the gaps in a
rectangular data structure. That is very well explained in the manual entry, but people
who do not yet know the command often miss it, so here is one more plug. Suppose
that you have a dataset of people and the choices they make, something like this:

id choice
1 1
2 3
3 1
4 2

Now suppose that you wish to run a nested logit model using nlogit (see [R] nlogit).
This command requires all choices, those made and those not made, to be explicit.
With even 4 values of id and 3 values of choice, we need 12 observations so that each
combination of variables exists once in the dataset; hence, 8 more are needed in this
case. The solution is just

. fillin id choice

and a new variable, fillin, is added to the dataset with values 1 if the observation
was “ﬁlled in” and 0 otherwise. Thus count if fillin tells you how many obser-
vations were added. You will often want to replace or rename fillin to something
appropriate:

. rename _fillin chosen

. replace chosen = 1 - chosen

If you do not rename or drop fillin, it will get in the way of a subsequent fillin.
Usually, the decision is clear-cut: Either fillin has a natural interpretation, so you
want to keep it, or a relative, under a diﬀerent name; or fillin was just a by-product,
and you can get rid of it without distress.
Another common variant is to show zero counts or amounts explicitly. With a dataset
of political donations for several years, we might want an observation showing that
amount is zero for each pair of donor and year not matched by a donation. This typically
leads to neater tables and graphs and may be needed for modeling: in particular, for
panel models, the zeros must be present as observations. The main idea is the same,
but the aftermath is diﬀerent:

. fillin donor year

. replace amount = 0 if _fillin

c 2005 StataCorp LP dm0011
136 Stata tip 17

Naturally if we have more than one donation from various donors in various years,
we might also want to collapse (or just possibly contract) the data, but that is the
opposite kind of problem.
Yet another common variant is the creation of a grid for some purpose, perhaps
before data entry, or before you draw a graph. You can be very lazy by typing

. clear
. set obs 20
. gen y = _n
. gen x = y
. fillin y x

which creates a 20 × 20 grid. This is good, but sometimes you want something different;
see functions fill() and seq() in [R] egen.
The messiest fillin problems are when some of the categories you want are not
present in the dataset at all. If you know a person is not one of the values of donor, no
amount of filling in will add a set of zeros for that person. One strategy here is to add
pseudo-observations so that every category occurs at least once and then to fillin in
terms of that larger dataset. This is just a variation on the technique for creating a grid
out of nothing.
As far as you can see from what is here, fillin just does things in place, so you
need not worry about file manipulation. This is an illusion, as underneath the surface,
fillin is firing up cross (see [R] cross), which does the work using files. Thus cross
is more fundamental. A forthcoming tip will say more.

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
Rating: 5 out of 5 stars
5/5 (1)
01 - Describing and Summarizing Data
Document41 pages
01 - Describing and Summarizing Data
ScarfaceXXX
No ratings yet
Correlating Vapor Pressures and Heats of Solution For The Ammonium Nitrate-Water System An Enthalpy-Concentration Diagram PDF
Document5 pages
Correlating Vapor Pressures and Heats of Solution For The Ammonium Nitrate-Water System An Enthalpy-Concentration Diagram PDF
Magdy Saleh
No ratings yet
Session 5 - Correlation and Regression
Document32 pages
Session 5 - Correlation and Regression
Phuong Nguyen Minh
100% (1)
Homework2 PDF
Document3 pages
Homework2 PDF
TeoTheSpaz
No ratings yet
Data Clean R
Document11 pages
Data Clean R
roy.scar2196
100% (1)
Datascienece
Document18 pages
Datascienece
ajus ady
No ratings yet
Prediction
Document10 pages
Prediction
endriasmit_469556062
100% (1)
Self Learning Module Cca
Document14 pages
Self Learning Module Cca
Irene Taba
100% (1)
Gheorghe Micula
Document402 pages
Gheorghe Micula
Mavrodin Constantin
No ratings yet
Big Data
Document5 pages
Big Data
Dinhkhanh Nguyen
100% (4)
Algorithm - Plain English Explanation of Big O - Stack Overflow
Document15 pages
Algorithm - Plain English Explanation of Big O - Stack Overflow
ghera_ghera
No ratings yet
COMP90038 Algorithms and Complexity
Document15 pages
COMP90038 Algorithms and Complexity
Bogdan Jovicic
No ratings yet
Exercise 2
Document3 pages
Exercise 2
Yassine Azzimani
No ratings yet
Lecture 04 Correlation
Document27 pages
Lecture 04 Correlation
Dhruv Patel
No ratings yet
Week13 2 Data Analysis 2
Document44 pages
Week13 2 Data Analysis 2
Obi Salvation
No ratings yet
Speaking Stata: Making It Count: 7, Number 1, Pp. 117-130
Document14 pages
Speaking Stata: Making It Count: 7, Number 1, Pp. 117-130
Mariia Sungatullina
No ratings yet
The Problem of Overfitting - Coursera
Document1 page
The Problem of Overfitting - Coursera
Brian Ramiro Oporto Quispe
No ratings yet
Lab Module: Coding & Big Data
Document30 pages
Lab Module: Coding & Big Data
Kinan
No ratings yet
Appendix C Working With Tableau 10
Document110 pages
Appendix C Working With Tableau 10
cantinflas08
No ratings yet
Data Preparation & Exploration
Document12 pages
Data Preparation & Exploration
IUB VIBES
No ratings yet
Basic Data Science Interview Questions: 1. What Do You Understand by Linear Regression?
Document38 pages
Basic Data Science Interview Questions: 1. What Do You Understand by Linear Regression?
sahil kumar
No ratings yet
CIMA Part 7
Document10 pages
CIMA Part 7
cssaspirantresources
No ratings yet
Prolog Programming - 11334
Document101 pages
Prolog Programming - 11334
Khairul Aizreen
No ratings yet
Cracking The Di (Data Interpretation) Part of Competitive Exams - The Fallacy, How To Prepare and Tips & Tricks
Document5 pages
Cracking The Di (Data Interpretation) Part of Competitive Exams - The Fallacy, How To Prepare and Tips & Tricks
Sanjeev Chaudhary
No ratings yet
Scientific Programming in C II. Data Types: Susi Lehtola
Document19 pages
Scientific Programming in C II. Data Types: Susi Lehtola
Sean Macfoy
No ratings yet
Illustration of Using Excel To Find Maximum Likelihood Estimates
Document14 pages
Illustration of Using Excel To Find Maximum Likelihood Estimates
Rustam Muhammad
No ratings yet
ECMT1020: Introduction To Econometrics Tutorial Questions, Week 3
Document3 pages
ECMT1020: Introduction To Econometrics Tutorial Questions, Week 3
Rainie Xu
No ratings yet
CS168: The Modern Algorithmic Toolbox Lecture #5: Generalization (Or, How Much Data Is Enough?)
Document16 pages
CS168: The Modern Algorithmic Toolbox Lecture #5: Generalization (Or, How Much Data Is Enough?)
Danish Shabbir
No ratings yet
Python Fundamentals
Document100 pages
Python Fundamentals
myanmar lover tamu
No ratings yet
Exploratory Data Analysis-1
Document10 pages
Exploratory Data Analysis-1
Sunil Arava
No ratings yet
Big o
Document21 pages
Big o
Pragati Chimangala
No ratings yet
Benford's Law
Document6 pages
Benford's Law
Scrooty
No ratings yet
Modeling Utah Population Data: Estimates of Utah Resident Population, in 100,000's Year X
Document8 pages
Modeling Utah Population Data: Estimates of Utah Resident Population, in 100,000's Year X
api-284563579
No ratings yet
BC2406 Week 3
Document34 pages
BC2406 Week 3
Alex Tay
No ratings yet
Working With Data: V Balachandra 19F41A0471
Document17 pages
Working With Data: V Balachandra 19F41A0471
balu
No ratings yet
Elitmus Guide - Guide To Solve Cryptic Multiplication Question
Document8 pages
Elitmus Guide - Guide To Solve Cryptic Multiplication Question
Vivekananda Ganjigunta Narayana
0% (1)
STA1007S Lab 4: Scatterplots and Basic Programming: "Hist"
Document9 pages
STA1007S Lab 4: Scatterplots and Basic Programming: "Hist"
mlungu
No ratings yet
TCS Technical Interview Questions and Answers 2011
Document16 pages
TCS Technical Interview Questions and Answers 2011
Rajesh Sinha
No ratings yet
Mathematics JSS 3
Document133 pages
Mathematics JSS 3
Patrick Chukwunonso A.
100% (2)
Sample Latex File
Document2 pages
Sample Latex File
JSBELL1962
No ratings yet
Business Data Mining Week 10 A
Document28 pages
Business Data Mining Week 10 A
pm6566
No ratings yet
Charts & Diagrams Primer
From Everand
Charts & Diagrams Primer
Beam Vanwaardenberg
No ratings yet
Strong Tie:: Q1. Illustrate Granovetter Strenghth of Weak Ties
Document10 pages
Strong Tie:: Q1. Illustrate Granovetter Strenghth of Weak Ties
Samreen Mahagami
No ratings yet
R Regression Exercise 2019
Document9 pages
R Regression Exercise 2019
Clarissa
No ratings yet
Data Mining Assignment
Document8 pages
Data Mining Assignment
Amanat Construction
No ratings yet
Introduction To Python
Document24 pages
Introduction To Python
GideonBardi
No ratings yet
Contest3 Writeup Zhang Edward Das Sauman
Document13 pages
Contest3 Writeup Zhang Edward Das Sauman
api-635899958
No ratings yet
Working With Percents
Document5 pages
Working With Percents
dony65
No ratings yet
Design Integrity Notes
Document4 pages
Design Integrity Notes
djamila abbad
No ratings yet
Stata Working With Ado Files
Document4 pages
Stata Working With Ado Files
chala24
No ratings yet
Data Structures: Will It Work?
Document9 pages
Data Structures: Will It Work?
Hamed Nilforoshan
No ratings yet
3 Awesome Visualization Techniques For Every Dataset: Mlwhiz
Document13 pages
3 Awesome Visualization Techniques For Every Dataset: Mlwhiz
Shivank Das
No ratings yet
Reading 5 - Data Preparation
Document23 pages
Reading 5 - Data Preparation
NR Yalife
No ratings yet
5 Summarizing Data
Document29 pages
5 Summarizing Data
akmam.haque
No ratings yet
Jaeger 2007
Document4 pages
Jaeger 2007
László Sági
No ratings yet
Ggplot2 Exercise
Document6 pages
Ggplot2 Exercise
retokoller44
No ratings yet
Show Me The Numbers Course
Document172 pages
Show Me The Numbers Course
Admire Mamvura
83% (6)
Aiep1 S4 SC
Document8 pages
Aiep1 S4 SC
Ayin Pernito
No ratings yet
Unit 8 - Reading Task
Document3 pages
Unit 8 - Reading Task
Nhu Nguyen
No ratings yet
Mixed Operation and Law
Document8 pages
Mixed Operation and Law
Chenta Sepi
No ratings yet
From Average To K-means
From Everand
From Average To K-means
Beam van Waardenberg
No ratings yet
Excel Data Cleansing Straight to the Point
From Everand
Excel Data Cleansing Straight to the Point
Oz du Soleil
Rating: 4.5 out of 5 stars
4.5/5 (2)
Pivot for office workers: Using Excel 365 and 2021
From Everand
Pivot for office workers: Using Excel 365 and 2021
Ina Koys
No ratings yet
New ways to go!: Modern Excel features making your work easier
From Everand
New ways to go!: Modern Excel features making your work easier
Ina Koys
No ratings yet
Data Analytics Course Outline
Document3 pages
Data Analytics Course Outline
rolland verniun
No ratings yet
Odisha B.Ed Entrance Exam 2024 Syllabus - PDF Download, Paper 1 & 2
Document8 pages
Odisha B.Ed Entrance Exam 2024 Syllabus - PDF Download, Paper 1 & 2
dmuduli033
0% (1)
CH 13 Skate Park Physics
Document4 pages
CH 13 Skate Park Physics
KobiXD
No ratings yet
8.1 Worksheet Polar Coordinates
Document3 pages
8.1 Worksheet Polar Coordinates
RonnieMaeMaullion
No ratings yet
1 Tangent Lines
Document5 pages
1 Tangent Lines
yamadacodm
No ratings yet
Chapter 13
Document33 pages
Chapter 13
Mustafa Farrag
No ratings yet
5 EisnerElliotWDa - 2004 - 12ResearchingImpossib - HandbookOfResearchAnd PDF
Document20 pages
5 EisnerElliotWDa - 2004 - 12ResearchingImpossib - HandbookOfResearchAnd PDF
dvadesetsedam123
No ratings yet
A Qualitative Study of Spending Behavior of ABM Students in STI College Malolos, Bulacan
Document18 pages
A Qualitative Study of Spending Behavior of ABM Students in STI College Malolos, Bulacan
Jonathan Rantael
No ratings yet
Graph Algorithms Using Depth First Search: John Reif, PH.D
Document59 pages
Graph Algorithms Using Depth First Search: John Reif, PH.D
KHAWAJA MANNAN
No ratings yet
Age Estimation Using Secondary Dentin
Document50 pages
Age Estimation Using Secondary Dentin
Manjiri Joshi
100% (1)
Composite Pressure Hulls For Deep Ocean Submersibles
Document13 pages
Composite Pressure Hulls For Deep Ocean Submersibles
Suja Subramanian
No ratings yet
Algebra Word Problems
Document7 pages
Algebra Word Problems
Sudha Puttagunta
No ratings yet
IDEA TEST - Identity and Eating Disorders
Document12 pages
IDEA TEST - Identity and Eating Disorders
Leire Sanz Mulero
No ratings yet
Alternatives To Selected-Response Assessment
Document2 pages
Alternatives To Selected-Response Assessment
yunangraeni
No ratings yet
Boonwattana School M4 Unit Test Permutation and Combination Name: - Class: - Number: - Instructions
Document4 pages
Boonwattana School M4 Unit Test Permutation and Combination Name: - Class: - Number: - Instructions
Sarah Anderson
No ratings yet
Process Design
Document46 pages
Process Design
devya123
No ratings yet
CBSE Class 6 Maths Chapter 5 - Understanding Elementary Shapes Important Questions 2022-23
Document14 pages
CBSE Class 6 Maths Chapter 5 - Understanding Elementary Shapes Important Questions 2022-23
Bharathi
No ratings yet
JMO Geometry
Document8 pages
JMO Geometry
Tulus
No ratings yet
Prediction of Cutting Force and Temperature Rise in The End-Milling Operation
Document12 pages
Prediction of Cutting Force and Temperature Rise in The End-Milling Operation
Palanisamy Ponnusamy
No ratings yet
Darkbasic Programming Manual
Document161 pages
Darkbasic Programming Manual
Andreas Rosenkreutz
0% (1)
Methods and Tools Used in Criticality Analysis in Industrial Systems
Document18 pages
Methods and Tools Used in Criticality Analysis in Industrial Systems
Leandro Baran
100% (1)
The Electric Field of A Uniform Ring of Charge
Document4 pages
The Electric Field of A Uniform Ring of Charge
Mohamed Alsied
No ratings yet
Bharat Khanal Edit
Document15 pages
Bharat Khanal Edit
nabin nit
No ratings yet
Msds
Document2 pages
Msds
Syed Aousaf
No ratings yet
Modellus 2.5 Leaflet
Document2 pages
Modellus 2.5 Leaflet
Comcyt Oaxaca
No ratings yet