Capitulo2 R

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

DEBER

Capítulo 2

Fabricio Trujillo

2.1 Create a frequency distribution table and a percent table of the variable absent12, which gives
the number of times a student was absent in twelfth grade and the proportion in each category,
respectively. Use these tables to answer the following questions.

a. How many students in the NELS dataset were absent 3–6 times in twelfth grade?
180
b. How many students in the NELS dataset were absent exactly 3 times in twelfth grade?
180
c. What percentage of students in the NELS dataset was absent 3–6 times in twelfth grade?
36
d. What percentage of students in the NELS dataset was absent 6 or fewer times in twelfth
grade?
86 %
e. How would you describe the shape of the distribution of absences in twelfth grade –
positively skewed, negatively skewed, or reasonably symmetric? Explain
La figura de la distribución de ausencias tiene un skewed positivo, es decir es asimétrica
hacia la derecha

2.5 In this exercise, we will make a bar graph of the variable, alcbinge, which indicates whether a
student has ever binged on alcohol, first for females and then for males.
a. What percentage of males in the NELS dataset has ever binged on alcohol?
29.07489%
b. Which gender, males or females, has more of a tendency to have ever binged on alcohol?
Hombres
2.10 Create a stem-and-leaf of the variable total cholesterol, given by the variable TOTCHOL1
found in the Framingham dataset. Use it to answer the following questions.

a. What is the lowest score in this distribution?


123

b. Are there any outliers (i.e., extreme values) in this distribution? Describe them.
Si existe outliers el valor corresponde a 464
c. How would you describe the shape of this distribution? Explain.
La forma de la distribución es asimétrica ya que tiene cola y es positiva
d. What is the 50th percentile of these data?

e. What is the highest score in this distribution?


464
f. Can we use this stem-and-leaf to obtain the complete set of original values for this
variable?
Como se puede apreciar por medio de stem-and-leaf se puede obtener los valores originales
de estas variables

2.15 Create two histograms, one for a student’s self-concept in eighth grade as given by the
variable slfcnc08 in the NELS dataset and the other for a student’s self-concept in twelfth grade
as given by the variable slfcnc12 in the NELS dataset. Edit the histograms so that they each have
the same scale and range of values on both the horizontal and vertical axes using the xlim and
ylim arguments in the hist function, and use them to answer the following questions.
a. For which grade (eighth or twelfth) is the overall level of self-concept higher?
El nivel de autoconcepto es más alto para octavo grado
b. For which grade is the spread of self-concept scores less?
La dispersion es menor para los estudiantes de octavo grado
c. For which grade is self-concept more skewed, and in which direction?
Para los estudiantes de duodecimo grado es mayor el skewed y es negativo
2.20 Why is the calculation of percentiles not appropriate for the variable parmarl8?

Existe 6 niveles en esta variable de tipo factor por eso no es apropiado el cálculo de percentiles, ya
que se necesita una variable numérica.

2.25 Construct four side-by-side boxplots of the number of Advanced Placement (AP) classes
offered at a student’s school (apoffer), one for each of the four types of high-school programs
(hsprog) represented in the NELS dataset.

a. How might one explain that there are academic program schools that offer many AP
courses, but not any rigorous academic program schools that offer more than 20?
Es programa de colocación avanzada, e cual brinda la oportunidad de tomar cursos de nivel
universitario, siendo mucho más difícil que los estudiantes tomen el programa académico
riguroso
b. What would be a better way to represent AP courses offerings other than by their number
per school?
Se puede representar con una tabla de frecuencias, y esto traducirlo a un histograma
Deber
Capítulo 2
Fabricio Trujillo

2.30 For the following variables in the NELS dataset, indicate whether the most appropriate types
of graphical displays are (1) bar chart or pie chart, or (2) histogram, line graph, stem-and-leaf plot,
or boxplot.

a. gender (bar chart o pie chart)


b. urban (bar chart o pie chart)
c. schtyp8 (bar chart o pie chart)
d. tcherint (bar chart o pie chart)
e. numinst (histogram, Line graph, stem-and-leaf plot, boxlot)
f. achrdg08 (histogram, Line graph, stem-and-leaf plot, boxlot)
g. schattrt (bar chart o pie chart)
h. absent12 (bar chart o pie chart)

2.35 On June 3, 2014, H&R Block (www.businessinsider.com/worst-college-majors2014-6)


published pie and bar charts, indicating the best and worst college majors in terms of employment
prospects
a. In the pie chart, what is the interpretation of the 31 percent associated with the business
major?
“Bussiness” es la Carrera con mayor demanda, ya que tiene el mayor porcentaje con un 31%
en comparación con los demás
b. What would be an advantage of using a bar chart instead of a pie chart to depict the most
in-demand college majors?
Los datos se representan de manera sencilla, se pueda utilizar para hacer una comparación
en relación a las otras carreras universitarias y se puede apliar más datos que se lo puede
representar en el gráfico de barras, en cambio que en el gráfico de pastel no se lo puede
hacer.

c. What percentage of recent grads who majored in anthropology and archeology are
unemployed?
10.5%
d. Why does it make sense for this bar chart to be stacked?
Al colocar los datos apilados se puede apreciar y realizar una comparación entre la
información obtenida así: se puede apreciar cual es la diferencia entre la tasa de desempleo
que se da entre los graduados con experiencia y sin experiencia.

2.40 Determine what, if anything, is wrong with the following statement: If it takes 10 raw score
points to go from a percentile rank of 50 to a percentile rank of 58, then it must also take 10 raw
score points to go from a percentile rank of 90 to a percentile rank of 98

La relación está equivocada, depende de los datos y su distribución

2.45 Explain how a boxplot could have only one whisker.

Aquí se puede apreciar el salario de 12 trabajadores, donde 4 ganan 700 haciendo que estos
tengan el mismo valor que Q4, y siendo este el valor máximo pues no va a existir el bigote en este
extremo de la caja

You might also like