Professional Documents
Culture Documents
Centro de Enseñanza Técnica Y Superior
Centro de Enseñanza Técnica Y Superior
MBA
Métodos estadísticos
Tarea #2
Presenta:
CAPITULO 7
74. Refer to the North Valley Real Estate data, which report information on homes sold during
the last year.
a. The mean selling price (in $ thousands) of the homes was computed earlier to be $357.0, with a
standard deviation of $160.7. Use the normal distribution to estimate the percentage of homes selling for
more than $500,000. Compare this to the actual results. Is price normally distributed? Try another test. If
price is normally distributed, how many homes should have a price greater than the mean? Compare this
to the actual number of homes. Construct a frequency distribution of price. What do you observe?
Distribution Plot
Normal, Mean=357, StDev=160.7
0.0025
0.0020
0.0015
Density
0.0010
0.0005
0.1868
0.0000
357 500
X
Tenemos una probabilidad del 18.68% para tener casas vendidas mayor a $500,000. Los datos
tienen una distribución normal.
De los 105 precios 14 de ellos son mayores a $500,000.
.
3
b. The mean days on the market is 30 with a standard deviation of 10 days. Use the normal distribution to
estimate the number of homes on the market more than 24 days. Compare this to the actual results. Try
another test. If days on the market is normally distributed, how many homes should be on the market
more than the mean number of days? Compare this to the actual number of homes. Does the normal
distribution yield a good approximation of the actual results? Create a frequency distribution of days on
the market. What do you observe?
Distribution Plot
Normal, Mean=30, StDev=10
0.04
0.03
Density
0.02 0.7257
0.01
0.00
24 30
X
Tenemos una probabilidad del 72.57% para tener casas en el mercado mayor a 24 días. Los
datos tienen una distribución normal.
De los 105 datos 62 de ellos son mayores a 24 días.
4
75. Refer to the Baseball 2016 data, which report information on the 30 Major League Baseball
teams for the 2016 season.
a. The mean attendance per team for the season was 2.439 million, with a standard deviation of 0.618
million. Use the normal distribution to estimate the number of teams with attendance of more than 3.5
million. Compare that estimate with the actual number. Comment on the accuracy of your estimate.
Distribution Plot
Normal, Mean=2.439, StDev=0.618
0.7
0.6
0.5
0.4
Density
0.3
0.2
0.1
0.04301
0.0
2.439 3.5
X
Tenemos una probabilidad del 4.3% para tener una asistencia mayor a 3.5 millones. Los datos
tienen una distribución normal.
De los 30 datos solamente 2 son mayores a $3.5 millones.
Attenda
nce
376481
5
352088
9
5
b. The mean team salary was $121 million, with a standard deviation of $40.0 million. Use the normal
distribution to estimate the number of teams with a team salary of more than $100 million. Compare that
estimate with the actual number. Comment on the accuracy of the estimate.
Distribution Plot
Normal, Mean=121, StDev=40
0.010
0.008
0.006
Density
0.7002
0.004
0.002
0.000
100 121
X
Tenemos una probabilidad del 70% para tener un salario de equipo mayor a 100 millones. Los
datos tienen una distribución normal.
De los 30 datos solamente 21 son mayores a $100 millones.
Team
Salary
118.90 213.50
168.70 133.00
117.20 126.60
110.70 166.50
117.70 123.20
172.80 120.30
112.90 144.80
146.40 116.40
230.40 174.50
108.30
100.10
6
Distribution Plot
Normal, Mean=4552, StDev=2332
0.00018
0.00016
0.00014
0.00012
0.00010
Density
0.00008
0.00006
0.00004 0.2673
0.00002
0.00000
4552 6000
X
Tenemos una probabilidad del 26.73% para tener un costo de matenimiento mayor a $ 6,000
dolares. Los datos tienen una distribución normal.
De los 80 datos solamente 17 son mayores a $6,000 dolares.
7
69. Refer to the North Valler Real Estate data, which reports information on homes sold in the área
during the last year. Select a random sample of twenty homes.
a.Based on your random sample of twenty homes, develop a 95% confidence interval for the mean days
on the market.
Los datos represantan las casas vendidas en el area durante el ultimo ano:
Usando Minitab, encontrar un intervalo de confianza siguiendo estos pasos:
1.- Importar los datos
2.- Seleccionar Stat > Basic statistics option.
3.- Seleccionar 1 sample t y seleccionar Samples in columns.
4.- Click en el botón option y elegir level of significance & alternative hypothesis
5.- Click OK
328308 a 437865, es el rango de precio medio de venta con un intervalo de confianza de 95%.
Encontramos que para un intervalo de confianza de 95% para la media days of market es 25.38
a 34.12.
c.Un intervalo de confianza de 95% para la proporción de casas con alberca es:
0.272 a 0.728, es la proporción de casas con alberca con un intervalo de confianza de 95%.
8
70.Refer to the baseball 2016 data, which report information on the 30 Major League Baseball
teams for the 2016 season. Assume the 2016 data represents a sample.
a.Develop a 95% confidence interval for the mean number of home runs per team.
Obtenemos la media usando un interval de confianza de 95% para la media del numero de home runs
por equipo de la siguiente manera:
1.- Import the data
2.- Seleccionar Stat y elegir Basic Statistics option.
3.- Seleccionar 1 sample t y selecionar samples in columns
4.- Click en el botón option y elegir level of significance & alternative hypothesis
5.- Click OK
151.75 a 175.52, es la media del numero de home runs por equipo con un intervalo de confianza
de 95%.
b.Determinar la media de promedio de bateo por cada equipo con un intervalo de confianza de 95%:
Obtenemos los siguientes resultados en Minitab:
0.251 a 0.257, es la media para el promedio de bateo por equipo con un intervalo de confianza
de 95%.
Usamos 1 sample t:
3.79 a 4.13, es la media para el promedio de carreras producidas para cada equipo con un
intervalo de confianza de 95%.
9
Obtenemos la media para el bus odometer miles con un interval de confianza de 95%:
1.- Import the data (Add label to first column cell and enter data in relevant column)
2.- Seleccionar Stat y elegir Basic statistics.
3.- Seleccionar 1 sample t y seleccionar samples in columns.
4.- Click en botón option y elegir level of sifnificance & alternative hypothesis
5.- Click OK
La media para bus odometer miles con un intervalo de confianza de 95% es 71038.3 a 84878.9.
c) Write a business memo to the state transportation oficial to report your results.
La oficina de transporte del estado encontró que hay un 95% de confianza que la media para el
costo de mantenimiento del camión esta entre 4033.03 y 5070.74 en la parte (a) y hay 95% de
confianza que la media para bus odometer miles esta entre 71038 y 84878.9 en la parte (b).
10
b. Refer to the variable on the number of miles driven since the last maintenance. The mean is
11,121 and the standard deviation is 617 miles. Estimate the number of buses traveling more than
11,500 miles since the last maintenance. Compare that number with the actual value. Create a
frequency distribution of miles since maintenance cost. Is the distribution normally distributed?
a) What is the typical salary for a team? What is the range of the salaries?
b) Comment on the shape of the distribution. Does it appear that any of the teams have a salary
that is out of line with the others?
11
c) Draw a cumulative relative frequency distribution of team salary. Using this distribution, forty
percent of the teams have a salary of less than what amount? About how many teams have a
total salary of more than $220 million?
80%
60%
40%
20%
0%
65≤x ≤100 100≤x ≤135 135≤x ≤170 170≤x ≤205 205≤x ≤240
12
CAPITULO 3
86. Refer to the North Valley Real Estate data and prepare a report on the sales prices of the
homes. Be sure to answer the following questions in your report.
Histogram of Price
20
15
Frequency
10
0
200000 300000 400000 500000 600000 700000 800000 900000
Price
13
Histogram of FICO
14
12
10
Frequency
0
600 650 700 750 800
FICO
a) Around what values of price do the data tend to cluster? What is the mean sales price? What is
the median sales price? Is one measure more representative of the typical sales prices than the
others?
El precio de promedio de ventas es de $357026 y la mediana es $323417 tenemos asi que la
distribución esta sesgada a la izquierda. El valor típico de la media es de $323417.
b) What is the range of sales prices? What is the standard deviation? About 95% of the sales prices
are between what two values? Is the standard deviation a useful statistic for describing the
dispersion of sales price?
El rango de los precios de venta es $751518 y la desviacion estandar es $160700. Por lo menos
el 95% de los costos de mantenimiento están entre:
Media-2*desviaciones estandar=357,026-2(160,700)=35,626.
Media+2*desviaciones estandar=357,026+2(160,700)=678,426
El 95% de los precios de venta están entre estos 2 valores : 35,626 y
678,426
87. Refer to the Baseball 2016 data, which report information on the 30 Major League Baseball
teams for the 2016 season. Refer to the variable team salary.
a) Prepare a report on the team salaries. Be sure to answer the following questions in your report.
1.1. Around what values do the data tend to cluster? Alrededor de los salarios entre 75 y 125
millones
6
Frequency
0
75 100 125 150 175 200 225
Team Salary
Media+2*desviaciones estandar=121.94+2(40.58)=203.1
El 95% de los salarios del equipo están entre estos 2 valores : 40.78 y
203.1
b) Refer to the information on the average salary for each year. In 2000 the average player
salary was $1.99 million. By 2016 the average player salary had increased to $4.40 million.
What was the rate of increase over the period?
Si tenemos que hay 17 años entre 2000 y 2016 por lo tanto n=17
4.4
GM =17
√ 1.99
-1
GM =0.047
CAPITULO 4
44 .Refer to the North Valley real estate data recorded on homes sold during the last year.
Prepare a report on the selling prices of the homes based on the answers to the following
questions.
a) Compute the minimum, maximum, median, and the first and the third quartiles of price.
Create a box plot. Comment on the distribution of home prices.
Variabl Media
e N N* Minimum Q1 n Q3 Maximum
Boxplot of Price
1000000
900000
800000
700000
600000
Price
500000
400000
300000
200000
100000
b) Develop a scatter diagram with price on the vertical axis and the size of the home on the
horizontal. Is there a relationship between these variables? Is the relationship direct or indirect?
900000
800000
700000
600000
Price
500000
400000
300000
200000
100000
1000 2000 3000 4000 5000 6000 7000 8000
Size
17
Si existe una relación directa entre el precio y el tamaño. Si existe un incremento el tamaño
de la casa también habrá un incremento en el precio.
c) For homes without a pool, develop a scatter diagram with price on the vertical axis and the size
of the home on the horizontal. Do the same for homes with a pool. How do the relationships
between price and size for homes without a pool and homes with a pool compare?
900000
800000
700000
600000
Price_1
500000
400000
300000
200000
100000
1000 2000 3000 4000 5000 6000 7000 8000
Size_1
800000
700000
600000
Price_2
500000
400000
300000
200000
100000
1000 2000 3000 4000 5000 6000 7000
Size_2
18
Ambos gráficos muestran una relación directa entre el precio y el tamaño. Sin embargo, las
casas con piscinas muestran una relación más apegada a que los precios incrementan y los
puntos se encuentran mas dispersos y se aprecian mas puntos en los precios altos a
diferencia de las casa que no tienen piscinas.
Las casas que no tienen piscina tiene la gran mayoría de los puntos concentrados en
precios y tamaños menores a 4,000-400,000.
45.Refer to the Baseball 2016 data that report information on the 30 Major League Baseball teams
for the 2016 season.
a) In the data set, the year opened, is the first year of operation for that stadium. For each team, use
this variable to create a new variable, stadium age, by subtracting the value of the variable, year
opened, from the current year. Develop a box plot with the new variable, age. Are there any
outliers? If so, which of the stadiums are outliers?
Dodgers.
60
40
20
b) Using the variable, salary, create a box plot. Are there any outliers? Compute the quartiles using
formula (4-1). Write a brief summary of your analysis.
150
100
50
c) Draw a scatter diagram with the variable, wins, on the vertical axis and salary on the horizontal
axis. What are your conclusions?
100
90
Wins
80
70
60
50 100 150 200 250
Team Salary
d) Using the variable, wins, draw a dot plot. What can you conclude from this plot?
Existe un pico de victorias que se encuentra entre los grupos de 67.5 y 68.5.
No se aprecia un Dotplot of Wins resultado sesgado a la izquierda o derecha.
Existe una distribución uniforme entre 59 y 103.
65 70 75 80 85 90 95 100
Wins