Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

8/30/2021 Asphalt Shingles Data Analysis

An important quality characteristic used by the manufacturers of ABC asphalt


shingles is the amount of moisture the shingles contain when they are
packaged. Customers may feel that they have purchased a product lacking in
quality if they find moisture and wet shingles inside the packaging. In some
cases, excessive moisture can cause the granules attached to the shingles for
texture and coloring purposes to fall off the shingles resulting in appearance
problems. To monitor the amount of moisture present, the company conducts
moisture tests. A shingle is weighed and then dried. The shingle is then
reweighed, and based on the amount of moisture taken out of the product,
the pounds of moisture per 100 square feet are calculated. The company
would like to show that the mean moisture content is less than 0.35 pounds
per 100 square feet.

Importing the libraries


In [1]:
import pandas as pd

import numpy as np

import copy

import matplotlib.pyplot as plt

import seaborn as sns

import pylab

import math

%matplotlib inline

import os

import warnings

warnings.filterwarnings('ignore')

Loading the Data


In [32]:
df = pd.read_csv('A+&+B+shingles.csv')

df.head(10)

Out[32]: A B

0 0.44 0.14

1 0.61 0.15

2 0.47 0.31

3 0.30 0.16

4 0.15 0.37

5 0.24 0.18

6 0.16 0.42

7 0.20 0.58

8 0.20 0.25

9 0.20 0.41

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>

RangeIndex: 36 entries, 0 to 35

Data columns (total 2 columns):

# Column
This study source was downloaded Non-Null from
by 100000840671323 Count Dtype on
CourseHero.com
01-29-2022 05:17:29 GMT -06:00
localhost:8888/nbconvert/html/Asphalt Shingles Data Analysis.ipynb?download=false 1/4
https://www.coursehero.com/file/108841153/Asphalt-Shingles-Data-Analysispdf/
8/30/2021 Asphalt Shingles Data Analysis

--- ------ -------------- -----

0 A 36 non-null float64

1 B 31 non-null float64

dtypes: float64(2)

memory usage: 704.0 bytes

Summary of the Dataset


In [10]:
df.info()

<class 'pandas.core.frame.DataFrame'>

RangeIndex: 36 entries, 0 to 35

Data columns (total 2 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 A 36 non-null float64

1 B 31 non-null float64

dtypes: float64(2)

memory usage: 704.0 bytes

Checking missing Values


There are few null values

In [11]:
df.isnull().sum()

Out[11]: A 0

B 5

dtype: int64

Deleting the missing Values


In [31]:
df.dropna(inplace=True)

df.shape

Out[31]: (31, 2)

In [21]:
df.isnull().sum() # Additional Validation

Out[21]: A 0

B 0

dtype: int64

In [35]:
df.describe() ## Performed to understand if 5 point summary of the data.

Out[35]: A B

count 36.000000 31.000000

mean 0.316667 0.273548

std 0.135731 0.137296

min 0.130000 0.100000

25% 0.207500 0.160000

50% 0.290000 0.230000


This study source was downloaded by 100000840671323 from CourseHero.com on 01-29-2022 05:17:29 GMT -06:00

localhost:8888/nbconvert/html/Asphalt Shingles Data Analysis.ipynb?download=false 2/4


https://www.coursehero.com/file/108841153/Asphalt-Shingles-Data-Analysispdf/
8/30/2021 Asphalt Shingles Data Analysis

A B

75% 0.392500 0.400000

max 0.720000 0.580000

3.1 Do you think there is evidence that means


moisture contents in both types of shingles are within
the permissible limits? State your conclusions clearly
showing all steps.
In [22]:
from scipy import stats

from scipy.stats import ttest_1samp

t_statistic, p_value = ttest_1samp(df.A, 0.35) #FOR COLUMN A

print('One sample t test \nt statistic: {0} p value: {1} '.format(t_statistic, p_val

One sample t test

t statistic: -1.6005252585398313 p value: 0.05998085400516971

Since pvalue > 0.05, do not reject H0 . There is not enough evidence to conclude that the mean moisture
content for Sample A shingles is less than 0.35 pounds per 100 square feet. p-value = 0.0748. If the population
mean moisture content is in fact no less than 0.35 pounds per 100 square feet, the probability of observing a
sample of 36 shingles that will result in a sample mean moisture content of 0.3167 pounds per 100 square feet or
less is .0748.

In [23]:
t_statistic, p_value = ttest_1samp(df.B, 0.35,nan_policy='omit' ) #FOR COLUMN B

print('One sample t test \nt statistic: {0} p value: {1} '.format(t_statistic, p_val

One sample t test

t statistic: -3.1003313069986995 p value: 0.0020904774003191813

Since pvalue < 0.05, reject H0 . There is enough evidence to conclude that the mean moisture content for
Sample B shingles is not less than 0.35 pounds per 100 square feet. p-value = 0.0021. If the population mean
moisture content is in fact no less than 0.35pounds per 100 square feet, the probability of observing a sample of
31 shingles that will result in a sample mean moisture content of 0.2735 pounds per 100 square feet or less is
.0021.

3.2 Problem Do you think that the population means


for shingles A and B are equal? Form the hypothesis
and conduct the test of the hypothesis. What
assumption do you need to check before the test for
equality of means is performed?
In [30]:
#Solution:

#H0 : μ(A)= μ(B)

#Ha : μ(A)!= μ(B)

#α = 0.05

from scipy.stats import ttest_ind

t_statistic,p_value=ttest_ind(df['A'],df['B'],equal_var=True ,nan_policy='omit')

print("t_statistic={} and pvalue={}".format(round(t_statistic,3),round(p_value,3)))

t_statistic=0.985 and pvalue=0.328


As the pvalue > α , do not reject H0; and we can say that population mean for shingles A and B are equal Test
Assumptions When running a two-sample t-test, the basic assumptions are that the distributions of the two
populations are normal, and that the variances of the two distributions are the same. If those assumptions are
not likely to be met, another testing procedure could be use.

In [ ]:

This study source was downloaded by 100000840671323 from CourseHero.com on 01-29-2022 05:17:29 GMT -06:00

localhost:8888/nbconvert/html/Asphalt Shingles Data Analysis.ipynb?download=false 3/4


https://www.coursehero.com/file/108841153/Asphalt-Shingles-Data-Analysispdf/
8/30/2021 Asphalt Shingles Data Analysis

In [ ]:

In [ ]:

In [ ]:

This study source was downloaded by 100000840671323 from CourseHero.com on 01-29-2022 05:17:29 GMT -06:00

localhost:8888/nbconvert/html/Asphalt Shingles Data Analysis.ipynb?download=false 4/4


https://www.coursehero.com/file/108841153/Asphalt-Shingles-Data-Analysispdf/
Powered by TCPDF (www.tcpdf.org)

You might also like