Professional Documents
Culture Documents
Business Analitics
Business Analitics
Business Analitics
Business Analytics
I have taken the data from: - Trending YouTube Video Statistics | Kaggle
Descriptive Analysis
Descriptive statistics are brief descriptive coefficients that summarize a given data set,
which can be either a representation of the entire population or a sample of a
population. Descriptive statistics are broken down into measures of central tendency
and measures of variability (spread). Measures of central tendency include the mean,
median, and mode, while measures of variability include standard deviation, variance,
minimum and maximum variables, kurtosis, and skewness.
In this data we have information regarding 300 Trending YouTube Video Statistics.
Of which I have tried calculated measures of central tendency include the mean,
median, and mode, while measures of variability include standard deviation, variance,
minimum and maximum variables, kurtosis, and skewness.
Sample Mean
A sample mean is an average of a set of data. The sample mean can be used to
calculate the central tendency, standard deviation and the variance of a data set. The
sample mean can be applied to a variety of uses, including calculating population
averages.
Of the 300 data I have taken 100 data as sample value and tried to calculate measures
of central tendency include the mean, median, and mode, while measures of
variability include standard deviation, variance, minimum and maximum
variables, kurtosis, and skewness of the take sample.
Correlation
Correlation, in the finance and investment industries, is a statistic that measures the
degree to which two securities move in relation to each other. Correlations are used in
advanced portfolio management, computed as the correlation coefficient, which has a
value that must fall between -1.0 and +1.0.
The Correlation (Population) is 0.364 which means that there is positive relation
between likes and dislikes, but the relation is weak in nature.
The Correlation (Sample Mean) is 0.288 which means that there is positive relation
between likes and dislikes, but the relation is weak in nature.
Covariance
Covariance measures the directional relationship between the returns on two assets. A
positive covariance means that asset returns move together while a negative
covariance means they move inversely. Covariance is calculated by analysing at-
return surprises (standard deviations from the expected return) or by multiplying the
correlation between the two variables by the standard deviation of each variable.
Regression
Regression is a statistical method used in finance, investing, and other disciplines that
attempts to determine the strength and character of the relationship between one
dependent variable (usually denoted by Y) and a series of other variables (known as
independent variables).
Regression helps investment and financial managers to value assets and understand
the relationships between variables, such as commodity prices and the stocks of
businesses dealing in those commodities.
T-Test
A t-test is a type of inferential statistic used to determine if there is a significant
difference between the means of two groups, which may be related in certain features.
It is mostly used when the data sets, like the data set recorded as the outcome from
flipping a coin 100 times, would follow a normal distribution and may have unknown
variances. A t-test is used as a hypothesis testing tool, which allows testing of
an assumption applicable to a population.
Scatter Plot
A scatter plot (aka scatter chart, scatter graph) uses dots to represent values for two
different numeric variables. The position of each dot on the horizontal and vertical
axis indicates values for an individual data point. Scatter plots are used to observe
relationships between variables.
As a general rule, sample sizes of around 30-50 are deemed sufficient for the CLT to
hold, meaning that the distribution of the sample means is fairly normally distributed.
Therefore, the more samples one takes, the more the graphed results take the shape of
a normal distribution.