Download as pdf
Download as pdf
You are on page 1of 20
1. Explain about distribution functions for random variables & continues random variables? Ans:random variable In probability, a real-valued function, defined over the sample space of a random experiment, is called a random variable. That is, the values of the random variable correspond to the outcomes of the random experiment. Random variables could be either discrete or continuous. In this article, let's discuss the different types of random variables. Random Variable and Probability Distribution The probability distribution of a random variable can be e Theoretical listing of outcomes and probabilities of the outcomes. e An experimental listing of outcomes associated with their observed relative frequencies. e A subjective listing of outcomes associated with their subjective probabilities. The probability of a random variable X which takes the values x is defined as a probability function of X is denoted by f (x) = f (X = x) A probability distribution always satisfies two conditions: e f(x)20 © Sf(x)=1 The important probability distributions are: e Binomial distribution e Poisson distribution Bernoulli's distribution e Exponential distribution e Normal distribution Continuous Random Variable A numerically valued variable is said to be continuous if, in any unit of measurement, whenever it can take on the values a and b. If the random variable X can assume an infinite and uncountable set of values, it is said to be a continuous random variable. When X takes any value in a given interval (a, b), it is said to be a continuous random variable in that interval. Formally, a continuous random variable is such whose cumulative distribution function is constant throughout. There are no “gaps” in between which would compare to numbers which have a limited probability of occurring. Alternately, these variables almost never take an accurately prescribed value c but there is a positive probability that its value will rest in particular intervals which can be very small. Definition Let X be a continuous r.v. Then a probability distribution or probability density function (pdf) of X is a function f(x) such that for any two numbers a and b witha < b, we have The probability that X is in the interval [a, b] can be calculated by integrating the pdf of the rv. X. For f(x) to be a legitimate pdf, it must satisfy the following two conditions: 1. f(x) = 0 for all x 2. = area under the entire graph of f(x) = 1 2. Explain about relative frequency distribution & computation of mean,variants? Ans: Relative freq dist: relative frequency distribution shows the proportion of the total number of observations associated with each value or class of values and is related to a probability distribution, which is extensively used in statistics. A relative frequency distribution consists of the relative frequencies, or proportions (percentages), of observations belonging to each category. This distribution displays the proportion or percentage of observations in each interval or class. It is useful for comparing different data sets or for analyzing the distribution of data within a set. and Relative Frequency is given by Relative Frequency = Frequency of the Event/ Total Number of Events Mean and variance These are measures of central dispersion. Mean is the average of a given set of numbers. The average of the squared difference from the mean is the variance. Central dispersion tells us how the data that we are taking for observation are scattered and distributed. We will learn about different properties, but before that, we need to get familiar with some of the features like mean, median and variance of the given data distribution. If we multiply the observed values of a random variable by a constant t, its simple mean, sample standard deviation and sample variance will be multiplied by t, |t| and t2, respectively. Also, if we add a constant m to the observed values of a random variable, that constant value will be added to the sample mean, but the sample standard deviation and sample variance remain unchanged. A similar rule applies to the theoretical mean and variance of random variables. Mean: This is the formula that we represent for ungrouped data), This is the formula that we represent for ungrouped data), B= Dini 2 e Where %, Xz, X3, = Xn denote the value of thy respective terms, And n = number of terms The formula for the mean calculation in this case (called the discrete frequency data) is xi fecthertfeertn thet 1 fy, © Ragespective terms here, x, X2, X3, «. Xn denote the value of the Nd fi, fa, fz, «fr denote the respective equency data of the respective term And n = number of terms Variants: Variance is the expected value of the squared variation of a random variable from its mean value. Sometimes, we have to take the mean deviation by taking the absolute values from a set of values. The absolute values were taken to measure the deviations; otherwise, the positive and negative deviations may cancel out each other. Variance is the expected value of the squared variation of a random variable from its mean value. Sometimes, we have to take the mean deviation by taking the absolute values from a set of values. The absolute values were taken to measure the deviations; otherwise, the positive and negative deviations may cancel out each other. So, to remove the sign of deviation, we usually take the variance of the data set, i.e., we usually square the deviation values. As squares are always positive, the variance is always a positive number. Let us take "n” observations as a1, a2, a3,.....,an and their mean is represented by Oe Then, the variance is denoted by a. Then, the variance is denoted by o? = (a; — a)? + (ap — a)? + (a3 —@)*..... 3. Explain about confidence interval estimation of population parameters? Ans: Confidence interval: confidence interval for a population mean, when the population standard deviation is known based on the conclusion of the Central Limit Theorem that the sampling distribution of the sample means follow an approximately normal distribution. There are two types of estimates for each population parameter: the point estimate and confidence interval (Cl) estimate. For both continuous variables (e.g., population mean) and dichotomous variables (e.g., population proportion) one first computes the point estimate from a sample. Recall that sample means and sample proportions are unbiased estimates of the corresponding population parameters. For both continuous and dichotomous variables, the confidence interval estimate (Cl) is a range of likely values for the population parameter based on: e the point estimate, e.g., the sample mean e the investigator's desired level of confidence (most commonly 95%, but any level between 0-100% can be selected) e and the sampling variability or the standard error of the point estimate. Confidence Interval for the Population Proportion If there are more than 5 successes and more than 5 failures, then the confidence interval can be computed with this formula: i(1-p) i ptz The point estimate for the population proportion is the sample proportion, and the margin of error is the product of the Z value for the desired confidence level (e.g., Z=1.96 for 95% confidence) and the standard error of the point estimate. In other words, the standard error of the point estimate is: r 1-7 SE(p)= pil-p) Rn This formula is appropriate for large samples, defined as at least 5 successes and at least 5 failures in the sample. This was a condition for the Central Limit Theorem for binomial outcomes. If there are fewer than 5 successes or failures then alternative procedures, called exact methods, must be used to estimate the population proportion.1,2 4.Explain about statistical inference sampling with & without replacement random samples? Ans:Statistical Inference Definition Statistical inference is the process of analysing the result and making conclusions from data subject to random variation. It is also called inferential statistics. Hypothesis testing and confidence intervals are the applications of the statistical inference. Statistical inference is a method of making decisions about the parameters of a population, based on random sampling. It helps to assess the relationship between the dependent and independent variables. The purpose of statistical inference to estimate the uncertainty or sample to sample variation. It allows us to provide a probable range of values for the true values of something in the population. The components used for making statistical inference are: e Sample Size e Variability in the sample e Size of the observed differences Types of Statistical Inference There are different types of statistical inferences that are extensively used for making conclusions. They are: ¢ One sample hypothesis testing ¢ Confidence Interval e Pearson Correlation e Bi-variate regression e Multi-variate regression e Chi-square statistics and contingency table e ANOVA or T-test Randomly selecting records from a large data set may be helpful if your data set is so large as to prevent or slow processing, or if one is conducting a survey and needs to select a random sample from some master database. When you select records randomly from a larger data set (or some master database), you can achieve the sampling ina few different ways, including: e sampling without replacement, in which a subset of the observations are selected randomly, and once an observation is selected it cannot be selected again. e sampling with replacement, in which a subset of observations are selected randomly, and an observation may be selected more than once.

You might also like