Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

In R, statistical models refer to mathematical representations of relationships between variables.

They can be used for a variety of tasks including estimation, prediction, hypothesis testing, and
model selection. R provides many functions and packages for fitting a wide range of statistical
models namely-

linear regression,

logistic regression

decision trees

time series models.

Regression is a statistical technique used to analyze and model the relationship between a
dependent variable and one or more independent variables. The dependent variable is the response
or outcome that we are interested in predicting, while the independent variables are the predictor
or explanatory variables that are used to make predictions.

There are several types of regression, including linear regression, logistic regression, and polynomial
regression.Continuous variables are used.

Regression models are used in a variety of applications, such as in finance to predict stock prices, in
medicine to predict patient outcomes, and in marketing to predict consumer behaviour.

Linear Regression is a statistical technique used to model the linear relationship between a
dependent variable and one or more independent variables.

Linear regression models the relationship between the dependent and independent variables as a
straight line, represented by the equation:

y = β0 + β1x1 + β2x2 + ... + βnxn

where y is the dependent variable, x1, x2, ..., xn are the independent variables, and β0, β1, β2, ..., βn
are the coefficients that represent the intercept and the slopes of the line.

In order to build a linear regression model, data is collected on both the dependent and
independent variables. The goal is to find the coefficients that minimize the difference between the
predicted values of the dependent variable and the actual values. This can be done using various
optimization techniques such as gradient descent or the normal equation.

Logistic Regression is a statistical method for analyzing a dataset in which there are one or more
independent variables that determine an outcome. It is used to predict a binary outcome (1 / 0, Yes /
No, True / False) given a set of independent variables. Used in medical diagnosis, credit scoring, and
marketing research. just like Linear regression assumes that the data follows a linear function,
Logistic regression models the data using the sigmoid function.

The logistic regression model is an equation that is defined as follows:

p = e^(b0 + b1x1 + b2x2 + ... + bnxn) / (1 + e^(b0 + b1x1 + b2x2 + ... + bnxn))

where

p is the probability of the positive class (class 1).

b0, b1, b2, ..., bn are the coefficients of the equation, representing the strength of the relationship
between each independent variable and the outcome.
x1, x2, ..., xn are the independent variables.

The objective of training a logistic regression model is to find the best coefficients (b0, b1, b2, ..., bn)
that minimize the error between the predicted probabilities and the actual outcomes in the training
data.

Once the coefficients have been determined, the model can be used to make predictions on new
data. Given a set of independent variables, the predicted probability of the positive class (class 1)

can be calculated using the logistic regression equation. A threshold is then set, typically 0.5, and
predictions are made based on whether the predicted probability is above or below the threshold.

lm(formula,data)
Following is the description of the parameters used −
 formula is a symbol presenting the relation between x and y.
 data is the vector on which the formula will be applied

You might also like