Learning objective

This module gives an outline on the basics of computers.


What is a computer?

• A computer is a fast, electronic data processing machine/device for receiving,

storing, processing, analyzing and retrieving any amount of data or
information with 100% accuracy following a set of instruction given to it by
human being. It does all the work assigned to it perfectly without committing
any mistake.

Components of computer (anatomy of a P.C)

• The components of computers are

o Input unit
o Central Processing Unit (CPU)
▪ Control unit
▪ Arithmetic and Logic unit
▪ Register.
o Memory unit
o Output Unit

Block diagram of a computer



Input unit

• It is a device through which we enter the program and data into a computer. It
performs two important functions.
• We feed information or data into the computer for the purpose of processing.
Secondly, we instruct the computer to perform various arithmetic operations
and the logical sequence in which they are to be computed.
• Punched cards, Punched paper tapes (used in olden days), magnetic tapes,
magnetic diskettes, magnetic drums, keyboards, compact disc, etc., are some of
the input devices.

Central Processing Unit (CPU)

• It is the main part of a computer system like the heart of a human being. It
interprets the instruction in the program and executes one by one.
• It consists of three major units.
• Control Unit
o It controls and directs the transfer of program instructions and data
between various units. The other important functions of control unit are
▪ opening and closing of proper logic circuits
▪ receiving data from input devices

▪ storing them in memory

▪ getting instructions from the memory
▪ executing the instructions
▪ sending the information to the output unit and so on.
• Arithmetic and Logic unit (ALU)
o Arithmetic operations like additions (+), subtraction (-), multiplication
(*), division, exponentiation, etc., and logical operations like
comparisons using the operators <, >, <=, >= etc., are being carried out
in this unit.
• Registers: They are used to store instruction and data for further use.

Memory units

• It is used to store the programs and data.Computers have two types of memories
like human being, main memory and auxiliary memory. For a human being
brain acts as main memory and notes, books and diaries act as auxiliary
• The main memory is used to store only vital information and auxiliary memory is
used to store a lot of information, which is not frequently used.
• The main memory is of the type of Random Access Memory (RAM) and the
auxiliary memory is of magnetic memory.
• RAM can retain the information as long as there is electric power. The auxiliary
memories are very slow and are very voluminous compared to the main
memories. The access time to auxiliary memory is slow compared to that of
main memory.

Output units

• Output devices are used to print/display the useful results or processed data that
are stored in the memory unit. Paper card punched, paper tape punched
(Olden days), dot matrix printer, line printer, plotter, video display unit
(VDU), graphic printer, laser printer etc., are some of the output devices.


Learning objective

This module deals with, classification of computer.


• The computers are classified based on the type of data they are designed to
process. Data may be obtained either as result of counting or through use of
some measurement.

• Data obtained by counting are called discrete data. For example, total number of
students in a classroom.
• Continuous data is obtained by measurement. For example, measurement of
temperature or voltage.


Comparison between HUMAN BEING and COMPUTER

• A computer can be compared to a human being. A human being reads data-using

eyes, hears the data using ears, gathers data using feelings, tastes some data
using tongue and smells some data using nose.
• So these parts like eye, ear, nose are used to get input and hence these parts are
comparable to the input units (or) input devices of a computer.
• The output of information can be given orally using mouth and in writing using
hands and sometimes using actions, facial expressions and body postures.
• These parts such as mouth, hands, etc., are used for giving out the information
and are comparable to the output devices of a computer. The logical and
arithmetic operations are done using the brain.
• The data are also kept in the memory. So, the brain is considered as the memory
as well as the processing unit.

Human being Computer

• Have common sense • Have no common sense
• Perform arithmetic • Perform arithmetic calculations
calculations in minutes in milli-seconds or
or hours microseconds.
• Having a poor memory • Having a very good memory
power. power.
• Reliability is not bad. • Very good reliability.
• Most of the time gives • At all times gives accurate
approximate solutions solutions for a problem.
for a problem. • Have artificial intelligence.

• Have natural intelligence. • Have no thinking power.

• Have very good thinking • No learning power.
power. • No tiredness and so, do
• Learn things quickly and repetitive jobs any number of
act accordingly. times with same accuracy and
• Due to tiredness, cannot efficiency.
do repetitive jobs for a
long time.


The probability of Type I error.

Alternate hypothesis

Any hypothesis which is complementary to the null hypothesis is called alternate hypothesis. This
hypothesis reveals that there is no difference in the mean of the sample and the population.


Arithmetic Logic Unit

Analysis of variance (ANOVA)

A statistical technique used to test the equality of three or more sample means and thus make

inferences as to whether the samples come from populations having the same mean.


American National Standard Institute

Arithmetic Mean

A central tendency measure representing the arithmetic average of a set of observations.


American National Standard Code for Information Interchange


A single value that describes the characteristics of the entire mass of data.

Bar diagram

One dimensional diagram where the length of the bar is important and not the width.

Bernoulli process

Bernoulli process is one where a experiment can result in only one of two mutually exclusive out
comes such as success or failure, dead or alive, male or female etc.,


The probability of Type II error .

Between sample variance

An estimate of the population variance derived from the variance among the sample means.

Biased errors

Biased errors are those which arise because of bias in selection, estimation etc.,

Bi-modal distribution

A distribution in which two values occur more frequently than the rest of the values in the data set.

Binomial distribution


A discrete distribution which describes the results of an experiment known as Bernoulli process


The application of the statistical method or mathematical logic to the analysis and interpretation of
biological variation.

Bivariate analysis

If there is some relationship existing between the two variables, the statistical analysis of such data
is called bivariate analysis.

Chance selection

The selection of units developed entirely on chance, and one does not know before hand which
unit will actually constitute the sample.

Chi-square distribution

It is type of probability distribution, differentiated by their degrees of freedom, used to test a

number of different hypotheses about variances, proportions and distributional goodness of fit.

Chi-square test

A statistical test used to determine the significance of overall deviation between observed and
expected frequencies: c 2 = å (fo-fe)2/fe.

Cluster sampling

In such type of sampling, the population is divided into some recognizable sub groups which are
called as clusters.

Coefficient of correlation

The square root of coefficient of determination. Its sign indicates the direction of the relationship
between two variable, direct or inverse.

Coefficient of variation

A relative measure of dispersion which expresses the standard deviation as a percentage of mean.

Conditional probability

The probability of one event occurring; given that another event has occurred.

Conditional probability distribution


A probability distribution in which the variable is allowed to take any value with in the given range.

Contingency table

A table having R rows and C columns. Each row corresponds to a level of one variable; each column
to a level of another variable. Entries in the body of the table are the frequencies with which each
variable combination occurred.

Continuous data

Data that may progress from one class to the next without a break and may be expressed by either
whole numbers or fractions.

Continuous variable

Variables which can take any value with in the certain range exhibited by the population .

Convenience sampling

Selection of items results in obtaining a chunk of the population. Chunk is a convenient slice of a
population which is commonly referred to as a sample.



Together the ordinate and abscissa are called coordinate to the point.


It is a statistical tool which measures the closeness of the relationship between two variables.

Correlation analysis

It is a technique to ascertain the strength of relationship between two variables .

Correlation and regression

These techniques help in measuring the independence or relationship between bivariate data and
predict the value of one variable for the given value of other variable.


Central Processing Unit

Cumulative frequency distribution

A tabular display of data showing as to how many observation lie above or below certain values.

Curvilinear relationship

An association between two variables that is described by a curved line .


A collection of any number of related observation on one or more variables .

Data array

The arrangement of raw data by observation in either ascending or descending order.

Data point

A single observation from a data set.

Data set

A collection of data .

Dependable variable


The variable which is to be predicted in regression analysis .


Diagrams always help to minimize the meaning of a numerical complex at a single glance.

Direct relationship

A relationship between two variables, in which as the independent variables value increases, so
does the value of dependable variable.

Discontinuous variable

Variables which have only certain fixed numerical values with no intermediate values possible in
between .

Discrete data

Data that do not progress from one class to the next without a break, i.e. where classes represent
distinct categories or counts and may be represented by whole numbers.

Discrete probability distribution

A probability distribution in which the variable is allowed to take only a limited number of values.

Discrete random variable

A discrete random variable is one that is allowed to take on, only a limited number of values .


The scatter or variability in a set of data.

Divided bar diagram

In a divided bar diagram, the frequency is divided into different components and such a
representation is called divided bar diagram

Estimation equation

A statistical formula that relates unknown variable to the known variable in regression analysis.


One of the possible outcome which results from conducting an experiment .


Expected frequency

The frequencies we would expect to see in a contingency table or frequency distribution if the null
hypothesis is true.


The activity that results in an event .


A type of distribution differentiated by two parameters (df-numerator, df-denumerator), used

primarily to test hypothesis regarding variances.


A ratio used in the analysis of variance, among the other tests, to compare the magnitudes of the
two estimates of the population of variance to determine if the two estimates are approximately
equal; in ANOVA, the ratio of between-sample variance to within –sample variance is used.

Frequency curve

A frequency polygon smoothed by adding classes and data points to a data set.

Frequency distribution

An organized display of data that shows the number of observations, from the data set that fall
into each of a set of mutually exclusive classes .

Frequency polygon

A line graph connecting the mid points of each class in a data set, plotted at a height corresponding
to the frequency of the class.

Geometric mean

A measure of central tendency used to measure the average rate of change or growth for some
quantity, computed by taking the nth root of the product of ‘n’ values representing change.

Goodness of fit test

A statistical test for determining whether there is a significant difference between an observed
frequency distribution and a theoretical frequency distribution such as binomial, poisson and


A graph is visual form of representation of statistical data .


Harmonic mean

A measure of central tendency used to measure an average rate like kilometer per hour, items
manufactured per day etc., computing by taking reciprocal of the arithmetic mean of the reciprocal
of the value of the variable.


Histogram is a set of vertical bars whose areas are proportional to the frequencies represented.


Hyper Text Make-up Language


Hyper Text Transfer Protocol


An assumption or speculation we make about a population parameter.


Independent events

A set of events are said to be independent, if the occurrence of any event does not affect the
chance of the occurrence of any other event of the set.

Independent variable

The known variable or variables in regression analysis.

Inverse relationship

A relationship between two variables in which as the independent variable increases, the
dependent variables decreases .

Judgment sampling

In judgment sampling, the choice of sample item depends exclusively on the judgment of the



The degree of peakedness of a distribution.


Local Area Network


A strongly peaked distribution.

Line diagram

Straight lines whose lengths are proportional to the frequencies.

Linear regression analysis

The statistical analysis employed to find out the exact position of the straight line or lines is known
as the linear regression analysis.

Linear relationship

A particular type of association between two variables that can be described statistically by a

straight line.

Lower tailed test

A one tailed hypothesis test in which a sample value significantly below the hypothesized
population value will lead us to reject the null hypothesis.

Marginal probability

The unconditional probability of one event occurring; the probability of a single event.

Marginal totals

The row and column totals of a contingency table are called marginal totals.

Measures of central tendency

A measure indicating the value to be expected of a typical or middle data point.

Measures of dispersion

A measure describing how scattered or spread out observation, in a data set are.


The middle point of a data set, a measure of location that divides the data into halves.

Median class

The class in a frequency distribution that contains the median value for a data set.


A moderately peaked distribution.


The value more often repeated in the data set. It is represented by the highest point in the
distribution curve of a data set.

Multiple bar diagram

The technique of simple bar diagrams can be extended to represent two or more sets of
interrelated data in a diagram.

Mutually exclusively events

Event that cannot be happened together.

Negative correlation

The values of two variables move in a reverse direction.

Non proportional stratified sampling

In non proportional stratified sampling, an equal number of units are taken from each stratum
irrespective of its size.

Non-sampling error

Errors that are mainly arise at the stages of observation and processing of data are called non
sampling errors.

Normal distribution

A distribution of continuous random variable with a bell shaped curve. The mean lies at the center
of the distribution and the curve is symmetrical. The two tails extend indefinitely, never touching
the horizontal axis.

Null hypothesis

Null hypothesis is the hypothesis which is a test for possible rejection under the assumption that it
is true.


A graph of a cumulative frequency distribution.

One tailed test

A hypothesis test in which there is one rejection region; i.e we are concerned only whether the
observed value deviates from the hypothesized value in one direction only.

One way classification

When one factor is involved in the analysis of the variance.

Ordinate and abscissa

In the contribution graphs, two simple lines are first drawn which cut each other at right angles.
These lines are called axis. The horizontal line is called abscissa or x-axis and the vertical line is
called ordinate or y-axis. The point at which they cut each other is called point of origin.

Paired difference test

A hypothesis test of difference between the sample means of two independent samples.


A measure which describes the characteristics of a population.

Pascal triangle

A set of figures written in triangular shape.

Percentage bar diagram

In percentage bar diagram, the length of the bar is equal to 100 and the deviations of the bar
correspond to the percentage different components.

Permutation and combination

Permutation refers to different arrangement and combination refers to group.

Pie diagram

To represent qualitative data, different components or frequencies are said to be shown by means
of sector of a circle, the angles of the sectors are proportional to the respective measurements of
the different components.


A slightly peaked distribution.

Poisson distribution

It is a discrete probability distributed. Poisson distribution is a limiting form of binomial distribution

as ‘n’ moves towards infinity and ‘p’ moves towards zero and ‘np’ or mean remains constant.


The total number of individual observations from which inferences are to be made at a particular

Positive correlation

The values of two variables move in same direction.


Probability is the like hood of occurrence of an event.

Probability distribution

Probability distribution as such distributions which are not obtained by actual observations or
experiments but are mathematically detected on certain assumptions

Probability tree

A graphical representation which shows the possible out come of a series of experiments and their
respective probability.

Proportional stratified sampling

Proportional stratified sampling is one in which items are taken from each stratum in proportion of
the units of the stratum to the total population.

Qualitative characters

Individuals comprising the materials under consideration are distinguished by some quality.

Quantitative characters

Individuals are distinguished by measurements.

Quota sampling

Quota sampling is a type of judgment sampling.

Random sample

A random sample is one where each item of the population has an equal chance of being included
in the sample.

Random sampling

Random sampling is one in which each and every item of the population has the same probability
of being included in the sample. It completely depends on elements of chance.

Random selection

It is one in which each unit of the population has the same chance of being included in the sample.


Randomization is a required condition for an experimental design which is necessary to obtain a

valid estimate of the error variation. The process of the randomization will ensue that the various
soil conditions represented in the field will have an equal chance of being used in the experiment.


The difference between the largest and smallest value of the distribution of data.

Raw data

Information before it is arranged or analyzed by statistical methods.


The statistical method which helps us to estimate the unknown value of one variable from known
value of the related variable.

Regression analysis

This technique is mentioned in measuring the probable form of relationship between the two

Regression equations

The two equations based on two regression lines are called regression equations.

Regression line

A line fitted to asset of data points to estimate the relation between two variables.

Relative frequency distribution

The display of data set that that shows the fraction or percentage of the total data set that falls
into each of set of mutually exclusive classes.


The repetition of treatments under investigation is known as replications. Replication increases the
accuracy of the scope of the experiment and it enables us to determine the magnetite of the
uncontrolled variation that is usually referred to as an error.

Representative sample

A sample that contains the relevant characteristics of a population in the same proportion as they
are included in that population.



Part of the population selected for study.


Selection of a part of population to represent the whole population.

Sampling errors

Errors that are arise due to drawing inferences about the population on the basis of sample are
termed as sampling errors.

Sampling units

Every population consists of individuals or items which are known as sampling units.

Scatter diagram

The scatter of various points in the form of dots.

Significance level


The statistical tests fixes fix the probability of committing Type I error at a certain level and
minimize the chances of committing type II error.


The extent to which a distribution of data points is concentrated at one end or other; the lack of


A constant for any given straight line, whose value represents as how much each unit change of the
independent variable changes the dependable variable.

Standard deviation

It is the square root of the arithmetic mean of the squares of all the deviations of a set of
observations in a series from the arithmetic mean.

Standard error of estimate

A measure of the reliability of the estimating equation, indicating the variability of the observed
points around the regression line, i.e the extent to which observed values differ their predicted
values on the regression line.

Standard error of the regression coefficient


A measure of the variability of the sample regression coefficients around the true population
regression coefficient.

Standard normal curve

The curve with zero mean and unit standard deviation is known as the standard normal curve.


A measure computed from the data of a sample. It is the science which deals with the collection,
analysis and interpretation of numerical data.

Stratified random sampling

In a stratified random sampling, first the population is divided into relatively homogenous groups
or strata and a random sample is drawn from each group or stratum to produce an overall sample.


A characteristic of a distribution in which each half is the mirror image of the other half .

Systematic sampling


A systematic sample is selected at random sampling. This method is used when complete test of
the population is available.


Transmission Control Protocol/Internet Protocol

Test of homogeneity

A statistical test used to determine whether two or more independent random samples are drawn
from sample population or from different populations. In case of independence problems, one
sample is taken into consideration where as two or more samples are taken from homogeneity.

Test of independence

A statistical test of proportion of frequencies to determine if membership in categories of one

variable is different as a function of membership in the categories of a second variable.

Test of significance

A statistical test used to determine whether observed frequencies between two samples drawn
from the same population are actually due to chance or whether they are really significant.


The object of comparison in the experimental trials.

Two tailed test

A hypothesis test in which the null hypothesis is rejected, if the sample value is significantly higher
or lower than the hypothesized value of the population parameter, a test involving two rejection

Two way classification

When two factors are involved in the analysis of variance.

Type I error

Rejecting null hypothesis when it is true.

Type II error

Accepting a null hypothesis when it is false.

Unbiased errors

Unbiased errors arise due to chance differences between members of the population included in
the sample and those not included in the sample

Unimodel distribution

A distribution in which one value move frequently in the rest of the values in the data set.

Univariate analysis

When only one variable is involved, this type statistical analysis is called univariate analysis.

Upper tailed test

A one tailed hypothesis test in which a sample value significantly above the hypothesized
population value will lead us to reject the null hypothesis.


Uniform Resource Locater


Any quantity or quality liabkle to show variation from one individual to the next in the same


It is defined as the mean of square of deviations.


An individual observation of any variable.


Wide Area Network


World Wide Web


A constant for any given straight line whose value represents the value of the Y variable when the X
variable has the value of 0.


AGB-111: Biostatistics and Computer Applications (2+1)

Define the following

1. Primary data

2. Secondary data

3. Geographical classification

4. Chronological classification

5. Frequency

6. Cumulative frequency

7. Sturge's rule

8. Yule's rule

9. Absolute measure of dispersion

10. Relative measure of dispersion

11. Large samples

12. Small samples

13. Type I Error

14. Type II Error

15. Null Hypothesis

16. Alternative Hypothesis

17. Mutually exclusive events

18. Independent events

19. Parameter

20. Statistics

21. Variance

22. Scatter diagram

23. Local Control

24. Randomization

25. Replication

26. Critical difference

27. Sample

28. Population

29. Variable

30. constant

31. Questionnaire

32. Sheppard’s Correction

33. Yate’s Correction

34. Weighted mean

35. Standard Error

36. Comparative dillusion assay

37. Analytical dillusion assay

38. Therapeutic index

39. Analysis of variance

40. Cumulative frequency distribution

41. Kurtosis:

42. Platykurtic

43. Skewness

44. leptokurtic

Short Questions

1. Give the requisites of an ideal average

2. What is Standard Deviation?Explain it.

3. What do you mean by regression and explain.

4. What is Normal Distribution?Give its properties

5. Write about Chi-square test of goodness of fit

6. Write about Paired ‘t’ test

7. Write about Chi-square test of independence

8. Write about Non-paired ‘t’ test


1. Define primary data. Give the various ways of collecting primary data and

discuss their merits and demerits.

2. What are the different Graphical representation of data. Explain them

3. Name the different measures of averages and explain them.

4. Define correlation and give the various methods of measuring correlation.

5. Give the various steps involved in the analysis of completely randomized

design. Name the different large sample tests and explain them.

