Download as pdf
Download as pdf
You are on page 1of 39
Scanned by CamScanner 14 NON-PARAMETR C TESTS the liter: i iy al meaning of the terms, a parametric statistical test is one that makes assumptions about the parameters is i est is h which one’s data are drawn, (defining properties) of the population distribution(s) from i while a non-parametrii rach assumptions. Parametric test is one that shakes no s' te Nonparametric statistics refer to a statist Lit Ke Méta.js Not requ peta statistical method wherfitt the datg.js not réquired toffita normal distribution. Nonparametric statistics uses data tua - often oftithal; m it does not rely on numbers, but rather a ranking or order of Soft... 7.4.1 WILCOXON RANK SUM TEST The Wilcoxon test, which refers to either the Ramnk Sum test or the SiGAMt Rink test, is a nonparametric test that compares two paired groups. The test essentially calculates the difference between each set of pairs and analyses these differences. The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used when comparing two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean ranks differ (ie. it is a paired difference test). 7.4.2 MANN-WHITNEY U TEST In statistics, the Mann-Whitney U test (also called the Mann—Whitney-Wilcoxon (MWW), Wilcoxon rank-sum test, or Wilcoxon-Mann—Whitney test) is a nonparametric test of the null hypothesis that it is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value. The Mann-Whitney U test is used to compare differences between two independent groups when the dependent variable is either ordinal or continuous, but not normally distributed. In order to run a Mann-Whitney U test, the following four assumptions must be met. The first three relate to choice of study design, whilst the fourth reflects the nature of data: Assumption if we have one dependent variable that is measured at the continuous or ordinal level 7.4.3 KRUSKAL -WALLIS TEST t The Kruskal-Wallis H test (sometimes also called the “one-way ANOVA on ranks’) is a rank-based nonparametric test that can be used to determine if there are statistically significant differences between two or more groups of an independent variable on a continuous or ordinal dependent variable. ANOVA stands for Analysis of variance. 7.1.4 FRIEDMAN TEST . serot tla ie is i ive to the one-way witl The Friedman test is the non-parametric alternative repeated measures. It is used to test for differences between groups when the dependent variable being measured is ordinal. Scanned by CamScanner 6... a 7.2 INTRODUCTION TO RESEARCH, Research Methods are the tools and techniques for doing research. Research isa term used liberally for any kind of investigation that is intended to uncover interesting or new facts. As with all activities, the rigour with which this activity is carried out will be reflected in the quality of the results. This book presents a basic review of the nature of research and the methods which are used to undertake a variety of investigations relevant to a wide range of subjects, such as the natural sciences, social science, social anthropology, psychology, politics, leisure studies and sport, hospitality, healthcare and nursingstudies, the environment, business, education and the humanities. Research methods are a range of tools that are used for different types of enquiry, just as a variety of tools are used for doing different practical jobs, for example, a pick for breaking up the ground or a rake for clearing leaves. In all cases, it is necessary to know what the correct tools are for doing the job, and how to use them to best effect. This ook provides you with the basic information about the tools used in research, the situations | in which they are applied and indicates briefly how they are used by giving practical examples. Research is a very general term for an activity that involves finding out, in a more or less systematic way, things you did not know. A more academic interpretation is that research \ involves finding out about things that no-one else knew either. It is about advancing the frontiers of knowledge. Research methods are the techniques you use to do research. They \ represent the tools of the trade, and provide you with ways to collect, sort and analyse information so that you can come to some conclusions. If we use the right sort of methods for your particular type of research, then we should be able to convince other people that our \\ conclusions have some validity, and that the new knowledge we have created is soundly \ based. It would be really boring to learn about all these tools without being able to try them { out - like reading about how to use a plane, chisel, drill etc. and never using them to make I something out of a piece of wood. Therefore courses in research methods are commonly linked |) to assignments that require these methods to be applied — an actual research project that is described in a dissertation or thesis, or a research report. In the workplace, itis often the other way round. When there is a perception that more information and understanding is needed i to advance the work or process of work, then ways are sought how research can be carried out to meet this need. 7.2.1 NEED FOR RESEARCH So what is the need of research? This question is logical and therefore we describe here the need to perform research. The research provides us to understand the problem concern with society and hence helps increasing new knowledge. The adequate knowledge of research is important in support of solving problems. Some of the ways it can be used one to: © To categorise. This involves forming a typologyof objects, eventsor concepts, i.e. 2 set of names or ‘boxes’ into which these canbe sorted. This can be useful in explaining which ‘things’ belongtogether and how. To describe. Descriptive research relies on observation as a meansof collecting data. It attempts to examine situations in order toestablish what is the norm, ie. what can be predicted to happenagain under the same circumstances. To explain. This is a descriptive type of research specifically designedto deal with complex issues. It aims to move beyond ‘just gettingthe facts’ in order to make sense of — Scanned by CamScanner METRIC TESTS the myriad other elementsinvolved, contextual. 143 | such as human, political, social, cultural and To evaluate. This involves maki Quality can be measured § TRking judgements about the qualityof objects or event ‘er in an absolutesense or on a comparative basis. To be Useful, the methods ofevaluation Research. must be relevant to the context and intentions of the ‘9 compare. TWO or ! x alarities *o. OF more contrasting cases can be examined tohighlight differences To correlate. The relati ‘em, leading to abetter understanding of phenomena. nd how they influen Pron between two phenomena are investigatedto see whether ach other. The relationship might be just a loose link at one extreme or a direct link when one ph ; \enomenon aaa levels of association. P causes another. These are measure To predict. This can sometimes be done in research areas where correlations 2° already known. Predictionsof possible future behaviour or events are made on the basis that if there has been a strong relationship between two or more characteristics or eventsin the past, then these should exist in similar circumstances in thefuture, leading to predictable outcomes. To control. Once we understand an event or situation, we may beable to find ways to controlit. For this we need to know whatthe cause and effect relationships are and that we are capableof exerting control over the vital ingredients. All of technologyrelies on this ability to control. Two or more of these objectives may be combined in a research project, with sometimes ane abjective needing to be successfully achieved before starting the next, for example we tsually need to be able toexplain how something happens before we can ‘work out how to control it. 12.2 NEED FOR DESIGN OF EXPERIMENT Design of Experiments (DoE) The choice of the design of experiments can have a large influence on the accuracy of the approximation and the cost of constructing the response surface. An important aspect of RSM is the design of experiments (Box and Draper, 1987), usually abbreviated as DoE. iginally developed for the model fitting of physical experiments, but can also be applied to numerical experiments, The objective of DoE is the selection of 4. Most of the criteria for optimal design the points where the response should be evaluate of experiments are associated with the mathematical model of the process. Generally, these mathematical models are polynomials with an unknown structure, so the corresponding experiments are designed only for every particular problem. In a traditional DoE, screening experiments are performed in the early stages of the process, when it is likely that many of the design variables intially considered have litle or no effect © the response. The purpose isto identify the design variables that have large effects for further investigation. A detailed description of the design of experiments theory can be found in Box and Draper (1987), Myers and Montgomery (1995), among many others. Schoofs (1987) has reviewed the pplication of experimental design t© structural optimization, Unalet al., (1996) discussed the use of several designs for response surface methodology and multidisciplinary design a Scanned by CamScanner 46 SP PY eunlereiees enatats)ral cela l=atateresel oles optimization and Simpson et al, (1997) presented a complete review of the use of statistics in dain. A particular: eeanbination of runs defines an experimental desigp. The possible setings of each independent variable in the N dimensional space are called level 7 ifferent methodologies is used such as Full factorial design, Central composite design, D-optimal designs, : ; ‘There are numerous types of research design that are appropriate for the different types of research projects, The choice of which design to apply depends on the nate of dl he problems posed by the research aims. Each type of research design has a range o° reseat methods that are commonly used to collect and analyse the type of data that is generated by the investigations. Here is a list of some of the more common research designs, with a short explanation of the characteristics of each are described as under. HISTORICAL This aims at a systematic and objective evaluation and synthesis of evidence in order to establish facts and draw conclusions about past events. It uses primary historical data, such as archaeological remains as well as documentary sources of the past. It is usually ecessary to carry out tests in order to check the authenticity of these sources. Apart from informing, us about what happened in previous times and re-evaluating beliefs about the past, historical research can be used to find contemporary solutions based on the past and to inform present and future trends. It stresses the importance of interactions and their effects. DESCRIPTIVE This design relies on observation as a means of collecting data. It attempts to examine situations in order to establish what is the norm, i.e. what can be predicted to happen again under the same circumstances. ‘Observation’ can take many forms. Depending on the type of information sought, people can be interviewed, questionnaires distributed, visual records made, even sounds and smells recorded. Important is that the observations are written down or recorded in some way, in order that they can be subsequently analysed. The scale of the research is influenced by two major factors: the level of complexity of the survey and the scope or extent of the survey. CORRELATION This design is used to examine a relationship between two concepts. There are two broad classifications of relational statements: an association between ‘two concepts — where there is some kind of influence of one on the other; and a causal relationship - where one causes changes to occur in the other. Causal statements describe what is sometimes called a ‘cause and effect’ relationship. The cause is referred to as the ‘independent variable’, the variable that is affected is referred to as the ‘dependent variable’. The correlation between two concepts can either be none (no correlation); positive (where an increase in one results in the increase in the other, or decrease results in a decrease); or negative (where the increase in one results in the decrease in the other or vice versa). The degree of association is often measurable. 4 COMPARATIVE This design is used to compare past and present or different parallel situations, particularly when the researcher has no control over events. It can look at situations at eS ‘Scanned by CamScanner RY) Ran eLn OmU=Se) 145 a order to eben micro (community, individual). Analogy 1 ‘ oe they could w Tesults — assuming that if two events are comparative design is used to explo, well be similar in others too.In this way certain events, 50 that it is possible, foroentt@St What conditionswere necessary to cause certain decisions. v“orexample, to understand the likely effects of making } d jg used to iden e similarin certain characteristics, EXPERIMENTAL Experimental research atte; SIMULATION Simulation involves devising a representation in a small and simplified form (model) of a system, which can be manipulated to gauge effects. It is similar to experimental design in the respect of this manipulation, but it provides a more attificial environment in that it does work with original materials at the same scale. Models can be mathematical (number crunching, inacomputer) or physical, working with two- or three-dimensional materials. The performance ofthe model must be checked and calibrated against the real system to check that the results aereliable. Simulation enables theoretical situations to be tested — what if? EVALUATION This descriptive type of research is specifically designed to deal with complex social issues. It aims to move beyond ‘just getting the facts’, by trying to make sense of the myriad human, political, social, culturaland contextual elements involved. There are a range of diferentapproaches of evaluation models, for example, systems analysis -which is a holistic | type of research looking at the complex interplayof many variables; and responsive evaluation - which entails a seriesof investigative steps to evaluate how responsive a programme, is toall those taking part in it. A common purpose of evaluation researchis to examine the working of projects from the point of view of levelsof awareness, costs and benefits, cost-effectiveness, attainment ofobjectives and quality assurance. The results are generally used toprescribe changes to improve and develop the situation. ACTION | Essentially, this is an ‘on the spot’ procedure, principally designed to deal with a specific Problem found in a particular situation. There is no attempt made to separate the problem from its context in order to study it in isolation. What are thought to be useful changes are and then constant monitoring and evaluation are carried ‘out to see the effects of the ‘anges. The conclusions from the findings are applied immediately, and further monitored ‘© gauge their effectiveness, Action research depends mainly on observation and behavioural ata. Because it is so bound up in a particular situation, it is difficult to generalize the i lls, ie. to be confident that the action will be successful in another context. > Scanned by CamScanner 146 EN (FU | Beside RESEARCH METHODOLOGY ETHNOLOGICAL Ethnological research focuses on people. In this approach, the researcher is interested in how the subjects of the research interpret their own behaviour rather than imposing a theory from outside. Tt takes place in the undisturbed natural settings of the subjects’ environment, It regards the context to be as equally important as the actions it studies, and attempts to represent the totality of the social, cultural ‘and economic situation. This is not easy as much of culture is hidden and rarely made explicit and the cultural background and assumptions of the researcher may unduly influence the interpretations and descriptions. Moreover there can be confusions produced by the use of language and the different meanings which may be given to words by the respondents and researcher. FEMINIST ‘This is more of a perspective than a research design that involves theory and analysis that highlight the differences between men’s and women's lives. Researchers who ignore these differences can come to incorrect conclusions. However, everyone is male or female, so value neutrality is impossible as no researcher practises research outside his or her system of Values. No specific methods are seen to be particularly feminist, but the methodology used is informed by theories of gender relations. Although feminist research is undertaken with a political commitment to identify and transform gender relations, it is not uniquely political, but exposes all methods of social research as being political. CULTURAL Many of the prevailing theoretical debates (e.g. postmodernism, post structuralism etc.) are concerned with the subjects of language and cultural interpretation. Cultural research provides methodologies that allow a consistent analysis of cultural texts so that they can be Compared, replicated, disproved and generalized. Examples of approaches tothe interpretation of cultural texts are: content analysis, semiotics and discourse analysis. The meaning of the term ‘cultural texts’ has been broadened from that of purely literary works to that of the many different forms of communication, both formal such as opera, TV news programmes, cocktail parties ete., and informal such as how people dress or converse. 7.2.3 EXPERIMENTAL DESIGN TECHNIQUESEXPERIMENTS Experiments are in principle comparative tests; they mean a comparison between two or more alternatives. One may want to compare the yield of a certain process to a new one, prove the effect of the process change compared to an existing situation or the effect of new raw materials or catalyser to the product quality or to compare the performance of an automated process with manually controlled one. When we speak about systematic experimental designtechnique, we presume statistical interpretation of the results so that we can say that a certain alternative outperforms the other one with e.g. 95% probability or correspondingly, that there is a 5% risk that our decision is erroneous. What is the best is that we can tell the statistical significance of the results before testing, or, just to put in another way round, we can define our test procedure so that it produces results with a required significance. We can also experiment with some process aiming to optimize its performance. Then we have to know in advance what the available operation area is and design oF experiments so that we by using them together with some mathematical software can search — Scanned by CamScanner eS ur LO eee ale gan Point. The famous Taguchi method is a straightforward approach to optimize ae i" y ny, by searching process conditions that produce the smallest cea By the way, this is also the approach that control engineers most often use when ered ng ab Out stabilizing controls. Also in this case, the focus is in optimizing operational conditions using systematic experimental design. There is also a large group of iment design methods tha erimental d eed and briefly dessa 7° Useful in optimizing nonlinear systems. Some of them MATRIX DESIGNS The conventional experiment design proceeds usually so that changes are made one variable at time; i.e. first the first variable is changes and its effect is measure and the same takes place for the second variable and so on. This is an inefficient and time consuming || approach. It cannot also find the probable interactions between the variables. Result analysis |) is straightforward, but care must be taken in interpreting the results and multivariable modelling is impossible, Systematic design is usually based on so called matrix designs that / change several variables simultaneously according to the program decided beforehand. Changing is done systematically and the design includes either all possible combinations of the variables or at the least the most important ones. FULL FACTORIAL DESIGNS | These designs include all possible combinations of all factors (variables) at all levels. There can be two or more levels, but the number of levels has an influence on the number of experiments needed. For two factors at p levels, 2p experiments are needed for a full factorial _| design. FRACTIONAL FACTORIAL DESIGNS These are designs that include the most important combinations of the variables. The | significance of effects found by using these designs is expressed using statistical methods. Most designs that will be shown later are fractional factorial designs. This is necessary in order to avoid exponential explosion. Quite often, the experiment design problem is defined as finding the minimum number of experiments for the purpose. ORTHOGONAL DESIGNS Full factorial designs are always orthogonal, from Hadamard matrices at 1800's to Taguchi Designs later. ON STATISTICAL TESTING | we are often encountered with a situation where we are studying, INK | if two populations are similar or different with respect to some variable; e.g. if the yield in the previous example is different at two reaction temperatures. In this comparison, there ther similar or different (statistically). The | are two possibilities: the populations are ¢ FOF We are testing, if the energy consumption of comparison uses usually means or variances t ec the new process is smaller (in average) than of the existing one or if the variation in some quality variable increases, if we take a new raw material into use. In many cases it is advantageous to set formal hypotheses and do some tests to show, which is the actual situation, In process analysis, Scanned by CamScanner 148 Ee) PTT MSNA es sia enna ele TWO LEVEL HADAMARD MATRIX DESIGNS ‘This Section deals with Hadamard matrix for eight runs. Tt was originally Seveloped by French mathematician Jacques Hadamard. Plackettja Burman used it in ade eee 1945. There are different Hadamard matrices (8x8 , 16x16 , 32x32, 64x64 and 128x eve lope from initial vectors by permutation. 8x8 matrix makes it possible to make 8 runs (T), for seven factors (T 1) at two levels (+, ). | \ RESPONSE SURFACE METHODS Linear methods reveal main effects and interactions, but cannot find quadratic (or cubic) effects, Therefore they have limitations in optimization; the optimum is found in some edge point corresponding linear programming. They cannot model nonlinear systems; e.g. quadratic phenomena. In an industrial process even third-order models are highly unusual. Therefore, the focus will be on designs that are good for fitting quadratic models Following example shows a situation where we are dealing with a nonlinear system and a two-level design does not provide us with the good solution. The details about this experimental design technique will be discussed in later chapter. BOX WILSON CENTRAL COMPOSITE DESIGNS Central Composite Design (CCD) has three different design points: edge points as in two level designs (+1), star points at +a; 1a1>1 that take care of quadratic effects and centrepoints, Three variants exist: circumscribed (CCC), inscribed (CCI) and face centred (CCF) ccc CCC design is the original central composite design and it does testing at five levels. The edge points (factorial or fractional factorial points) are at the design limits. The star points are at some distance from the centre depending on the number of factors in the design. The star points extend the range outside the low and high settings for all factors. The centre points complete the design.Completing an existing factorial or resolution V fractional factorial design with star and centre points leads to this design. CCC designs provide high quality predictions over the entire design space, but care must be taken when deciding on the factor ranges. Especially, it must be sure that also the star points remain at feasible (reasonable) levels. ccl In CCI, the star points are set at the design limits (hard limits) and the edge points are inside the range. In a ways, a CCI design is a scaled down CCC design. It also resuilts in five levels for each factor. CCI designs use only points within the factor ranges originally specified, so the prediction space is limited compared to the CCC. ccr In this design the star points are at the centre of each face of the factorial space, so a= + 1 and only three levels are used (Figure 5.6). Complementing an existing factorial oF resolution V design with appropriate star points can also produce this design. CCF designs provide relatively high quality predictions over the entire design range, but poor precision Bote —_— Scanned by CamScanner _ RAMETRIC TE: 149 es ing pure quadratic i : isinal factor range. efficients. They do not require using points outside the BOX-BEHNKEN DESIGN a eniowal independent quadratic design in that it does not contain an embedded factoria! ial design. In this design the treatment combinations are at the mid points of the edges of the process space at the centr i i a . These designs are near rotatable and require three level of each factor.The design have a limited capacity for orthogonal plocking compared to central composite designs p-OPTIMAL DESIGNS D optimal designs are one form of design provided by a computer algorithm. These es of Sa aided designs are particularly useful shea dasial designs do not apply. Unlike standard classical designs such as factorials and fractional factorials, D optimal design matrices are usually non orthogonal and effect estimates are correlated. These types of designs are always an option regardless of the type of the model the experimenter wishes to fit (for example, first order, first order plus some interactions, full quadratic, cubic, etc.) or the objective specified for the experiment (for example, screening, response surface, etc.). The optimality criterion results in minimizing the generalized variance of the parameter estimates for a pre specified model; the ‘optimality’ of a given D optimal design is model dependent. The experimenter must specify a model for the design and the total number of runs allowed and the computer algorithm chooses the optimal set of design runs from a candidate set. This candidate set usually consists of all possible combinations of various factor levels that one wishes to use in the experiment. To put it in another way, the candidate set is a collection of treatment combinations from which the D optimal algorithm chooses the treatment combinations to be included in the design. The computer algorithm generally uses a stepping and exchanging process to select the set of runs. Note that there is no guarantee that the design the computer generates is actually D optimal. The reasons for using D optimal designs instead of standard classical designs generally fall into two categories: the standard factorial or fractional factorial design requires too many runs for the amount of resources or time allowed for the experiment or the design space is | constrained; i.e. the process space contains factor settings that are not feasible or are impossible | to run. | 4 PLAGIARISM Plagiarism is a multifa stealing and publication o: acknowledge the original a 1 ceted and ethically a complex problem which simply means the f another author's work as one’s own original work without uthor. Plagiarism is simply an act of fraud. Authors should acknowledge sources fully and appropriately to follow the ethics and plagiarism is failure to he authors sometimes fail to acknowledge the source do so. However the students or # appropriately and properly. These failures are largely due to the lack of knowledge to use the citations properly. Therefore to maintain the ethics the plagiarism should be avoided. aim of this paper is to define plagiarism, its type, methods of detection and the ways to avoid plagiarism.’”Plagiarism is presenting someone else’s work as if it were your own, whether youmen to or vot”, Intention Wwas the sole point on which policies were found to be contradictory. a Scanned by CamScanner ie SU] BlosTATIsTICS AND RESEARCH METHODOLOGY There are nonetheless a number of reasons for concluding that plagiarism in its central sense, involves intentional deception. (Pecorari, 2001).It is spending considerable time attempting to distinguish between deliberate andunintentional plagiarism.Thereby, prototypical plagiarism is a form of cheating, an act of deception, in anattempt to gain unearned credit. It raises another problem: what to do in the cases of mergingsources ~ Patchwriting - which is not intentionally deceptive but is indicative of adevelopmental stage in learning the skills and techniques of academic writing? The answercomes by citin, sources. When doing this, a person cannot assume that paragraphs belong tohim/her and readers will be able to make the difference between them, without beingconfused. Plagiarism, includes not only written work, such as books or journals, but data orimages that may be Presented in tables, diagrams, designs, plans, photographs, film, music,formulae, Web sites and computer programs. Penalties associated with plagiarism extendfrom cancelling all marks for the specific assessment item or for the entire unit through to exclusion from your course, A graph is a pictorial representation of a set of objects where some pairs of objects are connected by links. The interconnected objects are represented by points termed as vertices, and the links that connect the vertices are called edges. Formally, a graph G = (V, E) consists of a nonempty set V of vertices (or nodes) and a set E of edges. Each edge has either one or two vertices associated with it, called its endpoints. An edge is said to connect its endpoints. TERMINOLOGY Ina simple graph each edge connects two different vertices and no two edges connect the same pair of vertices. Multigraphs may have multiple edges connecting the same two vertices. When m different edges connect the vertices u and v, we say that {u,v} is an edge of multiplicity m.An edge that connects a vertex to itself is called a loop. A pseudo graph may include loops, as well as multiple edges connecting the same pair of vertices. Graphs and graph theory can be used to model: ~ Computer networks ~ Social networks ~ Communications networks ~ Information networks ~ Software design — Transportation networks ~ Biological networks. Graphs and graph theory can be used to model: — > Computer networks > Social networks >» Communications networks > — Information networks » Software design * * Transportation networks > Biological networks Graphs Basics: There are two type of basic graph: 1. Directed 2. Undirected Scanned by CamScanner re RAMETRIC TESTS 151 pIRECTED GRAPHS 1. cas ies ( B) consists of V, a nonempty set of vertices (or nodes), directed edge (u,v) is said to since eats ge i an ordered pair of vertices. The : ean a Spa uv the initial vertex of this edge and is adjacent to and terminal vertices of a los 4 oa edge and is adjacent from u. The initial UNDIRECTED GRAPHS 1. Two vertices u, v in an undirected if there is an edge e bi vertices u and v and e Braph G are called adjacent (or neighbours) in G etween u and v. Such an edge e is called incident with the is said to connect u and v. 2. The ee eal pouicu of a vertex v of G = (V, B), denoted by N(v), is called the neighbourhood of v. If A is a subset of V, we denote by N(A) the set of all vertices in G that are adjacent to at least one vertex in A. So, Definition 3 3. The degree of a vertex in a undirected graph is the number of edges incident with it, except that a loop at a vertex contributes two to the degree of that vertex. The degree of the vertex v is denoted by deg(v) 7.3.4 HISTOGRAM A Histogram is a vertical bar chart that depicts the distribution of a set of data. Unlike Run Charts or Control Charts, which are discussed in other modules, a Histogram does not reflect process performance over time. It’s helpful to think of a Histogram as being like a snapshot, while a Run Chart or Control Chart is more like a movie. A histogram is a graphical representation of a frequency distribution. Bars are drawn over each class interval on a number line. The areas of the bars are proportional to the frequencies with which data fall into the class intervals. The shape of a unimodal distribution ofa quantitative variable may be symmetric (right side close to a mirror image of left side) | or skewed to the right or left. A distribution is skewed to the right if the right tail of the distribution is longer than the left and is skewed to the left if the left tail of the distribution is longer than the right. REPRESENTATION OF DATA USING HISTOGRAM Different data can be used on a single platform in the form of rectangle representing the area of region. Scales are defined on interval basis and different data is compared for further analysis. The graph below represents a Histogram of four categories of data for | three series, Scanned by CamScanner Sy mE SUT BIOSTATISTICS AND RESEARCH METHODOLOGY Di series 4 i Series 2 I Series 3 Category 1 Category 2 Category 3 Category 4 Figure: Bar Chart PARTS OF A HISTOGRAM A Histogram is made up of five parts: 1, Title: The title briefly describes the information that is contained in the Histogram. 2. Horizontal or X-Axis: The horizontal or X-axis shows you the scale of values into which the measurements fit. These measurements are generally grouped into intervals to help you summarize large data sets. Individual data points are not displayed. 3. Bars: The bars have two important characteristics—height and width. The height represents the number of times the values within an interval occurred. The width Tepresents the length of the interval covered by the bar. It is the same for all bars. 4. Vertical or Y-Axis: The vertical or Y-axis is the scale that shows you the number of times the values within an interval occurred. The number of times is also referred to as “frequency.” 5. Legend: The legend provides additional information that documents where the data came from and how the measurements were gathered. APPLICATION OF HISTOGRAM When we are unsure what to do with a large set of measurements presented in a table,we can use a Histogram to organize and display the data in a more user-friendly format. A Histogram will make it easy to see where the majority of values fall in a measurement scale, and how much variation there is. It is helpful to construct a Histogram when we want to do the following: > Summarize large data sets graphically. We can make it much easier to understand by summarizing it on a tally sheet and organizing it into a Histogram. Compare process results with specification limits. If we add the process specification limits to our Histogram, we can determine quickly whether the current process was able to produce “good” products. Specification limits may take the form of length, weight, density, quantity of materials to be delivered, or whatever is important for the Product of a given process. The team members can easily see the values which occur most. frequently. Whenwe use a Histogram to summarize large data sets, or to compare measurements to Specification limits, we are employing @ bo Scanned by CamScanner n(n eon) 168 Use a tool to assist in decision making. As h this module, certain shapes, sizes, and the spread of in investigating problems and making decisions. But d aren’t recent, or we don’t know how | Part is of the whole. Hence, it should be used | idual categories with the whole. A pie chart is a graphical data analysis technique for summarizing the distributional information of a variable, It is a cir cular plot consisting of wedges where the size of each wedge is proportional to the frequency (= number of observations) in that wedge. The pie chart has wedges (where the area of the we mn ‘dge is proportional to the number of observations | in the class) If two variables are specified, the first varrable contains pre-computed frequencies and the second variable is a Sroup identifier. This second form is more commonly used. Some examples are given to show Ppie-chart. Example-1 Pie chart showing world population by Countries: World Population by Countries United States . Ee a ee Scanned by CamScanner 154 Ea a) BIOSTATISTICS AND RESEARCH METHODOLOGY | Example-2 Pie chart representing percentage of different community of animals. Hogs Dogs: Example-3 Pie chart representing percentage of sales of shopkeeper in four different quarters. SALES PRESENTATION OF DATA USING PIE CHART Pie charts are a visual way of displaying data that might otherwise be given in a small table, Pie charts are useful for displaying data that are classified into nominal or ordinal categories. Nominal data are categorised according to descriptive or qualitative information such as country of birth or type of pet owned. Ordinal data are similar but the different categories can also be ranked, for example in a survey people may be asked to say whether they classed something as very poor, poor, fair, good, very good. Pie charts are generally used to show percentage or proportional data and usually the percentage represented by each category is provided next to the corresponding slice of pie. Pie charts ore good for displaying data for around 6 categories or fewer. When there are more categories it is difficult for the eye to distinguish between the relative sizes of the different sectors and so the chart becomes difficult to interpret. Scanned by CamScanner ry PARAMETRIC TESTS pie CHART: DESIGN 155 Pie charts provide a good vig i wal r ation in size. Why 'epresentati some varial en there are Ration, of the dat: . fooking cluttered and it may be diffieay ner Similarsized ° data when the extegoies show Weable would present the information mo tetPFet the data. In such cases solder whether ‘ mo . cases consider whethe the pie chart to be arranged clockwise in ordes to" It is usual forthe different sectors of contain a unique category of data but ittntrarts of magni gnswers’, then even if itis not the smallest we evra: 0 it does not detract from the name 'egory it is us ; d categori different slices so that they grade from derkto te eat lice. ight tones as 73.3 CUBIC GRAPH ‘A graph is said to be cubic, r example “other types” or “other ual to display it last in order that is helpful to colour or shade the we move from the first to the last if every vertex has exactly three edges emanating from it. How to sketch Cubic graph of a function > Find the x-intercepts for the function bi those equations. y setting the factors equal to zero and solving » Identify ey of each zero. Remember that the multiplicity represents the aunt! ‘es that zero appears. Decide if the curve touches or crosses through each zero. If the multiplicity is even, then the curve touches the x-axis at the zero without crossing. If the multiplicity is odd, then the curve crosses through the x-axis at the intercept. | > Coefficient of x: If the coefficient of x’ is positive, then the right hand goes up and the left follows. If the coefficient of x is negative, then right hand goes down and the left hand follows. 4. Once the above information is known, mark out the x-intercepts on) | the graph and start sketching from the LEFT side of the curve to the right. Make sure to take into account the multiplicity information. Exercise: Q1) Sketch the graph of y = x° + 6x*- x ~ 30. | re: Graph of the equation Figu — Scanned by CamScanner METHODOLOGY 156 liz BIOSTATISTICS PW ies) T i Fs 7.3.4 RESPONSE SURFACE FAO" | data, Rather than showing the individual Surface plots are diagrams of tn of a nefoen a designated depen near data points, surface plots show a functional relat aa ie Tot ig a companion plot to the cat fie CX, and two independent variables (X and 2). The PIO! © 8°". wvo-dimensional contour plot. Itis important to understand how these plots a Cor eof the data. Next, a | grid of X and Z is constructed. The range of nS BUA Cad average of all data values | Y value is calculated for each grid point. This Y value is a weighte a = by the user:) The tht are “near” this grid point. (The number of points averaged is sPevt i ee thace dimensional surface is constructed using these averaged values. Hence, the surface plot does not show the variation at each grid point. These plots are useful in regress on anal e I for viewing the relationship among a dependent an ables. that multiple regressionsassume that this surface is a per! S plot lets you visually determine if multiple regressions is appropriate. ‘A surface plot contains the following elements: * Predictors on the x- and y-axes. ; = Acontinuous surface that represents the fitted response values on the z-axis. ‘The type of fitted response values that Minitab displays depends on the type of response | variable in your model. Minitab displays the following types of fitted values: > Means for response variables that contain continuous measurements, such as length or weight. > Means for response variables that contain counts that follow the Poisson distribution, such as the number of defects per sample. | > Probabilities for response variables that contain only two possible outcomes, such as pass/fail. Standard deviations for models that are fit using Analyse Variability. Because a surface plot shows only two continuous variables at time, any extra variables are held at a constant level. v A surface plot can include only two continuous variables. If a model has more than two continuous variables, then Minitab holds each variable that is not on the plot constant. If a model has categorical variables, then Minitab also holds the categorical variables constant. Thus, these plots are valid only for fixed levels of the extra variables. If you change the holding levels, the response surface also changes, sometimes drastically. Surface plots do not use the data in the worksheet. Instead, Minitab estimates the response —— based ona ea model The accuracy of the surface plot depends on how well the model represents the true relationships between the variables. G i Response Surface Plot has been shown in Figure- a iiS AC TTTTane TIP To annotate the values of the predictors and the Predict responses for any point is plot, ue Plant Flag. To plant a flag, right-click the plot, choose Plant Flagin the coe thal ‘pears, and then click the point on the plot that you want to annotate. Use Predict to determin i e whether these points are unusual and to assess the precision of the predictions. is ; s Scanned by CamScanner Se TUG ila emt ES) ; 17 ‘Surface Plot of Rating ve Ratio, 6 Cone Cone, Hold Values Temp 140 | Time 3.75 8s | 4 Rating | 28 | 2 | 10 | Ratlo | | > 0 : Figure: Surface Curve Key Results: Surface Plot | | The response surface is curved because the model contains quadratic terms that are statistically significant. 73.5 COUNTER PLOT GRAPH A contour plot is a graphical technique for representing a 3-dimensional surface by plotting constant z slices, called contours, on a 2-dimensional format. That is, given a value for z, ines are drawn for connecting the (x,y) coordinates where that z value occurs.The contour plot is an alternative to a 3-D surface plot. Contour Plot : y} AN OR AON®@ © Figure: Contour Plot Scanned by CamScanner SUT BiosTATISTICS AND RESEARCH METHODOLOGY This contour plot shows that the surface is symmetric and peaks in the centre. The contour plot is formed by: * Vertical axis: Independent variable 2 * Horizontal axis: Independent variable 1 * Lines: iso-response values The independent variables are usually restricted to a regular grid. The actual techniques for determining the correct iso-response values are rather complex and are almost always computer generated. An additional variable may be required to specify the Z values for drawing the iso-lines. Some software packages require explicit values. Other software packages will determine them automatically. If the data (or function) do not form a regular grid, you typically need to perform a 2-D interpolation to form a regular grid. Figure: Regular Grid 4 WRITING AND) PRESENTATION! OF DATA' The written word News releases are often the vehicle through which your statistical organization communicates key findings of its statistical and analytical programmes to the intended audience, which is most probably the general blic. The text is the principal vehicle for explaining the findings, outlining trends and providing contextual information. We will Provide many suggestions for preparing an “effective” news release or other document, such as a report or an analytical article. What makes a news release, report or analytical article effective? Perhaps the best explanation comes from the first Making Data Meaningful guide, Part 1: A guide to writing Scanned by CamScanner — r 1b 159 pam pout umbers, which = stories @ across. An effect Provides an initial set of ; 7 essa ve news rleoe lo one a eam Fr BE ad at: > Tells story about the data; Has relevance for the publi . public vant to read about this?” and answers the question “Why should my audience Catches the reader's attenti i _ ader's a tion quickly with a headline or image; easily understood, interesting and often entertaining; Bi > Encourages others, including the medi i to what they are communicating ak eee ear ese™ v Data analysis is the AS SONS AUD) URHOEIE 4 eeepretation of ee developing answers to questions through the examination eermining, the availability of asic steps in the analytic process consist of identifying issues, Snawering the questions of interest oot deciding on which methods are appropri St anew mmunicating the results, applying the methods and evaluating, summarizing Analytical results underscore the usefulness of data sources by shedding light on relevant Some Statistics Canada programs depend on analytical output as a major data product ‘ox confidentiality reasons, it is not possible to release the microdata to the public: by pointing to data quality improvements to the survey | issues. because, Data analysis also plays a key role in data quality assessment problems in a given survey. Analysis can thus influence future process. Data analysis is essential for understanding results from surveys, administrative sources | and pilot studies; for providing information on data gaps; for designing and redesigning surveys; for planning new statistical activities; and for formulating quality objectives.Results {data analysis are often published or summarized in official Statistics Canada releases. | 7,6: PRINCIPLES A statistical agency is concerned with the relevance information contained in its data. Analysis is the principal tool ey can be used for descriptive or analytic studies. Descriptive from the data.Data from a surv’ studies are directed at the estimation of summary measures of a target population, for example, "2008 or the proportion of 2007 high school | the average profits of ‘owner-operated businesses in graduates who went on to higher education in the next twelve months. ‘Analytical studies | may be used to explain the behaviour of and relationships among, characteristics; for example, a study of risk factors for obesity in children would be analytic. To be effective, the analyst needs to understand the relevant issues both current and those likely to emerge in the future and how to present the results to the audience. The study table data sources and appropriate | ofbackground information allows the analyst to choose suitabl . 0 statistical methods. Any conclusions presented in an analysis, including those that can impact fata being analysed. public policy, must be supported by the d and usefulness to users of the for obtaining information | Scanned by CamScanner id siostarisrics AND RESEARCH METHODOLOGY | 7.7 INITIAL PREPARATION: GUIDELINES Prior to conducting an analytical study the following questions should be addressed: > Objectives. What are the objectives of the analysis? What issue am I addressing? What question(s) will I answer? ; > — Justification. Why is this issue interesting? How will these answers contribute to existing knowledge? How is this study relevant? > Data. What data am I using? Why it is the best source for this analysis? Are there any limitations? > Analytical methods. What statistical techniques are appropriate? Will they satisfy the objectives? > Audience. Who is interested in this issue and why? 7.8 SUITABLE) DATA’ Ensure that the data are appropriate for the analysis to be carried out. This requires investigation of a wide range of details such as whether the target population of the data source is sufficiently related to the target population of the analysis, whether the source variables and their concepts and definitions are relevant to the study, whether the longitudinal or cross-sectional nature of the data source is appropriate for the analysis, whether the sample size in the study domain is sufficient to obtain meaningful results and whether the quality of the data, as outlined in the survey documentation or assessed through analysis is sufficient. If more than one data source is being used for the analysis, investigate whether the sources are consistent and how they may be appropriately integrated into the analysis. 7.9 ‘APPROPRIATE| METHODS/AN) loc) > Choose an analytical approach that is appropriate for the question being investigated and the data to be analysed. » When analysing data from a probability sample, analytical methods that ignore the survey design can be appropriate, provided that sufficient model conditions for analysis are met. (See Binder and Roberts, 2003.) However, methods that incorporate the sample design information will generally be effective even when some aspects of the model are incorrectly specified > Assess whether the survey design information can be incorporated into the analysis and if so how this should be done such as using design-based methods. See Binder and Roberts (2009) and Thompson (1997) for discussion of approaches to inferences on data from a probability sample. See Chambers and Skinner (2003), Korn and Graubard (1999), Lehtonen and Pahkinen (1995), Lohr (1999), and Skinner, Holt and Smith (1989) for a number of examples illustrating design-based analytical methods. For a design-based analysis consult the survey documentation about the recommended approach for variance estimation for the survey. If the data from more than one survey are included in the same analysis, determine whether or not the different samples were independently selected and how this would impact the appropriate approach to variance estimation. Vv v Scanned by CamScanner isu sue S28) 161 data files f 7 7 Tniable, particulay thee surveys frequently contain more than one weight vai oneibadinal pidpdaes, Coney hee cacunal ot if it has both cross-section Tposes. Consult the survey documentation and survey exPer's v anger pea \ it is not obvious as to which might be the best weight to be used in any particular design-based analysis. When analysing dats ai eematon av ng Boom a probability survey, there may be insul inforrreenatives. arry out analyses using a full design-based approa Consult with experts on the subject ey sr aaa Having determined the approy investi a »priate analytical method for the data, investigate the software choices that are available to apply the method. If analyzing data from # probability sample by design-based methods, use software specifically for survey data since stan ard analytical software packages that can produce weighted point estimates do not correctly calculate variances for survey-weighted estimates. It is advisable to use commercial software, if suitable, for implementing the chosen analyses, since these software packages have usually undergone more testing than non-commercial software. Determine whether it is necessary to reformat your data in order to use the selected software. Include a variety of diagnostics among your analytical methods if you a any models to your data. Data sources vary widely with respect to missing data. At one extreme, there are data sources which seem complete - where any missing units have been accounted for through a weight variable with a nonresponse ‘component and all missing items on responding units have been filled in by imputed values. At the other extreme, there are data sources where no processing has been done ‘with respect to missing Gata, The work required by the analyst to handle missing data can thus vary widely. It should be noted that the handling of missing data in analysis is an ongoing topic of research. Refer to the documentation about types ofmissing data and the processin This information will be a starting poin Consider how unit and/or item nonresponse into consideration the degree and types of missing dat used. Consider whether imputed values should be inclu they should be handled. If imputed values are given to what other methods may be used to proper nonresponse in the analysis. If the analysis includes modelling, it could be appropriate to include some aspects of nonresponse in the analytical model. Report any caveats about how the approaches used to handle missing data could have impact on results fficient design ch. Assess n the statistical re fitting the data source to determine the degree and ig of missing data that has been performed. ® for what further work may be required. could be handled in the analysis, taking a in the data sources being ded in the analysis and if so, how not used, consideration must be Jy account for the effect of se Scanned by CamScanner 1 Eno BIOSTATISTICS AND RESEARCH METHODOLOGY 7.9.4 RESULTS INTERPRETATION = > Since most analyses are based on observational studies rather than. on the Tesults of a controlled experiment, avoid drawing conclusions concerning a : | > When studying changes over time, beware of focusing on short-term tends without | inspecting them in light of medium-and long-term trends. Freque! oti i jen trends are merely minor fluctuations around a more important me or long-term trend. ’ ; \ > Where possible, avoid arbitrary time reference points. Instead, use meaningful points | of reference, such as the last major turning point for economic data, generation-to- generation differences for demographic statistics, and legislative changes for social statistics. 7.9.2 OF RESULTS > Focus the article on the important variables and topics. Trying to be too comprehensive | will often interfere with a strong story line. > Arrange ideas in a logical order and in order of relevance or importance. Use headings, subheadings and sidebars to strengthen the organization of the article. > Keep the language as simple as the subject permits. Depending on the targeted audience for the article, some loss of precision may sometimes be an acceptable trade-off for more readable text. > Use graphs in addition to text and tables to communicate the message. Use headings that capture the meaning (e.g. "Women’s earnings still trail men’s”) in preference to traditional chart titles (e.g. “Income by age and sex”). Always help readers understand the information in the tables and charts. by discussing it in the text. | > When tables are used, take care that the overall format c the data in the tables and prevents misinter] wording, placement and appearance of titles labelling. Explain rounding practices or procedures. In the presentation of rounded data, do not use more significant digits than are consistent with the accuracy of the data. > Satisfy any confidentiality requirements (e.g. minimum cell sizes) imposed by the | surveys or administrative sources whose data are being analysed. > Include information about the data sources used and any shortcomings in the data that may have affected the analysis. Either have a section in the paper about the data or a reference to where the reader can get the details, Include information about the analytical methods and tools used. Bither have a section on methods or a reference to where the reader can get the details > Include information regarding the quality of the results. Standard errors, confidence intervals and/or coefficients of variation provide the reader important information about data quality. The choice of indicator may vary depending on where the article is published, contributes to the clarity of pretation. This includes spacing; the ; row and column headings and other v Ensure that all references are accurate, consistent and are referenced in the text. \ » — Check for errors in the article. Check di ‘ letails such as the consistency of figures used in the text, tables and charts, the accuracy of external data, and simple arithmetic \ >» — Ensure that the intentions stated in th s he introduction are fulfilled by the rest of the article. Make sure that the conclusions are consistent with the evidence. Scanned by CamScanner > 163 Have the article reviewed by others for relevance, accuracy and ‘comprehensibility, regardless of where it is to be disseminated. As a good practice, ask someon? from the data providing division to review how the data were used. If the article is t0 be disseminated outside of Statistics Canada, it must undergo institutional and pes! Canada, 2 oe in the Policy on the Review of Information Products (Statistics If the article is to be disseminated in a Statistics Canada publication make sure that it complies with the current Statistics Canada Publishing Standards. These standards affect graphs, tables and style, among other things. As a good practice, consider presenting the results to peers prior to finalizing the text. This is another kind of peer review that can help improve the article. Always do a dry run of presentations involving external audiences. Refer to available documents that could provide further guidance for improvement of your article, such as Guidelines on Writing Analytical Articles (Statistics Canada 2008 ) and the Style Guide (Statistics Canada 2004) Pawson 7.9.3 QUALITY INDICATORS The main quality elements are relevance, interpretability, accuracy, accessibility. An analytical product is relevant if there is an audience who is (or will be) interested in the results of the study-For the interpretability of an analytical article to be high, the style of writing must suit the intended audience. As well, sufficient details must be provided that another person, if allowed access to the data, could replicate the results.For an analytical product to be accurate, appropriate methods and tools need to be used to produce the results. 7.9.4 CHECKLIST FOR DEVELOPING GOOD DATA VISUALIZATIONS When producing visual presentations, we should think about: > ‘The target group: different forms of presentation may be needed for different audiences (e.g. business or academia, specialists or the general population). The role of the graphic in the overall presentation: analysing the big picture or focusing attention on key points may require different types of visual presentations. How and where the message will be presented: a long, detailed analysis or a quick slidechow. Contextual issues that may distort understanding: expert or novice data user Whether textual analysis or a data table would be a better solution. ACCESSIBILITY CONSIDERATIONS Provide text alternatives for non-text elements such as charts and images. Don’t rely on colour alone. If you remove the colour, is the presentation still understandable? Do colour combinations have sufficient contrast? Do the colours work for the colour blind (red/green)? Ensure that time-sensitive content can be controlled by the user (e.g. pausing of animated graphics). # Consistency across data visualizations: ensure that elements anthin visualizations are designed consistently and use common conventions where possible (e.g. blue to represent water on a map), Scanned by CamScanner 164 Eee BIOSTATISTICS AND RESEARCH METHODOLOGY > Size, duration and complexity: Is your presentation easy to understand? Is it too much for the audience to grasp at a given session? Possibility of misinterpretation: test your presentation out on colleagues, friends or some people from your target group to see if they get the intended messages. 7.9.6 PROTOCOL Oregon laws allow nurses to use Nursing Treatment Protocols. Oregon DOC Health Services has written Nursing Treatment Protocols consistent with the guidelines set by the Oregon Board of Nursing and the Oregon Board of Medical Examiners. Oregon DOC Health Services chooses to use this accepted practice to enhance inmate health care. Oregon DOC Health Services wants to ensure that the use of Nursing Treatment Protocols enhances medical care directed by a physician and does not replace it Implementation of the Nursing Protocols involves another application of the general concept of nursing triage practice. The protocols are designed to assist and educate nursing staff in this triage process. Oregon DOC Health Services requires additional training in physical assessment and the use of treatment protocols for the nurses who use them. It is recognized that nursing staff are responsible to review the changes that have been implemented, and understand the proper use of the Nursing Protocols. Oregon DOC Health Services requires that all nurses that use the protocols are supervised for this privilege by the Health Services Manager and the Chief Medical Officer of the institution that the nurse works at. Key concepts apply. If an individual is seen for the same problem twice without expected resolution or improvement, they are referred to a medical provider. All applications of Nursing Treatment Protocols that apply to the use of prescription medications are reviewed and signed off by the practitioner on the next working day (and within 72 hours). There is more to the art of nursing than the use of medication. The majority of the Nursing Treatment Protocols actually result in using educational materials or self-care treatments. Sometimes over the counter or prescription medication will be suggested. While some patients seen by the nurse will require an immediate referral to a practitioner, the inmate patient's first access to health care is the initial encounter with the nursing staff during the sick call /triage process. This encounter is the first chance to intervene and often resolves the inmate patient's health concern. Review has found that 80% of the inmate patient's health concerns can be addressed during the sick call visit and resolved through the use of the nursing treatment protocols. It’s also clear that the Protocols are not intended as the cure for every ailment in every patient. The effectiveness of the health care team is enhanced by empowering nurses to apply their knowledge and skills through the use of the nursing treatment protocols. Sometimes, no nursing protocol will or should apply to the patient that the nurse is evaluating. In this case the patient usually is referred for evaluation and treatment by @ Provider. The Nursing Treatment Protocols have been in place for many years. The inception and subsequent reviews and revisions of the nursing treatment protocols have been the concerted effort of many staff within the health services program. Nursing staff are encouraged to work with and offer feedback to the current work group for the Nursing Protocols. Your input into the ongoing revision process is a valuable resource to the group in helping with continuous quality improvement of the protocols. > > Scanned by CamScanner 165 REZ Suen 7.9.7 COHORTS STUDIES Cohort studies or case c . ‘ontrol studies — These are either looking back over patient record happen treme Rpened to them or following a group of people over time 19 = : : ! : ime whet ton, disease inguection ty be comparison groups of people (with or withou! 7.9.7.1 Cohort study Cohort is a group of subjects that represent the population of interest hav common chmacteriaties and studied over A sufficient Betiod of time, for example, survivors of Bhopal gas tragedy, cohort of children born in a particular year. In these type of studies; cohorts, with and without exposure, are selected and then followed up to measure the occurrence/incidence of disease in them By comparing the incidences of the diseases in the two groups, one can provide some evidence on the cause- effect relationship between the exposure and outcome as here you are sure of the exposure factor preceding the outcome: If you select the cohorts at the present time point and then follow them in future, itis called as prospective cohort study. Alternatively You Co also (i! you have detailed past records of the exposure and health characteristics of the individuals) look in the past records, select cohorts from there and then follow them till the present time int to look for the occurrence of the This is known as retrospective or historical cohort Study design. The term ‘retrospective’ in study designs is used when past records are taken into account. [Figure 1] shows a schematic presentation of a cohort study design. ing certain cohort of Disease Developed | Disease Free ___Sutfietent timelag Compare the disease incidence between Research recruits > the exposed and unexposed cohorts subjects _ Cohort (Disease free)) + Disease Developed Disease Free -» Time Present Future re: A schematic presentation of a cohort study design Figu yy: AN EXAMPLE n during adolescence is associated with the risk of anxiety disorder in adulthood two parallel cohorts of adolescents aged 13-14 years, who had and who did not have peer victimization experience were followed up to age of 18 years and the incidence of the anxiety disorder between the cohorts were compared. PROSPECTIVE COHORT STUD To find out whether peer victimizatio Scanned by CamScanner [ESWiT BlosTATIsTICS AND RESEARCH METHODOLOGY RETROSPECTIVE COHORT STUDY: AN EXAMPLE Villwock et al. conducted a study to find the effect of age (4”80 vs. >80 years) on acute ischemic stroke (AIS) outcomes in patients following endovascular mechanical thrombectomy (EMT). A retrospective cohort study design was adopted by selecting the entire cohort of patients from 2008 to 2010 with primary diagnosis of AIS who received EMT from Nationwide inpatient sample database. Some studies may start from the past records and move ahead in time to the present time point and then further continue to follow these cohorts in future too, thus combining the retrospective and prospective study design into one, thereby known as ambi-directional cohort study. At times, one cohort is selected and comparisons between subgroups or also known as internal cohorts are made. For example to study the effect of body mass index (BMI) on cardiovascular disease, a cohort of subjects free of cardiovascular disease were selected, categorized based on their BMI into internal cohorts and followed up for sufficient period of time to capture and compare the incidence of cardiovascular disease. In some instances, multiple cross-sectional studies are conducted ona certain population, which is known as a pseudo-cohort study as unlike a true cohort study, the same cohort has not been followed. For example, a pseudo-cohort study using national cross-sections (2001, 2004, 2007, and 2010) was conducted to examine differences in smoking prevalence under different smoking ban policies. Cohort study designs enable the researchers to study the temporal association between cause and effect, study multiple outcomes in the cohort, and find out incidence of the disease and alculate relative and attributable risks. But these type of studies are not suitable for rare diseases or those that have a longer latent period. Other major limitations of cohort studies are that they are relatively expensive, time consuming, and have a higher attrition rate. 7.9.8 OBSERVATIONAL STUDIES In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample to a population where the independent variable is not under the control of the researcher because of ethical concerns or logistical constraints. Designing of study is relatively more important than the data analysis because error conducted in designing of study can never be corrected unless researcher devises a fresh study, whereas a wrongly analysed data can be re-analysed to get meaningful results. A better understanding of study designs is essential for the proper conduct and interpretation of medical research. This review article focuses on observational study designs which are commonly used in medical research. Medical researchers are usually interested in finding out the effect of a particular risk/ therapeutic factor on the disease/health outcome. Comparisons between two groups are a logical method for providing such evidence. The groups to be compared can be based on a) exposure/risk factor 'b) disease/outcome or c) Intervention. _— Scanned by CamScanner [fees SRR i In observational is itudi ies, the groups to be compared are not based on intervention/ manipulation by the investigator the mainstay of such type af res engin to an exposure or outcome group, hence, forms Observational Study: An example To study the eff is ‘ jeople tiving in an ae of industrial pollution on health of the people, the health status of Due to a lack of tote nen Cat be compared with those living in a non-industrial area comparative efinctivencrst en by the investigator, these types of studies reflect the Observational studies inreal world scenarios, and therefore have a higher external validity. nd confounding are the anet Without limitations, though. Selection bias, information bias 8 © major issues to be taken care of in cbservational studies. Details of other biases that can affect observati F ional stu questions may arises:- dies are presented elsewhere.Now the following I. Why observational studies? IL When randomized controlled trials are considered the gold standard in medical research? Even though randomized controlled trial (RCT) have certain unique advantages like randomization, allocation concealment etc., there are instances when RCTs or other interventional studies are inappropriate or not feasible. Experimentation may prove to be inadequate in cases where the outcome of intervention is determined by activities of the care provider, such as physiotherapy, surgery etc. Even when studying rare diseases, it may not be possible to have sufficient number of patients for conducting a clinical trial. It may also be unethical to perform an interventional study in certain situations, for example, to study the effect of a harmful substance such as tobacco on the health status; you cannot expose someone to the harmful substance for the purpose of your study. In such instances, the most appropriate study designs are observational study designs. A recent Cochrane review has reported that there is little difference between the results obtained from observational studies and RCTs and hence factors other than study design per se need to be considered when exploring reasons for a lack of agreement between results of RCTs and observational studies. Thus a proper appraisal of the study design is critical to interpret the findings. The ‘type of study design’ alone, cannot be the endpoint in debates concerning the strength of the evidence generated. "Observational studies should not be considered to be lesser relevance in medical research as compared with RCTs”. 7.9.9 TYPES OF OBSERVATIONAL STUDIES Observational studies are usually categorized into various categories such as case report or case series, ecologic, cross-sectional (prevalence study), case-control and cohort studies. Other variants of these observational studies are also possible such as nested case-control study, case cohort study etc. 7.9.9.1 Case reports and case series ipti is i describing the manifestations, clinical course, A description of a single case, typically f and proggiosis i that cage, 7s a case report. For example; a case report of a patient of severe lactic acidosis and multiorgan failure due to thiamine deficiency during total parenteral Scanned by CamScanner —_ articular disease ¢ nutrition (TPN). When multiple case reports are reviewed bea - Pi ‘Ondition, diagnosis or treatment procedures, etc., it forms a case } Example of a case series study: ! Clinical presentation and management of six cases Pe aeeial pasmerBenc department following synthetic cannabinoid intoxication has be 7 epo ties Da i ack of control or comparison group, these types of studies do not oe Sten tiene lonships_ But nonetheless, these types of studies can be used as a source of hyP' y the researchers to design studies that provide stronger empirical evidence. ye "Case report or a case series is a description of a single or a set of cases, respec ’ vey i ere usually the diagnostic modalities, treatment procedures, prognosis, elc., concerne condition js discussed without taking any control group into account”. 7.9.9.2, Prevalence Study . If the data regarding both the exposure and the outcome are collected simultaneously from a selected group of individuals belonging to a specified population at a given point of time, it is also known as a cross-sectional study. Frequently referred to as a survey, this type of study design is a preferred design to find out the prevalence and associated risk factors of a particular outcome/disease. A cross sectional study may extend from weeks to months depending upon the magnitude of the survey but the variables from a subject are measured at only a single given point of time and the measurements are not repeated at different time intervals. The data collected in a cross-sectional study can be grouped in diseased/no diseased and exposed/no exposed groups and certain associations can be studied. As temporality is not taken into account, one cannot comment on causality, though associations can be pointed out. For example, if a cross-sectional study finds that milk drinking is associated with peptic ulcer in a cross-sectional study; is that becausé milk causes the disease, or because ulcer sufferers drink milk to relieve their symptoms? “If the data regarding both exposure and outcome are collected simultaneously from a select group of individuals in a specified population at a given point of time the this type of study is termed as Cross- sectional study” Ecological (Aggregate) Study ‘An observational study where two variables, that is, a risk/therapeutic factor and an outcome, are studied and one of the variables is measured at the population level is an ecological study. Group comparisons are made in these types of studies as opposed to the individual level comparisons. A recent study reported that the prevalence of obesity in a given population is inversely proportional to the prevalence of f{e1i jin the same population. It wil be fallacious to report that ebese people bees Shas of being infected withH. pylori or H. pylori protects against obesity with this study, as individual level data of obese and no obese people and their individual}, vyiosi ctatus was 10! collected. This faulty interpretation is common with ecological strolies and is called #8 ecological fallacy. But these type of studies provide some direction for more in-depth reseatcl even though its results by itself provide very weak empirical evidence i 7.9.9.3 Case-control study Case-control studies begin when the outcome /dise e two groups namely cases (those having the outcome of ener Ueady occurred. Her e of interest, i.e., a particular diseas® a ud Scanned by CamScanner nea ORS 169 eomplication) and controls (those not having the outcome of interest) are S° fected and information about their exposure(s)/ risk factor(s) under consideration are collected from existing records, patient examinations, personal interviews, etc., for ‘comparison. Figure- shows a schematic presentation of a case-control study design. Diseased (cases) Present | Absent [e—| Non Diseased (controls) tatus in cases Present | Al Exposure status in controls, Direction of study Figure: A schematic diagram of a case-control study design Even though the population-based case-control study is more appropriate because subjects are captured from a representative population hospital-based case-control studies where both the cases and controls are captured from the same hospital are more common for the sake of convenience. Even the landmark study that found the association between smoking and lung carcinoma was a hospital-based case-control study. 7.9.9.4 Selection of cases Only confirmed cases and those that are homogenous with respect to the disease severity should be included. In general, incident (new) cases are preferred than prevalent cases (both new and old) because newly diagnosed cases participate more actively, have lesser chances of recall bias; diagnostic criterion among the cases is more consistent than subjects captured from different period (prevalent cases). Due to non-availability of adequate incident cases at one point of time, researchers recruit the cases over a period of time and this should not be thistaderstood as a follow up study. But when, case-control studies include prevalent cases, this may introduce survival bias; for instance, those cases that were exposed to the factor under study may have poor survival than those exposed, and prevalent cases do not represent the exact population of cases and it increases as time progress. “The criteria to select cases and controls should be the same to ensure comparability during case- | control study”. 79.9.5 Selection of controls (who do not have concerned disease) | To ensure comparability, the controls as along with the cases should represent the | same population and meet the same inclusion criteria. It is at Himes assumed that if controls | ate slowed from the same hospital from where cases have been selected, they are being | ney er Scanned by CamScanner fo} 0 EE eo CESS as the catchment popu selected from the same population but this may oe to reduce this selection of the cases and controls may be entirely different. In orcet 0 usually represent the C8, controls are recruited from the cases’ friends or relatives, 4 due theraprreh ee Ste t ives is generally not preferre apprehension of Population. Inclusion of blood relatives is genera toe Overmatching in terms of exposure to various risk factors. ie kbs du Another situation where selection bias is possible is when the controls disease, whi is different from the case definition, may share the same rsk facts With £080 dsecg under the study. For example, in a case-control study ooplering ntl weeps ie On liver cancer, the strength of association would be underestimated if cor ‘ ‘teh ms from the accident and emergency ward because frequency of vice can be to select the oe accident and emergency wards. One way to reduce this bias can be to controls from the different diseases instead of a particular disease. , / For rare diseases, sufficient number of cases may not be available and in such situations there is an option to include multiple controls (up to three or four) per case, which increases | the power of the study. Case control studies are relatively less time consuming, relatively less expensive and best suited for rare diseases or those having a longer latent period. These studies usually serve as a quick assessment of associations between suspected exposure and outcomes, which can be later studied by more appropriate, costly, and robust study designs like cohort studies, Still these type of studies are prone to recall and interviewer biases, which affect its internal validity. It is erroneous to establish temporal sequence in case-control study with prevalent cases included, but when incident cases are included, it may be established. - 79.9.6 Nested case-control study: Variation of a case-control design In the nested case-control study, identified and, for each, a specified nui in the cohort who have not develope: case. The type of nested case-control cases of a disease that occur in a defined cohort are mber of matched controls is selected from among those 'd the disease by the time of disease occurrence in the study depends on how the controls are selected. At the end of the cohort study, controls are randomly selected from the disease-free Subjects. This design is known as exclusive design or cumulative incident sampling and usually applied in case-control studies and odds tatio is calculated for such design. re ty 8. g B cs & = : i x @ 5 gs 5 to as nested case-control. “The main feature of these studies are both case and coy . hi : re botl mtrol are selected from the same source of | population and reduce the chance of election bias. In addition teen stun affective compared with full cohort studies, albeit decrease in sar ple size may reduce the statistical power”. Scanned by CamScanner a Nested Case-Control stu dy: An exam; To Sear whether treatment with tte d risk of stroke Vi eed with ‘ccd ee nested case-control study was implemented with cases dagnosy uaa Coe Remorthagic stroke and controls matched for sex, year of birth, the electronic health records dite ie these cases and controls were selected came from _ patabase for participants ever ee HiCR Were extracted from Clinical Practice Research ‘Snosed with epilepsy and prescribed antiepileptic drugs. alproate (exposure) was associated with 199.7 Measures of association Relative Risk (RI association frequenth : , incidence rates.and hence risks can be computed by these type of study designs. In Replication: to provide an estimate of experimental error; randomization, to ensure ‘hat this estimate is statistically valid; and. local control, to reduce experimental error by making the experiment more efficient. Experimental Method: An experiment is an investigation in which a hypothesis i scientifically tested. In an experiment, an independent variable (the cause) i~ manipulated and the dependent variable (the effect I ) is measured; any extraneous variables are controlled. An advantage is that experiments should be objective at Scanned by CamScanner ——==—=£_ _ — ~~ AMETRIC TESTS 473 Completely randomized di esign: terms of data analysis and gn: is probably the sit in | 7 : implest. ental design resigned subjects to one Sotvenience In this design, he otperimenter randomly they received a cold vaccine. atment conditions. They received @ placebo oF gu engineering, science, and statistics, repli a te ov esscited wath eee ion isthe repetition of an experimental condition: F ication as “the repetitic omenon can be estimated. in standard E1847, fines replica repetition of the imated. ASTM, in start ere ; f the set of all the treatment combinations to be compared i” pesIGNING CLINICAL TRIAL ; Para eryants area on ‘most common clinical trial design is the parallel-grouP design, in which P' ndomized to one of two or more arms (Pocock, 1984). These arms ingude the new intervention under investigation : an asa ito control oF an ative conta 8 d one or more control arms, such as Clinical Research Design we | ss No intervention Intervention ‘Observational Experimental Comparison grou} Random allocation Yes| No - i Randomized controlled trial Randomised Descriptive| ‘Analytical stud study (case control, cohort) Clinical Research Design Figure: Experimental design originated in agricultural research and influenced laboratory and And etre before being applied to trial: ot pharmaceuticals in humans. Experimental design is characterized by control of the eo imental process to reduce experimental error, ie feet the experient 1 04 ‘abi response and randomization. For Peete elie vg the yield fies of com, the experimenter uses the same type era reales a “ee same fertilizer and eed control methods in each test plot Mliple los of ‘pound are plante twvo varieties of com. The assignment of a seed variety to a test plot is randomized. Scanned by CamScanner | feiss en orate eis let) 7 I 1 ETE 1 design, yet has some di Clinical trial design has its roots in classical epee sions of-varin iy is ; features. The clinical investigator is not able to con a ain responses t© medical treatm: ‘ design as a laboratory or industrial experimenter Tin genetically identical plants display greater variability than observations from expe 4 physical and chemical proce and animals or measuring effects of tightly-controlle Ps ren Toatdy a cinicd er And of course, ethical issues are paramount in clinical senor atient acral ana sponse with adequate precision, a trial may require lengthy Osama day. Theze is opportunity “a up. It is unlikely to enroll all the study ane ont eis for study volunteers to decide to no longer participate. | . ; ae of these issues will be considered as we extend Reosinerae Hiner to clinical trials.Good trial design and conduct are far more impor a than selecting the cone statistical analysis. When a trial is well designed and properly conductés) sia a amalyses can be performed, modified, and if necessary, corrected. On cones ney inaccuracy (bias) and imprecision (large variability) in estimating Ce Hfects the fo mao shortcomings of poorly designed and conducted trials, cannot be er the trial. Skillful statistical analysis cannot overcome basic design flaws. Y nt Piantadosi (2005) lists the following advantages of proper design: 1. Allows investigators to satisfy ethical constraints . Permits efficient use of scarce resources . Isolates the treatment effect of interest from confounders . Controls precision . Reduces selection bias and observer bias. Minimizes and quantifies random error or uncertainty . Simplifies and validates the analysis . Increases the external validity of the trial SrNauveon The objective of most clinical trials is to estimate the magnitude of treatment effects or estimate differences in treatment effects. Precise statements about observed treatment effects are dependent on a study design that allows the treatment effect to be sorted out from person-to-person variability in response. An accurate estimate Tequires a study design that minimizes bias. Piantadosi (2005) states that clinical trial design should acc 1. Quantify and reduce errors due to chance . Reduce or eliminate bias omplish the following: and precision Be simple in design and analysis . Provide a high degree of credibility, re 6. Influence future clinical practice (Reference ‘Mipeilionlinecourses.science.psu.edu/stat509/node/19) 2, 3. Yield clinically relevant estimates of effects 4. 5. Producibility, and external validity Scanned by CamScanner A ” irae OLS 175 ™ yarious PHASE The development of investigational new drugs (INDs) involves performing clinical tats (or studies) to assess the safety and efficacy of the IND in humans Pohese trials are usually dissed into # phases of development (Phase 1 to 4), with ach potentially lasting fT several Years. Successful completion ofeach phase and approval by the appropriate regula. quthority or satires (the European Medicines “Agency [EMA in the European Union, Food and Drug Administration [FDA\in the United States, Health Canadain Canada, or the Ministry of Health, Labour and Welfarein Japan) is required for progression to the ext phase. The various phases are discussed in this regards. Phase 0 : _ The official name of a Phase 0 study is an exploratory IND study, and the goal is to quickly establish whether an agent will work as desired in humans, based on in Vivo safety pharmacology and toxicology preclinical studies. In Phase 0 studies, a single sub therapeutic dose of the IND is administered to a small number of healthy subjects (€.10 to 15), over @ short duration (7 days). Since the dose administered is too Jow to result in a therapeutic effect (ensuring the absence of toxic effects), preliminary pharmacokinetic (PK) and, where possible, pharmacodynamics (PD) data are collected for evaluation. Phase 1 Phase 1 studies are designed to assess the safety of an IND, to understand its PK and PD properties, and to ideally identify a potential therapeutic dose. These studies are usually conducted in a small number of healthy volunteers/subjects (c.15 to 30). Phase 2 ‘These studies are typically conducted to test the IND ina larger group of patients who have the disease or illness for which the IND is being developed, to determine whether it is efficacious, at least in the short term. Phase 2 studies are larger than those conducted earlier in the drug development, typically comprising up to 300 patients. Phase 3 Phase 3 studies are designed and performed to assess the efficacy and effectiveness of an IND in a larger cohort of patients, ‘all of whom have the disease that the treatment is intended to treat. Such studies are typically conducted in several hundred patients, and are usually conducted at multiple sites in multiple countries. Phase 3 studies often compare the new treatment versus the cusrent ‘gold standard’ treatment for the condition. for which the new treatment is being developed Phase 4 Post marketing surveillance involves monitoring for safety (pharmacovigilance) once a treatment has been approved by the appropriate regulatory authority or authorities. Such surveillance is intended to identify any rare adverse effects that have not been observed previously or have only been observed infrequently, and to monitor the effects of long term administration in a wider population. Scanned by CamScanner - ial 176 ae” 551 Pearson lores ois =2sl a LALIT eoTo) Rope | - Approximate | Phase Objectives | Dose | suePoniae, Phase 0*| PK, particularly oral bioavailability | Sub therapeutic 10 healthy subjects | and half-life of the drug Phase 1 | Testing of drug on healthy volunteers | Often subtherapeutic, but a Aealthy to confirm safety and likely, _| with ascending doses subjects i therapeutic dose \ Phase 2 Testing of drug on patients to assess | Therapeutic dose Up to 300 patients efficacy and safety Phase 3 Testing of drug on patients to assess | Therapeutic dose Over 300 patients efficacy, effectiveness and safety Phase 4 | Post marketing surveillance - Therapeutic dose Anyone seeking monitoring the use of the drug after treatment from approval their doctor “Satisfactory completion and approval of Phases 1 to 3 is required for a drug to be approved for marketing. phase 4 studies are conducted after a compound has been approved, for the primary purpose of post marketing surveillance. In an attempt to both speed up the drug development process and to quickly identify safety issues, PhAS€ O studies, also referred to as ‘human micro dosing studies’ were introduced”, (Ref :https://www.quanticate.com/blog/bid/63143/the-different-phases-of-clinical-trials) The different clinical trial phases are described in further detail below and summarised in the table at the end of this blog, HUMAN CLINICAL TRIAL PHASES Phase I studies assess the safety of a drug or device. This initial phase of testing, which can take several months to complete, usually includes @ small number of healthy volunteers (20 to 100), who are generally paid for participating in the study. The study is designed to determine the effects of the drug or device on humans including how it is absorbed, metabolized, and excreted. This phase also investigates the side effects that occur as dosage levels are increased. About 70% of experimental drugs pass this phase of testing Phase TI studies test the efficacy of a drug or device. This last from several months to two years, and involves u phase II studies are randomized trials where one drug, while a second “control” group receives a standard treatment or placebo, Often these studies are “blinded” which means that neither the patients nor the resarchers know wh? has received the experimental drug, This allows investigators to provide the pharmaceutical company and the FDA with comparative information about the relative safely and second phase of testing can P to several hundred patients. Most Broup of patients receives the experimental _—cat Scanned by CamScanner effectiveness of the new dru; or poth Phase I and Phase Il studies, Phase III studies involve randomized thousand patients. This large- marketing the drug. Phase IV studies, drug or device has been approved for co objectives at this stage: > > 'ut one-third of experimental drugs successfully complete often called Post Marketing Surveillance Trials, are conducted after a sumer sale. Pharmaceutical companies have several To compare a drug with other drugs already in the market. > To monitor a drug’s long-term effectiveness and impact on a patient’s quality of life. To determine the cost-effectiveness of a drug therapy relative to other traditional and new therapies. Phase IV studies can result in a drug or device being taken off the market or restrictions of use could be findings in the study. (Ref :https://www.centerwatch.com/clinical-trials/overview.aspx) placed on the product depending on the PRASET "PHASE TT ‘PHASE PHASE V [OBJECTIVES | Determine he meiabalc and efeciveness, | Obl adetbonal formation | Mortar angang safety a foapeien [en Se |e oreacacases (Sera lection | enionaaers Sermons [ectimrngein |eiontre, cs eect, |i aos — ssaay ay roe |r ia a a ees: [fet — [Etcmnnnte [eer | Sapetaoren ine msceee, [tre (ama ee [Smee | Raentlre Remmome [rewrite ro Rea paca eee es a a | Sees a = ha — a RS as, [se feces | tee cet, eet | os oo pa — hesress ——f aa a POPULATION. | Penlby wiavear or Travail ease [cin wih ge Geass [maha wih gel Geen, oat oe porte an Lv) SAMPLE frog 0030 Hantestotauanes | Tevsarde = Se TSS [ae a nos, Se eres sarong epee = Figure: Comparison of Clinical Trial Phases Scanned by CamScanner api for to Re cal Lele de lev las dr stt ha cc PME Lt Q3. Explain the neet Gee article 3.22 Aann Whitney? Hint ~ veen Wilcoxon and M . (4 What is the difference as e 7 amptions. in te MWW test you are intrested inthe dig . mi ‘ans. The difference comes from thee null hypothesis the ome auirnice there is a digg detween hwo independent Oink you are {interested in testing e same hypothesis but witha while in Wilcoxon signe’ matched samples stforence between Kruskal Wallis test and Mann Whitney test? 5 What is the di ae vest) & Kruskal-Wallis H Tests, ~ ‘The major difference between th Ans. Lica ie Kruskal-Wallis H is simply that the latter can accommodate more thay Wnts require independent (hetween-subjects) designs and use summed rank soy groups. determine the results. 0.6 What is a non- parame! ; ‘Ans. These include, among others: distribution free the data are drawn from a given probabilil of parametric statistics. It includes non-parametric and statistical tests. 0.7 Discuss Disadvantages and Ad Hint: See article: 348 Q.8 What is the difference between ‘Ans, The major difference between tl the latter can accommodate more than two grou designs and use summed rank scores to determine the results. Q.9 What does the Kruskal Wallis test tell you? ‘Ans, The Kruskal-Wallis H test (sometimes also called the “one-way ANOVA on ranks") based nonparametric test that can be used to determine if there are statistically significant tween two or more groups of an independent variable on a continuous or ordinal depend tric tests? free methods, which do not rely on assumptions iy ty distribution. As such it is the oppo descriptive statistics, statistical models, ifeey vantages of Using Observational Research Kruskal Wallis and Mann Whitney? he Mann-Whitney U and the Kruskal-Wallis H is simply ps. Both tests require independent (between isan variable. Q.10 Explain the term Plagiarism. Hint: See article 324 Qu1 Explain the following terms: (a) Response Surf . i . face Plot (©) Checklist for developing good data visualization’ 7 (d) Cohorts Studies r | Scanned by CamScanner

You might also like