Analysis of Data and Testing of Hypothesis Name: Course: Code: Institution: Date

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19

Analysis of Data and Testing of Hypothesis

Name:

Course:

Code:

Institution:

Date:
Table of Contents
Section 1 Introduction to the report...........................................................................................3

1a) Define the terms used in the report..................................................................................3

1b) Brief discussion of the data used in the report.................................................................4

3: Do the students understand the” cheating cannot be tolerated” policy, Yes or no............4

Section 2 Data sets and analysis (body of the............................................................................5

2a) Summarize and comment dataset.....................................................................................5

Interpretation of the slope......................................................................................................7

2b) Analyse a large dataset that has the 2 variables...............................................................7

2c) Analyse the dataset that has the 2 variables.....................................................................9

Section 3 Conclusion................................................................................................................10

Appendices...............................................................................................................................12

Appendix 1: Data Set 1........................................................................................................12

Appendix 2: Data Set 2........................................................................................................16


Section 1 Introduction to the report

1a) Define the terms used in the report

Define An agent

An agent is a representative who acts on behalf of another person, commonly referred

to as the principal. In the business, an agent is selected based on their ability to perform

specific tasks, whether on a continuous or intermittent basis. An agent is used by a business

to find customers. Some agents do more harm than good, they get customers by making

promises the business cannot keep.

What is turnitin?

Turnitin is a software that checks the similarity index on content that has been

developed for publication. By matching the key words between the text that has been

developed and other existing literature on the online repositories, it is possible to determine if

the new content was copied directly from one or more sources. The purpose of checking the

similarity index or plagiarism is to ensure that all content that developed does not match pre-

existing literature and to ensure originality in every newly added content.

What is cheating?

Cheating is the application of unfair or illegal approaches to achieve specific

objectives. In the perspective of academic cheating, the presentation of materials that is either

copied from existing sources, or acquired through the efforts a different individual is

categorised as cheating.

A high turnitin score indicates that the similarity index is high, but is not necessarily

an indication of cheating. The high similarity index can be accounted for due to the presence

of references. However, it is important for students to focus on reducing the turnitin score in

order to avoid the perception that the work is copied from existing sources.
Cheating can also be perpetuated even when the turnitin score is low, especially if the

student relies on the efforts of another individual to complete the assignment. Under such

circumstances, the similarity index will not be a measure of lack of academic integrity.

Cheating entails academic dishonesty and limits the progress in education. First,

instructors have to invest in inspection of the work to determine if there is any similarity

index. Secondly, graduating students are unable to replicate their skills in research, since they

relied on the abilities of an agent. As a result, top performing students lack in the necessary

abilities, in spite of appealing grades.

1b) Brief discussion of the data used in the report

The variables that will be used are

1: turnitin match: this a percentage so it is between 0 and 100

The first variable, turnitin match is the product of comparison between the words in

the report and content on the online sources. Since turnitin focuses on key words, the higher

the number of keyword based on ideas from specific sources, the closer the scores gets to

100%.

2: Amount of cheating: this a percentage so it is between 0 and 100

The amount of cheating depends on the proportion of the students who rely on

assistance, or whose turnitin scores are high. As a result, if a student does not cheat at all,

then the percentage of ‘0’. If the cheat, then the percentage is slated at closet to 100%

depending on the level of cheating.

3: Do the students understand the” cheating cannot be tolerated” policy, Yes or no

The “cheating cannot be tolerated” policy is deliberate to discourage cheating,

including reducing high levels of plagiarism and the use of agents in completion of

assignments. By avoiding plagiarism and completing assignments, it is a clear indication that

they understand. Secondly, failure to deny cheating when it happens, it becomes easy to
establish the cause of the cheating and it indicates that they understand the policy. Failure to

dispute the grades is an indication that they are aware of the results of cheating, whether it is

intentional or accidental. .

If the students do not understand the policy, they do not comprehend the reason for

‘zero’ scores and may still request to get an opportunity to resubmit the assignment. The most

common defences include the excuse that “I cheated in the assignment however the

assignment is worth a lot of marks and failing the assignment will make it hard to pass, I

promise I will never do it again”; “I was sick so I cheated” and “I only cheated on half the

assignment”.

Students with valid reasons for high plagiarism and turnitin scores can be allowed to

retake the tests. However, lack of time and complexity of the test do not count as valid

reasons.

Section 2 Data sets and analysis (body of the

In this section, analysis of two data sets will be performed. The analysis will focus on

measures of central tendencies coupled with scatter plots and tabulation of the results.

Hypothesis testing will also be performed to test the independence of the differences with

commensurate conclusions drawn about the data.

2a) Summarize and comment dataset

The variables under analysis include the “%of cheating” and “% of turnitin match”.

The data is included in Appendix 1, with the measures of central tendencies included in the

following table.

Find the mean and standard deviation for each of the variables

Total 1873 1100


Minimum 0 0
Maximum 95 91
Mean 18.73 11
Standard Deviation 30.49 24.45

Figure: Amount of Cheating and Turnitin Match

The variable on the x-axis is the % of cheating while the variable on the Y-Axis is the

% turnitin match. The suitable title for the graph is ‘A scatter Plot for the Amount of

Cheating and Turnitin Match’.From the scatter plot, if the turnitin match is high the amount

of cheating can be low no. From the results, it is ‘true (YES) that if the amount of cheating is

high can the turnitin match

The regression line in the scatter plot represents the points of best fit through the

point. The straight line is comprised of the points on the scatter plot that reduce the sum of

squared errors in prediction. It represents the linear pattern which is the prediction of the

scores on one variable (% of cheating) based on an explanatory variable (% turnitin match).


The line has the smallest overall positioning from all the scattered plots. In this data, the

regression line is y=11.159174358936-0.0084983640649175x

The slope for the graph is -0.0084983640649175. the negative indicates that the

changes in the Y values is inversely related to the X axis and that the line slants from the left

to the right. For every unitary change in the variables on the Y axis, the variable on the x-axis

changes by -0.0084983640649175 times in the opposite direction.

2b) Analyse a large dataset that has the 2 variables.

The data set for two variables that reflect the students under the two agents (agent A

or agent B) and amount of cheating recorded.

The Measures of Central Tendency between the Two Agents

Agent A Agent B
Total 920 1943
Minimum 0 0
Maximum 100 100
Mean 15.59 29.54
Standard Deviation 28.49 38.47
Figure: Back to Back Histogram-% of Cheating for Students under each Agent.

The X-Axis represents the % turnitin match for the various tests or instances that the

two agents cheated. The Y-Axis measures the frequency of cheating by each of the agents by

indicating the number of tests taken.


Test the claim there is a difference between the means are based on assessment

whether the difference between the two means is zero.

The hypothesis for the tests is

H0: U1 = U2 the means of the two agents are independent of one another

H1; U1 ≠U2 0the means of the two agents are not independent of one another

The test statistic=2.2384, and the p value =0.0271 (P<0.05).

Based on this result, we reject the null hypothesis and state that the two agents are not

independent of one another.

2c) Analyse the dataset that has the 2 variables

The data is comprised of two aspects including the agent name (agent A or agent B)

and the classification of the students based on whether they understand that cheating cannot

be tolerated or not(yes or no)

Table: Data sets of the Two Agents

Agent A Agent B
Student Understands 516 570
Does not Understand 170 143

Figure: A Graphical Summary-Stacked Chart


Number of Students under the Two Agents Based on Whether they
Understand or Not
800
700
143
600 170
500
400
300 570
516
200
100
0
Agent A Agent B

Student Understands Does not Understand

The X-Axis measures the number of agents whereas the Y-Axis measures the number

of participants under each agent. The proportion of Agent A’s students that understand is

0.75. The Proportion of Agent B’s students that understand is 0.80

Test the claim of independence

The test for independence seeks to determine whether the claim that the row and

column variables are independent of each other. Based on this definition, the hypothesis is

stated as following.

The hypothesis for the tests is

H0: U1 = U2 the two variables are independent of one another

H1; U1 ≠U2 the two variables are not independent of one another

The test statistic=4.4947, and the p value =0.033999 (P<0.05).

Based on this result, we reject the null hypothesis that the two variables are not

independent of one another.

Section 3 Conclusion

Based on these tests results, both agents have similar ranges due to similar minimums

and similar maximums. However, Agent A has a lesser mean and standard deviation in the
percentage of cheating than Agent B. This is reflected in the difference in the proportion of

students who understand the ‘Zero Tolerance Policy’ in Agent A in comparison with Agent

B. It is assumed that a higher understanding of the Zero Tolerance Policy’ on cheating results

to a reduction in the percentage of cheating. This is based on the fact that 72% of the students

in Agent A understand the policy, whereas 80% of the students under Agent B understand the

policy. Consequently, the mean percentage of cheating under Agent A is 15.59 (SD=28.49)

while the mean percentage of cheating under Agent B is 29.54 (SD=38.47). However, based

on these results, it is clear that that assumption does not hold, as is supported by the

hypothesis testing in Section 2c Based on the tests, t=4.4947, p=<0.05, indicating the two

samples are not independent of one another. As a result, the propensity to cheat is not related

to knowledge of the effects of cheating and getting caught cheating. Based on the sample, it is

correct to indicate that there are exogenous motivations and rationales for cheating in Agent

B, as opposed to Agent A. It is however impetrative to determine the rationale behind the

circumstances, since a high score on turnitin match could be a reason unrelated to academic

dishonesty and illegal actions.


Section 4: Appendices

Appendix 1: Data Set 1

%
% of
Turnitin
cheating 
match 

student 1 5 88

student 2 23 0

student 3 41 88

student 4 0 0

student 5 79 0

student 6 0 0

student 7 0 0

student 8 0 0

student 9 0 39

student 10 0 0
student 11 41 0

student 12 9 40

student 13 67 0

student 14 0 0

student 15 0 15

student 16 0 0

student 17 0 0

student 18 0 0

student 19 90 0

student 20 48 88

student 21 30 0

student 22 0 0

student 23 23 0

student 24 0 0

student 25 15 0

student 26 0 0

student 27 0 0

student 28 0 0

student 29 75 25

student 30 0 0

student 31 0 41

student 32 0 0

student 33 0 0

student 34 47 0

student 35 0 78

student 36 0 67

student 37 0 0
student 38 0 0

student 39 0 0

student 40 7 0

student 41 12 0

student 42 71 0

student 43 0 0

student 44 15 0

student 45 89 0

student 46 0 0

student 47 39 0

student 48 0 41

student 49 0 0

student 50 0 0

student 51 0 51

student 52 0 0

student 53 0 0

student 54 70 0

student 55 85 0

student 56 0 0

student 57 91 31

student 58 52 0

student 59 63 0

student 60 0 0

student 61 0 0

student 62 52 0

student 63 0 0

student 64 0 0
student 65 0 0

student 66 0 0

student 67 0 27

student 68 22 0

student 69 0 8

student 70 0 0

student 71 86 28

student 72 73 0

student 73 0 0

student 74 0 0

student 75 0 0

student 76 0 0

student 77 0 91

student 78 0 0

student 79 31 0

student 80 0 0

student 81 0 0

student 82 92 0

student 83 95 43

student 84 0 0

student 85 0 66

student 86 0 75

student 87 68 0

student 88 0 0

student 89 0 0

student 90 64 0

student 91 0 0
student 92 0 0

student 93 0 70

student 94 0 0

student 95 0 0

student 96 0 0

student 97 19 0

student 98 0 0

student 99 84 0

student 100 0 0
Total 1873 1100
Minimum 0 0

Maximum 95 91
Mean 18.73 11
Standard
30.49 24.45
Deviation

Appendix 2: Data Set 2

Percentage of

cheating
agent
agent A 

99 38

28 85

13 16

78 31

100 100

45 94

43 98
70 91

76 93

67 99

50 96

59 71

64 64

43 64

43 68

42 77

0 70

0 65

0 77

0 67

0 64

0 71

0 69

0 75

0 0

0 0

0 0

0 0

0 0

0 0
0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0
0 0

0 0

0 0

0 0

0 0

0 0
Total 920 1943
Minimum 0 0
Maximum 100 100
Mean 15.59 29.54
Standard
28.49 38.47
Deviation

Note a Reference list is not required for this assignment, you can use other sources to help

you analyse the data but you do not need to cite these references.

You might also like