Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

BAHIR DAR UNIVERSITY

BAHIR DAR INSTITUTE OF TECHNOLOGY


FACULTY OF COMPUTING
DEPARTMENT OF INFORMATION TECHNOLOGY
Program: MSC (Regular)
Title: Plagiarism checker

By: Bizuayehu Tadege presented to: Million M. (PHD)


ID: BDU1300729 presentation date: May 2021
bizuayehutadege4@gmail.com 1
Outlines
• Overview and concepts
• Approaches
• Architecture
• Advantage and disadvantages
• Application
• Conclusion remark

bizuayehutadege4@gmail.com 2
Overview and concepts
What is plagiarism?
• According to oxford English dictionary plagiarism is “ the act of
taking someone's work or ideas and passing them off as one’s own ”.
• It is a way of using author’s ideas , text, or work without proper
citation.
• It is paraphrasing of a certain sentence or paragraph rather than
citation of the actual documents.

bizuayehutadege4@gmail.com 3
Cont.…
• There are different types of plagiarisms:
• Paraphrasing plagiarism: rephrasing some one else’s idea without citation
• Self plagiarism: reusing of passages or ideas from previously submitted work
• Mosaic plagiarism: combining ideas from more than two source with out
proper citation.
What is checker?
• Is the practice of analyzing a certain document with other documents
to identify the document is plagiarized or not.

bizuayehutadege4@gmail.com 4
Cont.…
• To detect existence of those plagiarism type different software tools
are developed to detect automatically in short time and cost efficiently.
• Plagiarism checker :is a tool/software that takes input document from
users world and identify/detect, is the document duplicated or
plagiarized from other documents that written by others.

• It used to check the similarity between a given document with other


documents that are documented by others like researchers

bizuayehutadege4@gmail.com 5
Approaches
In plagiarism checking process , there are different approaches
followed:
• Rule based approach: this is based on the rule defined, the algorithm
identify the given document is plagiarized or not from other document
sources.
• Learning based approach: this approach based on learning of a
certain document and perform a similarity between a given document
with those learned documents
• This approach use machine learning technique

bizuayehutadege4@gmail.com 6
Cont...
• Under machine learning there are three methods of learning:
• Supervised learning: this is the task of machine learning, based on labeled
trained data to perform a similarity between input document with trained
model.
• Unsupervised learning: this is also other techniques of machine learning
based on unlabeled data means that the machine itself learn the document and
create a pattern then perform similarity with users input document.

• Hybrid learning: it is techniques of machine learning with the combination of


two learning techniques. Unsupervised preprocess the given document and
supervised learn the preprocessed data
bizuayehutadege4@gmail.com 7
Methods of plagiarism checker
• Vector based method : by identifying the tokens of the given
document perform an similarity measures between documents
• Syntax based method: it is based on part of speech tag of a phrase
and words in different statement.
• Semantic based method: analyzing the meaning of the sentences
even if the statement described in different ways using VSM b/n
documents
• Structure based method: this is method analysis the lexical,
syntactic, and semantic features of a document to find similarity
between two documents
bizuayehutadege4@gmail.com 8
Architecture

Figure1: General architecture of plagiarism checker


bizuayehutadege4@gmail.com 9
Cont…
• It is a technique/way that followed to check a given document is plagiarized or not
within the existing documents.
• In plagiarism checking there are a number of steps to detect a document is pure or
not. i.e.
• Input document: documents to being analyze
• Text preprocessing: tokenization, stop word removal, stemming and
normalization are performed on the input document
• Language translation: way of converting one language to other laguages
• Encoding: is a process to training a model
• Matching: Analysis of existing and conducted research paper using algorithm
• The result of the analysis is displayed

bizuayehutadege4@gmail.com 10
Advantages of plagiarism checker
• Plagiarism checker used for:
• To free from plagiarisms
• To check quality of works
• It is easy to use
• It solves problem of document duplication
• It is very important for better writing
• To save time and cost by detecting in short time and cost efficiently

bizuayehutadege4@gmail.com 11
Disadvantages of plagiarism checker
• It is not show the detail of plagiarized content, only show similarity
score
• It provides false positive for common phrases
• It is not make researcher to conduct a research freely

bizuayehutadege4@gmail.com 12
Application
• Educational institution: to know students assignment and also teachers free
from any plagiarism.

• Research centers: to know every document is free from plagiarism to publish


the document like journal/article/book publishers.

• Multimedia production: used for video/audio detection for multimedia


company like YouTube.

bizuayehutadege4@gmail.com 13
Conclusion remark
• Plagiarism checker is a software application used to identify the
research paper/book/article/journal are plagiarized or not by applying
matching algorithm between documents.
• This mechanism is very important for researchers as well as
educational institutions.
• It used to enhance the purity of a document.
• It is a best mechanism in research world by Appling this technique we
can solve the problem of document duplication or plagiarism.
• It efficiently identify a document based on the trained documents.

bizuayehutadege4@gmail.com 14
Cont…
• Even if the plagiarism checker have the above strength it have a
limitation of it take time to train/learn a given document.

• In research there are a number of common phrases, but the plagiarism


checker detect those phrases as plagiarized from others.

• Every plagiarism checker have a limitation to detect a given


documents, like number of word, sentence block and paragraphs take
for detection.

bizuayehutadege4@gmail.com 15
Cont…
• As recommendation of plagiarism checker, still there is no Amharic
language plagiarism checker to identify the document is plagiarized or
not. So we can conduct a research on Ethiopian language plagiarism
checker.

• And also it does not have a multilingual plagiarism checker for any
language automatically. So we can conduct a research on multilingual
plagiarism checker.

bizuayehutadege4@gmail.com 16
References
1. M. R. K. C. Hiten Chavan1, "Plagiarism Detector Using Machine Learning," International
Journal of Research in Engineering, Science and Management, Volume 4, Issue 4, 2021.
2. S. R. a. A. B. Dastjerdi, "Plagiarism checker for Persian (PCP) texts using hash-based tree,"
Journal of AI and Data Mining, Vol 4, No 2, 2016.
3. A. B. S. K. M. S. Anu Saini, "Plagiarism Checker: Text Mining," International Journal of
Computer Applications (0975 – 8887) Volume 134 – No.3, , January 2016.
4. P. N. Daniela Chuda, "Support for checking plagiarism in e-learning," Procedia Social and
Behavioral Sciences 2, 2010.
5. G. A. A. A. Amandeep Dhir, "ARCHITECTURAL DESIGNING AND ANALYSIS OF
NATURAL LANGUAGE PLAGIARISM DETECTION MECHANISM," Journal of
Theoretical and Applied Information Technology, 2008.
6. a. A. b. Elena Bautu, "A textual plagiarism detection system for student assignments built
with open-source software," 2019.

bizuayehutadege4@gmail.com 17
bizuayehutadege4@gmail.com 18

You might also like