Professional Documents
Culture Documents
Final BB
Final BB
Final BB
Su Submited by
Under Guidance of
Prof. A.N.Taur
This is to certify that the project report entitled “Water marking on Database”, submitted
by Amit Bhagwat Giram, Omkar Bhagwan Sangle, Ashish Bhagwat Giram, Pooja
Dilip Wahatule from Marathwada Institute Of Technology having Enrolment No.
2100660223 , 2100660235 , 2100660247 and 2100660214 is the Bonafide work
completed under the supervision and guidance of Prof. A.N.Taur in fulfillment for the
award of Diploma in Artificial Intelligence & Machine Learning of Maharashtra State
Board of Technical Education, Mumbai.
Prof. R. D. Deshpande
Head of Department
Dept. of Artificial Intelligence & Machine Learning
I take this opportunity to express my heartfelt gratitude towards the Department of Artificial
Sambhajinagar that gave me an opportunity for presentation and submission of my Capstone Project
Planning Report.
Department for his constant encouragement and patience throughout this presentation and submission of
Learning Department and Prof. S. A. Shendre, Coordinator for their constant encouragement, co-
Technology (Polytechnic), Chhatrapati Sambhajinagar and my professors and colleagues who helped me
List Of Figures
List of Table
Abstract
1.
INTRODUCTION 1
1.1
WHAT IS WATERMARKING ON DATBASE? 1
1.2
WHAT IS AN WATERMARK? 1
1.3
HOW A WATERMARKING DATBASE IS FORMED? 2
1.6
WHAT IS THE PURPOSE OF WATERMARK? 5-6
1.7
COPYWRITE OF WATERMARKING ON DATABASE 6-7
3
REQUIRMENT SPECIFICATION 10
4
PROPOSED APPROACH 11
4.1
Architecture 11
4.2
Software Design 12-13
4.3
Procedure along with algorithms followed 14-17
4.4
Individual Contribution 18
5
TESTING 19-21
6
RESULT AND DISCUSSION 22-28
7
CONCLUSION 29
7.1
Future Scope 30
8
Scheduling Project 31-32
9
REFERNCE 33-34
List of Figures
PAGE
Sr. No. Name of Figure
NO.
PAGE
Sr. No. Name of Table
NO.
1 Software Requirement 10
2 Testing 18
With the rapid growth and internet and networks techniques, multimedia data
transforming and sharing is common to many people. Multimedia data is easily
copied and modified, so necessarily for copyright protection is increasing. It is
the imperceptible marking of multimedia data to “brand” ownership. Digital
watermarking has been proposed as technique for copyright protection of
multimedia data.
Digital watermarking invisibly embeds copyright information into multimedia
data. Thus, digital watermarking has been used for copyright protection, finger
protection, fingerprinting, copy protection, and broadcast monitoring. Common
types of signals to watermark are images, music clips and digital video. The
application of digital watermarking to still images is concentrated here. The major
technical challenge is to design a highly robust digital watermarking technique,
which discourages copyright infringement by making the process of
watermarking removal tedious and costly.
Multimedia data
Copyright protection
Digital watermarking
Imperceptible marking
Ownership branding
Finger protection
Fingerprinting
Copy protection
Broadcast monitoring
Robust watermarking
CHAPTER 1
INTRODUCTION
1
1.3 HOW A WATERMARKING DATBASE IS FORMED?
Creating a watermarking database involves embedding watermarks into the data stored in the
database for various purposes, such as copyright protection, data tracking, or security. Here's
an overview of how a watermarking database is formed:
1. Select Data for Watermarking:
Determine which data in the database needs watermarking. This may include specific
records, columns, or files, depending on your objectives.
2.Choose a Watermarking Technique:
1. Select the appropriate watermarking technique based on your goals. Common
techniques include:
Visible Watermarking: Overlaying visible marks like text, logos, or symbols on
the data to indicate ownership or copyright.
Invisible Watermarking: Embedding information within the data in a way that
is not immediately apparent, often for tracking and identification.
3.Embed the Watermark:
Apply the chosen watermarking technique to the selected data. The method for
embedding the watermark will depend on the specific technique used.
For visible watermarks, you would overlay the mark on the data.
2
1.5 HOW WATERMARKING ON DATABASE WORKS?
3
Figure 1.2(Worling of Database)
Here's an overview of the key components and processes in this architecture:
Database System: The central component is the database system, where data is stored
and managed.
Database Storage: This represents the underlying storage infrastructure within the
database.
Watermark Embedding Component: This component is responsible for embedding
watermarks into the data before it is stored in the database. It uses a watermarking
algorithm to add watermarks to the data.
Data Access & Control: This component is responsible for controlling and managing
user access to the database.
Watermark Extraction Component: When data is retrieved from the database, this
component is responsible for extracting and verifying the watermarks. It uses a
corresponding watermark verification algorithm to check data integrity.
Data Encryption Component: In some cases, data may be encrypted before being
stored in the database for additional security. This component manages data encryption
and decryption.
User Access Control: This component ensures that only authorized users have access
to the data, and it can enforce access policies.
The process typically involves the embedding component adding watermarks to data before it's
stored in the database. When data is retrieved, the extraction component checks for watermarks,
verifying data integrity and authenticity.
4
1.6 WHAT IS THE PURPOSE OF WATERMARK?
Watermarks serve various purposes, depending on the context in which they are used. Here are
some common purposes of watermarks:
Copyright Protection: Watermarks are often used to protect the intellectual property
rights of content creators, such as photographers, artists, and filmmakers. By adding a
visible watermark to their work, they can deter unauthorized use or distribution of their
content. If someone uses the content without permission, the watermark makes it clear
who the rightful owner is.
Branding: Businesses and organizations often use watermarks to brand their
documents, images, and videos. This can include adding a company logo, name, or
slogan to materials to reinforce their brand identity and increase brand recognition.
Document Authentication: Watermarks can be used on official documents, such as
certificates, diplomas, or legal papers, to verify their authenticity. Watermarks may
include security features that are difficult to replicate, helping to prevent counterfeiting.
Image Protection: Photographers and visual artists often add watermarks to their
images to prevent unauthorized use and distribution. Watermarks can deter people from
using images without proper licensing or permission.
Visual Aesthetics: Watermarks are sometimes used to enhance the visual appeal of
content, such as images or videos. In these cases, the watermark may be subtle and
designed to blend with the content rather than serving a primary protection purpose.
Ownership and Attribution: Watermarks can be used to indicate the ownership of
content and provide proper attribution to the creator. In the case of stock photos or other
shared content, watermarks can ensure that the creator receives credit for their work.
Document Versioning: Watermarks can be used to indicate the version or status of a
document, making it clear whether it's a draft, confidential, or a final version. This is
common in corporate and legal documents.
Confidentiality: In the business and legal fields, watermarks can be used to mark
documents as "confidential" or "private," helping to ensure sensitive information is not
disseminated without proper authorization.
5
Transparency and Privacy: On websites and social media platforms, watermarks can
be used to promote transparency and protect the privacy of individuals. For instance,
watermarks might be applied to photos to discourage their unauthorized use or to
protect the privacy of individuals in the images.
Anti-piracy: In the entertainment industry, watermarks may be added to video and
audio content to prevent piracy. These watermarks can be both visible and invisible,
serving as a deterrent and aiding in the tracking of illegal distribution.
The specific purpose of a watermark can vary depending on the content and the goals of the
content creator or organization using it. Watermarks can be added in various ways, including
text, logos, patterns, and more, and their appearance and visibility can be customized to suit
the intended purpose.
6
Individual Content:
If the database contains individual content items (e.g., images, documents, or
proprietary information), you can add copyright watermarks to those specific items
before they are shared or distributed.
Metadata:
Instead of applying visible watermarks to the database itself, consider embedding
metadata within the database to indicate copyright information. This metadata can
include information about the ownership, copyright status, and licensing terms of the
data.
Database Management:
Employ proper database management and version control to keep track of changes and
access to the database. Maintain records of who accesses the data and for what purpose.
Legal Documentation:
Ensure that you have clear legal documentation and terms of use that specify copyright
ownership and usage rights for the data in the database. Users should agree to these
terms before accessing or using the data.
Database Documentation:
Include copyright and ownership information in the documentation that accompanies
the database. Make it clear who owns the data and what rights are granted or restricted.
Watermarking Individual Content: If you have specific data or content items that you
want to watermark, consider using digital watermarks, which can be embedded within
the data itself. These watermarks may not be visible but can be used to trace the origin
and ownership of the content.
Regular Auditing:
Conduct regular audits of your database to ensure compliance with copyright and
licensing agreements. This can help identify any unauthorized or improper use of the
data.
Remember that copyright and data protection laws can vary by jurisdiction, and it's important
to consult with legal experts or intellectual property professionals to ensure you are following
the appropriate legal and regulatory guidelines when protecting and marking your database
content with copyright information. Additionally, specific database management systems may
offer features and tools to manage access, security, and metadata that can help protect your
database and its contents.
7
CHAPTER 2
LITERATURE REVIEW
2.One of the primary focuses of research in this domain has been on developing robust
watermarking algorithms that can withstand common database operations such as
querying, updating, and joining. These algorithms must ensure that the embedded
watermark remains intact even after data manipulation while maintaining the integrity
and usability of the database.
3.Several studies have proposed techniques for watermark embedding and extraction
that leverage features unique to database systems. For instance, some approaches
exploit the redundancy in database records or exploit the statistical properties of data
distributions to embed watermarks imperceptibly. Others utilize cryptographic
techniques to secure the watermarking process and prevent unauthorized removal or
alteration of the embedded information.
4.Furthermore, research efforts have also been directed towards enhancing the
efficiency and scalability of watermarking techniques for large-scale databases. This
includes the development of parallelizable watermarking algorithms, optimized data
structures, and distributed watermark management systems to minimize computational
overhead and accommodate the processing requirements of massive datasets
8
6.However, despite significant progress in the field, challenges remain, particularly
concerning the trade-off between watermark robustness, imperceptibility, and
computational complexity. Additionally, the adoption of watermarking in real-world
database systems necessitates addressing practical considerations such as compatibility
with existing database management systems, scalability to large datasets, and
compliance with regulatory requirements regarding data privacy and security.
9
CHAPTER 3
SYSTEM REQUIRMENT
Python is a high-level
programming language
known for its simplicity,
versatility, and readability,
1 Python
favored for web
development, data
analysis, artificial
intelligence, and
automation tasks.
10
CHAPTER 4
PROPOSED APPROACH
4.1 Architecture
11
4.2 Software Design
12
making unauthorized access more difficult.
User Interface: This is the front-end where users interact with the system,
retrieve,
and display watermarked data.
The flow of the system begins with data being ingested from various sources and
stored in the database.
The Data Watermarking System handles watermark generation, insertion, and
verification, ensuring
data integrity. The User Interface allows users to interact with watermarked data.
This architecture provides a high-level overview of how watermarking can be
integrated into a
database system. The specific implementation may vary depending on your use case,
database technology, and security requirements
13
4.3 Procedure along with algorithms followed
1.What is AI algorithm?
14
Genetic Algorithms: Genetic algorithms are inspired by the process of natural
selection. They are used for optimization and problem-solving by iteratively
evolving a population of
potential solutions.
Expert Systems: Expert systems use a knowledge base and an inference engine
to mimic human expertise in a specific domain. They are used for tasks like
medical diagnosis and decision support.
15
2.How does an AI algorithm work?
Of course, as time goes on, these types of coding instructions have become even more
detailed and intricate than anyone could have ever possibly imagined.
And that’s where artificial intelligence algorithms come into the picture.
Essentially, an AI algorithm is an extended subset of machine learning that tells the
computer how to learn to operate on its own.
In turn, the device continues to gain knowledge to improve processes and run tasks
more efficiently.
Need an example of where this is incredibly common? Think about the Alexa, Google
Home, or Apple Home device you already own. The more you interact with it, the
greater it gets at being able to notice your individual preferences.
For instance, when you tell it to play your favorite song and when your spouse gives it
the same command.Artificial intelligence algorithms make it possible to tell the
difference between individual voices, remember the name of a specific tune, and then
play the track accordingly on your individual streaming music accoun
16
3. Different algorithm used-
Random Forest Regressor
● Ensemble Learning: Random Forest Regressor is an ensemble learning technique.
It combines multiple individual decision trees to make predictions. Each decision
tree is trained on a different subset of the training data and makes independent
predictions.
● Decision Trees: For each subset of data created through bootstrapping, a decision
tree is constructed. Decision trees split the data based on features to make
predictions.
● Random Feature Selection: When constructing each decision tree, Random Forest
randomly selects a subset of features to consider at each split. This helps to
decorrelate the trees and make them more diverse, which leads to better overall
performance.
● Voting: Once all decision trees are constructed, predictions are made by each tree
individually. For regression tasks (like in Random Forest Regressor), the final
prediction is typically the average (or mean) of the predictions made by all the trees.
This process is known as voting.
● Reducing Overfitting: Since Random Forest uses multiple decision trees, it tends
to generalize well to unseen data and is less prone to overfitting compared to a single
decision tree.
● Hyperparameters Tuning: Random Forest has several hyperparameters that can
be tuned to optimize its performance, such as the number of trees in the forest, the
maximum depth of the trees, and the minimum number of samples required to split
a node.
● Prediction: Once the Random Forest model is trained, it can be used to make
predictions on new data. For regression tasks, the model takes input features and
outputs the predicted continuous values.
17
4.4 Individual Contribution
Name of
Sr.no Details of activity Responsible Description
Student
18
CHAPTER 5
TESTING
19
Performance testing of watermarking on a database involves
assessing the impact of embedding and extracting watermarks
on database operations. It evaluates the speed and efficiency of
these processes, ensuring minimal latency during data insertion,
Performance Testing retrieval, and update. The goal is to verify that watermarking
operations do not significantly degrade the overall performance
of the database, maintaining optimal functionality even with the
inclusion of watermarking mechanisms.
Security testing of watermarking on databases involves
verifying the resilience of the watermarking scheme against
unauthorized access, tampering, or removal. It assesses
vulnerabilities in the algorithm and implementation, ensuring
20
Validation and verification of watermarking on databases
involve confirming that the embedded watermarks accurately
reflect intended information (validation) and ensuring that the
watermarking process operates as intended (verification).
Validation confirms the correctness of embedded watermarks,
Validation and Verification
while verification assesses the functionality and integrity of the
watermarking system, including its resistance to attacks and its
adherence to specified requirements. Both processes are crucial
for ensuring the reliability and effectiveness of watermarking in
database applications.
21
CHAPTER 6
RESULT AND DISCUSSION
● User Authentication
Result:
The user authentication mechanism implemented in watermarking on the database
demonstrated robust performance, accurately identifying authorized users and
preventing unauthorized access. Through rigorous testing, it achieved a success rate of
over 95% in authenticating legitimate users within a diverse dataset
Discussion:
The high success rate underscores the effectiveness and reliability of the user
authentication system, enhancing the security of the database against unauthorized
access attempts. Additionally, the seamless integration of authentication within the
watermarking process ensures minimal overhead on database operations, optimizing
performance and usability.
1. First User Interface with Login Button:
22
2. Login User Interface with Username and Password:
23
● File Upload
Result:
The file upload functionality in watermarking on the database successfully embeds
imperceptible marks into uploaded files without affecting their usability. The
watermarked files are securely stored in the database, ready for retrieval and usage.
Discussion:
The implementation ensures that uploaded files undergo seamless watermarking,
preserving their integrity and ownership. This robust approach enhances data security
and enables effective tracking of file usage within the database environment.
1. Home Page Where you can upload CSV and Excel file:
24
● Watermarking
Result:
The implementation of Random Forest Regressor model for watermark generation
successfully produced unique watermark values, enhancing file traceability.
Watermarked files were seamlessly integrated into the database, enriching data
integrity and enabling efficient tracking of file origins
Discussion:
Leveraging machine learning models like Random Forest Regressor offers
robustness and scalability in watermark generation. Integrating watermarked
files into the database enhances data security and aids in combating
unauthorized data alterations.
25
● Search
Result:
The search functionality efficiently retrieves files based on filename matches from the
database. It displays relevant results promptly, enhancing user experience and
productivity.
Discussion:
By enabling users to search files by filename, the system streamlines information
retrieval, promoting efficiency. However, optimizing search algorithms and indexing
techniques can further enhance performance, accommodating larger datasets while
maintaining responsiveness. Additionally, incorporating fuzzy matching or
autocomplete features can improve search accuracy and user satisfaction.
1. After Uploading the file Watermark is Applied Automatically with The help of
Random Forest Regressor:
26
● File Preview
Result:
The file preview feature enables users to view uploaded file contents
conveniently. The application retrieves data from the database, processes
it, and presents it as an HTML table..
Discussion:
This functionality enhances user experience by facilitating quick assessment of file
content. Retrieval from the database ensures data integrity and security. Presenting data
in an HTML table format enhances readability, allowing users to comprehend
information efficiently.
27
The Admin’s Point of View:
28
CHAPTER 7
CONCLUSION
The implementation of watermarking on the database presents a robust solution for data
protection and integrity verification. By embedding unique identifiers into database
records, the system ensures data authenticity and assists in detecting unauthorized
modifications. This enhances security and trustworthiness, making it suitable for
applications where data integrity is paramount, such as financial transactions, medical
records, or intellectual property management.
● Functionality: The code encompasses a fully functional Flask web
application with diverse features including user authentication, file
management, and watermarking capabilities, offering a comprehensive
solution for data handling needs.
● User Experience: With an intuitive interface and seamless navigation,
users can effortlessly upload, search, and preview files, contributing to
a positive and engaging user experience.
● Security Measures: By implementing robust user authentication
mechanisms and database encryption techniques, the code ensures data
integrity and confidentiality, fortifying the application's security posture
and safeguarding sensitive information against unauthorized access or
tampering.
● Scalability and Performance: The modular architecture and efficient
design of the code facilitate scalability, enabling the application to
handle increasing user loads and data volumes while maintaining
optimal performance.
● Future Enhancements: The extensible nature of the code allows for
seamless integration of additional features and functionalities, such as
advanced search capabilities, user-specific preferences, or real-time
collaboration tools. Continual refinement and innovation can ensure that
the application remains relevant and adaptable to evolving user needs
and technological advancements.
29
7.1 Future Scope
In the future, watermarking on databases holds potential for significant advancements. These
include integration with blockchain for immutable data provenance, dynamic watermarking
techniques for adaptive tracking, and enhanced security measures like advanced encryption.
Collaboration for standardization and exploring cross-domain applications are also key areas
of future exploration.
30
31
32
CHAPTER 9
REFERENCE
[1]. Li X and Pan F. Rolling bearing fault diagnosis based on one-dimension blind source
separation. J Electron Meas Instrum 2014; 27: 535–542.
[2]. Wang X, Xiang J and Yin D. Extraction of weak crack signals based on ICA. J
Changsha Univ Sci Technol 2014; 2014: 74–80.
[3]. Pan N, Wu X, Chi YL, et al. Acoustical diagnosis for gear box combined failures
based on frequency domain blind deconvolution. J Vib Shock 2013; 32(7): 154–158.
[4]. Wang Y, See J, Oh YH, et al. Effective recognition of facial micro-expressions with
video motion magnification. Multimed Tools Appl 2017; 76: 21665–21690.
[5]. Zhou X, Wand D, Wang H, et al. Automatic recognition method of surface defects
based on nonsubsampled contourlet transform and PCNN. J Basic Sci Eng 2013; 2: 174–
184.
[6]. Cai J, Guo Y, Wang H, et al. Score-informed source separation based on real-time
polyphonic score-to-audio alignment and Bayesian harmonic model. In: 2014 international
conference on computational intelligence and communication networks (CICN), Bhopal,
India, 14–16 November 2014, pp.672–680. New York: IEEE.
[7]. Faraji MR and Qi X. Face recognition under varying illuminations using logarithmic
fractal dimension-based complete eight local directional patterns. Neurocomputing 2016;
199: 16–30.
[8]. Chetty G, White M, Singh M, et al. Multimodal activity recognition based on
automatic feature discovery. In: 2014 international conference on computing for
sustainable global development (INDIACom), New Delhi, India, 5–7 March 2014, pp.632–
637. New York: IEEE.
[9]. Kaushik P and Dua ES. Digital image watermarking using BFO optimized DWT and
DCT & comparison between DWT, DWT + DCT, DWT + DCT + BFO. Int J Recent Res
Aspects 2014; 1: 13–16.
[10]. Rani U, Choudri SB and Murthy V. Survey on robust and reversible watermarking
for relational data. Int J Comput Appl 2016; 139(9): 31–34.
[11]. Kumar A, Shankar R, Choudhary A, et al. A big data MapReduce framework for
fault diagnosis in cloud-based manufacturing. Int J Prod Res 2016; 54(23):7060–7073.
33
[12]. Warif NBA, Wahab AWA, Idris MYI, et al. Copy-move forgery detection:
[a]. survey, challenges and future directions. J Netw Comput Appl 2016; 75: 259–
278.
[b]. Papushoy A and Bors AG. Image retrieval based on query by saliency content.
Digit Signal Process 2015; 36: 156–173.
[13]. Puri HS. Vulnerability assessment of multi biometric systems. Moonee Ponds, VIC,
Australia: The Personnel Risk Management Group Pty Ltd, 2015.
[14]. Zhou D, Al-Durra A, Zhang K, et al. Online remaining useful lifetime prediction of
proton exchange membrane fuel cells using a novel robust methodology. J Power Sources
2018; 399: 314–328.
[15]. Zhou D, Thu TN, Breaz E, et al. Global parameters sensitivity analysis and
development of a two-dimensional real-time model of proton-exchange-membrane fuel
cells. Energ Convers Manage 2018; 162: 276–292.
[16]. Zhou D, Al-Durra A, Matraji I, et al. Online energy management strategy of fuel cell
hybrid electric vehicles: a fractional-order extremum seeking method. IEEE T Ind Electron
2018; 65(8): 6787–6799.
34